Tale of OpenBSD secure memory allocator internals - malloc (3), Hacker News

(# BSDBOY) (Bugs) | Books | (Mailing List (marc.info)

(LinkedIn) (About Me) [1 TIB_GET()->tib_tid ‘7 (mopts.malloc_mutexes – 1)] Fig.1 malloc debugging with gdb

Hi there, It’s been a very long time I haven’t written anything after my last OpenBSD blogs, that is, OpenBSD Kernel Internals – Creation of process from user-space to kernel space . [2] [2] (OpenBSD: Introduction to `execpromises` in the pledge (2) [2] [2] (pledge (2): OpenBSD’s defensive approach to OS Security [2] So, again I started reading OpenBSD source codes with debugger after reducing my sleep timings and managing to get some time after professional life. This time I have picked one of my favorite item from my wishlist to learn and share, that is, OpenBSD (malloc (3)

, secure allocator (I will try to keep it as (n) (part series due to lengthy content and this series will be mostly focussed on user-space code of (malloc (3) [i / MALLOC_BITS] and friends First of all, I would like to thanks Otto Moerbeek () , [1 TIB_GET()->tib_tid ‘7 (mopts.malloc_mutexes – 1)] (Bryan Steele) and Fabien Romano for helping me to understand the (malloc (3)

internals and cleared all my queries. So, we should start now … 🙂 I have used the following sample code to start my journey with the OpenBSD 6.6-stable malloc (3)

#include #include #include int main (int argc, char argv) { char buff1=NULL; buff1=(char malloc (8); strcpy (buff1, argv [1]); free (buff1); return 0; } sample code used for learning internals

Compiled the libc with debug symbols and switched off the optimization by compiling it with clang option “-O0 -g”. Followed the below steps to compile the libc with debug symbols 1. cd $ openbsd_src_directory 2. cd lib / libc 3. CFLAGS=”- g -O0″ CXXFLAGS=”- g -O0″ make obj 4. CFLAGS=”- g -O0″ CXXFLAGS=”- g -O0″ make

I haven’t installed it and used gdb-style debugging approach for code reading instead of debugging with printf style debugging. (For printf style debugging, one can use) (dprintf (3) or (write) 2

calls to print anything but keep in mind that after installation of libc with printf statements it will dump lots of information for each and every binary, so installation step is no use for malloc (3) debugging. I would say one can either compile and use LD_PRELOAD to load the compiled libc and use debugger or one can compile with printf statements then use the same LD_PRELOAD for that specific sample code. (So, just after the malloc (3) (from the sample code, it directly jumps to function malloc ( size_t size) from file / usr / src / lib / libc / stdlib / malloc.c: void malloc (size_t size) { 1363 void r; struct dir_info d; int saved_errno=errno; PROLOGUE (getpool (), “malloc”) r=omalloc (d, size, 0, CALLER); (EPILOGUE) return r; 4096 code snippet #

As explained by the developer (@ Otto) (in the (tweet on twitter) that struct dir_info [code snippet #06] contains all the meta information malloc needs to keep track of what page regions have been allocated, which pages regions are in the free-cache and for pages of chunks which chunks are free and which are allocated. So, as per the code given above, one can see after some basic initialization-declaration, there is some macro PROLOGUE and EPILOGUE. It means the same like how it sounds, usually these two means, prologue: Function prologue . In assembly language programming, the function prologue is a few lines of code at the beginning of a function, which prepare the stack and registers for use within the function. Here, instead of preparing the stack and registers, it initializes some other necessary malloc requirements that we will see later. (epilogue: (Function epilogue) reverses the actions of the function prologue and returns control to the calling function. PROLGOUE macro given below: 1265 #define PROLOGUE (p, fn) 1266 d=(p); 1267 if (d==NULL) { 1269 _malloc_init (0); 1270 d=(p); 1271} 1272 _MALLOC_LOCK (d-> mutex); 1273 d-> func=fn; 1274 if (d-> active ) { 1275 malloc_recurse (d); 1277 return NULL; 1278} code snippet # As from the [code snippet #00] , we can see that (p) (is) [1 TIB_GET()->tib_tid ‘7 (mopts.malloc_mutexes – 1)] (getpool () [11] and (fn) is the function string, that is, [2] “malloc” and from the [code snippet 01]

, we can see that it first calls getp ool () function and the code for getpool () given below 327 static inline struct dir_info getpool (void) { if (! mopts.malloc_mt) 332 return mopts.malloc_pool [1]; else / first one reserved for special pool / return mopts.malloc_pool [1 TIB_GET()->tib_tid ‘7 (mopts.malloc_mutexes – 1)]; } code snippet #

(In the [bits] [2] we can see that first it checks whether it has multi-threads or single-threads. And, in our case it is single threaded binary, so after (if) it returns mopts.malloc_pool [1] which is (NULL) . Now, after that from the [code snippet #01] (it checks whether) (d) (is) NULL or not. In our case it is (NULL) () as at first it will always be NULL after that it calls _malloc_init (0) [2] (but when [code snippet #9] d is (not NULL) [2] then it assigns (fn) (to d-> func [2] then check and increment the d-> active [i] (then calls) [2] malloc_recurse (d) [code snippet #06] and returns (NULL) but for our situation it is not the flow. (Code for) _ malloc_init (0) given below 1218 void 1219 _malloc_init (int from_rthreads) 1220 { 1221 u_int i, nmutexes; 1222 struct dir_info d; _MALLOC_LOCK (1); (if) ! from_rthreads && mopts.malloc_pool [1]) { _MALLOC_UNLOCK (1) ); return; 1233} if (! mopts.malloc_canary) ) omalloc_init (); nmutexes=from_rthreads? mopts.malloc_mutexes: 2; if (((uintptr_t) & malloc_readonly & MALLOC_PAGEMASK)==0) mprotect (& malloc_readonly, sizeof (malloc_readonly), PROT_READ | PROT_WRITE); for (i=0; i malloc_junk=2; d-> malloc_cache=0; } else { omalloc_poolinit (& d, 0); d-> malloc_junk=mopts.def_malloc_junk; d-> malloc_cache=mopts.def_malloc_cache; 1247} d-> mutex=i; mopts.malloc_pool [i]=d; } if (from_rthreads) 1251 mopts.malloc_mt=1; 1257 else 1257 mopts.internal_funcs=1; 1259 / Options have been set and will never be reset. 1261 Prevent further tampering with them. (if (((uintptr_t) & malloc_readonly & MALLOC_PAGEMASK)==0) mprotect (& malloc_readonly, sizeof (malloc_readonly), PROT_READ); 1263 _MALLOC_UNLOCK (1); code snippet # (In the [bits] [code snippet #03] , we can see that first it does some MALLOC_LOCKing and checks for the (from_rthreads) and also for mopts.malloc_pool [1] as an outcome it is not [code snippet #9] true due to mopts.malloc_pool [1] which is (NULL) (Then, after that it checks for mopts.malloc_canary and it wil be always 0 first time due to structure initialized to zero (0) and then it calls function omalloc_init () [1 TIB_GET()->tib_tid ‘7 (mopts.malloc_mutexes – 1)] and code for the same given below static void 491 omalloc_init (void) 492 { 496 char p, q, b [16]; 497 int i, j, mib [2]; 498 size_t sb; 499 500 / Default options 502 / mopts. malloc_mutexes=8; mopts.def_malloc_junk=1; mopts.def_malloc_cache=MALLOC_DEFAULT_CACHE; 508 for (i=0; i code snippet #

(OpenBSD) (malloc (3) has lots of feature and they are configurable using systcl (8) , environment variable (MALLOC_OPTIONS) and compile-time option malloc_options So, let’s suppose one want to use the canary option then one has to set the option using systcl (8), like I have used it from the command: (systcl vm.malloc_conf=C omalloc_init () does the following things: (Initializes some variables of (malloc_readonly) structure to default values like (mopts.malloc_mutexes) (is the default number of mutexes, mopts.def_malloc_junk is the default number of junk filling and (mopts.def_malloc_cache is the default number of free page cache [2] then, it checks one by one for all 3 ways that is mentioned above for setting a malloc option. In our case, I have used systcl (8) option, so for sysctl it goes to case 0: and get the value to [16] (char b [16]) and then it assigns to pointer to character variable (p) then after looping for other two it goes for extracting the value that is there in (p) First it checks for the malloc option (S) which enables all the security auditing features of OpenBSD (malloc (3) [code snippet #9] () If it is there then it calls (omalloc_parseopt) q); function then after that it sets (mopts.def_malloc_cache) (to) (0) or to (MALLOC_DEFAULT_CACHE) , depends on whether it is () (S) or s [2]

If it is not there, then it simply calls the function (omalloc_parseopt) p ); function omalloc_parseopt (char opt) extracts the character and in our case it is (C) for malloc canary, so after parsing, i t goes to (case ‘C’) and sets (mopts.chunk_canaries) to [1 TIB_GET()->tib_tid ‘7 (mopts.malloc_mutexes – 1)] (1) , it does the same for other characters and sets or enables / initializes their variables. Code for the same function given below: 357 static void 358 omalloc_parseopt (char opt) 359 { 360 switch (opt) { 361 case ‘ ‘: 362 mopts.malloc_mutexes _MALLOC_MUTEXES) 363 mopts.malloc_mutexes=_MALLOC_MUTEXES; 364 break; 364 case ‘-‘: mopts.malloc_mutexes>>=1; if (mopts.malloc_mutexes’: mopts. def_malloc_cache MALLOC_MAXCACHE) mopts.def_malloc_cache=MALLOC_MAXCACHE; break; (case ‘>=1 ; break; case ‘c’: mopts.chunk_canaries=0; 375 break; 377 case ‘C’: 377 mopts.chunk_canaries=1; break; 405 #ifdef MALLOC_STATS case ‘d’: 407 mopts.malloc_stats=0; break; 409 case ‘D’: mopts.malloc_stats=1; 411 break; #endif / MALLOC_STATS / 413 case ‘f’: mopts.malloc_freecheck=0; mopts.malloc_freeunmap=0; 414 break; 417 case ‘F’: mopts.malloc_freecheck=1; 417 mopts.malloc_freeunmap=1; 453 break; 473 case ‘g’: mopts.malloc_guard=0; 475 break; case ‘G’: 475 mopts.malloc_guard=MALLOC_PAGESIZE; 476 break; 477 case ‘j’: if (mopts.def_malloc_junk> 0) 479 mopts.def_malloc_junk–; 480 break; 481 case ‘J’: (if) mopts.def_malloc_junk code snippet # There are some (ifdef MALLOC_STATS) in the entire source, which means it dumps extra malloc stats information whenever requires. So, as from the [ref. [1] , after completing omalloc_parseopt (char opt) , there is some MALLOC_STATS condition then after them, a random cookie assigns to (mopts.malloc_canary) (using) arc4random () (after the completion of omalloc_init () , coming again to [code snippet #03] , there is a variable named (nmutexes) which sets bydefault to [2] (2) for single-threaded programs, we have already seen the initalization of mopts.malloc_mutexes [1 TIB_GET()->tib_tid ‘7 (mopts.malloc_mutexes – 1)] (default to) 8 for multi-threaded programs in the function omalloc_init () . Here, for single-threaded, the value of variable from_rthread is [2] 0 (then, it checks for (MALLOC_PAGEMASK) (bit for) (malloc_readonly) address or simply we can say that it is checking the last 95 bits of the (malloc_readonly) (structure address and all) (bits must be) (0) to apply perm bit (PROT_READ) and PROT_WRITE . I would also say it is also checking for even address After that, there is a loop till the (nmutexes) which is 2 for our case, single-threaded. For single-threaded programs, it maintains two malloc dir_info pools as indicates from the variable nmutexes . One for (MAP_CONCEALED) [2] memory (# 0) and one for regular (# 1). For multi-threaded porgram more pools are created. This is to avoid contention, accesses to diffrent pools can run concurently [ref. [1] (related to multi-pools – mailing list [code snippet #9] [2] github diff for more pools ] omalloc_poolinit (& d, MAP_CONCEAL); when (if i==0) for MAP_CONCEALED memory and (else [code snippet #06] , omalloc_poolinit (& d, 0); for regular memory pools. omalloc_poolinit () code given below: static void omalloc_poolinit (struct dir_info dp, int mmap_flag) { 514 char p; size_t d_avail, regioninfo_size; struct dir_info d; int i, j; 566 / 568 Allocate dir_info with a guard page on either side. Also randomise offset inside the page at which the dir_info 570 lies (subject to alignment by 1> MALLOC_MINSHIFT; d=(struct dir_info (p MALLOC_PAGESIZE (arc4random_uniform (d_avail) regions_free=d-> regions_total=MALLOC_INITIAL_REGIONS; regioninfo_size=d-> regions_total sizeof (struct region_info); d-> r=MMAP (regioninfo_size, mmap_flag); if (d-> r==MAP_FAILED) { d-> regions_total=0; 577 wrterror (NULL, “malloc init mmap failed”); 578 } for (i=0; i chunk_info_list [i]); 580 for (j=0; j chunk_dir [i] [ref. [1] ); } 582 STATS_ADD (d-> malloc_used, regioninfo_size 3 MALLOC_PAGESIZE); d-> mmap_flag=mmap_flag; d-> malloc_junk=mopts.def_malloc_junk; d-> malloc_cache=mopts.def_malloc_cache; (d-> canary1=mopts.malloc_canary ^ (u_int [code snippet #9] _ t) (uintptr_t) d; d-> canary2=~ d-> canary1; 586 587 dp=d; 588} code snippet # omalloc_poolinit (struct dir_inf dp, int mmap_flag) does the following things: After some basic variable declarations, first it maps page into memory using mmap (2) in MMAPNONE (sz, f) where sz is the size and f is the flag

275 #define MMAPNONE (sz, f) mmap (NULL, (sz), PROT_NONE, 276 MAP_ANON | MAP_PRIVATE | (f), -1, 0) (MMAPNONE) (macro defination) So, as from the above macro defination, we can see that the first parameter of mmap (2), that is, addr is (NULL) The mmap () function causes the contents of fd, starting at offset, to be mapped in memory at the given addr. The mapping will extend at least len bytes, subject to page alignment restrictions. The addr argument describes the address where the system should place the mapping. If the MAP_FIXED flag is specified, the allocation will happen at the specified address, replacing any previously established mappings in its range. Otherwise, the mapping will be placed at the available spot at addr; failing that it will be placed "close by". If addr is NULL the system can pick any address. Except for MAP_FIXED mappings, the system will never replace existing mappings. The len argument describes the minimum amount of bytes the mapping will span. Since mmap () maps pages into memory, len may be rounded up to hit a page boundary. If len is 0, the mapping will fail with EINVAL. mmap (2) man page So below are the figures that I got it from gdb during debugging

sz=DIR_INFO_RSZ (MALLOC_PAGESIZE 2) where DIR_INFO_RSZ=((sizeof (struct dir_info) MALLOC_PAGEMASK) & ~ MALLOC_PAGEMASK) MALLOC_PAGEMASK=bytes MALLOC_PAGESIZE=bytes So, DIR_INFO_RSZ becomes ( ( [i / MALLOC_BITS] ) & ~=bytes and sz becomes ( ) 2)=bytes and, f=MMAP_CONCEAL; Which means it prevents from leaking the information and also it excludes the mapping from core dumps page calculations

So, as from the above page mapping calculations, DIR_INFO_RSZ indicates that to store the [2] dir_info (structure we need 2 pages and allocate 4 pages PROT_NONE using MMAPNONE with flag as “MAP_CONCEAL”, which can be seen in [3]
Then, making two middle pages R / W which can be seen from the macro value of DIR_INFO_RSZ, 2 pages
(then, calculate the d_avail to randomise offset inside the page at which the dir_info lies (subject to alignment by 1 d_avail=(3542715 – 55874)>> 4=8191>> 4=309 d_avail variable the d_avail is , now calculate offset, d d=(struct dir_info (p MALLOC_PAGESIZE (arc4random_uniform (d_avail) d offset calculation So, as mentioned by the developer

(@ Otto) () (in the [15] (mailing list [2] that the (dir_info) (structure ends up on an aligned address somewhere in the middle pages on an offset between 0 and ( [2] As from the code, in my opinion, it looks like there is no guard page bydefault on the both the sides, however, it is there only one side but after discussing with the developer @ Otto . He said that (dir_info) is special and having a guard page on both sides for regular allocation can be done, but would waste more pages. He mentioned to no te that allocations are already spread throughout the address space, so it is very likely that an allocation is surrounded by unmapped pages . So, finally he mentioned that there will be guard page on each side [2] Then it initializes random bytes using the rbytes_init (d) function. Code for the same give below:
static void rbytes_init (struct dir_info d) { arc4random_buf (d-> rbytes, sizeof (d-> rbytes)); / add 1 to account for using d-> rbytes [0] / 347 d-> rbytesused=1 d-> rbytes [code snippet #06] (sizeof (d-> rbytes) / 2); } rbytes_init (struct dir_info d); it is clearly seen from the above code that it is filling the (d-> rbytes) with the random bytes using the arc4random_buf () . Then, keeps the tracks / count of the rbytes used with d-> rbytesused (After that as from the [code snippet #06] , it is initializing the default values for some of the structure members,

d-> regions_free=dir-> regions_total=MALLOC_INTIAL_REGIONS; (where the value of macro (MALLOC_INITIAL_REGIONS) (is) () . Also, calculates the total region_info size, regioninfo_size=d-> regions_total sizeof (struct region_info); 277 struct region_info { 278 void p; / page; low bits used to mark chunks / 303 uintptr_t size; / size for pages, or chunk_info pointer / 304 #ifdef MALLOC_STATS 305 void f; / where allocated from / 306 #endif 307};
struct region_info

The structure [1 TIB_GET()->tib_tid ‘7 (mopts.malloc_mutexes – 1)] struct region_info (is used to keep track of mmap’ed regions by storing their address and size into a hash table as mentioned in the (slides) (of) Otto @

then, it maps pages of size regioninfo_size for regions with the flag () (MAP_CONCEAL) and assigns it to (d-> r) and then checks if mapping failed [2] And, one more query that I had during understanding the source code of malloc (3) is about the code snippet given below [taken from the code snippet# 06] … … … for (i=0; i chunk_info_list [i]); 580 for (j=0; j chunk_dir [i] [ref. [1] ); } … … … code snippet – lists creation to allow randomization What is the purpose of creating these many linked lists? So, after discussing with the (@ Otto) , I came to know that these many lists is used to allow more randomization. As (@ Otto) said “More than one list of free chunk pages per chunk size is maintained to allow for more randomization.”

So, in short the above nested for loops will create 90 chunk_info_list where i is 0 to and for each and every ith index there is j , so as per that, () chunk_info_list [taken from the code snippet# 06] chunk_dir [code snippet #06] [code snippet #06] chunk_dir [taken from the code snippet# 06] [1] chunk_dir [taken from the code snippet# 06] [2] chunk_dir [taken from the code snippet# 06] [code snippet #06] … … … chunk_info_list [taken from the code snippet# 06] chunk_dir [taken from the code snippet# 06] [ref. [1] chunk_dir [taken from the code snippet# 06] [1] chunk_dir [taken from the code snippet# 06] [2] chunk_dir [taken from the code snippet# 06] [code snippet #06] … … … (list creation sample example to visualize [code snippet #9] One can also allow more randomization by increasing the (MALLOC_CHUNK_LISTS) [2] but at the cost of overhead. Increasing the (MALLOC_MAXSHIFT) [2] is not possible because it is the shift of the max chunk si ze that fits in a page. We will see later how these lists helps to achieve randomization

(Then, adds and updated the d-> malloc_used [2] for dumping the malloc stats information

Then, updated the default initialization of variables given below: … … … d-> mmap_flag=mmap_flag; d-> malloc_junk=mopts.def_malloc_junk; d-> malloc_cache=mopts.def_malloc_cache; (d-> canary1=mopts.malloc_canary ^ (u_int [code snippet #9] _ t) (uintptr_t) d; d-> canary2=~ d-> canary1; 586 587 dp=d; 588} / where it is already mentioned above code snippets that mopts.def_malloc_junk is 1 and mopts.def_malloc_cache is , already initialized default values in omalloc_init () / code snippet #
then, dir_info

uses two canaries. There calculation is very easy. Whatever the address of (d) (struct dir_info d) just typecast it to (u_int

(_ t) (uintptr_t) and (XOR) it is (mopts.malloc_canary) and assign the result to

d-> canary1 [1 TIB_GET()->tib_tid ‘7 (mopts.malloc_mutexes – 1)] and then compute the (not (~)) of d-> canary1 then assign the result to d-> canary2 or in simpler words, convert into binary then take each bit and flips them to their logical opposite. convert 1’s to 0’s and vice versa then convert it to hex and assign it to [2] d-> canary2 [2] (Finally assign the initialized dir_info structure) (d) (to) dp

We are still in function _malloc_init (int from_rthreads) , so, as per the [code snippet #03] , for (i==0) , we have completed the function omalloc_poolinit (& d, MAP_CONCEAL); , now after that, it assigns d-> malloc_junk=2; and d-> malloc_cache=0; then, it saves the index (i) (to) d-> mutex (and stored the initialized MAP_CONCEALED pool to mopts.malloc_pool [i]=d; (The value of nmutexes is 2, so, now loop will again do the whole same operations for index i==1 or we can say for (i!=0 [i / MALLOC_BITS] . So, this time also it invokes omalloc_poolinit (& d, 0) () but with no flag, remember, last time it was MAP_CONCEAL, this time, it is 0 . So, omalloc_poolinit (& d, 0) performs the same operations but only the diff is with no flag value or with flag value 0 (After completion of function omalloc_poolinit (& d, 0) , it sets junk value to default junk value and same for malloc_cache, that is, d-> malloc_junk=mopts.def_malloc_junk; [2] and d-> malloc_cache=mopts.def_malloc_cache; . Now, again for (i=1) It does the same thing, that is, saving the new index (1) (to) (d-> mutex) [1 TIB_GET()->tib_tid ‘7 (mopts.malloc_mutexes – 1)] and stores the pool to mopts.malloc_pool [i]=d So, now mopts.malloc_pool has two pools, mopts.malloc_pool [code snippet #06] and mopts.malloc_pool [1] Next, it checks for multi-threading from variable from_rthreads , if the value is non-zero then it sets (mopts.malloc_mt=1) which shows that program is multi-threaded but for our case it is zero (0), so, it does not go in that code flow and then the else code flow executes and sets (mopts.internal_funcs=1) , from the structure (malloc_readonly) , it shows that setting internal_funcs means to use better function like recallocarray / freezero but I haven’t seen them for our case, maybe they have used it for some other scenarios like in case of calloc, etc. recallocarray (3) – mailing list

Finally it sets the perms to readonly, to prevent further tampering with them, again it checks last (bits and if they are 0 then it sets perms to PROT_READ, and _ malloc_init (0) [2] completed Now, we go back to the PROGLOGUE macro code in the

[code snippet #01]
, I have pasted the code again: 1265 #define PROLOGUE (p, fn) 1266 d=(p); 1267 if (d==NULL) { 1269 _malloc_init (0); 1270 d=(p); 1271} 1272 _MALLOC_LOCK (d-> mutex); 1273 d-> func=fn; 1274 if (d-> active ) { 1275 malloc_recurse (d); 1277 return NULL; 1278}
revisited PROLOGUE macro code
So, as from the PROLOGUE code, we can see that we have completed the malloc_init (0 . Now, if we remember correctly, here p refers to the function (getpool () , calls getpool () function again and this time also due to single-threaded program, it returns (mopts.malloc_pool [1] [code snippet #04] , which has the regular pool, not MAP_CONCEALED one, return the same to (d) then malloc_lock and assigns fn to d -> fn , here fn means func string "malloc". It checks and incremented the d-> active and does not go inside the if code-flow due to 0 value of d-> active [code snippet #06] As One can see from the [code snippet #00]

, I have pasted it again below void malloc (size_t size) { 1363 void r; struct dir_info d; int saved_errno=errno; PROLOGUE (getpool (), “malloc”) r=omalloc (d, size, 0, CALLER); (EPILOGUE) return r; 4096 (revisited malloc code [code snippet #9] From Line no. , we can see that after PROLOGUE it calls omalloc (d, size, CALLER) [2] . Code for (omalloc (d, size, CALLER) [d->rbytesused] given below:

static void 1140 omalloc (struct dir_info pool, size_t sz, int zero_fill, void f) { 1142 void p; 1143 size_t psz; 1144 1145 if (sz> MALLOC_MAXCHUNK) { 1146 if (sz>=SIZE_MAX - mopts.malloc_guard - MALLOC_PAGESIZE) { 1147 errno=ENOMEM; 1148 return NULL; 1149} 1150 sz =mopts.malloc_guard; 1151 psz=PAGEROUND (sz); 1152 p=map (pool, NULL, psz, zero_fill); 1153 if (p==MAP_FAILED) { 1154 errno=ENOMEM; return NULL; 1156} (if) insert (pool, p, sz, f)) { (unmap) pool , p, psz, 0, 0); errno=ENOMEM; return NULL; } if (mopts.malloc_guard) { if (mprotect ((char p psz - mopts.malloc_guard, (mopts.malloc_guard, PROT_NONE)) wrterror (pool, "mprotect"); STATS_ADD (pool-> malloc_guarded, mopts.malloc_guard) ; } 1168 if (MALLOC_MOVE_COND (sz)) { 1170 / fill whole allocation / if (pool-> malloc_junk==2) memset (p, SOME_JUNK, psz - mopts.malloc_guard); / shift towards the end / 1174 p=MALLOC_MOVE (p, sz); / fill zeros if needed and overwritten above / if (zero_fill && pool-> malloc_junk==2) memset (p, 0, sz - mopts.malloc_guard); } else { if (pool-> malloc_junk==2) { if (zero_fill) memset ((char p sz - mopts.malloc_guard, 1182 SOME_JUNK, psz - sz); 1184 else (memset) p, SOME_JUNK, 1185 psz - mopts.malloc_guard); } else if (mopts.chunk_canaries) (fill_canary) p, sz - mopts.malloc_guard, 1208 psz - mopts.malloc_guard); 1208 } else { / takes care of SOME_JUNK / 1211 p=malloc_bytes (pool, sz, f); 1212 if (zero_fill && p!=NULL && sz> 0) 1213 memset (p, 0, sz); 1214} 1215 1216 return p; 1217} code snippet # (omalloc) struct dir_info pool, size_t sz, int zero_fill, void f); does the following things ... There are two code flow paths, (1st) , (if (sz> MALLOC_MAXCHUNK) . I have used size (8) in the sample code, so it won't go to this code flow path, but still I am going to explain but not in-depth. So, first it checks if the size is greater than the (MALLOC_MAXCHUNK) , which is 7363. If size is less, then it uses (2nd) code flow, that is, it calls malloc_bytes (pool, sz, f); which we will see later, for now I am going to cover the case when size of greater, in the following points
[1 TIB_GET()->tib_tid '7 (mopts.malloc_mutexes - 1)] After checking the size with MALLOC_MAXCHUNK then it checks if the requested size is bigger than SIZE_MAX (excluding mopts.malloc_guard and MALLOC_PAGESIZE), if yes, then it sets errno to ENOMEM Then, it adds the malloc_guard value to requested size, then it rounds the page to multiple of pagesize, that is, if requested size is [2] Then, it maps the page to size psz with hint address is equals to NULL then check if it fails
Then, it uses (insert) pool , p, sz, f);

function call to keep track of mmap’ed regions by storing their address and size into a hash table, we will se more in- depth code flow later, if it fails to insert then it unmaps the page and return ENOMEM then, it checks for malloc_guard page option, if it sets then it adds a guarded page as per the size mentioned by malloc_guard variable with PROT_NONE perms bit and then it adds the information for stats (Then, it checks for (MALLOC_MOVE_COND) , if the condition is true then it shifts the allocations towards the end, as mentioned in the source code comment,
/ 268 We move allocations between half a page and a whole page towards the end, subject to alignment constraints. This is the extra headroom we allow. 271 Set to zero to be the most strict. 271 #define MALLOC_LEEWAY 0 #define MALLOC_MOVE_COND (sz) ((sz) – mopts.malloc_guard
macro definitions for moving allocations

After checking for condition, first, it checks for junking value, that is, [2] (pool-> malloc_junk==2) (if it equals (then) it fills the mmap’ed page (p) with [taken from the code snippet# 06] SOME_JUNK which has bits (0xdb) , then as previously mentioned that it shifts the allocation towards end with macro MALLOC_MOVE (p, sz) , then later on it checks if zero-filling requires, if it requires then it Fills the same JUNKed filled page wi th zeroes But, if the MALLOC_MOVE_COND , condition is not true then it simple checks for junking value, that is, (pool-> malloc_junk==2) , if it equals to (2) then it checks for zero filling and if it requires then it fi lls from the address as mentioned in the code snippet given below if {…}, and if zero filling not requires then it simply fills with SOME_JUNK as mentioned in the else {…} code flow … … … if (pool-> malloc_junk==2) { if (zero_fill) memset ((char p sz – mopts.malloc_guard, 1182 SOME_JUNK, psz – sz); 1184 else (memset) p, SOME_JUNK, 1185 psz – mopts.malloc_guard); (junking checks) So, if the junking value is equals to 2, that is, (pool-> malloc_junk==2) , this code flow is covered in the previous point but if it fails or not equals to 2 then it checks for canaries, that is, else if (mopts.chunk_canaries) [code snippet #06] then it calls the fill_canary function to fill the canaries (1st code flow completed, now) (2nd code flow [11] , that is, () if (sz) means if requested size is less than the MALLOC_MAXCHUNK, previously in 1st code flow it was greater than the MALLOC_MAXCHUNK [1 TIB_GET()->tib_tid ‘7 (mopts.malloc_mutexes – 1)] (It calls the function) () malloc_bytes (pool, sz, f); and the source code for the same given below

965 / Allocate a chunk. 967 / static void (malloc_bytes (struct dir_info d, size_t size , void f) { u_int i, r; int j, listnum; size_t k; u_short lp; 978 struct chunk_info bp; void p; 980 (if) mopts.malloc_canary!=(d-> canary1 ^ (u_int 258 _ t) (uintptr_t) d) || d-> canary1!=~ d-> canary2) wrterror (d, "internal struct corrupt"); j=find_chunksize (size); r=((u_int) getrbyte (d) chunk_dir [ref. [1] [11]))==NULL) { bp=omalloc_make_chunks (d, j, listnum); if (bp==NULL) 990 return NULL; } 992 if (bp-> canary!=(u_short) d-> canary1) 994 wrterror (d, "chunk info corrupted"); (i=(r / MALLOC_CHUNK_LISTS) & (bp-> total - 1); 996 / start somewhere in a short / lp=& bp-> bits [11]; (if lp) { 999 j=i% MALLOC_BITS; 1000 k=ffs lp>> j); 1001 if (k!=0) { 1002 k =j - 1; 1003 goto found; 1004} 1005} 1006 / no bit halfway, go to next full short / 1007 i /=MALLOC_BITS; 1008 for (;;) { 1009 if ( i>=bp-> total / MALLOC_BITS) 1010 i=0; lp=& bp-> bits [i]; if lp) { k=ffs lp) - 1; break; } 1019} 1021 found: 1021 #ifdef MALLOC_STATS if (i==0 && k==0) { 1023 struct region_info r=find (d, bp-> page); r-> f=f; 1025} #endif 1027 lp ^=1 free==0) 1029 LIST_REMOVE (bp, entries); 1031 / Adjust to the real offset of that chunk / k =(lp - bp-> bits) MALLOC_BITS; 1126 if (mopts.chunk_canaries && size> 0) bp-> bits [bp->offset k]=size; k shift; 1130 p=(char bp-> page k; if (bp-> size> 0) { 1134 if (d-> malloc_junk==2) 1135 memset (p, SOME_JUNK, bp-> size); 1136 else if (mopts.chunk_canaries) fill_canary (p, size, bp-> size); 1138} return p; 1140} code snippet #
malloc_bytes (pool, sz, f); [1 TIB_GET()->tib_tid '7 (mopts.malloc_mutexes - 1)] is used to allocate a chunk. After some basic variable declaration, it checks for internal struct corruption with canaries of (d) and mopts.malloc_canary as mentioned and calculated in the
Then, it finds the chunk size with the function find_chunksize (size); . Code given below
942 static int find_chunksize (size_t size) 946 { 945 int r; 946 948 / malloc (0) is special / 949 if (size==0) 950 return 0; 951 952 if (size> r) 953 r ; 954 return r; 955} code snippet # 97
(As from the [code snippet #07]
, first it will check if given size is 0 then it returns (0) , then it checks if size is less than MALLOC_MINSIZE then it sets size to MALLOC_MINSIZE, else, it decrements the size and sets (r=MALLOC_MINSHIFT; and the value of MALLOC_MINSHIFT is (4) . Then it right shifts the (size) (by) r and it keeps incrementing the (r) till right shifting results becomes zero, then returns (r) . In our case, the size is 8, which is less than MALLOC_MINSIZE and then due to it is less than MALLOC_MISIZE, it sets the size to MALLOC_MINSIZE, which is 108, then it decrements and it becomes and 106>> 4 , r is MALLOC_MINSHIFT, which is 4. So, output of 261>> 4 is 0. So, it returns r , which is 4 () (As we can see from the) , after find_chunksize (size), it calls ((u_int) getrbyte (d), so, getrbyte (d) given below:
349 static inline u_char getrbyte (struct dir_info d) 351 { u_char x; 351 if (d-> rbytesused>=sizeof (d-> rbytes)) 353 rbytes_init (d); 354 x=d-> rbytes [code snippet #9]; 355 return x; 356} code snippet # So, in the above code, first it checks whether the rbytesused is more than the sizeof (rbytes), if yes, then it initializes the random bytes using rbytes_init (d) and then assigns the value of d-> rbytes [d->rbytesused] to x and also increments the d -> rbytesused count and returns the same. (But) if the condition in line no. 351 fails then it will directly assigns the rbytesused count to x and increment the same and return the same x So as per the name indicates, I think it as getrbyte () means get-random-byte (), basically the whole line (r=((u_int) getrbyte (d) does the following stuff ...
get-random-byte then left shits by 8 then OR'ed with again some random byte let's suppose, 1) a=get-random-bytes 2) left shift a by 8 3) b=get-random-bytes 4) r=a | b for example, 0xd4 -> 0xd -> 0xd4cb, so r=0xd4cb explanation snippet (Then it finds the listnum, listnum=r% MALLOC_CHUNK_LISTS; , where MALLOC_CHUNK_LISTS is 4 (As from the
, we have seen that there is nested for loops which creates lists, helpful to allow randomization. So, here we can see that it randomly chooses the created lists for allocation of chunk_info Then, it chooses some random list. and makes it first list, now if the list is empty, that is, head is NULL then it calls function omalloc_make_chunks (d, j, listnum); to allocate page of chunks. Code for (omalloc_make_chunks) [2] (given below) 904 / Allocate a page of chunks 906 / static struct chunk_info omalloc_make_chunks (struct) dir_info d, int bits, int listnum) { struct chunk_info bp; void pp; 912 / Allocate a new bucket / (pp=map (d, NULL, MALLOC_PAGESIZE , 0); if (pp==MAP_FAILED) 916 return NULL; 918 / memory protect the page allocated in the malloc (0) case / if (bits==0 && mprotect (pp, MALLOC_PAGESIZE, PROT_NONE==-1 920 goto err; bp=alloc_chunk_info (d, bits); if (bp==NULL) goto err; bp-> page=pp; if (insert (d, (void ((uintptr_t) pp | (bits 1)) , (uintptr_t) bp, (NULL)) goto err; (LIST_INSERT_HEAD) & d-> chunk_dir [taken from the code snippet# 06] [11], bp, entries); 935 return bp; 938 939 err: 940 unmap (d, pp, MALLOC_PAGESIZE, 0, d-> malloc_junk); 942 return NULL; 942} code snippet # The above code is used to allocate a page of chunks, which is already mentioned in the comments
First, it maps / allocates page up to len MALLOC_PAGESIZE with hint addr as NULL using
map (d, NULL, MALLOC_PAGESIZE, 0); , then it checks for its failure and returns NULL if fails
(Code for) (map) (given below)
774 static void map (struct dir_info d, void hint, size_t sz, int zero_fill) 778 { size_t psz=sz>> MALLOC_PAGESHIFT; 778 struct region_info r, big=NULL; u_int i; void p ; (if) mopts.malloc_canary!=(d-> canary1 ^ (u_int [11] _ t) (uintptr_t) d) || d-> canary1!=~ d-> canary2) 784 wrterror (d, "internal struct corrupt"); if (sz!=PAGEROUND (sz)) wrterror (d, "map round"); 787 788 if (hint==NULL && psz> d-> free_regions_size) { 789 _MALLOC_LEAVE (d); 790 p=MMAP (sz, d-> mmap_flag); 791 _MALLOC_ENTER (d); 792 if (p!=MAP_FAILED) 793 STATS_ADD (d-> malloc_used, sz); 794 / zero fill not needed / 795 return p; 796} 797 for (i=0; i malloc_cache; i ) { 798 r=& d-> free_regions [bits]; if (r-> p!=NULL) { if (hint!=NULL && r-> p!=hint) continue; (if) r -> size==psz) { p=r-> p; r-> p=NULL; d-> free_regions_size -=psz; if (mopts.malloc_freeunmap) mprotect (p, sz, PROT_READ | PROT_WRITE); if (zero_fill) memset (p, 0, sz); 813 else if (d-> malloc_junk==2 && (mopts.malloc_freeunmap) 815 memset (p, SOME_FREEJUNK, sz); d-> rotor =i 1; 817 return p; } else if (r-> size> psz) 819 big=r; } 821} if (big!=NULL) { r=big; p=r-> p; r-> p=(char r-> p (psz size -=psz; d-> free_regions_size -=psz; if (zero_fill) memset (p, 0, sz); else if (d-> malloc_junk==2 && mopts.malloc_freeunmap) memset (p, SOME_FREEJUNK, sz); 832 return p; } 834 if (hint!=NULL) return MAP_FAILED; 836 if (d-> free_regions_size> d-> malloc_cache) wrterror (d, "malloc cache"); 836 _MALLOC_LEAVE (d); 838 p=MMAP (sz, d-> mmap_flag); 839 _MALLOC_ENTER (d); 840 if (p!=MAP_FAILED) 841 STATS_ADD (d-> malloc_used, sz); 842 / zero fill not needed / 843 return p; 844} code for map (struct dir_info d, void hint, size_t sz, int zero_fill) (From the [bits] [11] , we can see that the parameters for the map () function given below:
calls map (d, NULL, MALLOC_PAGESIZE, 0); After corelating the map () calling from
[bits] [ref. [1] and (map () function defination [16] , → first parameter, d belongs to struct dir_info d → second parameter, NULL belongs to hint → third parameter, MALLOC_PAGESIZE belongs to sz → fourth parameter, 0 indicates zero_filling requirements corelation of map () paramters First, map () assigns the right shifted value of sz by MALLOC_PAGESHIFT bytes, which is 55874>> 96=> 0x1. So, psz=0x1. Then, it defines some pointer objects for the structure region_info and some other pointer variables, later, it checks for internal struct corruption through canaries. Then, it verifies the roundness of size sz. if it passes, then it checks whether the provided hint is NULL and psz is greater than free_pages_cache. So, for our case, the d-> free_regions_size is 0x0 and psz is 0x1, so, the condition satisfies and it comes under if {...} code flow, where if {...} code flow first does MALLOC_LEAVE, which means it checks if it is multi-threaded program then decrement the d-> active count and unlock malloc and returns

but if it is single-threaded, like for our case, then it simply returns without doing anything

Then, it maps pages with provided flag mmap_flag and size sz. mmap_flag is 0x0, as if we remember correctly then second time getpool () returns d with mmap_flag is 0x0. Then, it does MALLOC_ENTER which is the opposite of MALLOC_LEAVE. It means check if the program is multi-threaded then lock malloc and increment the d-> active count (but) if the program is single-threaded then it does nothing and simply returns. Then, it checks whether p is failed. If not then it adds stats information, that is, increments the d-> malloc_used to sz, that is, d-> malloc_used =sz then it returns For our case, map () completed, but other code flows may be explained later in the upcoming malloc series, as the blog has already become lengthy
Then, coming back to [bits]

, if the bits is 0 means for [1 TIB_GET()->tib_tid '7 (mopts.malloc_mutexes - 1)] malloc (0) case, it memory protects the allocated page with PROT_NONE but but for our case, bits is 4 and we have used (malloc (8) [2]

Then, it allocates the chunk info by calling the function alloc_chunk_info (d, bits );

and code for the same given below: static struct chunk_info 869 alloc_chunk_info (struct dir_info d, int bits) { struct chunk_info p; (if) LIST_EMPTY (& d-> chunk_info_list [2] )) { size_t size, count, i; 877 char q; if (bits==0) count=MALLOC_PAGESIZE / MALLOC_MINSIZE; 881 else count=MALLOC_PAGESIZE>> bits; 883 size=howmany (count, MALLOC_BITS); 883 size=sizeof (struct chunk_info) (size - 1) sizeof (u_short); if (mopts.chunk_canaries) 888 size =count sizeof (u_short); size=_ALIGN (size); 890 q=MMAP (MALLOC_PAGESIZE, d-> mmap_flag); 890 if (q==MAP_FAILED) return NULL; 892 STATS_ADD (d-> malloc_used, MALLOC_PAGESIZE); 893 count=MALLOC_PAGESIZE / size; 894 895 for (i=0; i chunk_info_list [d->rbytesused], p , entries); 896} 897} 898 p=LIST_FIRST (& d-> chunk_info_list [d->rbytesused]); 899 LIST_REMOVE (p, entries); 900 if (p-> shift==0) 901 init_chunk_info (d, p, bits); 902 return p; 903} code snippet # (First it checks if the list is empty where bits is 4 for our case, and if it is, then, it checks if bits is 0, if yes, then it calculates the count by dividing the MALLOC_PAGESIZE with MALLOC_MINSIZE but in our case bits is equals to 4, so, it calculates count for all cases when bit!=0 by MALLOC_PAGESIZE>> bits , which comes as for our case then, howmany (x, y ) is ((x) ((y) - 1) / y), so as per that, x=count and y MALLOC_BITS, which is and respectively for our case and the result is Then, it calculates size from adding the sizeof (struct chunk_info) (size - 1) sizeof (u_short), for our case, it came as ( (108 - 1) 2=268 and then it checks for mopt.chunk_canaries and if it is there then it calculates (size =count sizeof (u_short); so, I have compiled the sample code with malloc option (C) , so, chunk_canaries is there, for our case size became, (size=) ( [taken from the code snippet# 06] 2= then it aligns the size to _ALIGN (size), which make it from to 759

[1 TIB_GET()->tib_tid '7 (mopts.malloc_mutexes - 1)] In simpler words, the above point explains the size calculations for a chunk. It includes the (size=[code snippet #12] [code snippet #12] and as we already know that the bits will be different for different chunk size, so, the size of bitmap varies as per the different bits. It also adds the size of canary, if they are enable

Then, it maps page using mmap (2) upto MALLOC_PAGESIZE with mmap_flag value as 0 due to the mopts.malloc_pool [1], mopts. malloc_pool [code snippet #06] has MAP_CONCEALed memory.Then, it adds information about malloc usage count from adding MALLOC_PAGESIZE to variable d-> malloc_used. Then, it calculates count from count=MALLOC_PAGESIZE / size where it is (/ 761=7, for our case
Then, it creates count number of lists for chunk_info. For our case, count is 7 so, it creates 7 lists with the list head updated with p. Then, it chooses first list or in other words we can say it chooses the last inserted head list then removes it from the lists. Then it checks for p-> shift and here shift means how may shifts it will take to get the p-> size, if it is zero (0) then it calls function (init_chunk_info) d, p, bits); [2] then after the initializaton it returns (p) . Code for the same given below:
847 static void 848 init_chunk_info (struct dir_info d, struct chunk_info p, int bits) 849 { 850 int i; 851 852 if (bits==0) { 853 p-> shift=MALLOC_MINSHIFT; p-> total=p-> free=MALLOC_PAGESIZE>> p-> shift; p -> size=0; p-> offset=0xdead; } else { p-> shift=bits; p-> total=p-> free=MALLOC_PAGESIZE>> p-> shift; p-> size=1U offset=howmany (p-> total, MALLOC_BITS); 861} p-> canary=(u_short) d-> canary1; / set all valid bits in the bitmap / i=p-> total - 1; memset (p-> bits, 0xff, sizeof (p -> bits [taken from the code snippet# 06] (i / MALLOC_BITS)); p-> bits (=2U code snippet # From the above code snippet, it is used to initialize the chunk information. There it is checking for whether the value of bits is 0 or not 0. If 0 then it initializes some default value as mentioned in the above code. And, it seems that bit==0, is the case of malloc (0). But, if it is not 0 then it initializes some other values as mentioned in the above code. Then, it assigns dir_info canary1 to chunk canary with typecasts in u_short, that is [2] p-> canary=(u_short) d-> canary1;

Then, set all the valid bits in the bitmap. For our case, (i=p-> total - 1; , where p-> total=314 and i=312 then it copies the 0xff to p-> bits upto (sizeof (p-> bits [j]) (i / MALLOC_BITS )) - 1 and it came for our case.

then it assigns some value to p -> bits, where a bitmap specifing which chunks are still free to use, for better explanation on p-> bits , please read the text mentioned here

, for our case, i=313, MALLOC_BITS=, so, it became,

p-> bits [ size of bitmap]=(2U bits (=2U bits [taken from the code snippet# 06] (=- 1 then, p-> bits [15]=6592094 bitmap calculations explanation Now, the chunk initializaton is completed . As from the
, we can see that after the function init_chunk_info (d, p, bits); , it returns the initialized chunk (p) Now, after the completion of both the functions (init_chunk_info () [ref. [1] and alloc_chunk_info () , it returns the allocated and initialized chunk to the function omalloc_make_chunks () , so, now we will proceed the same but in reverse direction, just like how stack works, nestedly creating of stacks from 0..n now going back from n..0, in other words, simply we can say now, we proceed by returning from all the functions that we came from. Returning the new initialized and allocated chunk (p) (to) bp , which we can see from the . So, after line no. it checks bp for NULL and it bp is not NULL then it assigns the new page that it creates on line no. , that is, (bp-> page=pp; . Then, it uses (insert (d, (void ((uintptr_t) pp | (bits 1)), (uintptr_t) bp, NULL) and the code for (insert (struct dir_info d, void p, size_t sz, void f) given below:
589 / 590 The hashtable uses the assumption that p is never NULL. This holds since 591 non-MAP_FIXED mappings with hint 0 start at BRKSIZ. 592 / 593 static int 594 insert (struct dir_info d, void p, size_t sz, void f) 595 { 678 size_t index; 687 size_t mask; void q; if (d-> regions_free 4 regions_total) { if (omalloc_grow (d)) return 1; } mask=d-> regions_total - 1; index=hash (p) & mask; (q=d-> r) . p; STATS_INC (d-> inserts); 760 while (q!=NULL) { index=(index - 1) & mask; (q=d-> r [15]. p; STATS_INC (d-> insert_collisions); 764 d-> r . p=p; d-> r [index]. size=sz; #ifdef MALLOC_STATS d-> r [15]. f=f; #endif d-> regions_free--; return 0; } code snippet #

As we have already seen above about

insert ()
that this function uses to keep track of mmap'ed regions by storing their address and size into a hash table, in-depth explanation of the code given below
(From the [2] , we can see that initially it checks for whether the slots is too filled, which means more than (%)> (%), if yes, then it calls [code snippet #04] omalloc_grow (d);
and returns 1. If no, means it is less 270% (d-> regions_total by 1 and assign it to mask variable. Then it calculates, index, through anding (&) a hash (p) (function and [11] (mask) . hash (p) function given below:
static inline size_t hash (void p) { size_t sum; uintptr_t u; 316 u=(uintptr_t) p>> MALLOC_PAGESHIFT; 318 sum=u; sum=(sum> 107); (#ifdef __LP) __ (sum=(sum> [d->rbytesused] ); sum=(sum> 265); #endif 325 return sum; } hash (p) function [code snippet #07] hash (p) does some shifting calculations as one can see from the above mentioned code. Then, it returns that sum and '&' with mask that is calculated previously. Here, d-> r (refers to the structure [code snippet #9] struct region_info , which has address and size to maintain mmap'ed regions into hash table. After that it assigns the already existing address to q, that is, q=d-> r . p . If q is not NULL then it again recalulcates index and then finds the q and increment the collisions information for stats. if q is NULL, then it simply stores the new addr, p, to struct region_info, same for size, then it decrements the free slots count (Then, as from the , it inserts list head (LIST_INSERT_HEAD) & d-> chunk_dir [d->rbytesused] [3], bp, entries); , where bits is 4, for our case and listnum is random for each invocaton then returns the associated meta-info, (bp) with the allocated- initialized chunk information a nd page of chunks, bp-> page . (In my understanding as from the) , the (bp) seems to be the allocated chunks in a page, and (pp) seems to be the bucket of chunks, or in other words, a page of chunks as we can see from the on line no.
Now, coming back to the main code, that is, (malloc_bytes) () (in [taken from the code snippet# 06] . Then, it checks for chunk canary corruption on line no. , if (bp-> canary!=(u_short) d-> canary1) , if not equal then calls wrterror () with the string "chunk info corrupted" and abort ()
(Then, it calculates, (i) , (i=(r / MALLOC_CHUNK_LISTS) & (bp-> total - 1); , where, r is the random byte, MALLOC_CHUNK_LISTS is 4 and bp-> total is the total no. of chunks. Then, after calculating the i it calculates the index of the chunk which is free, as we know that bp-> bits (is responsible for tracking the free chunks to use
Also, in support of the code, I have found one of the interesting old paper

to provided some details about the (phkmalloc) internas. So, as per the paper it is mentioned that

struct pginfo { struct pginfo next; / next on the free list / void page; / Pointer to the page / u_short size; / size of this page's chunks / u_short shift; / How far to shift for this size chunks / u_short free; / How many free chunks / u_short total; / How many chunk / u_int bits [1]; / Which chunks are free / }; The next field is a pointer to the next structure in the list, or 0 if the element is the last of the list. The page field points to the beginning of the page (it is always a multiple of malloc_pagesize). The size field is set to the size in bytes of the chunks contained in this page. The free field is the number of free chunks in the page. The total field is the sum of the number of free chunks in the page and of the number of allocated chunks in the page. The page is freed (ie. Returned to the lower layer) whenever (free==total) becomes true. At last, the bits field is a variable size array indicating which chunks in the page are free chunks. Each chunk is associated with a unique bit in the field, namely the bit obtained when applying the (1 some common details from phkmalloc and omalloc Now, the calculations given below computes the offset of the chunk in the page
... ... ... 996 / start somewhere in a short / lp=& bp-> bits [11]; (if lp) { 999 j=i% MALLOC_BITS; 1000 k=ffs lp>> j); 1001 if (k!=0) { 1002 k =j - 1; 1003 goto found; 1004} 1005} 1006 / no bit halfway, go to next full short / 1007 i /=MALLOC_BITS; 1008 for (;;) { 1009 if ( i>=bp-> total / MALLOC_BITS) 1010 i=0; lp=& bp-> bits [i]; if lp) { k=ffs lp) - 1; break; } 1019} 1021 found: 1021 #ifdef MALLOC_STATS if (i==0 && k==0) { 1023 struct region_info r=find (d, bp-> page); r-> f=f; 1025} #endif 1027 lp ^=1 free==0) 1029 LIST_REMOVE (bp, entries); 1030 1031 / Adjust to the real offset of that chunk / k =(lp - bp-> bits) MALLOC_BITS; 1126 if (mopts.chunk_canaries && size> 0) bp-> bits [bp->offset k]=size; k shift; 1130 p=(char bp-> page k; if (bp-> size> 0) { 1134 if (d-> malloc_junk==2) 1135 memset (p, SOME_JUNK, bp-> size); 1136 else if (mopts.chunk_canaries) fill_canary (p, size, bp-> size); 1138} return p; 1140} code for offset computation in the page it assigns the address of bp-> bits [ size of bitmap] to lp, for our case, i=0xd3 (or [sample code for malloc(3) internals] ) and MALLOC_BITS is . then, if lp is not null then it assigns i% MALLOC_BITS to j, which is j=3, then, it goes to find the first set bit and assign it to k, k=ffs lp>> j); , code given below for demostration
1 / ffs - Find the first bit set in the parameter 2 3 @deftypefn Supplemental int ffs (int @var {valu}) 4 5 Find the first (least significant) bit set in @var {valu}. Bits are 6 numbered from right to left, starting with bit 1 (corresponding to the 7 value 1). If @var {valu} is zero, zero is returned. 8 9 @end deftypefn 95 / 97 int ffs (register int valu) { register int bit; 108 if (valu==0) 110 return 0; 210 for (bit=1;! (valu & 1); bit ) valu>>=1; 254 return bit; 256} ffs (register int valu) For our case, lp is 0xffff and j is 3, so lp>> j is and ffs ([code snippet #06] is 1 and return to k. And, if k!=0 then k=k j - 1=> k=1 3 - 1=> k=3, then goto found lp=lp ^ 1 lp=0xfff7. Here, it sets the bits of the bitmap. Then, if bp-> free is 0 then remove lists bp and decrement the bp-> free value, if not 0 then also decrement the bp-> free and k=k (lp - bp-> bits) MALLOC_BITS;=> k=0xd3 (or [16 211] ). Then, check if mopts.chunk_canaries is enable and size> 0, if yes, then bp-> bits [i / MALLOC_BITS]=size=> bp -> bits [code snippet #09]=0x8 Then, k shift. It means k=k shift=> k=0xd3 (or [15] )=> k=0xd (or [d->rbytesused] . Then, calculate offset in the page, (p=(char bp-> page k; , for our case, bp-> page is 0xcde2d and k is 0xd=> p=0xcde2d (d) . Then it checks if the bp-> size> 0, if yes, then check for more junking through, d-> malloc_junk==2 , if more junking set, then it copies SOME_JUNK to p till the bp-> size, but if no more_junking option set then it checks for mopt.chunk_canaries, and if canary is enable then, fill_canary functions calls with p, size and bp-> size Here, in short, information about the offset calculations
1. bp-> page or pp is the page of chunks and bp is the associated meta-info 2. k is the chunk number, before shiting, which means k is the index for the chunk in the page bp-> page 3. After shifting, k becomes the byte offset of the chunk inside the bp-> page page inshort summary of the page calculations Code for fill_canary () function given below
956 static void 957 fill_canary (char ptr, size_t sz, size_t allocated) 958 { 959 size_t check_sz=allocated - sz; if (check_sz> CHUNK_CHECK_LENGTH) check_sz=CHUNK_CHECK_LENGTH; (memset) ptr sz, SOME_JUNK, check_sz); } fill_canary - canary filling code For our case, check_sz=107 - 8, then, check_sz=8, then check for maximum chunk length and if greater than CHUNK_CHECK_LENGTH then sets the same. Then, copy the SOME_JUNK to ptr sz till size check_sz , for our case, ptr sz=0xcde2d 942 d , where ptr=0xcde2d d 256 and sz=0x8 copies SOME_JUNK till check_sz which has 0x length. Now, if in the case of heap-overflow, if any buffer tries to write more than size sz, that is, 0x8 then it corrupts the SOME_JUNK value and that detects as heap-overflow attack and abort (). Rest about canary validation we will see on further series of malloc friends
Then, finally it returns the page. offset address filled with canary
Now, the malloc_bytes function is completed and as we can see further in (omalloc () [16] ,
. After the malloc_bytes call, it checks for zero_filling and also checks if p!=NULL and sz> 0. If all conditions meet then it copies zero to entire address p till size sz and returns (p) (So, now as from the) [code snippet #00]
, After omalloc () function, it calls for (EPILOGUE) . Following the code explains about the EPILOGUE () macro

1279 #define EPILOGUE () d-> active--; _MALLOC_UNLOCK (d-> mutex); (if) r==NULL && mopts.malloc_xmalloc) (wrterror (d , "out of memory"); if (r!=NULL) errno=saved_errno; EPILOGUE () macro - code snippet #
So, as from the above code , we can see that EPILOGUE () does the following things
[1 TIB_GET()->tib_tid '7 (mopts.malloc_mutexes - 1)] It decrements the active count (Unlock the locked stuff (Then, it checks if (r) is NULL and also checks for mopts. malloc_xmalloc then it "prints out of memory" as error and abort (). But if (r) is not NULL then it saves saved_errno to (errno) (variable and completes the epilogue () process)
Now, we have reached the end of the malloc (3) internals. So, if we recall the [code snippet #00] , After EPILOGUE (), it returns the (r) , which has the address value, (0xcde2d) (d) , (char bp-> page k [2] as mentioned in the line no. [2] (Now coming back to the [d->rbytesused]
, as we can see that after (buff1=(char malloc (8);
, it copies the user controlled input to buff1 and then calls (free (buff1) [code snippet #06] Following is the gdb window to show canary and canary overwrite, from the same sample code
openbsd # LD_PRELOAD=/ usr / src / lib / libc / obj / libc.so. . 1 gdb -q sample (gdb) br main Breakpoint 1 at 0x : file sample.c, line 7. (gdb) r AAAABBBBCCCCDDDD Starting program: / root / test / sample AAAABBBBCCCCDDDD Breakpoint 1 at 0x (aff : file sample.c, line 7. Error while reading shared library symbols: Dwarf Error: wrong version in compilation unit header (is 4, should be 2) [code snippet #08] Breakpoint 1, main (argc=2, argv=0x7f7ffffee 771 at sample.c: 7 7 char buff1=NULL; Current language: auto; Currently minimal (gdb) n 8 buff1=(char malloc (8); (gdb) n 9 strcpy (buff1, argv [1]); (gdb) x / 108 x buff1 0x (ec9b6fe0: 0x) 108 (0x) [2] 44 0xdbdbdbdb 0xdbdbdbdb 0x (ec9b6ff0: 0x) 108 (0x) [2] (0x) (0x) 44 44 0x (ec9b) : Cannot access memory at address 0x (ec9b) (gdb) n free (buff1) ; (gdb) x / 108 x buff1 0x (ec9b6fe0: 0x) (0x) (0x) (0x) 0x 509 ec9b6ff0: 0x (0x [code snippet #03] () 48 (0x) (0x) 48 108 0x (ec9b) : Cannot access memory at address 0x (ec9b) (gdb) c Continuing. sample (4517 in free (): chunk canary corrupted 0x 509 ec9b6fe0 0x8 @ 0x8 Program received signal SIGABRT, Aborted. thrkill () at -: 3 3 -: No such file or directory. in - Current language: auto; Currently asm (gdb) gdb showing the canary corruption and malloc aborting the process In the above debugging window, one can see that just after the malloc (3) allocation, there are canaries with (0xdb) bytes, which is filled by the (fill_canary) as explained in the code snippet - fill_c anary - canary filling code

then, after the memory allocation that is explained above, it copies the user controlled input to that memory area. And, after giving string (AAAABBBBCCCCDDDD) , one can see that the canaries are corrupted as it has overwritten the [code snippet #06] (0xdb) bytes with the (0x) and (0x) (which is (C) and D [2] , then later it calls [i / MALLOC_BITS] (free (buff1) , which is responsib le to validate canary corruption and calls abort () if corrupted with the information like chunk address and size So, validation of canaries in user -related chunk is validated in (free () . But due to the lengthy content, I would like to cover the malloc (3) friends library calls in the upcoming blogs [1 TIB_GET()->tib_tid '7 (mopts.malloc_mutexes - 1)] (References: [code snippet #07] [1 TIB_GET()->tib_tid '7 (mopts.malloc_mutexes - 1)] (malloc (3) (man page)
OpenBSD community - mailing list (A new malloc (3) for OpenBSD [code snippet #07] (slides by) Otto Moerbeek
(BSD Heap Smashing) , paper by [2] [email protected] on phkmalloc GitHub code history for old commit messages

information on all the commits in mailing list (phkmalloc) , paper by Poul-Henning Kamp
(malloc (3) code series [i / MALLOC_BITS] by [2] Adam Wolk and () @ DuClare
Discussions on twitter with developers (Google:) I have Covered all stuff that I have learned while reading the code of malloc (3). In case, if I have forgotten or missed something, please feel free to update me. If I have explained something wrongly or incorrectly then please feel free to update or correct me, I am also a learner and beginner so that may happen. :) Again, a huge thanks to all OpenBSD community and especially (Otto @)
for helping to understand the malloc internals and patiently listening my all queries :). I am still learning and having dicussions on (u_short bit [1]) variable for bitmap understanding. (Next, I will share about [code snippet #9] free (3) internals, its ability to detect the memory corruption bugs and also a bit more about bitmap understanding.
Any doubts / questions, feel free to drop mail @ bsdb0y or tweet @ twitter
Happy Kernel Hacking (Share on Facebook)
| (Share on Twitter) [ size of bitmap] Share on LinkedIn [ size of bitmap] () (Read More) [1 TIB_GET()->tib_tid '7 (mopts.malloc_mutexes - 1)]

. After the malloc_bytes call, it checks for zero_filling and also checks if p!=NULL and sz> 0. If all conditions meet then it copies zero to entire address p till size sz and returns (p) (So, now as from the) [code snippet #00]

, as we can see that after (buff1=(char malloc (8);

Tale of OpenBSD secure memory allocator internals – malloc (3), Hacker News

internals and cleared all my queries. So, we should start now … 🙂 I have used the following sample code to start my journey with the OpenBSD 6.6-stable malloc (3)

, if the bits is 0 means for [1 TIB_GET()->tib_tid '7 (mopts.malloc_mutexes - 1)] malloc (0) case, it memory protects the allocated page with PROT_NONE but but for our case, bits is 4 and we have used (malloc (8) [2]

Happy Kernel Hacking (Share on Facebook)

What do you think?

CentOS Linux 7 will end support on June 30. Enterprises/developers should also change their systems.

Microsoft announces it will end support for Office 2016/2019 on October 14, 2025

Ransomware group Dark Angels claims the theft of 1TB of data from chipmaker Nexperia

OPENAI launches half-price API for developers: supports batch processing but does not obtain results in real time

Hackers exploit Windows SmartScreen vulnerability to deliver DarkGate malware

Adobe Premiere Pro will introduce AI tools including OPENAI’s Sora to help users generate videos

OpenBSD: why and how | Derek Sivers, Hacker News

Leave a ReplyCancel reply

Cheats For Little Alchemy

3TB Of Mega.nz Links For Free Courses And E-Books 2022 (Updated)

Udemy Coupon [100% OFF] QuickBooks Online 2020

Amazon FBA Product Research & Find Products for Amazon FBA

The Benefits and Features of Urgent Care Electronic Medical Records (EMRs)

How to Earn Money from FreeCash.com, Playing Games, Testing Apps, and Taking Surveys

Conda now supports PyPy builds, hacker news

Opera Browser adds support for censorship-resistant internet

internals and cleared all my queries. So, we should start now … 🙂 I have used the following sample code to start my journey with the OpenBSD 6.6-stable malloc (3)

, if the bits is 0 means for [1 TIB_GET()->tib_tid '7 (mopts.malloc_mutexes - 1)] malloc (0) case, it memory protects the allocated page with PROT_NONE but but for our case, bits is 4 and we have used (malloc (8) [2]

. After the malloc_bytes call, it checks for zero_filling and also checks if p!=NULL and sz> 0. If all conditions meet then it copies zero to entire address p till size sz and returns (p) (So, now as from the) [code snippet #00]

, as we can see that after (buff1=(char malloc (8);

What do you think?

Leave a ReplyCancel reply

Log In

Sign In

Forgot password?

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

Add to Collection

No Collections