Summarized using AI

Keynote: Taking Out The Trash

Aaron Patterson • June 24, 2016 • Singapore • Keynote

The video features a keynote presentation by Aaron Patterson, also known as Tender Love, at the Red Dot Ruby Conference 2016, titled 'Taking Out The Trash.' The talk primarily explores the intricacies of garbage collection (GC) in Ruby, focusing on the algorithms used within the MRI (Matz's Ruby Interpreter). Aaron begins by expressing gratitude for the audience and the conference before sharing humorous anecdotes about his experience working at GitHub. He introduces the garbage collector topic with a discussion about memory allocation and reclamation.

Key points covered in the talk include:

  • Understanding Garbage Collection in Ruby: Aaron explains the fundamental role of the garbage collector in managing memory by both collecting unused objects and allocating memory for new objects.
  • Types of Garbage Collection Algorithms: He discusses various algorithms, notably the mark-and-sweep method, generational garbage collection, and incremental garbage collection, detailing their phases and operations.
    • The mark-and-sweep algorithm identifies live objects by marking reachable nodes from roots and sweeping away unmarked objects, namely unreachable ones.
    • Generational GC improves efficiency by segregating objects into young and old generations, addressing the likelihood that newer objects die sooner.
    • The incremental GC allows collection processes to run in phases, reducing the pause times during execution by marking objects that are in a gray set and processing them incrementally.
  • Memory Management Techniques: Aaron highlights the importance of objects forming a tree structure, which aids the garbage collector in identifying reclaimable memory. He introduces additional concepts like right barriers and remembered sets in addressing issues during garbage collection.
  • Allocation Strategies: He describes how Ruby allocates memory using pages and slabs, emphasizing the efficiency of using a free list to manage object allocation.
    • An interesting memory allocation technique discussed is the use of tag pointers, which allows certain objects like integers to be represented without additional allocations.
  • Performance and Debugging Tools: The presentation also touched on introspection tools available in Ruby for monitoring garbage collection, such as GC.stat and ObjectSpace.dump_all, enabling developers to analyze and optimize memory management in their applications.

In conclusion, Aaron encourages developers to further explore garbage collection mechanisms in Ruby while highlighting the exciting aspects of coding and memory management. Key takeaways include the importance of understanding GC algorithms for effective memory management and the introduction to useful debugging tools provided by Ruby for tracking memory allocation and garbage collection.

Keynote: Taking Out The Trash
Aaron Patterson • Singapore • Keynote

Date: June 24, 2016
Published: unknown
Announced: unknown

Speaker: Aaron Patterson, Ruby & Rails Core, GitHub

Event Page: http://www.reddotrubyconf.com

Produced by Engineers.SG

Red Dot Ruby Conference 2016

00:00:14.879 okay okay oh I guess I'm using this mic hello hello hello good morning good
00:00:21.000 morning happy Friday I first I want to say thank you to everyone thank you to the organizers thank you all of you for
00:00:26.800 being here today even though I know it is the morning of the the second day which makes it a lot tougher to get here
00:00:34.719 so thank you for being here um it is a great honor to be here I think this is
00:00:39.960 my fourth year here I think and I love coming every time so thank you um today
00:00:45.600 I'm going to talk about uh taking out the trash uh this this is the title of
00:00:51.320 my talk first I'm going to introduce myself a little bit my name is Aaron Patterson I'm also known as Tender Love
00:00:57.879 uh if you don't recognize me this is what I look like on the internet uh so
00:01:03.519 my avatar looks a little bit different than me but that that really is me so uh I work for a company called GitHub uh
00:01:11.479 you may have heard of them I'm not sure um I've been working for the company for about three three months now
00:01:18.479 and it is the first uh legit company I've ever worked
00:01:25.400 for um so like I said I've been I've been working for the company
00:01:31.040 for about for about 3 months and it was really weird I had to go down to um I
00:01:36.200 live in Seattle I had to go to San Francisco for a week to do like training or whatever and it turns out that at the
00:01:43.000 company everybody refers to each other by their um GitHub nicknames and like in
00:01:50.880 person and this has never happened to me before all these people were calling me tender love and I just thought it was
00:01:57.159 really awkward so I mean if you if you want to call me
00:02:04.759 tender love you can but you know please call me Aaron that's fine that's fine too I'm happy with my I'm happy with my
00:02:11.200 real name so I should have called this I
00:02:16.440 should have called this talk cats puns and Friday hugs because it is Friday and I love puns and cats so I'm going to
00:02:22.800 show you I'm going to show you some of my cats this is one of my cats this is Cho Cho um her her her is too long for
00:02:30.920 her mouth so it always it always sticks out this is a more famous One this is
00:02:35.959 Gorbachev puff puff Thunder horse um my wife likes him better than the other cat
00:02:42.159 uh this is he's trying to hide in this photo here like he thinks he is hiding he is not uh this is Choo Cho again I
00:02:49.159 really I really like her she likes to sit on my desk while I'm programming and she always makes this face which is like
00:02:55.599 the same face that I make when I'm programming just like
00:03:07.120 sorry uh anyway so you know I I I also love cats and code just like our MC so I
00:03:14.040 want to go over I want to go over a few things that I've learned so far at the conference uh so first off I learned
00:03:21.040 about some new Ruby features uh we were talk like matz was talking about typing
00:03:26.480 and Ruby and we learned about gradual typing and I want to so I I want to demonstrate
00:03:33.360 it here today I have a live demo for all of you so I'm going to demonstrate gradual typing okay you
00:03:54.439 typing soft typing
00:04:03.079 anyway uh this is a mechanical keyboard so if you want to talk about mechanical keyboards you should come talk to me
00:04:08.239 later uh okay also I learned that we don't care what what dynamic tying ah
00:04:13.760 Dynamic typing yes Dynamic typing this is very Dynamic I was also going to do static typing where I don't move my hand
00:04:28.600 keyboard anyway so we don't we don't care about small things which is why in uh Ruby 2.5 we're going to remove the
00:04:35.440 down case method so no more no more Small
00:04:42.320 Things uh also I thought it was like I thought it I was watching the presentation I thought it was really
00:04:48.000 cool that we're going to have like upcase and down case and they're going to support utf8 characters so we have upcase and down case but I was thinking
00:04:55.960 about this I'm like well we we've got up and down we don't have we don't have
00:05:01.000 widen and and narrow and that doesn't like we're missing those so I I put
00:05:06.360 together an implementation of that so you can call Widen and narrow and so look look forward to this in Ruby
00:05:13.520 2.5 uh if you want to get this functionality today go ahead to my
00:05:19.840 GitHub repository you can get this Gem and and have those um I also learned
00:05:26.199 that there are no no durians allowed at this uh conference but that is too bad because I
00:05:31.919 love Duran so I will eat them anyway um
00:05:38.960 and sorry I'm trying to work in a durian pun somehow I I appreciate all of you for
00:05:52.960 people to fish I'm going to teach you all how to fish uh go ahead and look in your email
00:05:59.960 today all of you you you have an email from me just enter your bank account
00:06:05.639 information and I will help you out um so also I learned I learned from
00:06:12.039 today uh at this conference I learned from Terrence that uh learning a new language can keep me from getting rusty
00:06:18.319 so that was an important thing I learned from his talk today all right so let's let's move on to the real topic of
00:06:24.800 conversation uh I'm going to talk about GC today I'm going to talk about GC uh specifically I'm going to talk about
00:06:31.000 Godfrey Chan uh so I asked him I asked him you know
00:06:37.560 what is your what is your favorite memory and he said that it has been collected uh uh his favorite food is
00:06:45.039 pizza and I asked him on the bus this morning what is your favorite color and he said that it is Ruby uh actually said
00:06:52.599 it was orange but I fix that for him uh okay no no no no I'm I'm I'm actually
00:06:59.120 going to talk about the garbage collector today and this is this is a very exciting topic and I'm so happy for
00:07:04.919 all of you to be here at at 9930 in the morning to listen to talk about a
00:07:11.160 garbage collector so you know get ready to have a nap this morning it's okay you can
00:07:17.400 sleep in so we're going to talk about I'm going to talk about the GC and MRI um I'm going to talk about uh GC
00:07:25.120 algorithms but I'm going to talk about them specifically uh with regard to the algorithm algorithms that we use in MRI
00:07:31.800 so I'm going to talk about some some GC algorithms but I'm going to focus heavily on the AL algorithms we use in
00:07:38.120 uh MRIs GC uh but I think I'm also going to talk about two sides of the garbage collector I think it's interesting when
00:07:44.440 you see people give talks about the garbage collector they typically talk about uh collecting collecting memory
00:07:51.000 but they don't really talk so much about allocating memory so I'm actually going to cover both sides of that uh the GC is
00:07:57.759 the GC is um responsible not only for collecting memory but also for or also
00:08:03.639 for allocating it so we're going to cover that so we'll look at uh collection algorithms how Ruby actually
00:08:09.960 reclaims memory um and then we're also going to look at allocation algorithms
00:08:15.639 how we actually create objects in Ruby uh and I'm also going to touch a little bit if we have time I'm going to touch a
00:08:22.240 little bit on the introspection API like uh G the GC introspection API in MRI so
00:08:29.039 you can actually get statistics about um uh objects in your system uh
00:08:38.479 so I'm going to talk first about the collection algorithms in uh Mr GC so
00:08:44.680 let's let's describe a little bit what uh what type of collector MRI has uh the
00:08:50.920 type of collector we can we can describe it as um a mark and sweep collector it's
00:08:56.839 uh generational and it is also incremental so we can use these three three adjectives to describe the garbage
00:09:03.480 collector so we're going to cover each of those a little bit more in depth but uh first I want to step back a little
00:09:10.120 bit and talk about you know what what exactly is a GC so what you know we know
00:09:15.160 that it's it's it's a thing that that frees up memory for us but at a high level what does that what does that
00:09:21.040 really mean um if you think about the objects that are in your system so if
00:09:26.600 you if you look at the objects when you you're creating objects in Ruby you'll notice that they form sort of a tree
00:09:32.360 thing here so for example if you look at this Ruby code on the left you'll notice that it kind of forms a tree where we
00:09:37.959 start at the at the root uh and the root references an array which is that that a
00:09:43.720 variable and then that array references a hash and the hash references a key and a value pair so we can think of it as
00:09:50.440 kind of a tree data structure so when you're creating these objects in Ruby you're creating kind of a tree a tree
00:09:57.320 data structure and let's say let's say we Chang this code a little bit and we assigned nil into a we
00:10:04.959 actually cut that line up there with the root uh now the root is no no longer
00:10:10.200 holding a reference to that array and hash uh Etc so this these objects are going to get collected so essentially
00:10:18.040 what our garbage collector is doing is saying okay we're going to find everything in this tree that isn't referenced from the root anymore uh and
00:10:25.000 we're going to we're going to free that memory up so we can think of all of our objects as a tree and we can think of
00:10:31.200 the GC as something as trying to find find things that are not available in that tree anymore so uh a few terms that
00:10:38.560 we can learn from this are we have a we have what's called a root set which is our that very top root root thing we
00:10:45.279 have garbage objects those are ones that are no longer reachable from the root uh and then we have live data which is all
00:10:51.720 the all the objects that you can actually reach from the root in that tree so our gc's entire job the entire
00:10:58.320 job of the GC is to find those UNL nodes and then free those up so when we talk
00:11:04.920 about GC algor algorithms all we're really doing is talking about how do we find those how do we find those unlined
00:11:10.959 nodes right how do we get how do we get those what is an algorithm for finding those and releasing those
00:11:17.800 nodes so the first thing we're going to talk about is uh Mark and sweep this is a very simple GC algorithm and it has
00:11:26.279 two distinct phases uh mark and sweep which is why it's called Mark and
00:11:33.399 sweep garbage collector so if you look at the if we look at this let's pretend we have an object graph that looks like
00:11:39.000 this uh what happens is we go through a mark phase and the mark the mark phase
00:11:44.240 goes through walks every one of these arrows and actually marks that object as
00:11:49.639 something that we care about so it starts at the root and follows each of those arrows through the tree so you'll
00:11:55.120 see that it'll follow through a it'll follow through all the these all these arrows until it marks all the objects
00:12:02.440 and then we go through a sweep phase and what happens in The Sweep phase is any objects that were not marked we actually
00:12:09.160 just free up those objects they go away uh and we're left with we're left
00:12:15.040 with the actual live objects and at the end of The Sweep phase we go back through all those objects and unmark
00:12:20.480 them all so now we're back where we started and we can continue on with the program so uh right now the mark Mark
00:12:29.079 flag has been cleared so the mark and sweep algorithm is very it's very easy it's a simple algorithm uh but we have
00:12:35.639 some problems with it it's a little bit too slow we have to actually walk every one of those arrows every single time
00:12:41.560 look at every every object Mark whether or not it's alive and then sweep through the whole thing and another problem is
00:12:47.600 that we have to actually stop the world when we uh when we do this collection you may have noticed in older versions
00:12:53.920 of Ruby if you're especially if you're running a long process you'll notice like it'll just stop stop for a second
00:13:00.720 and then do you're like oh it's just pausing maybe my iTunes is playing too much or something no it's not your
00:13:06.880 iTunes it's fine it's actually the garbage collector just pausing doing the
00:13:12.279 mark and sweep phase and then continuing on so the other problem is that we have to visit objects every single time we
00:13:18.800 have to visit every one of the objects in the Heap and Mark all of those every single time and this is not very fun so
00:13:26.639 we have to walk every object every time so this one way we can get around this is if we introduce what's called a
00:13:32.120 generational garbage collector and the idea behind the generational garbage collector the theory behind it is that
00:13:38.720 objects in your system will die young so typically typically objects are going
00:13:43.880 you're going to allocate them and then they'll just go away very quickly uh so the idea is that if we divide objects up
00:13:50.880 into old and new objects so if we have old objects and new objects then maybe
00:13:55.920 let's only look at the only look at the Young objects only look at the new objects so the way this uh generational
00:14:02.600 collector works is say we have a say we have a object graph that looks something like this uh we do the same Mark and
00:14:10.959 sweep phase where we say okay well we're going to Mark here we Mark B and D and then uh we sweep A and C uh so those go
00:14:19.480 away but once that happens then we take we take b and d and we move them into another generation so they go into the
00:14:26.040 first generation right now let's say the program continues on and we allocate some new objects those new objects are
00:14:32.560 going to get allocated in the Young Generation or generation zero uh then we go through the same Mark and sweep phase
00:14:39.199 so we Mark FNG G we also Mark D uh then we know that e goes away and then f and
00:14:46.199 g get moved into generation one oh go transition there you go uh and
00:14:52.480 then we do the unmarked step again or we clear that Mark bit now the interesting thing that uh you may or may have have
00:14:58.880 not may or may not have noticed is that when we're doing that Mark phase when we
00:15:03.959 walk we're walking those arrows we actually didn't have to touch the B node we didn't go to B because we knew that
00:15:09.759 that one was old so we we're not dealing with B so the nice thing about this is
00:15:15.000 that we don't actually walk those old objects every single time so we can speed up that Mark and sweep phase
00:15:20.639 because we're only we're only looking at Old objects once in a while we only consider new objects frequently but
00:15:26.360 there is one slight problem with this with this uh uh algorithm and that's
00:15:32.160 that let's say we have let's say we have a graph that's like this B and D or in the old generation and for some reason
00:15:38.360 something happens such that that D actually allocates a new object e right
00:15:44.120 that new that new e object is in the Young Generation and when it's time to do the time to go through our Mark and
00:15:51.000 sweep phase we don't actually walk that Arrow up there because we know B is old
00:15:56.199 right we know B is old so we we don't go through that that go through all those arrows and what happens is at the end of
00:16:02.040 the mark phase we say well you know e is e hasn't been marked so we're just going
00:16:07.279 to GC that and now we've got this we've got this bad connection here so so your
00:16:13.600 program will crash we've we've uh freed up an object that actually is being used it's actually live so the way that we
00:16:21.440 fix that is we introduce something that's called a right barrier uh a right barrier in a remembered set and what a
00:16:27.000 right barrier is is when we write that when we write that connection between d and e when that that Arrow gets written
00:16:34.480 we keep track of that we keep track of that connection inside of what's called a remembered set so when that Arrow gets
00:16:41.399 written we remember that we put it inside a special set so that we know when we do our Mark and sweep phase we
00:16:47.160 say oh well we're going to go look at that remembered set too ah we need to Mark E and then e goes into the old
00:16:53.600 generation uh and we're uh free of our we don't have this bug anymore more so
00:17:00.079 we have a couple new terms from this we have a we have a right barrier which is just keeping track of those interesting
00:17:06.280 what we call interesting arrows which would be old generation to New Generation arrows and we have a
00:17:12.039 remembered set which is that list of those interesting interesting arrows so
00:17:17.520 generational garbage collectors they're faster it's it's faster because we don't have to walk as many nodes as frequently
00:17:24.240 uh but it's not quite as easy we have to implement this right barrier so it introduces some right barrier in a
00:17:29.640 remembered set so it introduces some some uh complexity to our garbage collector now unfortunately we still
00:17:36.440 have to stop the world with this with this algorithm even though we're walking fewer nodes we still have to stop go
00:17:42.480 through that go through that cycle and then continue on so to get around this
00:17:48.120 problem we've introduced what's called an incremental garbage collector and an incremental garbage collector is what it
00:17:53.760 says it's something that you can do incrementally and we'll look at how that works the way this algorithm works is it
00:17:59.080 it uses an algorithm called tricolor marking where you have um uh three
00:18:07.919 colors wow these names are so imaginative I love it all right so we
00:18:13.200 have three we have three colors white black and gray now white those those
00:18:18.600 objects are objects that we're going to collect um black objects they they have no references to white objects but
00:18:25.240 they're reachable from the root and then Gray objects are uh reachable from the
00:18:30.559 root but they haven't been scanned yet so we'll look at we'll look at how all these these colors work together so the
00:18:36.679 way this algorithm works is we pick we pick an object from the gray set and then we move it to Black we change that
00:18:41.960 over to the black set now for each objects each object that that one references we move those into the gray
00:18:48.840 set as well and then we just repeat step one and two until the gray set is empty
00:18:54.360 and then we know that everything that's left over that's white those can be those can be garbage collected
00:19:00.240 so uh an example of tricolor marking let's say we have a we have an object graph that looks something like this we
00:19:06.799 start out with a and F in the gray set because they're reference from the root uh we color those we color those black
00:19:13.960 then all of their references go to gray then we color those black and now we
00:19:19.320 know that we're done we don't have any more objects inside of the gray set so we can take all the white objects and GC
00:19:26.480 those we can free those up so those go away way now the advantage of this
00:19:32.200 algorithm is that we can interrupt any one of those steps so we can we can take a break at any time so we do we do the
00:19:39.640 gray to Black coloring and we can just stop right there we can continue on with the program so what this means is we can
00:19:46.520 perform each of these steps incrementally which is why it's called an incremental uh garbage collector so
00:19:52.679 what this means is that our halting time is actually reduced what we what we do is we say okay we'll run the program
00:19:58.240 prog for a bit now we're going to do one step of this incremental GC and then we're going to continue on with the
00:20:03.559 program so uh our program halts for less time but there's also a problem with
00:20:09.960 tricolor marking and that's very similar very similar to the generational problem which is let's say we have these these
00:20:17.039 uh gray colors here now we let's say we halted right right now we're gonna we're
00:20:22.720 going to pause the program uh and we do the we do the first step we pause the
00:20:28.960 program we let the user code continue so the GC pauses and then your code continues and somehow F this F object
00:20:35.840 creates a new object it it allocates something new so it's pointing at a g object now we have this new object now
00:20:43.080 let's say the GC starts up again and it says okay well now we're going to take all the gray objects and we're going to color those black so it it colors all
00:20:50.440 the gray objects black and it says okay we don't have any more gray objects so it's time to GC so we collect that g G
00:20:58.360 object and now we have a bad reference again that g object should have been live but it is no longer
00:21:04.480 live so in order to fix this problem uh we actually introduce another write
00:21:09.559 barrier so we say okay when when we write from F to G we're actually going to save that off so we have another
00:21:16.080 right barrier and a remembered set so we do exactly the same trick we were doing earlier we just keep track of that and
00:21:23.440 then we know that we can color that one gray later so uh interesting things from
00:21:28.960 this the glossery to learn from this is we have incremental GC which means that we can halt the GC at particular times
00:21:34.840 we have a right barrier uh again which keeps track of rights to objects and then we have a remembered set which is
00:21:41.039 those interesting interesting connections now all these algorithms the idea behind all these is that we want to
00:21:47.799 actually minimize tracing so we want to minimize the number of arrows that we Traverse we want to reduce those as much
00:21:54.159 as possible and we also want to decrease halting so that's those are our biggest things we want to reduce the number of
00:22:00.480 arrows we walk because that's less work we have to do and we want to decrease halting because uh if the program halts
00:22:06.760 it means it can't be doing other things like servicing your servicing requests from a user or something now things that
00:22:14.480 our GC is not which I want to talk a little bit about uh our GC is not parallel it means we're we're not able
00:22:20.880 to run this Mark and sweep algorithm in another thread we can't do that we can't do it in parallel uh it's also not real
00:22:27.760 time uh which means we can't we can't run MRI on things like um that require realtime
00:22:34.679 feedback for example like uh I don't know maybe you're building a robot or something you might you might not want
00:22:40.400 to run MRI on that you might want to use something like M Ruby or something that has a more real-time garbage collector
00:22:47.000 uh the other thing is that it's not compacting we don't actually take these objects and move them around and I'm
00:22:52.080 going to talk about that a little bit a little bit later those objects when we allocate them they stay in one
00:22:57.120 particular place in memory so we'll we'll look at that uh when we talk about
00:23:03.279 allocations all right so we've covered we've covered the way that we do um object freeing in MRI now I kind of want
00:23:11.320 to look at um allocation algorithms and the thing I want to look at first is I
00:23:16.679 want to look at our Heap layout so we have a heap uh our Heap contains all of
00:23:22.919 our objects now uh when a ruby object is allocated we don't actually call Malo
00:23:28.679 every single time uh we we actually allocate that into a page and the reason that we do that is that Malo isn't
00:23:38.600 free come on you're killing
00:23:47.200 me I know it's early but please I'm so proud of this
00:24:00.720 all right so Malik isn't Malik isn't free when when you call Malik it costs time so what we try to do is we want to
00:24:08.120 we try to allocate one large chunk of memory we allocate a large chunk or we we call that a page or a slab I prefer
00:24:15.600 to call it a slab though I think we it's referred to in the code as a page and the reason I like to call it a slab I'll
00:24:22.360 get into a little bit later um now we allocate one large chunk of contiguous
00:24:28.720 memory this page memory is contiguous so and each one of these chunks of memory
00:24:34.039 contains a linked list and ah yes I should Advance the slide so each page
00:24:39.360 holds a linked list now this linked list each member of the link list is what we
00:24:44.760 call a slot so nodes in this link list are called slots and those each slot is
00:24:50.000 actually a ruby object so to make this a little bit more clear when we do an allocation we'll allocate a page and it
00:24:56.159 is just a big chunk of memory like like this and when we when the uh garbage
00:25:01.279 collector allocates an object it'll just put one of the objects inside of the page so each time we allocate an object
00:25:07.880 we get a new object one after another oops yes there we go so we have a link list inside that's stored inside of this
00:25:15.640 page this list we call it the free list so all we have to do to allocate a new
00:25:20.919 object is we find the next slot that's open in the free list and then hand that back to you as your Ruby object so
00:25:28.760 uh if we actually fill up a page let's say this page is filled with objects when we go to the free list we see oh
00:25:35.240 it's filled so let's just allocate a new page and now we have a bunch of other places to store objects now uh each one
00:25:43.720 of these Pages uh actually one thing I want to talk about is when we're when we're looking
00:25:50.000 at the top of that the top of this link list where we actually pull that pull that object out when it's time to
00:25:56.000 allocate an object we call that our e Eden uh are Eden so that's where objects are
00:26:02.000 born um so when an object gets garbage collected it's actually pulled out it's
00:26:07.720 pulled out from this list so we free it from that we free it from that list and that actually leaves a hole in the page
00:26:14.960 which I want to talk about a little bit later but let's say we let's say we actually free up all of these objects
00:26:20.399 all these objects get gced if that happens then we'll actually free up this page so the page will go away so we can
00:26:27.279 free that um oh I want to talk about this this is exciting I love this part so I
00:26:32.840 want to talk about some interesting allocation hacks now not every object requires allocation
00:26:41.240 so Matt's talked a little bit about this yesterday with um integers um now not every object and you
00:26:49.880 may or may not know this but not every object in Ruby actually requires us to pull something off of that off of that
00:26:56.399 page so so I'm going to try and walk through how that actually works here so one page one page in Ruby is about 16k
00:27:04.840 this is just a size it's a fixed number it's a 16 16k page size now one object
00:27:10.440 in Ruby is 40 bytes okay so we got a 16k page and we got 40 by 40 by objects now
00:27:17.960 pages are actually aligned there're and what this means is that we say instead of malaking the pages we actually do
00:27:24.760 what's called an aligned Malo and what this means is we say we ask the computer
00:27:29.840 okay give me a chunk of memory but I want the beginning address of that chunk of memory to be divisible by some number
00:27:37.720 okay so the address of that chunk of memory is going to be divisible by some number and we choose 40 so we choose 40
00:27:44.480 as our multiple now 40 happens to be the size of a ruby object now what's cool is
00:27:50.120 if you start at 40 and you print out uh print out the binary representation of
00:27:56.159 that number so we're going to say all right 40 * 1 40 * 2 40 * 3 Etc as if
00:28:01.360 we're walking through the size of those objects so we're adding we keep adding an object of that list uh if you look at
00:28:07.559 the binary values of those now if I I'm going to just show you this here these are the binary binary representations of
00:28:13.399 those numbers if you look at the last three digits you'll notice that those last three digits are always zero okay
00:28:22.240 they're always zero so what this means is we can actually use those last three bits to have add some meaning to the
00:28:29.519 number so we'll say like let's use those let's use those last three bits to uh
00:28:35.120 indicate that that number is something special so what we can do is we can represent we can use that bit to
00:28:41.760 represent integers without doing any allocation so for example let's say we have the number let's say we have the number two uh and we want to represent
00:28:49.480 that as a ruby we want to represent that in Ruby what we would do is say well
00:28:54.760 let's let's have a flag of one for example example uh and if we convert two to Binary we know that that is 1 Z and
00:29:02.799 if we shift that over then we get 100 and we add the int flag to that and now we have one1 and if you look at that
00:29:11.000 number because those last three digits are not zero we know that that was not
00:29:16.240 allocated in Ruby's GC if an object is allocated in Ruby's garbage collector it's always going to
00:29:22.799 end in three zos and since this one doesn't end in three zeros we know that it's something special so what we can do
00:29:30.320 is just take that number two encode it like this and treat it as a ruby object
00:29:35.559 and then to decode we simply just shift that over shift that over one so using this information we can actually figure
00:29:42.120 out what is the largest fix num that we can allocate without actually without
00:29:48.679 actually pulling anything from the garbage collector so for example if we do 2 to the 64 - 1 you'll see that's
00:29:56.200 what the binary representation looks like it's 64 64 1es um but if we
00:30:01.919 subtract one from that we get a big num class uh and that's because you know we had to shift that over one we lose one
00:30:08.600 bit from that shifting so we we can't represent 63 bits uh and if we if we do
00:30:14.600 63 uh we still get a big num and the reason we get a big numb is because the very first bit represents plus or minus
00:30:21.679 right so if we actually do 2 to the 64 then we get a fixed num
00:30:27.760 so we can see that that's actually the largest fixed num that we can represent now even if you add one to that you'll
00:30:33.840 still get back a big num now unfortunately this is before Ruby 2.4 if you go look at Ruby 2.4 it's much more
00:30:40.120 confusing you can't tell the difference they're all just integers unfortunately but but if you look at the object IDs
00:30:47.960 you'll notice they are very very interesting those top two ones there uh those top two ones have very big object
00:30:56.919 ideas s and it's because we actually use the memory address the address of the object as the as the ID uh so you can
00:31:04.399 tell those two objects at the top those two are uh not allocated on in Ruby's GC
00:31:10.159 and the bottom two ones are uh and this technique this technique of using those
00:31:15.840 three bits those three remaining bits we call those tag pointers so it's a pointer it's some number stored in
00:31:22.639 memory but that number has special bits in it so I I love this hack I think it's
00:31:28.320 awesome it's really clever in my opinion now if you look at uh if you look at the
00:31:33.960 other objects in Ruby many many of these objects um are TAG pointers uh fixed nums are TAG pointers of course we don't
00:31:40.440 have fixed nums in two four uh floats are TAG pointers true false nil symbol
00:31:45.600 so a lot of these things don't actually require uh allocation from the garbage
00:31:50.799 collector so I want to move on a little bit uh to talk about some allocation
00:31:56.159 problems and these are these are problems that uh we've been seeing in our
00:32:01.600 application um now one one problem that that uh the garbage collector has is we
00:32:07.679 have an issue with poor Reclamation so let's say we have let's say we have three pages that look like this they're
00:32:14.679 full they're filled with objects right now let's say uh the garbage collector
00:32:19.840 runs and we and we reclaim some of these objects so they go away right we've filled we've reclaimed some of those
00:32:26.240 objects now unfortunately I mean what would be cool is if we could take these objects and move them move them around
00:32:32.320 like this then we could actually take this page and free it it would go away now unfortunately we can't do that that
00:32:39.279 does not happen in that does not happen in MRI we cannot move these objects so what ends up happening is we have these
00:32:45.600 three pages that are allocated with holes in them and we can't reclaim that
00:32:51.200 memory so unfortunately this causes this causes some problems for us where we have we have heaps that are too large
00:32:58.480 they could be smaller but they're not uh the other issue is that we have copy on
00:33:03.720 right we've experience copy on right problems and where this comes from is um
00:33:09.679 uh I'm going to explain how these copy on right issues occur now unfortunately
00:33:14.720 one a ruby memory page is not an OS memory page which is why I like to call
00:33:19.799 those pages in MRI slabs so an OS page is something totally different uh it's a
00:33:25.760 page of memory but it's not the same size as a ruby page so one Ruby page is about 16k on OS 10 an OS page is about
00:33:34.159 4K I think on uh the Linux systems we have in production to there One OS page is about 4K so uh one Ruby page contains
00:33:43.760 four OS Pages now let's say we have a parent process and a child process parent
00:33:50.120 process Fork to have a child process and they're both pointing at something that looks like this they're they're both pointing at a ruby page that looks like
00:33:56.480 this now let's say the uh child process allocates an object sticks it in there
00:34:02.760 now what happens is when we write to that bit of memory the operating system has to copy it so it's copy on write we
00:34:10.079 wrote to the memory it has to copy it now unfortunately the the operating system doesn't copy just that that
00:34:16.200 little section it actually copies One OS page so rather than copying those 40
00:34:22.520 bytes we actually we actually copy 4K so we wrote 40 bytes but we actually got 4K
00:34:28.879 4K copy so what would be cool is and this is something I'm trying to work on
00:34:34.240 and test test on our application is if we could group old objects together such that we didn't have as many holes in our
00:34:41.040 pages so for example let's say we had let's say we had two types of pages uh
00:34:46.919 one type is a probably old page and another type is just a a regular a regular page now what we could do is if
00:34:56.040 we alloc objects at the time we allocate them if we know hey I think you're going to be old we stick that into the old
00:35:02.520 page then maybe we can group all those together so they don't get uh we don't get holes in them like that so we'll
00:35:08.240 allocate an old object we know it's old when we allocate objects that are unknown then we just stick them into the
00:35:13.920 normal pages so you might be thinking Aaron how do you know what object is
00:35:19.599 going to be old it seems hard it is not that hard um
00:35:26.160 if you think about classes and modules and Ruby all these things are objects and they're probably not going to get
00:35:31.280 garbage eled when you do a class Fu it's probably going to be become an old object same with modules or constants or
00:35:38.680 possibly Frozen strings so we know we can have some heris sixs to determine like hey when we allocate this object
00:35:45.240 let's put it into the old page uh the other thing I was thinking about doing
00:35:50.599 and I'm not I'm just testing this now is if we could statistically determine
00:35:55.920 which objects are going to be be old and which ones aren't so let's say we have some Foo object and 90% of the time we
00:36:02.680 allocate a foo object it becomes old uh then the next time we allocate a foo object let's just stick that into the
00:36:09.040 old page um though unfortunately most objects becoming an old object is a rare
00:36:16.280 thing so I'm not sure if this technique will actually work for us but that's why I'm just trying to implement it and see
00:36:22.280 what happens in production uh so this will help us efficiently use space so if we we can
00:36:28.880 group all these old objects together then maybe we can reduce those reduce those empty slots without having to move
00:36:35.440 objects uh we could also reduce GC time with this with this technique because if we know that those objects are going to
00:36:41.839 be old and we mark them old already then we don't have to go through those first two cycles of of um garbage collection
00:36:49.200 in order to mark them as old all right so let's let's take a look at a little bit of some uh GC introspect
00:36:56.960 tools that we have available to us we use we use a lot of these tools to debug
00:37:02.200 um object leaks and just various GC issues in production so uh MRI has a lot
00:37:09.079 of GC in introspection tools so for example we can look at GC info now the
00:37:15.119 way to look at GC info is just with the gc. stat method and you don't need to read all these um just go run that
00:37:23.119 method uh but one of the main reasons I went through talk about all these all
00:37:28.520 these GC algorithms and pointing out the particular uh glossery and those those
00:37:33.680 particular terms is because each of those keys in this hash corresponds to
00:37:40.040 that particular thing with the garbage collector so if you know how the GC works when you read all these Keys it's
00:37:46.160 very easy to understand exactly what the GC is doing so now that we've talked
00:37:51.280 about all these techniques go run this run this code take a look at all the keys and you should be able to determine
00:37:57.119 what each of those keys means uh we can also get GC performance
00:38:02.920 so sometimes we'll look at uh the performance of our garbage collector and that's really easy with just a GC
00:38:09.119 profiler it looks the code looks just like this you can just enable the profiler and check reports for it you
00:38:15.160 can see how long each GC took uh so just call this method and then uh call the
00:38:22.079 report method to take a look at take a look at uh uh the performance of the garbage collector uh we also use a lot
00:38:29.200 of Heap introspection so we'll look at what objects are actually allocated and my favorite tool for using this is
00:38:34.920 object space dump all and what this does is it takes your entire Heap and actually dumps that out as a as a Json
00:38:41.240 file and if you have a lot of objects this can take some time so uh be careful
00:38:46.920 when you use this when you use this tool but this is one of my favorites to use uh you can actually dump one this dumps
00:38:53.400 out your entire Heap but you can actually dump one object at a time if if you want to uh you can just use objects
00:38:58.880 space. dump to actually dump out one object you'll see right there that's just the Json Json representation of
00:39:05.160 that and what's neat is if you use this if you use this uh dump method you can actually see uh see what happens to the
00:39:12.640 object as you GC so let's say we do gc. start three times um you'll see that
00:39:19.800 that object down at the very bottom uh after the three garbage collections it
00:39:24.839 becomes old and uncollectible and marked so you can see you can see how these
00:39:31.680 objects change over time which I think is very cool uh also not not all objects in Ruby have a right barrier and you'll
00:39:38.319 see when you dump out a particular object you can see which objects do and don't have a right barrier uh so in Ruby
00:39:46.119 objects have three generations as we saw from this uh the last one is the third we don't consider an object to be old
00:39:52.920 until it makes that third generation uh so another another thing I really like to use when debugging when
00:39:59.599 debugging um allocation issues is object space go check out all the methods on
00:40:04.800 that that class especially um Trace object allocations I like to use this
00:40:10.440 one I like to use this one a lot uh if you call this if you look at this uh
00:40:15.599 look at this method you can actually turn on uh tracing for objects so that every time an object gets allocated it
00:40:22.000 remembers where it was allocated what file and what line so if you're having troubles determining where this some
00:40:28.520 object came from enable this flag and then you can ask it hey where were you
00:40:35.079 allocated um hopefully am I I am out of time I I don't know yet I think so am I
00:40:43.520 am I okay thank you thank you so much I
00:40:51.040 oh please use this hashtag if you enjoyed the presentation
00:40:57.640 also I have stickers of my cat so come say hi to me I know I'm not supposed to have stickers apparently that's not cool
00:41:02.960 anymore but I do so come say hello find me later yes thank
00:41:12.040 you thanks and love thank you uh any questions from the audience
00:41:19.359 please I know it's early but you can do it yeah I was trying to think of some
00:41:24.480 questions but my memory is mixed I I I want to GC but it take too much
00:41:31.319 time hello um you mentioned in the tricolor marking phase you're using a
00:41:37.359 right barrier to go from blacks to White yes yes is there a reason we can't just
00:41:44.599 Mark uh so if a black object allocates an object and retains a reference to it is there a reason we can't just Mark that allocated object as right during
00:41:53.359 allocation what's your prence I don't think so I mean we probably could though
00:41:58.800 I don't I mean yeah I don't see why not is there a signicant overhead to the right barrier would it
00:42:04.359 be um there's I don't think there's any significant overhead we could try it
00:42:10.880 seems Seems like a good idea yes hey here I've got a question yes hey
00:42:18.200 so you mentioned that uh classes and modules are the objects that probably will become old and will not be garbage
00:42:23.880 collected what if I use a lot of Anonymous classes does it make it harder for GC to do its job and I should avoid
00:42:32.280 it no I don't think so if you use anonymous if you use Anonymous classes that's fine uh we can tell so when we
00:42:40.480 want to divide those when we want to divide those up we can tell like okay this is uh this class was allocated via
00:42:48.079 you know class Fu we know it was actually allocated in the source where if you're doing an anonymous class you're doing like class do you're doing
00:42:55.400 like class. new or something or module. new so we can tell the difference between those two those two types of
00:43:01.960 allocations so I mean feel continue to do that feel free to do
00:43:09.400 it any more
00:43:15.240 questions I think you guys need to GC okay uh I did it
00:43:25.079 yay yeah don't forget to take your cat stickers if not I'll take them all
Explore all talks recorded at Red Dot Ruby Conference 2016
+17