00:00:09.120
um hello everyone uh so yeah that's me that was name dropped in the previous talk a couple
00:00:15.039
times um so in this talk I'm going to present two bugs um one what was about
00:00:22.920
10 years ago which was like my my first real ra contribution and another one
00:00:28.679
that was just like a bit more than a year ago that was probably one of the if not the other the most intricate bug I
00:00:35.559
ever had to fix but first let me introduce myself uh so jier probably
00:00:41.360
better known as byot uh if you follow Ruby or rails development you might see me as my professional altero which is a
00:00:48.840
Caspar is fine and in addition to being a part of RA score and Rubik also maintain a
00:00:56.039
number of popular gems such as like a boots snap r r client message pack few
00:01:01.079
things like that uh I probably don't need to present my employer I'm sure you're all
00:01:06.240
streaming music on your phone um so yeah as I said the the first
00:01:13.920
bug um was from 10 years ago so like every bug every contribution
00:01:21.600
there's always a context of an opportunity um back in the day I joined
00:01:26.640
Shopify in October 2013 and it was their newly created office in
00:01:32.640
another town so we were growing very very fast we were assigned projects but there was always more people than
00:01:39.000
project like than work to do so that's why I just try to keep myself busy
00:01:45.040
looking at all the issues on the bug tracker and starting to fix weird stuff um so I made a bit of a reputation for
00:01:51.520
myself um until one day my my manager just tell me hey Johan I have something
00:01:56.960
for you um and so as you can see here it's like a my squ slow query report and we
00:02:04.640
had a account query doing taking 100 20 milliseconds and you might think like
00:02:11.000
okay it's a read query so it's fine it's from a background job so maybe it's fine but actually it takes a lot of um
00:02:16.879
database capacity while your database do that that's one thread used in my SQL but cannot do anything else so it's not
00:02:23.400
something you want to keep um so being a rail developer is the
00:02:29.280
first thing I was like oh that's easy like if you have slow count queries you use active record counter caches it's
00:02:34.959
just done for that so what are active record counter caches you might ask it's
00:02:40.640
a very simple denormalization techniques whenever you inser a record you increment the counter whenever you
00:02:46.840
delete a record you decrement the counter and so that gives you on the parent model that gives you like a an
00:02:52.360
integer field that you can just read instead of doing an extra query um it's mostly beneficial if you
00:03:00.959
have like a redv use case if you're inserting way more than your counting don't do it
00:03:07.840
but um yeah and so under the wood um active
00:03:13.560
record does like um Atomic increments and decrements as part of the transaction that inser to delet the
00:03:20.040
record so it's like all Atomic it's just all it's all keep in synct um not for
00:03:26.200
free but for very cheap um however when I went to track
00:03:31.720
down where the code was coming from like the query was coming from um I found that the model actually already had a
00:03:38.400
cter cache but for some reason the code wasn't using it and using the good old
00:03:44.400
git blame I saw that just a few months prior someone like purposely stopped
00:03:50.000
using the counter cache so what seemed like a very simple fix started to be
00:03:55.680
more complicated so tracking da by commit and like the discussion prior to it um there
00:04:03.200
was a bug report of some of collection so it's like Shopify so you have like products they are part of Collections
00:04:09.200
and you might not want to know how many products there is in the collection and some people in there I mean would see like minus 500 products in the
00:04:15.959
collection which obviously sounds a bit wrong U there was also several link
00:04:23.160
peers of atoms to move counter caches in radius to try to because radius is atomic so but nothing worked so far and
00:04:32.440
that that been a long-standing bug for about a year I think something like
00:04:38.240
that um and so I I went to tackle the issue and like all the previous items
00:04:44.759
were like okay this must be a race condition somewhere in rails um and they they went nowhere and
00:04:52.560
for me like Contra cashes have been in rights like as far as I can remember so people were using this for 10 years it
00:04:59.199
couldn't possibly be a bug in R right this was Battle tested um so I went instead
00:05:06.840
um I went instead with the idea that we probably were doing something wrong so I went to see about the implementation of
00:05:13.639
cont caches which don't get scared I'm goingon to go over it uh
00:05:21.000
but uh this is just meta programming to Define model callbacks if you remove the
00:05:27.080
meta programming layer you get this which is much simpler to and the sun like all of you should be able to uh
00:05:33.520
basically um on after create you increment on before destroy you decrement that's it as simple as
00:05:41.360
that um so I went over like I basically audited Shopify codebase went to look
00:05:47.840
for since those are callbacks you could use you know like update colon or like some lower level apis that bypass
00:05:55.479
callbacks so that was my theory but I just couldn't prove it I spent like a couple days looking everywhere to for
00:06:03.039
proof of that but just couldn't find it um so back to square
00:06:09.479
one but then that's when it hit me that looking at the at the production data
00:06:14.759
like all the desynchronized counters uh they were all negative it was always
00:06:20.400
they were always displaying less than the actual count uh and as you could so
00:06:26.000
some many were actually negative absolutely negative so this already allow to narrow down the
00:06:33.319
problem like if it's if we decrement too much it means the issue is on the destroy path not on the
00:06:40.199
create oh actually it could be that we don't increment when we create
00:06:45.360
but um so perhaps it was indeed a um a concurrency
00:06:51.960
issue so when you try to reproduce a bug on a u j app it can be complicated because
00:06:58.919
you have lots of overhead and like recreating the scenario with all the validation can be complicated so I just
00:07:05.199
want to create like a small application like your B basic like blog post with
00:07:10.599
comments and and whatever um and I run it with a unicorn to have like actual concurrency because that's also a
00:07:17.000
problem when you try to reproduce in development usually you have like many things that are different than
00:07:23.599
production and that's where I managed to actually reproduce the problem as you can see
00:07:29.160
like the counter cache that's displayed on top says there's one product but actually you can see two under it and
00:07:34.759
all I did this is was like mashing uh the delete button in the
00:07:40.520
browser for the anecdote I tried to reproduce this 10 years later for this for this talk and I just couldn't
00:07:46.039
because like Chrome and Firefox know like prevent double clicking so I had to simulate it a
00:07:53.879
bit um and so now that I had a reproduction I could really look in details on what happened Reser and
00:07:59.720
trying to get through like gigabytes and gigabytes of production log I had just like a few lines and I colored the I
00:08:08.520
colored the the relevant line and you can see like the the first request which is in red um first load the product oh
00:08:17.039
sorry I don't even remember but like load the parent record like the record and
00:08:22.720
then just after that the second request loads the same record as well in memory
00:08:28.360
and then they both deleted so uh that's where the the proing
00:08:35.800
problem from and as with most concurrency issues because the job of our database um is to of like
00:08:43.039
transaction and everything is to make it appear everything like if it was a Serial series of events there's no
00:08:49.120
actual like there's concurrency of course but like at the end of the day you can create a linear series of events
00:08:54.920
you don't need threads or anything to to reproduce and so here B log translated
00:09:00.440
in in a simple um Ruby snipet is just that in the same process M thre you load
00:09:05.720
both like you load the same product twice and then you destroy
00:09:10.880
twice and so I wasn't so great to squl at the time
00:09:16.920
I just had went to school and they told me like oh transactions are Atomic and so yeah that's
00:09:23.160
it but turns out it's a bit more complicated than that um in SQL delete
00:09:28.640
is an ed important operation so you can repeat it and it's just going to always succeed it might delete something or it
00:09:34.399
might delete nothing but in both case it's considered a success so if we always decrement after
00:09:43.040
uh Delete assuming we actually deleted something then we we decrement too
00:09:50.720
much um so basically same thing it was very easy to create a test case to
00:09:56.720
reproduce this because when you're trying to debug something having a good feedback loop like a very
00:10:01.880
fast feedback loop is is Major productivity gain and wor case with his
00:10:08.160
with his test case if I couldn't fix the bug myself I could just like open an issue on Rails and say like hey here's a
00:10:14.040
test case you can experiment for yourself so like getting a Repro like a
00:10:20.079
programmatic Repro is really like the most important things you should drive for when you're looking for a
00:10:26.200
bug um so now that I knew what the bug bug was I was as I said like this has
00:10:31.880
been in race for 10 years it's impossible we were the first one to eat it so I went to search for no I had a
00:10:37.720
keyword I could search for proper bug reports and um there was indeed some reports
00:10:45.959
there was even some um propos fix for it like some PO requests and one of them
00:10:51.760
was to put a pessimistic lock accuse a databas locking to prevent this uh race
00:10:57.800
condition but it was was rejected or or at least not considered because the
00:11:04.040
performance impact of loocking this explicitly during the transaction is really really twint so I knew that I had
00:11:10.639
to find another solution than that but reflecting back if you look at
00:11:17.480
those at this like MySQL debug like not debug but like console station like the
00:11:22.639
answer is right before our eyes right it tells us that the first delete actually deleted the record and but the second
00:11:28.200
one didn't so if we could um if we could get this information from the Ruby side we could
00:11:35.519
just say like hey if you deleted the record you decrement if you didn't you do
00:11:41.560
nothing and conveniently we actually advin information um maybe you have um
00:11:48.880
sometimes used like um you know the lower level ra CPI like a update all or delete all and things like that they
00:11:55.480
actually return you that number the number of affected rows um so the data was here we just couldn't
00:12:01.800
choose it because since cont caches are callbacks they're not they don't get
00:12:08.399
this information right there in a different context so it was we had everything it
00:12:14.440
was just a matter of raring like putting all the pieces
00:12:20.600
together um so I went to refactor um counter caches so that
00:12:27.760
they're no longer callbacks they actually core to active record and this way um they could just call like the
00:12:35.279
actual uh Delete statement and get the value back and check if the effect draw
00:12:40.440
is indeed um Superior to zero but before submitting this patch to
00:12:46.560
rails um I wanted to test it in production because um maybe my fix wasn't sound maybe it that also had
00:12:52.399
performance issues maybe there was other bugs like as we say you know a train can ID a
00:12:59.560
b canos as well um so I made a monkey patch vation of that fix and I shipped
00:13:05.480
it to in production and um I did like a migration data migration to reset all the counters and
00:13:13.240
everything um and after after several weeks I don't remember quite how much uh
00:13:19.000
we couldn't notice any desynchronization whatsoever so I was able to R it um the
00:13:24.040
initial commit that stopped using the counter cache solving the slow count in the M time
00:13:30.839
no that I had my uh fix um I had to send it up stream right because you don't
00:13:36.440
like other people can benefit from it and you don't want to run a monkey patch forever um and I just did like a clean
00:13:44.279
pier and it fit really really good um
00:13:50.040
but Aron P end up reviewing my Pro request um but it didn't exactly go as I
00:13:55.839
imagine um based on his Rons responses he didn't quite understand what the bug
00:14:01.600
was which was probably my fault for not explaining well uh I was a bit
00:14:06.759
destabilized U so I answered very stupidly told him like his suggestion
00:14:11.880
was a terrible idea with all due respect so uh my my English like just my proper
00:14:19.839
English level and it's just my communication skill in in general where VI great at the time know that I
00:14:25.560
maintain many project I totally understand how that works right because I had been working on this for weeks I
00:14:30.920
had like all the context in my head and I didn't do a good job of like really dropping that context in the in the
00:14:38.120
issue um and uh yeah sorry um so yeah it's
00:14:46.199
really it's really about like people cannot read your mind it's really important to track back everything all
00:14:52.120
the important information and put in a good way but lucky for me Aon is really
00:14:57.880
nice so he just just mocked me a bit but it's fine but like many maintainers would have just like ignored me or
00:15:03.759
something even I I'm pretty sure today like myself today I would have ignored me back in the day or something uh but
00:15:11.000
so the pr was ultimately managed after the consequen so reflecting back on all this
00:15:18.839
pH like a few takeaways like I took for myself or whatever is just like debugging is just a scientific method
00:15:25.320
you you just theorize you do an experiment you confirm on in film and then you just CLE back until the problem
00:15:31.839
is now done enough that you know where the problem is um when you get more
00:15:37.199
experience you you really know the application or whatever you sometime might do like better theories and so you
00:15:43.399
might go faster but like the the process is still the same um and yeah I touch this a bit
00:15:51.000
sooner but like the most challenging part of the beginning is to get like a a r production once you get a I know very
00:15:57.639
very little bug that have a r production and haven't been fixed however I know a million bugs where the reproduction
00:16:04.560
steps are unclear or non-existent and they're just rooting there right um as a coral it means the best
00:16:12.199
person to investigate a bug is the person who experience it like if you just go on a bug tracker and say like
00:16:19.000
hey I have a bug and it's un clear how to reproduce it the Mainer 99% of the
00:16:24.440
time just cannot do anything for you um and yeah if you go look at Trails
00:16:29.560
like you're going to find like five six years old issues where it's just a vague bug report with no reproduction nobody
00:16:35.959
can fix that for you um another thing is to always
00:16:41.560
question your assumption uh I made I I lost a couple days thinking like oh the right the bug cannot be that because and
00:16:49.120
just be careful about that another another thing is that your
00:16:54.800
depes are your code too like this is open source like else it just it's the
00:17:00.240
same thing than I don't know a controller that I've been written by a colleague of yours and maybe it's in vacation or or something or they in
00:17:07.240
vacation uh you're not going to wait two weeks or whatever it's just it's your code too right you need to be able to
00:17:14.319
you don't have to know it by by before to use it but if it doesn't work as you expect you need to be able to figure it
00:17:22.559
out um and also like hopefully this is never the case but like you shouldn't
00:17:28.319
need to ask manager like can I do a PR Upstream just it's your application
00:17:34.760
really and finally um communication is important we all know the people who
00:17:40.360
think they're like l stal or whatever and they just can be um as a
00:17:45.760
maintainer I can work on your issue on my free time or I can do a million other
00:17:50.799
things um it's just I'm not saying like oh you should be polite because you can be extremely insulting without swearing
00:17:58.679
but or you can be very nice and swearing a lot it's really about just like trying to cooperate and help and not be judgy
00:18:05.640
and and like keep your grumblings for yourself kind of thing make me want to
00:18:11.080
help you or make the Mainer want to help you uh I just need to te
00:18:24.960
it so second bag 10 years
00:18:30.080
later um
00:18:36.760
so yeah fast forward 10 years I became like a ruby committer um I became a
00:18:41.840
racec member um still working at chifi but in a team dedicated to work on Ruby
00:18:47.840
and race so I was like you know nothing no no Ruby bug or no R bug
00:18:53.799
can scare me anymore oh so I thought so
00:18:59.440
March 2023 I just saw um a toot on masteron
00:19:04.720
and someone said like hey there's a bug in masteron because myON is written in rights and I had a bit of time to K and
00:19:11.360
I was like oh this see this person is asking nicely so I'm going to go
00:19:17.679
look and so classic um not super clear bug report but like recently um Aon
00:19:26.880
Patterson had been um uh making change to masteron to allow
00:19:32.200
apps to upgrade like masteron was stuck to Ruby 31 and Ruby 32 was new and he
00:19:37.600
wanted to see the performance difference so he unblocked them to upgrade and some uh users started to upgrade and starting
00:19:44.000
thing with um arrows like actual
00:19:49.760
exceptions and this is a the back trace and like I was just reading it and it's
00:19:56.919
trying to like it's failing like without looking at the codee it's failing trying to read the ID attribute
00:20:03.280
of an active record instance so to me you know like I just like oh this is simple um I was very
00:20:10.240
overconfident I'm like Oh my son developer simply run into small limitation like of course um I said like
00:20:17.200
oh this probably one of your model that doesn't have an actual ID um ID attribute maybe it's just like
00:20:24.000
a joint table um I'm just going to fix that in ril it's no problem you can just
00:20:29.480
rock around it that way uh and of course like 5 minutes later while trying to do
00:20:35.240
the change in R I was like wait a minute but doesn't add
00:20:40.679
up I ignored tons of clues in the thread and I just jumped to conclusion way too
00:20:46.760
click way too quick because I was clearly overconfident um when I went to look at
00:20:51.960
the source code um the problem is not the model doesn't
00:20:57.200
have an id0 colon but that we're trying to access the ID of an of an instance
00:21:03.320
that didn't have its attributes values loaded yet so it was like a partially
00:21:08.640
initialized record much weer than just that another clue that I have been
00:21:15.880
totally ignoring is that it was only happening to people upgrading to 3.2 so like the rail limitation was totally off
00:21:23.559
track um so I I had to come up with a theory that would include
00:21:29.440
the upgrade like the change between 31 and 32 as a
00:21:34.600
bug um since the bug was triggered by Marshall trying to load the the cach I
00:21:39.799
went to CH to look for the changes in in Marshall um so a good old git log thing
00:21:47.080
and looking at the comit there was basically nothing and honestly from experience when you're starting to use
00:21:52.320
git log to try to find a bug it's just like you're desparate you have no idea where you're looking at this is never or
00:21:57.640
you find bugs unless you know it's really between like five comets but one year worth of change it's like forget
00:22:04.799
it so at this point I I just didn't think I could go any fter with a TR
00:22:09.919
producing because here I was trying to debug something that wasn't happening to me so it's just like trying to help your
00:22:16.440
your parents on the phone to debug their internet box or something it's just absolutely terrible so I needed to if I
00:22:23.840
cannot reproduce myself I cannot go any further um
00:22:29.159
the next day another us user posted like an interesting observation they tried to load the broken Marshall pad from from
00:22:35.600
rist 3 one so like they took a a cash pad from production and tried to load it in the old Ruby and it was fing failing
00:22:41.799
the same way so this was one more clue right um we we knew I mean I was
00:22:48.559
convinced the bug was in Marshall but I didn't know if it was um a bug in loading the Marshall or in seizing it
00:22:55.080
and so if a payload generated by 32 is still
00:23:00.360
broken in 31 then I know it's in the generation so we we we Al the hypothesis
00:23:05.679
yet again but I really needed my hands on like a corrupted payload because I I
00:23:12.960
would needed to inspect it the problem is Master on Cache contains lots of like personal um identifier data and stuff
00:23:21.039
like that so you cannot just like hey take this because that maybe private messages that may be
00:23:27.080
like um so the the next day someone shared Like a decompiled Pill like it's basically
00:23:33.039
like a a of the the pill so I couldn't execute it but at least I could see like
00:23:38.400
they did this so that they could strip the personal information um so at least I could have like a various clues about
00:23:45.200
what it looked like um and that led me to a new theory
00:23:51.799
is that in the P I could see there was many circular references I'm going to tou a bit more on that but basically
00:23:57.080
like object reference like a referencing B and B referencing a marshall is able to un
00:24:03.880
that um and so my theory was that an active record instance was being used as
00:24:10.559
a ash key before like in a circular way before it was fully instantiated so
00:24:16.080
basically that would look like this pattern um here if you're not familiar allocate
00:24:22.679
like when you do um some class. new in Ruby basically it calls allocate and then it call initialize on the return of
00:24:28.440
allocate so here I'm I'm allocating an object without initializing it without calling the
00:24:34.080
initialize and if I use that object as a ash key I had exactly the same error I
00:24:40.000
had like users had in production so that that was a
00:24:46.200
confirmation so you need to understand a bit how martial works like it's not a magical thing of where you coll Lo and
00:24:52.600
then instantly all your object graph has been recreated by by Marshall it just create it reads the it's a stream of
00:24:59.279
instruction on how to recreate the object graph so it creates the objects one by one and then just rebuild the the
00:25:05.520
graph which mean that like many points in times you have many partially initialized
00:25:13.919
objects um and because of this in case of secular references uh it may have
00:25:20.080
like Marshall may have to inser a partial initialized instance as a ash key or something so that was purely a
00:25:27.760
theory ADV that point but something like that could reproduce so I had a pattern that could explain it so I need to look
00:25:34.159
for it um since I know active record relatively well I I had a few ideas of
00:25:41.440
where the thing could be and so I started gripping and things like that
00:25:46.600
and um one place where we use active record instances as ashys is the
00:25:51.880
association cache when you load like prod products. comments or something vasor Ash with um instances as
00:26:01.039
Keys um but there was a missing piece right
00:26:06.520
is because for like okay the record was
00:26:11.640
partially instantiated but it meant that we would load an association cache
00:26:17.279
before we load the attributes which made no sense because if you were to create a new record and look at the order of the
00:26:25.120
attributes uh of well of the instense variables attri are before um so we will need also to find a
00:26:33.520
place where uh the ordering but like let me show you all ordering you probably
00:26:39.799
never ask yourself this question but like all instance variables are ordered in
00:26:45.880
Ruby so at the surface is quite simple um Ruby just preserves the under of
00:26:52.440
assciation so here we have like two classes U A and B one Define um this
00:26:58.399
they both defined in reverse order and when you ask for instance variable you get the order you set the
00:27:03.919
variables but what if the same class can sometimes have different ination
00:27:10.080
orders like in this case it's the same class and and um in one case we Define a
00:27:15.600
and then B and in the other case we Define B and then a and you can see that
00:27:21.960
like the first case it's what we expect but that in the in the second case it's less clear
00:27:28.000
right like should you reuse the same order than from the first instan station or do you have your own order because
00:27:33.799
you're a different instance on the on the Ruby commer slack
00:27:40.360
we have a a very um useful Bots where you can give it a snipet of Ruby and it's just going to run it over all the
00:27:46.480
versions of Ruby and show you the differences um so if you ignore the 1.8
00:27:52.039
where this was returning strings you can see that in 31 up to 31 the very first
00:27:59.519
inertion order seen by the class was preserved and the same for every single other instance of the class but in
00:28:06.000
street2 that changed in 32 every single instance have um its own
00:28:13.960
order so this was a Smoking Gun because we have a difference between 31 and 32
00:28:19.760
and the reason for that is object shapes uh I wor closely with uh two colleagues
00:28:24.960
of mine were working on that this is like a optimization so now we needed to find a pattern in
00:28:31.240
active record where we could cause this and um I grab for it and I thought
00:28:39.320
I saw saw it because I saw like oh this this initialized up when you do
00:28:45.720
a record it resets the association cache and then call Super so all the
00:28:50.760
attributes are set later so I was like really really excited like I found the bug I had like so many exclamation
00:28:57.159
points uh but actually I was
00:29:03.679
wrong when you dup actually like the attributes are copied directly and when the Callback is called so it just didn't
00:29:11.240
work uh but oops sorry but then thinking about it more um
00:29:20.519
what happens is that I had been missing in my console for a very long time and I created a state that was reproducing but
00:29:26.039
just that wasn't it and the MTH is that it's a bit complicated but like
00:29:32.000
shapes are an optimization unless you define your attributes in such a different order every time that it
00:29:37.399
creates so many possibilities and when that happen Ruby de optimize and say like okay I'm just going to use a ash
00:29:43.480
you know for that because it's it's too complicated otherwise makes them a gigantic tree and so to reproduce this you had to
00:29:51.159
create so many different insertion order on that particular class so that it will de optimize the
00:29:56.240
class um so I had finally after four five days I
00:30:02.360
actually had my Repro so agab complex object being Ed it's just a
00:30:10.840
bug in Ruby right it's just it's a bug in Ruby not in RS we should fix it in Ruby it has been working since then but
00:30:17.360
I like ra 7even was just new I couldn't say like oh don't upgrade to ra to Ruby
00:30:22.840
if you're using right 7even it just doesn't work so we need to fix it in rights as well um
00:30:29.000
and I saw like this is like sorry um in Insight like the the actual fix we did in Ruby
00:30:35.679
just shipped like N9 months later or something so that would have been terrible not to to fix so I needed a quick work around for
00:30:43.120
masteron users I needed a fix in Ruby and I needed a fix in rise at that point I was curious because
00:30:50.440
at ch5 we had upgraded to 3.2 for like three four months and we never ate that bug um but it's because we don't use
00:30:58.159
Marshall we basically baned Marshall for this sort of reasons we use a library called pakito
00:31:06.919
and so I thought okay I'm going to backport something from pakito into Mastodon um so basically I just replace
00:31:16.000
instead of just doing Marshall dump I replace with a a custom serializer I'm just going to show you very quickly it
00:31:22.240
works it's just like every record is basically an array with like the class name the list of attrib Utes and your
00:31:28.480
record and then we rebuild it cleanly from that we don't rely on Marshall resetting with the order and
00:31:34.360
everything um as an added benefit which was like way faster it's like the
00:31:39.799
pillows were twice smaller and the serialization time was also like cut in half more or less so I basically then
00:31:47.320
did pretty much the same thing but for rails itself of course the pill is different so there's like a migration
00:31:53.000
step you need to to activate the new version and everything um but now it's
00:31:58.360
fixed for everyone in R um and I I even did um kind of a bet verion because you can
00:32:05.200
control all like object can declare other seriz by Marshall so it's still using Marshall but it's not using the
00:32:11.279
automatic kind of thing it just say like oh if you want to convert me to Marshall you just convert it like this array with
00:32:16.559
three elements um and similarly in um in rails
00:32:22.679
that cut down the payload and the diation time quite a bit so now needed
00:32:28.200
to fix it uh in Ruby um so I mean $10
00:32:33.240
broke it so I told him to go fix it uh and um the issue is that when Ruby de
00:32:40.200
optimize and say like oh stop using shape I'm going to use a ash it was using an unordered Ash like technically
00:32:46.600
it's an order Ash but not an order we care about it's the ination order um the the patch is much more
00:32:53.039
complicated than that because it's C so you need to fix things everywhere but basically um each change the type right
00:32:58.720
in instead of using ID table it's using St table it's exactly the same Ash you're using when you're using ash. new
00:33:05.200
in Ruby it's really the same thing um and the fix was then back
00:33:10.480
ported in three do 3.2.3 um so yeah just
00:33:17.880
to finish on on conclude on this uh some more takeaways um I really need to
00:33:24.880
double check before posting because like this R if you go look it up it's like all of me saying like oh I found it and
00:33:31.000
two minutes later like oh no I was wrong just made a bit of a full of
00:33:36.320
myself um and I before I said like oh when you get more experience you know
00:33:41.360
the system well you might be able to take shortcuts uh be beware because you might actually discard some information
00:33:48.840
because of that also as I said before it's way harder to debug something you're not
00:33:54.760
experiencing yourself like it's really really hopeful if from day one I had a
00:33:59.880
corrupted pillow that would sa like three four days of debugging um and to me it's like the big
00:34:05.000
advantage of Open Source it's just like you can directly debug yourself you're not just trying to do back and forth
00:34:10.520
slow back and forth with other people um be very careful with Marshall
00:34:17.000
uh it can be convenient but like it's really really dangerous um honestly my
00:34:23.119
advice would be like if the thing that's going to deize what you serialized is the same program with the same version
00:34:29.599
it's fine but even across the across two version of the same program it start to be very very dangerous because maybe
00:34:35.240
it's referencing a constant you removed things like that so try to avoid it like the
00:34:41.440
plague um I saw like it's small for Ruby but like semi deterministic behavior is
00:34:49.320
really the worst because like this is what made it so hard um to reproduce and
00:34:56.280
also just because you're trying something RB and it works and maybe it has worked
00:35:01.680
like that for five a doesn't mean it's continue like that like if the documentation doesn't tell you that the
00:35:07.359
insertion order can be trusted you shouldn't trust it trust it there's a something called irum law that say like
00:35:13.200
every single observable behavior of your program would end up being relied upon by one of the users but exactly what was
00:35:21.240
happening and yeah it's never a rby bug like uh don't go out of this talk thinking like oh I know the weird thing
00:35:28.119
I'm seeing is must be a ruby bug it's never until it is um and yeah z r VM is
00:35:34.920
your code too because your car's running on it if it's buggy your your program is buggy so just be curious just go dig in
00:35:42.520
offing walks and and just yeah search search into it that's all I had thank you very