List

Putting Rails in a corner: Understanding database isolation

Putting Rails in a corner: Understanding database isolation

by Emil Ong

In this video titled "Putting Rails in a Corner: Understanding Database Isolation," Emil Ong provides an insightful discussion aimed at Rails developers dealing with database transactions and isolation levels. He addresses the common issue of inconsistent data appearing in applications even when code is wrapped in transactions, emphasizing the importance of understanding database isolation levels in Rails applications.

Key Points Discussed:

  • Understanding Database Transactions:
    • Transactions are sequences of operations performed on a database. They must follow the ACID properties: Atomicity, Consistency, Isolation, and Durability.
  • Focus on Isolation:
    • Isolation ensures that concurrent transactions appear to be executed serially. This is vital for data integrity.
  • Spectrum of Isolation Levels:
    • The video outlines various isolation levels, from "Read Uncommitted" to "Serializable," explaining the trade-offs between performance and data consistency.
    • Common Isolation Levels Explained:
    • Read Uncommitted: No guarantees; can read uncommitted changes.
    • Read Committed: Only reads committed changes but can still have varying results if read multiple times.
    • Repeatable Read: Ensures consistency in reads within a transaction but can still lead to inconsistencies if not handled properly.
    • Serializable: Full isolation, prevents any anomalies, but can be the most performance-intensive.
  • Rails and Isolation Levels:
    • Since Rails 4, developers can control isolation levels directly in their transactions. However, increasing isolation can lead to performance issues due to increased load and potential deadlocks.
  • Practical Examples:
    • Ong provides a hypothetical scenario using a congratulatory card application to illustrate race conditions that can cause inconsistent data statuses. He explains how simultaneous actions within transactions can lead to unexpected outcomes.
  • Testing and Handling Exceptions:
    • Recommendations on testing strategies that accommodate isolation levels and how to identify and manage rollback scenarios in Rails applications. Testing isolation changes in combination with transactions can pose challenges.
    • Emphasizes that specific isolated transaction tests may require database cleaning strategies instead of standard transaction rollbacks.

Conclusions and Takeaways:

  • The Necessity of Isolation Levels: Understanding and configuring database isolation levels is crucial for ensuring data integrity in applications.
  • Performance vs. Consistency Trade-off: There is always a balance to maintain between performance gains and transaction isolation requirements.
  • Ongoing Improvement Needed: Ong advocates for a more accessible way to manage these complexities in Active Record and Rails, implying that community contributions may be necessary to facilitate better practices for developers.

Putting Rails in a corner: Understanding database isolation by Emil Ong

If you've ever had inconsistent data show up in your app even though you wrapped the relevant code in a transaction, you're not alone! Database isolation levels might be the solution...

This talk will discuss what database isolation levels are, how to set them in Rails applications, and Rails-specific situations where not choosing the right isolation level can lead to faulty behavior. We'll also talk about how testing code that sets isolation levels requires special care and what to expect to see when monitoring your apps.

RailsConf 2018

00:00:11.210 so I'll go ahead and get started thanks
00:00:15.150 everybody for coming
00:00:16.379 the title of this talk is putting rails
00:00:18.330 in a corner understanding database
00:00:19.980 isolation and that was originally a
00:00:22.800 reference to a quote from the movie
00:00:25.710 Dirty Dancing but much like a lot of
00:00:28.500 lines from 80s movies that kind of came
00:00:30.840 out of nowhere and doesn't really go
00:00:33.149 anywhere so this isn't really a good
00:00:35.040 Dirty Dancing themed talk but instead
00:00:38.100 you get this lovely botanical
00:00:39.870 illustrations which which I find quite
00:00:42.420 nice so I'm email and I'm now as the
00:00:48.960 last hour so a software writer appetit
00:00:51.500 and banty is a great company we work in
00:00:57.059 the gov tech space working with foster
00:01:00.270 agencies with the goal of helping to
00:01:03.239 find every child a family and we're
00:01:07.770 hiring if you want to talk to us about
00:01:10.110 that I'm also on Twitter I've got my
00:01:12.689 handle at the bottom right hand corner
00:01:14.280 if at any time you want to tweet
00:01:17.159 something about the talk I talked about
00:01:19.619 computers on there sometimes but mostly
00:01:21.390 like music theory and jazz and sometimes
00:01:24.119 cute animals cool so let me answer the
00:01:28.110 first question that's all honcho minds
00:01:29.970 right now this should you try and sneak
00:01:31.979 out or else who might benefit from this
00:01:35.100 talk so say that you have some
00:01:40.380 controller actions that you wrapped in a
00:01:42.119 and a transaction and you're finding
00:01:44.850 that you're still having inconsistent
00:01:46.950 data afterwards stick around this is
00:01:49.500 probably for you if you have slow back
00:01:53.040 big slow background jobs they use active
00:01:56.369 job or something like that
00:01:57.750 and those need to run transactions so
00:02:00.360 it's another good another good candidate
00:02:03.979 or if you simply write anything to the
00:02:07.140 database that's based on something that
00:02:08.789 you read from the database so really
00:02:11.520 quite quite general
00:02:13.830 in other words people who need
00:02:15.090 confidence in their data and probably
00:02:17.610 use a sequel database otherwise this
00:02:19.440 might not apply as much but if on the
00:02:23.160 other hand somebody just called you and
00:02:25.080 said that they've got some ice cream and
00:02:26.490 it's gonna melt like totally that sounds
00:02:28.740 better so feel free to feel free to duck
00:02:31.680 out great so thanks for staying and let
00:02:41.190 me just make a kind of meta note before
00:02:43.620 we we really dive in so this is this
00:02:47.580 kind of a dense topic and I wouldn't I
00:02:50.220 think the takeaway here is not that
00:02:52.170 you'd be able to walk away and like
00:02:53.610 immediately write code based on you know
00:02:57.180 sitting and configuring database
00:02:58.860 isolation levels so I'd recommend that
00:03:01.260 you just kind of get a sense of the kind
00:03:02.940 of problems that are affected by the
00:03:05.370 space and and how you might think about
00:03:08.130 configuration and then just remember
00:03:10.140 some terms and go back and search for it
00:03:12.720 later cool so let's consider the context
00:03:18.090 that we're talking about so we're
00:03:19.320 talking about transactions we're talking
00:03:21.570 about how they interact with ORM s and
00:03:24.810 especially active record and it was
00:03:29.490 awesome that DHH set this term up in our
00:03:32.010 mind of leaky abstractions because
00:03:33.990 that's what this is all about and why we
00:03:36.930 still need to talk about this so just a
00:03:40.950 reminder on transactions they're really
00:03:44.820 like a sequence of interactions with the
00:03:46.590 database that really you hope have in a
00:03:51.990 lot of cases these following properties
00:03:53.850 so if you've heard the term acid it's an
00:03:56.250 acronym that stands for atomicity
00:03:57.540 consistency isolation durability a de
00:04:02.070 Missa t means that either everything
00:04:04.410 that you tried to do happened or nothing
00:04:06.630 happened
00:04:08.240 consistency means that when you run a
00:04:10.980 transaction you your transaction starts
00:04:14.310 with a consistent state of the database
00:04:16.290 and it ends with a consistent state of
00:04:18.359 the database it might be the same State
00:04:20.070 or it might be a different one but it
00:04:21.390 should be consistent isolation is kind
00:04:26.010 of primary
00:04:27.340 focus of this talk and I think it's
00:04:29.620 really fair to use your your intuition
00:04:31.870 about what isolation means but the
00:04:34.810 official definition of this is something
00:04:37.930 along the lines of if you have a bunch
00:04:39.820 of transactions that are running at the
00:04:41.290 same time they can be run as if they
00:04:44.860 were run one after the other serially
00:04:47.020 and then durability means that the
00:04:50.020 things that you write and committed do
00:04:51.910 you expect to be able to that they stay
00:04:54.910 there until you delete them at least so
00:04:59.010 what we're going to talk about is how we
00:05:01.000 can control isolation and an effect
00:05:03.780 control consistency so not too much
00:05:06.700 about durability maybe a little bit
00:05:08.260 about atomicity in this talk active
00:05:12.070 record is awesome it gives us this
00:05:13.540 really helpful object model for dealing
00:05:15.580 with our data and it even gives us some
00:05:18.880 controls for for transactions so we can
00:05:21.729 start transaction we can roll back a
00:05:23.169 transaction as of rails 4 we can even
00:05:25.750 control the isolation level so that's
00:05:28.740 that's super helpful but unfortunately
00:05:32.020 it might hide some of the finer points
00:05:33.280 about what's going on with a database
00:05:35.039 that that we need to know about to make
00:05:37.510 our our code behave like we expect so
00:05:43.380 many of you may have used transactions
00:05:46.240 and you take a block of code which you
00:05:49.720 think should have those kind of acid
00:05:51.910 properties and you wrapped it in a
00:05:53.979 active record based transaction block
00:05:56.650 and then we hope that everything goes
00:06:00.160 right well let's take a look at a
00:06:02.950 scenario of an app called the congrats
00:06:06.190 app so this is just an imaginary app but
00:06:10.479 it's an app to send a card from a group
00:06:14.020 of people to one person usually probably
00:06:17.229 to congratulate them so you can go in
00:06:19.660 you can create a card identify a
00:06:21.280 recipient and set a group of people who
00:06:24.669 can sign that car it might be living
00:06:26.080 node and instead of send by date so
00:06:28.960 people can come in and and sign and
00:06:31.229 remove their signature or update their
00:06:34.630 note and then we have a sin by day so
00:06:39.180 say that
00:06:41.319 not everyone has signed by that sin by
00:06:43.330 date we go ahead and send it if you've
00:06:45.879 ever tried to collaborate and get a
00:06:48.039 bunch of people to sign a card for
00:06:49.629 somebody you know there's always some
00:06:50.860 stragglers so we'll just put that into
00:06:52.690 our app so let's say that we have this
00:06:58.000 timeline so Edie gets a promotion that's
00:07:00.310 pretty awesome
00:07:01.930 Edie has a friend named Pat who is
00:07:04.840 really cool and decides that they want
00:07:06.909 to send a congratulatory card to Edie
00:07:10.300 and invites Dana and Rhys so Dana is
00:07:14.590 totally on top of things signs right
00:07:16.419 away but then some drama happens between
00:07:19.960 Dana and Edie I don't know what it is
00:07:22.060 but I'd love to speculate so also it's
00:07:27.370 totally imaginary scenario so I can make
00:07:29.259 it up whatever I want but that's not the
00:07:31.389 important bit the important bit is that
00:07:32.800 the next thing that happens is that Rhys
00:07:35.590 finally signs and Dana at the exact same
00:07:38.860 time decides to remove their signature
00:07:41.669 so we've set up our app so that ETS
00:07:45.400 getting a card but when is when are they
00:07:49.030 gonna get it whose signatures are gonna
00:07:50.919 be on it it's not really obvious from
00:07:53.590 this right so let's zoom in on that
00:07:56.639 somewhat that simultaneous event so this
00:08:00.490 could be one way that happens so Reece
00:08:02.110 says clicks the sign button it comes to
00:08:05.529 our controller we say load the card with
00:08:08.139 the signatures add Reese's signature see
00:08:11.289 if everybody signed and if if everybody
00:08:14.650 has sign then go ahead and send the card
00:08:15.969 at the same time
00:08:18.099 Danette clicks like remove me I no
00:08:21.159 longer want to be associated and we load
00:08:24.130 their signature and then delete it so
00:08:27.400 this could it could happen this way
00:08:28.810 depending on you know garbage collection
00:08:33.070 or whatever it could happen this way
00:08:35.880 who knows so what's supposed to happen
00:08:39.849 in either of these scenarios what do we
00:08:42.370 intend to happen well it could go two
00:08:44.950 ways as far as I can see one is that we
00:08:47.829 don't send a card until the send by date
00:08:49.390 and only the people who intended to sign
00:08:51.490 during that entire time
00:08:52.769 mine are represented and I think that's
00:08:55.170 probably what I would want
00:08:56.850 so like Dana's signature is not on there
00:08:59.519 but everybody else is is but we could
00:09:04.019 also end up sending the card before the
00:09:06.899 send by date and still having deleted
00:09:09.480 data stain asserted eita base so at some
00:09:12.660 point even in our not only if we
00:09:16.049 misrepresented what everybody wanted in
00:09:18.929 our database we have a card that was
00:09:20.819 sent with before the send by date
00:09:22.649 without all the signatures so that seems
00:09:25.559 bad so why why can we end up in these
00:09:30.689 inconsistent states well when we
00:09:35.040 actually do the sending we're assuming
00:09:37.350 that the context from before hasn't
00:09:40.529 changed we just made a decision based on
00:09:43.139 on that context so the the thing that we
00:09:48.839 have to know is even though we talked
00:09:51.299 about those asset transactions what went
00:09:55.049 wrong is that the database actually
00:09:57.449 makes trade-offs so we relaxed the
00:10:01.319 databases relaxed isolation to improve
00:10:03.540 concurrency and performance so we the
00:10:06.420 the database says in certain cases we
00:10:09.089 don't have to be as strict and I put
00:10:11.040 this really cute dog photo on there so
00:10:13.230 that you remember this really key
00:10:15.660 important part so remember the
00:10:17.790 blissed-out dog
00:10:21.179 so there's actually a spectrum of
00:10:24.299 isolation that we can choose and it
00:10:26.339 trades off between performance cost and
00:10:29.660 and how isolated you are and don't try
00:10:33.540 and read all these right now we'll go
00:10:34.829 through them but you are always in one
00:10:38.910 of these isolation levels no matter what
00:10:41.009 and depending on which database you're
00:10:44.220 in you might be in one by default or
00:10:46.649 another one by default and also
00:10:48.509 depending on which database you're in it
00:10:50.970 might implement these differently
00:10:52.350 because the requirements are only
00:10:53.910 minimum requirements each at every level
00:10:57.179 it could actually be more isolated than
00:10:59.489 is specified so let's go through these
00:11:03.529 read uncommitted to
00:11:06.100 the ours there's no guarantee about
00:11:07.930 isolation you there's there's nothing
00:11:10.480 that is is required of the database to
00:11:12.639 isolate your transaction so you can in
00:11:14.860 fact even read rows that have been
00:11:18.190 updated or inserted in other
00:11:19.779 transactions that haven't even committed
00:11:21.670 yet and that means that if you read data
00:11:25.810 during your transaction if you read it
00:11:29.620 again it could be totally different
00:11:30.850 there's no warning and you don't know
00:11:32.980 where it came from and in fact it might
00:11:34.660 even rollback so it's it's really you
00:11:38.740 have no guarantees whatsoever I can't
00:11:42.009 think of a really good reason to use
00:11:43.569 this for production code but I have
00:11:44.980 actually used it for sneaking a peek at
00:11:47.199 what's going on on my sequel in
00:11:49.089 production if I ever really long-running
00:11:50.940 transaction so for the next level up is
00:11:57.339 read committed so in this case when you
00:12:02.350 read something from the database you're
00:12:04.120 guaranteed that it has at least been
00:12:05.980 committed by another transaction however
00:12:10.000 that means that you still may have may
00:12:12.550 be able to read data twice and it be
00:12:15.550 different without you having changed
00:12:18.160 anything in in that transaction so it
00:12:21.250 still doesn't sound super isolated but
00:12:23.769 it's better than reading somebody's kind
00:12:26.079 of like half done work and then there's
00:12:28.930 there's no real warning if anything
00:12:31.449 happened and this is what Postgres
00:12:33.670 chooses its default repeatable read it's
00:12:36.850 the next level up so this means that
00:12:38.829 once you've read a row if you try and
00:12:42.009 read it again you'll only see changes
00:12:46.120 that you've made to it so at this point
00:12:48.399 we're now in a place where the things
00:12:50.740 that we kind of touched during our
00:12:52.449 transaction are now consistent within
00:12:55.500 within the transaction so that that's a
00:12:58.360 pretty nice guarantee and what happens
00:13:02.110 is if you read something and then start
00:13:06.220 updating things the database will warn
00:13:08.649 you and say hey you just did something
00:13:11.740 based on something that changed and
00:13:13.930 committed to a different state by the
00:13:15.760 time you're done with your transaction
00:13:17.500 so you may want to
00:13:19.840 do something else so won't fix anything
00:13:21.400 it's not an automatic fix but it is a
00:13:25.270 really helpful warning you can still end
00:13:28.060 up with inconsistent data in this in
00:13:30.940 this level it's a little bit harder but
00:13:33.970 it's definitely possible and then this
00:13:37.450 is what my sequel chooses as its default
00:13:39.610 although as again I said it's
00:13:41.580 implemented differently from post crest
00:13:44.250 serializable is basically full isolation
00:13:48.190 so transactions can only happen in a way
00:13:52.510 that they could be they could have been
00:13:54.940 written serially so this is hopefully an
00:13:58.510 implementation of that full I in the
00:14:00.580 assets back it's pretty I could not
00:14:03.310 think of a way to get inconsistent data
00:14:05.710 without like incorrect code in this with
00:14:10.030 this level but if you think of what a
00:14:12.160 way or you know of a way and let me know
00:14:13.750 I think it would be really interesting
00:14:15.040 and then this is the most expensive so
00:14:18.850 let's go back to our scenario and see
00:14:21.690 what would happen if we used repeatable
00:14:25.720 read or serializable with our with this
00:14:30.130 scenario that happened before so if we
00:14:32.950 got to the end of the transaction in in
00:14:35.380 Reese's transaction and say dana's
00:14:37.510 committed before then we would get a
00:14:41.680 warning from the database that says hey
00:14:43.690 sorry your assumptions changed whatever
00:14:45.670 you did you can't rely on that we're
00:14:48.790 gonna roll back everything and and try
00:14:50.770 again so that's at least a helpful
00:14:54.730 warning to you to help you know that you
00:14:57.160 have to do something else it's also
00:15:00.190 possible that Dana could have gotten a
00:15:02.350 rollback and said sorry we couldn't have
00:15:04.210 deleted your signature before we sent
00:15:06.970 the the card but at least to be able to
00:15:08.950 warn the user in that case so everything
00:15:13.810 I talked about before is basically just
00:15:17.410 for sequel we didn't really need to know
00:15:20.800 anything about rails or rubia but as I
00:15:24.910 mentioned you can enable isolation in in
00:15:29.290 rails as of rails 4 and what you do is
00:15:32.530 when you have a transaction block you
00:15:36.930 specify an isolation level on as an
00:15:41.050 argument to the transaction alone so
00:15:42.910 cool we're all done right it was less
00:15:46.090 than one line of code for this whole
00:15:47.530 talk is simply a parameter change yeah
00:15:50.950 not quite so as I mentioned as you
00:15:54.280 increase the the isolation guarantees
00:15:58.120 your performance may suffer and it may
00:16:00.970 suffer because well there's there is
00:16:02.980 actually more load on the database as
00:16:04.720 well he may hold locks things like that
00:16:07.020 you may have to repeat transactions but
00:16:10.750 that's the cost of doing business there
00:16:12.730 and depending on how you implement
00:16:15.130 things you may introduce dead logs it's
00:16:17.680 not the worst thing in the world it
00:16:19.360 actually looks a lot like those
00:16:20.910 serialization errors that we saw before
00:16:22.900 but it's another thing that might come
00:16:25.000 up so what's actually special about RMS
00:16:31.960 and an active record well there's this
00:16:36.640 great abstraction that reads from the
00:16:40.120 database for you at times that are that
00:16:45.340 allow caching and then a caches things
00:16:47.140 and it also writes to the database and
00:16:49.990 it may do that at any time so you
00:16:51.640 actually don't always really know if the
00:16:56.530 data that you're being that you're
00:16:58.180 reading has been read into a model yet
00:17:01.390 so we have to go through special
00:17:03.670 procedures like pre loading and eager
00:17:06.310 loading or refreshing so in order to get
00:17:10.540 this great facility from the database
00:17:12.880 that warns us when our isolation has has
00:17:17.080 been has a problem we need to give a
00:17:20.560 hint to the database to tell it hey I
00:17:23.760 I'm using this data and I'm making
00:17:26.560 assumptions based on it and my logic my
00:17:29.770 application logic needs that so let's
00:17:34.180 consider how we might implement that
00:17:36.100 send by dates that we mentioned so one
00:17:40.060 thing that we could do is use an active
00:17:41.590 job
00:17:42.040 implementation and if you're familiar
00:17:44.020 with active job it's really cool or if
00:17:47.110 you're not it's also cool one thing that
00:17:51.160 you can do is call these jobs and say
00:17:53.530 perform this later on this model so
00:17:55.960 maybe we've got a card model and we say
00:17:58.980 send the cards at the send my date later
00:18:02.350 on with with this card and it does this
00:18:05.680 really cool magic underneath the hood it
00:18:07.870 says I know if this card is I'm going to
00:18:10.180 take an ID stick that into whatever our
00:18:13.900 store is for our our background job
00:18:16.540 queue maybe it's Redis or active in
00:18:19.480 queue or something like that and then
00:18:21.400 when when I call you back in this
00:18:23.560 perform method I'm gonna re reform that
00:18:27.850 card for you and pass it in
00:18:32.130 unfortunately what that means is that
00:18:34.060 the read for the database happened
00:18:36.520 outside the transaction so we thought we
00:18:39.280 could use the serializable isolation
00:18:42.460 level for for the body of this this
00:18:45.910 active job but it turns out that the
00:18:48.250 database has no way of knowing that
00:18:49.930 we're using the data in the card to make
00:18:52.570 our assumptions so one thing that we can
00:18:55.840 do to improve that unfortunately we have
00:18:57.700 to kind of not let active job I get the
00:19:02.230 card for us but we it's not too bad we
00:19:05.290 can just introduce an additional
00:19:07.440 additional line here pass in the card ID
00:19:11.520 to to perform the job later and then
00:19:15.310 grab it within the transaction so now
00:19:18.040 the database knows hey we're this is
00:19:20.410 something that we're using for our logic
00:19:24.120 but there's a problem what if we sent a
00:19:28.720 card in the actual mail and then the
00:19:30.550 database told us hey you shouldn't have
00:19:33.010 done that
00:19:34.300 somebody's going to have to go digging
00:19:35.470 through the post box and that's not
00:19:37.510 great so so we can try something else so
00:19:43.270 this is another technique that you can
00:19:44.440 use which is well still continue to pass
00:19:47.590 the card ID and we'll still grab it
00:19:49.510 within the course of the transaction and
00:19:51.280 then maybe we'll add a boolean field to
00:19:54.250 the card table
00:19:55.810 that says are we committed to send and
00:19:59.260 what we want to do in terms of getting
00:20:02.320 them the logic here is we want to say
00:20:05.140 we're only going to send the card in the
00:20:07.870 mail when this field goes from false to
00:20:10.300 true never again so so it should happen
00:20:14.670 really and most once and depending on
00:20:17.650 how we it might happen at least once as
00:20:19.450 well depending on how we deal with our
00:20:21.130 jobs but here we kind of guarantee that
00:20:22.930 happens at most once and so in this in
00:20:28.300 this second line here with the return
00:20:30.700 we're we're making a decision based on
00:20:33.220 what we read from the database yeah is
00:20:35.110 this committed to send if not then
00:20:37.900 continue on say we are committed to send
00:20:40.150 and then the only way that we can either
00:20:44.550 exit the transaction and continue on to
00:20:47.500 the next line or exit the transaction
00:20:49.270 without a rollback is if that if our
00:20:52.570 assumption went from false to true and
00:20:54.070 we use the database to helpfully get us
00:20:56.980 from one state to the to the next just
00:20:59.710 once you gotta watch out for loops too
00:21:04.110 so one thing that you might think about
00:21:06.460 doing is sending like a reminder email
00:21:08.620 to everybody who hasn't signed a card
00:21:09.910 yet and it's before their send by days
00:21:11.950 so one thing you might think is I'm just
00:21:14.710 gonna do this loop in a in a transaction
00:21:18.130 do them all at once and that might be
00:21:21.070 safe and it might be okay too if they're
00:21:23.110 if you know that this doesn't happen or
00:21:25.330 if there's not very many cards like that
00:21:26.980 or not very many reminders to send out
00:21:29.550 but you may also experience a lot of
00:21:31.990 rollback so really you're just you got
00:21:35.260 to watch out for the surface area of
00:21:36.850 your transactions so it it's better if
00:21:39.850 you don't make a lot of assumptions and
00:21:41.830 then make a lot of actions based on
00:21:44.410 those assumptions so classic advice keep
00:21:47.320 your keep your transaction small if you
00:21:49.960 do decide to do that though at least try
00:21:52.060 and put an ordering a consistent
00:21:53.530 ordering on the way that you access
00:21:54.700 resources here we're only accessing them
00:21:57.070 from one table but we at least should
00:22:01.630 try and access them in say an ID order
00:22:04.510 or something that's consistent between
00:22:06.150 transactions
00:22:07.680 otherwise you could lead to two
00:22:09.540 deadlocks now I got to talk too much
00:22:11.040 about that but that's another thing that
00:22:12.330 I would recommend digging into a little
00:22:14.760 bit more if you haven't come across
00:22:16.460 database deadlocks it's another huge
00:22:19.680 topic so maybe a better idea would be to
00:22:25.910 go and get all of the cards that you
00:22:30.450 think might not that might need a
00:22:33.030 reminder and then to a transaction
00:22:34.950 around each one and then explicitly do a
00:22:37.680 reload within the transaction block to
00:22:40.350 say hey go get this from the database
00:22:43.320 again tell it that we're using this data
00:22:45.270 and then make our decision based on what
00:22:49.620 we read within the transaction and then
00:22:51.929 send it if if need be a little trick you
00:22:56.670 can do is just select the ID and then if
00:22:58.740 you do a reload it will actually reload
00:23:00.270 all the fields for you so we talked
00:23:04.770 about how to configure and use all of
00:23:07.830 the all the database isolation levels
00:23:10.980 but how do you actually identify this
00:23:12.179 rollback so you can make a decision
00:23:13.890 about whether or not you should do
00:23:16.140 something well if if you're running a
00:23:23.100 transaction what will happen is it will
00:23:24.870 throw active record will throw this
00:23:27.179 active record statement invalid error
00:23:31.100 unfortunately at the current state of
00:23:33.960 things the exact nature of that
00:23:37.950 statement invalid could be anything it
00:23:40.890 could be that your sequel is invalid or
00:23:42.870 something else so active record does
00:23:46.170 provide this cause field on on this
00:23:49.110 statement invalid error and then it
00:23:51.870 what's attached is database driver
00:23:54.570 specific not just database specific
00:23:57.290 Postgres is is pretty well typed and
00:24:00.929 then within there you can decide to try
00:24:03.929 or read or do not retry depending to
00:24:07.620 paraphrase and my sequel has done less
00:24:16.160 less with the inheritance tree here so
00:24:19.080 it's just error and these also might
00:24:21.630 be different for JDBC as well so you
00:24:24.750 really have to the best way to do this
00:24:26.970 is try and create scenarios where you
00:24:29.700 will run into isolation errors and see
00:24:33.300 what's thrown unfortunately we haven't
00:24:38.040 talked at all about tests but also I'm
00:24:40.890 sorry this leads to trouble with tests
00:24:43.800 so between each test for most tests
00:24:48.270 ideally what you can do is set up a
00:24:50.700 transaction before the test runs run all
00:24:53.730 your logic and then when it's done roll
00:24:55.650 it back and get back to the state that
00:24:57.630 you started out possibly empty or
00:25:00.360 possibly with some seed data and that's
00:25:02.580 great it's super fast and it's it's a
00:25:06.210 really consistent way of dealing with
00:25:08.700 with test data unfortunately once a
00:25:14.160 transaction has been started you cannot
00:25:16.140 change the transaction level so if
00:25:20.040 you're testing code that changes the
00:25:22.080 transaction level you probably you're
00:25:24.990 going to want to use deletion or
00:25:27.750 truncation strategies for your database
00:25:29.550 cleaner so I wrote a little poem this is
00:25:35.820 just to say that I've slowed the tests
00:25:37.320 that were in transactions and you what
00:25:39.090 you're probably hoping would remain fast
00:25:40.920 forgive me they were so simple and so
00:25:43.080 clean and so clear but now they're not
00:25:45.500 sorry also apologies to William Carlos
00:25:48.450 only so some more testing considerations
00:25:52.050 how do you actually test the concurrency
00:25:53.880 here it's actually really hard I don't
00:25:55.440 have a great answer for you and I got a
00:25:57.060 lie so one thing you can do is load
00:26:00.000 testing this is more likely to come up
00:26:02.160 with load testing and then the other
00:26:05.760 thing you can do is kind of manually
00:26:08.610 test it and this really stinks but you
00:26:12.180 can try and add random sleeps to things
00:26:14.540 just to prove at least to yourself what
00:26:17.610 kinds of errors come up and that they're
00:26:19.080 handled this yeah I can't say sorry
00:26:22.560 enough for that but it it totally works
00:26:24.630 and it works really consistently but
00:26:26.130 it's really hard to automate so if
00:26:29.100 anybody comes up with some great ideas
00:26:30.240 for testing
00:26:33.059 so let's do a quick review so remember
00:26:39.070 my top happy dog slide databases try to
00:26:43.900 trade a isolation for performance the
00:26:47.230 database and active record will let you
00:26:49.419 choose the level but choosing it may
00:26:52.960 require code and test changes and and
00:26:56.110 it's a performance hit it's not just a
00:26:58.179 performance hit and I say it's worth it
00:27:00.700 for the correctness I mean when I say
00:27:02.919 it's worth it I mean it's required
00:27:04.620 there's not really much else you can do
00:27:07.240 you just kind of have to do it so yeah
00:27:11.700 so I added this slide after I saw both
00:27:17.140 DHHS and and Eileen's talk and I I
00:27:19.870 realized as I was going through this
00:27:22.030 talk this kind of sucks yeah yeah this
00:27:25.059 is really hard and when you saw like the
00:27:27.070 final form of that of that job I hope
00:27:30.400 that like I don't know if you like me
00:27:33.370 it's like wow this really looks like
00:27:34.990 threading code and that's really kind of
00:27:37.919 rough so I don't think it has to be this
00:27:41.919 way I think we can do more with this and
00:27:44.500 I think the more people understand this
00:27:47.980 and kind of get an idea of the use cases
00:27:50.230 for this maybe we can do better and try
00:27:52.960 and push some of this stuff further down
00:27:54.880 the stack maybe we can figure out when
00:27:57.370 we entered a transaction that changed
00:27:59.890 the isolation was very specific about
00:28:02.110 isolation level and we explicitly reread
00:28:05.020 things so that we so that we can hint to
00:28:08.740 the database that it that we made
00:28:10.929 assumptions on that data something like
00:28:12.580 that but we can definitely compress the
00:28:15.669 concept down into rails an active record
00:28:17.980 I'm not totally sure how yet but there's
00:28:20.080 there's definitely work that that we
00:28:22.270 could be doing with that cool I just so
00:28:28.539 thanks to slides carnival which I use
00:28:30.789 this nice slide deck under Creative
00:28:33.370 Commons Attribution and thanks everybody