List

Static Type Checking in Rails with Sorbet

Static Type Checking in Rails with Sorbet

by Hung Harry Doan

In the talk titled "Static Type Checking in Rails with Sorbet," Hung Harry Doan, a Staff Software Engineer at the Chan Zuckerberg Initiative, explores how to implement static type checking in Ruby on Rails applications using the tool Sorbet. The presentation covers a variety of aspects related to Sorbet and its integration into a codebase that comprises over 2,500 files, emphasizing the following key points:

  • Intro to Static Type Checking: The discussion begins with an introduction to static type checking and its significance in catching errors early, improving code maintainability, and enhancing developer productivity.
  • Sorbet Overview: Sorbet is highlighted as a fast and powerful static type checker specifically designed for Ruby, which supports gradual typing and can be integrated with text editors for real-time type checking.
  • Real-World Application: The speaker shares insights into the experiences of adopting Sorbet across their large codebase, demonstrating how it was able to uncover subtle bugs, such as misspelled method calls that would typically go unnoticed until runtime.
  • Challenges with Rails: Doan discusses the challenges posed by metaprogramming in Rails when it comes to type checking, as Rails often generates methods dynamically, complicating type inference.
  • Introduction of Sorbet-Rails Gem: To address the unique challenges posed by Rails, the creation of the Sorbet-Rails gem is introduced, which helps bridge the gap between Sorbet and Rails applications. This gem facilitates the generation of method signatures, thereby enhancing the ability to type check dynamic Rails methods.
  • Implementation Strategies: Key strategies for implementing Sorbet in a Rails environment are shared, including generating RBI (Ruby Interface) files, leveraging Rails reflections for dynamic methods, and encouragement toward adopting a gradual approach to type checking.
  • Adoption Metrics: The speaker discusses metrics for measuring the adoption of type checking within their team, including method call coverage and the number of files type-checked, along with strategies to maintain engagement and support among team members.
  • Conclusion and Insight: The talk concludes with a recommendation to the audience to explore Sorbet for their own projects, emphasizing the potential for improved code quality and developer efficiency. The presenter invites collaboration and contribution to the Sorbet community, highlighting the positive impact it has had on their workflow.

In summary, Hung Harry Doan provides a compelling overview of the benefits and challenges of incorporating static type checking into Ruby on Rails applications using Sorbet, offering practical advice and insights for teams considering similar implementations.

Static Type Checking in Rails with Sorbet by Hung Harry Doan

"Sorbet is a powerful static type-checking tool for Ruby. Type checking can catch errors early and improve the developer experience—and with the right tools, it can even be used with dynamic methods in Rails.

In this talk, we’ll cover the basics of type checking and share the lessons learned from our adoption of Sorbet type checking across our 2,500-file codebase in just six months. We’ll talk about the technical challenges we faced and how we solved them with a new Sorbet-Rails gem, which makes it easy to gradually integrate type checking into your Rails app & improve your team’s efficiency."

__________

Harry Doan is a Staff Software Engineer at the Chan Zuckerberg Initiative. He’s passionate about building great developer productivity tools and applications for the community. He created Sorbet-Rails to help Rails developers benefit from static type checking. In his free time, he can be found meditating or studying Buddhism teachings.

RailsConf 2020 CE

00:00:08.599 hello everyone thanks for tuning into railsconf today I would like to talk to
00:00:14.510 you about static type checking in rails using a awesome tool called sabe we are
00:00:21.110 going to cover three main things today first we will talk about what is static
00:00:26.179 type checking and quite the benefits of what efforts I will walk you through how
00:00:31.759 we use sabe to type check our result which is going to be a fun challenge
00:00:37.809 plus I will give you some tips and tricks for how you can drive adoptions
00:00:43.550 of sabe in your team my name is hum Ron I am a staff engineer
00:00:49.969 at the Chan Zuckerberg initiative hairy's is a name I picked for myself
00:00:55.219 because I really enjoy Harry Potter I enjoy Harry Potter so much now I will be
00:01:01.039 using wizard examples throughout this talk CGI is the new kind of philanthropy
00:01:07.729 that applies technology to solve challenging social problems we have
00:01:14.509 three main initiatives one working across science and DC research to
00:01:20.150 education and three cooking in the areas of justice and opportunity I work on the
00:01:27.829 education initiative where we build some e-learning program a personalized
00:01:33.290 education platform for any k12 school in the u.s. it empowers students to learn
00:01:39.860 at their own pace in the best way that works for them so they can help with
00:01:46.880 code base and teams that are large and small old and new to give you a sense of
00:01:52.579 our skills our team has about 35 engineers we've been developing this
00:01:57.740 platform for six years now it is moving on top of Ruby on Rails with more than
00:02:03.290 2,000 Ruby files and more than hundred 50,000 lines of Ruby it is a reasonable-sized codebase and
00:02:11.360 team as the theme has expanded we have seen challenges that I think are typical
00:02:18.199 to any engineering team such as maintaining legacy code bills years ago
00:02:23.310 making big code changes safely and reliably and onboarding new engineers this also becomes a challenge
00:02:31.240 as the code base grows and officials become more complex because of that we
00:02:37.120 have been thinking about different tools that can improve the success of our team
00:02:43.170 so why sabe let's look at the code example below I define a levitate
00:02:49.870 function that takes Casta as an argument following with three calls to the functions only one of them is valid but
00:02:58.000 Ruby does not complain at all the arrows only reveal where the code is run
00:03:03.960 wouldn't it be great if there is a way to detect the bugs automatically it will
00:03:10.300 help us find errors early reduced hum testing code and increased harm focusing
00:03:16.390 on building features and Static type checkers do exactly that is not a new
00:03:23.230 idea to use static type checking in a dynamic language they are successful
00:03:29.050 tools widely used for other languages such as flow in typescript for Java
00:03:34.270 scripts of hi in my PI for Python part
00:03:40.060 of the most exciting thing last year for me was the release of sabe the tie
00:03:45.400 ticker OPA sauce by strike Saba is fast and powerful he can process hundred
00:03:52.480 thousand lines of code per second and can be integrated with editors to
00:03:57.610 perform type checking as you write code sabe is designed to support gradual
00:04:03.820 typing you can add type checks to this part of your codebase benefit from typing immediately and add more files as
00:04:11.440 you like it also comes with a runtime type checking component this sets it
00:04:17.770 apart from tools like flow and typescript in keeping the pipe definition correct
00:04:23.440 and up-to-date with the code and the tools is production ready he has been
00:04:29.110 battle tested in productions with hundreds of engineers strive for the last two
00:04:34.510 years sabe can eliminate common arrows like typos
00:04:40.750 namitha arrows and argument arrow the type information also makes code a lot
00:04:47.260 easier to understand and make it faster on both new engineers at the carven
00:04:55.210 survey relies on method signatures to do type check a signature defines the type
00:05:02.020 of the parameters and the returning value of a method survey uses these
00:05:08.020 informations to enforce that the method he is correctly and it really works
00:05:16.740 since we integrate the survey he has tremendous impact on our team and it was
00:05:23.140 immediate below is an example bug the survey I found in our code someone tried
00:05:31.090 to call academic year on the side objects however this method didn't exist
00:05:37.150 the current meta name was current academic year this bug was subtle and
00:05:43.480 heart of spot by writing a ravine code but saw a cotton as we spend so by usage
00:05:52.630 user found more bugs you know code that we didn't know like they were dead code
00:05:58.420 accessing classes of functions that were already removed ah
00:06:03.760 they were bugs caused by incomplete rename of a fashions bug like these
00:06:09.580 lower the quality of the code and slow down engineers working on those products
00:06:15.190 area however when we started out things was not easy
00:06:20.860 somebody could understand Ruby syntax and came with supports for car Ruby API
00:06:26.800 like array string hash but it offer very little support for rails out of the box
00:06:34.570 rails code is not easy to type check and let me explain why first is a big
00:06:41.950 framework with many functionalities sure that is one of the challenge but more
00:06:47.890 than that race relies heavily on metaprogramming to offer the core functionalities what
00:06:56.630 is meta programming it is a way to program where you Jerry Mathers
00:07:02.230 dynamically using code instead of writing them by hand for example I have
00:07:09.920 an Oracle class here I'm not writing the answer method will it exists it is
00:07:16.250 created by calling make magic if make magic is not called answer is not
00:07:21.980 created and the last line reproduce an arrow now you can already see here how
00:07:29.150 this can be a challenge for a static type ticker sometimes it doesn't even
00:07:34.790 know that I method exists let alone knowing what the method does let's look
00:07:43.310 at an example in rails here I have wizard a typical rails model he has an
00:07:51.470 object relational mapping definition and a database backing it up with this much
00:07:59.570 code Rose is able to create a fully flash model with many functionalities
00:08:05.350 you can query the model for records in the database you can access the
00:08:10.910 attributes of the database table you can also access the associations to receive
00:08:18.230 data from other tables all these methods are generated dynamically but how does
00:08:27.050 sabe see the code with some custom
00:08:32.510 method signature so they can see that Harry is a wizard however he doesn't
00:08:39.919 know much about attributes to sabe name and house are untyped same with the
00:08:49.070 association's they aren't Ibis well but these are quite typical day to day real
00:08:55.580 code it is not very useful to type-check your code when nothing is type is it
00:09:02.950 if only so they would know the type of these rails method if only he was this
00:09:10.040 work and me we make it happen our team is easy I created a gem called
00:09:17.810 sabe rails to bridge the gap between sabe and Rios it is designed to be a
00:09:25.730 one-stop shop that makes sabe works with real seamlessly it has a static
00:09:31.820 component that generate metal signatures for dynamic methods created by Rails and
00:09:38.980 additional runtime features that also help type-checking let me take a moment to
00:09:46.220 talk about RBI fries because this is a important concept RBI fries are like C++
00:09:55.580 header files they only contains method definition and signatures but no
00:10:01.190 implementation they provide additional information the subarray doesn't get
00:10:07.610 from passing the code they are perfect for dynamic methods so so various
00:10:13.940 approaches let's generate the RBI files for dynamic methods so that subway can
00:10:19.430 know what they do we weren't the first team that was dealing with meta
00:10:24.770 programming an alternative were meta programming plugins the plugins ran
00:10:31.460 alongside with sabe when he passed the code and let it know when dynamic
00:10:36.950 methods may be created his best explained with an example given the
00:10:44.120 article class I define earlier you can write a plug-in that detects the make
00:10:49.760 magic cause and generate signature corresponding to the answer method he
00:10:55.460 would produce and sub a we know that our method exists in the Oracle class
00:11:03.310 metaprogramming plugins have the benefits of being evergreen they around
00:11:09.530 with every type checking call so they are high on the latest versions of the
00:11:14.810 code how for that same reason it slows down type-checking Gridley and thus it cannot
00:11:22.200 be used to integrate with the editor where type checking will happen with
00:11:27.390 every keystroke it also can be difficult to write the plugins because it only has
00:11:34.350 static informations about the code being passed compared to the plugins so
00:11:40.709 various approach which is generating RBI ahead of time can be less accurate if
00:11:47.970 the cursor change and you haven't rerun the generation scripts it may be our bit
00:11:54.380 however because we have meta signatures at hand it is faster to type check even
00:12:01.500 when it is in the editor also it is much
00:12:06.810 easier to generate signatures when you have access to a fully flesh ROS runtime
00:12:12.810 environment because of the advantages of this approach it is now the recommended
00:12:19.620 approach to be able metaprogramming let's see how this approach works with
00:12:27.120 the example we had earlier using the models configuration sorry rails can
00:12:33.360 generate signature for name in house not that other house a integer column the
00:12:39.630 model method sweetens Jane because this is defined as an enum in the model
00:12:45.000 so sometimes generating signatures is not as straightforward and is seen
00:12:50.570 anyway with these signatures sub a starts to understand the code it now
00:12:56.790 knows the name and house returns string similarly we generate the signatures if
00:13:04.260 we see the written type of the association's as well sorry rails can
00:13:10.260 generate signatures for whole lot more methods in a model finally because rails
00:13:16.260 rely on meta programming to generate methods we can also mimic is process to
00:13:22.110 generate the signatures it is a very suitable way to write signatures especially if you have four
00:13:29.550 models like we do still signature generations can be cheeky for each type
00:13:37.050 of method we need a different strategy I'm going to explain a few strategies implemented in subway rails these are
00:13:44.910 quite fun technical challenges first our database attributes and Association
00:13:51.540 methods we rely on rails reflections to share it their signatures
00:13:57.500 what is reflection in short it is a mean for a process to see and change your own
00:14:03.930 structure like seeing the dynamic methods rails comes with a lot of
00:14:09.300 reflection classes like Association reflections they are great they have a
00:14:15.240 lot of informations about the dynamic methods when they exist our job is easy we can take the
00:14:21.899 informations that provide to generate corresponding method signatures except
00:14:28.110 Gwen there and there and we have to rely on other sources such as rails documentation our source code
00:14:36.750 sometimes we have to look up which file they'll generate a method to the text is
00:14:42.390 type and signature for enums we even overwrite our method provided by rails
00:14:49.350 to collect more informations for the generation code it gets more complicated
00:14:55.860 with gems and private concerns that you can add even more methods in a model to
00:15:03.089 solve that Cerberus provides a configurable generator each plug-in can
00:15:09.029 generate signatures and the generator will aggregate them to produce a final RBI file you can add or remove plugins
00:15:18.029 as you want even the cogeneration logic I described earlier are also implemented
00:15:24.120 as plugins gem plugins allow the community to share cherish and code for
00:15:29.459 public gems currently we have plugins for handful of gems such as shrine and elasticsearch I
00:15:37.199 hope that as the community developed we'll see more and more gem plugins
00:15:42.240 being shared lastly you can write custom plugins for
00:15:47.880 your own private libraries the generations and aggregation logic here
00:15:54.240 is supported by Paulo gem I recommend using it if you want to write your own
00:15:59.279 generation module I spent a lot of time talking about models it is definitely the most complex
00:16:07.710 generation module but sub-areas provides we also support generation matters for
00:16:13.740 other risk objects listed here among these objects mirrors and jobs pose an
00:16:21.570 interesting challenge they have class level methods that are generated based
00:16:26.910 on custom user-defined instant methods in this example Merlot class when we
00:16:34.110 define an instant method notify subscriber rows we add a corresponding
00:16:39.420 class level method and so very generous signature for this method as well
00:16:46.520 Gwenda is a custom signature for the instance method severus used survey
00:16:52.770 itself to reflect on the signature and generates a better signature for the
00:16:58.110 class method on top of the RBI generation logic they are a few runtime
00:17:05.250 Fisher's in sub areas to make typing easier here I will only talk about type
00:17:13.080 ax rams because they show how your code can become safer and easier to understand with type normally params in
00:17:21.540 controller actions are string even if it is an integer our boolean value the
00:17:27.810 value the controller receives our stream however in our code we can define the
00:17:35.340 structure of the params with corresponding types for its component using type params
00:17:41.700 we can coerce the normal parents into this structure and let the rest of the
00:17:47.130 code use type values the structure also acts as documentation for the controller
00:17:53.550 actions now it is super easy to know which parameters you need to send to the
00:18:00.000 actions from the client side so we have solved many technical
00:18:06.350 challenges with using sabe in a rails app but this is only when the fun begins
00:18:12.429 adopting type checking means that you asked a team to change the good flow to
00:18:18.110 make room for type checking in this section I will share some lessons we learn from driving adoption in our team
00:18:25.490 and how you can apply them to make your adoption successful first let's talk
00:18:32.750 about the metrics to measure adoption progress subway offers two metrics to
00:18:39.110 check adoption fire level metric and course eye level metric
00:18:45.159 five different type checking levels you can say it's polish five I wish Flint type shoe in strict because
00:18:52.269 they are the primary levels you will focus on type suffice will be checked
00:18:59.099 but you don't have to write meta signatures in them this is the ideal
00:19:04.720 level to start with type check requirements are higher at this level
00:19:11.889 you are required to write signatures and types for instance variable this is the
00:19:19.869 best level to inform because the next level is too limiting and only good for
00:19:25.479 simplifies or RBI files so survey can tell you the number of files at each
00:19:32.799 level within the type files survey checks the number of method calls
00:19:38.710 that are type checked any method call on untied object is counted as untied this
00:19:46.269 number matches more closely to the notion of type check coverage both
00:19:52.539 metrics provide insight to where to focus your adoption efforts usually you
00:19:58.840 want to drive up the number of files type checks then focus on the number of call sites in those files you can also
00:20:07.929 create your own matrix to define what adoption success means to you for example we tried who are
00:20:14.979 participating in type code and how much this is important because the size to us
00:20:20.470 mean everyone in the team is writing and type checking their code currently we
00:20:27.099 have over 90% of our file structure or higher that includes our models and
00:20:33.039 controllers the Spree high but how about the cosine level matrix
00:20:39.340 our percentage is currently at about 66% the blue line shows we have seen a
00:20:45.820 steady growth in the number of call size tight where's the number of I call site
00:20:51.910 the red line has died at about the same level now there was a slight bump in the red
00:20:59.150 line in the middle that is when we made our models and controllers stucture
00:21:05.290 it significantly increase the surface area the survey could type check after
00:21:12.520 that the red line has stayed the same level again is a very promising results we see
00:21:19.440 everyone in the team is writing tight code and drive up the matrix every day some sub team have even set up their own
00:21:27.600 magic goal for their code there are two principles that I think
00:21:33.710 help us immensely in driving a successful adoptions first adopting
00:21:39.410 gradually sabe can be useful even when you only have part of your code based ID checks
00:21:46.240 in fact I think is an on go to touch check Arif ice or every line of code
00:21:51.809 after all rubies the dynamic language it allows fun and creative magic to happen
00:21:57.720 so you should think about where type checking can be useful even in type
00:22:04.450 virus is okay to use escape hatches like T and type and T and safe to bypass type
00:22:11.679 check if needed also we found that it is easier to add types to new code by adding type check
00:22:19.419 to new code you are line model with the team interest of building new features so I recommend you to start them the
00:22:28.210 second principle is to not brought people from doing their work we want
00:22:33.850 everyone to find sub a as a tool that help them not another check that is in
00:22:40.179 their way of getting things done there are a few things you can do here learn
00:22:46.510 about the challenges of type checking code early it could be different in issue code based depending on the coding
00:22:53.440 patterns and libraries your team use be sure to provide workarounds so that
00:23:00.080 people can bypass type checkers when needed and allow them to make mistakes when they are getting used to the tool
00:23:07.159 if they make mistake a new string for a symbol in a signature ideally it
00:23:13.130 shouldn't break the fissure in production with those principles in mind
00:23:18.770 let's go through the step-by-step of driving adoptions in your team first
00:23:24.919 setting up the gems this step is easy you just add the gems to your gem file run bundle install and
00:23:31.880 follow the initiation steps documented in the gems one thing to note here is that you
00:23:39.260 should disable service random check in production the random check acts as
00:23:44.770 assertions to enforce the current needs of the inputs and output types but if we
00:23:50.779 were bought the running code you violated at the beginning is this best to run it inspect only so that it does
00:23:58.279 not affect such a production app once you've done with setting up
00:24:03.320 check out the adoption metric provided by sabe I think you'd be surprised I was
00:24:11.659 surprised sabe was able to take 80% of our files and 40% of the call sites in
00:24:18.230 them this is because survey knows all the car will be logic and sabe rails
00:24:24.140 enable it to understand a lot of the rails logic next let's move up a good
00:24:31.940 foundation here you want to set up checking for your adoption matrix those
00:24:38.690 later we reveal to you where you should focus on it is also a good place to
00:24:45.710 investigate how to integrate sabe with the themes development tools like git CI
00:24:51.710 or editor however this one is awesome yet for our team here and so they check
00:24:59.360 in our CI with a review just to collect logs also I must say that editor
00:25:06.799 integration is GameChanger it will change how you write code the most important effort here will
00:25:15.220 be understanding type checking and figure out which could to type you
00:25:20.920 should know the type of code people writes most often and how to make them tight chunks have answers for most of
00:25:29.260 the challenges so that you can guide and unblock people in the next phase if you
00:25:35.410 don't have to do this all in one day take it slow and build a strong foundation it is critical for the
00:25:42.640 success of the next phase after all adoption is supposed to be a gradual
00:25:48.130 process once you got a strong foundation it is time to dry up the adoption matrix
00:25:55.660 and get people to use the tools in terms of adoption matrix you want to make your
00:26:02.020 coffee a slight shoe usually this is easy because you don't have to write any
00:26:08.200 method signatures yet first we focus on our models controllers and data mutation
00:26:15.580 classes while typing the files I also find a lot of existing bugs I will share
00:26:22.450 with the team to get people excited about survey we also had workshops and
00:26:28.900 recruits early adopters to participate in typing people really enjoyed survey I
00:26:35.950 see in the desert we prepare for a workshop that you can see on the photo
00:26:43.620 after then you may think about guiding people into doing type checking one
00:26:50.980 valuable - I found here is robocop survey on the right side I had a cop
00:26:57.640 that enforced that whenever people add new files the files need to be tagged to
00:27:02.740 a Haggar you should also turn on return check in development and turn on git integration
00:27:09.270 this should run a check automatically when people push updates to remote
00:27:14.800 branches does a reminder here Freeman's escape hatches so that people
00:27:21.100 can bypass the checks you've needed on the people site
00:27:26.950 I find it immensely useful to celebrate adoption progress and early adopters
00:27:32.560 frequently just do what you came to make people excited about the effort they are
00:27:38.860 putting in the next phase the current phase we are in is when type-checking
00:27:45.220 becomes the norm when you enjoy logs has agreed that most people have gotten used
00:27:51.850 to do type checking you can up the adoption guidance for example you may
00:27:57.460 require that new files attached tricks and you method have signature again I
00:28:03.430 find Robocop survey useful here for runtime error is time to direct them to
00:28:10.870 responsible team so that they own them check out the screenshot here to see our
00:28:17.110 setup runtime errors are reported without breaking production app this is
00:28:23.320 gray we also think more about adding types to old code when practice I find
00:28:29.110 very effective here is getting a genius new engineers junior engineers to
00:28:35.020 type-check part of the code in the sub team this helps them understand the code they will work on and helps the team the more
00:28:42.760 code type in fact this is currently our favorite way to unbowed new engineers
00:28:48.910 I want to acknowledge that the tools are not perfect yet surveyance Oreos are
00:28:55.420 still measuring and sometimes you may run into problems for survey their
00:29:00.460 syntax not very well supported yet like shape or block binding however the team
00:29:06.850 are actively developing the tools I'm excited that they would be releasing official support for vs codes very soon
00:29:15.000 Severus is also being admitted frequently but there's two things we can
00:29:20.350 do to make typing easier so make sure you have brief survey and survey rows
00:29:25.390 frequently to get the latest feature
00:29:30.410 CCI is not the only team trying to use survey in Rio's here are a few companies
00:29:36.420 that I know of well it's in survey as well there are big companies with thousand employees
00:29:42.840 like Shopify and there are so small startups with just a few engineers all
00:29:48.030 of them are finding benefits when using sabe I hope this serve as an inspiration for
00:29:55.050 you to give sabe a survey really trying just follow the first step that set up
00:30:01.350 the chain see if sabe can find any bugs in your code base you also welcome to join the
00:30:08.970 slack community provide feedback and contribute to the jam
00:30:15.440 if I start I could like to thank CCI engineers and leadership the continuous
00:30:20.480 support has helped us achieving a successful adoptions thanks everyone in the survey team for
00:30:27.590 building a wonderful tool it has provided so much value and changed how I
00:30:33.289 write Ruby code last but not least I would like to thank our various contributors they would hurt the
00:30:40.009 contributors very much by a harab commit to the report I'm very grateful for your
00:30:45.950 contributions and thank you for attaining the talk did you have any
00:30:52.279 questions feel free to reach out to me through github a soy-based like I will
00:30:58.009 be more than happy to answer your questions