List

Upgrading a big application to Rails 5

Upgrading a big application to Rails 5

by Rafael França

In this presentation from RailsConf 2017, Rafael França discusses the journey and strategies involved in upgrading a large monolithic Rails application—specifically Shopify—from its previous versions to Rails 5. Francia outlines the challenges faced along the way, as well as the solutions implemented, emphasizing lessons learned for both the company and the Ruby on Rails community.

Key Points Discussed:
- Timeline of Rails and Shopify Development: Shopify's codebase, having started around the same time as Rails itself in 2004, has undergone several upgrades in close agreement with Rails releases. For instance, the upgrade to Rails 5 took about a year from its initial start until deployment, highlighting the scale and coordination necessary in such a large application.
- Complexity of Upgrading: The transition involved creating new branch strategies and maintaining code functionality across two versions of Rails simultaneously. A significant point made was the complexity of needing to handle code that operate under both Rails 4 and Rails 5, which was managed through conditional configurations in shared machine files.
- Key Challenges Encountered: The upgrade process was not without complications. Major issues included managing deprecations such as protected attributes transitioning to strong parameters, which required code adjustments to ensure secure and smooth data handling.
- Testing and Continuous Integration: France emphasized the importance of establishing a parallel Continuous Integration system that would validate the application behavior under both Rails versions. This practice allowed the identification and fixing of bugs collaboratively with other developers while ensuring stability.
- Community Contributions: França highlighted the role of contributing back to the community, particularly when some dependencies required updates to work with Rails 5, demonstrating a cycle of giving back to the framework that underpins the Shopify application.
- Future Plans and Considerations: Moving forward, Francia noted the desire for Shopify to maintain a consistent upgrade path, ensuring it remains on the latest versions of Rails to avoid the technical debt that stems from outdated code. This proactive approach seeks to foster a culture within the company where everyone contributes to keeping the application state-of-the-art.

In conclusion, the upgrade of Shopify to Rails 5 serves as a valuable case study for large Rails applications, demonstrating that while the upgrade process can be complex and challenging, strategic planning, community engagement, and an emphasis on testing can lead to a successful transition.

RailsConf 2017: Upgrading a big application to Rails 5 by Rafael França

In this talk we would take a look in different strategies to upgrade Rails application to the newest version taking as example a huge monolithic Rails application. We will learn what were the biggest challenges and how they could be avoided. We will also learn why the changes were made in Rails and how they work.

RailsConf 2017

00:00:12.049 so I'm here to talk about a history how
00:00:18.480 to operate a big rails application to real size this is a history how Shopify
00:00:26.039 lashing the real cyber grade to production so first I'm going to
00:00:31.260 introduce myself I am Raphael flosser you can find me at Twitter or github as
00:00:38.309 a firefighter I'm a member of the Rio Scotty and I work for Shopify as a
00:00:46.350 producer in January you may know me more because every single Apple star off the
00:00:53.280 rails organization you enter you're going to see these at the last committee so I'm also the self-proclaimed the
00:01:02.460 rails maintainer because I usually do all the Helius so this is like the
00:01:13.080 history house on fire releases resize to production but wide look at top 5
00:01:18.390 because why shall fight so especially in that you have to look on it because shop
00:01:24.630 I started around the same time as rails so the first initial commit of Shopify
00:01:32.159 codebase was in 2004 if you peel to the first sub or release of rails so shop I
00:01:40.920 started at 2004 wheels one that viewers only he DVD in 2005 and also this
00:01:50.640 application was never here we think so we are also issue using the same
00:01:56.159 application that started 13 years ago so
00:02:02.329 this is a timeline cooperation between rails and the shop by codebase
00:02:09.660 so we can see that shop 5 follow the realizations really closely this is
00:02:17.190 April like you erase to that view shall fire up with each dot 2.0 s after is the
00:02:24.630 same case happens with the Reg 3.0 was just close today with this but for the
00:02:35.370 rare Cidade jus we took a little more time to upgrade like examples one year
00:02:41.010 after that we are gated to Wales 3.2 and we also keep it rare 2.1 because it was
00:02:51.000 too mature in that the application so
00:02:56.550 this is when we actually we're seed up to and for the case of the rails 5 like
00:03:05.570 rail side was a visa the angel of 2016 and we started the upgrade of the rails
00:03:13.370 5.0 right after the risk photo was Mercia to to shop fight and we only
00:03:21.870 finish it in two thousand 2017
00:03:26.970 it was almost like it was more they were year after we started to upgrade shop
00:03:33.780 fight to raise five zero that we could actually the Polish production also
00:03:39.720 Shopify is a big application when sa big is really big baby is the biggest rails
00:03:47.489 application that you have in production right now we have more than three
00:03:53.700 hundred seventy four lines of codes is just counting lines inside they add a
00:04:00.270 little disease excluding any kind of SAS code we have body 100,000 lines at or
00:04:11.280 our bodies and we have more than 2000 classes inside they both directors and
00:04:19.230 also we have a lot of coding the controller's director and our test
00:04:24.870 coverage is - one Pikachu has decided we have 1.3
00:04:32.110 lines of test for each line of code and chopped 5 is also in the letter -
00:04:38.170 reservations if you don't cost that that real 5.1 is
00:04:45.430 probably going to be really late today so how else how do you upgrade the
00:04:53.620 application to real size well it's simple you create a new branch you make
00:05:01.030 all the changes necessary to upgrade to real size you measure it I don't know
00:05:06.940 what happened later and your profit is simple well that's not our case supplies are
00:05:14.740 very massive this we have some hundreds of developers working every single day
00:05:20.950 in this same codebase create a mesh to upgrade the rails ratio and keep it up
00:05:27.340 to date is hard because of all the conflicts that may happen because
00:05:32.410 someone changed surfing or people could be totally new bugs every single commit
00:05:41.760 we need a different way to to use in
00:05:46.810 test the same code base with two different versions so we come up with
00:05:53.260 our strategies to make the application possible to work with two versions of
00:06:00.160 rails no booty in our case was the best solution because it allow us to run the
00:06:07.510 same code division with two different versions of the framework in birth death in development or even production the
00:06:15.820 solution is not hard to implement you create two different shape files they
00:06:22.060 share file dot next in the normal file you use environment variable that are
00:06:29.140 ready a feature of the burn rate to restore details and this works just fine
00:06:37.480 it is the recommended way by the balloting to this but you know being application to G
00:06:44.319 files are also really hard to maintain since people could upgrade one forget to
00:06:49.840 obligated there another one and also the pressure that is either the emergency
00:06:55.840 fire should be the same as the vessel in demolition fail so the solve this problem we did
00:07:03.190 something that I'm not proud of we did a hack to share the same machine file this is the beginning of our shell
00:07:11.169 file right now it's a huge multi page inside the binder internals that I'm not proud of this code at the root but
00:07:17.880 forget about that is focus or drugs important we have some conditionals
00:07:23.289 inside the origin file like this if you path environment variable colored rails next it's going to run with wayside if
00:07:31.509 not it's greater if you wave a tube and you could use this environment variable
00:07:38.110 to install your chainsaw refinish that you're Riskin solve your real server
00:07:43.229 this approach is good because it keeps possible to do do a booty it if they
00:07:49.360 also address the issues in the previous approach like having to your files
00:07:54.490 because you don't have to the sides of forget it you could update the
00:08:00.009 annotation file of course these are sounds of this is having to do a more
00:08:05.740 compassion bundle so I I try to come up with a better way I never tested these
00:08:11.650 in production yet but you could use a featuring birthday color involving file
00:08:18.820 so you have a shape file with resize and let's say you have outside challenging
00:08:25.060 file and you have an audition fight with real size one and you have this share
00:08:30.610 with you here file we file the change that you want to share in this day
00:08:36.459 coffee / but buta RB you add that three
00:08:43.149 lines of code that to change the G files depending on device available these are
00:08:49.329 sort of disease if you need to use that in the command line to study hands but to
00:08:58.139 restart the server you only need to change the wavelength invite the available so when I do this the dois
00:09:07.980 booty is it's not possible to rush my code base with two different versions of
00:09:14.639 prevail so what we did we created a parallel CI dude that was running with
00:09:21.810 rails for two in rear side given that that's not possible
00:09:28.050 I can now start working on their p8 itself so what did we started replicate
00:09:35.430 all the premises it's important to
00:09:40.649 always have the pendants working both measures and because it is it the best
00:09:46.260 that behavior is correct so what we did
00:09:52.800 it was really simple we make sure that all the vessels are very using the same pressure that were previous for chewing
00:10:01.019 resize and when it was not possible to have a version a to hoof way aside we
00:10:08.130 contribute contribute back to the dependencies some of the pieces takes a
00:10:14.490 long time to equate to near reservations so this is a really opportunities to
00:10:20.670 give back to the community I have one example here I had to upgrade the chain
00:10:27.000 action page XML parser to support the rear side and these was a good
00:10:32.940 opportunity also simplify the code of the dependence itself because the rear
00:10:38.639 side we had some actitude simplifications and that made the code
00:10:44.430 of the plugins easier to write so in
00:10:49.769 this case we had this is the entire code of the same like you have a bunch of
00:10:56.160 lines of code that I don't want to you to focus on but after the rails 5
00:11:02.339 upgrade we could change the chain to vicious that is that simple
00:11:07.500 call that call hash for xml or returning
00:11:13.500 hash so that that was a good example how
00:11:18.830 upgrading your reservation can make your the premises in blue so after that we
00:11:26.610 after we upgraded out the dependency we had to fix all the best because we had
00:11:32.370 thousands of specifially so what we did was for a
00:11:39.360 booking test we created a bench we got
00:11:44.820 the test we fixes the best in with the bytes potential is this boring as you
00:11:50.370 can machine it but I have a disease the
00:11:55.830 leash of the our professors were created between the time we are very upgraded to
00:12:02.760 resolve in production as you can see it's huge if there is nothing really facing this
00:12:08.430 desk we just created a bunch of piazza fixes all the things that they were
00:12:14.130 booking sometimes you can take days to pick one to fix one single test and
00:12:20.430 sometimes one single code change can fix hundreds of the defense let's talk about
00:12:29.610 our biggest challenge on this upgrade the first one was the protected data
00:12:35.339 book James if you don't know what protected Buddha was let's say that you
00:12:42.510 have a model users that have attributes as attributes name passwords in aadmi by
00:12:51.900 the way if you know about this these attribute API is actually working so you
00:12:57.690 can define what a book like this if you want tailor your application so let's
00:13:03.839 say that you have this model and you have a user's controller in the admir'd space namespace so you find the user by
00:13:11.640 ID you update the attributes that come to the form and you have directed back to the hilt back in your form you have
00:13:19.459 name a password adda me if he in rails photo actually
00:13:26.279 inversely to that work you should you
00:13:31.320 needed to add a new method called attribute assessable to tell rails that
00:13:37.649 those are two are accessible through to Memphis I this and let's say that now
00:13:46.260 you're free to change and you want to give users the possibility to update the
00:13:53.160 Oh name in passwords so you create another controller that does basically
00:14:00.630 the same thing but the form is different because you only want the users to be
00:14:05.940 able to change the name in the password but but not they added me flight
00:14:11.899 what kind of foreign does this house well something like this may happen if
00:14:20.370 you don't know this was a hack in 2012 where a guy called Homer cough could
00:14:27.690 commit to rails up a story without heavy assess the episode itself so how that
00:14:35.640 kind made this happen what he did was really simple he created a new FH key
00:14:44.000 using his own SSH key in the ehh user so
00:14:50.089 github did not had support will not support but protections of this kind of
00:14:56.010 attack in that time so it was possible to a person add new SSH key in the uses
00:15:02.790 in the user account and it was really simple to mitigate this kind of problem
00:15:08.750 what you could do you could add a new attribute accessible call to say that
00:15:16.709 the enemy flag is only allowed when you are doing the octopus assign me as I add
00:15:24.089 me in the your admission control a you say that you want to update the
00:15:30.540 attributes badly so that was in ray - in the race all we shake it the way
00:15:38.230 that this objects protection works with they strongly parameters it brings the
00:15:46.060 protection closer to the source of being put in this case the controller so it's easy to you to remember to protect your
00:15:52.830 data before sending to the model so how is showing parameters works is work
00:16:01.720 similarly but you have to filter out the parameters before st. change to the updates in the controller itself so we
00:16:11.410 started a huge step to remove out there - beautiful accessible using in shop
00:16:18.280 fight because real fight would not support attribute effects pose anymore and we had more than 150 boo requests in
00:16:26.610 three months of works to remove the situation from shop by itself in some
00:16:32.770 case we were just simple as moving the
00:16:38.230 attributes accessible calls to do photos that are animated boots to that model
00:16:43.900 but this other case we were found that maybe we went missing some kind of
00:16:51.670 abstraction in the frame book so in some case like we had some massive forms with
00:16:58.420 a lot of attributes we actually created a bridge section colored form object
00:17:04.290 using the drive validation chains computers all the attributes before saying into the model another Challenger
00:17:13.960 to head was controller tests we had a lot of controls that for like this like
00:17:19.960 we created a request to actual sending a
00:17:25.330 title and in this case I'm ash aspect
00:17:31.260 and in way or to when the request comes
00:17:38.020 to controller the tag the text parameters were actually a theory but in
00:17:45.760 rails 5 the tag parameters was not the rainbow was actually blank so why
00:17:52.150 this change of behavior between those two versions so let's say that you have
00:18:01.060 a code like this like this is a valid code but is not as a sizing what they
00:18:09.400 actually happen in the browser so you have the post you do the request you set
00:18:16.120 the parameters and you send the parameters of the controller but in
00:18:22.480 verse 5 what we did was we actually encode the parameters as the browser
00:18:29.530 would be done like you cannot say the in theory - in application using the
00:18:36.670 browser itself that's invalid this by definition of the Equality of parameters
00:18:43.300 of the browser so that time we had no
00:18:48.820 way to fix these this regression because
00:18:54.120 there is no weights do you to tell racial not include the parameters as the browser because what we are trying to
00:19:01.450 test it is if it's possible to send this deformation as Jaisal to the application
00:19:07.540 we had to open and for refreshing waves itself to make possible to set the
00:19:12.580 coating type with as of showing the controller death what changes is now
00:19:18.160 it's possible to you to pay there's options in the request itself and both
00:19:26.140 ways so to a real site is going to behave in the same way because now it's not including the parameters as a
00:19:32.260 browser but included the parameters ever change his request
00:19:37.440 speaking parameters that was another thing that gives us a lot of struggles
00:19:43.110 because since we rails 5 action parameters those are inherent for a anymore a lot of good that we're doing
00:19:51.360 type ishaq' with hash book so in real site we had parameters like this we also
00:19:58.510 actually in Rio cipher meters are not inherit fashionable this change was to improve security
00:20:05.509 because month in many plays that are not just models are vulnerable for methods
00:20:11.339 of assignment work like while I help is not even a shepherd models like active
00:20:19.070 resource and to have avoid these follows we made parameters to not inherit fresh
00:20:26.959 so what's happening in waves five now is
00:20:32.759 like you have a parameters with named Rafael if you call patterns dot 2h you
00:20:39.179 get a imp trash because you did not feel to anything if you call too unsafe or H
00:20:45.269 you get all the parameters and if you do the future improperly in co2h you get
00:20:51.419 what you featured so we had to fix a bunch of place that we're relying on the
00:20:58.019 behavior that parameters is a hash so we had a lot of bunch of code with this
00:21:05.099 kind of check like different the hash do something and this will not have any
00:21:11.999 more work anymore so what we did was to
00:21:17.249 avoid that parameters entering the model layer so the controller layer we call
00:21:24.269 the 2h model or in some case we list the
00:21:29.549 abstraction inside the models and do these the type cycles and parameters in
00:21:37.049 my opinion there is that is not ideal solution it's causes a lot of pain in
00:21:43.949 your code base and it's also a paying a lot of fodder for bits so we need to
00:21:50.849 think in a solution that would improve security but keep it be pleasant to with
00:21:57.499 so I think see zigzag oh yeah six days ago I open this point flash in raise
00:22:04.589 itself to improve the up with Pat function parameters it the change is now
00:22:13.229 if you call to a to read out a few during your parameters you get a section
00:22:18.690 this is good because usually that is exactly what you want you don't want to
00:22:23.850 send the featured parameters inside your models or inside or the part of your
00:22:29.610 application if you call to a safe age of course you get out all the information
00:22:36.750 in the parameters but if you call to it doing the fishery you get to what you
00:22:43.860 get you want so in my opinion this is going to help everyone to get I use a
00:22:52.170 upgrade because we suffer a lot with this feature and it's also going to book
00:22:58.890 security so how they would do production looks like me application like that like
00:23:05.820 we have shop by running with both of our players we could make sure that other
00:23:14.070 fashions are passing with both measures but how to deploy that to production it's not just like the point production
00:23:21.810 the wayside container in that suite so what we did was with the point of
00:23:29.730 production in I like yellow base like a
00:23:35.610 smaller size of a peak of production cyphers way using the new version of the
00:23:42.360 rails and we had to write some compatibility to the players because
00:23:47.760 shall find needed to run in both eyes using in productions and there is no way
00:23:53.220 to you to tell that one equation is only going to hit your veg off the rails
00:23:59.250 let's say that you're using is going to the checkout page and assume you
00:24:05.040 clicking check out your request that was Ovid Bioware social application is going
00:24:10.770 to be serviced by always five applications so we need to make sure that there is no difference between two
00:24:20.820 requests to different versions so in the real social era we had to do some vocal
00:24:28.260 pipes in the wave itself to make possible to work in both
00:24:34.720 measures in this case we created the multi page to generate CFR up talking in
00:24:44.440 wrestle-1 that is compatible with ways for to the difference is that in real
00:24:51.820 size now we try to create the same kind of code compatibility layers inside the
00:24:58.390 framework itself so you don't need to do that in application show so this is one
00:25:03.460 of imposing where five we created a
00:25:09.180 legacy here MOA Cody two parameters because like a third parameter was
00:25:16.230 hashes in via so true they are not hashes involving side so we had to be
00:25:22.300 able to read the gables being coded parameters in previous versions in the
00:25:30.070 new versions so we did what we call the weather whole out that we deployed to
00:25:37.720 production just a small positive of the server's this is the message of our bot
00:25:43.720 the point production we 50% of the
00:25:48.850 service running where sighs so what we did was we depart production we find
00:25:56.110 bugs in productions we fix those bugs we hope back to zero percent of the
00:26:02.680 innovation and with a boy later with the bug fixes again we also did some
00:26:10.560 benchmarks so this is our bot responded
00:26:18.400 to me to profile of five seconds in one of the production machines so it gives
00:26:26.350 it to us if stacktrace with all the methods that are mostly colleges the
00:26:32.830 server in that spirit so what we did was we provide different servers to compile
00:26:40.210 the results and see if there is no performance equation between the two versions if father said
00:26:47.470 why she's found we've either deploy we fix the aggression in the framework and
00:26:52.690 we'd apply again so this was the time
00:26:59.110 that I actually did the the point to 100% of the server's it was 7:00 p.m. a
00:27:10.990 must ATM at night decimates and cheese so
00:27:16.230 what's really brave to do that but at least is broken so after the the
00:27:24.040 deployed production we had to start the cleanup because we end up with a lot of feature toggles inside our code base
00:27:30.820 checking if the ratio is for wave ratio is 5 we had to first remove all those
00:27:37.120 conditioners and we also had to remove other duplications in in this project i
00:27:45.430 smart enough for people working on it it's not possible so I smarting off for
00:27:52.450 people remove all the duplications that you have in the Shopify codebase so what
00:27:57.580 we did we need the help from everyone the company our first approach was to
00:28:03.970 consider that pretty deprecations while you wanna test locally well enough but turned out that
00:28:11.050 people are really good doing you know read the petitions so we created a
00:28:17.350 whitelist of test fires that have duplication in a 15-2 remove all the
00:28:23.460 duplications of those slash fires that are part of the components like and in
00:28:33.190 the last the last thing that that we do after we remove the other applications
00:28:38.530 err pregrated the configuration to match they near the poles of the whales and
00:28:44.650 after that we start the preparation of the new upgrade this is a this kind of
00:28:52.990 push it never have in because rails is always really sunny version so we are always trying to keep
00:29:01.040 track of the realization so for the future what we want a shop fight is to
00:29:07.040 avoid more capacity at our course that means that a pizza eases because we are
00:29:15.040 only usually features of purveyors and the only way to do that is to keep out
00:29:21.470 the page number of the peninsula smaller because as much as more depends you have
00:29:27.919 more likely to have more capatch on the depends we are going to keep the parlor
00:29:34.760 side one in the entire year and my goal is to keep it tracking off the rails
00:29:41.480 machination forever it means that chopped firewood be running in the last version of the rails forever and of
00:29:50.150 course things will breaking and it's not what ever wants to make everyone's
00:29:57.530 concern everyone at the company should be concerned about keeping the
00:30:03.590 application up to date we also want to
00:30:08.900 think more about backwards compatibility inside the framework itself it is part
00:30:15.679 of the shop I give it back into community to be to pay me to look at the
00:30:23.240 rails on work to make easier to everyone propagate this application because subphyla prevail and we want to race to
00:30:31.970 succeed and we want everyone to be able to use delicious version of the waves
00:30:37.100 without any problem and that's it
00:30:47.499 yeah so the question was we have a white fish for of deprecations if we go figure
00:30:55.129 eight where's two ways in the deprecation side that fire we yeah we use it the rails application behavior
00:31:02.570 but we have a special code that's what does is it records all the duplications
00:31:08.809 and if the duplication don't match the leash of recorded duplications it's fair
00:31:14.479 the path so and also if there is no duplication in that five anymore if you
00:31:21.200 have recorded in sauce fail so that we are planning to pick sauce let's hold to
00:31:28.450 so maybe next month so the question is the haircutter said that there are ops
00:31:34.940 in front end developer shop fight if I am in the ops easy as part of it like we
00:31:42.200 don't have the ops the kind of organization we have what Google calls
00:31:49.929 SRA in Facebook's call producer Huey is kind of developers with ops background
00:31:56.959 in a yes I working machine so my job - to make those five dollars so these are
00:32:04.249 people's tools or any kind of production tools more questions I will be here if