List

Semi Automatic Code Review

Semi Automatic Code Review

by Richard Huang

In this talk titled "Semi Automatic Code Review," Richard Huang presents a framework for improving productivity in code reviews, particularly in Ruby on Rails projects, as part of the development life cycle. This framework addresses the challenges that arise when code quality is not prioritized in the early stages of a fast-growing company.

Key points discussed include:
- Development Life Cycle: Richard outlines the typical development process which includes new feature implementation, bug fixing, manual code reviews, and QA testing prior to deployment.
- Code Review Importance: Emphasizing the critical nature of code reviews, he notes that they can identify bugs and performance issues, albeit being a manual process that can introduce complexities.
- Automation Opportunities: Huang suggests that while many stages of development can be automated, code reviews still largely rely on human evaluation. He introduces the idea of partially automating this by utilizing tools that help analyze code based on best practices.
- Rails Best Practice Gem: Richard details the creation and functionality of the Rails Best Practice gem, which analyzes Rails project code and offers suggestions for refactoring. He highlights the benefits such as better readability and adherence to coding guidelines. For example, it can suggest using scope access for querying posts belonging to the current user, and using built-in methods to simplify code.
- Realbp.com: In response to the needs identified during code reviews, Richard discusses the launch of realbp.com, an online service that integrates with GitHub. This service allows developers to track code quality over time, share analysis results with teammates, and customize what checks to apply based on team preferences.
- Plugins and Customization: Richard explains the extensibility of the gem through plugins, allowing teams to adapt the checks to their specific coding guidelines.
- Practical Demonstration: He provides a step-by-step guide on registering a GitHub repository with realbp.com, analyzing code, responding to suggestions, and tracking improvements.

In conclusion, Huang stresses the dual focus of semi-automatic code reviews: improving code quality while enabling engineers to dedicate more time to critical performance and scalability concerns. The main takeaway is the push towards leveraging automation in the code review process to ultimately foster better coding practices and enhanced collaboration among engineers.

Rails is so popular to be used to fast build a website, at the beginning we sometimes write codes too fast without considering code quality, but after your company grows fast, you have to pay more attentions on code review to make your website more robust and more maintainable.

In this talk I will introduce you a way to build a semi automatic code review process, in this process a tool will analyze the source codes of your rails project, then give you some suggestions to refactor your codes according to rails best practices. It can also check your codes according to your team's rails code guideline. So engineers can focus on implementation performance, scalability, etc. when they do code review.

Rails Conf 2012

00:00:25.320 here uh let's get
00:00:27.080 started uh hi uh my name is Richard I
00:00:31.119 come from China uh my topic is
00:00:34.399 semi-automatic code review it's about a
00:00:37.320 process to improve your productivity of
00:00:40.760 code
00:00:42.719 revie uh I'm now working for G
00:00:46.079 maintaining the open fan platform it's
00:00:49.039 the biggest so mobile social gaming
00:00:52.199 platform and I'm also active in the open
00:00:55.559 source Community I created the real best
00:00:59.199 practice Jam
00:01:00.640 and I also built the rb.com website
00:01:04.400 recently I will discuss them uh more
00:01:07.360 later uh you can check out my GitHub
00:01:10.240 account to get more useful
00:01:13.360 tools okay this is our development life
00:01:17.200 cycle for each Sprint uh we will
00:01:20.119 implement the new features of fix bugs
00:01:24.200 and then we will ask other Engineers to
00:01:26.439 do code review and QA will verify our
00:01:29.960 chance then we merge merge the chance to
00:01:34.079 a release branch and finally we will
00:01:36.920 deploy it to production server I think
00:01:40.399 most uh most teams using rails should
00:01:43.640 have a similar process just like
00:01:48.119 this uh in the development life cycle
00:01:51.680 you will see uh some process can be done
00:01:54.640 automatic and the some some process have
00:01:57.360 to be done uh manually uh manual work
00:02:01.000 are always easy to introduce bugs into
00:02:04.119 your product while automation makes
00:02:07.640 things simple more accurate and less
00:02:10.520 buggy so for us we will try to do
00:02:14.280 processes automatically as much as
00:02:18.000 possible so in development process uh
00:02:22.480 it's OB obvious that we will uh write
00:02:25.480 codes manually but sometimes we will
00:02:28.519 generate codes a automatically and we
00:02:31.239 will we write a lot of unit test and
00:02:33.800 integration test to make code make test
00:02:36.920 code
00:02:38.440 automatically uh in our code review
00:02:40.920 process we have to do it manually
00:02:44.239 because uh sometimes we were sitting
00:02:47.080 together face Toof Face review and
00:02:49.920 sometimes just sending a code compare
00:02:52.959 link and a leave comments on
00:02:56.080 GitHub and in our QA process we we have
00:03:01.239 an integration uh continuous integration
00:03:04.720 test uh server to guarantee it that all
00:03:08.040 the unit test and uh integration test
00:03:11.120 are green and we we also have a lot of
00:03:14.799 uh script script code to to to run the
00:03:19.080 uh uh regression test automatically but
00:03:22.400 beside this we we have man we have to
00:03:24.959 manually verify our
00:03:27.439 product in merge process
00:03:30.439 uh we use the Version Control System git
00:03:33.560 to uh automatically merge the Fe the
00:03:37.000 feature Branch into release Branch uh
00:03:40.480 only when that conflict occurred then we
00:03:44.000 have to manually conflict
00:03:45.920 them and in deploy process thanks to G
00:03:50.159 cap trainer that we can release the new
00:03:53.439 feature Almost 100%
00:03:57.239 automatically so in the uh uh as you
00:04:01.640 seen that we have more or less
00:04:03.640 Automation in every process in the life
00:04:06.280 cycle
00:04:08.680 but uh accept the code review process
00:04:12.439 you know code review is very important
00:04:14.720 that it can solve most bugs and the
00:04:17.880 potential performance issues but it
00:04:20.479 really depends on Engineers for example
00:04:24.479 if Engineers are familiar with Ruby and
00:04:27.280 the reals who will uh suggest
00:04:30.280 opportunities to use a lot of reals
00:04:33.080 default uh building helpers method and
00:04:36.840 uh if Engineers are uh have a strong
00:04:40.039 knowledge of database and uh HTTP
00:04:43.479 protocol he will suggest where you can
00:04:47.199 uh optimize your DB queries and uh add
00:04:50.199 cach
00:04:51.800 layer so uh so we ask if we could Auto
00:04:57.479 partially automate the
00:05:00.400 code review
00:05:02.280 process the answer is yes and this is
00:05:05.720 what I want to talk
00:05:07.560 today so let's see what we review in
00:05:11.960 general uh it contain two parts the
00:05:15.479 first part is easy and its goal is to
00:05:18.199 make code more readable and
00:05:21.800 maintainable uh like if like you should
00:05:25.720 following your your team's coding
00:05:27.960 guidelines you should write better
00:05:30.520 coding syntax and you should remove
00:05:33.479 unused methods Etc and the second part
00:05:37.360 is important but a bit complicated that
00:05:41.280 you you should uh review if the code is
00:05:45.120 performant and scalable and this part is
00:05:49.280 really depends on the programming
00:05:52.199 experience
00:05:54.440 so what if we can review the first part
00:05:59.319 a
00:06:00.240 automatically and the engineer just
00:06:03.280 focus on their uh focus on their time on
00:06:07.400 the this more important stuff when they
00:06:10.520 do code
00:06:12.440 rev so I have built rails best practice
00:06:16.319 gem over the last two years it is create
00:06:20.479 for analyzing the source code of your
00:06:22.960 reals project and uh giving you some
00:06:26.280 suggestions to refect and improve your
00:06:29.319 code
00:06:31.280 quality this is an
00:06:33.759 example uh the in the edit action on top
00:06:38.160 code you will see it the code read uh
00:06:41.639 post from database then uh it check the
00:06:45.560 privilege by uh just checking if the
00:06:49.160 current user is the user is just the
00:06:52.080 user of
00:06:53.560 post the real best prti gem will tells
00:06:57.520 you that there there is an
00:07:00.280 uh Co May in in the post controller live
00:07:03.080 five it it suggests that you should use
00:07:06.039 scope access just like uh just like what
00:07:09.919 the code in
00:07:11.919 buttons uh you should just read the post
00:07:15.560 from current users posts so if the post
00:07:19.000 does not belong to current current user
00:07:22.440 uh and not found error will be
00:07:25.800 raised and a similar example uh
00:07:30.080 on the top code before you create a post
00:07:33.800 you you assign the user ID from current
00:07:37.759 user then if you check by reals best
00:07:40.879 practice gem it will suggest you to use
00:07:43.759 model Association just like what the
00:07:46.560 botton code does it it creates the post
00:07:51.120 by with current user. poost that it will
00:07:55.199 automatically set the user
00:07:58.039 ID uh yeah
00:08:00.759 and uh this is a example of a suggestion
00:08:05.240 to use R's default haer method right on
00:08:09.159 the top code you uh the code read users
00:08:12.879 login attribute and check if it is
00:08:16.000 present and real best practice gem will
00:08:19.479 suggest you that you should use the
00:08:22.840 query attribute qu by query attribute
00:08:26.560 means you can directly call login
00:08:28.960 question mark mark on user to check if
00:08:31.360 the attribute is
00:08:34.479 exist I saw more and more developers
00:08:37.640 begin to use the real best practice gem
00:08:40.880 someone integrated it into God running
00:08:44.200 the real best practice frequently in
00:08:47.480 development mode and uh someone just
00:08:51.240 integrate the integrate it into Jen kins
00:08:55.320 so he can running the real best proam in
00:08:59.120 on their continuous integration server
00:09:02.440 and I also saw a team that they build
00:09:06.440 their internal syst internal tool by
00:09:09.519 based on the Real's best price gem they
00:09:12.399 they they score their developers code
00:09:15.640 also I also I don't like this
00:09:18.800 idea so there was a chance in our team
00:09:22.760 that we decided to pay down the
00:09:25.920 technical debt so you know we have a
00:09:30.079 large ra space c space or we started in
00:09:34.600 early
00:09:35.920 2009 there are a lot of unused classes
00:09:39.560 unused methods Dead Road and Old ugly
00:09:44.040 code then I used the real best prce gam
00:09:47.680 to refact and clean up our code
00:09:50.959 base uh but real best practice generates
00:09:54.800 the analysis in just in terminal so I
00:09:58.680 found it's not
00:09:59.839 easy to share the analysis with the team
00:10:03.560 and I also want I I I need to I think I
00:10:08.120 need to just contact the auth of the
00:10:10.760 code which I want to refect and discuss
00:10:14.040 them if it is easy to if the code is
00:10:18.279 easy to be removed then uh an idea
00:10:22.120 jumped into my mind that uh what why not
00:10:26.399 build an online code review service
00:10:31.800 so I buil the real bp.com early this
00:10:35.760 year it's it it's just integrated with
00:10:39.839 uh GitHub every time you push code to
00:10:42.440 GitHub the online co uh code Checker
00:10:45.760 service will be executed automatically
00:10:49.240 and you can easily share the analysis
00:10:52.560 with your teams collaborators because
00:10:54.680 it's online and the Ros BP will uh keeps
00:10:59.399 every analyze an analysis result so you
00:11:03.240 can key you can track the history to see
00:11:06.360 if your code quality is improving or not
00:11:10.000 and you can also config what to review
00:11:13.760 for example if you don't take or you
00:11:16.480 don't care about the uh like
00:11:19.760 use Query attribute you can just disable
00:11:23.560 this checker on rb.com
00:11:26.760 this is uh how it works
00:11:30.399 and every time you or first you should
00:11:33.519 register your your GitHub repository on
00:11:36.880 real bp.com then every time you push the
00:11:40.440 code to GitHub GitHub will notify RBP
00:11:44.000 there is a new push then Ros BP will
00:11:46.920 check out your source code and analyze
00:11:49.519 your code quality then generate generate
00:11:52.560 the analysis report so you and your
00:11:55.920 collaborators can all view the uh
00:11:58.760 analysis this report so you can keep
00:12:03.160 reflecting your code just according to
00:12:05.160 the
00:12:07.480 report so there is a real BP H service
00:12:11.920 hook on on your GitHub repository admin
00:12:15.279 page so after you register your
00:12:19.360 repository on reals BP you will be
00:12:21.880 assigned a token just set a token in in
00:12:25.160 this hook page and activate this this
00:12:28.240 hook
00:12:30.240 so if you you activate it then well
00:12:34.720 every time you push code the GitHub will
00:12:36.720 notify notify us so we can analyze the
00:12:40.880 source code for
00:12:42.440 you this is the
00:12:45.079 analysis uh report page as you seen that
00:12:49.120 it will tells you the uh commit ID
00:12:52.800 commit message and it will list all the
00:12:55.680 Cod Mes in a
00:12:57.560 repository for for for each cosail it
00:13:02.000 will tells you the uh file name the line
00:13:05.360 number the warning message and which
00:13:08.519 commit and who introduced this
00:13:12.360 COA and the Highlight line means that
00:13:16.519 this Co mail is introduced by current
00:13:20.800 commit so you can also uh click the
00:13:24.560 warning message it will bring you to a
00:13:28.440 bring you to a page which gives you a
00:13:30.880 suggestion how to reflector your your Co
00:13:33.279 mail and you can also click the file
00:13:36.079 name it will bring you to the GitHub
00:13:38.600 file page uh if you want you can
00:13:42.320 directly refect your cail on
00:13:46.040 giab so before you ask your senior
00:13:50.160 Engineers to do code review for you you
00:13:52.759 can Cod reveal your your your code on
00:13:56.480 Ros BP by yourselves so it's can improve
00:13:59.639 your Co quality and also save the save
00:14:02.839 other Engineers
00:14:05.160 time so this is the history tracking
00:14:08.720 page the visual chart shows how many co
00:14:12.920 Mes in your repository historically and
00:14:16.160 there there is also a table that lists
00:14:19.519 every analyze result you can click any
00:14:23.399 item to go to the analyze report
00:14:28.040 page okay it is also configurable I
00:14:32.320 understand that not all Checkers in the
00:14:34.720 gy are suitable for all reals reals
00:14:38.279 teams uh some reals teams may have
00:14:41.279 different idea of reals best practice so
00:14:44.680 you can select which Checker should be
00:14:48.519 activated and uh turn off other Checkers
00:14:52.160 as you like so this is only part of the
00:14:55.600 configurations you can just uh select
00:14:58.680 the check check that you think is
00:15:00.320 suitable to your team and some some
00:15:03.320 Checker also accept some parameter so
00:15:06.480 you can customize for your
00:15:09.959 team uh this is also the this is the
00:15:12.759 configur or collaborator page so you can
00:15:15.279 add or remove the collaborator to grant
00:15:18.440 them a privilege to to config your
00:15:21.839 repository on real bp.com and for
00:15:24.920 Simplicity you can synchronize the
00:15:26.959 collaborators from GitHub
00:15:31.279 okay so I I know that different yo
00:15:37.040 different teams may have different code
00:15:39.839 guideline it's probably probably there
00:15:42.600 are some code guideline in your team
00:15:45.600 that the the gam count analyze
00:15:49.399 so but don't be depressed the real best
00:15:53.279 best proam uh provide a way to extend
00:15:57.199 itself for your team by by writing
00:15:59.680 plugins so here I will give you an
00:16:03.120 example how to write the plug-in so you
00:16:06.199 know that reals provide a convenient
00:16:09.240 helper method try it uh calling try our
00:16:13.720 new object will always return return new
00:16:16.440 so I want to find the code in in my
00:16:19.920 repository to see any code that is doing
00:16:23.079 the same thing as the the try method so
00:16:26.120 I can replace the manual code by by by
00:16:29.680 using the default try
00:16:32.519 method okay so let's see how to do
00:16:37.440 it uh can you can you see this TM
00:16:41.480 clearly and uh this is the test code for
00:16:45.120 for use try Checker plugin as you see
00:16:48.920 it's it's readable there are only three
00:16:51.720 tests in the first two tests if I use
00:16:56.240 all question mark all the name column
00:17:00.199 new or I use al. new question question
00:17:04.640 mark new column or. name that means they
00:17:08.839 did the same thing with try method so I
00:17:13.280 I should replace the code with au
00:17:17.959 do. try name right so the jam told me
00:17:23.039 the F name the line number and the we
00:17:26.880 and the warning message with use try on
00:17:32.039 or so and in the in the third test there
00:17:36.240 should no such code may be
00:17:40.400 detected and uh every time you write a
00:17:43.720 plug-in for your uh for real best PR gem
00:17:47.320 be sure you write the test code first so
00:17:50.080 it makes your things
00:17:51.880 easier so this is the uh use try Checker
00:17:56.559 plug-in implementation code
00:17:59.840 uh I recommend you to read the document
00:18:03.080 of ripper in Ruby one9 first uh Ripper
00:18:07.280 is a library to convert your uh Ruby
00:18:10.919 source code to abstract syntax tree so
00:18:14.280 real best PRM is based on
00:18:16.799 Ripper so here I only check the if op
00:18:20.640 node in all the Ruby files and it checks
00:18:24.640 the two cases that I have written in the
00:18:26.720 test
00:18:27.760 code so as you you can see that it's not
00:18:31.559 too difficult to write plugin so you can
00:18:34.400 just write the plugins for your teams so
00:18:37.840 you can
00:18:39.039 just you can promise your code base is
00:18:42.679 always good
00:18:46.720 quality if you don't want to write
00:18:49.360 plug-in just in your own repositories
00:18:52.400 feel free to uh for the gem in GitHub
00:18:57.760 and uh as you see that the website is
00:19:00.919 also open- sourced so feel free to
00:19:03.640 contribute to
00:19:05.280 them and uh at the end I will give you a
00:19:09.760 demo so this is I I create a reals
00:19:13.640 project on on GitHub the real BP is just
00:19:17.039 a simple reals project and I add the
00:19:20.280 user and the post model uh by by the
00:19:24.559 scold okay so first let me register the
00:19:29.039 repository on RBP I typed the uh GitHub
00:19:34.280 name and I create
00:19:38.360 repository it will synchronize the
00:19:40.480 information onab and you will be
00:19:42.799 assigned a tok a repository token just
00:19:46.280 copy it and update the token on
00:19:52.559 GitHub yeah you will see this is your
00:19:54.799 repository ad administrator page there's
00:19:57.600 a service hooks scroll down there's a
00:20:00.360 real BP
00:20:01.919 service and uh scroll up you should uh
00:20:05.679 replace the UR and the token and
00:20:08.640 activate this token or this service then
00:20:12.039 just update the
00:20:15.120 settings then next you push the code you
00:20:18.520 will see the the an analysis report page
00:20:23.000 you see we have some codes mes and we
00:20:26.520 have a lot of replace instance variable
00:20:28.960 with local variable so I if I don't care
00:20:31.760 about this Checker I can add it to
00:20:35.360 repository I set a
00:20:38.679 configuration Let's uh search the
00:20:44.159 replace then I uncheck this
00:20:46.919 Checker uh update the
00:20:50.360 configuration so let's see next I we
00:20:54.360 push the code so I directed on GitHub I
00:20:57.760 changed the read me the doc doc file to
00:21:01.400 to push the
00:21:05.720 code okay I added the
00:21:11.600 file I change its value with real BP
00:21:15.840 demo and uh yeah I push the
00:21:22.760 code after that when we back to analysis
00:21:27.720 report page you will see that replace
00:21:30.640 instance variable with local Val the the
00:21:33.320 Cod me just disappeared and let's see
00:21:35.640 the use Scopes use scope access you
00:21:39.000 click the message then it brings you to
00:21:41.720 the web page to say what's the mail is
00:21:46.039 and how to reflect
00:21:48.279 it then back you will click the file
00:21:50.960 name it brings you to the GitHub file
00:21:54.480 name file page let's just fix the
00:21:58.679 reflector fix the co on on
00:22:04.760 giab okay addit this
00:22:13.120 file so I change post. found. find with
00:22:17.520 current users. post. find and remove the
00:22:21.600 other line
00:22:24.720 code then I add the commit message you
00:22:28.520 use scope access for
00:22:34.000 post then I push the code
00:22:38.240 again so let's back to the RBP analysis
00:22:43.120 report page you will see the use scope
00:22:45.159 access Cod m is
00:22:47.960 disappeared and let's do
00:22:50.559 another example I change the user model
00:22:56.039 file I add a
00:23:01.200 method uh admin question mark and it
00:23:04.320 always return
00:23:08.279 forse okay I add a commit
00:23:13.279 message and push the
00:23:16.799 code so let's see the analysis report
00:23:20.200 page if there's a new new Cod it says
00:23:23.480 there's a an unused method user admin
00:23:27.159 question mark because we just have the
00:23:29.240 method in model but no place to call
00:23:32.039 that
00:23:33.120 method so let's just go to the
00:23:35.760 controller and uh let's change the
00:23:40.919 destroy logic say only admin can destroy
00:23:45.640 the M the the
00:23:48.000 post so if current use is admin and it
00:23:52.760 can
00:23:53.679 destroy
00:23:55.799 SS or we just return the flash
00:24:10.760 error okay the
00:24:13.760 commit commit messages only admin can
00:24:17.480 destroy the
00:24:18.960 post let's push the
00:24:22.080 code and uh in the analysis report page
00:24:27.520 the co mail paed because you already
00:24:29.760 have a a code to called the admin page
00:24:33.480 admin method so uh in in when I just
00:24:38.600 clean up my code space uh I will found a
00:24:41.559 lot of unused method but I will not add
00:24:44.799 code to to call them I just uh remove
00:24:47.919 these code because they are rarely
00:24:50.440 unused and uh uh I didn't just re uh fix
00:24:54.960 the Cod mail on GitHub I will create a
00:24:57.600 local branch to to to do the co co-
00:25:02.480 refacing okay that's it thank
00:25:27.320 you
00:25:57.240 he