Scaling Rails API to Write

by Takumasa Ochi

In the video titled "Scaling Rails API to Write" presented by Takumasa Ochi at RailsConf 2021, the speaker delves into the challenges and solutions for handling write-heavy traffic encountered by mobile game API backends. Takumasa introduces his professional background and the mission of DNA, the company he represents, which focuses on diverse services including mobile gaming.

The session outlines several key points:
- Mobile Game API Characteristics: The speaker emphasizes the significance of non-real-time servers in mobile gaming environments, which manage transactions and data storage tasks, highlighting the disproportionate traffic focused on write operations.
- Scalable Database Configurations: Takumasa discusses the importance of strategic database management approaches, including vertical splitting, horizontal sharding, and read-write splitting. He explains how these configurations allow for managing heavy database transactions effectively.
- API Design for Write Scalability: Through the exploration of resource-focused API designs and replication processes, the speaker highlights strategies that offload read traffic from writer nodes, addressing challenges of latency and consistency to ensure optimal performance.
- Sharding Mechanisms and Data Locality: The session illustrates various sharding configurations such as multi-tenant and single-tenant designs and their impact on performance and consistency in gameplay scenarios. Some scenarios, like handling friends lists and training requests, demonstrate how sharding can enhance scalability without sacrificing user experience.
- Consistency Issues with Multiple Databases: Takumasa discusses advanced methods for achieving eventual consistency across multiple databases, highlighting rules to manage transactions reliably, including a relaxed two-phase commit and the use of identity keys in managing anomalies.
- Avoiding Pitfalls in Database Management: The commentary cautions against oversights when managing multiple databases, urging extensive testing to confirm system integrity and reliability before launch.

The session concludes by summarizing the importance of employing all aspects of multiple databases and emphasizes testing techniques lean toward robustness and adaptiveness. Takumasa encourages developers to contribute back to the Rails community, emphasizing that collaboration can turn technical debt into assets for future development.

This comprehensive session provides practical insights into optimizing Rails API systems for game development, addressing both technical implementation and consistency management.

Tens of millions of people can download and play the same game thanks to mobile app distribution platforms. Highly scalable and reliable API backends are critical to offering a good game experience. Notable characteristics here is not only the magnitude of the traffic but also the ratio of write traffic. How do you serve millions of database transactions per minute with Rails?

The keyword to solve this issue is "multiple databases," especially sharding. From system architecture to coding guidelines, we'd like to share the practical techniques derived from our long-term experience.

RailsConf 2021

00:00:05.299 thank you for coming to my session the theme of this session is scaring rare's

00:00:12.660 API to write heavy traffic first I will introduce myself

00:00:19.199 I am attack massage a software engineer and product manager at DNA

00:00:25.800 I am developing back-end systems for mobile game apps I love developing and

00:00:31.920 playing games I'll also Rob Ruby and Rion rails this is our company DNA our mission is

00:00:41.340 we Delight people beyond their widest dreams we offer a wide variety of services

00:00:48.920 Games sports live streaming Healthcare Automotive e-commerce and so on

00:00:57.780 we started using multiple databases for e-commerce in the 2000s

00:01:06.420 today game

00:01:12.540 and multiple databases this is the agenda for today's session

00:01:20.939 first I will present an explanation of mobile game API backends on the mobile

00:01:26.880 game app characteristics also we will take a quick overview of scalable

00:01:33.840 database configurations then we move on to the API designs for

00:01:40.200 right scalability there we will explore the application API response design and

00:01:46.619 charting after that we will check some pitfalls around multiple databases and

00:01:53.100 test with multiple databases then I will share the consideration of

00:01:58.500 the past present and future of multiple databases at our game servers

00:02:05.340 finally we will sum up the session

00:02:10.979 okay let's start with the mobile game API backends

00:02:20.700 generally speaking mobile game may appear back ends are split into two categories one is real-time servers and

00:02:29.280 the other one is no real-time servers in this session we will focus on

00:02:36.440 non-real-time servers the objective of these servers are to provide secured or

00:02:43.379 data storage and transactions multiplayer or Multi-Device Communications for example

00:02:50.360 non-real-timer non-real-time servers handle inventories friends daily bonuses

00:02:55.980 and other transactions this figure illustrates an example of

00:03:02.819 API calls a client wants to consume the energy and help the friend the server

00:03:09.300 conduct a resource exchange transaction and returns a successful response

00:03:17.220 this slide explains the system overview of our mobile game API backends we

00:03:23.879 started developing our game API server in the mid-2010 the system is running on

00:03:31.140 Ruby on Race 5 2 and mainly running on iaas

00:03:36.360 the main API is a monolithic application and there exists other components as

00:03:43.860 separated services such as proxy servers and management tool and HTML servers for

00:03:50.940 webview we utilize APA mode to minimize

00:03:56.220 controller modules and overhead also we employ scheme First Development by

00:04:02.700 protocol buffers we use MySQL memcast release and

00:04:07.980 elasticsearch now we will look at mobile game app

00:04:15.180 characteristics

00:04:22.199 how do you try a new game app store link tap the install button wait a

00:04:29.820 minute and play it is super easy for players play to

00:04:36.479 play however it is super easy for servers to

00:04:42.180 write this slide shows the right heavy traffic

00:04:47.940 reasons first as mentioned earlier easy install leads to easy right access

00:04:54.900 downloading is very easy and mobile up distribution platforms ensure installs

00:05:00.840 from all around the globe there is no need to make an account and the first

00:05:06.900 in-game action means first right secondly various in-game actions trigger

00:05:14.160 accesses especially in write such as receiving daily bonuses playing battles

00:05:20.580 buying items and Etc generally speaking the order of a

00:05:26.940 magnitude for mobile apps ranges from 10 kilo requester per second to one Mega

00:05:34.440 requester per second therefore scaling API to write heavy

00:05:40.199 traffic is required here

00:05:46.259 now we will take a quick overview of a scalable database configurations

00:05:57.300 this slide is an overview of our database configurations

00:06:02.699 from the application side we use vertical splitting horizontal shabbing

00:06:08.340 and read write splitting on the other hand from the infrastructure side we conduct load

00:06:15.960 balancing on readers and automatic writer failover let's look at them one by one

00:06:25.440 the first technique is vertical splitting we split the databases into two parts

00:06:31.979 the first part is common there is one writer node and this is optimized for

00:06:39.060 read heavy traffic we prioritize consistency of our

00:06:44.520 scalability here for example game setting data and

00:06:49.680 pointer to Shard are stored in the database

00:06:56.340 the other part is short which is the main focus of today's session there are

00:07:02.340 multiple writer nodes and this is optimized for write heavy traffic we

00:07:08.160 prioritize scalability over consistency here also this shared database is

00:07:15.600 horizontally charted for example player generated data is stored in the database

00:07:24.419 this is the horizontal shading we take data locality on a player and partition

00:07:30.840 them by player ID not by resource types this architecture is effective to hold

00:07:38.160 consistency on each player and efficient for player-based data access

00:07:45.240 we as a mapping table called player shards to assign a shard

00:07:50.880 we decide A Shard on the player's first axis by weighted sampling this Dynamic

00:07:57.840 shared allocation is flexible for adding new shards and balancing the load on

00:08:04.020 them there is a limitation of a maximum inserts per second for this architecture

00:08:11.460 for example thousands or tens of thousands of queries per second

00:08:17.460 are the limitation of my SQL if we hit the limit we can use the range

00:08:23.639 mapping table or distributed mapping tables

00:08:29.220 of course we utilize rewrite splitting 2. in our case we use manual read write

00:08:36.539 splitting this is complex but efficient APA servers access writer nodes to

00:08:43.500 retrieve data used for transactions and the previous own data

00:08:49.680 on the other hand APA servers access region nodes to retrieve data not used

00:08:55.920 for transactions and the other players data

00:09:01.440 from the infrastructure side we employ several multiple database systems the

00:09:07.620 key concept here is do that outside of rails this concept increases flexibility

00:09:15.540 and achieves separation of concerns in our case with a dns-based service

00:09:22.920 Discovery system with the custom DNS resolver on Rails blood balancing is conducted on the

00:09:30.779 weighted sampling of IP addresses and automatic writer failover is executed by

00:09:37.080 changing the IP address to a new writer node

00:09:43.140 so every aspect of multiple databases is indispensable for scalability

00:09:51.779 let's dive deeper into some critical features

00:09:57.839 now we are going to explore the API design for right scalability

00:10:06.839 let's begin with traffic offload by replication

00:10:13.640 replication is an effective technique to offload read traffic

00:10:18.959 here we replicate ddl and DML from writers to readers and almost the same

00:10:27.480 data is available on reader nodes by offloading read traffic to readers we

00:10:34.080 can reduce the writer's load and make the system scalable for read traffic

00:10:43.920 however there is an application delay historically speaking the latest data is

00:10:51.720 only available on the writer node

00:10:57.240 then how do we balance between the scalability and the consistency

00:11:06.779 the solution is read your right consistency which ensures consistency on

00:11:13.860 your right this slide shows the radio right

00:11:19.800 consistency in railroad 6. the solution is time-based connection switching when

00:11:26.760 there is a recent right by the death changer Raiders connects to a writer

00:11:32.399 this connection switching is the default in rails 6 and 2 seconds replication

00:11:38.399 delay is tolerated at default using this architecture we gain high

00:11:45.420 scalability for really heavy applications and strong consistency is

00:11:50.820 guaranteed where players care on the other hand there are mainly two

00:11:56.760 columns for this architecture the first one is that this is not

00:12:02.700 suitable for write heavy applications because of the higher proportion to the

00:12:08.040 writer traffic offloading is not so effective

00:12:13.079 also we have to take a severe balance between replication delay and the right

00:12:19.800 read ratio when the Tool elevated replication delay increases the right read ratio also

00:12:28.320 increases which means not scalable so we must further improve this for

00:12:35.640 right heavy traffic this slide explains the radial right

00:12:42.899 consistency for write heavy applications the solution is a resource-based

00:12:48.899 strategy players own data and critical data affects from writer and other digital

00:12:55.860 effects from readers one with the pros of this architecture

00:13:00.899 is defectiveness for right heavy traffic also much longer replication delay is

00:13:08.040 tolerated say tens or hundreds of seconds this magnitude of replication delay is

00:13:16.800 often invoked by Cloud infrastructure figures or By Request search

00:13:23.040 this architecture is effective to offload interplayer API traffic because

00:13:28.860 there are fewer connections to writers the counts of this architecture are the

00:13:35.700 complicated business logic and ineffectiveness of offloading into a player API in other words this

00:13:43.920 architecture helps greatly enhanced scalability but further Improvement is

00:13:49.200 required considering the interoperial API

00:13:56.279 okay now we will take a look at traffic offload by resource Focus the API design

00:14:10.560 let's conduct a case study the scenario is buy hamburgers and add

00:14:18.899 them into an abstract there is a knapsack with items a wallet

00:14:24.360 with money and infinite hamburgers player a buys hamburgers and add them to

00:14:32.459 the knapsack the Wallet balance will decrease and the items in the knapsack will increase what apis are required for

00:14:41.399 this scenario there are three required apis create

00:14:49.320 Burger purchase index item in knapsack and show wallet you can see the

00:14:55.500 railroads roots on your right hand side the roots consist of three resources

00:15:02.040 let's see the API cover sequence on the left hand side in the first axis a client calls index

00:15:10.199 item in knapsack and show wallet then the player decide what to buy and

00:15:17.160 the client calls create Burger purchase index items in knapsack and show Wallet

00:15:23.040 balance again the player decides what to buy and

00:15:28.500 the client calls all the apis they are mailing API calls to access

00:15:35.820 writer of this right heavy traffic is because of the recent ride for regional right in

00:15:43.860 rail 6 and because of the own resources for radio right for right heavy traffic

00:15:53.880 how do we solve this right heavy API covering

00:16:01.380 question is why do we fetch the data from a writer

00:16:07.019 this is because the items and wallet are changed

00:16:13.740 what exactly the changed items and worried

00:16:20.940 we can find them in the controller they are the resources client want to

00:16:26.399 watch and the resources write requests affected as a side effect and the

00:16:32.100 resources various has on memory as active record instances

00:16:40.500 so the solution is to include all the affected resources in API response

00:16:47.880 in this scenario there are two kinds of affected resources the first one is

00:16:54.360 Target resource which is the primary purpose of the API in this case the

00:17:00.060 burger projects the other one is affected resource which is the side

00:17:05.520 effect of the API in this case items in knapsack and wallet

00:17:12.959 on the Json sample on your right hand side you can see all the resources are

00:17:18.419 included in the response and you can see the API core sequence below the Json

00:17:25.079 here the client calls index item in knapsack and show wallet on the first

00:17:31.080 axis this sequence is the same as the original API core sequence

00:17:37.380 however after deciding what to buy the client calls only create Burger purchase

00:17:44.640 well this is because all the required resources are included in the response

00:17:50.160 of create variable purchase so there are no read API requests except

00:17:56.880 the first one and there are no writer database accesses except for the

00:18:01.980 transaction itself the other sense of request is beneficial both to databases and to

00:18:10.020 clients although there are duplicate resources and increased complexity we have a major

00:18:17.880 writer databases focused on transactions which is the only thing not achievable

00:18:24.120 by readers

00:18:29.580 this slide explains the summary of the mixed result of replication and resource

00:18:36.059 focused APK design on the one hand radial right consistency

00:18:41.220 based on data's owner is effective for write heavy traffic

00:18:46.520 interplayer apis and sharding on the other hand including affected

00:18:52.799 resources in right to API response is effective for minimizing read API

00:18:59.760 requests underwriter database accesses also it is effective for interplayer API

00:19:09.360 as a mixed result we have obtained the following effects firstly we have gained

00:19:16.860 a complex but adaptive architecture due to manual read write switching

00:19:23.160 secondly we have achieved maximum traffic upload due to affected resources

00:19:29.700 in right API responses thirdly we have achieved robustness of a

00:19:36.360 replication delay the architecture tolerate a very long application delay

00:19:44.340 so we have achieved both consistency and right scalability with higher robustness

00:19:49.799 of our application delay

00:19:55.020 now we successfully let writer databases focus on transactions

00:20:02.700 however what if the number of transactions exceeds the writer databases capacity

00:20:11.160 the answer is sharding now let's see the scalable data locality

00:20:19.080 for sharding

00:20:25.919 General multi-tenant sharding splits the data by the owner Talent

00:20:31.679 there is no interaction between shards

00:20:37.260 this is the scalable architecture we only have to add another chart to scale

00:20:45.440 also starting is the valley architecture to guarantee the best data locality a

00:20:52.320 multi-canon sharding

00:20:59.880 how about a single tenant multiplayer game

00:21:05.820 interaction between shards is the most critical feature required for single

00:21:11.460 tenant multiplayer game players in one large game world must

00:21:17.340 interact with each other also sharding does not guarantee the

00:21:25.380 best data locality by itself furthermore Charlene can invoke a

00:21:32.460 consistency issue on other failures or network failures

00:21:42.480 so we have to handle it

00:21:47.940 the objective of that locality is to improve performance and consistency by

00:21:54.720 locating data wisely the basic strategy here is first consider closed chartered

00:22:01.620 operation cost at expensive second decrease the access shared and

00:22:07.440 additional computation per request then make a balance on accumulated cost

00:22:14.460 let's conduct some case studies on the architecture

00:22:21.299 the first scenario is train myself and respond to call for help

00:22:30.419 in this scenario player a and player B trained themselves then player C calls

00:22:38.760 one of them for help here there is no need for the latest

00:22:44.580 trained state the request consists of 95 everyday

00:22:50.520 training requests and five percent occasional help requests

00:22:56.100 the solution is a pool architecture you can see the layers and databases on the

00:23:01.559 figure we locate the data to the ownership indexed by player ID then 95 percent of

00:23:09.900 write requests involve just one writer and five percent read requests involve

00:23:16.380 one writer and one reader this architecture is performant on the

00:23:22.260 majority write requests the next scenario is send gifts list

00:23:30.539 gifts and receive gifts in this scenario player a and player B

00:23:39.360 send gifts to player C player C lists them in chronological

00:23:46.140 order the requests consist of 30 occasional

00:23:51.539 sending requests and 70 percent everyday listing or receiving requests

00:23:57.419 you can see the figures on your right hand side the solution here is a push architecture

00:24:04.679 we locate the data to the receiver desert indexed by receivable player ID

00:24:09.960 sent out by doing all this 30 percent write request involve two writers and 70

00:24:18.059 percent read write requests involve just one writer

00:24:23.280 although there is an additional cost on sending gifts this architecture is

00:24:29.460 performed on the majority read write request also there is no merge sort cost due to

00:24:36.840 prey indexed data the third scenario is become friends and

00:24:44.520 listen to friends in this scenario player a player B and

00:24:52.260 player C list friends player C become friends or linked with

00:24:57.720 the B the request consists of 95 everyday

00:25:02.880 listing of requests and the five percent occasional linking requests

00:25:08.940 the solution here is double right architecture we locate the data to both shards

00:25:16.799 indexed by player ID and friend player ID

00:25:22.980 then five percent write requests involve maximum two writers a 95 percent read

00:25:30.840 request involve only one reader this is performant on majority read

00:25:37.919 requests let's look into a more complicated

00:25:43.740 scenario now the scenario is join the gears and

00:25:49.919 defeat a raid boss player a player B and player C join a

00:25:55.380 guild they fight together against the powerful enemy

00:26:00.600 they share a limited amount of consumable weapons each player's Attack under the enemy's

00:26:07.679 attack change the current situation both acid and performance are critical

00:26:13.500 for updating playered Health stage formation and elements Health State and

00:26:20.880 gears remain new weapons and State the solution is partial resholding

00:26:29.640 from the databases Viewpoint players belong to different shards and players

00:26:36.059 belong to the same Guild both acid and performance are required here

00:26:42.779 first select another Shard for the guild by weighted sampling

00:26:48.299 you can see the new Shard with the sunglasses on the figure on your right hand side

00:26:54.539 then reference the new Shard from each player's original shot

00:26:59.640 at the design phase we extract critical resources for the raid boss

00:27:06.299 for example Health stage formation weapons and so on

00:27:12.179 then locate the extracted resources on the geared's child

00:27:18.000 this achieves acid and performance because the number of involved shards is

00:27:24.720 only one for transactions we can leave the non-critical resources

00:27:30.720 to the original shards by this way there is no cross-shard

00:27:37.740 overhead for guild-based transactions and apis

00:27:43.140 and acid is also guaranteed

00:27:49.980 we have successfully improved data locality and achieved scalability

00:27:59.520 this data locality technique benefit not only conventional databases but also

00:28:07.200 some distributed databases considering the under the hood architecture

00:28:16.200 however we have not yet fully solved the difficult issue for conventional

00:28:22.620 databases now let's look at consistency for

00:28:29.159 shutting the last part of API design

00:28:39.240 keeping consistency over multiple databases is a difficult problem

00:28:47.159 especially back in the mid-2000 tons when our game server was

00:28:53.279 born someone says you know two-phase comic

00:28:59.580 can ensure a Domesticity of multiple databases like Excel transaction

00:29:08.340 this is the overview of the two-phase comic for comparative study first of all

00:29:15.000 there are two kinds of participants here one is transaction manager in this case

00:29:21.720 API server and the other one is resource manager in this case mySQL database

00:29:29.340 the sequence is as follows first transaction manager ask resource

00:29:36.240 managers to update records as usual and transaction manager ask resource

00:29:42.480 managers whether or not they are prepared for commit then resource managers execute the

00:29:50.399 transaction up to the point where they will be asked to commit

00:29:56.340 after that resource managers respond to transaction manager so that they can commit you or not

00:30:03.480 if all the resource managers can commit then transaction manager tell over the

00:30:08.760 resource manager to commit it however if none of the resource managers

00:30:14.399 cannot commit the insurance action manager tell all of the resource managers to rollback it

00:30:21.659 this is simple at least for nominal cases

00:30:27.179 you can see the more casual description of that sequence in the picture on your

00:30:32.580 right hand side here transaction manager asks resource managers whether they can commit or not

00:30:41.460 transaction manager is asking for an explicit agreement here

00:30:50.100 however unfortunately the my secret document says prior to my sequel 577 XC

00:30:58.140 transactions were not compatible with the replication at all

00:31:03.960 also two phase commit like XA transaction has potential issues for

00:31:09.659 scalability prioritizing scalability we have to

00:31:16.320 relax the optimistic while maintaining the practically acceptable consistency

00:31:25.380 let's conduct another case study for this consistency

00:31:30.419 player a consume item of item in knapsack player would be receive the consumed

00:31:38.039 amount as a gift pretty easy isn't it

00:31:44.940 this code is the result of a venture consistency implemented in raid 6. seems

00:31:52.500 difficult because of connected to but do not worry we will explore this

00:31:58.799 one by one rule one is reduce the risk of crosstalk

00:32:07.500 transaction anomalies before proposing this rule we should

00:32:13.140 check the precondition for relaxed cross-shot transactions first all resource managers are composed

00:32:21.779 of the same architecture MySQL and then resource managers are prone to fail for

00:32:29.399 a continuous period of time for example failure in Cloud infrastructure can

00:32:35.279 trigger this kind of failure also resource managers are unlikely to

00:32:41.399 fail on the young comment let's look at the figures on your right

00:32:47.220 hand side there are two MySQL databases in the middle the top part of this figure is a

00:32:54.059 conventional two-phase commit and the bottom part of this figure is the proposed relaxed transaction

00:33:01.559 in the latter case transaction manager asks the resource manager to update the

00:33:06.779 record and the resource managers respond with ok then transaction manager assumes that

00:33:13.860 yeah they can probably comment and transaction answer execute the

00:33:20.460 commit order let's take a look at radius code achieved a relaxed transaction

00:33:28.200 in the beginning there are two consecutive nested transaction blocks this nested transactions execute a

00:33:36.059 consecutive commit at the end of the transaction block right after the beginning of the

00:33:43.140 transaction block you can see the player.blog.findplayer.id

00:33:48.600 this code is locking granular resources to avoid race conditions during the

00:33:55.140 transactions which further ensures the success rate of the transaction

00:34:02.220 the main business logic is omitted in this example

00:34:08.220 finally toward the end of the transaction block they are a consecutive safe methods this chord is the critical

00:34:15.899 component in Redux transaction the Ruby code invokes consecutive update

00:34:22.800 queries followed by consecutive commit queries which minimizes the time and

00:34:29.399 uncertainty of a transaction manager and resource managers

00:34:34.859 but typically expected but a very unlikely anomalies are as follows

00:34:41.520 one transaction manager or rails failures between consecutive comment

00:34:47.820 the next one is Network failures between consecutive commit on the third one is

00:34:54.480 sender Shadow node failure before the last comment

00:35:00.240 here we have achieved an XA competitive success rate for nominal cases except

00:35:06.839 for the difference between ah sorry except for the difference from Excel

00:35:12.000 prepare to update for the difference from XA commit to

00:35:17.820 commit also there are no blocking factors for

00:35:23.280 node or network failures although we do not have a persistent log

00:35:28.560 on resource managers we have gained a reasonable success rate for nominal

00:35:33.720 cases here rule 2 is to keep inconsistency

00:35:38.880 observable and acceptable before proposing this rule we have to know that

00:35:45.599 there are three states for Crosshair transactions one is none of the shorts Commit This is

00:35:53.280 the failure state second one is some and not all of the

00:35:58.500 shards commit this is the anomaly state and the last one is all of the Sharks

00:36:05.220 comment of this is Success state in order to recover from the anomalic

00:36:11.460 state the general two-phase Comet forces all the Sheffield shards to commit so we

00:36:18.000 do the same thing firstly we serialize and control the

00:36:23.160 order of comet this reduces the anomalystate to L minus 1 for the number

00:36:29.099 of resource managers second way we make the inconsistency

00:36:34.579 observable from transaction manager in this scenario the sender this is the

00:36:41.339 trigger for trial recovery after that we make inconsistency

00:36:46.740 acceptable from others or from the Nexus action we let the inconsistency behave

00:36:54.180 as if the transactions are all successful also the next action should not affect

00:37:02.040 the recovery or vice versa this practice leads to a non-blocking characteristics

00:37:09.540 now let's see the rare scored on your right hand side in the beginning you can see the outer

00:37:16.320 transaction block is the standards chart so the anomaly case is that receiver

00:37:22.500 Rich Comet and the sender reached rollback let's imagine what will happen to this

00:37:29.099 anomaly at the bottom of this transaction first send your history and the sender

00:37:34.800 summary would not be saved in this case the absence of history is observable and

00:37:42.240 the sender can retry based on that also receiver gift would be saved in the

00:37:49.079 animal case and the receiver can use receiver gift regardless of the sender's

00:37:55.500 transaction by this observable and acceptable

00:38:01.200 inconsistency we have gained a non-blocking start point for retry or

00:38:07.200 recovery then Here Comes rule three keep identity

00:38:16.140 and reach a venture consistency the proposed solution utilized retry

00:38:23.099 because the required is the simplest but reliable method actually it works even

00:38:29.040 for the network failures outside the server systems in order to utilize retry we have to use

00:38:37.320 an identity key to identify the resources the identifier is often called

00:38:44.220 transaction ID identity key or request ID

00:38:50.040 we build the business logic assuming the inconsistency history like table and find our initials

00:38:57.480 by is a convenient way to handle this

00:39:02.700 inconsistency let's look at the actual radius code this code is the main business logic of

00:39:10.619 the Crosshair transaction in the middle you can see the return near Center history dot persisted

00:39:18.660 this code ensures the identity for which rights if the central history persisted

00:39:24.960 it means that all the transactions successfully reached comments no further

00:39:30.960 update is required at the bottom you can see the gift dot

00:39:36.720 find your insurance by in the anomaly case the gift already exists so the

00:39:42.839 final initialized by migrate this inconsistency efficiently

00:39:48.660 by introducing identity and the adventure consistency we not only

00:39:54.180 overcome database anomalies but also failures outside the server systems

00:40:00.599 both eventual consistency and scalability are achieved I think this is

00:40:07.920 sufficiently practical are not satisfied what do you require

00:40:15.180 most work solution then rule 4 meets a requirement

00:40:22.020 real voice utilize at least one's message queue for strict completion

00:40:28.560 the precondition for this rule is that business logic already complies with the

00:40:34.680 rule 1 2 and 3. we divide the architecture into two parts the first part is the enqueue

00:40:41.940 phase and the other one is the DQ phase in the enqueue phase we save data to a

00:40:48.900 single shark database the acid is guaranteed here then we include a

00:40:54.839 message referencing the data within the transaction this guarantee at least once message

00:41:02.940 existence then make the message decued only after

00:41:08.880 the transaction is finished Force timeout both on Rails and on

00:41:14.460 databases and use perform later for deleted queue

00:41:19.859 let's see the radius chord on your right hand side in the middle of the code you

00:41:25.200 can see the weight 30 seconds and perform later within the transaction

00:41:30.780 given my sequel on the greatest timeout it is guaranteed that the transaction

00:41:36.119 will reach either commit or roll back when the active job is performed

00:41:43.380 in the DQ phase just retrieve the messages from the queue and check if the

00:41:49.020 data in the database exists if not exist then the transaction reads

00:41:55.200 rollback do nothing if it exists then the transaction reached commit continue

00:42:01.800 the process then conduct ident business logic and be

00:42:07.920 sure to finish the message only after the business logic succeeds this guarantees at least once job execution

00:42:16.099 also do not forget to consider the risk of Ruby process Clash

00:42:22.800 by this architecture we have achieved a quasis theoretically guaranteed

00:42:29.240 completion of cross-shot transactions

00:42:36.000 thus we have acquired practically acceptable consistency without losing

00:42:41.160 scalability this is essentially a concerned

00:42:46.859 databases so we should bear in mind this should be more elegantly solved by

00:42:52.500 sophisticated databases so is it all the things we have to do

00:43:00.300 for red scalability there's more

00:43:06.780 and we are going to look at the pitfalls around multiple databases

00:43:17.819 the objective of sharding is horizontal right scalability of databases

00:43:24.480 specifically speaking given n as number of shards keep time and space complexity

00:43:31.200 to of one or at least of log l sounds simple and trivial we have

00:43:37.920 actually achieved that by sharding however there are more pitfalls around

00:43:44.099 us the one bit four is the number of connection per database

00:43:50.940 the connection pool of rails is to cache and reuse the established database connection for performance

00:43:57.900 and there is a limitation over this because the casted pool cannot be shared

00:44:05.640 across processes or across instances also the Ruby 2 the gvl forced only one

00:44:15.599 native thread is runnable per process the result of these two limitations

00:44:21.540 leads to the number of processes larger than CPU core

00:44:27.599 and generally speaking there are many API servers to the many shots

00:44:34.560 so let's do the math here the number of servers is 1000 and CP

00:44:39.900 core is 32 and the full connection for process is 5. then we can yield 160

00:44:48.060 000 over 480 000 connections per database

00:44:54.720 actually this number exceeds the maximum connection of my SQL and postgresql

00:45:00.900 and we can see the deteriorated database performance due to many connections like

00:45:06.720 memory and CPU actually the typical default by cloud

00:45:12.300 computing providers is around hundreds or thousands so we have to Target this

00:45:18.420 value the strategy of reducing the number is

00:45:24.480 like this first assume average of one for actually used database connections

00:45:30.359 for request and then release the connection to where average value of process can retrieve

00:45:37.140 the solution one is Multiplex connections at the proxies between databases and API servers

00:45:44.579 the PHD bouncer for postgrades SQL is famous for this solution

00:45:50.400 the pros of this solution is there is no connect overhead at databases

00:45:57.660 however there are columns as for the transaction pooling some database

00:46:02.700 features are disabled and regarding session pooling clear connections are

00:46:08.040 also required and the solution too is Clear Connection pool after each request

00:46:15.240 the active record refresh connection is famous for this solution

00:46:20.339 the process of this solution is there is no additional components required and

00:46:26.280 full database features are available although these have a connect overhead

00:46:32.760 at databases this is not so problematic for MySQL and we have achieved the low

00:46:38.700 connection number with a solution too the next Pitfall is the connection

00:46:45.720 Reaper thread per API server a connection reverse thread is a thread

00:46:51.300 to periodically flush IU database connections this was enabled by default

00:46:57.180 in raids 5. the original design of Reaper thread with two River thread created for

00:47:04.680 database spec so let's do the math here again the number of unicorn processes is 100

00:47:12.000 and number of shards is 300 and then the number of writers and readers are two

00:47:18.420 then we can yield 60 000 total number of river thread here

00:47:26.520 so this is actually exceeds the default parameter in kernel and also we can see

00:47:32.880 the deteriority performance due to many thread

00:47:37.980 actually in the phrase six this is already fixed and if we are using rares

00:47:44.280 under six you can see the reading frequency equals new undisable thread

00:47:53.940 the summary of beautifuls is like this we have learned that even the

00:47:59.220 elaboratory crafty code fails before huge number of databases and there might

00:48:04.920 be another pitfalls behind you

00:48:10.619 so how do we avoid pitfalls there are too many things to consider

00:48:19.380 then it comes the test test with multiple databases

00:48:28.140 if we haven't tried it assume it's broken

00:48:33.480 we should test multiple databases

00:48:39.060 the recommendation for use real remote databases in test is first use real

00:48:45.000 multiple database configurations at test avoid stub and use real databases and

00:48:52.099 middlewares and separate region right connections and there should be more

00:48:58.380 than one shirt or horizontal shelving and we love Factory bought our fixtures

00:49:04.079 because it can test the dynamic Shard allocation and access

00:49:10.260 and we recommend use real transactions on tests

00:49:15.540 by setting the config and we also simulate partial transaction

00:49:21.060 failures and this practice can detect wrong

00:49:26.520 understanding of material Behavior wrong choice of video writer connections misallocation and misaxes discharge and

00:49:34.560 wrong handling of recover your try of a partially failed transactions

00:49:41.940 also we had injected anomalies in sandbox environment

00:49:47.460 the purpose of sandbox environment is developed client system and conduct q a

00:49:53.280 on integration of ground Sovereign other components the anomalies should be tested there

00:50:00.720 in our case we had always 10 seconds application delay

00:50:06.839 and we implemented debugger to inject exceptions for important and complicated

00:50:12.900 apis this practice can detect server bug on

00:50:18.240 existing readers and client bug on relying with the API for the related data and many architecture bug on the

00:50:27.540 design of read write API and identity and the adventure consistency

00:50:34.800 and also we highly recommend conducting load tests before service launch

00:50:40.079 because this is the most unpredictable part on multiple databases for example

00:50:46.319 does the app work with 10 shots 100 shots or 1 000 shards

00:50:53.640 and actually it is required to provision servers to the maximum load on the

00:50:59.099 production and these practices can detect an even

00:51:04.440 distributed load of partitions and time and special complexity of of n or

00:51:10.800 greater and other issues due to the real scale

00:51:17.160 now we can relax ourselves in addition to relaxing multiple databases

00:51:24.839 lastly past present and future of multiple

00:51:31.319 databases at our game server in the past our project started ways for

00:51:38.700 two this is the latest version of that time and there is no native multiple support

00:51:45.240 to them and so we have extended rails and various gems and we make many monkey

00:51:53.400 patches for example in-house jail mode truck was created and Richboro was extended

00:52:01.380 and we take priority on performance over complexity and we take priority on

00:52:07.619 various syntax over implicit computation there we have implemented various features and

00:52:14.700 this was successful supporting games then the current

00:52:21.000 there are two different race versions average is 5 2 plus extended gems plus

00:52:27.780 monkey patches and Upstream Rays if six two plus native multi native multiple

00:52:33.839 database sport the featured discrepancy is shrinking for example common pitfalls are being

00:52:40.559 fixed out of certain rails and useful features are being implemented at up

00:52:45.780 some rails however um there is growing called discrepancy

00:52:53.760 because there is different code to achieve the similar things and there is more sophisticated design at Upstream

00:53:00.900 rails so which way are we going

00:53:09.000 of course on the rails we are considering to upgrade to

00:53:15.480 Upstream with a simplified multiple databases I think this is sustainable continuous

00:53:21.780 development toward the future also we are considering to migrate and

00:53:27.240 contribute our technical expertise to rails this not only benefit rails but

00:53:34.079 also benefit us and we can turn the technical debt into technical assets

00:53:42.720 we'll be back on Rails

00:53:51.059 so let's sum up today's session we have explored various techniques to

00:53:58.740 scale rails API to write heavy traffic first of all we utilize all aspects of

00:54:04.920 multiple databases vertical splitting horizontal shading replication and Road

00:54:11.220 balancing are all necessary then we let writer databases focus on

00:54:17.460 transactions replication can offload the array traffic and resource-focused API

00:54:23.280 response design can minimize unnecessary data access to writer

00:54:28.380 next we optimize data locality to reduce the cross chart performance overhead

00:54:34.740 we have explored pull push double right and partial resholding architectures

00:54:41.760 after that we introduce practically acceptable and scalable consistency it

00:54:48.599 consists of risk minimization observable and acceptable inconsistency and

00:54:54.599 eventual consistency we also learned some pitfalls and risks even for the elaboratory crafted code

00:55:03.180 therefore we recommend a test with multiple databases this can ensure all

00:55:09.180 the techniques here are working as intended by using real databases real anomalies

00:55:16.319 and real scale we can ensure everything is working well

00:55:22.500 lastly I believe it is crucial to contribute to rails Upstream which turns

00:55:29.520 technical debt into technical assets

00:55:35.760 thank you for listening DNA loves rays