List

Scaling Rails API to Write

Scaling Rails API to Write

by Takumasa Ochi

In the video titled "Scaling Rails API to Write" presented by Takumasa Ochi at RailsConf 2021, the speaker delves into the challenges and solutions for handling write-heavy traffic encountered by mobile game API backends. Takumasa introduces his professional background and the mission of DNA, the company he represents, which focuses on diverse services including mobile gaming.

The session outlines several key points:
- Mobile Game API Characteristics: The speaker emphasizes the significance of non-real-time servers in mobile gaming environments, which manage transactions and data storage tasks, highlighting the disproportionate traffic focused on write operations.
- Scalable Database Configurations: Takumasa discusses the importance of strategic database management approaches, including vertical splitting, horizontal sharding, and read-write splitting. He explains how these configurations allow for managing heavy database transactions effectively.
- API Design for Write Scalability: Through the exploration of resource-focused API designs and replication processes, the speaker highlights strategies that offload read traffic from writer nodes, addressing challenges of latency and consistency to ensure optimal performance.
- Sharding Mechanisms and Data Locality: The session illustrates various sharding configurations such as multi-tenant and single-tenant designs and their impact on performance and consistency in gameplay scenarios. Some scenarios, like handling friends lists and training requests, demonstrate how sharding can enhance scalability without sacrificing user experience.
- Consistency Issues with Multiple Databases: Takumasa discusses advanced methods for achieving eventual consistency across multiple databases, highlighting rules to manage transactions reliably, including a relaxed two-phase commit and the use of identity keys in managing anomalies.
- Avoiding Pitfalls in Database Management: The commentary cautions against oversights when managing multiple databases, urging extensive testing to confirm system integrity and reliability before launch.

The session concludes by summarizing the importance of employing all aspects of multiple databases and emphasizes testing techniques lean toward robustness and adaptiveness. Takumasa encourages developers to contribute back to the Rails community, emphasizing that collaboration can turn technical debt into assets for future development.

This comprehensive session provides practical insights into optimizing Rails API systems for game development, addressing both technical implementation and consistency management.

Tens of millions of people can download and play the same game thanks to mobile app distribution platforms. Highly scalable and reliable API backends are critical to offering a good game experience. Notable characteristics here is not only the magnitude of the traffic but also the ratio of write traffic. How do you serve millions of database transactions per minute with Rails?

The keyword to solve this issue is "multiple databases," especially sharding. From system architecture to coding guidelines, we'd like to share the practical techniques derived from our long-term experience.

RailsConf 2021

00:00:05.299 thank you for coming to my session the theme of this session is scaring rare's
00:00:12.660 API to write heavy traffic first I will introduce myself
00:00:19.199 I am attack massage a software engineer and product manager at DNA
00:00:25.800 I am developing back-end systems for mobile game apps I love developing and
00:00:31.920 playing games I'll also Rob Ruby and Rion rails this is our company DNA our mission is
00:00:41.340 we Delight people beyond their widest dreams we offer a wide variety of services
00:00:48.920 Games sports live streaming Healthcare Automotive e-commerce and so on
00:00:57.780 we started using multiple databases for e-commerce in the 2000s
00:01:06.420 today game
00:01:12.540 and multiple databases this is the agenda for today's session
00:01:20.939 first I will present an explanation of mobile game API backends on the mobile
00:01:26.880 game app characteristics also we will take a quick overview of scalable
00:01:33.840 database configurations then we move on to the API designs for
00:01:40.200 right scalability there we will explore the application API response design and
00:01:46.619 charting after that we will check some pitfalls around multiple databases and
00:01:53.100 test with multiple databases then I will share the consideration of
00:01:58.500 the past present and future of multiple databases at our game servers
00:02:05.340 finally we will sum up the session
00:02:10.979 okay let's start with the mobile game API backends
00:02:20.700 generally speaking mobile game may appear back ends are split into two categories one is real-time servers and
00:02:29.280 the other one is no real-time servers in this session we will focus on
00:02:36.440 non-real-time servers the objective of these servers are to provide secured or
00:02:43.379 data storage and transactions multiplayer or Multi-Device Communications for example
00:02:50.360 non-real-timer non-real-time servers handle inventories friends daily bonuses
00:02:55.980 and other transactions this figure illustrates an example of
00:03:02.819 API calls a client wants to consume the energy and help the friend the server
00:03:09.300 conduct a resource exchange transaction and returns a successful response
00:03:17.220 this slide explains the system overview of our mobile game API backends we
00:03:23.879 started developing our game API server in the mid-2010 the system is running on
00:03:31.140 Ruby on Race 5 2 and mainly running on iaas
00:03:36.360 the main API is a monolithic application and there exists other components as
00:03:43.860 separated services such as proxy servers and management tool and HTML servers for
00:03:50.940 webview we utilize APA mode to minimize
00:03:56.220 controller modules and overhead also we employ scheme First Development by
00:04:02.700 protocol buffers we use MySQL memcast release and
00:04:07.980 elasticsearch now we will look at mobile game app
00:04:15.180 characteristics
00:04:22.199 how do you try a new game app store link tap the install button wait a
00:04:29.820 minute and play it is super easy for players play to
00:04:36.479 play however it is super easy for servers to
00:04:42.180 write this slide shows the right heavy traffic
00:04:47.940 reasons first as mentioned earlier easy install leads to easy right access
00:04:54.900 downloading is very easy and mobile up distribution platforms ensure installs
00:05:00.840 from all around the globe there is no need to make an account and the first
00:05:06.900 in-game action means first right secondly various in-game actions trigger
00:05:14.160 accesses especially in write such as receiving daily bonuses playing battles
00:05:20.580 buying items and Etc generally speaking the order of a
00:05:26.940 magnitude for mobile apps ranges from 10 kilo requester per second to one Mega
00:05:34.440 requester per second therefore scaling API to write heavy
00:05:40.199 traffic is required here
00:05:46.259 now we will take a quick overview of a scalable database configurations
00:05:57.300 this slide is an overview of our database configurations
00:06:02.699 from the application side we use vertical splitting horizontal shabbing
00:06:08.340 and read write splitting on the other hand from the infrastructure side we conduct load
00:06:15.960 balancing on readers and automatic writer failover let's look at them one by one
00:06:25.440 the first technique is vertical splitting we split the databases into two parts
00:06:31.979 the first part is common there is one writer node and this is optimized for
00:06:39.060 read heavy traffic we prioritize consistency of our
00:06:44.520 scalability here for example game setting data and
00:06:49.680 pointer to Shard are stored in the database
00:06:56.340 the other part is short which is the main focus of today's session there are
00:07:02.340 multiple writer nodes and this is optimized for write heavy traffic we
00:07:08.160 prioritize scalability over consistency here also this shared database is
00:07:15.600 horizontally charted for example player generated data is stored in the database
00:07:24.419 this is the horizontal shading we take data locality on a player and partition
00:07:30.840 them by player ID not by resource types this architecture is effective to hold
00:07:38.160 consistency on each player and efficient for player-based data access
00:07:45.240 we as a mapping table called player shards to assign a shard
00:07:50.880 we decide A Shard on the player's first axis by weighted sampling this Dynamic
00:07:57.840 shared allocation is flexible for adding new shards and balancing the load on
00:08:04.020 them there is a limitation of a maximum inserts per second for this architecture
00:08:11.460 for example thousands or tens of thousands of queries per second
00:08:17.460 are the limitation of my SQL if we hit the limit we can use the range
00:08:23.639 mapping table or distributed mapping tables
00:08:29.220 of course we utilize rewrite splitting 2. in our case we use manual read write
00:08:36.539 splitting this is complex but efficient APA servers access writer nodes to
00:08:43.500 retrieve data used for transactions and the previous own data
00:08:49.680 on the other hand APA servers access region nodes to retrieve data not used
00:08:55.920 for transactions and the other players data
00:09:01.440 from the infrastructure side we employ several multiple database systems the
00:09:07.620 key concept here is do that outside of rails this concept increases flexibility
00:09:15.540 and achieves separation of concerns in our case with a dns-based service
00:09:22.920 Discovery system with the custom DNS resolver on Rails blood balancing is conducted on the
00:09:30.779 weighted sampling of IP addresses and automatic writer failover is executed by
00:09:37.080 changing the IP address to a new writer node
00:09:43.140 so every aspect of multiple databases is indispensable for scalability
00:09:51.779 let's dive deeper into some critical features
00:09:57.839 now we are going to explore the API design for right scalability
00:10:06.839 let's begin with traffic offload by replication
00:10:13.640 replication is an effective technique to offload read traffic
00:10:18.959 here we replicate ddl and DML from writers to readers and almost the same
00:10:27.480 data is available on reader nodes by offloading read traffic to readers we
00:10:34.080 can reduce the writer's load and make the system scalable for read traffic
00:10:43.920 however there is an application delay historically speaking the latest data is
00:10:51.720 only available on the writer node
00:10:57.240 then how do we balance between the scalability and the consistency
00:11:06.779 the solution is read your right consistency which ensures consistency on
00:11:13.860 your right this slide shows the radio right
00:11:19.800 consistency in railroad 6. the solution is time-based connection switching when
00:11:26.760 there is a recent right by the death changer Raiders connects to a writer
00:11:32.399 this connection switching is the default in rails 6 and 2 seconds replication
00:11:38.399 delay is tolerated at default using this architecture we gain high
00:11:45.420 scalability for really heavy applications and strong consistency is
00:11:50.820 guaranteed where players care on the other hand there are mainly two
00:11:56.760 columns for this architecture the first one is that this is not
00:12:02.700 suitable for write heavy applications because of the higher proportion to the
00:12:08.040 writer traffic offloading is not so effective
00:12:13.079 also we have to take a severe balance between replication delay and the right
00:12:19.800 read ratio when the Tool elevated replication delay increases the right read ratio also
00:12:28.320 increases which means not scalable so we must further improve this for
00:12:35.640 right heavy traffic this slide explains the radial right
00:12:42.899 consistency for write heavy applications the solution is a resource-based
00:12:48.899 strategy players own data and critical data affects from writer and other digital
00:12:55.860 effects from readers one with the pros of this architecture
00:13:00.899 is defectiveness for right heavy traffic also much longer replication delay is
00:13:08.040 tolerated say tens or hundreds of seconds this magnitude of replication delay is
00:13:16.800 often invoked by Cloud infrastructure figures or By Request search
00:13:23.040 this architecture is effective to offload interplayer API traffic because
00:13:28.860 there are fewer connections to writers the counts of this architecture are the
00:13:35.700 complicated business logic and ineffectiveness of offloading into a player API in other words this
00:13:43.920 architecture helps greatly enhanced scalability but further Improvement is
00:13:49.200 required considering the interoperial API
00:13:56.279 okay now we will take a look at traffic offload by resource Focus the API design
00:14:10.560 let's conduct a case study the scenario is buy hamburgers and add
00:14:18.899 them into an abstract there is a knapsack with items a wallet
00:14:24.360 with money and infinite hamburgers player a buys hamburgers and add them to
00:14:32.459 the knapsack the Wallet balance will decrease and the items in the knapsack will increase what apis are required for
00:14:41.399 this scenario there are three required apis create
00:14:49.320 Burger purchase index item in knapsack and show wallet you can see the
00:14:55.500 railroads roots on your right hand side the roots consist of three resources
00:15:02.040 let's see the API cover sequence on the left hand side in the first axis a client calls index
00:15:10.199 item in knapsack and show wallet then the player decide what to buy and
00:15:17.160 the client calls create Burger purchase index items in knapsack and show Wallet
00:15:23.040 balance again the player decides what to buy and
00:15:28.500 the client calls all the apis they are mailing API calls to access
00:15:35.820 writer of this right heavy traffic is because of the recent ride for regional right in
00:15:43.860 rail 6 and because of the own resources for radio right for right heavy traffic
00:15:53.880 how do we solve this right heavy API covering
00:16:01.380 question is why do we fetch the data from a writer
00:16:07.019 this is because the items and wallet are changed
00:16:13.740 what exactly the changed items and worried
00:16:20.940 we can find them in the controller they are the resources client want to
00:16:26.399 watch and the resources write requests affected as a side effect and the
00:16:32.100 resources various has on memory as active record instances
00:16:40.500 so the solution is to include all the affected resources in API response
00:16:47.880 in this scenario there are two kinds of affected resources the first one is
00:16:54.360 Target resource which is the primary purpose of the API in this case the
00:17:00.060 burger projects the other one is affected resource which is the side
00:17:05.520 effect of the API in this case items in knapsack and wallet
00:17:12.959 on the Json sample on your right hand side you can see all the resources are
00:17:18.419 included in the response and you can see the API core sequence below the Json
00:17:25.079 here the client calls index item in knapsack and show wallet on the first
00:17:31.080 axis this sequence is the same as the original API core sequence
00:17:37.380 however after deciding what to buy the client calls only create Burger purchase
00:17:44.640 well this is because all the required resources are included in the response
00:17:50.160 of create variable purchase so there are no read API requests except
00:17:56.880 the first one and there are no writer database accesses except for the
00:18:01.980 transaction itself the other sense of request is beneficial both to databases and to
00:18:10.020 clients although there are duplicate resources and increased complexity we have a major
00:18:17.880 writer databases focused on transactions which is the only thing not achievable
00:18:24.120 by readers
00:18:29.580 this slide explains the summary of the mixed result of replication and resource
00:18:36.059 focused APK design on the one hand radial right consistency
00:18:41.220 based on data's owner is effective for write heavy traffic
00:18:46.520 interplayer apis and sharding on the other hand including affected
00:18:52.799 resources in right to API response is effective for minimizing read API
00:18:59.760 requests underwriter database accesses also it is effective for interplayer API
00:19:09.360 as a mixed result we have obtained the following effects firstly we have gained
00:19:16.860 a complex but adaptive architecture due to manual read write switching
00:19:23.160 secondly we have achieved maximum traffic upload due to affected resources
00:19:29.700 in right API responses thirdly we have achieved robustness of a
00:19:36.360 replication delay the architecture tolerate a very long application delay
00:19:44.340 so we have achieved both consistency and right scalability with higher robustness
00:19:49.799 of our application delay
00:19:55.020 now we successfully let writer databases focus on transactions
00:20:02.700 however what if the number of transactions exceeds the writer databases capacity
00:20:11.160 the answer is sharding now let's see the scalable data locality
00:20:19.080 for sharding
00:20:25.919 General multi-tenant sharding splits the data by the owner Talent
00:20:31.679 there is no interaction between shards
00:20:37.260 this is the scalable architecture we only have to add another chart to scale
00:20:45.440 also starting is the valley architecture to guarantee the best data locality a
00:20:52.320 multi-canon sharding
00:20:59.880 how about a single tenant multiplayer game
00:21:05.820 interaction between shards is the most critical feature required for single
00:21:11.460 tenant multiplayer game players in one large game world must
00:21:17.340 interact with each other also sharding does not guarantee the
00:21:25.380 best data locality by itself furthermore Charlene can invoke a
00:21:32.460 consistency issue on other failures or network failures
00:21:42.480 so we have to handle it
00:21:47.940 the objective of that locality is to improve performance and consistency by
00:21:54.720 locating data wisely the basic strategy here is first consider closed chartered
00:22:01.620 operation cost at expensive second decrease the access shared and
00:22:07.440 additional computation per request then make a balance on accumulated cost
00:22:14.460 let's conduct some case studies on the architecture
00:22:21.299 the first scenario is train myself and respond to call for help
00:22:30.419 in this scenario player a and player B trained themselves then player C calls
00:22:38.760 one of them for help here there is no need for the latest
00:22:44.580 trained state the request consists of 95 everyday
00:22:50.520 training requests and five percent occasional help requests
00:22:56.100 the solution is a pool architecture you can see the layers and databases on the
00:23:01.559 figure we locate the data to the ownership indexed by player ID then 95 percent of
00:23:09.900 write requests involve just one writer and five percent read requests involve
00:23:16.380 one writer and one reader this architecture is performant on the
00:23:22.260 majority write requests the next scenario is send gifts list
00:23:30.539 gifts and receive gifts in this scenario player a and player B
00:23:39.360 send gifts to player C player C lists them in chronological
00:23:46.140 order the requests consist of 30 occasional
00:23:51.539 sending requests and 70 percent everyday listing or receiving requests
00:23:57.419 you can see the figures on your right hand side the solution here is a push architecture
00:24:04.679 we locate the data to the receiver desert indexed by receivable player ID
00:24:09.960 sent out by doing all this 30 percent write request involve two writers and 70
00:24:18.059 percent read write requests involve just one writer
00:24:23.280 although there is an additional cost on sending gifts this architecture is
00:24:29.460 performed on the majority read write request also there is no merge sort cost due to
00:24:36.840 prey indexed data the third scenario is become friends and
00:24:44.520 listen to friends in this scenario player a player B and
00:24:52.260 player C list friends player C become friends or linked with
00:24:57.720 the B the request consists of 95 everyday
00:25:02.880 listing of requests and the five percent occasional linking requests
00:25:08.940 the solution here is double right architecture we locate the data to both shards
00:25:16.799 indexed by player ID and friend player ID
00:25:22.980 then five percent write requests involve maximum two writers a 95 percent read
00:25:30.840 request involve only one reader this is performant on majority read
00:25:37.919 requests let's look into a more complicated
00:25:43.740 scenario now the scenario is join the gears and
00:25:49.919 defeat a raid boss player a player B and player C join a
00:25:55.380 guild they fight together against the powerful enemy
00:26:00.600 they share a limited amount of consumable weapons each player's Attack under the enemy's
00:26:07.679 attack change the current situation both acid and performance are critical
00:26:13.500 for updating playered Health stage formation and elements Health State and
00:26:20.880 gears remain new weapons and State the solution is partial resholding
00:26:29.640 from the databases Viewpoint players belong to different shards and players
00:26:36.059 belong to the same Guild both acid and performance are required here
00:26:42.779 first select another Shard for the guild by weighted sampling
00:26:48.299 you can see the new Shard with the sunglasses on the figure on your right hand side
00:26:54.539 then reference the new Shard from each player's original shot
00:26:59.640 at the design phase we extract critical resources for the raid boss
00:27:06.299 for example Health stage formation weapons and so on
00:27:12.179 then locate the extracted resources on the geared's child
00:27:18.000 this achieves acid and performance because the number of involved shards is
00:27:24.720 only one for transactions we can leave the non-critical resources
00:27:30.720 to the original shards by this way there is no cross-shard
00:27:37.740 overhead for guild-based transactions and apis
00:27:43.140 and acid is also guaranteed
00:27:49.980 we have successfully improved data locality and achieved scalability
00:27:59.520 this data locality technique benefit not only conventional databases but also
00:28:07.200 some distributed databases considering the under the hood architecture
00:28:16.200 however we have not yet fully solved the difficult issue for conventional
00:28:22.620 databases now let's look at consistency for
00:28:29.159 shutting the last part of API design
00:28:39.240 keeping consistency over multiple databases is a difficult problem
00:28:47.159 especially back in the mid-2000 tons when our game server was
00:28:53.279 born someone says you know two-phase comic
00:28:59.580 can ensure a Domesticity of multiple databases like Excel transaction
00:29:08.340 this is the overview of the two-phase comic for comparative study first of all
00:29:15.000 there are two kinds of participants here one is transaction manager in this case
00:29:21.720 API server and the other one is resource manager in this case mySQL database
00:29:29.340 the sequence is as follows first transaction manager ask resource
00:29:36.240 managers to update records as usual and transaction manager ask resource
00:29:42.480 managers whether or not they are prepared for commit then resource managers execute the
00:29:50.399 transaction up to the point where they will be asked to commit
00:29:56.340 after that resource managers respond to transaction manager so that they can commit you or not
00:30:03.480 if all the resource managers can commit then transaction manager tell over the
00:30:08.760 resource manager to commit it however if none of the resource managers
00:30:14.399 cannot commit the insurance action manager tell all of the resource managers to rollback it
00:30:21.659 this is simple at least for nominal cases
00:30:27.179 you can see the more casual description of that sequence in the picture on your
00:30:32.580 right hand side here transaction manager asks resource managers whether they can commit or not
00:30:41.460 transaction manager is asking for an explicit agreement here
00:30:50.100 however unfortunately the my secret document says prior to my sequel 577 XC
00:30:58.140 transactions were not compatible with the replication at all
00:31:03.960 also two phase commit like XA transaction has potential issues for
00:31:09.659 scalability prioritizing scalability we have to
00:31:16.320 relax the optimistic while maintaining the practically acceptable consistency
00:31:25.380 let's conduct another case study for this consistency
00:31:30.419 player a consume item of item in knapsack player would be receive the consumed
00:31:38.039 amount as a gift pretty easy isn't it
00:31:44.940 this code is the result of a venture consistency implemented in raid 6. seems
00:31:52.500 difficult because of connected to but do not worry we will explore this
00:31:58.799 one by one rule one is reduce the risk of crosstalk
00:32:07.500 transaction anomalies before proposing this rule we should
00:32:13.140 check the precondition for relaxed cross-shot transactions first all resource managers are composed
00:32:21.779 of the same architecture MySQL and then resource managers are prone to fail for
00:32:29.399 a continuous period of time for example failure in Cloud infrastructure can
00:32:35.279 trigger this kind of failure also resource managers are unlikely to
00:32:41.399 fail on the young comment let's look at the figures on your right
00:32:47.220 hand side there are two MySQL databases in the middle the top part of this figure is a
00:32:54.059 conventional two-phase commit and the bottom part of this figure is the proposed relaxed transaction
00:33:01.559 in the latter case transaction manager asks the resource manager to update the
00:33:06.779 record and the resource managers respond with ok then transaction manager assumes that
00:33:13.860 yeah they can probably comment and transaction answer execute the
00:33:20.460 commit order let's take a look at radius code achieved a relaxed transaction
00:33:28.200 in the beginning there are two consecutive nested transaction blocks this nested transactions execute a
00:33:36.059 consecutive commit at the end of the transaction block right after the beginning of the
00:33:43.140 transaction block you can see the player.blog.findplayer.id
00:33:48.600 this code is locking granular resources to avoid race conditions during the
00:33:55.140 transactions which further ensures the success rate of the transaction
00:34:02.220 the main business logic is omitted in this example
00:34:08.220 finally toward the end of the transaction block they are a consecutive safe methods this chord is the critical
00:34:15.899 component in Redux transaction the Ruby code invokes consecutive update
00:34:22.800 queries followed by consecutive commit queries which minimizes the time and
00:34:29.399 uncertainty of a transaction manager and resource managers
00:34:34.859 but typically expected but a very unlikely anomalies are as follows
00:34:41.520 one transaction manager or rails failures between consecutive comment
00:34:47.820 the next one is Network failures between consecutive commit on the third one is
00:34:54.480 sender Shadow node failure before the last comment
00:35:00.240 here we have achieved an XA competitive success rate for nominal cases except
00:35:06.839 for the difference between ah sorry except for the difference from Excel
00:35:12.000 prepare to update for the difference from XA commit to
00:35:17.820 commit also there are no blocking factors for
00:35:23.280 node or network failures although we do not have a persistent log
00:35:28.560 on resource managers we have gained a reasonable success rate for nominal
00:35:33.720 cases here rule 2 is to keep inconsistency
00:35:38.880 observable and acceptable before proposing this rule we have to know that
00:35:45.599 there are three states for Crosshair transactions one is none of the shorts Commit This is
00:35:53.280 the failure state second one is some and not all of the
00:35:58.500 shards commit this is the anomaly state and the last one is all of the Sharks
00:36:05.220 comment of this is Success state in order to recover from the anomalic
00:36:11.460 state the general two-phase Comet forces all the Sheffield shards to commit so we
00:36:18.000 do the same thing firstly we serialize and control the
00:36:23.160 order of comet this reduces the anomalystate to L minus 1 for the number
00:36:29.099 of resource managers second way we make the inconsistency
00:36:34.579 observable from transaction manager in this scenario the sender this is the
00:36:41.339 trigger for trial recovery after that we make inconsistency
00:36:46.740 acceptable from others or from the Nexus action we let the inconsistency behave
00:36:54.180 as if the transactions are all successful also the next action should not affect
00:37:02.040 the recovery or vice versa this practice leads to a non-blocking characteristics
00:37:09.540 now let's see the rare scored on your right hand side in the beginning you can see the outer
00:37:16.320 transaction block is the standards chart so the anomaly case is that receiver
00:37:22.500 Rich Comet and the sender reached rollback let's imagine what will happen to this
00:37:29.099 anomaly at the bottom of this transaction first send your history and the sender
00:37:34.800 summary would not be saved in this case the absence of history is observable and
00:37:42.240 the sender can retry based on that also receiver gift would be saved in the
00:37:49.079 animal case and the receiver can use receiver gift regardless of the sender's
00:37:55.500 transaction by this observable and acceptable
00:38:01.200 inconsistency we have gained a non-blocking start point for retry or
00:38:07.200 recovery then Here Comes rule three keep identity
00:38:16.140 and reach a venture consistency the proposed solution utilized retry
00:38:23.099 because the required is the simplest but reliable method actually it works even
00:38:29.040 for the network failures outside the server systems in order to utilize retry we have to use
00:38:37.320 an identity key to identify the resources the identifier is often called
00:38:44.220 transaction ID identity key or request ID
00:38:50.040 we build the business logic assuming the inconsistency history like table and find our initials
00:38:57.480 by is a convenient way to handle this
00:39:02.700 inconsistency let's look at the actual radius code this code is the main business logic of
00:39:10.619 the Crosshair transaction in the middle you can see the return near Center history dot persisted
00:39:18.660 this code ensures the identity for which rights if the central history persisted
00:39:24.960 it means that all the transactions successfully reached comments no further
00:39:30.960 update is required at the bottom you can see the gift dot
00:39:36.720 find your insurance by in the anomaly case the gift already exists so the
00:39:42.839 final initialized by migrate this inconsistency efficiently
00:39:48.660 by introducing identity and the adventure consistency we not only
00:39:54.180 overcome database anomalies but also failures outside the server systems
00:40:00.599 both eventual consistency and scalability are achieved I think this is
00:40:07.920 sufficiently practical are not satisfied what do you require
00:40:15.180 most work solution then rule 4 meets a requirement
00:40:22.020 real voice utilize at least one's message queue for strict completion
00:40:28.560 the precondition for this rule is that business logic already complies with the
00:40:34.680 rule 1 2 and 3. we divide the architecture into two parts the first part is the enqueue
00:40:41.940 phase and the other one is the DQ phase in the enqueue phase we save data to a
00:40:48.900 single shark database the acid is guaranteed here then we include a
00:40:54.839 message referencing the data within the transaction this guarantee at least once message
00:41:02.940 existence then make the message decued only after
00:41:08.880 the transaction is finished Force timeout both on Rails and on
00:41:14.460 databases and use perform later for deleted queue
00:41:19.859 let's see the radius chord on your right hand side in the middle of the code you
00:41:25.200 can see the weight 30 seconds and perform later within the transaction
00:41:30.780 given my sequel on the greatest timeout it is guaranteed that the transaction
00:41:36.119 will reach either commit or roll back when the active job is performed
00:41:43.380 in the DQ phase just retrieve the messages from the queue and check if the
00:41:49.020 data in the database exists if not exist then the transaction reads
00:41:55.200 rollback do nothing if it exists then the transaction reached commit continue
00:42:01.800 the process then conduct ident business logic and be
00:42:07.920 sure to finish the message only after the business logic succeeds this guarantees at least once job execution
00:42:16.099 also do not forget to consider the risk of Ruby process Clash
00:42:22.800 by this architecture we have achieved a quasis theoretically guaranteed
00:42:29.240 completion of cross-shot transactions
00:42:36.000 thus we have acquired practically acceptable consistency without losing
00:42:41.160 scalability this is essentially a concerned
00:42:46.859 databases so we should bear in mind this should be more elegantly solved by
00:42:52.500 sophisticated databases so is it all the things we have to do
00:43:00.300 for red scalability there's more
00:43:06.780 and we are going to look at the pitfalls around multiple databases
00:43:17.819 the objective of sharding is horizontal right scalability of databases
00:43:24.480 specifically speaking given n as number of shards keep time and space complexity
00:43:31.200 to of one or at least of log l sounds simple and trivial we have
00:43:37.920 actually achieved that by sharding however there are more pitfalls around
00:43:44.099 us the one bit four is the number of connection per database
00:43:50.940 the connection pool of rails is to cache and reuse the established database connection for performance
00:43:57.900 and there is a limitation over this because the casted pool cannot be shared
00:44:05.640 across processes or across instances also the Ruby 2 the gvl forced only one
00:44:15.599 native thread is runnable per process the result of these two limitations
00:44:21.540 leads to the number of processes larger than CPU core
00:44:27.599 and generally speaking there are many API servers to the many shots
00:44:34.560 so let's do the math here the number of servers is 1000 and CP
00:44:39.900 core is 32 and the full connection for process is 5. then we can yield 160
00:44:48.060 000 over 480 000 connections per database
00:44:54.720 actually this number exceeds the maximum connection of my SQL and postgresql
00:45:00.900 and we can see the deteriorated database performance due to many connections like
00:45:06.720 memory and CPU actually the typical default by cloud
00:45:12.300 computing providers is around hundreds or thousands so we have to Target this
00:45:18.420 value the strategy of reducing the number is
00:45:24.480 like this first assume average of one for actually used database connections
00:45:30.359 for request and then release the connection to where average value of process can retrieve
00:45:37.140 the solution one is Multiplex connections at the proxies between databases and API servers
00:45:44.579 the PHD bouncer for postgrades SQL is famous for this solution
00:45:50.400 the pros of this solution is there is no connect overhead at databases
00:45:57.660 however there are columns as for the transaction pooling some database
00:46:02.700 features are disabled and regarding session pooling clear connections are
00:46:08.040 also required and the solution too is Clear Connection pool after each request
00:46:15.240 the active record refresh connection is famous for this solution
00:46:20.339 the process of this solution is there is no additional components required and
00:46:26.280 full database features are available although these have a connect overhead
00:46:32.760 at databases this is not so problematic for MySQL and we have achieved the low
00:46:38.700 connection number with a solution too the next Pitfall is the connection
00:46:45.720 Reaper thread per API server a connection reverse thread is a thread
00:46:51.300 to periodically flush IU database connections this was enabled by default
00:46:57.180 in raids 5. the original design of Reaper thread with two River thread created for
00:47:04.680 database spec so let's do the math here again the number of unicorn processes is 100
00:47:12.000 and number of shards is 300 and then the number of writers and readers are two
00:47:18.420 then we can yield 60 000 total number of river thread here
00:47:26.520 so this is actually exceeds the default parameter in kernel and also we can see
00:47:32.880 the deteriority performance due to many thread
00:47:37.980 actually in the phrase six this is already fixed and if we are using rares
00:47:44.280 under six you can see the reading frequency equals new undisable thread
00:47:53.940 the summary of beautifuls is like this we have learned that even the
00:47:59.220 elaboratory crafty code fails before huge number of databases and there might
00:48:04.920 be another pitfalls behind you
00:48:10.619 so how do we avoid pitfalls there are too many things to consider
00:48:19.380 then it comes the test test with multiple databases
00:48:28.140 if we haven't tried it assume it's broken
00:48:33.480 we should test multiple databases
00:48:39.060 the recommendation for use real remote databases in test is first use real
00:48:45.000 multiple database configurations at test avoid stub and use real databases and
00:48:52.099 middlewares and separate region right connections and there should be more
00:48:58.380 than one shirt or horizontal shelving and we love Factory bought our fixtures
00:49:04.079 because it can test the dynamic Shard allocation and access
00:49:10.260 and we recommend use real transactions on tests
00:49:15.540 by setting the config and we also simulate partial transaction
00:49:21.060 failures and this practice can detect wrong
00:49:26.520 understanding of material Behavior wrong choice of video writer connections misallocation and misaxes discharge and
00:49:34.560 wrong handling of recover your try of a partially failed transactions
00:49:41.940 also we had injected anomalies in sandbox environment
00:49:47.460 the purpose of sandbox environment is developed client system and conduct q a
00:49:53.280 on integration of ground Sovereign other components the anomalies should be tested there
00:50:00.720 in our case we had always 10 seconds application delay
00:50:06.839 and we implemented debugger to inject exceptions for important and complicated
00:50:12.900 apis this practice can detect server bug on
00:50:18.240 existing readers and client bug on relying with the API for the related data and many architecture bug on the
00:50:27.540 design of read write API and identity and the adventure consistency
00:50:34.800 and also we highly recommend conducting load tests before service launch
00:50:40.079 because this is the most unpredictable part on multiple databases for example
00:50:46.319 does the app work with 10 shots 100 shots or 1 000 shards
00:50:53.640 and actually it is required to provision servers to the maximum load on the
00:50:59.099 production and these practices can detect an even
00:51:04.440 distributed load of partitions and time and special complexity of of n or
00:51:10.800 greater and other issues due to the real scale
00:51:17.160 now we can relax ourselves in addition to relaxing multiple databases
00:51:24.839 lastly past present and future of multiple
00:51:31.319 databases at our game server in the past our project started ways for
00:51:38.700 two this is the latest version of that time and there is no native multiple support
00:51:45.240 to them and so we have extended rails and various gems and we make many monkey
00:51:53.400 patches for example in-house jail mode truck was created and Richboro was extended
00:52:01.380 and we take priority on performance over complexity and we take priority on
00:52:07.619 various syntax over implicit computation there we have implemented various features and
00:52:14.700 this was successful supporting games then the current
00:52:21.000 there are two different race versions average is 5 2 plus extended gems plus
00:52:27.780 monkey patches and Upstream Rays if six two plus native multi native multiple
00:52:33.839 database sport the featured discrepancy is shrinking for example common pitfalls are being
00:52:40.559 fixed out of certain rails and useful features are being implemented at up
00:52:45.780 some rails however um there is growing called discrepancy
00:52:53.760 because there is different code to achieve the similar things and there is more sophisticated design at Upstream
00:53:00.900 rails so which way are we going
00:53:09.000 of course on the rails we are considering to upgrade to
00:53:15.480 Upstream with a simplified multiple databases I think this is sustainable continuous
00:53:21.780 development toward the future also we are considering to migrate and
00:53:27.240 contribute our technical expertise to rails this not only benefit rails but
00:53:34.079 also benefit us and we can turn the technical debt into technical assets
00:53:42.720 we'll be back on Rails
00:53:51.059 so let's sum up today's session we have explored various techniques to
00:53:58.740 scale rails API to write heavy traffic first of all we utilize all aspects of
00:54:04.920 multiple databases vertical splitting horizontal shading replication and Road
00:54:11.220 balancing are all necessary then we let writer databases focus on
00:54:17.460 transactions replication can offload the array traffic and resource-focused API
00:54:23.280 response design can minimize unnecessary data access to writer
00:54:28.380 next we optimize data locality to reduce the cross chart performance overhead
00:54:34.740 we have explored pull push double right and partial resholding architectures
00:54:41.760 after that we introduce practically acceptable and scalable consistency it
00:54:48.599 consists of risk minimization observable and acceptable inconsistency and
00:54:54.599 eventual consistency we also learned some pitfalls and risks even for the elaboratory crafted code
00:55:03.180 therefore we recommend a test with multiple databases this can ensure all
00:55:09.180 the techniques here are working as intended by using real databases real anomalies
00:55:16.319 and real scale we can ensure everything is working well
00:55:22.500 lastly I believe it is crucial to contribute to rails Upstream which turns
00:55:29.520 technical debt into technical assets
00:55:35.760 thank you for listening DNA loves rays