List

Configuration Management Patterns

Configuration Management Patterns

by Beau Harrington

In this engaging presentation titled "Configuration Management Patterns," Beau Harrington discusses the complexity of managing configurations in modern Rails applications as they scale. The session begins with an acknowledgment of the common perception of configuration management, often associated with tools like Chef and Puppet, and sets the stage for a broader discussion on patterns for effective management of configurations within growing systems.

Key points covered in the talk include:

  • Defining Configuration: Harrington provides a broad definition of configuration, encompassing all values that might need sharing or changing across applications, including settings like hostnames, user credentials, feature flags, and translations.

  • Case Study - Kingdoms of Camelot: He illustrates the concept using his experience working on the mobile game "Kingdoms of Camelot," highlighting the need to manage over 25,000 configuration values, which cannot be efficiently handled through static YAML files alone.

  • Configuring for Context: The talk emphasizes the need for configuration to adapt based on context, including different environments, server roles, and regions. Harrington introduces composite configurations, which allow for hierarchical settings tailored to specific contexts.

  • Decoupling Configurations from Source Code: Harrington suggests moving configurations out of source control and incorporating them into a build process, effectively treating configuration files like build artifacts which can be versioned and accessed independently.

  • Dynamic Configuration Updates: A significant part of the presentation focuses on techniques for dynamically updating configuration without requiring deployments or service restarts. He discusses using background threads for polling configuration changes.

  • Empowering Non-Engineers: The approach not only streamlines configuration management for engineers but also empowers non-developers, such as game designers, to make configuration changes directly, thereby reducing dependency on engineering resources.

  • Testing and Change Control: The importance of rigorous testing — both technical QA and play-testing QA — is emphasized before deploying configuration changes, as even minor alterations can significantly affect user experience in gaming contexts.

  • Open Source References: The presentation ends with references to useful tools and practices in the industry, including Netflix's architectural patterns and tools for configuration management, indicating that the principles discussed are part of a larger movement towards more robust and flexible configuration management strategies.

The main conclusion of the talk is that proper configuration management is crucial for maintaining control and flexibility as applications grow and evolve. By employing the discussed patterns, organizations can streamline their processes, avoid configuration bottlenecks, and enhance overall efficiency.

As your simple Rails app grows into a larger system or set of systems, using simple constants and Yaml files for configuration may no longer suffice. The meaning of 'configuration' expands to include business logic alongside the customary hostnames and timeout intervals; the rate at which configuration changes are required increases; non-engineers begin to require the ability to make configuration changes themselves; different environments require different configurations. This presentation will examine several patterns that can be applied to handle these issues, keeping iteration team high and reducing the burden on your engineering team. We'll create and iterate on a simple game as a case study to illustrate the value of these principles in practice, and also look at a few open source projects that integrate some of these concepts.
Topics:
* moving configuration values out of source
* sharing configuration across multiple applications/services
* working with sensitive configuration data (eg API keys)
* dynamically updating configuration without deployments or restarts
* cascading/overlaying configuration values based on environment and context
* running experiments and A/B tests
* change control
* testing and multi-stage deployment of configuration changesets
* allowing non-developers to change configuration values

Help us caption & translate this video!

http://amara.org/v/FGaS/

Rails Conf 2013

00:00:12.259 thank you
00:00:15.800 hello
00:00:17.460 what's up
00:00:18.779 let's see
00:00:20.640 Wednesday afternoon day three of
00:00:23.220 railsconf everyone's a little sleepy so
00:00:26.100 I'm gonna wake you all up with
00:00:28.980 the most insane mind-blowing topic I
00:00:31.380 could ever conceive
00:00:34.260 I mean when I get neck deep in XML and
00:00:37.200 my favorite Turing complete gaml files
00:00:39.379 uh I just go crazy so hence 40 minutes
00:00:43.200 just for you
00:00:44.340 uh so I'm Beau Harrington
00:00:46.500 um there's my amazing GitHub profile
00:00:48.420 which I think has one repo on it and
00:00:49.800 then my Twitter where I make jokes about
00:00:51.600 things that have nothing to do with
00:00:52.620 software
00:00:54.239 uh so
00:00:56.460 first things first I noticed about three
00:00:59.820 minutes after I submitted my talk at the
00:01:04.019 magical two words configuration
00:01:05.400 management to a lot of people especially
00:01:07.200 our devops crowd means chef and puppet
00:01:09.780 and this is not what this is so here's
00:01:12.060 your chance to escape
00:01:14.100 all right uh next I'm going to steal a
00:01:17.580 phrase I heard this from Chris Kelly I
00:01:20.220 think on Monday and I probably badly
00:01:22.320 paraphrased it but this is intended to
00:01:24.960 be just a quick uh
00:01:28.560 okay anyway this is a conversation about
00:01:31.979 ideas uh it's not a strict prescription
00:01:34.560 as to what you have to do uh just going
00:01:38.100 over some things that worked for us some
00:01:39.420 things that we haven't actually put into
00:01:41.119 production on all of our games yet but
00:01:44.159 have still shown promise or things that
00:01:46.920 are just been cool ideas for personal
00:01:49.079 projects
00:01:51.420 so
00:01:52.860 It's Gonna Fill with a cable or
00:01:54.899 something and then
00:01:56.340 in the meantime uh
00:02:00.240 uh I'm just gonna go on yeah okay cool
00:02:02.640 uh so I'm Chief software Arctic bam
00:02:05.460 we're a company that makes mobile and
00:02:06.659 web games that includes the number one
00:02:08.220 top grossing IOS app of 2012 Kingdoms of
00:02:11.220 Camelot battle for the north of course
00:02:13.319 we are hiring and that's all about that
00:02:16.800 all right so what is configuration you
00:02:19.920 guys are missed the best slots uh
00:02:24.420 well you can probably start off with
00:02:27.480 some of the contents of your config
00:02:29.580 directory in your rails app we've got
00:02:32.000 database.yaml here this looks like
00:02:33.660 configuration right we got host names we
00:02:36.360 got usernames and passwords timeouts
00:02:39.780 those are my favorite
00:02:42.060 then you know some kind of newer hotness
00:02:44.459 feature Flags we want to be able to turn
00:02:47.400 features on and off with just a simple
00:02:48.959 deploy or otherwise so we can flip the
00:02:51.959 search box or append the request time to
00:02:54.360 request very simply and easily yeah it
00:02:56.879 looks like configuration right
00:02:59.459 how about translations
00:03:01.319 those living config I don't really know
00:03:03.060 why but they're there
00:03:05.300 you know it's so frequent that we have
00:03:07.920 the people that are going to be doing
00:03:09.180 internationalization and translation for
00:03:11.099 us on bigger teams and bigger apps
00:03:13.319 they're all software Engineers so this
00:03:15.360 is cake to them so yeah we've got all
00:03:18.180 right we're going to translate some
00:03:19.800 strings that's great
00:03:21.360 I guess it's configuration
00:03:25.680 I hate this so much
00:03:30.060 uh for some reason we're putting view
00:03:32.879 logic into our models but you know what
00:03:35.640 I'll take that this configuration
00:03:37.379 that'll work
00:03:38.700 How about if you're playing one uh a
00:03:40.799 game that you know free to play game or
00:03:42.480 game for mobile webs like that we always
00:03:44.459 have quests right I want to reward when
00:03:46.920 I finish one of those quests
00:03:48.780 50 coins all right more configuration
00:03:53.580 hmm okay now I'm in my production.rb
00:03:57.239 file
00:03:58.799 now we got some assignments going here
00:04:01.440 we got a second level access of
00:04:04.140 variables might even see a block or two
00:04:07.459 okay so configuration is basically and
00:04:11.580 very annoyingly everything you want it
00:04:14.220 to be and for our purposes we're going
00:04:16.260 to say today that it's values that you
00:04:18.299 may want to share or to change and I'm
00:04:21.419 again intentionally keeping the
00:04:23.100 definition of it super broad because
00:04:25.400 uh you know you've heard the phrase you
00:04:27.660 know code is configuration or
00:04:28.919 configuration is code and the line is
00:04:32.160 extremely blurry so if it doesn't have a
00:04:35.240 precise definition that's okay
00:04:38.220 so why do we care or I guess what I care
00:04:40.740 about configuration so much so I lied I
00:04:43.979 am going to go back to my corporate life
00:04:45.180 for a second so Kings of Camelot battle
00:04:47.400 for the north uh it's a real-time home
00:04:51.120 it's a strategy game with a shared world
00:04:52.979 uh it's
00:04:55.080 you can have up to I think 70 000
00:04:56.820 players in a single world and they're
00:04:58.440 all fighting over the exact same
00:04:59.520 resources they join alliances they fight
00:05:01.860 each other they kill each other it's
00:05:03.360 quite exciting
00:05:04.740 and so part of that is you have a
00:05:07.919 kingdom and each of the your kingdom has
00:05:10.320 lots of little buildings there's 15
00:05:12.120 different kinds of buildings in fact and
00:05:14.520 each of those buildings can get leveled
00:05:15.840 up gradually over time we have 10 plus
00:05:18.000 levels of each building and then each of
00:05:20.400 those levels has eight different
00:05:21.840 requirements including different kinds
00:05:23.520 of resources a certain amount of time
00:05:25.440 that needs to pass and a prerequisite
00:05:27.720 for uh buildings that have already been
00:05:30.479 built so just for buildings that's 1200
00:05:33.240 plus values that we have to deal with
00:05:34.860 that's 1200 integers and a couple
00:05:37.740 strings
00:05:39.180 but mostly integers
00:05:41.220 and that's just buildings look at all
00:05:42.960 this other crap too
00:05:44.520 research nights battle tuning NPCs items
00:05:47.520 quests which gives us a grand sync and
00:05:50.039 total over 25 000 values that we have to
00:05:52.259 deal with for one game
00:05:54.419 I'm not particularly relishing copying
00:05:57.840 and pasting stuff into a yaml file for
00:06:00.539 somebody out of an email or we're see it
00:06:03.300 out of a word attachment inside email
00:06:07.560 so let's get serious uh
00:06:10.740 when we're going to put out new features
00:06:13.380 or even configuration changes for our
00:06:15.300 games we have two levels of QA that we
00:06:17.820 need to do Beyond just our normal
00:06:19.380 automated testing we have technical QA
00:06:21.840 but we also have play testing QA
00:06:23.280 everything can be working perfectly from
00:06:25.199 a technical standpoint but from a play
00:06:27.419 testing standpoint if we release
00:06:28.979 something that Nerfs a very valuable
00:06:30.720 item or make something cheap very
00:06:32.960 overpowered that's just as bad if not
00:06:36.419 worse than a failed to deploy on a
00:06:39.060 technical basis because the nature of
00:06:41.759 our games they're free to play which
00:06:42.900 means you can pay zero dollars we can
00:06:44.580 pay more than zero dollars for the items
00:06:47.039 that you have in your game so if we mess
00:06:48.479 up the configuration values of those we
00:06:50.819 are basically nuking people's
00:06:52.620 Investments and we don't want to do that
00:06:54.419 so we want to have a process around
00:06:57.300 deploying new configuration that's just
00:06:59.880 as rigorous as when we're deploying code
00:07:02.759 so now a point an agenda
00:07:06.720 so we're going to talk about all right
00:07:09.000 we have a context right we have an
00:07:10.979 environment we have data centers things
00:07:13.380 like that where we're going to be
00:07:15.539 deploying that's really relevant to how
00:07:17.520 we're going to configure our
00:07:18.300 applications we also need to think about
00:07:20.280 how we even generate those
00:07:21.419 configurations in the first place
00:07:23.400 and then how we're going to push them
00:07:25.500 out onto our servers finally
00:07:27.960 what is all this going to do for us what
00:07:29.520 is it going to enable us to do and what
00:07:31.500 are some other things that we can
00:07:32.520 explore other software packages that we
00:07:34.680 can look at
00:07:35.940 so
00:07:37.860 let's talk context
00:07:41.520 let's actually just make our own game
00:07:43.020 instead right now
00:07:44.759 so
00:07:45.840 it's a really cool spec it's a really
00:07:47.580 cool idea I think it's going to make
00:07:48.539 millions
00:07:49.500 you're going to click this puppy
00:07:52.740 and you will get 10 gold coins
00:07:55.919 and sometimes you will get a prize as
00:07:57.900 well you'll get either a yoyo or a piece
00:08:00.360 of candy
00:08:01.740 so let's take our first stab at this
00:08:03.240 we've got the puppy click class yes
00:08:06.599 so we've got three beautiful constants
00:08:08.880 there coins per click odds winning prize
00:08:11.759 and a array of prizes
00:08:14.520 I feel like we've done pretty good so
00:08:15.720 far we've avoided any kind of magic
00:08:17.460 number it's not terrible right
00:08:21.060 but at the very least we get the damn
00:08:22.500 things out of source if I was happy with
00:08:24.419 this I wouldn't be yammering on about
00:08:26.460 this for 40 minutes
00:08:27.840 so let's pull them out and do a simple
00:08:30.599 config file I'll call it game.yaml and
00:08:33.180 config let me just go ahead and tell you
00:08:35.399 right now yes all the examples in here
00:08:37.080 are in yaml for the sole reason that
00:08:38.760 yaml takes up less space on this deer
00:08:41.219 screen here
00:08:42.320 no particular Affinity to it
00:08:45.360 so we've got our little game hash here
00:08:47.640 and then we've stored everything as part
00:08:49.740 of that yaml data structure awesome now
00:08:52.440 everything's out of source and it's in a
00:08:54.060 single place in config that's really
00:08:55.920 good because it's kind of it's natural
00:08:58.260 that we're going to be start looking at
00:09:00.600 the config directory first when we're
00:09:02.220 looking at configuration
00:09:05.700 so what about when we need to actually
00:09:07.740 care about the rails of I mean this
00:09:09.779 seems pretty basic also it's a basic
00:09:11.459 pattern you've got things split by
00:09:13.560 development task production
00:09:16.140 it's kind of gross though because we're
00:09:17.459 repeating ourselves a little bit
00:09:19.560 um
00:09:20.339 so when I'm in development I want to
00:09:22.740 make sure that I win a prize every time
00:09:24.060 because I'm super impatient my time is
00:09:26.100 Super valuable so I've set the odds of
00:09:28.800 winning a prize from 0.1 or 10 to 1 100
00:09:32.880 percent
00:09:34.019 and then all the things that are the
00:09:35.220 same I've had to replicate for
00:09:36.300 development and then test and then in
00:09:38.580 production as well if the screen were
00:09:41.220 bigger
00:09:42.959 so let's move on a bit to composite
00:09:45.180 configurations uh I think these have
00:09:47.760 like seven or eight different names
00:09:48.899 composite configurations cascading
00:09:50.760 configurations the same kind of concept
00:09:52.800 in CSS where we'll have a core set of
00:09:56.640 configuration and then we will apply
00:09:59.220 Transformations on top of it based on a
00:10:01.200 weighted set of
00:10:03.000 uh factors and so the first one and
00:10:06.360 pretty much the only one that kids use
00:10:07.800 usually in this kind of configuration is
00:10:09.660 the rails environment so we've got all
00:10:11.820 the core values and then anything that
00:10:13.740 we're overriding we're just going to
00:10:15.899 change we're just going to State those
00:10:17.279 values in the sections for those
00:10:19.440 particular environments so we've got
00:10:21.420 odds of winning prize one for
00:10:23.040 development and then nothing else for
00:10:24.660 tests we want to make sure that we're
00:10:25.980 testing the screen properly so we change
00:10:27.720 the image from a piece of candy to a
00:10:30.360 screen test
00:10:31.560 and so yeah we're dry we're not
00:10:34.200 repeating ourselves that's great
00:10:36.300 so
00:10:38.100 yeah there you go there's an example of
00:10:40.140 all the wonderful goodness that you can
00:10:41.940 have there
00:10:43.019 but why stop at rails ends you can do a
00:10:46.200 lot more with being able to infer
00:10:47.820 information about the context of your
00:10:49.560 deployment
00:10:50.760 and from there make bigger changes I
00:10:54.420 know a lot of people are on AWS and if
00:10:56.640 you're doing a multi-data center deploy
00:10:58.320 for example or multi-region deploy
00:11:00.540 you'll need to change the server names
00:11:02.399 that you're using for each of those data
00:11:03.899 centers
00:11:04.800 if you have
00:11:07.500 certain machines that are more powerful
00:11:09.120 than others you might want to change the
00:11:10.440 settings if you're in a heterogeneous
00:11:11.940 environment you might want to assign
00:11:13.740 more specific roles as well machines so
00:11:16.380 how do we do that
00:11:18.060 well we can just pull uh different yaml
00:11:21.420 files
00:11:22.320 and then so we've got the core and then
00:11:24.480 we've got the environment so production
00:11:26.100 or what have you we have the region so
00:11:28.680 we'll say U.S east one God help you then
00:11:31.680 we've got uh
00:11:33.240 in hosts we will take the hostname of
00:11:35.820 the machine as well so we'll go through
00:11:38.100 and we'll load the contents of each of
00:11:39.839 those yaml files and we'll do a deep
00:11:42.120 merge on what starts out as an empty
00:11:44.700 hash and then we keep merging in the
00:11:48.300 contents of each of those so you can see
00:11:50.160 that the further down the file list you
00:11:52.380 go the more likely are to you know win
00:11:55.560 so to speak they'll be the ones that
00:11:57.360 have the highest priority so anything
00:11:58.680 that gets set in a host specific file
00:12:00.779 will always win
00:12:03.480 we can go on with this concept we can
00:12:06.660 basically have tags in which you can
00:12:09.000 pass a tag through an environment
00:12:10.800 variable and then load up a different
00:12:13.079 configuration this is useful for if you
00:12:15.360 need to do kind of transient quick loads
00:12:17.399 of different sets of values if you're
00:12:19.560 having a crisis or something you need to
00:12:21.300 pull up specific debugging values or you
00:12:23.700 want a canary some new code
00:12:25.500 you can also assign role-specific
00:12:27.060 configurations this way and again the
00:12:29.399 code is pretty simple just pull in the
00:12:31.320 list of tags split them by comma and
00:12:33.360 then we load in each of the relevant
00:12:35.640 yaml files and that's it
00:12:37.920 so there's already a gem that
00:12:39.540 encapsulates a bunch of this stuff
00:12:42.120 uh rails config I'm sure there's more
00:12:44.100 there's a whole lot it seems like the
00:12:46.139 killer feature for all of them is being
00:12:47.760 able to have dot notation instead of
00:12:49.680 having to do a hash based access to
00:12:53.220 I guess that's important so the rails
00:12:56.160 config Jam will also do hot reloads of
00:12:58.800 settings files for you so if you do
00:13:00.860 settings.reload it will suck in all the
00:13:03.420 new values it'll reload all the files
00:13:06.180 there for you so you can
00:13:08.579 do some interesting things with pushing
00:13:10.380 new configurations without doing a code
00:13:13.019 deploy you can also do the composite
00:13:15.180 configurations that we talked about it
00:13:17.459 also has support for a set of developer
00:13:19.019 local settings well it's more of a
00:13:21.240 configuration or a convention than
00:13:22.800 anything else but with the git ignored
00:13:25.019 file has settings that are specific to
00:13:26.700 you and then you don't have to worry
00:13:28.200 about committing those back up and
00:13:29.880 messing with anything else up and
00:13:31.320 there's we can get it on GitHub so all
00:13:34.200 this is great we now have a little more
00:13:36.120 sanity a little more clarity and more
00:13:38.339 predictability as to what configuration
00:13:40.860 settings are going to be applied when
00:13:42.300 and yet
00:13:45.600 I can't give this to a non-engineer
00:13:48.420 I mean I could
00:13:50.220 but I don't want to
00:13:52.560 all right so our settings are aware of
00:13:55.620 their contacts and environment that's
00:13:56.820 great but everything is still in Source
00:13:58.440 still tied to a commit still tied to
00:14:00.600 deploy and it's inaccessible to any
00:14:03.000 other apps if you guys have been paying
00:14:04.680 attention to all the other talks this
00:14:05.820 week I'm sure by now you all have
00:14:07.500 completely reconfigured and read
00:14:08.940 factored your applications so they're
00:14:10.560 all beautiful sparkling service oriented
00:14:13.019 architectures
00:14:14.220 and so if we have configuration stuck
00:14:16.740 inside of a single app
00:14:18.600 it's going to be extremely hard to
00:14:19.920 extract that and you're going to be
00:14:21.000 doing things like putting configuration
00:14:22.560 sets into gems and passing them around
00:14:24.839 and don't really think that's the best
00:14:26.639 pattern and yes of course still
00:14:28.200 inaccessible it's not engineered
00:14:30.600 so let's look at how we're actually
00:14:32.040 going to bake these configurations or
00:14:33.720 build or
00:14:34.980 uh yeah Baker build these configurations
00:14:38.279 the requirements from our hands we want
00:14:40.920 to decouple the process of generating
00:14:42.720 the configuration entirely from the
00:14:45.300 application and the repository itself
00:14:48.240 do you guys remember how we used to do
00:14:49.980 software like before we had scripting
00:14:51.660 languages all over the place we would do
00:14:53.699 builds and the build process would
00:14:55.440 generate an artifact and when we would
00:14:56.940 take that artifact and put it somewhere
00:14:59.100 and so that process Maps really well to
00:15:02.519 how to handle complicated configuration
00:15:04.740 changes we can take whatever data source
00:15:07.800 we have it can be something that's
00:15:09.000 entirely different than yaml that's
00:15:10.440 maybe more uh appropriate from an
00:15:13.199 editing point of view then we can have a
00:15:14.880 process to turn it into a yaml file or a
00:15:16.800 Json file we can version it store it in
00:15:19.260 a central repository and then have some
00:15:21.300 Shenanigans to deploy it in a way that's
00:15:23.220 separated from that overall application
00:15:25.260 itself
00:15:27.720 sounds like a good job for web app to me
00:15:29.940 in terms of generating and holding and
00:15:32.880 making accessible to users individual
00:15:34.980 configuration files we can also hold a
00:15:37.320 whole bunch of logic in that web app
00:15:38.699 that we don't want to shove into our
00:15:40.440 actual applications things like access
00:15:42.899 controls individual auditing uh
00:15:46.079 making sure that we keep a list of every
00:15:47.639 single change and then any kind of
00:15:50.399 balance or sanity checking that's in
00:15:51.899 there and then for those of you in
00:15:53.279 larger corporate environments you can
00:15:54.540 put in God knows what else
00:15:57.060 so
00:15:58.320 it's a really simple principle I think
00:16:00.300 you have your Dimensions you've got
00:16:02.639 basically core and then you can do an
00:16:04.560 environment you can do AWS region tag
00:16:07.079 anything else that we've previously
00:16:08.279 specified and then the value for that
00:16:10.079 whether it's core Dev task production
00:16:11.699 USC one whatever and then simple key
00:16:14.399 value
00:16:16.040 uh and yes
00:16:18.300 it would map even a little more nicely
00:16:20.279 to a document store
00:16:22.260 so why don't we just pull it directly
00:16:24.180 from the database well uh the database
00:16:26.940 is precious right compared to everything
00:16:30.000 else you have if you have a decent
00:16:31.199 architecture you can spin up more of
00:16:32.639 anything else
00:16:33.920 except database power and yes you can
00:16:36.779 say that you'll have you know your
00:16:38.639 horizontally scaling databases that can
00:16:41.399 go uh you know multi-data Center and a
00:16:43.680 lot of stuff
00:16:46.139 I wouldn't trust it compared to being
00:16:48.540 able to do simpler application or simple
00:16:51.000 operations just pulling things from
00:16:52.500 files and also if you're having you know
00:16:56.040 hundreds or even thousands of apps I'd
00:16:58.079 rather do thousands of requests against
00:16:59.820 a static file going through nginx or S3
00:17:02.160 rather than trying to pull all of that
00:17:04.260 information through uh your precious
00:17:07.260 database as well
00:17:08.640 but if you don't have tons of load it'll
00:17:11.040 work fine
00:17:12.600 so our build process is we're going to
00:17:14.880 take our configuration values however
00:17:16.439 they're stored and we're going to
00:17:17.939 transform them into a single yaml dock
00:17:19.740 or what have you that's our artifact
00:17:21.600 we'll also take note at the same time of
00:17:24.660 the Builder we'll set a monotonically
00:17:27.900 increasing build number time stamp and
00:17:30.120 then we can enforce the change log as
00:17:31.679 well which is actually really handy and
00:17:34.320 then we can include a check sum within
00:17:36.419 there
00:17:37.400 especially people who are
00:17:39.059 quasi-technical will love to go into
00:17:42.120 files and make changes and not annotate
00:17:44.220 them it's really a pain in the butt when
00:17:46.260 those files are not under diversion
00:17:47.760 control so with a checksum that's not
00:17:49.559 going to happen
00:17:51.360 so quick example let's get double coins
00:17:53.940 let's get 20 points every time we click
00:17:56.220 on this puppy here
00:17:57.600 we're just going to go in we're going to
00:17:59.100 change our value in our configuration
00:18:00.900 table
00:18:01.980 and then we'll run our build process
00:18:04.380 and there there's our yaml dock awesome
00:18:06.620 so here's information about our actual
00:18:09.600 artifact we put in the build number and
00:18:12.240 then like a tag name for it basically
00:18:14.400 and then it checks them and then we've
00:18:16.559 also got it on an asset server as well
00:18:18.780 so if you want to pull this particular
00:18:20.400 version of the configuration it's there
00:18:22.200 for the taking and that way you can
00:18:24.120 experiment with it not just in
00:18:25.500 production but any other environment as
00:18:27.059 well
00:18:29.179 so how do we get that artifact into a
00:18:33.120 particular environment we can do a
00:18:34.740 promotion process where we basically
00:18:36.299 will map a single build artifact a
00:18:39.660 single configuration artifact to an
00:18:41.820 environment uh or a region or something
00:18:44.160 else so we basically have uh we end up
00:18:47.340 with the same kind of
00:18:49.620 overall end product which is we've got
00:18:51.660 directories uh you know based on
00:18:54.000 environments and data centers and
00:18:56.280 regions and tags and there's ammo files
00:18:58.559 in them they're just on a remote disk
00:19:01.140 somewhere or on a remote web server
00:19:02.640 somewhere and if you build your tools
00:19:05.100 correctly you can do this promotion
00:19:06.480 process well any user can do this
00:19:08.820 promotion process it doesn't have to be
00:19:10.200 just an engineer
00:19:12.179 so
00:19:13.500 our settings are aware the contacts and
00:19:15.059 environment the settings are outside the
00:19:16.740 repo they're accessible to other apps
00:19:18.840 you can have any app you want access
00:19:20.580 those URLs the changes are accessible to
00:19:24.000 non-engineers but we can't deploy them
00:19:25.679 yet
00:19:26.460 so how do we do I guess the final step
00:19:28.799 which is going to be the deployment
00:19:30.780 process for all this stuff so
00:19:32.940 got a title slide for it even
00:19:35.100 all right so a release process we're
00:19:38.100 going to be running you know standard
00:19:40.140 tests and continuous integration against
00:19:42.000 the configuration
00:19:43.740 uh we'll run uh both QA and play test QA
00:19:47.700 against and running instance of the
00:19:50.220 application that has that config and
00:19:52.679 then everything checks out we're going
00:19:54.660 to deploy to production
00:19:56.520 so we've got uh the same kind of deal
00:19:59.340 we're going to Loop through our rails AV
00:20:02.460 our AWS region and our hostname and then
00:20:05.160 we will go get that file from our config
00:20:07.380 artifact server and for each of those
00:20:09.240 files we're then going to do that same
00:20:11.220 deep merge so that we have that
00:20:12.900 composite set of configuration
00:20:17.400 but you still have to restart I don't
00:20:19.380 want to restart and if you go check out
00:20:21.059 my lightning talk in about an hour and a
00:20:22.919 half you'll see that I really don't like
00:20:24.059 to do restarts so what we can do is if
00:20:27.179 we're running on a modern interpreter we
00:20:29.700 can spawn a thread in the background to
00:20:31.260 pull for updates to that configuration
00:20:34.320 file
00:20:35.340 or set of configuration files a really
00:20:38.460 good mechanism for this is Celluloid
00:20:40.320 which is a library that basically brings
00:20:43.440 actor patterns into Ruby and works great
00:20:47.220 on both jvm and rubinius
00:20:50.640 and you can spawn a Celluloid worker
00:20:54.000 that will run on a timer and we'll go
00:20:56.220 fetch your configuration values and then
00:20:59.940 you know every 30 minutes we'll keep
00:21:01.620 recreating your configuration as long as
00:21:03.720 you write it in a thread-safe way then
00:21:05.340 you'll be able to basically just sit
00:21:07.559 there and any new artifacts that are
00:21:10.020 promoted to production will
00:21:11.460 automatically get brought into your app
00:21:13.559 this doesn't work for well doesn't work
00:21:15.900 well for uh applications that are
00:21:19.260 running on MRI that or that have a
00:21:20.880 global interpreter lock it's also kind
00:21:22.919 of gross in multi-process environments
00:21:24.900 that means unicorn because once they
00:21:27.000 forked they forked
00:21:28.440 so the big thing is that if you're going
00:21:30.840 to be doing this you should be on a
00:21:32.760 modern interpreter and should try to be
00:21:35.100 kind of thread safe because trying to do
00:21:37.380 In-Place updates can really get you so
00:21:41.059 here's an example I think it's pretty
00:21:44.400 intuitive even without in-depth
00:21:46.020 knowledge of the Celluloid Library so
00:21:48.840 we'll spin up a worker and then
00:21:52.039 refresh will pull and assemble all of
00:21:55.440 the configuration just like we've seen
00:21:56.880 before and then we have a timer that'll
00:21:58.679 run every 30
00:22:00.720 seconds minutes and refresh those
00:22:04.460 values and then reset the timer so it
00:22:06.720 keeps occurring
00:22:08.039 so what you can do is then you stick
00:22:10.740 those
00:22:12.120 your configuration access is behind a
00:22:14.159 method which will basically make sure
00:22:15.419 that any accesses are thread safe and
00:22:18.539 that you're also getting the latest
00:22:19.559 value so when we deploy our change to
00:22:23.940 increase the number of coins per click
00:22:25.740 from 10 to 20. once that configuration
00:22:29.220 file is promoted
00:22:31.140 all we have to do is wait because that
00:22:33.059 timer will go off and reload all of our
00:22:35.159 configurations and then the next time we
00:22:36.720 want to access it there you go the new
00:22:38.520 value is there
00:22:40.440 so it's good for simple things but then
00:22:43.320 there's a problem if we want to try to
00:22:44.760 change more advanced sets of
00:22:46.620 configuration well
00:22:49.559 let's see what if we need to break a
00:22:51.179 cache so we can put non-change hook on
00:22:55.380 when that particular configuration value
00:22:57.480 changes they can be a string right X
00:22:59.760 something like that and then once we get
00:23:02.520 that new value we can you know do a
00:23:04.679 logger warn statement we can break
00:23:06.960 caches we can do all kinds of things in
00:23:08.880 this way we're cognizant of what's going
00:23:10.500 on with updates and changes to our
00:23:12.659 configuration
00:23:14.640 this is really handy for more
00:23:16.559 operational side of the house type
00:23:17.880 things like if we want to add or remove
00:23:19.919 memcache servers from our pool uh we can
00:23:23.520 anytime the list of memcache servers
00:23:25.679 changes when we're pulling the
00:23:27.059 configuration we can go ahead and
00:23:29.220 reinitialize the uh rails cache client
00:23:32.580 to be a new copy of whatever your client
00:23:35.159 of choice is with the new list of mcash
00:23:37.260 servers so your operations guys will
00:23:39.539 probably be a lot happier with you
00:23:40.679 because they will not have to harangue
00:23:42.179 you and try to find you to take out half
00:23:44.100 of the memcast servers to do maintenance
00:23:47.340 so why have I'm talking about polling
00:23:49.620 and not pushing the entire time well
00:23:52.140 with polling basically you're keeping
00:23:54.179 the burden of what happens when things
00:23:56.460 go wrong in the same place is actually
00:23:58.559 it's the same code base and same source
00:24:01.799 as the code that's trying to pull and
00:24:05.640 pull the configuration updates if we
00:24:08.159 have a failed poll then
00:24:10.860 you know we can easily rescue that
00:24:12.840 timeout or whatever and we can nuke the
00:24:15.299 worker or we can take other steps we can
00:24:17.880 preserve the original configuration we
00:24:20.280 can do lots of different things but if
00:24:21.539 we're trying to do push and the push
00:24:22.980 fails
00:24:25.080 if I can't access a machine from my
00:24:28.080 control server
00:24:29.640 then I can't further communicate with it
00:24:31.860 in order to tell it to do anything else
00:24:33.179 so it's going to continue to just kind
00:24:34.440 of be a zombie and keep chewing through
00:24:37.559 data with whatever its original
00:24:39.600 configuration is
00:24:41.520 and so
00:24:42.840 at that point your responsibility is in
00:24:44.340 the wrong place and you can't do
00:24:45.600 anything about it so that's why for the
00:24:47.640 most part if you want to do polling
00:24:50.159 instead of push for this one although
00:24:52.020 I'll talk about an exception in a minute
00:24:54.659 all right so the config is out of our
00:24:57.539 source that's awesome we've got
00:24:59.280 non-engineers that can build and employ
00:25:00.720 even better we've got configuration
00:25:02.760 that's shared across our
00:25:04.559 just massive constellation of service
00:25:06.600 oriented
00:25:08.159 Services we can update the configuration
00:25:11.220 live without doing any restarts and we
00:25:13.620 can test those configurations easily
00:25:15.120 because this doesn't just work for
00:25:16.440 production you can set up an arbitrary
00:25:18.360 amount of environments or hosts or
00:25:19.799 whatever you want and then you can
00:25:20.940 easily
00:25:21.860 have whoever is responsible for those
00:25:25.140 configuration updates you know a game
00:25:26.700 designer or place tester or someone like
00:25:29.039 that can easily push that configuration
00:25:30.720 to a test server and test it themselves
00:25:33.120 you're not even involved which I think
00:25:35.580 is a big plus because I know that
00:25:37.020 everyone in this room has been bugged
00:25:38.940 for
00:25:40.100 uh quite a lot of configuration changes
00:25:42.840 and so forth by people who can't do it
00:25:45.120 themselves
00:25:46.020 and then if you design your tool set
00:25:47.820 correctly you can also record any
00:25:50.220 possible changes uh you can audit
00:25:52.679 everything you know exactly what's been
00:25:53.940 going on and you can actually add more
00:25:56.340 than just having your configuration
00:25:58.740 files and git will be able to supply to
00:26:00.659 you
00:26:01.380 so
00:26:02.580 this is all really good for games you
00:26:04.380 know we've got thousands upon thousands
00:26:05.880 of configuration values and now we have
00:26:07.559 a framework and basically peace of mind
00:26:10.080 when it comes to managing them making
00:26:11.880 sure that we don't screw things up what
00:26:13.919 else can we use this for
00:26:15.419 well it's really good for pushing out
00:26:17.760 lots of a b testing if you want to get
00:26:20.940 lots of experiments out there this is a
00:26:23.100 good way to make sure that all of your
00:26:25.200 experiment data well in terms of what's
00:26:27.960 going to be changed can easily be pushed
00:26:29.880 out to all your machines
00:26:31.620 also this is really good for dealing
00:26:33.059 with itnet and translation management
00:26:35.419 some of our games are published in 15
00:26:37.559 languages and have tens of thousands of
00:26:39.360 strings each
00:26:40.740 don't really want them mucking around in
00:26:42.720 config locales and so with this
00:26:44.940 framework and setup we're able to have
00:26:47.760 them use whatever tools that they like
00:26:50.159 and have become used to and then we work
00:26:52.679 with them to create a simple kind of
00:26:54.600 tool that will bake that into a yaml
00:26:57.240 file that the game can then consume on
00:26:59.400 its own and finally feature Flags
00:27:03.059 all right bonus non-ruby content because
00:27:05.700 we still have a little bit of time these
00:27:07.860 are three things that I think are very
00:27:09.179 influential and very cool to check out
00:27:10.679 even though none of them are strictly
00:27:12.659 Ruby per se first of all uh how many of
00:27:15.539 you have been keeping up with the
00:27:17.340 Netflix efforts with open sourcing their
00:27:19.740 internal API
00:27:22.140 everyone's hands should go up
00:27:25.080 their stuff is the commitment that
00:27:27.360 they've put toward pushing the envelope
00:27:29.460 and pushing out source for how to deal
00:27:33.179 with not just working the cloud but also
00:27:35.480 working with kind of the mundane
00:27:37.620 problems like configuration management
00:27:39.860 we've gotten a bunch of ideas from their
00:27:42.059 configuration Library the composite
00:27:44.340 configuration Library called archaeus
00:27:46.799 and it isn't Java you can use it in
00:27:49.020 jruby and that's just the start of the
00:27:51.960 kind of Netflix ecosystem of java
00:27:54.900 plugins and Java libraries and they've
00:27:58.320 also demonstrated a commitment to uh
00:28:01.200 helping their stuff run with other jvm
00:28:04.380 languages like jruby as well there's
00:28:07.020 also zookeeper
00:28:08.820 um zookeeper has a reputation for being
00:28:11.640 kind of tough to run into production and
00:28:13.440 I won't dispute that but it is solve an
00:28:17.220 extremely hard problem which is a
00:28:18.779 strongly consistent cluster of
00:28:20.220 synchronization service so if you it
00:28:23.400 basically just surprise supplies uh some
00:28:26.039 basic Primitives of sets and puts and
00:28:28.020 watches and things like that and you can
00:28:30.120 be guaranteed that they'll occur in the
00:28:31.679 order that you specify why this is
00:28:34.440 relevant to us is because the watch
00:28:36.120 primitive on zookeeper allows us to
00:28:39.299 basically get a stronger uh reliability
00:28:42.659 guarantee for being able to get push
00:28:45.299 updates to configuration and so that
00:28:47.760 leads into a very interesting python app
00:28:51.539 that's based on zookeeper called Jones
00:28:53.520 and basically takes a lot of the
00:28:56.100 concepts that I've been talking about in
00:28:58.080 this talk and combines them with
00:29:00.539 zookeeper to allow you to instantly push
00:29:02.820 configuration changes into whatever
00:29:06.179 hosts are hooked up and watching those
00:29:09.000 specific keys on a zookeeper instance
00:29:12.840 all right I think I have some extra time
00:29:15.059 that's it you get another gratuitous
00:29:16.919 puffy picture
00:29:19.679 cool you're all experts now
00:29:22.140 thank you very much