00:00:05.120
hey everybody my name is Austin story and I am a tech lead and manager at
00:00:10.620
doximity I'm honored to be here to be sharing uh lessons that doximity uh our
00:00:16.320
team has learned over the last several years of how to effectively sync data between rails microservices and do that
00:00:23.039
at scale and I'd like to start with painting you more of a picture of what we're going to
00:00:28.260
be talking about over the next half hour and that's going to be whenever you have
00:00:33.420
a company that has more complex data requirements and whenever you start
00:00:39.059
things are a little more simple it's easy to make decisions you have a rails monolith following the rails way
00:00:47.040
how do you keep your application developers and your data engineers in sync
00:00:54.120
so that you can lean into all of the rich things that rails provides for you in order to keep your Rich domain models
00:01:00.660
in sync over time as your business grows your data needs
00:01:06.659
grow your applications grow get more lines of businesses you end up in a situation where you have multiple apps
00:01:12.240
multiple teams multiple data teams multiple application teams and those problems become harder and
00:01:18.720
harder to keep the data team and the web application team in sync and then finally imagine that you're in
00:01:24.900
a situation where you have dozens of lines of business and you have 70 plus application
00:01:30.780
developers dozens of teams that are working on and relying on Rails to provide Rich data needs for your clients
00:01:38.340
and then data teams where you have 45 Plus data engineers and over a dozen
00:01:44.520
teams that's what I'm going to be talking to you about today how doximity has solved
00:01:50.579
that problem to effectively sync the data between multiple rails microservices and enable our data teams
00:01:55.740
and our web application teams to work together the way that I'm going to do that is
00:02:01.020
first I'm going to start talking a little bit about the background talk about the domain a little bit about doximity our company and then we're
00:02:08.039
going to talk and we're going to Define explicitly what we mean when we say effective data syncing after that we'll
00:02:13.800
move into more of the application and Company growth that we experienced and
00:02:19.200
some of the things that we tried along the way to keep our data team and our web application team in sync and finally we'll end up on what I call our secret
00:02:25.680
sauce the architecture that has worked for us over the last several years and has enabled billions of effective data
00:02:32.760
syncs in our system now a little bit of background and domain now first uh our company is uh
00:02:39.480
called doximity it is a 11 year old rails based application
00:02:45.420
and our company is focused on a it's a professional Network Professional
00:02:51.540
Medical Network that is focused on enabling Physicians to save time so that
00:02:57.239
they can provide better care to patients we provide doctors with a lot of ways to communicate in a more modern way to
00:03:03.840
enable their workflows and also some continue education tools for them
00:03:09.120
and some of the products that we've developed over the years in order to enable that are like doximity dialer
00:03:15.900
that is a product that has facilitated over 100 million Telehealth calls in the
00:03:22.019
US and then another one that I'll be talking about a little bit later is called continuing medical education that
00:03:27.659
is an entire system where we ingest articles that are medically relevant and
00:03:33.720
we give them the doctors so that they can read them and get credit for that for their continuing medical education
00:03:39.360
we have Rich search that is enabled through our integration with rails
00:03:45.480
domain modeling we have secure faxing and messaging uh we also have Rich
00:03:51.120
profile data for our physicians you know they have simple things like name but also things like their specialty the
00:03:57.659
things that they've done and where they've went to college University those sorts of things and another important
00:04:04.860
area is medically relevant news we provide uh that for our teams and our
00:04:10.680
physicians in the form of a news feed and in order to
00:04:15.900
um oh and because of all of those features that we've had we've grown to a point where we have over 70 of all the
00:04:21.660
U.S Physicians and 45 of nurse practitioners and physician assistants as verified members on our site
00:04:28.440
now with that amount of features and users
00:04:34.080
we have to have a lot of teams in order to build that out at this point we have over 10 data teams
00:04:40.380
with 45 plus engineers and 20 plus application teams with over 70 Engineers that are building out and
00:04:48.180
maintaining these features and the system that I'm going to introduce you a little bit later are a data update tool
00:04:53.220
we've performed over 7.7 billion data updates since April of 2019.
00:05:01.020
so now let me know a little bit of the uh the background the domain that we're going to be talking about let's Define
00:05:06.360
effective data syncing what I'm talking about here is data integration in rails so you have many
00:05:13.380
rails based microservices and you have many different data stores how do we
00:05:18.479
move data to and from those different data sources while respecting application business logic without
00:05:24.660
breaking things now before we go into the talk about application growth I just want to give
00:05:31.800
like a preview of the solution that we ended up on in general it is a Kafka
00:05:37.320
based system that allows our data team to produce messages and our application developers to consume those messages
00:05:42.720
that they're produced they're able to work independently because of that now
00:05:48.660
in the beginning before we had all the teams all of the microservices we had the model f and our
00:05:57.240
monolith was quite Majestic sparkly had unicorns
00:06:02.580
and it had sprouted wings at some point and just to give a summary of how data
00:06:08.460
updates work in a monolith apple or monolith application I'd like to kind of Step through that
00:06:14.639
this is the way whenever you have a monolith trying to get data updates in but at the end of the day all we really
00:06:20.160
care about is that we're able to serve our users for us it's Physicians they don't care about all the stuff that
00:06:26.520
we're doing all that they care about is that they're able to get the stuff that they need and access the data that they
00:06:32.160
want whenever they want it but rails so fantastic at providing a
00:06:37.440
rich way to model all of the domain logic that exists and is distributed amongst
00:06:42.720
many data stores you know my SQL redis and there's also a lot of rails developers that are very familiar with
00:06:49.500
all of The Primitives that rails provides all the abstractions in order to integrate or in order to communicate
00:06:54.660
with those data sources and whenever you have a a data need something that is is
00:07:00.300
more simple say you want to go in and you know up case all the Physicians first names
00:07:06.240
business talks to the rails developers and the rails developers have a very well-known mature set of tools in order
00:07:13.259
to handle that crime rig active job rails console in order to get those data updates to go
00:07:20.520
through the rich domain modeling that rails has provided and sync it to all the data stores
00:07:26.880
and just as a way to demonstrate how fantastic rails is at keeping these data
00:07:32.039
stores in sync I'd like to talk about what you would do if you wanted to add
00:07:37.199
better search for your users and say you want to enable your Physicians to be
00:07:43.440
able to find each other by many other types of criteria so name where they went to University and you want to be
00:07:49.380
able to sort that by uh relevance and control scoring and that sort of stuff so your team decides to use
00:07:54.660
elasticsearch for that because it's very good at that sort of stuff how do you keep
00:08:01.080
your elasticsearch system up to date with your users now there's a lot of ways that you can
00:08:07.919
approach this problem but I think that this is one of the areas where rails shines with its Rich application domain
00:08:15.300
modeling now there's a lot involved with doing search effectively but whenever I'm
00:08:22.080
focusing on just the step of making sure that your user data stays in sync
00:08:29.280
whenever you change it with your elasticsearch index all that you really have to do to
00:08:34.740
accomplish that is create an after commit hook in your user model to schedule a background
00:08:40.680
elasticsearch sync and then have that method kick off a background job that
00:08:46.380
will successfully re-index the user in your elasticsearch
00:08:51.779
this is one of my favorite parts of rails it makes tasks like this so simple and straightforward and it's also one of
00:08:58.980
the reasons why over time you get a ton of very rich domain logic in your rails
00:09:04.680
models now let's talk a little bit about what happens whenever your business gets to a
00:09:10.620
point where it needs more more data like more experienced data
00:09:15.839
and here's some examples of reasons that that would happen so doximity is a very data driven company you know most of the
00:09:21.660
decisions that we make for new products leaning into specific areas of our
00:09:27.300
business are related to the feedback that we're getting from our users on whether they're
00:09:32.580
engaging with specific or specific products so uh there's a very sophisticated analytics Pipeline and we
00:09:39.600
need to know how that what we're doing is working in a timely manner doing that is very difficult it requires people
00:09:45.959
that have deep specialized knowledge and data pipelines and Analytics
00:09:51.000
now there's also some other features that we lean on quite heavily like machine learning recommendations and
00:09:56.279
data Pipelines but let's look at a real example of a
00:10:01.860
complex data need that we've actually built out so that we're talking in concretes so earlier I mentioned that we have a
00:10:08.279
lot of physician profile data these are things like usernames where people or not usernames but uh first name last
00:10:14.760
name where they went to University at their specialty their sub-specialty and then we also have that continuing
00:10:21.000
medical education that I talked about where we ingest articles and we're able to put them through a pipeline where we
00:10:28.740
can extract out things like who has been cited in other articles so business gets
00:10:35.160
this idea hey you know how empowering would it be for our physicians if they were able to
00:10:41.100
see that when they created a white paper or a journal somewhere that somebody
00:10:46.440
else cited their work in the other person's article
00:10:51.540
and the end result that we had was this is something that we've released and
00:10:57.120
Physicians like it it's it's empowering it's cool whenever people do work and other people rely on that and it allows
00:11:03.839
the Physicians to do things like you know one feel better about writing articles and write more articles
00:11:09.180
or journal entries or go in and double check that everything you know is jello with what they were saying
00:11:15.480
now doing this is a subtly complex problem there is a lot involved with
00:11:23.279
matching CME articles citations with Physicians Data you know the the first part of this is
00:11:30.000
you have to make sure that all the physician names are correct you know how do you do that in the first place
00:11:36.300
and then after that you have to make sure that you have the information there on the Physicians like what is their
00:11:42.240
specialty where did they go to university uh then for the CME articles you have to
00:11:48.360
clean and standardize all of the citation names that appear in these
00:11:53.940
journal entries that's also hard because there is no
00:11:59.160
standard format for this you know a journal could choose to do first name last name last name first name they could put any string of
00:12:05.940
characters that they want in there and then after you're able to get all the
00:12:11.220
physician data good and standardize all the cited names which are both hard processes by themselves then I think the
00:12:18.720
real difficulty starts where you get into name matching you know for common
00:12:23.880
names you know like Austin story that may not be too difficult but what if you have a very common name and you have
00:12:29.519
multiple Physicians that share the exact same name then you start getting into confidence scores where you look at the person's
00:12:35.760
specialty is this a journal entry related to the specialty that they would be uh writing in you know uh is this
00:12:42.540
related to something that they've done in the past uh so this this is very
00:12:47.700
very difficult and our data team does an incredible job at it but it does require
00:12:52.740
deep expertise in Specialists so how do we integrate these data
00:12:58.680
specialists that have their own unique sets of tools that they need into our existing rails
00:13:06.600
monolith application so that the things that they're doing get piped through our Rich domain models because they're used
00:13:12.720
to things like python spark jobs raw SQL how do we do this
00:13:18.120
well I'll tell you about the steps that we took in order to accomplish this and the first one is a question that
00:13:25.500
I ask a lot is what is the easiest way to do this and one of the easiest ways that you
00:13:31.920
could arrive at is to just get them direct database access and then promise
00:13:37.560
to be super careful but before you just walk away from that promise you have to solidify that with
00:13:44.940
one of the most binding contracts possible which is the pinky promise
00:13:51.779
now after you've made a picky promise to be super careful whenever they have direct database access
00:13:59.040
there's a lot of pro bonus and cons that you can evaluate with the system uh the pro is that it's very easy to
00:14:06.660
integrate them in the system you know you don't have to make any changes to anything that is over in your rich or in
00:14:14.100
your your rails ecosystem you just give them direct database access but that does come with a lot of cons uh
00:14:21.480
the first of which is that pinky promises are actually pretty hard to keep you know uh even if everybody has
00:14:27.360
the best intentions what if the context of that Pinky Promise changes and not everybody on the team gets the update
00:14:33.360
that the context has changed well if there's some tables that for some reason need to be highly available and you
00:14:39.360
can't change them uh During certain hours you know what if there's accidents
00:14:44.459
and even if there's no accidents and that Pinky Promise is completely managed
00:14:49.860
correctly you can't control the load here the data team can just go in and do whatever they
00:14:55.560
want whenever they want to you have to coordinate the load there so that they're not overwhelming specific tables
00:15:01.019
whenever you're needing to serve those to the Physicians and then I think the biggest reason that
00:15:06.060
I don't like this as the way that we do things is because even if all of that
00:15:12.360
happens perfectly you keep your picking promise the load doesn't ever bring down the site because the database servers
00:15:17.940
get overwhelmed the biggest thing here is that the application your Rich domain logic from
00:15:23.399
rails is not going to be respected whenever the data team goes in and does that update on all the users first names
00:15:31.139
there's no way for them to run all of the rich domain modeling that rails is
00:15:37.860
providing you you know you don't get the after commit whenever you're doing a direct update uh in in the database so
00:15:44.519
your elastic search job isn't kicked off also caching you know even if your
00:15:50.279
database team knows like okay whenever I update a record I need to also update the updated app to now whenever I do
00:15:57.120
that what about answer what about the uh other other models that are dependent on
00:16:03.779
touching you know like there's no way that you can expect the data team to be
00:16:09.300
aware of all of the other concerns that could be added in the future you know if
00:16:14.339
uh you need to touch your account whenever you update the user uh data team has no way to know that that needs
00:16:19.740
to happen so the application logic not being respected I think is the biggest reason why this one is isn't a reason that we went with it so the next way is
00:16:27.180
I think the the next most easy thing which is just an admin UI with uh with some rest over it
00:16:32.339
so instead of giving them direct database access you provide some API so that they can do their updates submit
00:16:39.420
them through some restful call and then all of your updates do get piped through
00:16:44.940
those Rich domain models at that point so that is much better there's also some other good benefits here uh you know
00:16:50.940
rest is well known and it can also be shared with other clients like you know web mobile apis but there are also some
00:16:59.160
downsides for this and also the reasons that we didn't continue
00:17:04.620
down this path the first one is that the limiting of the clients is completely
00:17:10.860
dependent on them respecting when they get a message to stop sending requests
00:17:16.319
uh you can't control that the clients are just fully like they just keep pushing stuff up and you just keep uh
00:17:22.319
responding with uh you know back off and make requests later also batch processing is difficult here
00:17:30.540
most of the time whenever you're doing batch processing there's different subtle needs for each type of process
00:17:36.840
that you're going to be pushing it through and what we ended up with was a lot of snowflake type setups where each
00:17:44.460
type of update was different and each batch processing update was different so
00:17:50.580
we didn't stick with this long term but we landed up on what I'm going to call temp tables plus sync internally we
00:17:55.919
call this our data update tool volume one and with this type of setup you know
00:18:01.559
instead of the direct database access or the rest uis you create some tint tables
00:18:07.080
data team populates those temp tables and then you create a tool so that
00:18:13.440
whenever the database team wants to they can sync those tables to the real rails
00:18:19.440
application tables and we ran with this one for a while several years the pros on this were pretty high you
00:18:26.340
know we had no direct access to the main database so we didn't have to worry about them accidentally taking down the
00:18:32.039
main database uh we were able to separate the data writing from the consuming which is
00:18:37.140
fantastic but there were also some cons here uh it was hard to manage the load here
00:18:43.880
and our batch processing was pretty difficult using the batch processing
00:18:49.080
system that that we were using so this worked probably could have worked forever for us but right as we got this
00:18:58.020
temp tables plus sync solution running the size of our team started growing a
00:19:04.320
lot and we started pulling up a lot uh bringing up a lot more line of business applications
00:19:10.380
and that's what I'm talking about right now so we had a lot of growth about this time so we had our main application which had
00:19:17.640
the setup that I just talked to you about where we had you know rails developers and data team working in harmony through this temp table
00:19:23.460
and then we also brought up our news feed about the same time this is where we're delivering all the medically relevant news that I was talking about
00:19:29.520
we brought up another service to handle all of our colleaguing in order to you know facilitate connections between
00:19:34.740
people in our Network how do we manage this
00:19:40.440
sort of set up so that each team is not having to maintain their own way for their data team to integrate with the
00:19:46.740
rails application because keep in mind at the end of the day Physicians don't care about the way that our back end is
00:19:52.679
set up all that they care about is that they're able to get the data that they need and that's what we should be focused on you know enabling the teams
00:19:59.100
to work so the Physicians can get what they need so that their jobs are easier
00:20:04.440
cool so now we're going to move on to the architecture that works our secret sauce here
00:20:09.960
just to tie this back again to our definition effective data seeking how do we move around data to and from the different
00:20:16.679
sources while respecting the application business logic and not breaking things now let's talk about how we do this
00:20:22.440
so we just went through a lot of growth we had the temp tables plus sync solution going
00:20:28.500
and that gave us an opportunity to kind of pump the brakes a little bit and say hey you know what we have here is good
00:20:35.760
but how do we build this in a way where it is scalable where it is going to grow
00:20:41.039
completely with our sys with our our team as we get more teams and more micro services
00:20:46.500
so we were able to sit down and Define a vision set some goals some requirements
00:20:51.539
for what we wanted our system to work with or look like for our data updates
00:20:57.240
and the first thing was that it has to work with our existing code base you know like just completely stopping
00:21:02.760
developing for six months or a year is absolutely not an option so it has to work with what we're doing right now
00:21:08.640
it also has to be easy to use for the data team that was some of the feedback that we got from the data update tool you know it was an extra step is a
00:21:14.880
little bit more complex for them to go in and update things has to support multiple apps out of the
00:21:22.020
box so it has to be easy for us to bring a new application into the system safeguards to avoid disaster if the data
00:21:28.679
team is producing too much data we need an easy way for that to not be impacting our web application servers that are
00:21:34.740
running all those updates through the ridge application domain models it has to be bulk processing by default
00:21:40.559
we wanted this to be a first class concern in our system and we also wanted a complete split
00:21:46.380
between the people that are producing data and the people that are consuming data completely independent
00:21:53.039
and in order to fulfill a lot of those needs we ended up reaching for a tool
00:21:58.679
called Kafka and I'll talk a little bit about it it's not the most important part of this but just know that it's a
00:22:04.500
tool that we used in order to fulfill these needs so Kafka the things that it fulfilled for us are it allowed us to
00:22:10.380
split the producers and consumers apart it kind of acted as like a bridge in between those it gave us multiple app
00:22:16.440
support easily and it also gives us safeguards because the data team or the the producers of the data are completely
00:22:22.200
independent from the the consumers of that data so they're able to be independent they were also able to go at
00:22:27.299
their own speed The Producers could produce as fast as they want to the consumers could consume as slow or as
00:22:32.580
fast as they want to and there was now no longer a need to communicate between them you know you
00:22:38.820
didn't have to go reach out to the application team before a data team was doing a big push and uh
00:22:44.640
what we added was an app topic for each of these
00:22:50.520
um applications in order to allow them to communicate on just their bandwidth so
00:22:55.679
you know like at the main main topic you have a news feed topic you have a colleague's topic and for those of you that are not
00:23:02.460
familiar with Kafka at all I'll just do like a really high level overview the easiest way to think about it is imagine
00:23:09.840
that you have a way to send a message to somebody
00:23:16.020
and do it in a Json payload you know it's not restricted to Jason but that's
00:23:21.659
how we we use it internally and you're able to put it on a
00:23:27.179
text file and you just keep appending to that text file and Kafka provides all of the
00:23:34.260
abstractions that you need so that you can distribute this text file and you can consume this from this text file in
00:23:40.919
a way that is very fault tolerant and resilient and the producers produced to this this
00:23:46.679
file and then a consumer basically just gets like a pointer to the text file and they just start reading and then they can stop and start as much as they want
00:23:52.980
to so another way to think of it if you're familiar with active support concerns or
00:23:58.140
active support notifications you uh are able to dispatch using active support notification and it's just persisted
00:24:04.620
somewhere and you can read chronologically through all of those notifications at your leisure
00:24:10.620
cool so how did this look for our system uh you know we have data team split on the left and uh our data team on the
00:24:17.159
left and our application team on the right we wedge Kafka right in between those two and the way that it works is the
00:24:24.600
data team is going to produce into a Kafka topic and our application team is going to
00:24:30.840
consume from that topic now before we go into a little bit more
00:24:36.900
details about how the data team produces and how the application team consumes I want to talk about a couple of Primitives that are kind of core to our
00:24:43.260
system and the first is called an operation uh
00:24:48.600
keep in mind we've done uh 7.7 billion of these at this point um
00:24:54.720
and it's composed of a few parts first it is a command to change the data in
00:25:00.240
some way and these are the normal things that you would expect you know create read or create insert update delete
00:25:05.400
upsert they're also self-contained an operation has to be able to live completely on it
00:25:12.299
on its own it has everything that it needs so that whenever it is when the operation is consumed somewhere the
00:25:18.960
consumer can do everything that it needs to with it and it must belong to a batch even if it
00:25:24.659
is a batch of one this is how we make sure that batching is a first class concern
00:25:30.360
and some of the specifics about how we did our operation so I mean we created the operation uh concept and then
00:25:37.080
whenever we are uh dispatching out these operations I'll show you an example of what one looks like later uh we used
00:25:43.260
Avro and Json in order to do this Avro is a schema that allows us to validate
00:25:49.620
the the format of the the values that are coming through but the two things that I want to focus on here are model
00:25:55.740
and type those are two very important Concepts here because it's how we allow
00:26:01.559
the consumers and we'll talk about this later to look up the right importer in order to pull the data in and it also
00:26:07.679
includes things like an identifier which is a key value pair in order to look up the uh the object a batch ID that it belongs to and then
00:26:14.940
any address attributes that it needs in order to update so you know if you're updating somebody's name uh you would
00:26:21.179
send name and then the new name that you want and then the requester so that we can give people alerts and also audit
00:26:29.279
cool another primitive batching uh this is so this solely exists as a way to
00:26:34.440
track manage and Report bulk process operations only reason that it exists
00:26:41.640
cool so here's the diagram of what our system looks like in order to facilitate
00:26:47.820
the data team being separated from the application team and here up at the top we have symbolized by python the data
00:26:55.200
processes that's our Orange Box we have kopka as our purple box
00:27:01.980
and our rails as our green box so in general the way that this works is the
00:27:07.799
python process there The Orange Box is going to produce it's going to write to a topic
00:27:13.620
and that topic is going to really just sit there and then on the rail size with
00:27:18.960
the with the green they have a consumer and that consumer is going to read from that same topic
00:27:24.120
now it's also going to do a couple other things uh well it's going to do uh it's going to write to a results topic as it
00:27:32.640
is consuming and it's also going to reach out to a main controller which is our red
00:27:38.220
in order to check and see if it needs to just stop doing what it's doing for a
00:27:43.440
little bit this is uh how we prevent uh disasters from happening and then our red box down there in the bottom left
00:27:50.279
this is the only time we'll talk about this but we have a metadata consumer where we're able to look at both the
00:27:57.059
operations and the results of them and be able to report on the status of these
00:28:03.120
uh batches that the python side is writing to the topics
00:28:09.179
cool so let's talk more specifically about the data side so data producers that is The Orange Box
00:28:15.779
and there are a few things here they are able some of the things that are important for the data team to work
00:28:21.720
independently they have to own their own data stores and be able to work completely independent of anything in
00:28:27.120
the application side so when the data team is doing processing uh you know a good example of that is the CME and
00:28:34.200
profile example that I talked about where we are associating citations uh to real profiles
00:28:39.900
they have to do that in their own data stores so they pull the data in they do any transforms that they need to and
00:28:46.080
then whenever they are done with what they're doing they'll run a python or
00:28:51.900
submitted or a python script or submitted job and that will write to the Kafka topic that they are targeting
00:28:57.480
there's also several other ways that we can produce into this it does not have to be the data team you can do this from
00:29:03.480
any language and I'll show you exactly what that looks like here in a second but you can do this also from a web UI
00:29:09.120
that red part that I talked about earlier we also have a web UI that allows you to submit jobs and then you could also just do this in any like
00:29:15.360
other rails based system so more specifically this is an example of what a data producer python script
00:29:22.440
would look like so up in the top we have a module that we've created and this
00:29:29.159
module allows you to create a batch notice that with that batch you just specify a few things one is like your
00:29:35.520
application Target you know I chose my app here that would be the application that is going to be piping this through
00:29:41.880
its Rich domain models put your username that is who you are for auditing and then I mentioned
00:29:49.860
earlier that there was a model and a type that is right there where it says the job is the model name uh that is
00:29:56.279
very important because it tells the system which which model that they are targeting and then last is prioritization this
00:30:03.299
isn't something that you would need to do in your own implementation of this but that's something that we've added and it
00:30:08.880
uh it really helps the the data updates go through so that is wrapping up the batch then once you have your batch you
00:30:15.240
are going to just add some option add some operations to that here we are adding an insert uh and we are passing
00:30:21.480
some attributes that I mentioned earlier that is the thing that we are that is
00:30:26.940
that represents the update that we are pushing through the system and then we are adding another operation
00:30:33.659
after that that is doing a separate update with a description and
00:30:39.840
this python script can look however you want to you know you you really just need to build some sort of an
00:30:45.179
abstraction so that your data team can can interact with creating or producing data into this topic but the message is
00:30:52.020
actually end up looking something like this so after you run the script it'll produce a bunch of messages that look like this
00:30:57.360
and this can really look however you want as well these are just some decisions that we've
00:31:03.659
made some things I want to point out here we have the batch it gives you the ID and then the size of the batch in the past or in the previous slide we created
00:31:10.440
two operations so this has two operations in that batch we have the index of this update in that
00:31:16.020
batch we have a type and a type that is an update and then a model that is the
00:31:21.480
job so we've given in this operation all of the context that is needed
00:31:27.299
for some process that is using it to be able to read update or to
00:31:34.799
find the model that it needs and to update it properly cool so we talked about the data side
00:31:41.820
now let's talk about the application side so the app consumers that's this rails part up here what are some of the
00:31:47.820
things that it needs first as part of consumption you have you'll need the concept of a dispatcher
00:31:54.240
so as you are reading from this topic you will need something that will be able to look at the messages that's
00:32:00.059
coming in and find the correct importer for that message
00:32:05.279
for us we use model and type so you tell the model you say that it's a you know an update and then we use that to look
00:32:11.640
up the proper class in order to run that through our system and then for the Importer this is also Implement what you
00:32:17.640
need some of the things that have been important for us are permitted attributes so you know
00:32:24.480
specifying in advance exactly what is allowed to be updated through the system you probably don't want admin Flags to
00:32:30.779
be toggled here you know or you know maybe you do but just whatever you're needing for your system
00:32:36.480
and we also created a super like a parent class uh abstract class that you
00:32:41.520
can inherit from and then you implement import whenever you need to to do something special in order to send these
00:32:48.059
through our system and another important idea is whenever we are returning from these results we return a operation
00:32:55.440
Operation result like a more of like a value object as opposed to just uh like a straight up hash
00:33:00.960
now this is the first example of what an importer is uh the your base importer
00:33:07.500
will need to do a few things like the first thing it will need to wrap all of
00:33:13.320
the logic for communicating with Kafka I omitted that from this because I don't think it's important for this talk but
00:33:20.340
you'll need to handle things like batch sizes with Kafka you know your topic configuration all that sort of stuff but
00:33:26.700
the thing that that is important I think is this has all of the logic that's related to consuming so earlier I said
00:33:33.419
it's important to add permitted attributes here we add a class attribute that allows you to specify some
00:33:40.019
permitted attributes we initialize this with some operations and then say hey
00:33:45.179
you need to implement the import method so that you can do what you need to do then you can also provide some helper
00:33:52.679
methods okay then you can also provide some helper methods like I have there at the
00:33:58.919
bottom which is like the failed operation result uh to make it a little bit easier for people to
00:34:03.960
implement the stuff uh and then here's an example of something that is going to inherit from that this is a basic insert
00:34:10.440
importer so here as part of the import method that we have to Define you know
00:34:16.080
we initialize an array with some results and then we look at the operations
00:34:22.440
and take the first model and constantize it because we've had a dispatcher that's dispatched to this
00:34:28.260
we're able to look up the model because we know that it's been dispatched properly and we're only going to be
00:34:33.359
importing based on one model then after that we Loop through all of the operations and then for each operation
00:34:39.839
you'll Define your business logic here this could be different you know for the most part it's just going to be a lookup
00:34:45.839
by ID but you find all the operations and then you slice the permitted attributes and update the object and at
00:34:52.740
the end you add the results to that array return at the end
00:34:59.099
once you have those two uh abstractions you can build on it in order to really easily Implement other importers like
00:35:05.220
here's an example of a city importer so on this because we've built the
00:35:10.800
Importer and the basic insert importer unless all we have to do is implement the permitted attributes and then
00:35:16.140
anybody that's producing in the system can update a uuid and a name anytime that they want to and we don't have to
00:35:21.359
override the import method we have something more specific that we needed to do whenever this message was coming
00:35:26.880
in you have the ability to implement the uh the import method itself to overwrite
00:35:33.240
that but you know the goal is to not have to do that as much as possible
00:35:38.339
cool so we just talked about some of the specifics related to how we've enabled
00:35:44.339
the data team to be able to work in Python and do all of their data processing
00:35:50.099
and how we use Kafka to integrate them and separate the concerns between them producing and the application side
00:35:56.099
consuming and we did it in a way where it is scalable and easy for the data team to
00:36:02.520
use so when we started we said hey these were the goals that we came up with you know has to work with existing code base
00:36:08.099
is used by data teams for multiple apps safeguards for disaster bulk processing
00:36:13.140
by default and independent concerns I'd say at the end of this we have really filled the need and that we've had over
00:36:19.440
7.7 billion data updates with this system uh really close to how how much we've leaned on to it in order to
00:36:25.980
provide the ability for our data teams to work independently of the application teams and the application teams to model
00:36:32.280
using rails all that rich domain logic cool so we just talked about
00:36:38.820
um the effective data syncing between rails microservices uh thank you so much for watching this talk just as a summary we
00:36:45.599
talked about the domain how doximity uh is a physician-first medical Network we have a lot of line of business
00:36:50.940
applications and our team has grown a lot and these coffee-based solution that
00:36:56.820
we arrived at in order to facilitate the application and the data teams to work independently of each other in synchrony
00:37:04.040
and a couple things I want to point out here uh if you like anything that I've said here go to workout toximity
00:37:10.320
um any questions uh if you're at railscomp I'll be in the effective data syncing between rails microservices
00:37:17.099
Discord Channel otherwise you can p me on Twitter Osteo 36 and I'd like to give a special thanks out for the slide
00:37:23.940
assistance to our one somebody that's on our design team named Hannah she is the reason that these slides don't look like
00:37:29.760
they were made by a caveman so thank you all very much for attending the talk and if you have any questions please reach
00:37:35.579
out thank you