00:00:21.260
hi everyone
00:00:22.920
how about this amazing Rails comm 2019
00:00:26.789
I have hope you I hope you all have had
00:00:34.110
an amazing week like I have and I really
00:00:36.629
appreciate you coming to my talk cuz I
00:00:38.550
know you're probably tired my name is
00:00:40.980
Colleen and I run a Ruby on Rails
00:00:43.170
consulting business I'm here today to
00:00:46.050
teach you a little bit about my
00:00:47.940
adventures poor misadventures as they
00:00:51.449
were migrating a production application
00:00:54.149
from shrine to active storage using
00:00:56.489
Amazon s3 storage actually you shrine
00:00:59.760
when I did this for my client but for
00:01:02.250
the purposes of this talk I'm gonna use
00:01:03.750
paperclip because the five people that
00:01:06.509
responded to my Twitter poll said they
00:01:08.160
used a paperclip more than shrine but
00:01:11.940
first I'd like to start with a little
00:01:13.530
story so how did I get here I was
00:01:17.729
contacted by a cool new startup looking
00:01:20.789
for a rails developer to do just that
00:01:22.649
migrate their solution from shrine to
00:01:24.330
active storage I was excited to work
00:01:26.550
with this company and this was the first
00:01:28.950
time I was going to get to use active
00:01:30.929
storage and I was very excited to use
00:01:32.789
active storage I had actually attended
00:01:35.729
the active storage talk I believe it was
00:01:37.860
railsconf last year so I was feeling
00:01:40.229
quite confident in my ability to migrate
00:01:42.929
this application to active storage so
00:01:45.660
for those of you who are not yet on
00:01:47.520
rails 5.2 let's start with what is
00:01:51.360
active storage so active storage is an
00:01:55.979
easy way to attach files to active
00:01:59.009
record objects and store those files in
00:02:01.530
cloud-based storage have you ever needed
00:02:04.530
to add an avatar to a user or maybe a
00:02:08.610
resume to an applicant active storage
00:02:11.310
helps you take care of all of those file
00:02:14.129
attachment needs well that's great
00:02:17.069
Colleen but paperclip is working fine
00:02:19.470
for me why should I go through the
00:02:21.540
trouble of switching well that's a good
00:02:23.610
question why should you migrate to
00:02:26.819
active storage well the first and
00:02:30.209
possibly most important reason
00:02:32.310
is because active storage is now the
00:02:34.560
built-in solution for handling file
00:02:37.200
uploads to cloud storage in rails
00:02:40.100
supports Amazon Google and Microsoft and
00:02:44.270
this one's fun there's no additional
00:02:47.280
migrations needed maybe if you remember
00:02:49.680
with paperclip every time you add a new
00:02:52.170
file you have to write a new migration
00:02:54.980
active storage is different it doesn't
00:02:56.910
work that way and if I still haven't
00:02:59.610
convinced you paperclip is deprecated so
00:03:02.400
you're out of luck so I accepted the
00:03:06.209
contract and the first thing I did was I
00:03:08.790
went and looked at the active storage
00:03:10.590
Doc's so in my experience the
00:03:14.430
documentation for rails is usually
00:03:16.380
excellent and active storage appeared to
00:03:19.019
be no different step one install active
00:03:22.110
storage step to configure cloud storage
00:03:26.120
step 3 add an attachment to a model and
00:03:30.750
step 4 let the magic of rails
00:03:33.650
extrapolate away all of the heavy
00:03:35.610
lifting for you and it just works well
00:03:39.180
has anyone tried to migrate an
00:03:41.459
application to act of storage following
00:03:43.260
these steps if you have tried you might
00:03:47.100
know that implementing active storage in
00:03:49.980
a new application you can follow the
00:03:52.290
steps and it is relatively easy but
00:03:56.540
migrating to active storage can be quite
00:03:59.220
challenging why is that
00:04:03.650
well active storage is fundamentally
00:04:08.040
different from paperclip paperclip works
00:04:11.730
by attaching file data to the user table
00:04:15.120
so for example here we have an avatar on
00:04:18.900
a user so if we added an avatar to our
00:04:22.019
user using paperclip it's going to
00:04:24.180
change the users table
00:04:26.220
it adds these four columns to your users
00:04:29.400
table I didn't include the whole table
00:04:31.110
here so you could actually see what
00:04:32.729
paperclip does at the store they're just
00:04:35.490
different active storage creates two new
00:04:39.300
tables the active storage attachments
00:04:42.270
table and the active storage blobs table
00:04:45.810
so if we revisit our steps I'm gonna say
00:04:50.610
that step 3 had an attachment to a model
00:04:53.490
while active storage is not going to be
00:04:55.200
able to access the data since there's
00:04:57.720
currently nothing in your active storage
00:04:59.730
tables but we can't do step 3 yet but we
00:05:04.980
can do step 1 and step 2 so step 1 is
00:05:12.169
install active storage create the tables
00:05:16.669
and then you need to configure your
00:05:20.820
cloud storage so the way this is set up
00:05:23.880
right right here is we have an Amazon
00:05:26.130
which is going to be our production
00:05:27.630
storage and Amazon Dev which is our dev
00:05:30.810
storage I created this little contrived
00:05:32.940
example for this talk so you can see I
00:05:35.220
came up with a very clever bucket name
00:05:36.840
they're a really fun bucket for Colleen
00:05:38.640
which was unique
00:05:39.660
so go me but when we did this on our
00:05:41.970
production application this is how we
00:05:43.889
had it set up and wet as well and it's
00:05:46.229
really going to depend on your setup but
00:05:48.330
I would highly recommend testing this on
00:05:50.070
a dev bucket on your cloud storage
00:05:52.590
provider and after you configure it in
00:05:58.560
storage DML you then have to configure
00:06:00.570
it on a per environment basis so what
00:06:03.720
I'm showing you here is development and
00:06:05.640
as I said you're gonna configure
00:06:07.289
configure it to use Amazon dev and
00:06:09.900
production would be using Amazon oh so
00:06:13.530
okay great so that took like one minute
00:06:16.700
so at this point you already have active
00:06:20.370
storage installed and now your active
00:06:23.370
storage tables exist in your database so
00:06:27.510
let's talk about step three I have
00:06:29.700
changed step three to say move Avatar
00:06:33.120
data from the user table to the active
00:06:36.810
storage tables well how do we move data
00:06:42.390
from one table to another in our
00:06:45.060
database a rake task so we are gonna
00:06:50.910
write a rake task together and let's
00:06:53.760
talk about this rake task we're gonna be
00:06:56.400
moving a good amount of data and we're
00:06:58.500
not it's not a
00:06:59.160
one-to-one because we have one user
00:07:00.900
table and two active storage tables so
00:07:03.030
we're also going to be mapping some data
00:07:04.680
so the only way to make this work is to
00:07:07.470
understand what we are doing I don't
00:07:09.510
really think there's a copy and paste
00:07:10.800
solution for this particular problem so
00:07:13.710
let's talk a little more about what we
00:07:16.470
are trying to do so we are moving this
00:07:21.990
data from the users table which I'm
00:07:26.790
going to show you again to the active
00:07:30.870
storage attachments and active storage
00:07:32.340
blobs and we're technically copying it
00:07:34.830
over there for now but so I don't know
00:07:38.370
about you but I find reaching into my
00:07:41.460
database with sequel to change records
00:07:43.830
on a production application to be a
00:07:45.900
little bit scary plus I was told I
00:07:49.440
wasn't gonna have to write sequel this
00:07:53.850
is from last year so this is recent but
00:07:57.000
alas it seems to be the case here so
00:08:00.560
Before we jump into what the rake task
00:08:03.630
is going to be let's talk about the
00:08:06.570
active storage tables because as I said
00:08:08.760
you really need to understand what
00:08:09.960
you're doing here so the first table I
00:08:13.710
want to talk about is the active storage
00:08:16.169
attachments table we're gonna start with
00:08:19.860
a name which is the name of your
00:08:22.050
attachment in this case avatar then you
00:08:25.530
have your polymorphic Association
00:08:27.600
columns user and the user ID and then
00:08:31.350
you have your blob ID okay so that's
00:08:33.840
Table one now Table two is the blob
00:08:35.940
table so if we look at the blobs table
00:08:39.570
the key is the location of your current
00:08:43.380
file in Amazon s3 storage and then you
00:08:47.490
have your file name your content type
00:08:51.740
bitesize I don't know why I skipped that
00:08:53.760
one
00:08:53.940
and your checksum alright so how do
00:08:57.450
these tables relate to one another so
00:09:00.510
I'm going to do one table at a time so
00:09:02.370
on your left is the users table and on
00:09:06.420
your right is the active storage
00:09:07.710
attachments table so user becomes our
00:09:10.140
record type the ID
00:09:12.480
he becomes our record ID and the name
00:09:16.970
becomes just avatar okay so now I have
00:09:22.709
users table user table on the left and
00:09:25.440
the blobs table on the right and we have
00:09:28.350
avatar file name that's from our user
00:09:30.750
table is going to go to our blog as the
00:09:33.029
file name avatar content type is going
00:09:36.089
to go to the content type and file size
00:09:39.089
is gonna go to byte size so let's get
00:09:44.490
started on that rake task so the good
00:09:51.000
people of thoughtbot put together the
00:09:53.250
skeleton of a task that's an excellent
00:09:56.459
starting place as I mentioned they
00:09:59.070
actually use a migration as I mentioned
00:10:01.709
I would recommend using a rake task so
00:10:04.290
if we look at this if we look at this we
00:10:08.279
get our blob ID and then these two
00:10:10.910
statements are just defining our insert
00:10:14.160
statements so this is actually all
00:10:16.110
pretty cut and paste for you after that
00:10:21.630
what's happening here is we're looping
00:10:24.839
through all of the models and pulling
00:10:27.240
out the attachment names the important
00:10:30.690
thing to realize here is this code
00:10:33.149
that's used to pull out the attachment
00:10:35.819
name is specific to paperclip because
00:10:38.069
that's how paperclip names the files on
00:10:41.449
your user table right so that's what
00:10:44.339
we're looking at right there avatar
00:10:45.630
underscore file underscore name so if we
00:10:49.139
go this is the same slide the same piece
00:10:50.490
of code so if you look at this that is
00:10:52.949
specific that's just pulling out your
00:10:55.139
avatar string and that is specific to
00:10:56.579
paperclip so as you are going through
00:10:57.980
depending on what gem you are migrating
00:11:00.690
from you have to be aware of this and
00:11:03.180
all this is doing is pulling out the
00:11:05.069
string avatar so once we get the
00:11:08.970
attachment names the next step is to
00:11:14.370
loop through the models and their
00:11:16.709
associated attachments
00:11:24.550
and as a side note what I wanted to
00:11:27.020
share if you only have one or two models
00:11:29.120
with attachments or one model with one
00:11:31.190
attachment you don't have to do all of
00:11:33.050
this you can just call out the model and
00:11:35.030
the attachment name instead of looping
00:11:37.310
through every single model looking for
00:11:39.170
attachments so the thing I wanted to
00:11:44.390
show you here is this the reason I want
00:11:49.400
to show you this is this this instance
00:11:51.230
which is just the instance of your user
00:11:53.930
so this is instance our attachments
00:11:56.300
avatar in our example so user avatar
00:12:00.110
path blank that statement is dependent
00:12:05.210
on the relationship paperclip creates
00:12:07.580
between user and avatar that is
00:12:10.460
important it's important because this is
00:12:14.660
going to take two deploys so why does
00:12:18.380
this process require two deploys well
00:12:21.260
the rape task we're building right now
00:12:23.440
needs that user avatar relationship
00:12:26.420
defined by paperclip I just showed you
00:12:28.070
in circle and it needs the active
00:12:30.950
storage tables because it's it needs a
00:12:32.720
place to put to move the data to to put
00:12:34.790
the data so it needs to put the data in
00:12:37.160
the active storage tables now active
00:12:40.970
storage needs data in the active storage
00:12:44.870
in the active storage tables so you
00:12:47.120
can't run active storage without first
00:12:50.060
running the rake task and the rake task
00:12:52.940
is dependent on paperclip and we will
00:12:54.890
revisit this all right so let's go back
00:12:57.350
to our rake task
00:13:00.160
this right here is okay so this is just
00:13:04.370
calling our blob insert statement and
00:13:06.070
the thing I wanted to point out here are
00:13:09.640
the key and checksum methods and the
00:13:13.910
other ones are just you know use their
00:13:15.320
avatar file name content type file size
00:13:17.810
but I want to call out the key and
00:13:20.020
checksum for a few reasons you're gonna
00:13:24.650
have to write these methods yourself I
00:13:26.750
didn't actually include my solution
00:13:28.880
because your solution is going to be so
00:13:30.560
specific to your paperclip configuration
00:13:33.620
and your Amazon s3 configuration
00:13:36.339
and okay so the key the key is we're
00:13:41.209
active storage is gonna look for your
00:13:43.580
files as a funny or frustrating aside
00:13:47.990
depending on how you want to look at it
00:13:49.490
I was using paper clips so I assumed the
00:13:51.649
key would be user avatar path so that's
00:13:55.130
what I put in my rig tasks well maybe it
00:13:57.890
was the way I had my as three buckets
00:13:59.209
set up or my paperclip config that
00:14:01.520
actually returned a forward slash right
00:14:05.149
there and because of that forward slash
00:14:09.110
when active storage went to look for my
00:14:11.240
files could not find my files so
00:14:13.339
everyone knows that keys are hard so
00:14:15.380
that's that's a potential pitfall as
00:14:17.180
you're going through this process and
00:14:18.260
then checksum so we when I did this on
00:14:22.190
production we had about 80,000 images so
00:14:25.010
it wasn't too many so I actually opened
00:14:27.470
each image and ran it through the md5
00:14:29.360
process I think some of the gems
00:14:31.220
actually provide the checksum for you so
00:14:34.370
that'll just be depending on what you're
00:14:35.839
migrating from alright so that is then
00:14:42.620
all of those records we need to write
00:14:44.750
the very last step is just writing to
00:14:47.029
your attachments table and that's your
00:14:50.329
attachment which we discussed is the
00:14:52.070
string avatar model name which is our
00:14:54.860
user instance ID okay excellent
00:14:59.720
so that is the whole rake task so the
00:15:05.120
next thing to do after you have run your
00:15:07.550
rake task is figure out if it worked so
00:15:11.720
the quickest way to figure out if it
00:15:14.029
worked is to actually see if you've
00:15:16.040
created the correct number of blob
00:15:18.020
records and attachment records if you're
00:15:20.540
feeling feisty you can go into your
00:15:21.950
database take one record from your user
00:15:25.160
table and see if it has transposed
00:15:26.839
correctly to your attachments tables and
00:15:29.120
your blobs tables but if you're not
00:15:32.209
that's fine we'll figure it out when we
00:15:35.570
get there alright I feel like I kind of
00:15:39.860
sped read through a lot of code there so
00:15:43.640
let's do a brief overview of what we
00:15:47.209
have done
00:15:49.400
so we created the active storage tables
00:15:51.080
by installing active storage and running
00:15:53.300
the migrations we configure the active
00:15:56.510
storage cloud storage so that was
00:15:58.550
storage PMO and that was configuring on
00:16:01.070
a per environment basis so it's kind of
00:16:04.400
long we wrote the whole rake task to
00:16:06.290
create the user avatar records in the
00:16:08.600
attachments and blob State up tables and
00:16:10.760
we source that data from the user table
00:16:13.700
or whatever table hat currently has the
00:16:15.830
file attached to it
00:16:18.280
and we have hopefully confirmed that
00:16:23.360
records were created in the active
00:16:25.700
storage table so we don't actually know
00:16:31.520
if the records are right unless you took
00:16:33.260
the time to actually poke peek into your
00:16:34.940
database and look we don't know if
00:16:37.400
they're right but we know they exist
00:16:38.720
so that's good enough to move on to the
00:16:40.700
next step okay before you move on to the
00:16:46.220
next step I would highly recommend
00:16:48.160
checking out a new branch technically
00:16:51.800
you do not have to do this you can push
00:16:54.140
one branch up run your rake task and
00:16:56.930
then push the second branch up with
00:16:58.160
active storage but for testing I think
00:17:00.440
it's a lot easier to do a new branch
00:17:03.430
this was my preferred method as I
00:17:06.770
mentioned I got the key wrong the first
00:17:08.089
time so I had one branch with paperclip
00:17:10.310
in the rake task and another branch with
00:17:12.320
active storage run the rake task use
00:17:15.770
active storage if it didn't doesn't work
00:17:18.080
you can blow out the active storage
00:17:20.150
records fix the rake task rewrite to the
00:17:23.209
tables and try again as I said here's
00:17:29.930
our deploy run the rake test then go to
00:17:33.890
your active storage models and views so
00:17:38.170
now I will show you that alright so now
00:17:44.270
we have installed active storage we have
00:17:47.810
data in our tables our active storage
00:17:49.880
tables so now we can actually preferably
00:17:53.870
on a new branch in my opinion now we can
00:17:57.050
actually change our code and our models
00:17:59.870
views control
00:18:00.690
in test two use the active or active
00:18:04.289
storage functionality so the thing I
00:18:08.580
really want to show you this is you know
00:18:10.590
this is why it looks so easy in the
00:18:11.940
docks right because you just do has one
00:18:13.440
attached but it only works you know if
00:18:16.649
there's data so the reason I wanted to
00:18:19.950
show you this is I wanted to show you
00:18:21.960
the bottom here I wanted to show you
00:18:24.019
views if you look at the views you can
00:18:28.169
see if you're gonna be using multiple
00:18:29.970
sizes of images use something called
00:18:32.940
variance and the cool thing about
00:18:36.479
variance is you can just pick your image
00:18:38.849
size kind of on the fly
00:18:40.649
you aren't hamstrung into specific sizes
00:18:43.499
that you've predefined so let's talk a
00:18:47.190
little bit more about variance because
00:18:48.840
if you're working with images as I was
00:18:50.789
they're very important so paper clip
00:18:55.229
paper clip I think pre-process is all
00:18:57.419
your image sizes so they're going to
00:18:58.619
give you your whatever they are large
00:19:00.479
thumb medium whatever sizes you're
00:19:02.820
working so active storage is gonna do a
00:19:05.700
lazy transform on the original blob on
00:19:09.359
the fly hence the airplane and rails
00:19:15.119
does cache the variant so that the
00:19:16.889
processing is only gonna happen the
00:19:18.179
first time it's generated so here I was
00:19:23.099
in this process of working for this
00:19:26.099
client migrating this application and I
00:19:29.220
had a rake task I knew it was working I
00:19:31.679
had looked at my database active storage
00:19:33.720
could find my files and I ran it and
00:19:37.669
probably 30 percent we're a very
00:19:40.349
image-heavy website it's important to
00:19:42.450
note probably thirty percent of the
00:19:44.759
images were blurry that's how I felt
00:19:49.859
right then so why were 30 percent of our
00:19:55.049
images blurry
00:19:56.099
they were blurry because active storage
00:19:59.700
uses mini magic for image transformation
00:20:02.369
mini magic does not support the advanced
00:20:05.849
image processing that we had been using
00:20:07.830
with shrine and that I think is pretty
00:20:10.379
important and that was a really big pain
00:20:13.259
point for us
00:20:17.210
but fortunately there will be a happy
00:20:20.610
ending so we did this I want to say I
00:20:24.990
did this it was eight months to a year
00:20:26.460
ago and I feel like we're a little early
00:20:30.510
to the active storage party mainly
00:20:32.790
because of this image processing snafu
00:20:36.900
we had to deal with fortunately for us
00:20:42.110
rail six should be solving something
00:20:45.480
this specific issue active storage on
00:20:47.760
rail 6 has deprecated mini magic and is
00:20:50.070
now using the image processing gem so
00:20:52.830
fortunately that image I believe it was
00:20:54.780
like the resize to fill resize to fit
00:20:57.740
that did not work with mini magic all
00:21:02.100
right so we have already done what's up
00:21:08.310
there deploy a paperclip run the rake
00:21:10.650
task and the act of storage tables and
00:21:13.340
the next step is to deploy with the
00:21:18.420
active storage models and views
00:21:21.410
implemented that I just showed you and
00:21:24.270
if that works then you have completed
00:21:28.200
well made good progress on your
00:21:30.420
migration to active storage so let's
00:21:34.110
revisit all of our steps alright so we
00:21:39.210
installed active storage configure the
00:21:42.420
cloud storage move the avatar data from
00:21:45.720
the user table to the active storage
00:21:47.250
tables and now active storage can work
00:21:52.320
its magic and it should just work