00:00:09.110
welcome to railsconf 2020 couch edition and enterprise identity management on
00:00:15.449
Rails I'm Bridget and I'm Oliver
00:00:21.360
Sandford we're software engineers at mode analytics this talk is not about
00:00:28.230
resetting passwords we don't like passwords very much and we're assuming you're already familiar enough with
00:00:34.649
rails to have seen what password management typically looks like we're also going to skip right past session
00:00:41.640
storage yes in a web app session the storage is the backbone of identity
00:00:46.710
management it's something you have to handle in a way that suits your scale and infrastructure the various rails
00:00:54.149
session stores are well covered elsewhere this talk is also not about
00:00:59.280
real ID verification biometrics are other new forms of strong identity verification for government or health
00:01:06.030
processes it is about several topics you'll want to consider if you're
00:01:11.460
building a product or service for larger businesses this talk is about some of
00:01:16.979
the lessons we've learned in handling identity concerns and rails apps and particularly through debugging sam'l and
00:01:23.310
developing a skin integration we'll start with a big-picture design consideration and then we'll look at a
00:01:29.880
couple areas more closely handling Enterprise authentication in rails and implementing skins the obvious but wrong
00:01:40.709
way to set up an identity management in rails is to have all users share a single space users belong to an
00:01:48.840
organization validate that no two user accounts can have the same email or
00:01:54.119
username and you're done right wrong actually sometimes the same person
00:02:00.869
belongs to more than one organization add a membership stable and voila users
00:02:07.020
can be part of more than one organization here's where the trouble starts you'll run into policy conflicts
00:02:13.650
immediately when two organizations have different login policies which one do
00:02:19.290
you apply if someone has logged in fully according to one organization using a strategy or provider that
00:02:26.860
another organization otherwise wouldn't what is their authorization level when a
00:02:32.530
person is removed which organization has permission to delete their account a
00:02:39.420
better approach is to scope users to organizations from the beginning of your
00:02:44.769
project with this approach you'll validate traits like the email and username only within a single
00:02:52.299
organization if Billy at Greenfield comm needs to be part of two orgs each org
00:02:58.750
will need to maintain a separate Billy at Greenfield com account to understand
00:03:05.739
the consequences of the two different design patterns let's compare github with slack github
00:03:12.670
emerged out of an open source oriented ethos much like Twitter developers had
00:03:18.609
handles in a single global namespace of users slack was designed for the
00:03:24.489
enterprise from the beginning you log in to a workspace specific to your organization the github model is
00:03:32.139
oriented toward global interactions and sharing a user can certainly still belong to an organization however things
00:03:40.450
become more challenging when they belong to multiple organizations I definitely logged into my personal
00:03:46.780
github account recently to discover I was still a member of a large organization I stopped contracting for
00:03:52.389
three years ago the fact github has developed its Enterprise Cloud Edition
00:03:58.120
which offers unique username account spaces sam'l authentication and skim
00:04:03.669
integration at considerable additional cost speaks to the effort required in
00:04:08.829
rebuilding or refactoring their service to move away from the global namespace pattern to the extent that github is
00:04:16.000
really a b2b service that is if the platform is mostly used while people are
00:04:21.579
at work with their employers paying for it the open-source model becomes extraneous if you have the luxury of
00:04:30.370
planning for enterprise support from the ground up take those early decisions carefully
00:04:36.039
how can you design your product or service so that Enterprise Identity Management is a breeze probably the best
00:04:43.059
gift you can give yourself is if you are building a business product scope each
00:04:48.430
user account entirely to a single organization if you need individuals to
00:04:53.860
have accounts either add them all to an invisible general public organization or
00:04:59.800
make them each and organization of one observability is fundamental identify
00:05:09.580
and define any regular business events you want to log such as authentication success or failure a user provisioned
00:05:17.469
event or a user deletion event in an existing service your product probably
00:05:24.430
already has service objects to handle exceptions and analytics use them if you
00:05:31.749
don't have them yet build them half the battle in development is exploring and
00:05:37.930
tuning a living system if you work with the running system locally and explore actual payloads you'll find it makes the
00:05:45.459
documentation much more concrete than the formalistic description of an RFP
00:05:50.610
even for local development a log visualization product will serve you well compared to manually parsing the
00:05:57.610
rails logs for everything you need to find omniauth
00:06:07.590
many rails apps start out life offering authentication via username and password
00:06:13.169
perhaps using something like devise try to skip this step storing individual passwords is insecure
00:06:20.490
and risky avoid it if you can what's good about advice is that its
00:06:26.250
modules implement many fundamentals of Identity Management in an itemized way they're instructive and represent
00:06:33.960
concerns you may still want to address even if in a way that fits better with
00:06:39.120
your existing practices for our purposes the minimum viable implementation is
00:06:44.940
single sign-on via some combination of public a auth to vendors users will
00:06:50.970
click on the first available single sign-on button and defer to an external service at this stage you'll need
00:06:58.050
something more modular than username and password implementation in the rails ecosystem that something is omniauth
00:07:07.760
omni auth really has only one central idea all forms of authentication boil
00:07:14.520
down to a request phase in a callback phase and the request phase we check for
00:07:19.710
evidence of the user's identity we look for an existing session and ask for credentials if we don't find it in the
00:07:27.060
callback phase we pass verified identity information to your application and handle anything required to log these or
00:07:33.780
in I'm the author supports some useful hooks such as before a request phase a
00:07:40.320
good place for CSRF verification and sending information to analytics before
00:07:46.200
callback phase and on failure in between it delegates the actual alert to
00:07:51.450
whichever authentication strategy is appropriate
00:07:56.470
depending on your customers you may find Google OAuth 2 is the most popular
00:08:01.700
single sign-on method our perhaps github or Azure Active Directory in these cases
00:08:07.250
you're connecting to public a auth to authorization servers and the connection
00:08:12.320
details are more or less the same for all your customers so you'll wind up with something that looks like a pile of
00:08:18.230
configuration constants from various Omni authentication strategy gems this
00:08:24.470
may work for a while but there are some disadvantages to this system not all your customers will have all the login
00:08:31.310
method set up so people will likely click on the wrong thing and wind up in the dead end
00:08:36.729
moreover your customers may require you to insure their team only login with an
00:08:41.839
approved strategy get access to both the request and call back actions where
00:08:47.720
possible since your system now supports multiple authentication providers you'll
00:08:54.140
need to track the appropriate external UID for each of them you'll find it in Omni off scoff hash and save it to an
00:09:01.460
identity model associated with the user at this point the minimal possible ERD
00:09:07.730
looks something like this
00:09:13.700
sam'l stands for security assertion markup language the 2.0 specification
00:09:19.380
was approved way back in 2005 before the world saw its first iPhone sam'l has
00:09:26.640
some limitations but it's fully realized and fairly widely adopted in larger organizations it's a dialect of XML with
00:09:35.280
a few interrelated documents you may wish to handle with the public o auth to
00:09:41.850
vendors we discussed in the last section most of your connection details were constants no worries it won't be long
00:09:49.650
before customers want to connect to identity providers entirely under their own control they might ask you to do
00:09:57.120
this with throw off too but we found the most widely requested Enterprise authentication protocol is sam'l in the
00:10:04.650
future it may be open ID connect a sam'l
00:10:09.750
authentication flow can be initiated either directly from the identity provider or it can be initiated from the
00:10:15.720
service provider that's your service there are some good reasons you want to initiate the request yourself your
00:10:23.310
request will likely have a short timeout associated with it and may also contain some disposable immutable data you wish
00:10:30.570
the identity provider to send back as part of the sam'l assertion in the OAuth
00:10:35.910
2 world the sam'l assertion would be called an authorization token both
00:10:41.250
measures help mitigate the risk of men in the middle attacks in a sam'l
00:10:47.040
authentication flow the security assertion or sam'l response document contains the verified identifying
00:10:54.300
information of the logged in user so once the sam'l response is received by the client and submitted to the
00:11:00.720
service provider the authentication flow is done the service provider may decide
00:11:06.300
not to log the user in for some reason a policy violation regarding a particular resource for example but from the moment
00:11:14.670
the client sends the token the response document back to the service provider the identifying information has been
00:11:21.660
delivered of course it's also essential be signed so sam'l documents should
00:11:28.110
always contain a signature that matches the rest of the information provided the
00:11:34.770
main issue with sam'l became clear within a few years after the release of 2.0 it's that request assertion consumer
00:11:41.640
service step which the client provides the security assertion to your service
00:11:47.250
provider the sam'l specification makes the assumption that the client is a web browser and supports transport for this
00:11:54.600
final step using either post or redirect this diagram implicitly shows the post
00:12:00.120
variant since it responds with an XHTML form but for mobile clients this step does not
00:12:06.750
work both two addresses this limitation providing for direct communication
00:12:12.120
between the service and the authorization server the catch is that the identifying information is not
00:12:19.050
provided to the user as part of the authentication token rather the service provider has to request it as part of
00:12:26.160
the verification step if you need to support multiple customers with their
00:12:32.970
own identity providers take a look at Omni off multi provider this makes it
00:12:38.640
easy to pull the parameters for a sam'l strategy out of an active record model
00:12:44.240
one limitation of Omni off is that while sam'l 2.0 supports automatic
00:12:50.210
configuration through metadata exchange and Omni off supports an options phase
00:12:55.470
for uses like that Omni off multi provider will need some extra tuning to
00:13:00.480
get that to work in the meantime consider setting up a feature flag or
00:13:05.730
environment toggle you can use to log the full XML of sam'l assertions in the
00:13:11.940
event a particular customer has trouble setting up their sam'l provider you can grab the assertion and see exactly what
00:13:18.510
is required debugging in practice here are a few
00:13:25.090
other suggestions that are useful for debugging cinema and other authentication protocols set up test
00:13:32.590
organizations with your most popular identity vendors such as octa or one login even if it's an identity provider
00:13:39.910
that your organization also uses internally you probably won't be able to
00:13:44.920
use your organizational account to do development and debugging instead you'll want a test organization with the larger
00:13:53.050
vendors such as octa this is not necessarily included in your contract and it's something you'll want to
00:13:58.210
negotiate if you use trial accounts for development and testing you'll find that
00:14:03.520
they may expire and lose valuable State every 30 days to avoid this you'll want
00:14:09.040
your vendor account manager to include permanent testing resources from the beginning
00:14:14.990
rather than rely on a suspender with a large surface area another option that's
00:14:20.910
helpful for understanding sam'l is to debug against an open-source identity provider such as key cloak this has the
00:14:28.680
advantage of being free plus you can quickly set up and run the identity provider locally using docker key cloak
00:14:35.790
is certainly a thinner solution than octa but in practice this makes it easier to locate the certificates and
00:14:42.000
settings you need and to see how settings on the identity provider side match up with things you're seeing in
00:14:48.270
Ruby you'll also want to be able to inspect
00:14:53.720
authentication flows in process from the client perspective for this use in grok
00:14:59.570
are a sam'l browser plugin the latter ads XML decoding and inspection to your browser developer
00:15:06.230
tools it's super helpful to be able to see exactly what's being sent for each stage of the flow and to decode the
00:15:13.130
sam'l assertions right there before you get your skin implementation
00:15:20.529
in place you may need to handle account provisioning just in time after your
00:15:25.870
single sign-on auth response is received the main idea is to treat the
00:15:32.050
authorization token as being authoritative that is if Google says that Billy Moore from Greenfield is at
00:15:39.040
your door and you don't have an account for Billy Moore yet you can create one for him as part of the login flow this
00:15:46.180
is what we call just-in-time provisioning as we'll see shortly with skim a sam'l Plus just-in-time
00:15:53.290
provisioning scenario raises a couple of edge cases which you'll need to think through among them email and name
00:16:00.370
changes as when an individual gets married and email inheritance as when
00:16:05.800
Billy Moore departs and Billy Strayhorn arrives and also wants to be Billy at Greenfield com what is skim skim stands
00:16:18.970
for system for cross domain Identity Management note that throughout this presentation when we say skim we are
00:16:25.540
referring to version 2.0 of the skim API
00:16:30.839
is an API that is implemented by a service provider and used by an identity
00:16:35.860
provider to manage resources on the service provider as a b2b company we
00:16:40.990
implemented the skim API to allow our customers to manage their users and groups through third-party software
00:16:47.279
specifically starting with hasta this gives the administrators other customer organizations more control over who can
00:16:54.459
access our products and the permissions they have within our products
00:17:00.570
there are two portions of the skim API discoverability and operations the
00:17:07.150
discoverability portion of the API tells a client about the supported features resources and attributes that the
00:17:14.050
operations portion implements the operation portion of the skim a PI provides the abilities to create read
00:17:21.670
update and delete resources search for resources and perform operations on
00:17:28.060
those resources the two resources defined in the skim API are users and
00:17:33.460
groups however this can also be extended to other resources in your application
00:17:39.630
here is an example of the user resource we can see many of the attributes that
00:17:45.310
are defined in the skim core schema and here is the group resource there's a lot
00:17:52.090
to take in here from these two resources so to start off we will want to determine how much of the skim API we
00:17:58.450
need and want to support in order for our administrators our customers to
00:18:03.910
manage resources in our application in an efficient way this will narrow the
00:18:09.520
scope of our skin api we will want to think about how skin interacts with our
00:18:16.750
internal permission structure if we use groups to assign permissions we'll want to implement the group's operations of
00:18:23.440
the CM API we could also use a roles attribute on users to specify multiple
00:18:28.990
values are the user type field to specify a singular value for our
00:18:34.150
application we use both roles and groups for different permissions so we will
00:18:39.340
implement the roles attribute in our scheme API and support the group's resource we have also worked with our
00:18:46.480
product managers customer success and sales teams to determine which identity providers are the most valuable to our
00:18:52.390
customers we have decided to start by integrating with octa so we will only
00:18:57.700
need to implement the portions of the skin API which octa requires it's important to also spend time researching
00:19:04.929
the other identity providers we may want to support in the future we don't want to dig ourselves into a whole by
00:19:11.320
ignoring the rest of the skim API and missing a crucial piece from the beginning
00:19:17.019
so we'll remove some of the user attributes from the user resource that we don't need to support and add some
00:19:25.129
others to get our supported user resource and we'll do the same for
00:19:31.039
groups our resources are starting to look more manageable now we will also be able to
00:19:40.519
limit which endpoints are included in our first iteration of our scheme API by only implementing those that octo uses
00:19:50.049
this eliminates the entire discoverability section of the scheme API and the bulk operations too so isn't
00:20:02.539
there a gem for all this let's take a look at three gems to help us implement the scheme API skin kit by milk on skim
00:20:11.539
rails by lesson Lee and skim engine by Cisco amp
00:20:17.590
each of these gems covers a slightly different piece of the skin API skin kit
00:20:23.960
covers the discoverability portion while skin rails covers the operations portion
00:20:28.970
but only for users skin engine covers both the discoverability and operations
00:20:37.360
the skin kit gem focuses on the schema and resources for your application
00:20:42.680
supports it helps provide the discoverability portion of the skin may pie telling an identity provider about
00:20:49.310
how to use your API it doesn't help you implement the operations portion of skin so after implementing this gem into your
00:20:56.510
rails application you will still need to do all the work to manage resources the
00:21:01.580
pieces of the skin may pie that skin kit will help you implement aren't required by octa so for us this meant skipping
00:21:07.970
this entirely in order to narrow our scope
00:21:13.059
next up we have the skin rails gem this gem is not fully skin compliant it
00:21:19.580
focuses on the components of skin that are required by octave within the set of
00:21:24.589
features octave supports this gem helps implement the scheme API for users it doesn't support groups are adding other
00:21:31.249
resources this gem also makes assumptions about how your underlying data model looks it assumes your users
00:21:38.089
are organized within a company that has a scope to get these errs it assumes that your data model matches the core
00:21:44.269
skin schema due to our existing code there wasn't enough flexibility here to
00:21:50.149
add support for groups and handle the differences between our data models and the skin core schema lastly we have the
00:21:58.219
skin engine gem this gem aims to be more general-purpose than the skin kit and
00:22:03.440
skin rails gems it supports the core schema endpoints the operations
00:22:08.899
endpoints and can be extended to have multiple resource types there are soul pieces you will need to implement
00:22:14.389
yourself though to be fully skin compliant you'll need to implement the index action with filtering handling
00:22:21.019
patched parameters and other application specific logic that other logic is
00:22:28.549
summarized nicely and an example controller in the skin engine repository convert the skin engine resources user
00:22:36.349
to your application object and save this is definitely easier said than done
00:22:44.370
factoring your skin API endpoints is the hardest part of writing your skin maybe I
00:22:51.409
it can't supposes the question should I really do what this request is asking
00:22:57.589
the answer to this question lies in the permissions of your application in our
00:23:03.029
case users belong to organizations and organizations have domains if a request
00:23:08.459
comes in for a user with an email address domain that belongs to the organisation that's making the request
00:23:14.759
we do it we use domains as an additional way to determine if the request has
00:23:20.219
permission to create our link to users with a specific email in addition to
00:23:25.469
linking users we also ran into some edge cases and questions from our customers these edge cases also apply to SSO with
00:23:33.479
just-in-time provisioning the first edge case we ran into involved
00:23:40.420
emails someone got married I decided that goat lover was not an excessively professional email handle with skim it's
00:23:48.970
now time for your administrator to link your accounts they use your first name last name email in the identity provider
00:23:54.550
and it doesn't link to your goat lover email handle that use used to sign up for the service provider with instead
00:24:02.500
that you have a fresh blanket count that is missing all your previous work in this case we provided our customers with
00:24:10.990
instructions on how to remove the new account change the email within our application on the old account and then
00:24:17.320
to retry the skin request another solution would be to use the goal of our email handle in the identity provider
00:24:24.010
and then change it within the identity provider to trigger an update operation on our skin API in this case we are
00:24:31.570
expecting the request to have an external UID that matches an existing user and a new email address our other
00:24:38.320
changed attributes for the future we've also added this scenario as a warning in
00:24:43.660
our skin integration guide SSO with just-in-time provisioning handles this similarly when we get the
00:24:51.760
SSO login request we can check both the external UID and email address if the
00:24:57.820
user already exists in our system as determined by their external UID but the
00:25:03.400
authorization token specifies a different name our email address we're looking at an email change
00:25:13.730
the other hand we may see that the external UID has changed but the email does already exist in the system and so
00:25:21.710
we have a slightly more subtle situation to think about this happens when Billy Moore departs and Billy Strayhorn
00:25:27.950
arrives and really wants to be Billy at my company calm
00:25:33.980
it really depends on how the external system treats email addresses if it does
00:25:39.660
not enforce email uniqueness stop right here there's nothing really you can conclude and your customer will have to
00:25:46.410
take certain administrative actions manually if you can assume that the
00:25:51.540
email though it might be mutable is unique within the customer organization at that moment then when you're off
00:25:58.380
handling code sees the new external UID coupled with an old email address it probably means the old UID is now
00:26:06.180
inactive or the funked someone has left the customer organization and you are now looking at a new person
00:26:12.540
this is email inheritance checking the external UID is the solution for both
00:26:18.360
the skin and SSO with just-in-time provisioning the challenge here is that
00:26:23.880
the correct permissions action is then to deactivate or delete the prior user account however in a mature system these
00:26:32.190
actions may not be so easy to take a user may have many associated records
00:26:37.620
that they're shared and accessed within the team you will have to be careful about what you delete and how you manage
00:26:45.360
the disposition of any ambiguous situations for instance if the user has scheduled for deletion is the creator or
00:26:52.560
owner of a resource to whom's as that resource fall in the person's absence it's also possible user or
00:27:00.450
administrative error has led to this situation in which case you'd better hope there's an easy way to undelete
00:27:07.200
anything you may need to delete in order to provision the new account for the email inheritor when you decide to deep
00:27:14.940
revision a user in the course of another users login or skimmer quests that's a significant business decision you will
00:27:22.170
definitely want to make sure that you have observability and maybe even proactive alerting for these events
00:27:30.299
next we'll talk about testing your skin may PEI to ensure it will work with identity providers to manage resources
00:27:36.970
on your application there are a couple things we want to confirm when testing
00:27:44.110
your CM integration the API complies with the format specified by the skin
00:27:49.360
protocol each endpoint implements the business logic we expect to take place
00:27:55.110
requests are authorized appropriately and only modify resources they should have access to integrating with our
00:28:02.530
prefer that identity provider works we're able to make changes in the identity provider and see those changes
00:28:08.169
within our application this also includes running through common and uncommon administrative work Liz so for
00:28:16.870
each of these there will be different tools you can use octa provides tests for your api and
00:28:22.570
basic tests of integrating with their preview environment that work with blaze meter run scope you'll also want to
00:28:28.809
include r-spec or another framework for integration and unit tests and possibly penetration testing to ensure only
00:28:35.710
authorized users are accessing and making changes through your skin api octa also requires an entire suite of
00:28:42.789
manual testing to be done through the octave you environment this manual testing is tedious that will teach you
00:28:49.299
how to use octa as an administrator and show the workflows administrators use for common tasks like assigning users to
00:28:55.840
your applications deactivating and suspending users and pushing groups if you don't know how a specific workflow
00:29:02.350
is done an octa you can probably find it in the manual testing spreadsheet we also relied on the tool called Angra to
00:29:09.190
capture and replay requests that hit our skim development server
00:29:14.639
when your skin may p.i is working you can then submit it to octopus integration network in order to submit
00:29:21.820
you'll need to provide octa with an integration guide show passing automated tests and blaze maybe run scope confirm
00:29:28.570
manual QA test pass and provide testing credentials for your application
00:29:34.799
each time we contacted octa there was about a one week turnaround our initial
00:29:40.360
feedback involved updating support contacts of our company removing optional blades made a run
00:29:45.909
scope tests that were not passing providing testing credentials and confirming that we did the manual
00:29:51.940
testing of our integration our first two bugs came from a race condition on
00:29:57.340
creating users quickly and business logic for our application that needed to handle periods and email addresses the
00:30:04.659
next round of bugs involves asking us update screenshots in their integration guide and an error on the octa tests and
00:30:11.230
about how to use the rules attribute for our application there was a typo in the
00:30:16.269
role they assigned after one more round of updating screenshots again in our integration guide our application was
00:30:22.629
approved it took about five weeks from first submission to getting final approval we hope you have enjoyed
00:30:31.419
hearing the lessons we've learned in handling identity concerns and rails apps debugging sam'l and developing a
00:30:38.649
skin integration and have taken away some design considerations and insights into Enterprise Identity Management on
00:30:45.340
Rails thanks for watching