00:00:11.200
oh okay so we're gonna get this party started my name is kevin newton um and i would
00:00:17.039
like to formally welcome you all to the um
00:00:22.560
if this is going to turn on hold on a second aha welcome to the live keynote yeah
00:00:28.400
i want to thank matt's for the introduction um since we're in the keynote room and i'm the first person speaking here i feel like i have this
00:00:34.000
privilege to just say welcome to the live keynote apparently that joke's not landing so that's okay you can laugh if you want
00:00:40.800
just a little pity laugh would be appreciated okay cool anyway my name is kevin newton uh i work at shopify uh i
00:00:47.600
work on the yjit team along with aaron allen noah and maxime uh
00:00:52.960
if you want to talk about that or any other thing come and find me at the booth where the nerds with the the green
00:00:58.559
background so yeah i'm gonna open up with a quick
00:01:04.559
oh this thing is dying i'm gonna open it up with a quick warning um so i talk pretty quickly when i'm nervous i am
00:01:10.080
nervous i've had a lot of caffeine we're gonna talk pretty quickly um i'm also going to say if you're a junior
00:01:15.119
developer please don't leave the room um this is a somewhat complicated topic i've spent hours agonizing over this
00:01:21.920
trying to make it accessible for everyone so if you're a junior developer don't leave the room if you're a senior developer
00:01:27.439
please stay with me i promise there's content in here for you it's not the very beginning but it will be there
00:01:34.079
uh so i want to talk to you about person ruby i want to talk to you about how ruby has been parsed over time how we we
00:01:41.040
get uh from plain source text into a structure that we can deal with uh but
00:01:46.640
in order to do that i need to back up and talk about the fundamentals that underlie those concepts in order to give you an understanding of how these things
00:01:52.640
work i need to give you the the theory before i can give you the practice so
00:01:58.240
here here's the game plan we're going to build a grammar for a simple language i'll explain what a grammar is in a
00:02:04.399
second we're going to build a parser for a simple language we're going to look at the history of the ruby parser and how
00:02:09.759
that has evolved over time and how it's used and finally we are going to look at
00:02:14.879
how ripper works which is an internal library standard library that is used to
00:02:20.879
gather information from the parser as it is parsing so first step
00:02:26.319
building a grammar a grammar is a syntactical representation of what is allowed in a
00:02:32.000
language language here is used kind of loosely it's not english it's not ruby it's it's just a language it's any kind
00:02:38.239
of concept that puts together a series of tokens so if we look and we just wanted a language that only accepted one
00:02:44.480
single number this is what the grammar would look like it's actually overly complicated for this because it could
00:02:49.920
just say program points to number but i'm doing this for illustrative purposes a program is going to be our root node
00:02:55.360
it's going to point to our overall grammar it says the only thing that it accepts in this grammar is a number the
00:03:01.040
number is a non-terminal token as opposed to a terminal token and it accepts only a single number token
00:03:08.560
i realize i just said token a whole bunch of times this will accept something like 1 or 2
00:03:14.879
or 7 or any number but it's not extensive enough for us to do anything with so i'm going to extend
00:03:19.920
it a little bit we're going to add the ability to do addition
00:03:25.680
now in when we're doing addition we are now accepting a number plus a number or just an individual number so
00:03:32.319
now we can accept a couple things we can accept one we can accept one plus two but we can't accept one plus two plus
00:03:38.159
three why there's no recursion here all right so we need to extend it a little bit further
00:03:43.599
this is now if i clicked the right thing this is now a little bit of recursive so
00:03:48.959
right so this is now left recursive is what we say in the theory this is pointing at itself the expression node
00:03:56.080
in the tree is pointing at itself it can go and accept more and more information it accepts one except one plus two now
00:04:02.480
except one plus two plus three we can extend it first subtraction and go all the way as far as we want it's infinite
00:04:08.239
right this is left recursive we can continue on and make another one
00:04:14.400
we can make another rule that includes terms this is the reason we're splitting this up and not making it expressions is
00:04:20.400
something i'm going to explain in a second but that's called operator precedence we'll get into that but the function of this is to accept another
00:04:27.280
set of rules and if we put it back into our grammar so that it will understand itself it can recurse down all the way
00:04:34.320
understand plus minus times divide the last step the last thing you want to add
00:04:40.160
to this is going to be parentheses this kind of language can accept one times two
00:04:45.680
you have to do a little bit of substitution in your head you say okay one times two is the entire language
00:04:50.880
right okay our language is an expression we go down to the expression we say an expression is
00:04:56.960
just a term a term is a term times a number a term is a number so it's a number
00:05:03.120
times a number stay with me i promise this is going to make sense in a second the last thing we want to add is
00:05:09.280
parentheses this one's a little fun this is uh any single token or
00:05:16.400
symbol in this language can be made into an overall program
00:05:22.400
because you see a program is equal to expression by wrapping it in parentheses so you can put it anywhere in our language all of a sudden we support
00:05:28.320
parentheses this is our grammar this is what we're going to use throughout this presentation to understand this language
00:05:34.320
to build on top of it all right so we've got we've got our language
00:05:40.000
the next step is going to be building a parser that understands this language the grammar was a
00:05:46.000
uh abstract concept we're going to now implement that so this is our source we're going to
00:05:52.639
take this source file this is your dot numbers or whatever uh you know language you want to call it
00:05:59.120
we're going to loop over it in ruby it's going to be a language implemented in ruby we're going to say until the
00:06:04.479
input string is empty we are going to uh we're going to click our mouse is what
00:06:10.479
we're going to do come on um we are going to switch over the input we're going to
00:06:15.840
skip over white space and the little dollar sign uh apostrophe
00:06:21.360
there means skip over the last match uh we're going to
00:06:27.039
go and take our numbers anything that matches that regex
00:06:32.319
and we're going to yield out a number token same thing with these operators we're
00:06:37.520
going to yield that operator token that dollar sign ampersand means take the the last match string
00:06:43.120
and finally we're going to throw a parse error if we don't understand any remaining syntax
00:06:48.639
the important part is here the this is called lexical analysis tokenization any kind of thing like that there's a lot of
00:06:54.400
names for it the point is that we are taking these segments of the file and yielding out individual tokens
00:07:01.680
okay so if we go when we we parse this and we execute over this source string we're going to find a number
00:07:07.440
plus all these different tokens as we build them up and now we have our list
00:07:13.120
this is what was parsed by that file everyone with me i see some nods
00:07:18.240
yes oh more nods excellent great okay so this is called a token stream
00:07:24.319
or a token buffer or just a list of things and we're going to take this token
00:07:29.919
stream and we're going to pass it through our grammar so at this point we have tokens we have the concept of
00:07:35.199
lexical analysis we don't have semantic meaning yet so we're going to add that so it it's called accepting the input
00:07:41.919
we're going to run through how it works this is our grammar we already went over this we're going to add a stack and this
00:07:48.639
stack is going to be a token stack and it's also called the semantic stack
00:07:53.680
semantic meaning stacks manic token stack people really don't like having one name for things um but what it's going to do is it's
00:08:00.319
going to take the first token and this is called shifting we are shifting a token
00:08:05.440
we have shifted it off of the list of input tokens and then we're going to say okay we have our token what are we going
00:08:11.039
to do with it we need something equivalent to it in the terms of our grammar okay well we can see from our grammar
00:08:17.680
that a number is equal to a factor so we're going to replace it with factor
00:08:23.039
on our stack that thing we just did is called reducing shift and reduce those are the two terms
00:08:28.479
i'm going to add to your idiolect today so we're going to shift more tokens
00:08:34.080
shift more tokens continue shifting until we get to something else we can reduce okay we've got a number we can
00:08:39.599
reduce that again using our grammar down to a factor continue shifting continue shifting
00:08:46.160
again we get a factor and finally we get to something we can actually do now this our this stack is looking a little bit
00:08:51.200
large and i've been working for aaron for a little bit so i couldn't resist just showing that this is in fact overflowing
00:08:56.959
um so we can go and take the factor and we can say okay a factor is actually
00:09:02.240
a term a term minus a term can
00:09:07.279
just be an expression uh as it gets reduced further we can go
00:09:13.920
through this process and just keep substituting things if we ever get to a point where we can no longer substitute something or shift
00:09:20.880
a token then we throw a syntax error right we continue down this process we look at
00:09:27.519
the various rules we continue to shift overall we finally get to the end of our input we substitute we substitute we
00:09:34.480
substitute we substitute we substitute all the way down to program at this point we have accepted our input and
00:09:40.640
this is a valid syntax for our grammar the thing that we just did
00:09:46.240
in taking those tokens and passing it through this process of semantic analysis of going through that stack and
00:09:53.040
replacing them over the course of time with shifting and reducing it turns out that that's very repetitive it turns out
00:09:58.560
that that's language agnostic it doesn't have specific meaning it's just something that we can do over and
00:10:04.160
over again and in the late 80s early 90s the
00:10:09.360
thing in vogue to do if you were building a language was to use a parser generator a parser generator is a
00:10:14.480
program that takes that kind of grammar and those kinds of actions and does that for you
00:10:19.839
not to say it does everything but it it takes the shifting and reducing part out of the need of your
00:10:24.959
head using very unfortunately large integer arrays anyway
00:10:30.880
what this would look like with a parser generator so ruby has a parser generator in the same library it's called rack
00:10:36.399
there's a reason for that i'll show you in a minute and it looks something like this this is a parser generator using rack um
00:10:43.040
the thing up top that is that we're looking at is the operator precedence up where it says left uh left is
00:10:49.519
associativity that i'm not gonna get into today but that operator precedence tells you if you have to determine
00:10:55.120
between a shift and a reduce that you're gonna go with the operator with the higher precedence
00:11:01.120
the expressions down there are actually an equivalent grammar to the grammar we already have
00:11:06.399
uh when you when you pass a grammar into a parser generator it's going to take that and do all that shifting and reducing for you and generate it for you
00:11:13.600
the last thing you can do on top of this is you can execute actions when rules are reduced
00:11:19.440
so when rules are reduced you can do something this one is going to evaluate it immediately this is going to take whatever the input
00:11:25.279
is and just evaluate it as soon as we find it so if you find an expression position expression you're gonna do the value plus the value
00:11:31.360
you can also do something like this which is building up a syntax tree this is what was done in ruby before one
00:11:38.240
nine we built up a syntax tree and then in order to execute ruby it walked over that tree and understood what it was
00:11:43.600
doing over time this is as opposed to building a bytecode interpreter which is what your voice in 1.9
00:11:49.680
if we take this file and we pass it through rack and we build our file and we go into irb
00:11:56.800
and we require it and we parse it you get this
00:12:02.079
this may not look like much but it's kind of interesting because if we blow it up a little bit
00:12:07.760
this is actually a tree this is a syntax tree this is what we have built using our parser generator
00:12:14.079
if you look back at the source you can see how that relates to it and it takes care of precedence for us
00:12:20.000
um right this was our original grammar and and this can build that kind of thing and we're going to come back to this okay
00:12:26.000
okay we're actually on track here for time so it's it's good good progress so far
00:12:34.000
so the next thing i want to talk about is the history of the repressor now the reason that it's called rack the
00:12:39.519
partial generator that standard library uses is because yak was the original one yacc
00:12:46.000
um yet another compiler compiler i believe it's a partial generator that was built
00:12:52.480
way back when and it was in vogue in 1993 when matt
00:12:57.519
started using it for building out the ruby parser so in the very very early days this is
00:13:03.040
the earliest changelog entry i could find ruby 0.06 in 1994. um
00:13:09.760
it's i had to go through the wayback machine and find a tarball that had a change log that was entirely japanese
00:13:16.639
but i found it so it's there um for the first uh pre
00:13:21.760
1.0 all of the change log entries are in japanese so i had some fun with google translate um there's some fun things in
00:13:28.560
here ruby didn't used to look like ruby ruby used to look like both python and c plus um this the top one there is saying
00:13:36.320
that uh dicks are hash literals and he added like braces for hash level syntax
00:13:42.880
which didn't used to exist um it's a backwards incompatibility because that used to be a race in text
00:13:48.000
back in you know whenever it was um rescue was misspelled for about 90
00:13:53.680
versions it got renamed to rescue like the correct one um that in japanese that's
00:14:00.800
saying that it was embarrassing so i feel badly sorry bats but i thought i can felt concluding that
00:14:06.800
uh after a while i guess we get up to ruby 1.0 uh ruby 1.0 started looking more and
00:14:12.320
more like ruby the super class syntax went from a colon like c plus plus to a less than that we recognize today the
00:14:18.000
continued keyword went from being continued to next which we also recognized for movies today they added syntax to access the
00:14:23.680
singleton classes we started getting regex flags that were specific to encoding which was kind of an
00:14:28.720
interesting thing because at the time a lot of stuff was very western-centric um i i love the fact that ruby was
00:14:34.959
written in japan because we have encoding kind of as a first-class citizen especially ruby 1.9
00:14:40.160
rupee 1.3 comes out the day before ruby 1.2 i know that's confusing uh the odd
00:14:45.279
versions were developer releases and were used to be development branches effectively this was sdn or actually i
00:14:50.880
don't know it was sbn yet um but we get a couple of interesting things like body statements that we begin rescue else and
00:14:57.680
clauses we get in dental here docs uh the next day ruby1.2.0 is released
00:15:04.160
and uh you know more more and more things the true and false keywords were added for the first time which is
00:15:10.160
kind of a funny thing to add to a language in ruby 1.2 um percent w array literals stuff like that
00:15:16.959
the next year we get ruby 1.4 um we get binary number literals a couple of other interesting things multiply character
00:15:23.120
identifiers again with the multi-byte strings this is something that ruby was ahead of time on
00:15:29.120
uh 1.5 had compiled time streaming concatenation i don't know if you know this but in a ruby file if you put a
00:15:34.320
string and then a space and then another string it becomes one string when ruby parses it kind of a weird
00:15:39.519
thing came from c uh it the reason i have it in here is because it was on the to-do list since
00:15:44.720
ruby 0.06 i don't know why i mean it's great i guess if you like that kind of thing
00:15:50.720
uh the next year we get the rescue modifier form modifier means inline so like foo rescue bar
00:15:57.120
um and finally the next year we get no dump
00:16:02.240
no dump is not a reversion no dump is a project that came out of the pragmatic programmers and it's a c extension to
00:16:07.680
ruby and it's written in english in the us and this is kind of interesting because
00:16:13.839
ruby is starting to pick up steam and starting to get some popularity no dump was an extension so at the time ruby
00:16:19.279
before 1.9 was a tree walk interpreter meaning it took the ast that we already built up like we looked at it walked
00:16:25.519
over it and interpreted it as it went no dump took that yeah excuse me ast and printed it out in
00:16:32.399
a human readable format so you can understand it this was the first attempt to my knowledge to take the ruby ast and
00:16:38.480
do something with it that was not execute ruby ruby171 comes out
00:16:44.079
this was right around the time of the first rubyconf and we get some interesting things break and next now accept values
00:16:50.560
um rescue and singleton method bodies more and more things are getting added around this time jruby gets created
00:17:02.160
file the grammar file that we looked at you remember the rack file from earlier it takes that grammar file takes all of
00:17:07.679
the action blocks which were the things in the in the braces and rewrites them all and and i love
00:17:12.880
this i love just thinking about this like what would make this language better if it were written in java that would be better
00:17:19.280
and at the time it was i mean jvm is still one of the greatest inventions we've ever done as programmers it's
00:17:24.559
incredibly powerful and it's achieved better peak performance numbers than you are at the time that it was introduced
00:17:30.320
but the interesting thing about this is to take this file to take these things
00:17:36.000
okay a one-time translation is one thing but how are you going to stay up to date
00:17:41.520
you're not it's freaking hard you have to watch the
00:17:46.799
commits on parse.y and just stay up to date you got to keep translating and that's a hard thing to do and you know
00:17:53.440
tons of props to the j ruby team for for keeping up with this because this is not an easy thing to do and the reason i'm
00:17:58.480
harping on this and taking time here is because this is what you have to do in order to maintain
00:18:04.480
another parser in ruby you have to watch the commits on this file and just
00:18:09.919
make it work um so i'm gonna come back to that point later around this time ripper 0.0.1 is
00:18:16.880
released minero aoki uh published it publishes it on his website it takes the grammar file rewrites it to dispatch
00:18:24.000
parser events and scanner events instead of building the ast that ruby17 would build
00:18:29.840
still to this day in the documentation it says ripper is still early alpha version that's what it says in ruby310
00:18:36.799
it's been 20 years still early alpha maybe we'll get to beta someday
00:18:43.360
ruby one eight comes out two years later and we get a couple of more interesting things i'm gonna start skipping through this a lot faster so if you thought i
00:18:49.520
was going fast now um around the same time parser gets released parse tree is a project by ryan
00:18:55.039
davis that builds out an ast from the source file using a c extension it relies on ruby 1.8
00:19:01.760
internals so unfortunately ryan had to deprecate this when ruby 1.9 came out
00:19:06.799
because it completely changed to a bite code interpreter rubinius comes out about this time robinius is a very fascinating project
00:19:13.440
rewriting the standard library and rewriting ruby in ruby it bootstraps ruby
00:19:18.480
unfortunately it is not at this point what it was at the time it's it's it's
00:19:23.520
no longer kind of up to date it's it's a different project at this point um but there are a couple of interesting things
00:19:29.120
like having to rewrite again those action blocks in in ruby i included cardinal because it's a very
00:19:35.120
interesting thing it's a rewrite of the ruby thing except this time instead of taking the purse.y
00:19:40.880
file it actually took the grammar matt's had published a grammar the the abstract grammar that we looked at on a website
00:19:47.840
way back when this was a fork of the ruby 1.4 grammar and they rewrote it in order to put it on the parrot vm iron
00:19:54.799
ruby was written around this time for the.net framework finally ruby parser comes out ryan davis
00:20:00.080
rewrites his parser so that you can use it with yarv in 1.9 uh completely had to rewrite
00:20:07.039
it and used rack to generate it around this time yard comes out yarv is a
00:20:12.400
phd thesis that is a bytecode interpreter for ruby it upgrades a couple things it changes it so that instead of using uh
00:20:19.679
yak it uses bison which is just a successor that has a lot of the same functionality ripper is merged into the
00:20:25.840
standard library we'll show how that works and a couple other of uh controversial things are done at the
00:20:32.720
time including simple hashkeys and lambda literals review 1.9 we keep moving along we get
00:20:38.799
the ruby intermediate language this is a research project that rebuilds the ruby parser in ocam
00:20:43.840
if you were wondering of the list of if you had your list of bingo languages of of places that ruby had been rewritten
00:20:50.159
in okay i doubt camel would be on your list but but there you go uh ruby 1.3 is released is the last of
00:20:55.520
the 1.0 series and these two standards are recognized by the international
00:21:00.799
committee as uh international standards for this language we get into the ruby 2.0 series
00:21:08.159
and we get all kinds of fun things we get refinements we get uh percent i symbol lists keyword arguments ruby 2x
00:21:14.640
keyword arguments not really theoretical arguments and finally we get the parser gen the parser gems from white quark
00:21:20.159
it is a published gem with a parser api that uses rival to generate using another parser generator to build out an
00:21:26.960
ast this one has been stayed up to date miraculously over time and you can see
00:21:32.559
this is a very short list of all a very short abbreviated list of all the different tools that are built on top of this gem
00:21:39.039
truffle ruby comes out around this time i'm going to start skipping through these roots 2.1 we get required keyword arguments 2.2 dynamic symbol hash keys 2
00:21:46.480
3 we get here docs the frozen string little fragment around this time everyone starts freaking out about memory and they start opening prs to
00:21:52.880
rails saying dot freeze dot freeze dot freeze dot freeze the entire pull request is just dot freeze finally we
00:21:58.480
get the frozen string literal pragma and people can calm down with that we get the two proc so you can call
00:22:04.080
refinements with two proc we get top level return multiple assignments and conditionals around this time a project comes out
00:22:10.320
called tree sitter tree center is a very interesting parser project a parser generator
00:22:15.520
library that is used mostly for ides
00:22:21.039
and it has a plugin for the ruby grammar that is not all the way there but it's enough that you can use it for like go
00:22:27.200
to definition on github type 3b comes out around this time which is a type system written in rust the parser is written in c plus i included
00:22:34.400
this because it's kind of interesting they basically forked the grammar the parse.y from ruby they
00:22:39.440
took elixir from ruby parser and then eventually sorbet when they wanted to have a ruby parser they just used this
00:22:45.679
one as well two five comes out with uh rescue and a short at the block level two six comes
00:22:51.120
out ruby vm abstract syntax tree is introduced rubium abstracts insecurity was not
00:22:56.320
what i would call an intentional release it was part of a test suite for another feature um and it's a very very
00:23:03.360
interesting thing i encourage you to go to the ruby issue tracker and go read about it it's another
00:23:08.400
form of a parser that can be used though it doesn't it's unclear if it's going to be supported in
00:23:14.960
the long term or how it's going to be supported it might do some optimizations before it hands things back to you not
00:23:20.320
entirely clear flip-flop is decorated deprecated much to the sugar and the flip-flop fans everywhere
00:23:26.559
2.7 flip-flop is undeprecated so it's okay yeah we get a lot of introduction of
00:23:33.039
syntax in route 2.7 as you saw in matt's keynote uh method reference operator was added method reference operator was
00:23:38.960
removed um we get some interesting syntax for like star star nil for saying this method
00:23:44.320
accepts no keywords uh rightward assignment unprogrammers argument forwarding all these different things
00:23:49.600
finally ruby three comes out happy about ruby three we get keyword arguments that are different keyword
00:23:54.880
arguments previously the keyword arguments were based on allocating a hash and then there was a whole bunch of syntax errors it was very confusing now
00:24:01.200
we get some actual separated keyword arguments we get endless method definitions you can do defu equals bar
00:24:07.760
and that is all working we get some interesting pragmas for raptors and we get
00:24:14.080
keyword pattern matching in single line ruby31 preview one came out today
00:24:21.760
there you go yeah all right uh we get some hash literal syntax uh hash a little shorthand so that you don't have
00:24:27.840
to write the value in a hash this is somewhat like javascript um and we get
00:24:33.200
in pattern matching you can pin expressions not just variables okay
00:24:38.320
that that was the abridged history believe it or not i have a website i will share with you a link at the end that has
00:24:43.919
way more information with way more implementations of ruby and you can go and click around your heart's desire
00:24:49.840
i got five minutes left i want to tell you how ripper works
00:24:55.919
ripper is a standard library it hooks into the parser it gives you events
00:25:01.760
we go back to our grammar we go back to our parser generator we see this is right this is our lecture down
00:25:08.320
at the bottom it's our grammar up at the top ripper is going to hook in here and it's going to hook in here
00:25:14.799
what that is to say is when ripper finds a token say a token initially it is going to
00:25:20.080
fire a scanner event using this in quotes because that doesn't really mean anything to a lot of people but it's a
00:25:25.360
scanner event up top anytime a rule is reduced you remember when i showed you what reduced meant at the beginning it's
00:25:31.440
going to fire an event whenever a rule is reduced so you're going to get events for these different things i'll show you the syntax for what that means
00:25:38.559
internally to ruby in the source this is in parse.y you can see how this kind of works it found a comment
00:25:45.520
so it's going to dispatch a scan event for that comment okay so we can go and we can build a subclass of river
00:25:51.840
we can say on comment we're going to build the subclass we're gonna add an add-on reader for comments
00:25:58.320
we're gonna have this on comment this is how you handle an event in ripper you define this on method there are 200 some
00:26:03.679
odd methods i will show you the link to the docs later if we go and we build a parser with this
00:26:10.559
source we tell it to go in parts itself and then we pull the comments back out you will get a list of comments this is
00:26:17.600
how you can use ripper in your own thing if you want to to go and pull tokens out it's a lot
00:26:25.120
easier on the token side than it is on the node side and i'll show you why
00:26:30.480
remember that i said it hooked into these different spots up top it's hooking into the reductions of the rules
00:26:35.600
so this is also in parse.y in in ruby ruby and this is the part of the grammar
00:26:41.200
that is handling the super calls whenever you call super you can call super with no
00:26:46.559
arguments or you can call them with arguments and you might see in the faint faint font that weird comment this is
00:26:52.640
how ripper works ripper has comments all over the parse.y file ripper then builds
00:26:57.760
its own parse.y file based on this file using those comments using macros yeah
00:27:03.600
it takes those comments and then builds its own parts.wi-fi which it can then dispatch different events the important part is right here
00:27:10.640
that's a dsl it's a tiny little language that's baked into the language parser generator that is using
00:27:16.480
by ruby but then gets generated by makefile which then gets generated by ripper
00:27:21.600
i couldn't tell you that again in if i tried um if we go we try to build a
00:27:27.600
parser using ripper to handle this and we go when we pass in this stuff we build these handlers for these events
00:27:34.720
z super took no arguments super took one argument and we tell it to go parse we'll get
00:27:40.240
this super called without arguments great z super worked just fine super cold with arguments and we get
00:27:45.679
nothing why did we get nothing if we go back to our grammar we can see
00:27:51.039
okay actually this is being passed an argument what argument is it being passed it's being
00:27:56.799
passed the paren args argument remember the way that this works is as things are getting reduced whatever value you have
00:28:03.360
for that node gets passed up the tree we didn't implement a handler for those events so we get nil
00:28:09.360
so okay we have a problem what are we going to do ripper ships with two subclasses
00:28:15.120
there is sex builder s expression builder in s expression builder pp for
00:28:20.240
pretty printer it has an implemented handler for every single one using method missing it's a
00:28:26.559
whole thing if we go and we call it with this
00:28:31.919
subclass we actually will get something that looks a lot like the ast we built earlier right so
00:28:37.760
what are your options if you want to use ripper you can implement every method handler yourself which is fine you can do it
00:28:45.279
you can inherit from ripper sex builder or ripper sex builder pp or you can do some combination of both
00:28:51.520
as we saw with the comment handler earlier it just worked you can just get your tokens out it's fine
00:28:57.200
uh i have implemented one for you it's here it's in prettier um
00:29:02.720
and it's you know it's there it happens uh there are 200 some odd tokens there is a hash for each of them
00:29:09.679
i'm actively working on upstreaming it so that you can have one for yourself if you want to build a syntax tree what
00:29:16.080
are your options to this day ryan is still maintaining reparser this is an option for you it
00:29:22.080
has some community option it's not 100 compatible though he did just bring it up to ruby 3.0 new stuff may break
00:29:27.360
because it's not shipped with core the parser gem tons of community adoption it backs rubric up it backs
00:29:33.120
standard which is a wrap around rubric it's very well documented not 100 competitiveness necessarily new stuff
00:29:38.799
may break every time new syntax comes out they have to play catch up it's just part of the name of the game it doesn't ship with or test with core so i mean
00:29:46.559
that's just what it is ruby apps ruby vm abstract syntax tree it's still too early to tell what's actually happening
00:29:51.760
with this it's on the issue tracker you can check it out it's not implemented on any other ruble implementation so if you're interested in portability don't
00:29:58.000
choose that option finally there's ripper as we talked about it's built into the parser generator it's well tested in court
00:30:03.600
ships with ruby there's no documentation uh in the last hack days for shopify
00:30:09.279
thank you shopify i uh spent a fair amount of time uh and there is now documentation for
00:30:15.200
every event and with an example for every single one that will trigger all those events you can go and find it i
00:30:20.320
will send you the link uh it is here all of that is to say
00:30:27.360
the ruby ruby uses a parser generator it originally used yak it now uses bison
00:30:32.720
parcel generators are complicated technologies that use shift and reduce
00:30:37.840
operations to build up syntax trees cartridge generators are difficult to maintain across implementations of
00:30:43.840
languages they're not the most intuitive of technologies and it's difficult to maintain upstream compatibility it's a
00:30:51.200
good thing that ruby is going to slow down on syntax and future development because it's going to give an opportunity for all the other
00:30:56.799
implementations to catch up but this is not something that we can solve as a community without the help of
00:31:02.240
core this is not something that we can really fix without having a standard
00:31:07.440
library parser because as you saw with the massive list of tools that i went through
00:31:13.200
all of those went away when the money dried up you can only keep up with core for so
00:31:18.320
long um it's exhausting and you you have to have
00:31:23.360
the parser ship with ruby that's really the only option so ruby vm ast or ripper
00:31:28.720
those are the options i'm pushing for ripper but i'd be happy if we just had one that was standard that rubicon used
00:31:34.480
so that we could all get used to that and use it and yeah i hope this inspires
00:31:40.480
you to learn more about syntax trees i hope this inspires you to build more tools on ruby you know we're here to answer the call
00:31:45.760
that matt's just put out to build more tools and uh yeah that's all i got thank you very much