00:00:00.900
foreign
00:00:13.519
I work at a company called app signal
00:00:15.960
and we have sort of have a fully remote
00:00:18.900
that we have an internal conference that
00:00:20.640
we do every year because you kind of
00:00:21.840
have to invent these ways of of like
00:00:23.880
staying in touch with each other and
00:00:25.439
kind of understanding uh who everybody
00:00:27.720
is so I've been doing so this
00:00:30.900
presentation has been like four years in
00:00:32.399
the making like I did a bunch of
00:00:34.140
internal
00:00:35.940
for of talks about music stuff and I
00:00:39.840
kind of wrapped it all up into one big
00:00:41.340
thing here so this is like some examples
00:00:44.040
of flux stuff that people do at a
00:00:46.079
company there's a makeouts or I
00:00:48.539
did not know
00:00:49.980
uh people are into pipes
00:00:53.399
uh procrastination is a big thing
00:00:55.079
amongst developers I've heard
00:00:58.620
yeah so what I usually do is I get into
00:01:01.620
I've been getting into music production
00:01:03.480
myself for a couple of years and uh I I
00:01:07.380
sort of figured out that writing code is
00:01:09.900
a really good way of understanding the
00:01:11.400
world because it kind of forces you to
00:01:13.020
actually
00:01:14.420
block the whole mental model or
00:01:16.560
otherwise you're not going to be able to
00:01:17.820
automate something so for this
00:01:19.920
presentation I went through this process
00:01:21.479
which I've done before which is
00:01:23.159
basically not understanding anything
00:01:24.960
about it like forcing yourself to
00:01:27.780
understand it through modeling it in
00:01:29.460
goats and now I'm here like a sharing
00:01:33.060
the results of that with you
00:01:35.340
so what we're going through today is
00:01:37.200
like we'll do really quick history of
00:01:39.540
music recording Technologies I just hope
00:01:41.880
kind of know where we ended up and then
00:01:43.799
we're getting into uh digital audio and
00:01:46.619
like some ways to manipulate it and
00:01:48.240
generate sample in it
00:01:51.240
so
00:01:53.880
um
00:01:54.960
yeah so what is music actually I think
00:01:57.479
that's that's sort of the starting point
00:01:59.420
uh I'm using is a really weird thing
00:02:01.799
because it it only exists in our brain
00:02:03.960
like we don't like our neuroscientists
00:02:06.719
don't really understand how this whole
00:02:08.340
process works and some are there are
00:02:09.899
these waves
00:02:11.099
uh sounds
00:02:13.200
which you can kind of visualize like
00:02:15.720
this they're they're
00:02:17.459
um kind of similar to like how a wave
00:02:19.680
would operate in waterways which is a
00:02:21.660
little more intuitive for us so what
00:02:23.580
you're seeing here actually also happens
00:02:25.140
at the moment in this room where these
00:02:27.720
where these speakers kind of like just
00:02:30.420
create a create movement in the air and
00:02:33.000
it kind of like oscillated way into into
00:02:35.700
your ear
00:02:38.220
uh once it makes it to your air that's a
00:02:40.980
bunch of these really tiny hairs that
00:02:42.900
are called follicles and they vibrate as
00:02:45.300
well and that gets picked up by your
00:02:46.800
brain
00:02:47.760
and when I say your brain like we have
00:02:50.099
no clue somehow we kind of like Find
00:02:51.959
meaning in in all these waveforms and we
00:02:54.360
and we we just perceive it as music and
00:02:56.879
it's an emotional thing and we have no
00:02:59.400
clue why
00:03:00.420
maybe we'll find out someday
00:03:03.660
so how does this sort of like basic
00:03:05.819
process work so we've got a
00:03:08.819
uh
00:03:10.019
one important aspect especially this is
00:03:12.360
speech so this is the number of times
00:03:14.340
that this waveform kind of oscillates
00:03:17.940
um
00:03:20.580
so this is like a very simple waveform
00:03:23.040
I'm just going to let you hear it
00:03:26.280
so this is just a side wave which is
00:03:30.300
kind of um
00:03:32.099
which is uh oscillating 440 times a
00:03:35.040
second and then you end up with the
00:03:36.599
sound
00:03:37.980
um if you uh played like at multiple
00:03:41.220
frequencies you you kind of perceive it
00:03:43.440
as big as having notes that sounds a
00:03:45.360
little bit like this
00:03:50.159
sorry to mess up the order there
00:03:52.620
foreign
00:03:55.980
that's purely a sine wave that we
00:03:58.200
generated that's uh just playing at
00:04:00.959
these different frequencies
00:04:04.260
uh next up is stumbler so this is what
00:04:07.500
you get when you go from uh from having
00:04:09.720
like a really simple wave to like more
00:04:11.879
complicated way for all these kind of
00:04:13.680
like little edges you see there so then
00:04:16.199
you get lots of like a piano notes
00:04:18.739
it's uh it just uh it does completely
00:04:21.959
different sound to it but it's the same
00:04:23.400
pitch
00:04:26.720
uh next up is tempo so Tempo is is like
00:04:31.080
what you get when you uh when you play
00:04:32.940
when you play the Drone for example
00:04:35.759
so this is like to waveforms that we do
00:04:39.419
some space in between and like we really
00:04:41.639
perceived at South Africa tempo
00:04:47.639
um
00:04:48.479
well if you combine all that stuff you
00:04:50.639
already get something that's kind of
00:04:51.960
like music so you get like a little
00:04:54.900
little few notes and then if you add
00:04:58.320
Rhythm to that
00:05:04.100
sort of a song
00:05:06.979
it's probably the most boring song that
00:05:09.419
has ever been created but it is a song
00:05:11.820
like I think we all agree that it's it's
00:05:13.740
that it's music
00:05:16.680
yeah so back in the day like there were
00:05:20.040
no like there was no recorded music so
00:05:22.199
people you could just like go into a
00:05:24.000
room and see some somebody play and that
00:05:25.680
was it and this kind of changed
00:05:30.000
like at the beginning of 20th century
00:05:33.780
so we'll just like really quickly gloss
00:05:36.300
out for this so so we have a bit of a
00:05:38.880
basis
00:05:40.020
so this is the first
00:05:41.840
uh uh known music recording device in
00:05:45.360
history so there's a little roll on
00:05:47.340
there and there's a wax layer and then
00:05:49.259
if you kind of like shout really hard
00:05:51.000
into that hole like this this little
00:05:53.520
needle kind of vibrates and it makes it
00:05:56.160
better into in the wax and if you kind
00:05:58.560
of like Replay that you get like a
00:06:00.840
really silent uh signal back
00:06:04.680
so this wasn't really useful but it was
00:06:06.780
the first one that kind of evolved into
00:06:08.880
these like record players it was still
00:06:10.740
completely mechanic so they didn't
00:06:12.600
produce a lot of volume
00:06:14.820
so people would record like this I would
00:06:17.460
like the this little cone you see over
00:06:19.560
there really had to be like right in the
00:06:21.180
middle of the of the action otherwise
00:06:23.039
there would like basically be no signal
00:06:26.880
uh then the state machines happened so
00:06:29.160
that kind of encoded the waveforms this
00:06:31.080
did like magnetic charge like there were
00:06:33.360
these little iron particles on the tape
00:06:36.360
and they they're kind of like charged in
00:06:38.699
One Direction or the other and that kind
00:06:40.560
of like there was a way to like also
00:06:42.539
record these waveforms and kind of like
00:06:44.220
playing back
00:06:45.600
but then like World War II happens and
00:06:48.300
then like the whole thing really started
00:06:50.400
uh uh moving fast because a lot of
00:06:52.680
Technology was invented for radar and so
00:06:55.199
on that labor then had to be really
00:06:57.060
useful in recording as well
00:06:59.340
so we got cubes so with these cubes you
00:07:02.039
could amplify signals so you could take
00:07:04.080
something that's really silent like the
00:07:06.419
microphone coming out of the the signal
00:07:08.160
coming out of this microphone for
00:07:09.539
example it's like it's there's a tiny
00:07:11.460
amount of electricity and in the end you
00:07:13.500
need a lot of electricity to kind of
00:07:15.180
power that speaker that's producing the
00:07:17.160
uh the uh that's getting the air to move
00:07:20.880
yeah so we've got these cubes you know
00:07:23.160
things started becoming more ambitious
00:07:24.780
like we got like a lot of microphones
00:07:26.639
these types of microphones also had like
00:07:28.620
little tubes in them to amplify the
00:07:30.419
signal
00:07:32.099
you know we got these mixing desks
00:07:34.560
uh record players
00:07:36.840
like a lot of transistor based stuff was
00:07:38.880
then done so this is a this is an old
00:07:40.740
compressor that doesn't use tubes
00:07:42.900
anymore
00:07:43.919
and then the 80s happened and the whole
00:07:46.139
thing changed
00:07:48.180
or got like a lot of people think like
00:07:50.940
this is kind of when it got terrible
00:07:52.560
because like then we got digital audio
00:07:54.479
and uh uh look we got a CD a digital
00:07:59.099
audio is is
00:08:01.020
uh sort of perfect like it took a long
00:08:03.180
way for people to find a way to make
00:08:05.039
digital audio actually sound good and
00:08:07.460
that's what we're gonna where we're
00:08:09.360
getting to now
00:08:10.919
so what's digital audio it's uh way to
00:08:14.639
sample these waveforms uh and reproduce
00:08:17.340
them that way so instead of like in the
00:08:19.440
natural world you'd you get a a really
00:08:22.020
smooth like actually smooth curve it's
00:08:25.020
uh it's an almost perfect smooth curve
00:08:27.300
and what we do when we create this
00:08:30.300
little audio you sample that so you you
00:08:32.580
just take these measurements uh
00:08:34.919
thousands of times per second and you
00:08:37.620
can use that to sort of like recreate
00:08:39.240
the waveform that you measured uh in a
00:08:41.760
way that that that uh just that it just
00:08:44.219
almost exactly sounds same
00:08:46.500
yeah and then uh the other all Recording
00:08:50.519
Technology got kind of replaced by so
00:08:52.200
far like this so this is program I like
00:08:54.060
to use
00:08:54.959
uh Erica contains all these like all the
00:08:57.839
gear it used to be in a million dollar
00:08:59.160
Studio it's not gonna like embedded in
00:09:01.800
this this type of software
00:09:03.959
and what I'm going to do now is recreate
00:09:06.000
like a few parts of this software using
00:09:08.220
Ruby codes and uh and we'll get to see
00:09:11.160
how this stuff actually works
00:09:14.399
so digital audio so it's a lot of
00:09:16.560
numbers
00:09:17.700
so we're going to use a gem called the
00:09:20.580
wave file gem in this presentation and
00:09:23.700
uh that's able that gem is able to read
00:09:26.220
and write
00:09:28.140
digital audio
00:09:29.580
so this is like a little bit of gold so
00:09:32.040
we just open a file we just
00:09:34.500
smash all the numbers into one big thing
00:09:37.260
and then we can work with it and then
00:09:39.720
you get this
00:09:41.459
so yeah
00:09:42.959
that's not really useful well that's not
00:09:44.880
really useful for human
00:09:47.339
we can also write stuff back
00:09:49.800
another General using is junkie PNG so
00:09:52.560
that that lets us create some images
00:09:53.880
because we need to like be able to see
00:09:56.160
what we're doing
00:09:58.019
so what we're going to do is like go
00:10:00.660
from this
00:10:02.160
uh to an image like this so This these
00:10:05.220
images
00:10:06.180
uh the next one uh this is like the
00:10:09.240
beginning of a hi-hat sounds so it looks
00:10:11.880
kind of random if you take a side wave
00:10:14.100
that we talked about earlier like it
00:10:15.779
look it's a bit easier to crack what
00:10:17.519
happens so the white line in the middle
00:10:19.440
is is kind of like the zero point and
00:10:21.839
then we've got a we've got a wave that's
00:10:23.580
kind of like oscillating from from
00:10:25.260
positive to negative and it kind of like
00:10:27.180
flows around this uh this line in the
00:10:29.760
middle
00:10:31.980
and this is some code to generate these
00:10:34.860
images so what we're doing is we're
00:10:36.899
taking this really long array of numbers
00:10:38.820
these samples
00:10:40.740
and we're just driving a point like
00:10:42.720
either or above or below the line
00:10:45.240
uh we're doing some calculations to kind
00:10:47.339
of like figure out where they uh where
00:10:49.560
they are supposed to end up it doesn't
00:10:50.700
really matter uh what calculations are I
00:10:53.820
will share this code after the
00:10:54.959
presentation if you want to play around
00:10:56.220
with it
00:10:58.800
um and like another uh visualization
00:11:02.339
that's very often used is uh this kind
00:11:05.459
of is this visualization where you're
00:11:06.959
kind of compressing these uh these
00:11:08.760
shapes into each other so this one is
00:11:10.320
still like we see individual dots for
00:11:12.600
every single sample here so this is
00:11:14.160
probably only like 0.0 or one one
00:11:16.260
seconds of audio already looking at here
00:11:19.920
and here we're looking at like a four or
00:11:22.260
five seconds of audio so we're kind of
00:11:23.940
compressing it so that's that's going to
00:11:25.440
be the two visualizations I'm going to
00:11:27.660
use for the rest of the presentation
00:11:31.019
um just well you can look at this if you
00:11:33.300
want later
00:11:41.880
like this
00:11:43.200
this Cube's used to do
00:11:45.120
so we're taking uh this piece of audio
00:11:52.700
which is the same drum Loop we did
00:11:55.200
earlier
00:11:57.720
and we're going to make it a little bit
00:11:59.760
louder
00:12:05.100
and
00:12:06.680
then even a little louder than that
00:12:13.320
that's uh we're going to get to this in
00:12:15.720
a minute so this is like we made a
00:12:17.640
little bit louder and then we made it
00:12:18.899
louder again and then like face fix went
00:12:20.760
horribly wrong
00:12:25.140
um so what are we doing here so we're uh
00:12:29.160
basically just it's at the end it's
00:12:31.920
relatively simple we're just looping
00:12:33.360
through all the samples and we just like
00:12:35.399
uh uh multiplying them so that's that's
00:12:39.180
the whole thing there's nothing else to
00:12:40.800
it so um uh we get we just like
00:12:43.680
whichever sample had like a value of 100
00:12:45.839
gets a value of 200 if you do that
00:12:47.820
consistently after the duration of the
00:12:49.920
whole sample set like it's going to
00:12:51.660
sound the same only louder
00:12:53.820
and if we do it like with a ratio of
00:12:56.399
four so we make it four times as loud
00:12:58.200
you get this this thing which is the
00:13:00.839
bane of everything every sign engineer
00:13:04.440
out there it's clipping it's what we
00:13:06.180
just heard so
00:13:09.480
so this is what happens
00:13:11.459
what you see here is that it's like
00:13:13.079
these peaks in the signal are kind of
00:13:14.579
like going uh higher than we have space
00:13:16.860
for in the image and that means that I'm
00:13:19.019
kind of being cut off so you get this
00:13:20.459
this Distortion effect that sounds
00:13:23.100
pretty bad uh if you do it in a digital
00:13:25.980
way but it's also a very crucial element
00:13:28.139
of a lot of music so like if any time
00:13:30.120
you hear like Jimi Hendrix played a
00:13:31.560
guitar they basically use this this
00:13:33.899
effect of this cranking too much signal
00:13:36.779
into something that can't really handle
00:13:38.220
that and then it starts kind of like a
00:13:40.500
mutating the signal
00:13:46.079
yeah so that's that's sort of like the
00:13:47.940
simplest thing we can do it's just
00:13:49.380
amplifying the sound
00:13:50.820
we can also make sounds
00:13:52.860
so
00:13:54.000
uh in the analog world you've got uh
00:13:57.240
what does loss of face make sound but
00:13:58.680
log one way is using synthesizer so
00:14:00.839
since seizure has these sound sources
00:14:02.760
and filters and all kinds of stuff we're
00:14:04.980
going to focus on the sound sources
00:14:07.139
uh so there's a couple of things you can
00:14:09.839
do to make sound so one of them is is uh
00:14:13.079
having noise
00:14:21.300
this is what what you're seeing on the
00:14:23.519
screen is basically random pixels and if
00:14:26.579
you if you translate that to a sound
00:14:30.600
you can just get white nose noise that
00:14:32.639
you might be familiar with like if you
00:14:34.019
put an ulti feed to like a non-existent
00:14:37.200
signal then you will get stuff like this
00:14:39.240
and generating a noise in Ruby is also
00:14:42.899
like relatively simple curve
00:14:45.180
so we're just going to Loop through uh
00:14:48.180
we've got to have a loop and we just
00:14:50.339
kind of like insert a random number in
00:14:52.860
the range of like this lowest negative
00:14:54.899
value it is highest positive value you
00:14:58.019
slam it into the array and well you've
00:15:01.079
got noise
00:15:03.240
um so it don't have that like a lot of
00:15:05.399
this audio stuff is like if you if you
00:15:07.260
translate it into Ruby code it's it's
00:15:09.180
you end up plug with like five lines
00:15:11.519
which is kind of interesting I think
00:15:14.639
another thing you can make as a square
00:15:16.860
wave
00:15:21.660
that sounds like this so this is uh
00:15:25.740
also a crew this you hear this in music
00:15:28.079
a lot actually it sounds horrible now
00:15:29.519
because we made a really simple one but
00:15:30.959
if you kind of like process it you it's
00:15:33.000
a big part of electronic music
00:15:35.639
I'm making a square wave looks a little
00:15:37.980
bit like this
00:15:39.720
so
00:15:42.000
um again We're looping uh I created a
00:15:46.139
an oscillator here which is kind of like
00:15:48.540
a thing that kind of alternates between
00:15:50.220
between two sides kind of like a
00:15:52.019
metronome
00:15:53.639
and We're looping through samples again
00:15:56.100
and and depending on like where we are
00:15:59.220
in the oscillation we either pick like a
00:16:01.260
high value or low value and then you end
00:16:04.620
up with a graph like this where these
00:16:06.600
these like high lows are kind of stacked
00:16:09.000
uh on the others on the opposite ends of
00:16:11.820
the middle and uh you get a square wave
00:16:17.399
uh
00:16:20.220
yeah this is to go through uh uh to
00:16:23.760
generate uh the calls at oscillator
00:16:25.800
another
00:16:27.480
of the used wave is a sine wave
00:16:33.199
that sounds like this
00:16:42.480
and the sine wave has a slightly more
00:16:44.579
complex math behind it so uh we're using
00:16:48.560
math.c to kind of do uh calculate the
00:16:52.259
next Point based on the angle uh that
00:16:55.560
that the signal is moving in
00:16:57.600
and I again this this probably if you're
00:17:00.899
interested in this like definitely look
00:17:02.339
at the examples
00:17:06.720
we uh this is Mr Fourier and he uh he's
00:17:10.980
a French mathematician and he found out
00:17:13.500
that all sounds can actually be
00:17:15.900
represented by different sound waves uh
00:17:18.319
that can be merged so you could have
00:17:20.819
suffer like this which is a combination
00:17:23.819
of two sine waves
00:17:26.939
and you get a chords so if if you
00:17:30.780
combine these two
00:17:32.580
it's a little bit off actually but it's
00:17:34.620
it's a quartz
00:17:39.000
and the code for that is is
00:17:41.880
also like not a whole lot of codes so
00:17:45.120
we're creating like different uh three
00:17:47.400
different signs uh wave generators
00:17:49.799
and we kind of merge those those
00:17:51.240
together so this brings us to the uh
00:17:54.960
next part of the of the presentation I'm
00:17:57.900
just going to skip this
00:17:59.580
which is mixing so this is uh this is a
00:18:03.000
mixing desk
00:18:04.200
uh so luxan comes in into all these luck
00:18:07.620
paths as you see here that all of a
00:18:09.120
different fader and then it's merged
00:18:11.640
into one signal that comes out at the
00:18:13.440
end of The Thing
00:18:15.179
uh
00:18:16.320
so what you do is you have like multiple
00:18:18.660
waveforms multiple channels and you kind
00:18:21.240
of combine them into this more complex
00:18:22.919
thing
00:18:24.240
so these are three waveforms that we
00:18:26.220
were listening to earlier
00:18:29.160
we can read them
00:18:31.320
into into three separate arrays
00:18:34.500
and then we Loop through through the uh
00:18:37.260
the whole thing and we just like take
00:18:38.820
all the numbers uh that uh that we got
00:18:42.419
from all three of these tracks just sum
00:18:44.640
them together and and we get the signal
00:18:46.980
back so one thing we have to do to hear
00:18:49.559
is like uh Define fixed by 1.5 to to
00:18:52.679
avoid the clipping issue because if you
00:18:54.480
keep stack stacking up those numbers
00:18:56.280
like you're going to go above the limit
00:18:57.960
of the thing and you kind of like have
00:18:59.400
to bring the level down to get back to
00:19:01.020
the proper level again
00:19:06.780
and that we get this result this thing
00:19:10.440
has actually been mixed by the rubiko
00:19:12.480
that we were looking at earlier
00:19:17.400
um and the last technique I want to talk
00:19:19.500
to you about today is compression so
00:19:21.660
compression is and I'm not talking about
00:19:23.640
MP3 so this is audio compression there
00:19:27.360
used to be machines like this one that
00:19:30.000
did it well this one is still actually
00:19:31.260
really popular this this is an extremely
00:19:33.360
expensive device has used a lot of
00:19:35.100
Records
00:19:36.179
So What compression does is it takes a
00:19:39.059
it takes a waveform
00:19:42.900
it takes a waveform that that has this
00:19:45.480
this peak so if you play the drum you're
00:19:47.520
going to get a uh you're going to get a
00:19:49.799
a sort of like really high value at the
00:19:52.140
beginning and then kind of as the sound
00:19:53.580
kind of taper so often it becomes it
00:19:55.440
becomes less loud but that's often not
00:19:57.720
what you want if you're making music
00:19:59.340
because you you want to have like a
00:20:00.780
consistent level uh that's Pleasant to
00:20:03.000
listen to so what compression does is uh
00:20:06.419
you you kind of draw a line which is
00:20:08.220
called thresholds uh you want the the
00:20:11.340
Peaks that are above this line to kind
00:20:13.320
of like become less loud
00:20:15.360
so what you do is you you make them less
00:20:17.820
loud
00:20:19.320
and then
00:20:22.260
um
00:20:23.940
and then you can make the whole thing
00:20:25.860
like a little bit louder so you get a
00:20:28.700
show you uh you get like a consistently
00:20:31.799
higher level for the whole signal
00:20:35.160
so we'll take a a little sample I have
00:20:38.220
here
00:20:42.900
like a little simple hit
00:20:47.960
are we going to apply this compression
00:20:50.580
to it in two steps so again We're
00:20:52.980
looping through all the samples and
00:20:55.080
whenever we see that that this this this
00:20:57.240
value is above the the threshold that we
00:20:59.700
set we just uh divided by the ratio and
00:21:03.419
if we see that it's lower than the
00:21:05.039
signal that we just set then we just add
00:21:07.620
it so we're only going to like
00:21:09.000
manipulate these like larger parts of
00:21:10.980
the signal so to balance it all out
00:21:13.679
and then yeah and then in the second
00:21:15.539
step
00:21:16.320
we're going to apply some gain to it so
00:21:18.840
we're just just uh multiplying it by a
00:21:21.840
number
00:21:23.400
uh and uh and adding it back to this
00:21:25.860
into this ring
00:21:28.020
and it looks like this in the end so so
00:21:30.120
just a two-step process and then we're
00:21:31.679
writing it back to disk
00:21:33.360
and then you go from this form
00:21:36.000
to this form so you see like it it like
00:21:39.360
elevated the parts of the center where
00:21:40.980
there were less louds and it sounds
00:21:43.080
horrible because this is a terrible
00:21:44.580
compressor but it's uh you can sort of
00:21:46.799
hear what happens so this is the the
00:21:48.299
original one
00:21:53.400
yeah it's not really a natural if you do
00:21:55.740
it like this
00:22:00.299
but it's a good illustration yeah and
00:22:03.960
that's that brings me to the end of like
00:22:05.820
all the things I want to discuss today
00:22:07.980
so it's almost lunchtime
00:22:11.280
um but I have one more thing so I think
00:22:13.740
this was already abstract so I wanted to
00:22:16.140
prove to you all that you can actually
00:22:17.880
make music just using those samples that
00:22:20.100
I used
00:22:21.299
so I made a cover of I think the best
00:22:23.580
song that was ever made in history and
00:22:26.580
I'm going to play it you now
00:22:44.780
thank you
00:23:00.000
foreign