Introduction to Metaprogramming and Build Systems
Metaprogramming here refers not to programming itself but the tools and processes surrounding software development, such as building, testing, and managing dependencies. Build systems automate repetitive commands needed to produce software artifacts like executables or documents.
What is a Build System?
- Purpose: Automate commands to build targets (e.g., paper.pdf, software binaries).
- Targets and Dependencies: Define files and commands necessary to produce outputs.
- Example Tool:
make, widely available on Unix-like systems and simple projects.
Writing a Makefile
- Specify targets and dependencies with rules describing commands to transform dependencies into targets.
- Support for pattern rules and special variables (e.g.,
$*,$@) allow flexible matching and command execution.
Example Workflow
- Build a PDF from a TeX source with embedded plot images.
- Use
maketo regenerate files only when their dependencies change, optimizing build time.
Managing Dependencies
- Dependencies vary widely: files, programs, libraries, system packages.
- Repositories: Centralized collections (e.g., PyPI, npm, RubyGems) provide dependency libraries.
Semantic Versioning
- Version numbers follow
MAJOR.MINOR.PATCHscheme:- Patch: Backwards-compatible bug fixes.
- Minor: Backwards-compatible new features.
- Major: Incompatible API changes.
- Ensures compatibility and smooth upgrades across dependent software.
- Example: Python versions 2 vs. 3 show incompatibility indicated by major version change.
Lock Files and Dependency Freezing
- Lock files record exact versions of dependencies for reproducible builds and faster development.
- Prevent unexpected breakages due to upstream changes.
- Extreme locking involves vendoring (copying dependencies into the project).
Continuous Integration (CI) Systems
- Cloud-based automation of build, test, deployment triggered by events like commits or pull requests.
- Examples: Travis CI, GitHub Actions.
- Use cases include automatic testing, linting, deploying documentation or releases, and dependency updates.
- Support collaborative workflows and improve software quality.
Testing Fundamentals
- Test suites: Collections of automated tests.
- Unit tests: Test isolated components.
- Integration tests: Test interactions between components.
- Regression tests: Prevent reintroduction of past bugs.
- Mocking: Replace parts of the system with controlled test doubles for isolated testing.
For a deeper dive into testing methodologies and automation tools, see Comprehensive Software Testing Tutorial: Manual to Selenium Automation.
Advanced Build Systems
- Specialized tools like CMake (for C projects), Maven/Ant (Java), or Bazel (multi-language at scale) understand language-specific layouts and dependencies.
- Can integrate with simpler tools like
makeas glue for complex workflows.
Practical Takeaways
- Automate repetitive build steps to save time and reduce errors.
- Use semantic versioning and lock files to maintain compatibility and security.
- Employ continuous integration to automate testing and deployment.
- Understand the role of different test types and leverage mocking for reliable software.
For improving your infrastructure automation skills related to build environments and automation workflows, consider exploring Mastering Terraform: A Comprehensive Guide to Infrastructure as Code.
Mastering these tools and concepts significantly improves software development efficiency, reliability, and maintainability, especially in larger projects.
all right everyone let's get started with the next lecture so today we're gonna that tackle the topic of meta
programming and this title is a little weird it's not entirely clear what we mean by meta programming we couldn't
really come up with a better name for this lecture because this lecture is about the processes that surround the
work that you do when working with software it is not about programming itself necessarily but about the process
this can be things like how your system is built how it's tested how you add dependencies to your software that sort
of stuff that becomes really relevant when you build larger pieces of software but they're not really programming in
and of themselves so the first thing we're going to talk about in this lecture is the notion of
build systems so how many of you have used a build system before or know what it is okay so about 1/2 of you so for
the rest of you the the central idea behind a build system is that you're writing a paper you're writing software
you're like working on a class whatever it might be and you have a bunch of commands that you like either written
down in your shell history or you wrote them down in a document somewhere that you know you have to run if you want to
do a particular thing so for example like there are a sequence of commands so you need to run in order to build your
paper or build your thesis or just to run the tests for whatever class you're currently in and a build system sort of
idea is that you want to encode these rules for what commands to run in order to build particular targets into a tool
that can do it for you and in particular you're going to teach this tool about the dependencies between those different
artifacts that you might build there are a lot of different types of tools of this kind and many of them are built for
particular purposes particularly languages some of them are built for building like papers some of them are
built for building software some of them are built for particularly programming languages like Java or some some tools
even have built in tools for for builds so NPM for example you might be aware if you've done no GS development has a
bunch of built-in tools for doing tracking of dependencies and building them and building all of the dependent
stuff of your software but more generally these are known as build systems and at their core they all
function in a very similar way and that is you have a number of targets these are the things that you want to build
these are things like paper dot PDF but they can also be more abstract things like run the test suite or build the
binary for this program then you have a bunch of dependencies and dependencies are things that need to be built in
order for this thing to be built and then you have rules that define how do you go from a complete list of
dependencies to the given target so an example of this might be in order to build my paper dot PDF I need a bunch of
like plot images they're gonna go into the paper so they need to be built but then once they have been built how do i
construct the paper given those files so that is what a rule is it's a sequence of command so you run to get from one to
the other how you encode these rules differs between different tools in this particular class we're gonna focus on a
tool called make make is a tool that you will find on almost any system that you log in today like it'll be on Mac OS
it'll be on basically every Linux and BSD system and you can pretty easily get it on Windows um it's not great for very
complex software but it works really well for anything that's sort of simple to medium complexity now when you run
make to make user command you can run on the command line and when you type make and this is an empty directory if I type
make it just has no target specified and no make file found stop and so it helpfully tells you that it stopped
running but also tells you that no make file was found make will look for a file literally called make file in the
current directory and that is where you encode these targets dependencies and rules so let's try to write one let's
imagine that I'm writing this hypothetical paper and so I'm gonna make a make file and then this make file I'm
going to say that my paper dog PDF the hands-on that's what the : here indicates paper dot text was going to be
a little attack file and plot data dot PNG and the command in order to build this is going to be PDF latex of paper
dot Tex so for those of you who are not familiar with this particular way of building documents the tech is a really
handy programming language for documents it's a really ugly language and it's a pain to work with but it produces pretty
nice documents and the tool you use to go from a tech file to PDF is PDF latex and here I'm saying that I also depend
on this plot plot data PNG that's gonna be included in my document and what I'm really saying here is if either of those
two dependencies change I want you to build paper PDF they both need to be present and should they ever change I
wanted to rebuild it but I haven't really told it how to generate this plot data PNG so I might want to rule for
that as well so I'm gonna define another target here and it's gonna be it's gonna look like
this plot - % & % means and make is any string sort of a wildcard pattern but the cool thing is a you go and repeat
this pattern in the dependencies so I can say that plot - % dot PNG is going to depend on % dott data or debt that is
a common sort of suffix for data files and it's also going to depend on some script that's gonna actually plot this
for me and the rules for to go from one to the other these can be multiple lines but in my
particular case they're just one line I'm gonna explain what this is in a little second alright so here we're
gonna say that in order to go from a wildcard dot dot file that matches the wildcard in the target and a plot dot
python file run the python file with - i which is going to be like the way we take the mark what the input is in our
python file I'll show it to you later dollarz star is a special variable that is defined
for you and make file rules that matches whatever the percentile was so if I do plot to PNG then it's going to look for
food dot that and it dollars stars can expand to foo so this will produce the same file name as the one we matched
here and dollar act as a special variable that means the name of the target right so the output file and
hopefully what plotter py will do is that it will take whatever the data is here it will produce a PNG somehow and
it will write it into the file indicated by the dollar at all right so now we have a make file let's see what happens
if the only file in this directory is the make file and we run make one says no rule to make target paper dot tex
needed by paper dot PDF stop so what it's saying here is first of all it looked at the first rule of our file the
first target and when you give make no arguments it tries to build whatever the first target is this is known as the
default goal so in this case it tried to helpfully build paper dot PDF for us and then it looked up the dependencies and
it said well in order to build paper dot PDF I need paper Tex and I need this PNG file and I can't find paper dot X and I
don't have a rule for generating paper dot X and therefore I'm gonna exit this isn't nothing more I can do
so let's try to make some files here let's just make like an empty paper dot X and then type make so now it says no
rule to make target plot data dot PNG needed by paper to PDF right so now it's it knows that it has one dependency but
it doesn't know how to get the other one it knows it as a target that matches but it can't actually find its dependencies
and so it ends up doing nothing at all it still still needs us to generate this PNG for the input for the PNG so let's
actually put some useful stuff into these files let's say that luckily I have one from earlier plot da py to here
so let's so good well this text file is this is what text looks like it's not very pretty but
see I'm defining an empty document I'm going to include graphics which is the way you include a an image file I'm
going to include plot data dot PNG and this is of course why we want a dependency of peda paper dot PDF to be
the PNG file plot the py is also not very interesting it just part imports a bunch libraries it parses the - ion - Oh
arguments it loads data from the I argument it uses library called matplotlib which is very handy for just
quickly plotting data and it's gonna plot the first column of the data as X's and the second column of the data as Y's
so we're just going to have a data file that's two columns x and y on every line and then it saves that as a figure into
whatever the given - oh value is okay so we need a data file that's going to be data dot because we want plot data dot
PNG and our rules said that the way you go from that pattern to the dot file the dot that file is just by whatever
follows plot so if we want plot - data then we want data dot that and then this file we're just gonna put in some linear
coordinates because why not that's not linear all right and now what happens if we're gonna make well mmm
okay so what just happened well make first ran plata py with the correct files to generate the PNG file and then
it ran PDF latex paper dot text and all the stuff we see below is is the output from that tool if you wanted to we
silence the effort from this tools we don't have to like have it mess with all our output but in general you notice
that it ran the two commands then it write the random perhaps unsurprisingly in the right order and if we now do LS
in the current directory we see that we have a bunch of files that were generated by PDF latex but in particular
we have the PNG file which was generated and we have the paper dot PDF and if we open the paper - a PDF file we see that
it has one image which has the straight line perhaps in and of itself not a very
surprising or interesting result but where this gets really handy is I can do things like if I type make again make
just says paper dot PDF is up to date it does no work whenever you run make it tries to do the minimal amount of work
in order to produce whatever you ask it to produce in this case none of the dependencies have changed so there's no
reason to rebuild the paper or to rebuild the plot if I now let's say I'm gonna edit paper dog texts I'm gonna add
hello here an hour and make then if we scroll up we'll see it didn't run plot op py again because I didn't need to
none of the dependencies change but it did run PDF latex again and indeed if we open the paper analysis hello over there
on the other hand if I were to change say the data file and make this point 8 an hour and make then now it plots again
because the data changed and it regenerates the PDF because the plot changed and indeed the paper turns out
the way we expected it to so that's not to say that this particular pipeline is very interesting because it's not it's
only true very very straightforward targets and rules but this can come in really handy when you start building
larger pieces of software or there might be dependencies you might even imagine that if you're writing a paper one of
your targets might be producing this data file in the first place right so one of the makefile
targets might be run my experiment right run my benchmark and stick the the data points that come out into this file and
then plot the results and then and then and then and then all the way until you end up with a final paper and what's
nice about this is first of all you don't have to remember all the commands to run you don't have to write them down
anywhere but also the tool takes care of doing the minimal amount of work needed often you'll find things like they'll be
too there'll be sub commands to make like make tests right which is going to compile your entire piece of software
and also run the tests there might be things like make release which builds it with optimizations
turned on and creates a tarball and uploads that somewhere right so it's gonna do the whole pipeline for you the
idea is to reduce the effort that you have to do as any part of your build process now what we saw here was a very
straightforward example of dependencies right so we saw here that you could declare files as dependencies but you
Kosar declare sort of transitive dependencies right I depend on this thing which is generated by this other
target very often when you work with dependencies in the in the larger area of software you'll find that your your
system ends up having many different types of dependencies some of these are files like we saw here some of them are
programs right like this sort of implicitly depends on Python being installed on my machine some of it might
be libraries right you might depend on something like matplotlib which we depend on here some of them might be
system libraries like open SSL or open SSH or like low-level crypto libraries and you don't necessarily declare all of
them very often there's sort of an an assumption about what is installed on the given system what you'll find is
that for most places where you have dependencies there are tools for managing those dependencies for you and
very often these systems you might depend on are stored in what are known as repositories so repository is just a
collection of things usually related that you can install that's basically all a repository is and you might be
familiar with some of them already right so some examples of repositories are pi PI which is a well-known
repository for Python packages rubygems which is similar for ruby crates or io for rust NPM for nodejs
but other things the repositories too right like there are positives for cryptographic keys like key base there
are repositories for system installed packages like if you ever use the apt tool in Ubuntu or in Debian you are
interacting with a package repository where people who have written like programs and libraries upload them so
that you can then install them similarly you might have repositories are entirely open right so the Ubuntu
repositories for example are usually provided by the Ubuntu developers but in Arch Linux there might there is
something called the arch user repository where users can just share their own libraries and their own
packages themselves. Very often, repositories are either sort of managed or they are just entirely open and you
should often be aware of this because if you're using an entirely open repository maybe the security guarantees you get
from that are less than what you get in a controlled repository one thing you'll notice if you start using repositories
is a very often software is versioned and what I mean by version well you might have seen this for stuff like
browsers right where there might be something like starting like Chrome version 64 dot zero dot two zero one 903
24 right this is a version number it might there's a dot here this is one kind of version number. But, sometimes, if
you start, I don't know, like Photoshop or you start any other tool, there might be other kind of versions that are, like,
8.1 dot seven right these version numbers are usually numerical but not always sometimes they have hashes in
them for example to refer to git commits but you might wonder why do we have these why is it even important that you
add a number to software that you release the primary reason for this is because it enables me to know whether my
software would break imagine that I have a dependency on a library that Jose has written right and Jose is constantly
doing changes to his library because he wants to make it better and he decides that one of the functions that his
library exposes has a bad name so he renames it my software suddenly stops working right because I my library calls
a function on Jose's library but that function no longer exists depending on which version people have installed of
Jose's library versions helps because I can say I depend on this version of Jose's library and there has
to be some rules around what is Jose allowed to do within a given version if he makes a change that I can no longer
rely on his version has to change in some way there are many thoughts on exactly how
this should work like what are the rules for publishing new versions how do they change the version numbers um some of
them are just dictated by time so for example if you look at browsers they very often have time versions that look
like this they have a version number on the far left that's just like which release and then they have sort of an
incremental number that is usually zero and then they have a date at the end right so this is March 24th 2019 for
some reason and usually that will indicate that this is version 64 of Firefox from this date and then if they
release sort of patches or hot fixes for security bugs they might increment the date but keep the version of the at the
left the same and people have strong strong opinions on exactly what the scheme should be and you sort of depend
on knowing what schemes other people use right if I don't know what scheme Jose is using for changing his versions
maybe I just have to say you have to run like eight one seven of Jose's software otherwise I cannot build my software but
this is a problem, too, right? Imagine that, Jose, as a responsible developer of his library, and he finds
the security bug and he fixes it but it doesn't change the external interfaces library no functions changed no types
change then I want people to be building my software with his new version and it just so happens that building mine works
just fine with his new version because that particular version didn't change anything I depended on so one attempted
solution to this is something called semantic versioning. So, in semantic versioning, we give each of the numbers
separated by dots in a version number, a particular meaning. And, we give a contract for when you have to increment
the different numbers in particular in semantic versioning we call this the major version we call this
the minor version and we call this the patch version and the rules around this are as follows if you make a change to
whatever your software is and and the change you made is entirely backwards compatible right like it does not add
anything it does not remove anything it does not rename anything externally it is as if nothing changed then you only
increment the patch number nothing else so usually security fixes for example will increment the patch number if you
add something to your library I'm just gonna call them libraries because usually libraries are the things where
this matters so for a library if you add something to the library you increment the minor version and you set the patch
to zero so in this case if we were to do a minor release the next minor release version number would be eight to zero
and the reason we do this is because I might have a dependency on a feature the josè added in a 2-0 which means you
can't build my software with eight one seven that would not be okay even though if you if I had written it
towards eight one seven you could run it with a 2-0 the reverse is not true because it might not have been added yet
and then finally the major version you increment if you make a backwards incompatible change where if my software
used to work with whatever version you had and then you make a change that means that my software might no longer
work such as removing a function or renaming it then you increment the major version and set minor and patch to zero
so the next major version here would be nine zero zero taken together these allow us to do really nice things when
setting what our dependencies are in particular if I depend on a particular version of someone's library
rather than saying it has to be exactly this version what I'm really saying is it has to be the same major version and
at least the same minor version and the patch can be whatever this means that if I have a dependency on Jose software
then any later release that is still within the same major is fine that includes keep in mind an earlier version
assuming that the minor is the same imagine that you are saw on some older computer that has like version eight one
three in theory my software should work just fine with eight one three as well it might have whatever bug the Jose
fixed in between like whatever security issue but this has the nice property that now you can share dependencies
between many different pieces of software on your machine if you have version eight-30 installed and there are
bunch of different software that like one requires eight one seven one requires eight two four one requires
eight zero one all of them can use the same version of that dependencies you only needed installed once one of the
most common or one of the most familiar perhaps examples of this kind of semantic versioning is if you look at
the Python versioning so many of you may have come across this where python 3 and python 2 are not compatible with one
another they're not backwards compatible if you write code in Python 2 you try to run in Python 3 it might not work there
are some cases where it will but that is more accidental than anything else and Python actually follows semantic
versioning at least mostly and so if you write software that runs on Python 3.5 then it should also work in 3.6 3.7 and
3.8 it will not necessarily work in Python 4 although that will hopefully be a long time away but if you write code
for Python 3.5 it will it will possibly not run on Python 3.4 so one thing you will see many software projects do is
they try to bring the version requirements they have as low as possible if you can depend on major
and then minor in patch zero zero that is the best possible dependency you can have because it is completely liberal as
to which version of that major you're depending on sometimes this is hard right sometimes you genuinely need a
feature that was added but the lower you can get the better it is for those who want to depend on your software in turn
when working with these sort of dependency management systems or in with versioning in general you'll often come
across this notion of lock files you might have seen this where like you try to do something and it says like cannot
reconcile versions or you get an error like lock file already exists these are often somewhat different topics but in
general the notion of a lock file is to make sure that you don't accidentally update something the lock file at its
core is really just a list of your dependencies and which version of them you are currently using right so my
version string might be eight one seven and the latest version like on the internet somewhere might be a three zero
but whatever is installed on my system is not necessarily one of those two it might be like eight two four or
something like that and the lock file will then say dependency josè version eight to four and the reason you want to
lock file there can be many one of them is that you might want your builds to be fast if every single time you try to
build your project whatever tool you were using download the latest version and then compile it and then compile
your thing you might wait for a really long time each time depending on the release cycle of your dependencies if
you use a lock file then unless the version unless you've updated the version in
your lock file it'll just use whatever it built previously for that dependency and your sort of development cycle can
be a lot faster another reason to use lock files is to get reproducible builds imagine that I produce some kind of
security related software and I very carefully audited my dependencies and I produce like a signed binary of like
here is thus like a sworn statement for me that this version is secure if I didn't include a
lock file then by the time someone else installs my program they might get a later version of their pendency and
maybe that later version as I've been hacked somehow or just has some other security vulnerability that I haven't
had a chance to look at yet right and a lock file basically allows me to freeze the ecosystem as of this version that I
have checked the extreme version of this is something called ven during when you vendor your dependencies it really just
mean you copy/paste of them ven during means take whatever dependency you care about and copy it into your project
because that way you are entirely sure that you will get that version of that dependency it also means that you can
like make modifications to it on your own but it has the downsides that now you no longer get these benefits of
versioning right you no longer have the advantage that if there are newer releases of that software your users
might get them automatically like for example when Hosea fixes his security issues not that he has any of course one
thing you'll notice is that when talking about this I've been talking about sort of bigger processes around your systems
these are things like testing they're things like checking your dependency versions
they're also things are just setting up build systems and often you don't just want a local build system you want to
build process that includes other types of systems or you want them to run even when your computer is not necessarily on
and this is why as you start working a larger and larger project you will see people use this idea of continuous
integration and continuous integration systems are essentially a cloud build system the idea is that you have your
project stored on the internet somewhere and you have set it up with some kind of service that is running an ongoing thing
you for your project whatever it might be and continuous integration can be all sorts of stuff it can be stuff like
releasing your library to pi PI automatically whenever you push to a particular branch it could be things
like run your test suite whenever someone submits a pull request or it could be
check your code style every time you commit there all sorts of things you could do with continuous integration and
the easiest way to think about them is that they're sort of event triggered actions so whenever a particular event
happens for your a possibly for your project a particular action takes place where the action is usually some kind of
script some sequence of programs they're gonna be invoked and they're gonna do something this is really an umbrella
term that encapsulate a lot of different types of services so some continuous integration services are very general
things like Travis CI or Azure pipelines or github actions are all very broad CI platforms they're built to let you write
what you want to happen whenever any event that you define happens very broad systems there are some more specialized
systems that deal with things like continuous integration coverage testing so like annotate your code and show you
have no tests that test this piece of code and they're built only for that purpose or they're built only for
testing browser-based libraries or something like that and so often you can find CI tools that are built for the
particular project you're working on or you can use one of these broader providers and one thing that's nice is
that many of them are actually free especially for open source software or if you're a student you can often get
them for free as well in general the way you use the CI system is that you add a file to your repository and this file is
often known as a recipe and what the recipe specifies is this sort of dependency cycle again sort of what we
saw with make files it's not quite the same the events instead of being files might be something like when someone
pushes a commit or when a commit contains a particular message or when someone submits a pull request or
continuously write one example of a continuous integration service that's not tied to any particular change to
your code is something called the dependable you can find this on github and the
dependent bots is something that you hook up to your your repository and it will just scan whether there are newer
versions available of your dependencies that you're not using so for example if I was depending on eight one seven and I
had a lock file that locked it to eight two four and then eight three zero is released the dependable will go you
should update your log file and then submit the pull request to your repository with that update this is a
continuous integration service it's not tied to me changing anything but to the ecosystem at large changing often these
CI systems integrate the back into your project as well so very often these CI services will provide things like little
badges so let me give an example so for example here's a project I've worked on recently that has continuous integration
set up so this project you'll notice it's readme if I can zoom in here with that chrome bean nope nope that's much
larger than I wanted here you'll see that at the top of the the repositories page they're a bunch of these badges and
they display very various types of information you'll notice that I have dependable running right so the
dependencies are currently up to date it tells me about whether the test suite is currently passing on the master branch
it tells me how much of the code is coverage by tests and it tells me what is the latest version of this library
and what is the latest version of the documentation of the library that's available online and all of these are
managed by various continues continuous integration services another example that some of you might find useful or
might even be familiar with is the notion of github pages so github pages is a really nice service the github
provides which lets you set up a CI action that builds your repository as a blog essentially it's it runs a static
site generator called Jekyll and Jeckle just takes a bunch of markdown files and then produces a
complete website and that as a part of get up pages they will also upload that to get up servers and make it available
at a particular domain and this is actually how the class website works class website is not a bunch of like
HTML pages that we manage instead there's a repository missing semester so if you look at the missing semester
repository you will see if i zoom out a little here that this just has a bunch of markdown files right it has saket
20/20 metaprogramming md so this is the if I go to raw here this is the raw markdown for today's lecture so this is
the way that I write the lecture notes and then I commit that to the repository we have and I push it and whenever a
push happens the github pages CI is gonna run the build script for github pages and produces the website for our
class without me having to do any additional steps to make that happen and so yeah sorry good yeah so so Jekyll
it's using a tool called Jekyll which is a tool that takes a directory structure that contains markdown files and
produces a website it produces like HTML files and then as a part of the action it takes those files and uploads them to
github servers add a particular domain and usually that's the domain under like github I oh that they control and then I
have set missing semester to point to the github domain I want to give you one aside on testing because it's something
that many of you may be familiar with from before right you have a rough idea of what testing is you
run the test before you've seen a test fail you know like the basics of it or maybe you've never seen a test fail in
case congratulations but as you as you get to more advanced projects though you'll find that people have a lot of
terminology around testing and testing is a pretty like deep subject that you could spend many many hours trying to
understand the ins and outs of and I'm not going to go through it in excruciating detail but there are a
couple of words that I think it's useful to know what mean and the first of these is a test suite so a test suite is a
very straightforward name for all of the tests in a program it's really just a suite of tests it's a large collection
of tests that usually are run as a unit and there are different types of tests that often make up a test suite the
first of these is what's known as a unit test a unit test is a often usually fairly small test of self-contained
tests the tests a single feature what exactly a feature might mean is a little bit up to the project but the idea is
that should be sort of a micro test that only tests a very particular thing then you have the larger tests that are known
as integration tests integration tests try to test the interaction between different subsystems of a program so
this might be something like an example of a unit test might be if you're writing an HTML parser to the unit test
might be test that it can parse an HTML tag an integration test might be here's an HTML document parse it right so that
is going to be the interview the integration of multiple of the subsystems of the parser you also have a
notion of regression tests regression tests are tests that test things that were broken in the past so imagine that
someone submits some kind of issue to you and says your library breaks if I give it a marquee tag and that makes you
sad so you want to fix it so you fix your parser to now support my key tags but then you want to add a test to your
test suite the checks that you can parse marquee tags the reason for this is so that in
the future you don't accidentally reintroduce that bug. So that is what a regression tests are for and over time
your project is gonna build up more and more of these, and they're nice because they prevent your project from
regressing to earlier bugs. The last one I want to mention is a concept called mocking. So mocking is the idea of being
able to replace parts of your system with a sort of dummy version of itself that behaves in a way that you control. A
common example of this is you're writing something that does, oh I don't know, file copying over SSH. Right? This is a tool
that you've written that does file copying over SSH there are many things you might want to mock here. For example,
when running your test suite you probably don't actually care that there's a network there. Right? You don't
need to have to like set up TCP ports and stuff, so instead you're gonna mock the network. The way this usually works
is that, somewhere in your library, you have something that like opens a connection, or reads from the connection,
or writes to the connection, and you're gonna overwrite those functions internally in your library with
functions that you've written just for the purposes of testing, where the read function just like returns the data, and
the write function just drops the data on the floor, or something like that. Similarly, you can write a mocking
function for the SSH functionality. You could write something that does not actually do encryption, it doesn't talk
to the network: it just like takes bytes in here and just magically they pop out the other side, and you can ignore
everything that's between, because for the purpose of copying a file, if you just wanted to test that functionality,
the stuff below doesn't matter for that test, and you might mock it away. Usually, in any given language, there are tools
that let you build these kind of mocking abstractions pretty easily. That is the end of what I wanted to talk about
metaprogramming, but this is a very, very broad subject. Things like continuous integration, build systems, there are so
many out there that can let you do so many interesting things with your projects, so I highly
recommend that you start looking into it a little. The exercises are sort of all over the place, and I mean that in a good
way. They're intended to try to just show you the kind of possibilities that exist for build working with these kind of
processes so for example the last exercise has you write one of these continuous integration actions yourself
where it you decide what the event be and you decide what the action be but try to actually build one and this can
be something that you might find useful in your project the example I given the exercises is try to build an action that
runs like right good or pro Slynt one of the linters result for the english language on your repository and if you
do like we could enable that for the class repository so that our lecture notes are actually well written right
and this is one other thing that's nice about this kind of continuous integration testing is that you can
collaborate between projects if you write one I can use it in my project and there's a really handy feature where you
can build this ecosystem of improving everything any questions about any of the stuff we record today yeah so the
question is why do we have both make and see make what do they do and is there a reason for them to talk together so so
see make I don't actually know what the tagline for C make is anymore but it's sort of like a better make for C as the
name implies C make generally understands the layout of C projects a little bit better than make Falls do
they're sort of built to try to parse out what the structure of your dependencies are what the rules from
going to one to the other is it also integrates a little bit nicer with things like system libraries so C may
can do things like detect given libraries available on your computer or if it's available at
multiple different paths it tries to find which of those paths it's present on on this system and then link it
appropriately so see make is a little bit smarter than make is make will only do whatever you put in the make file not
entirely true there are things called implicit rules that are like built-in rules in make but they're pretty dumb
whereas emic tries to be able to be a larger build system that is opinionated by default to work for C projects
similarly there's a tool called maven so maven and ant which is another project they are both built for Java projects
they understand how Java code interacts with one another how you structure Java programs and they're built for that task
very often at least when I use make I use make sort of at the top and then make my call other tools that build
whatever subsystem they know how to build right like my make file might call cargo to build a rust program and then
call see make to build some like see dependency of that but then at the top like I'm gonna do some stuff at the end
after the programs have built and that might just be like run a benchmark which is in the rust code and then like plot
it using the C code or something like that right so for me make you sort of the glue at the top that I might write
usually if your make file gets very large there's a better tool would you'll find it like big companies for example
is they often have one build system that manages all of their software so if you look at Google for example they have
this open source system called basil and I don't think Google literally uses basil inside of Google but it's sort of
based on what they use internally and it really just its intended to manage the entire build of everything Google has
and basil in particular is built to be I think they call it like a polyglot build framework so the idea is that it works
for many different languages there's like an implement there's a module for basil for this language and that
language in that language but they all integrate with the same basil framework which then knows how to integrate
dependencies between different libraries and different languages get a question sure
so when you say expressions you mean the things in this file or yeah so these are so make files are their own language
they are it's a pretty weird language like it has a lot of weird exceptions in many ways it's weird just like bash is
weird but in different ways which is even worse like when you're writing a make file you sort of you can sort of
think like you're writing bash but you're not because it's broken in different ways but but it is its own
language and the way that make files are generally structured is that you have a sequence of I think they call them
directives so every like the this thing oops this thing is a directive and this is a directive and every directive has a
colon somewhere and everything to the left of the colon is a target and everything to the right of the colon is
right of the colon is a dependency and then all of the lines below that line are the sequence of operations known as
the rules for once you have the dependencies how do you build these targets notice that make is very
particular that you must use a tab to indent the rules if you do not make will not work if they must be tabs they
cannot be four eight spaces must be tabs and like you can have multiple operations here I like I can do echo
hello or whatever and then they would first run this and then run this there's a there's an exercise for today's
lecture that has you try to extend this make file with a couple of other targets that you might find interesting that
goes into a little bit more detail there's also some ability to execute external commands to like determine what
the dependencies might be if your dependencies are not like a static list of files but it's a little limited
usually once you've started needing that sort of stuff you might want to move to a more advanced build system yeah so the
question is what happens if I have let's say that I have library a and library B and they both depend on library see but
library a depends on like 4.0.1 and library B depends on 3 dot 4 dot 7 so they both depend on C and so ideally
we'd like to reuse see but they depend on different major versions of C what do we do well what happens in this case
depends entirely on the system that you're using the language that you're using in some cases the tool would just
be like well I'll just pick for which sort of implies that they're not really using semantic versioning in some cases
the tool will say this is not possible like if you do this it's an error and the tool will tell you you either need
to upgrade be like have B use a newer version of C or you need to downgrade a you do not get to do this and
compilation will fail some tools are gonna build two versions of C and then like when it builds a it will use the
major four version of C and when it builds B will use the major three version of C one thing you end up with
is really weird conditions here were like if C has dependencies then now you have to build all of C's dependencies
twice to 1 for 3 and 1 for 4 and maybe they share and maybe they don't you can end up in particularly weird situations
if imagine that the library see like imagine that library see like rights to a file like
rights to some like file on disk some cache stuff if you run your application now and like a does something to call
like C dot save and B to something like C adult load then suddenly your your application of the bottom is not going
to work because the format is different right so these situations are often very
problematic and and most tools that support semantic versioning will reject this kind of configuration for exactly
that reason but it's so easy to shoot yourself in the foot all right we will see you again tomorrow for security keep
in mind again if you haven't done the survey the question I care the most about in the survey is what you would
like us to cover in the last two lectures so the last two lectures are for you to choose what you want us to
talk about and to give any questions you want us to answer so please like add that if you can and that's it see you
tomorrow
A build system automates the process of compiling and assembling software artifacts from source files by executing defined commands only when necessary based on file dependencies. This automation saves time, reduces errors, and ensures consistent builds by rebuilding targets like executables or documents only if their dependencies have changed.
Semantic versioning uses a MAJOR.MINOR.PATCH format to signal the level of changes in software releases. Patch versions indicate backwards-compatible bug fixes, minor versions add compatible new features, and major versions introduce incompatible API changes. This scheme helps developers understand compatibility implications and safely upgrade dependencies.
Lock files record exact versions of dependencies used in a project, ensuring that everyone working on the software or automated build systems use the same dependency versions. This prevents unexpected breakages caused by upstream updates and guarantees that builds are reproducible and consistent across different environments.
CI systems automate the process of building, testing, and deploying software whenever code changes are made, such as after commits or pull requests. This frequent and automated feedback helps catch errors early, enforces coding standards through linting, and accelerates development by streamlining releases and documentation updates.
Key test types include unit tests (which verify individual components), integration tests (which check interactions between components), and regression tests (which prevent previously fixed bugs from reappearing). Using mocking techniques to isolate tests can also improve reliability by simulating dependencies and external systems.
Advanced build systems are beneficial for larger or more complex projects, especially those involving multiple languages or non-trivial dependencies. Tools like CMake or Maven understand language-specific conventions, manage dependencies, and integrate well with modern development workflows, whereas simple tools like make suit smaller or simpler projects.
Start by automating build steps using tools like make to avoid manual commands, adopt semantic versioning and maintain lock files for stable dependencies, set up continuous integration pipelines for automated testing and deployment, and design comprehensive test suites that include unit, integration, and regression tests. Exploring infrastructure automation tools like Terraform can further enhance your build environment management skills.
Heads up!
This summary and transcript were automatically generated using AI with the Free YouTube Transcript Summary Tool by LunaNotes.
Generate a summary for freeRelated Summaries
Ultimate Git and GitHub Tutorial: Version Control & Collaborative Workflows
Master Git basics and GitHub collaboration with this comprehensive course by Sumit Saha. Learn Git's core concepts—tracking, staging, committing, branching, merging—and how to effectively use GitHub's remote repositories and pull requests through practical examples and clear explanations.
Getting Started with Git: A Comprehensive Beginner's Guide
This course provides a thorough introduction to Git, the world's most popular version control system. Designed for beginners and those looking to deepen their understanding, it covers fundamental concepts, essential commands, and practical applications to help you effectively track project history and collaborate with others.
Boosting Productivity: Essential Tools and Approaches for Efficiency
Explore top productivity tools and strategies with insights from experts Scott and Wes on how to maximize your efficiency.
Mastering Constants in C: Defining and Using Macros Effectively
Explore how constants are defined and utilized in C programming using #define and const keywords. Learn best practices, common pitfalls, macro functions, multi-line macros, and predefined macros for date and time to write clean, maintainable code.
Key Features of C Programming and Basic Code Execution Guide
Explore the core features of C programming language, including its procedural nature, middle-level abstraction, and system-level capabilities. Learn how to write and execute a simple C program using Code Blocks IDE with step-by-step explanations of code components and compilation process.
Most Viewed Summaries
Kolonyalismo at Imperyalismo: Ang Kasaysayan ng Pagsakop sa Pilipinas
Tuklasin ang kasaysayan ng kolonyalismo at imperyalismo sa Pilipinas sa pamamagitan ni Ferdinand Magellan.
A Comprehensive Guide to Using Stable Diffusion Forge UI
Explore the Stable Diffusion Forge UI, customizable settings, models, and more to enhance your image generation experience.
Mastering Inpainting with Stable Diffusion: Fix Mistakes and Enhance Your Images
Learn to fix mistakes and enhance images with Stable Diffusion's inpainting features effectively.
Pamamaraan at Patakarang Kolonyal ng mga Espanyol sa Pilipinas
Tuklasin ang mga pamamaraan at patakaran ng mga Espanyol sa Pilipinas, at ang epekto nito sa mga Pilipino.
Pamaraan at Patakarang Kolonyal ng mga Espanyol sa Pilipinas
Tuklasin ang mga pamamaraan at patakarang kolonyal ng mga Espanyol sa Pilipinas at ang mga epekto nito sa mga Pilipino.

