And Understanding How Clojure Works

If you have been in a field tangential to Computer Science or Software Engineering for some time; you have more than likely heard some fairly passionate opinions about the functional programming language Lisp. Lisp (an acronym for List Processing) is a functional programming language known for its fully parenthesized prefix notation and is the second oldest high-level programming language to still be in use; Fortran being its senior [1]. Because of its age, Lisp now refers to a family of programming language variants referred to as dialects that have developed over the years. All of these languages, however, have the shared purpose of being a practical mathematical notation system for computers and being able to easily interact with data structures in a list-like manner. Despite being used in many fields and implementations such as Artificial Intelligence and databases for companies such as Walmart, Netflix, and more; many individuals are turned away from the utility of Lisp languages in general because of their sharp contrast to Object Oriented Programming (OOP) languages such as Python and their complex learning curves [2]. However, there is one dialect which is an exception.

Clojure: What Is It and Why Should You Care?

Clojure, voted the second most liked language in 2021’s Stack Overflow Developer Survey, is a Java Virtual Machine-based language (JVM-based language) and “a dynamic, general-purpose programming language, combining the approachability and interactive development of a scripting language with an efficient and robust infrastructure for multithreaded programming” [3], [4]. Essentially, Clojure is everything great about Lisp without (most of) the worst parts. Rich Hickey, the creator of Clojure, aimed to make a programming language which could stand as a modern take on a Lisp dialect that was designed for concurrency [5], [6].

If you are a coder who wants to overcome their disgust/hatred for Lisp in general, dip your toes into the world of functional programming languages, are learning AI and are beginning to learn how to perform Evolutionary Local Search, or simply need to learn Clojure for one of your classes, you have come to the right place. In this post, we are going to brave the wild west of functional programming and Clojure. By the end of this post, you will be able to understand the basic syntax of Clojure, how all macros and functions operate, be able to understand the documentation of the language, and have additional resources to use to test out the simple yet powerful nature of Clojure.

How to Clojure

How to Setup a Clojure Environment

Now this is a pretty demanding task itself. There are a lot of resources available on how to do this, so I will not waste your time. But, I will point you in the right direction and give you my personal preferences on what to use and how to use them.

The first few things you need to do is install Java (if it is not already on your device) and Leiningen. Once you have those requirements, I highly recommend using VSCode as your development environment as it has several supported, easy to install extensions which (if you want to continue your journey in Clojure into the distant future) can be extremely useful. If you decide to do so, you will want to install the Calva extension to help with creating Read-Eval-Print Loops (REPLs) which are used for in-the-moment testing of your code (more on that later) and getting fancy rainbow-colored parentheses (which you will value IMMENSELY later on). Once you have done all of that, go ahead and open your terminal of choice and navigate to the directory where you like to play around with code on your device. Then run the following command:

lein new app project_title

while replacing ‘project_title’ with whatever you want to name it. This will create a new Clojure environment for you to test and play around with.

You are very close to writing your first line of code and executing it! Now we just have to learn the basics of executing our code.

Once you have created your Clojure project, open it in VSCode and open the ‘core.clj’ file found in the ‘src’ directory. Consider this file to be the main file of your project which will execute when you want to test the project as a whole. You should see something like this if you followed all of the steps above:

A directory tree can be seen on the left side of the screen with 'core.clj' selected as the current file to look at. In the center of the screen, automatically generated Clojure code is found which contains a Clojure main function which, when executed, prints "Hello, World!". — After completing the steps above, a variant of this view should be found in the generated source directory. This image is taken from the main file found in the source tree of the entire project (the main file that should be executed when executing the program). The window on the right is automatically generated sample Clojure code.

As you see, Leiningen was kind enough to already set the file up for us with a file setup line for your project (in my example, my project is called ‘clojure-noob’) and an example main function. Once you are here, press “Ctrl+Alt+C” then “Ctrl+Alt+J” on Windows or “Ctrl+Option+C” then “Ctrl+Option+J” on Mac to open a dropdown menu of project types to initiate (this may take a few seconds to load fully). Select “Leiningen” from the choices and you will see a separate window open.

At the top of the image, a window tab titled "output.calva-repl" is displayed. in the center of the screen, a REPL window shows it is currently in a Clojure directory on lines 28, 41, and 43 which read "clj:clojure-noob.core:>". Lines 29 through 40 contain signifiers that the REPL session is being set up with the first line reading "; Jacking in...". Then, a generic REPL window navigation tips are given but are cut-off in the image. — This is a sample REPL (Read-Eval-Print-Loop) window immediately after beginning an instance. The current directory the REPL starts in is this project’s core directory.

This is a REPL, also known as a Read-Eval-Print Loop. This is a programming environment where you can test and modify your Clojure code in real-time (despite Clojure being a compiled language). This is where any returned values or printed statements can be found from your code.

How to Comment and Print “Hello, World!”

There are two types of comments you can make in Clojure: the “traditional” way and the comment function way. The traditional comment is what you are most-likely used to in coding. Multi-line comments are typically done by declaring a string constant for the very first line of a function or the file. Single-line comments must begin with a semicolon; everything that comes after the semicolon is considered a comment.

Orange text is displayed at the top of the image which reads "I am a multi-line comment!\nCan you tell that I am multi-lined?". After a vertical break, green text is displayed which reads ";; This is a single-line comment!" The following line reads "(-main) ' This is also a comment!" — The orange text towards the top is an example of a Clojure multi-line comment while the green text is an example of a single/in-line comment.

The comment function way of commenting is extremely Clojure-like: you use a function to represent a comment of executable code. You will use this function to test and debug your code as you develop it and leave it in your project when you are done to give examples of how to use an implementation. For now, we are going to use it to print the “Hello, World!” string.

Three lines of text are displayed. Line 1 reads "(comment" where the parathesis is white and the word comment is italicized, lower-cased, and in blue. Line 2 is indented by a tab then reads "(print "Hello, World!")" where the parathesis are blue, the word print is italicized and yellow, and the string "Hello, World!" is orange. Line 3 is tabbed and contains a single white parathesis. — The *comment* function in blue is used to contain the following *print* command which contains the string parameter “Hello, World!”

By typing out the above example (do not worry about understanding the syntax, we will go into that a bit later), you have successfully commented a line of code which will print the string “Hello, World!” to your REPL. To test your code out, put your cursor immediately after the set of parentheses containing the segment of code you want to test out (in this example, it would be after the blue parenthesis), then for both Windows and Mac press Ctrl+Enter (that is a keyboard shortcut for compiling segments of code in Clojure which you will use all the time).

clj:clojure-noob.core:>
Hello, World!
nil
clj:clojure-noob.core:> — After running the code in the core.clj file, the REPL instance will have the string “Hello, World!” printed and return *nil*.

Congratulations! You have successfully created your first line of executable code in Clojure!

Basic Syntax

In the above example, you probably noticed that the way we used the print and comment functions are a little different from how functions are traditionally used in OOP languages. This is because Lisp languages adhere to the Abstract Syntax Tree (AST) format which is a fancy way of saying operators (i.e. ‘+’, ‘/’, ‘my_func’, etc.) come first followed by their parameters which are all enclosed inside of parentheses [13]. In relation to the tree analogy, the operator is the parent node while everything else are children nodes. This is the format everything in Lisp languages uses. For example, if you wanted to represent the C++ expresion 2 * (5 + my_num), you would type out (* 2 (+ 5 my_num)).

An example of a function call within Clojure. The name of the function comes first as is shown by the yellow color then the parameters for that function follow.

Immutability of All Data

One of the many things that is off-putting to many individuals when learning Clojure is that pliable variables/data structures do not exist—at least, not in the OOP sense. Let’s look at C++ code for a moment to understand how variables work in Clojure.

int main() {
int my_int = 7;
const int my_const_int = 8;
return 0;
} — Example C++ code where two variables are declared. The first variable *my_int* is the integer 7 and is not constant while the second variable *my_const_int* is the integer 8 and is constant.

In the above example, I have declared two variables. The first one ‘my_int’ is what we are all used to in OOP languages. The second one, ‘my_const_int’ is what we know as a constant or immutable variable in OOP. Therefore, we can manipulate what value is being held in ‘my_int’ but not in ‘my_const_int’.

// This is a valid line of code.
my_int++;
// This; not so much.
my_const_int++; — Following from the image above the first two lines show a valid line of C++ code which could be written while the last two lines show an example of an invalid line of code. The first line is valid because *my_int* is not a constant variable while *my_const_int* is. This program will not compile.

But, if we want to use ‘my_const_int’ when making calculations or when declaring a new, different-valued variable, we can use it as long as we do not attempt to manipulate what the original value of ‘my_const_int’ is.

// These are completely valid moves with a constant integer variable
cout << my_const_int * my_const_int << endl;
const int my_int_squared = my_const_int * my_const_int; — Lines 2 and 3 show valid C++ code that performs an operation with the constant variable found in the above images. This program will compile.

A constant variable in C++ is how every piece of data behaves in Clojure. Every. Piece. Of. Data. This also extends to data structures like Linked Lists: once you have bound a Linked List to store data out of order, that is it! It cannot change what it stores. However, you can use that Linked List to create a new version of it that is sorted and goes by a different name or is returned for later use (more on returning things later).

Why is Immutability of All Data a Good Thing?

Why would we want all of our data to work like constants do in C++? Because of the confidence it grants us in using our data. We no longer need to worry if a function changed one of our parameters via a reference, if a class attribute is still what we declared it to be when constructing it, if an operation we call on data ruins/”trashes” what was there, etc. These are all benefits that, without immutable data, we would never have in Clojure. So, while it feels weird and like a restriction at first, I promise after some time in the language and after seeing how things like if-statements, looping, and variables work; this fact will give you a greater confidence in performing operations on any data you can see.

How to Set “Variables”

While Clojure does not have variables in the sense we are used to in OOP, they are still very useful to have in complex code. There are two ways to declare “variables” in Clojure.

(def my-variable 100)
(let [my-other-variable 600]
(print my-variable "\n")
(print my-other-variable)) — Two examples of variable bindings are displayed. the first variable named *my-variable* is an example of a global binding while the second variable named *my-other-variable* is an example of a local binding. When this program is compiled and ran, both the numbers 100 and 600 will be displayed in a REPL instance on their own lines respectively.

The first method is to define a global binding as seen with ‘my-variable’. This uses the def macro to bind the number 100 to ‘my-variable’. Again, this is a global binding; therefore, it will be declared across all functions and files in the project even if it is declared inside a specific local scope. The second method is a local binding. Using the macro let, inside of the square brackets, you can bind data to a name by using the format above. This binding, unlike the def binding, is only recognized in the scope of the let statement. Thus, ‘my-other-variable’ will only be recognized inside the parentheses just before let and after the closing parenthesis for let. One important thing to note regarding bindings overall is that a named binding can be bound to just about everything. If-statements, for-loops, functions; you name it, it can probably be bound to a binding.

The Abstract Data Structure “Sequence”

Now that we can bind values to names/bindings, you may want to bind certain data structures for use in your programs such as Huffman trees for file compression, priority queues for your AI uniform cost search algorithms, and so on. However, in Clojure, most of these data structures do not exist (remember, we are in a Functional Programming language, so all of the comforts of Object Oriented Programming such as classes and objects do not exist here). Instead, Clojure has a very small handful of traditional data structures such as hash-maps, lists, and sets; and a unique data structure called a Sequence. A Sequence is an abstract data structure which follows three rules: 1) You can request a specific piece of data from the structure (first), 2) once you have acquired the available piece of data from the structure, you still have the rest of the data structure (rest), and 3) if you decide to add anything to the data structure, it will be added to the “front” of it (cons) [12]. Now, I know what you are thinking, “This does not seem very useful or optimal,” and, in the OOP sense, you would be right. But, there are two major benefits of this data structure abstraction in Clojure: polymorphism and data structure auto-adjustment.

Polymorphism is a coding concept you may already be familiar with; it is an Object Oriented Programming concept where you can manipulate several data structures in similar ways utilizing a single interface for all of them, so this means in Clojure you do not have to memorize several methods or functions just to access the first piece of data in any structure you may find [13]. Data structure auto-adjustment is the idea most newcomers to Clojure are blown away by. Clojure is extremely intelligent when it comes to recognizing patterns of use on data structures. It is so good at this in fact, it will convert any portion of a sequence of data into any data structure it sees fit to optimize your code for you [12]. What that means is, for example, if you have a list or sequence of integers and throughout your file you keep checking to see if certain numbers exist in that list or sequence, Clojure will automatically (under the hood, in Java) turn that sequence into a binary tree and even balance it for you. Another example could be if you wanted to create a map which had a weight as its key and a piece of data as its value and you kept taking the key-value pair with the greatest weight, Clojure would turn that map into a priority queue for you to optimize your performance.

All of this is possible because every data structure you know can, in some way, obey the first, rest, cons properties of sequences. And the best part about these properties is the fact any new data structure you create in Clojure with sequences will inherit all of these properties and benefits that come with them. This can lead to some crazy under-the-hood “magic” – like a 4D-sequence actually being a stack that stores priority queues that stores red-black trees that stores functions – which you can benefit from without even having to think about it.

What to Return (Everything)

Because all data is immutable, this leads to an interesting dilemma: how do I make if-statements that set certain values, for-loops that sum all the entries in a list, sort a list, and so on? The answer: everything returns something. You may have noticed when you printed “Hello, World!” that just below the printed string in your REPL was ‘nil’. This is because the function print returned the data type ‘nil’ (equivalent to ‘None’ in Python) after performing some operations to print the text to the screen. Thus, an if-statement will similarly return something instead of executing lines of code if its condition is true. This means two things for you as of now: 1) every binding can be bound to anything as long as it returns something in some way (you will see an example of this in the section below), and 2) everything needs to be compiled after edits are made. Remember the key shortcut you used to compile and print “Hello, World!”? That same key shortcut will need to be used every time you make an edit and you want the effects of that edit to go into action. This is especially important to remember if you change anything inside of a function as you will have to recompile the whole function itself in order for the function to update.

How If-Statements (and Basic Flow-Control Macros/Functions) Work

(let [foo 3
if-statement-result (if (= foo 3)
5
foo)]
(* foo if-statement-result))
On the same line as the last line of Clojure code, a white arrow is appended to the end of the line and reads as "=> 15". — Two local variable bindings are made so *foo = 3* and *if-statement-result* contains the result of the following if statement. As seen after compiling the function (indicated by the white arrow), the result of the function is the integer 15.

Take this snippet of code for example. Inside of this let statement, I have two local bindings I want to declare: ‘foo’ and ‘if-statement-result’. We know ‘foo’ will be bound to 3. However, ‘if-statement-result’ has an if-statement as its binding (which we know we can do from reading the “How to Set ‘Variables’” section of this blogpost). This means, based on what ‘foo’ is, the if-statement will return a certain value for ‘if-statement-result’ to be bound to. In this case, because ‘foo’ is 3, the first line of code is executed, which returns 5. Thus, ‘if-statement-result’ is bound to 5. Therefore, in the main body of this let-statement, the result of multiplying ‘foo’ by ‘if-statement-result’ is 15 and this let-statement will return 15 to the file itself, making the value of this let-statement 15. It is important to note that, similarly to many high-level programming languages, Clojure is whitespace sensitive. Thus, when you lookup a function and it has certain aspects on independent lines, that is for a reason. The annoying part is that every function/macro you will use has a completely different set of rules for whitespacing (if you want to make your head spin, just look at the whitespacing for the myriad of looping functions/macros in Clojure), so you will need to be extremely careful with your syntax in these cases. Also, if a function/macro (such as an if-statement) gives you space for only one line of code and you need more than one line to execute if that line is hit, use the do macro. It has various forms, is extremely important, simple to use, and well documented [15].

Functions

According to Alan Perlis, “It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures.” [14]. Clojure lives and breathes by this philosophy. Functions have a very simple setup to them.

(defn foo
"This prints word and returns 5"
[word]
(print word)
5) — An example of a Clojure function with a docstring which takes in a parameter which it prints then returns the integer 5.

Every function is surrounded by parenthesis (a very common theme at this point) with the defn macro beginning the function, then followed by the name of the function. Inside of the square brackets is where the function’s parameters are declared (in this example, ‘word’ is the parameter for this function). Finally, anything after the initial square brackets is part of the body of the function. The reason there is no return statement in any function in Clojure is because functions will return the last piece of data returned to it inside of the body.

(defn fii
"Performs a series of calculations with number and returns nil"
[number]
(let [squared-number (* number number)
half-squared (* squared-number 1/2)]
half-squared)
nil) — Another Clojure function which performs a series of complex computations but returns nil as it is the last piece of data referenced in the function.

In this example, the function fii will square a number given to it and divide the result by two. Thus, the let statement returns ‘number^2 /2’. However, this is not what the function will return, as ‘nil’ is the last declared piece of data in the function; therefore, nil is returned instead. Remember, when you are done writing a function, you need to compile it for the changes to take effect.

Anonymous Function

Anonymous Functions act as stand-ins for helper functions. These functions are declared exactly like a regular function, but they start with the macro fn instead and do not have a name assigned to them. Instead, they can have global or local bindings which helps to eliminate possible clutter in your code.

(let [formula-for-apex-of-ball (fn
[speed-of-ball]
(/ speed-of-ball 9.81))]
(formula-for-apex-of-ball 19.62))
On the final line of code, a white arrow is appended to the text with the resulting calculation as follows "=> 2.0". — An anonymous function is stored in the local binding *formula-for-apex-of-ball* and is used to calculate the apex of a ball which is traveling at 19.62m/s.

Take this segment of code for example. In our let statement, I have declared a binding called ‘formula-for-apex-of-ball’ which has an anonymous function bound to it. This function (assuming the speed of the ball is given in meters per second and the ball is thrown directly upwards, will give us the apex of the ball’s trajectory in meters. This function only exists inside of this let statement and will only be used inside this let statement, however.

Additional Resources

As mentioned before, the aim of this blog post is to help you begin your journey into Clojure so you can read the documentation of other functions and understand how they are used and why things behave the way they do. This is only the beginning of Clojure. There are so many more things to dive into and utilize such as Lazy Sequences, Tail Recursion, Name Spaces, how basic macros such as for-loops function, and so much more. To learn more about Clojure, I recommend looking into the following resources.

Clojure for the Brave and True: A phenomenal free online book by Daniel Higgenbotham which starts from the very beginning of Clojure to some of the more advanced topics. Picking up from Chapter 3 would be a continuation and further explanation of what has been covered in this post.
Clojure Cheat Sheet: A constantly updated list of all Clojure’s functions with documentation on how to use them.
The Official Clojure Website: A great place for anything related to Clojure.

If this post helped to get your foot in the door in regards to coding in Clojure, then using the above resources will have you well on your way to joining us Clojurists in our coding journeys. I would love to hear about your experiences with Clojure/Lisp languages or tips you may have regarding them. You can contact me at jgfrazie@hamilton.edu. Start memorizing some Clojure functions by heart, do not forget which set of parenthesis you are in, and good luck!