# Why Does Julia Work So Well

There is an obvious to choose Julia:

it's faster than other scripting languages, allowing you to have the rapid development of Python/MATLAB/R while producing code that is as fast as C/Fortran

Newcomers to Julia might be a little wary of that statement.

1. Why not just make other scripting languages faster? If Julia can do it, why can't others?
2. How do you interpret Julia benchmarks to confirm this? (This is surprisingly difficult for many!)
3. That sounds like it violates the No-Free-Lunch heuristic. Is there really nothing lost?

Many people believe Julia is faster because it is Just-In-Time (JIT) compiled (i.e. every statement is run using compiled function which are either compiled right before they are used, or cached compilations from before). This leads to questions about what Julia gives over JIT'd implementations of Python/R (and MATLAB by default uses a JIT). These JIT compilers have been optimized for far longer than Julia, so why should we be crazy and believe that somehow Julia quickly out-optimized all of them? However, that is a complete misunderstanding of Julia. What I want show, in a very visual way, is that Julia is fast because of its design decisions. The core design decision, type-stability through specialization via multiple-dispatch is what allows Julia to be very easy for a compiler to make into efficient code, but also allow the code to be very concise and "look like a scripting language". This will lead to some very clear performance gains.

But what we will see in this example is that Julia dose not always act like other scripting languages. There are some "lunches lose" that we will have to understand. Understanding how this design effects the way you must code is crucial to producing efficient Julia code.

To see the difference, we only need to go as far as basic math.

## Arithmetic in Julia

In general, math in Julia looks the same as in other scripting languages. One detail to note is that numbers are "true number", as in a Float64 is truly the same thing as a 64-bit floating point number or a “double" in C.A Vector {Float64} is the same memory layout as an array of doubles in C, both making interop with C easy (indeed, in some sense "Julia is a layer on top of C") and it leads to high performance (the same is true for Numpy arrays).

Some math in Julia:

a = 2+2
b = a/3
c = a÷3 #\div tab completion, means integer division
d = 4*5
println([a;b;c;d])

output: [4.0, 1.33333, 1.0, 20.0]

Note here that I showed off Julia's unicode tab completion. Julia allows for unicode characters, and these can be used by tab completing Latex-like statements. Also, multiplication by a number is allowed without by a number is allowed without the $*$ if followed by a variable. For example, the following is allowed Julia code:

α = 0.5
∇f(u) = α*u; ∇f(2)
sin(2π)
output: -2.4492935982947064e-16

## Type-stability and Code Introspection

Type stability is the idea that there only 1 possible type which can be outputted from a method. For example, the reasonable type to output from *(::Float64, ::Float64) is a Float64. No matter what you give, it will spit out a Float64. This right here is multiple-dispatch: the $*$ operator calls different method depending on the types that it sees floats , it will spit out floats. Julia provides code introspection macros so that way you can see what your code actually compiles to. Thus Julia is not just a scripting language, It's a scripting language which lets you deal with assembly! Julia, like many languages, compiles to LLVM(LLVM is a type portable of portable assembly language).

@code_llvm 2*5

; Function *
; Location: int.jl:54
define i64 @"julia_*_33751"(i64, i64) {
top:
%2 = mul i64 %1, %0
ret i64 %2
}

This output is saying that a floating point multiplication operation is performed and the answer is returned. We can even look at the assembly

@code_llvm 2*5

.text
; Function * {
; Location: int.jl:54
imulq   %rsi, %rdi
movq    %rdi, %rax
retq
nopl    (%rax,%rax)
;}

This shows us that the $*$ function has compiled down to exactly the same operation as what happens in C/Fortran, meaning it achieves the same performance (even though it's defined in Julia). Thus it is possible to not just get "close" to C, but actually get the same C code out. In what cases does this happen?

The interesting thing about Julia is that, asking which cases this happens is not the right question. the right question is, in what cases does the code not compile to something as efficient as C/Fortran? The key here is type-stability. If a function is type-stable, then the compiler can know what the type will be at all points in the function and smartly optimize it to the same assembly as C/Fortran. If it is not type-stable, Julia has to add expensive "boxing" to ensure types are found/known before operations.

This is the key difference between Julia and other scripting languages

The upside is that Julia's functions, when type stable, are essentially C/Fortran functions. Thus ^(exponentiation) is fast. However, ^(::Int64,::Int64) is type-stable, so what type should it output?

2^5
output: 32
2^-5
output: 0.03125

Here we get an error. In order to guarantee to the compiler that ^ will give an Int64 back, it has to throw an error. If you do this in MATLAB, Python, or R, it will not throw an error. That is because those languages do not have their entire language built around type stability.

What happens when we don't have type stability? Let's inspect this code:

@code_native ^(2,5)

.text
; Function ^ {
; Location: intfuncs.jl:220
pushq   %rax
movabsq $power_by_squaring, %rax callq *%rax popq %rcx retq nop ;} Now let's define our own exponentiation on integers. Let's make it "safe" like the form seen in other scripting languages: function expo(x,y) if y>0 return x^y else x = convert(Float64,x) return x^y end end output: expo (generic function with 1 method) Let's make sure it works: println(expo(2,5)) expo(2,-5) output: 32 0.03125 What happens if we inspect this code? @code_native expo(2,5) .text ; Function expo { ; Location: In[8]:2 pushq %rbx movq %rdi, %rbx ; Function >; { ; Location: operators.jl:286 ; Function <; { ; Location: int.jl:49 testq %rdx, %rdx ;}} jle L36 ; Location: In[8]:3 ; Function ^; { ; Location: intfuncs.jl:220 movabsq$power_by_squaring, %rax
movq    %rsi, %rdi
movq    %rdx, %rsi
callq   *%rax
;}
movq    %rax, (%rbx)
movb    $2, %dl xorl %eax, %eax popq %rbx retq ; Location: In[8]:5 ; Function convert; { ; Location: number.jl:7 ; Function Type; { ; Location: float.jl:60 L36: vcvtsi2sdq %rsi, %xmm0, %xmm0 ;}} ; Location: In[8]:6 ; Function ^; { ; Location: math.jl:780 ; Function Type; { ; Location: float.jl:60 vcvtsi2sdq %rdx, %xmm1, %xmm1 movabsq$__pow, %rax
;}
callq   *%rax
;}
vmovsd  %xmm0, (%rbx)
movb    \$1, %dl
xorl    %eax, %eax
; Location: In[8]:3
popq    %rbx
retq
nopw    %cs:(%rax,%rax)
;}

That's a very visual demonstration on why Julia achieves such higher performance than other scripting languages by how it uses type inference.

# Core Idea: Multiple Dispatch + Type Stability => Speed + Readability

Type stability is core crucial feature which separates Julia apart from other scripting languages. In fact, the core idea of Julia is the following statement:

Multiple dispatch allows for a language to dispatch function calls onto type-stable functions.

This is what Julia is all about, so let's take some time to dig into it.If you have type stability inside of a function (meaning, any function call within the function is also type-stable), then the compiler can know the types of the variables at every step. Therefore it can compile the function with the full amount of optimizations since at this point the code is essentially the same as C/Fortran code. Multiple-dispatch works into this story because it means that * can be a type-stable function: it just means different things for different inputs. But if the compiler can know the types of a and b before calling *, then it knows which * method to use, and therefore it knows the output type of c=a*b. Thus it can propogate the type information all the way down, knowing all of the types along the way, allowing for full optimiziations. Multiple dispatch allows * to mean the "right thing" every time you use it, almost magically allowing this optimization.

There are a few things we learn from this. For one, in order to achieve this level of optimization, you must have type-stability. This is not featured in the standard libraries of most languages, and was choice that was made to make the experience a little easier for users. Secondly, multiple dispatch was required to be able to specialize the functions for types which allows for the scripting language syntax to be "more explicit than meets the eye". Lastly, a robust type system is required. In order to build the type-unstable exponentiation (which may be needed) we needed functionalities like convert. Thus the language must be designed to be type-stable with multiple dispatch and centered around a robust type system in order to achieve this raw performance while maintaining the syntax/ease-of-use of a scripting language. You can put a JIT on Python, but to really make it Julia, you would have to design it to be Julia.

## The Julia Benchmarks

The Julia benchmarks, featured on the Julia website, test components of the programming language for speed. This doesn't mean it's testing the fastest implemention. That is where a major misconception occurs. You'll have an R programmer look at the R code for the Fibonacci calculator and say "wow, that's terrible R code. You're not supposed to use recursion in R. Of course it's slow". However, the Fibonacci problem is designed to test recursion, not the fastest implementation to the the ith Fibonacci number. The other problems are the same way: testing basic components of the langauge to see how fast they are.

Julia is built up using multiple-dispatch on type-stable functions. As a result, even the earliest versions of Julia were easy for compilers to optimize to C/Fortran efficiency. It's clear that in almost every case Julia is close to C. Where it is not close to C actually has a few details. The first is the Fibonacci problem where Julia is 2.11x from C. This is because it is a test of recursion, and Julia does not fully optimize recursion (but still does very well on this problem!). The optimization which is used to receive the fastest times for this type of problem is known as Tail-Call Optimization. Julia can at any time add this optimization, though there are reasons why they choose not to. The main reason is: any case where tail-call optimization is possible, a loop can also be used. But a loop is also more robust for optimizations (there are many recursive calls which will fail to tail-call optimize) and thus they want to just recommend using loops instead of using fragile TCO.

The other cases where Julia doesn't do as well are the rand_mat_stat and the parse_int tests. However, this is largely due to a feature known as bounds checking. In most scripting languages, you will receive an error if you try to index an array outside of its bounds. Julia will do this by default:

function test1()
a = zeros(3)
for i=1:4
a[i] = i
end
end
test1()

BoundsError: attempt to access 3-element Array{Float64,1} at index [4]
Stacktrace:
[1] setindex! at ./array.jl:769 [inlined]
[2] test1() at ./In[11]:4
[3] top-level scope at In[11]:7

However, Julia allows you to turn this off using the @inbounds macro:

function test2()
a = zeros(3)
@inbounds for i=1:4
a[i] = i
end
end
test2()

This gives you the same unsafe behavior as C/Fortran, but also the same speed (indeed, if you add these to the benchmarks they will speed up close to C). This is another interesting feature of Julia: it lets you by default have the safety of a scripting language, but turn off these features when necessary (after testing and debugging) to get full performance.

## Small Expansion of the Idea: Strict Typing

Type-stability is not the only necessity. You also need strict typing. In Python you can put anything into an array. In Julia, you can only put types of T into a Vector{T}. To give generality, Julia offers various non-strict forms of types. The biggest example is Any. Anything satisfies T:<Any (hence the name). Therefore, if you need it, you can create a Vector{Any}. For example:

a = Vector{Any}(undef,3)
a[1] = 1.0
a[2] = "hi!"
a[3] = :Symbolic
a
output:
3-element Array{Any,1}:
1.0
"hi!"
:Symbolic

A less extreme form of an abstract type is a Union type, which is just what it sounds like. For example:

a = Vector{Union{Float64,Int}}(undef,3)
a[1] = 1.0
a[2] = 3
a[3] = 1/4
a
output:
3-element Array{Union{Float64, Int64},1}:
1.0
3
0.25

This will only accept floating point numbers and integers. However, it is still an abstract type. A function which is called on an abstract type cannot know the type of any element (since, in this example, any element can be either a float or an integer). Thus the optimization that was achieved by multiple-dispatch, knowing the type each step of the way, is no longer present. Therefore the optimizations are gone and Julia will slow down to the speed of other scripting languages.

This leads to the performance principle: use strict typing whenever possible. There are other advantages: a strictly typed Vector{Float64} is actually byte-compatible with C/Fortran, and so it can be used directly by C/Fortran programs without conversion.

# Lunch Money

It's clear that Julia made clever design decisions in order to achieve its performance goals while still being a scripting language. However, what exactly is lost? Next I will show you a few pecularities of Julia that come from this design decision, and the tools Julia gives you to handle them.

## Performance as Optional

One thing I already showed is that Julia gives many ways to achieve high performance (like @inbounds), but they don't have to be used. You can write type-unstable functions. It will be as slow as MATLAB/R/Python, but you can do it. In places where you don't need the best performance, it's nice to have this as an option.

# Conclusion

Julia is fast by design. Type stability and multiple dispatch is necessary to do the specialization that is involved in Julia's compilation to make it work so well. The robust type system is required to make working with types at such a fine level in order to effectively achieve type-stability whenever possible, and manage optimizations when it's not totally possible.

## Discussion / Project

Here's a good learning project: how would you design a new type EasyFloats to build MATLAB/Python/R arithmetic into Julia? How would you designed "arrays with NAs" to mimic R? Time the results and see what the difference from optimal is.

WhyJulia (ucidatascienceinitiative.github.io)