I am going to tell you some things that may sound counter-intuitive.
I am going suggest, that julia code is so reusable, are because the language has not just good features, but weak and missing features.
Missing features like:
But that these are countered by, or allow for other features:
Common advise when loading code form another module in most languagage communities is:
only import what you need.
e.g using Foo: a, b c
Common practice in Julia is to do:
using Foo
,
which imports everything everything that the author of Foo
marked to be exported.
You don't have to, but it's common.
But what happens if one has package:
Bar
exporting predict(::BarModel, data)
,Foo
exporting predict(::FooModel, data)
and one does:
using Foo
using Bar
training_data, test_data = ...
mbar = BarModel(training_data)
mfoo = FooModel(training_data)
evaluate(predict(mbar), test_data)
evaluate(predict(mfoo), test_data)
If you have multiple using
s trying to bring the same name into scope,
then julia throws an error.
Since it can't work out which to use.
As a user you can tell it what to use.
evaluate(Bar.predict(mbar), test_data)
evaluate(Foo.predict(mfoo), test_data)
There is no name collision if both names are overloaded the from the same namespace.
If both Foo
and Bar
are overloading StatsBase.predict
everything works.
using StatsBase # exports predict
using Foo # overloads `StatsBase.predict(::FooModel)
using Bar # overloads `StatsBase.predict(::BarModel)
training_data, test_data = ...
mbar = BarModel(training_data)
mfoo = FooModel(training_data)
evaluate(predict(mbar), test_data)
evaluate(predict(mfoo), test_data)
Name collisions makes package authors to come together and create base package (like StatsBase
) and agree on what the functions mean.
They don't have to, since the user can still solve it, but it encourages it.
Thus you get package authors thinking about other packages that might be used with theirs.
One can even overload functions from multiple namespaces if you want;
e.g. all of MLJBase.predict
, StatsBase.predict
, SkLearn.predict
.
Which might all have slightly different interfaces targetting different use cases.
Many languages have one module per file,
and you can load that module e.g. via
import Filename
from your current directory.
You can make this work in Julia also, but it is surprisingly fiddly.
What is easy however, is to create and use a package.
src
, test
etcpkg> test MyPackage
The recommended way to create packages also ensures:
JIT compiler: even compilation errors don't arive til run-time.
Dynamic language: type system says nothing about correctness.
Testing julia code is important.
So its good to have CI etc all setup
Assume it walks like a duck and talks like a duck, and if it doesn't fix that.
Another closely related factor is Open Classes.
But I'm not going to talk about that today, its uninteresting.
You need to allow new methods to be added to existing classes, in the first place.
Consider on might have a type from the Ducks library.
struct Duck end
walk(self) = println("🚶 Waddle")
talk(self) = println("🦆 Quack")
raise_young(self, child) = println("🐤 ➡️ 💧 Lead to water")
raise_young (generic function with 1 method)
and I have some code I want to run, that I wrote:
function simulate_farm(adult_animals, baby_animals)
for animal in adult_animals
walk(animal)
talk(animal)
end
parent = first(adult_animals)
for child in baby_animals
raise_young(parent, child)
end
end
simulate_farm (generic function with 1 method)
simulate_farm([Duck(), Duck(), Duck()], [Duck(), Duck()])
🚶 Waddle 🦆 Quack 🚶 Waddle 🦆 Quack 🚶 Waddle 🦆 Quack 🐤 ➡️ 💧 Lead to water 🐤 ➡️ 💧 Lead to water
Ok now I want to extend it with my own type. A Swan
struct Swan end
# Lets test with just 1 first:
simulate_farm([Swan()], [])
🚶 Waddle 🦆 Quack
The Waddle was right, but Swans don't Quack.
We did some duck-typing -- Swans walk like ducks, but they don't talk like ducks.
We can solve that with single dispatch.
talk(self::Swan) = println("🦢 Hiss")
talk (generic function with 2 methods)
# Lets test with just 1 first:
simulate_farm([Swan()], [])
🚶 Waddle 🦢 Hiss
# Now the whole farm
simulate_farm([Swan(), Swan(), Swan()], [Swan(), Swan()])
🚶 Waddle 🦢 Hiss 🚶 Waddle 🦢 Hiss 🚶 Waddle 🦢 Hiss 🐤 ➡️ 💧 Lead to water 🐤 ➡️ 💧 Lead to water
That's not right. Swans do not lead their young to water.
They carry them
# Same thing again:
raise_young(self::Swan, child::Swan) = println("🐤 ↗️ 🦢 Carry on back")
raise_young (generic function with 2 methods)
# Now the whole farm
simulate_farm([Swan(), Swan(), Swan()], [Swan(), Swan()])
🚶 Waddle 🦢 Hiss 🚶 Waddle 🦢 Hiss 🚶 Waddle 🦢 Hiss 🐤 ↗️ 🦢 Carry on back 🐤 ↗️ 🦢 Carry on back
Now I want a Farm with mixed poultry.
simulate_farm([Duck(), Duck(), Swan()], [Swan(), Swan()])
🚶 Waddle 🦆 Quack 🚶 Waddle 🦆 Quack 🚶 Waddle 🦢 Hiss 🐤 ➡️ 💧 Lead to water 🐤 ➡️ 💧 Lead to water
🐤 ➡️ 💧 Lead to water
What happened?
We had a Duck, raising a baby Swan, and it lead it to water.
Ducks given baby Swans to raise, will just abandon them.
But how will we code this?
function raise_young(self::Duck, child::Any)
if child isa Swan
println("🐤😢 Abandon")
else
println("🐤 ➡️ 💧 Lead to water")
end
end
(NB: this example is not valid julia code)
struct DuckWithSwanSupport <: Duck end
function raise_young(self::DuckWithSwanSupport, child::Any)
if child isa Swan
println("🐤😢 Abandon")
else
raise_young(upcast(Duck, self), child)
end
end
Duck
in my code-base with DuckWithSwanSupport
Duck
I have to deal with that alsoIf someone else implements DuckWithChickenSupport
, and I want to use both there code and mine, what do?
DuckWithChickenAndSwanSupport
This is clean and easy:
raise_young(parent::Duck, child::Swan) = println("🐤😢 Abandon")
raise_young (generic function with 3 methods)
simulate_farm([Duck(), Duck(), Swan()], [Swan(), Swan()])
🚶 Waddle 🦆 Quack 🚶 Waddle 🦆 Quack 🚶 Waddle 🦢 Hiss 🐤😢 Abandon 🐤😢 Abandon
Turns out it does.
The need to extend operations to act on new combinations of types shows up all the time in scientific computing.
I suspect it shows up more generally, but we've learned to ignore it.
If you look at a list of BLAS, methods you will see just this, encoded in the function name E.g.
SGEMM
- matrix matrix multiplySSYMM
- symmetric-matrix matrix multiplyZHBMV
- complex hermitian-banded-matrix vector multiply And turns out people keep wanting to make more and more matrix types.
b = BlockArray(0.5.<sprand(3*4,3*4,0.9), [4,4,4],[4,4,4])
3×3-blocked 12×12 BlockArray{Bool,2,Array{SparseMatrixCSC{Bool,Int64},2},Tuple{BlockedUnitRange{Array{Int64,1}},BlockedUnitRange{Array{Int64,1}}}}: 1 0 0 0 │ 0 1 0 0 │ 1 0 0 0 0 1 1 1 │ 1 1 1 1 │ 1 1 0 0 0 0 1 0 │ 1 0 0 0 │ 1 0 0 1 0 0 0 0 │ 0 0 1 0 │ 1 0 0 0 ────────────┼──────────────┼──────────── 1 0 0 0 │ 1 1 0 0 │ 1 1 0 1 0 1 1 1 │ 1 1 1 0 │ 0 1 0 0 0 1 0 1 │ 1 0 0 0 │ 0 1 1 0 1 0 0 0 │ 0 0 1 1 │ 1 1 0 0 ────────────┼──────────────┼──────────── 0 0 1 0 │ 1 1 0 0 │ 0 0 1 0 1 0 0 0 │ 0 0 0 0 │ 0 1 1 0 0 1 0 0 │ 1 0 0 0 │ 1 0 1 1 1 1 0 1 │ 0 1 1 1 │ 0 1 1 1
# creates a banded matrix of 8, with l sub-diagonals and u super-diagonals
BandedMatrix(Ones{Int}(10,10), (l,u))
10×10 BandedMatrix{Int64,Array{Int64,2},Base.OneTo{Int64}}: 1 1 ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 1 1 1 ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 1 1 1 1 ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 1 1 1 1 ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 1 1 1 1 ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 1 1 1 1 ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 1 1 1 1 ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 1 1 1 1 ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 1 1 1 1 ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 1 1 1
# creates a block-banded matrix with ones in the non-zero entries
x = BlockBandedMatrix(Ones{Int}(sum(rows),sum(cols)), rows,cols, (l,u))
4×4-blocked 10×10 BlockSkylineMatrix{Int64,Array{Int64,1},BlockBandedMatrices.BlockSkylineSizes{Tuple{BlockedUnitRange{Array{Int64,1}},BlockedUnitRange{Array{Int64,1}}},Fill{Int64,1,Tuple{Base.OneTo{Int64}}},Fill{Int64,1,Tuple{Base.OneTo{Int64}}},BandedMatrix{Int64,Array{Int64,2},Base.OneTo{Int64}},Array{Int64,1}}}: 1 │ 1 1 │ ⋅ ⋅ ⋅ │ ⋅ ⋅ ⋅ ⋅ ───┼────────┼───────────┼──────────── 1 │ 1 1 │ 1 1 1 │ ⋅ ⋅ ⋅ ⋅ 1 │ 1 1 │ 1 1 1 │ ⋅ ⋅ ⋅ ⋅ ───┼────────┼───────────┼──────────── 1 │ 1 1 │ 1 1 1 │ 1 1 1 1 1 │ 1 1 │ 1 1 1 │ 1 1 1 1 1 │ 1 1 │ 1 1 1 │ 1 1 1 1 ───┼────────┼───────────┼──────────── ⋅ │ 1 1 │ 1 1 1 │ 1 1 1 1 ⋅ │ 1 1 │ 1 1 1 │ 1 1 1 1 ⋅ │ 1 1 │ 1 1 1 │ 1 1 1 1 ⋅ │ 1 1 │ 1 1 1 │ 1 1 1 1
# creates a banded-block-banded matrix with 8 in the non-zero entries
y = BandedBlockBandedMatrix(Ones{Int}(sum(rows),sum(cols)), rows,cols, (l,u), (λ,μ))
4×4-blocked 10×10 BandedBlockBandedMatrix{Int64,PseudoBlockArray{Int64,2,Array{Int64,2},Tuple{BlockedUnitRange{Array{Int64,1}},BlockedUnitRange{Array{Int64,1}}}},BlockedUnitRange{Array{Int64,1}}}: 1 │ 1 1 │ ⋅ ⋅ ⋅ │ ⋅ ⋅ ⋅ ⋅ ───┼────────┼───────────┼──────────── 1 │ 1 1 │ 1 1 1 │ ⋅ ⋅ ⋅ ⋅ 1 │ 1 1 │ 1 1 1 │ ⋅ ⋅ ⋅ ⋅ ───┼────────┼───────────┼──────────── 1 │ 1 1 │ 1 1 1 │ 1 1 1 ⋅ 1 │ 1 1 │ 1 1 1 │ 1 1 1 1 ⋅ │ ⋅ 1 │ ⋅ 1 1 │ ⋅ 1 1 1 ───┼────────┼───────────┼──────────── ⋅ │ 1 1 │ 1 1 1 │ 1 1 1 ⋅ ⋅ │ 1 1 │ 1 1 1 │ 1 1 1 1 ⋅ │ ⋅ 1 │ ⋅ 1 1 │ ⋅ 1 1 1 ⋅ │ ⋅ ⋅ │ ⋅ ⋅ 1 │ ⋅ ⋅ 1 1
And that is before other things you might like to to to a Matrix, which you'd like to encode in its type:
These are all important and show up in crucial applications.
When you start applying things across disciplines, they show up even more.
Like advancements in Neural Differential Equations, which needs:
and wants to use them together.
So its not a reasonable thing for a numerical language to say that they've enumerated all the matrix types you might ever need.
This is called specialization.
This is pretty good: its a reasonable assumption that the types are going to an important case.
It lets a human tell it how that specialization should be done.
Which can add a lot of information.
We have
*(::Dense, ::Dense)
:*(::Dense, ::Diagonal)
or *(::Diagonal, ::Dense)
: *(::OneHot, ::Dense)
or *(::Dense, ::OneHot)
:*(::Identity, ::Dense)
or *(::Dense, ::Identity)
: But not everyone has Array types that are parametric on their scalar types; and the ability to be equally fast in both.
Without this, your array code, and your scalar code can not be disentangled.
BLAS for example does not have this.
It has a unique code for each combination of scalar and matrix type.
With this seperation, one can add new scalar types:
Without ever having to touch array code, except as a late-stage optimization.
Otherwise, one needs to implement array support into one's scalars, to have reasonable performance at all.
People need to invent new languages. Its a good time to be inventing new languages. It's good for the world.
I’ld just really like those new languages to please have: