I’m a big admirer of the Julia programming language: it’s a fast general-purpose language with a nice syntax, macros, and a decent package manager.
No respectable up-and-coming language should be without good editor support. I’ve been polishing the Emacs mode, and learnt a lot about the language. If you’re writing Julia code, or integrating an editor, this should be interesting to you.
Syntax highlighting
Syntax highlighting is hard, and it’s rather challenging in Julia. We’ll look at some corner cases of syntax highlighting in Julia, so I’ll show code snippets along with screenshots of how this code is currently highlighted in Emacs.
I’ve written a complete Julia syntax highlighting test file which exercises all the different syntactic features of the language. You can use this to test Julia support in your editor of choice.
Highlighting function calls
Julia supports two ways of declaring functions, which the docs describe as ‘basic’ and ‘terse’.
function f(x,y)
x + y
end
f(x,y) = x + y
We want to highlight keywords (such as function
and end
) and to
highlight function names (f
in this example). This is pretty
straightforward: we can write a regular expression that spots either
the keyword function
or a symbol followed by something roughly like
(.*?) =
.
We can also define functions in an explicit namespace. This is also straightforward, we just highlight the last symbol after the dot.
function Foo.bar(x, y)
x + 1
end
A function definition may also include type variables. This isn’t too difficult to handle either, we just need to adjust our terse regular expression to step over the curly brackets.
elsize{T}(::AbstractArray{T}) = sizeof(T)
function elsize{T}(::AbstractArray{T})
sizeof(T)
end
However, highlighting gets harder with nested brackets.
cell(dims::(Integer...)) = Array(Any, convert((Int...), dims))
At this point, our naive regular expression falls down. We need to count brackets, or write a crude parser. The Emacs editing mode doesn’t yet handle this case.
Macro usage
Highlighting macros is easy. There are some awkward syntactic edge cases but these don’t affect highlighting.
@hello_world! foo
Built-in functions
Julia has a lot of built-in functions. After some discussion, we
felt that it wasn’t worth special-casing functions that are
keywords in other languages, such as throw
and error
.
throw(foo)
error("foo", bar, "baz")
Strings and characters
Julia has a lovely syntax here, but it takes a little care to highlight correctly.
For characters, Julia uses single quotes, but it also supports '
as
an operator. This gives very readable mathematical formulae.
# Characters
x = 'a'
y = '\u0'
# Not characters
a = b' + c'
Julia’s string syntax allows multi-line strings, triple-quoted strings, regular expression literals, byte array literals and (particularly nifty) version number literals.
x = "foo
bar"
x = """hello world"""
x = "hello $user"
x = r"foo.*"ismx
x = v"0.1"
x = b"DATA\xff\u2200"
We are handling most of this syntax in the Emacs mode, but it’s not perfect yet. I think we should highlight interpolated values in strings. See my test file for a full set of examples.
Comments
Julia’s comment syntax is also very nice. There are single-line and multi-line comments, and they support arbitrary nesting.
# I'm a comment.
#= I'm a
multi-line comment. =#
#= I'm a #= nested =# comment. =#
Emacs makes it easy for us to support all this different variants, so we’ve supported this for a long time.
Type declarations
You can declare your own types in Julia.
type Foo
x::Bar
end
immutable Foo
x::Bar
end
abstract Foo <: Bar
This is mostly a case of knowing all the keywords for type declaration, so it’s straightforward.
The operator <:
is particularly tricky. It is used in type
declarations to declare subtypes, but it’s also used a boolean
operator to see if one value is a subtype of another x <: y
. I
believe this is
impossible to highlight correctly in all cases.
# I can't see how to highlight the first 'T' here.
same_type_numeric{T<:Number}(x::T, y::T) = true
We can cheat by having a full list of built-in types in our highlighting code, so we highlight most subtype declarations correctly.
Type annotations
Julia supports optional type annotations in functions and on
variables. These are simple to highlight, but we need to get ::
right before dealing with quoted symbols.
f(x::FooBar) = x
function foo()
local x::Int8 = 5
x
end
Variable declarations
Julia has a local
keyword which lets you introduce local variable
bindings. I’d love to highlight this correctly too.
global x = "hello world", y = 3
let x = 1
x + 1
end
function foo()
local x = 5
x + 1
end
This requires parsing to handle correctly, so we don’t handle it yet. We can’t simply look for commas, as there may be arbitrary Julia expressions used.
# 'b' is not declared as a variable here.
global x = foo(a, b), y = 3
Colons and quoting
The hardest part of Julia’s syntax is :
. There have also been users
confused by this syntax.
# Quoted symbols
x = :foo
y = :function
foo[:baz]
[1 :foo]
# Not quoted symbols
foo[bar:end]
foo[bar:baz]
x = :123
for x=1:foo
print(x)
end
I’ve opened a pull request that enables Emacs to handle the most common usages correctly, but this is very hard to get right in all cases.
Numbers
Finally, Julia has a really neat numeric syntax. It supports all the
literals you could possibly want. It also lets you write 2x
as a
shorthand for 2 * x
, which makes many equations in Julia much more
similar to a maths textbook.
x = 0x123abcdef
x = 0o7
x = 0b1011
x = 2.5e-4
# Equivalent to '2 * x'
y = 2x
The Emacs mode currently doesn’t highlight these, but we probably
should. Some Emacs modes highlight numbers, some don’t, but for a
language with a focus on scientific computing, it would make sense to
highlight numbers. It’s particularly helpful to help readers see that
2x
is two separate symbols.
Conclusions
Julia’s syntax isn’t completely set in stone, but I doubt much of the syntax will change in ways that affect highlighting. The syntax favours readability over simple parsing (a great tradeoff), so writing a highlighter takes some careful thought.
Once you’ve got syntax highlighting working, it’s much easier to handle indentation. I think Emacs’ ability to indent Julia is pretty good (this blog post is plenty long enough without getting into indentation) and this is because it can fairly robustly identify block delimiters for highlighting.
Finally, it’s also desirable to have as-you-type syntax checking and linting. Flycheck will add support for this using Lint.jl as soon as Lint.jl/Julia performance is good enough to run on demand without a persistent process.
If you do encounter a bug with Emacs and Julia, there’s a ‘julia-mode’ issue label to track any bugs.
Happy hacking!