Tuesday, 22 April 2014

Julia notes1

Some reminders to myself on working with Julia
This is mostly producing reports that I would otherwise have used Perl for
when I test I'm still getting significant performance improvements.
But as of yet I'm a long way from the productivity I had with Perl
Mostly that's my unfamiliarity with syntax etc. but there are other things tripping me up
Interestingly with the speed up and the data sizes I'm using I'm still pretty much on par on the total time for doing the work.

Take great care with spaces in brackets
e.g.
a = {[1,2,3],1,3,[1,2,3,4]}
println(a[4][3]) # 4
println(a[4][end -1]) # ERROR: no method Array{T,N}(Array{Int64,1}, Int64, Int64)
println(a[4][end - 1]) # 4
even Stefan Karpinski agrees it's annoyingly fiddly
but helps with people shifting from Matlab though not all. I hope Julia shift's away from it or offers some configuration setting that switches it off and requires use of a comma or some other more visible separator
If it looks like a space separation then Julia assumes it's a 1D array

Scope
Variables in global scope don't always behave identically to those in local scope, so care needs taking.
https://github.com/JuliaLang/julia/issues/6522
https://github.com/JuliaLang/julia/issues/423

Loading text files into arrays of arrays
This is something I ended up doing a lot with perl - arrays of arrays of hashes etc.
You can do it exactly the same using Arrays of Any in Julia, but the performance sucks compared to creating a Type for a record of the field types e.g.
type Price_Record
         priced_at::Float64
         bid::Float64
         ask::Float64
       end
and
price_array = Array(Price_Record,25000000)

and then loading it line by line took less than a 1/15th of the time
of
price_array = Array(Any,25000000)
and loading them line by line




Initialising blank dictionaries/hash arrays
newdict = (ASCIIString => ASCIIString)[]
or
newdict = Dict{ASCIIString, Int64}()
or
newdict = Dict{Any,Any}()

Windows specific
Multiple versions of Julia
If you have multiple versions of Julia installed, make sure they have separate package directories
e.g. in Julia.bat
set JULIA_PKGDIR=C:\apps\julia\Julia_2_1\packages
if 0.3 and 0.2 point to the same place you run the risk of loading incompatible versions

If using julia studio, point it to the main version of Julia you are using, not the internal one as supplied version is old



Packages and a proxy
Some of the machines I use can only get to the internet via a proxy, which makes it a bit harder
As the package handling is with git you need to configure Git to use the proxy.

I have the http_proxy environment variable set
e.g. set http_proxy=http://server.address.com:port
Julia packages are picked up with standard git, so to use a proxy one can use
git config --global url."https://".insteadOf git:/
and then something like
git config --global http.proxy http://proxyuser:proxypwd@proxy.server.com:8080
Stack Overflow and the Julia documentation helping out.

you can run these within the Julia prompt which ensures that they are picked up with the Pkg.add()
 ; git config --global url."https://".insteadOf git:/
; git config --global https.proxy http://192.168.1.189:8080
; git config --global http.proxy http://192.168.1.189:8080
where http://192.168.1.189:8080 is the proxy server