Tuesday, 9 December 2014

Julia performance - avoid global variables as per performance docs

I'm still finding that get significant speedups for all tasks in switching from Perl  to Julia
But it's quite easy to write code that is orders of magnitude slower if you write it in the wrong way, so if the timing matters it's essential to read the documentation on performance
e.g. here's some simple code that strips ascii nuls out of a text file


fileinname = "Log.txt"
filein = open(fileinname)
logfile = open(string(splitext(fileinname)[1],".PSV"), "w+")

# read and write
buffersize = 65536
#b::Vector{UInt8}
b = zeros(Uint8, buffersize)
readsize = buffersize
while readsize == buffersize
readsize = readbytes!(filein, b, buffersize)
#replace null with space
for i in 1:readsize
b[i] == 0 && (b[i] = 32)
end
    write(logfile, b[1:readsize])
end
It takes about 42 seconds to run on a 80 MB text file (windows 7 julia 0.3.3)

perl -pe "tr/\000/ /;$_" log.txt > log.DLF

takes about 4 seconds

But eliminating the global variables by merely putting a
let 
at the beginning and
end
 at the end of code takes the time down to 0.65 seconds, and @inbounds knocks another fraction of a second off.

b[1:readsize] apparently makes a copy of the data, but 0.4 is I think due just to refer to the existing data.

No comments:

Post a Comment