<greenbagels>
its a good sign when your code cant even finish benchmarks in the target time
<greenbagels>
really promising sign for the production dataset that is 100x larger
<companion_cube>
:D
<greenbagels>
but its ok, my goal with this project is to learn to profile and optimize code!
<greenbagels>
...even though the second highest overhead symbol in my code is pow() from libm...
<companion_cube>
that all in OCaml?
<greenbagels>
my program?
<companion_cube>
yeah
<greenbagels>
yeah, all in ocaml, just using the standard library
<greenbagels>
my intuition was that the slowest parts of the program would be like, memory allocation/copying
<greenbagels>
since i have huge lists of lists (a list of 2801 lists; im profiling as a function of inner list length, which for production code will be ~250k floats long)
<greenbagels>
i know i should *probably* use arrays for iterating, but i wanted to see how lists would perform
<greenbagels>
(since i dont need random access)
<companion_cube>
if you want to store 250k floats I strongly recommend bigarrays
<greenbagels>
companion_cube: if my use case is stride-1 iteration, is there a compelling reason other than space efficiency?
<discocaml>
<darrenldl> memory and cache locality
<discocaml>
<darrenldl> actually i dont know what stride-1 means, nvm
<greenbagels>
like always iterated front to back, not jumping around
<greenbagels>
intuitively i would think if my lists are going to mostly be built-up at (roughly) one time, and then just mapped and piped around, i would still more or less have decent locality of reference
<greenbagels>
maybe i should check my cache misses
<companion_cube>
a list of floats will be: 3 words per item for the list
<companion_cube>
+ 1 word per item for the (boxed) float
<companion_cube>
or is it 2 words? it's 2 words I think
<companion_cube>
with a bigarray, it's 1 word per float, that's it
<greenbagels>
oh so the cache will be heavily underutilized
<discocaml>
<darrenldl> then yeah, you'd benefit from spatial locality (greatly)
<companion_cube>
even just use 2MB instead of 10MB
<greenbagels>
for benchmarking im just using 512 floats instead of 250k
<greenbagels>
23% cache misses, yeowch
<companion_cube>
are you using valgrind to measure that?
<greenbagels>
perf
<greenbagels>
perf stat
<companion_cube>
oh nice
<greenbagels>
ill try bigarrays
<greenbagels>
pattern matching with cons is just so comfy ;_;
<companion_cube>
embrace the loops!
<greenbagels>
noooo
rgrinberg has joined #ocaml
<greenbagels>
companion_cube: haha wow you lose all the nice iterators!
<greenbagels>
guess i will have to embrace the loops
<companion_cube>
oh I mean you can write a fold or iter on bigarrays!
<greenbagels>
why isnt it just in the library at that point :p
<companion_cube>
big numerical computations isn't really something people have used OCaml for, really
<companion_cube>
it's not the focus of the stdlib _at all_
waleee has quit [Ping timeout: 264 seconds]
cocomo has quit [Remote host closed the connection]
trev has joined #ocaml
bartholin has joined #ocaml
szkl has quit [Quit: Connection closed for inactivity]
andrzejku has joined #ocaml
pi3ce has joined #ocaml
azimut has joined #ocaml
<greenbagels>
companion_cube: just seems strange to me for the stdlib to have bigarrays but not those then, but maybe there was good reason not to include it
<greenbagels>
i know ocaml tends to stay on the leaner side for standard library size
andrzejku has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
andrzejku has joined #ocaml
andrzejku has quit [Client Quit]
andrzejku has joined #ocaml
andrzejku has quit [Client Quit]
andrzejku has joined #ocaml
andrzejku has quit [Client Quit]
Tuplanolla has joined #ocaml
szkl has joined #ocaml
andrzejku has joined #ocaml
rgrinberg has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
rgrinberg has joined #ocaml
andrzejku has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<discocaml>
<commutativeconjecture> companion_cube: "Not trying to rain on your parade but doesn't coc mean you need both compile time evaluation, and some notion of erasure?" -> yes, exactly what i'm implementing
<discocaml>
<commutativeconjecture> (or more like, "typing time" evaluation, more than compile-time, as it can even be fully interpreted. but ye, pre-runtime / static-time / whatever this should be called)
andrzejku has joined #ocaml
<discocaml>
<commutativeconjecture> main part is trying to define a calculus with some stronger notion of reduction than just call-by-value (so that you can get some notion of equality on non-applied parametric-types / polymorphic expressions), but not necessarily _full_ beta-reduction (so that you don't loop infinitely on recursive types for instance)
<discocaml>
<commutativeconjecture> i'm thinking of some semi-equality on types ("for sure equal", "for sure diff", "unsure") that is refined as you pass in more parameters, and should converge on ground terms
<discocaml>
<commutativeconjecture> that way, you'd get the best of both worlds
<discocaml>
<commutativeconjecture> if anyone has inspiration for the above, pls ping/hl me
<discocaml>
<commutativeconjecture> more specifically, i'd be interested in a calculus that a stronger notion of reduction than just call-by-value, but which reduction order is still controllable by the programmer
<discocaml>
<commutativeconjecture> the advantage of call-by-value is that i can easily predict the order of evaluation of my programs, and control it (by using different `let in`s, etc.). not sure how to define a stronger calculus that has this nice property
azimut has quit [Remote host closed the connection]
azimut has joined #ocaml
rgrinberg has quit [Ping timeout: 268 seconds]
<rage>
does anyone use gdb for debugging ocaml? ive been using it to find a deadlock and it seems... fine
bartholin has quit [Quit: Leaving]
andrzejku has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
andrzejku has joined #ocaml
andrzejku has quit [Client Quit]
waleee has joined #ocaml
<discocaml>
<froyo> it is fine, just not the most fun for inspecting & traversing ocaml values
waleee has quit [Ping timeout: 256 seconds]
waleee has joined #ocaml
dnaq has quit [Remote host closed the connection]
dnaq has joined #ocaml
waleee has quit [Ping timeout: 240 seconds]
waleee has joined #ocaml
<rage>
makes sense, in this case it was an issue with a C library so gdb was rather handy
<greenbagels>
hmm
<greenbagels>
i feel like my computer architecture knowledge is depreciating
<greenbagels>
because for example i feel like my understanding of caching only gets worse with time, heh
waleee has quit [Ping timeout: 252 seconds]
waleee has joined #ocaml
average has quit [Quit: Connection closed for inactivity]
szkl has quit [Quit: Connection closed for inactivity]
andrzejku has joined #ocaml
noonien85 has joined #ocaml
azimut has joined #ocaml
rgrinberg has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
andrzejku has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<companion_cube>
😡 I also use it for deadlocks!!
<companion_cube>
Ugh
<companion_cube>
Rage : same
<rage>
SDL2 audio is deadlocking on me so I'm knee deep in C at the moment :(
dnh has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<discocaml>
<gabyfle> Would you recommend using OCaml for writing a language VM ? I did a toy version of my using a tree-walk algorithm for the moment, but I want to do things properly and write a VM for it, with proper bytecode so execution times can be reduced.
<discocaml>
<gabyfle>
<discocaml>
<gabyfle> Fact is with OCaml, I never experienced any low-level control and haven't see anything related to it. I'm wondering if OCaml has some things that can enable control over lower level stuff.
<companion_cube>
For a bytecode vm, not really, no
<discocaml>
<gabyfle> I wanted to make it in OCaml since the principal feature of my toy language is that I want it to be 100% interoperable with OCaml
<companion_cube>
I'd look into rust or C++ I think
<discocaml>
<gabyfle> Yeah ok, I just recently found rust <-> OCaml bindings so maybe I'll go for Rust
<rage>
what would you say makes OCaml unsuitable exactly?
<discocaml>
<bluddy5> You pay a cost for data structure layout, allocation, boxing. On top of that, your VM will have its own costs related to the language you're creating.
<discocaml>
<bluddy5> It's fine for a demo, not for a high performance language.
<discocaml>
<bluddy5> With Jane Street's compiler extensions though, it starts being feasible. But those aren't stable.
<rage>
makes sense, its been working just fine for my chip8 VM but I can imagine it being unfeasable for something more sophisticated
<companion_cube>
@bluddy5 I'm not entirely sure ocaml will be good for a VM ever, even with JST's extensions
<companion_cube>
You need really good control over machine words and arrays and all that, without memory barrier
dnh has joined #ocaml
<discocaml>
<commutativeconjecture> how recent are those compiler extensions? do they build upon the `[@noalloc]` idea? if so, where can i find a link to them?
<companion_cube>
You can look on their blog
<companion_cube>
There's stuff about memory regions, stack local variables, etc
<discocaml>
<Kali> these are the three that have been published so far
<companion_cube>
They also use flambda2 in prod
<discocaml>
<abstract.domain> compiling to tail recursive closures is faster than bytecode, you just have to be careful not to generate bigger closures than you need
<discocaml>
<bluddy5> companion_cube: I think the key is allocation on the stack. That's where most of the speed lies, and once you're fast enough, it doesn't matter so much that you're allocating a little more.
<discocaml>
<bluddy5> I wrote a VM for a language in OCaml and it was just fine. Had allocation been on the stack, it would have been very fast.
<companion_cube>
What if you want int64 registers? Now you have to worry about boxing
<companion_cube>
OCaml and int64 is not the greatest mix in terms of perf
<discocaml>
<bluddy5> boxing isn't a problem so long as you allocate on the stack
<discocaml>
<bluddy5> once you allocate in the heap, it starts to hurt
<discocaml>
<bluddy5> once you allocate on the heap, it starts to hurt
<discocaml>
<abstract.domain> how does the compiler do with int64 from big arrays coding asm.js style?
<companion_cube>
Ah, there's bigarrays of int64
<companion_cube>
(you need to be careful to avoid polymorphism but it works)
<discocaml>
<abstract.domain> you could have a bigarray heap and write a gc for it which, could even have that part in c
<discocaml>
<abstract.domain> ive done that in chez scheme and it was competitive my hand coded asm bytecode interpreter because that of indirect threading vs direct i could do by compiling bytecodes to closures
<discocaml>
<abstract.domain> used more memory and generated garbage but it still executed faster in some cases
troydm has quit [Quit: What is Hope? That all of your wishes and all of your dreams come true? To turn back time because things were not supposed to happen like that (C) Rau Le Creuset]
<discocaml>
<abstract.domain> ive also done similar for compiling scheme to javascript, having to make my own stack frames with arrays, and this stuff is still an order of magnitude faster than stuff Python which people seem to cope with
<discocaml>
<abstract.domain> if being faster than python is enough to satiate you, ocaml is definitely enough to write an interpreter in
<companion_cube>
Sure, that's not a very high bar :)
mima has quit [Ping timeout: 268 seconds]
troydm has joined #ocaml
pi3ce has joined #ocaml
dnh has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
szkl has joined #ocaml
azimut has joined #ocaml
azimut_ has joined #ocaml
azimut has quit [Ping timeout: 240 seconds]
<discocaml>
<Ada> i've written vms in ruby before for prototyping, if you're just doing a first demo you shouldn't worry too much about performance until you know what you actually want to do
<companion_cube>
If you're just prototyping a tree interpreter will work really well, and ocaml is great for that!
<discocaml>
<Ada> so if i were to need to store an int64 that is always above 0, would this be 63 bit fine?
<discocaml>
<froyo> can't you cheat by delegating most of your vm's operations to the existing ocaml runtime? especially since first-class interop is expected, you could just use the same data and pay no cost in transforming it.
<discocaml>
<Ada> so since int is 63-bit signed, to store a true 64 bit i would have to box it?
<discocaml>
<froyo> yep
<discocaml>
<froyo> I think nativeint is a little nicer in that if the compiler figures out it doesn't escape, it unboxes the value and puts it in registers
<discocaml>
<froyo> kinda like what happens with refs
<discocaml>
<Ada> but in the case of just storing a numeric ID from a sql query and returning it as text, the arithmetic penalty shouldn't apply?
average has joined #ocaml
<discocaml>
<froyo> you'd just pay memory penalty
azimut_ has quit [Remote host closed the connection]
<discocaml>
<froyo> but whether or not it's significant depends on your application
<discocaml>
<0aty> if you are serializing the number to text and talking to a DB, i think the chances of one pointer worth of indirection being what kills you is very low
<discocaml>
<Ada> haha
<discocaml>
<0aty> so probably dont worry about it and profile if you have problems
<discocaml>
<Ada> yeah obviously, it's just interesting to know about
<discocaml>
<0aty> yea for sure :-D
azimut has joined #ocaml
<discocaml>
<Ada> it's not a real perf concern especially since at this stage one of my test machines is 32 bit haha
<discocaml>
<froyo> if you're keeping a million records in memory that's a whole extra 16 megabytes though :)
<discocaml>
<Ada> very frequently i find the performance problems are glaring, big and stupid
<discocaml>
<Ada> the other day i got really confused cause every request was taking a full two seconds
<discocaml>
<Ada> turns out windows just sucks
rgrinberg has joined #ocaml
andrzejku has joined #ocaml
<companion_cube>
?? That's a lot
azimut has quit [Ping timeout: 240 seconds]
azimut_ has joined #ocaml
<discocaml>
<Ada> yeah ik
<discocaml>
<Ada> wsl
<discocaml>
<Ada> port forwarding thing
<discocaml>
<Ada> doesn’t work
<discocaml>
<Ada> i think it’s systemd messing with something but idfk
rgrinberg has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
<companion_cube>
Oh right, that sounds messy
<discocaml>
<Ada> its supposed to automatically forward ports on localhost to the linux vm
<discocaml>
<Ada> but for some reason it has a huge penalty over just connecting to the ip of the vm
<discocaml>
<Ada> and has weird conflicts with microsoft's own systemd implementation!
<discocaml>
<Ada> and this is still by far the best development environment that has ever existed on windows
szkl has quit [Quit: Connection closed for inactivity]
andrzejku has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
mima_ has joined #ocaml
rgrinberg has joined #ocaml
andrzejku has joined #ocaml
andrzejku has quit [Client Quit]
andrzejku has joined #ocaml
andrzejku has quit [Client Quit]
andrzejku has joined #ocaml
andrzejku has quit [Client Quit]
dnh has joined #ocaml
mima_ has quit [Ping timeout: 260 seconds]
rgrinberg has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
dnh has quit [Client Quit]
trev has quit [Quit: trev]
rgrinberg has joined #ocaml
andrzejku has joined #ocaml
rgrinberg has quit [Quit: My Mac has gone to sleep. ZZZzzz…]