<discocaml>
<dubious245> *Looks up from mad programming spree.*
<discocaml>
<dubious245> I am unrolling the AST into some psuedo instructions that look like machine code, I guess thats my IR. I can go from there to proper assembly eventually.
<discocaml>
<dubious245> I was struck by the problem of translating the recursive AST into a linear data structure so I walked the tree in post order, and then read in that new list with a stack and started building my little pseduo instructions.
<discocaml>
<edhebi> ime it's work making sure those instruction are flexible enough that it doesn't get in the way any time you want to change something somewhere
<discocaml>
<dubious245> I was struck by the problem of translating the recursive AST into a linear data structure so I walked the tree in post order, and then read in that new list with the help stack and started building my little pseduo instructions.
<discocaml>
<edhebi> also a good number of optimizations work best on that ir, even some small stuff
<discocaml>
<edhebi> are you going for interpreted or compiled ? (or whatever's in between, esp reg. memory managment)
<discocaml>
<dubious245> I am eventually aiming to compile to raspberry pi 5 or that riscv machine on my desk.
<discocaml>
<dubious245> I feel like this isnt the right way to start converting the AST into a more linear data structure but I am not sure what I should use.
<discocaml>
<contificate> good to know CPS for this low key
<discocaml>
<dubious245> Unfortunately the prof who taught the compiler class at my uni quit so I cant ask him.
<discocaml>
<dubious245> -# (Or maybe he was fired. He left awfully quick.)
<discocaml>
<dubious245> :bThoughtful:
<discocaml>
<contificate> there's a trick to this
<discocaml>
<dubious245> Isnt... CPS what I am doing with tail recursion or are they not the same?
<discocaml>
<contificate> yes
<discocaml>
<contificate> let me explain
<discocaml>
<contificate> the problem is that you are writing a recursive algorithm where things deeper in the original tree need to become bound shallower in the output tree
<discocaml>
<contificate> so there's some threading of names that must happen
<discocaml>
<contificate> you can use CPS to do this, e.g. you normalise an expression, give it a fresh name, then push it to the context where it's used, via the continuation
<discocaml>
<contificate> this is the standard way ANF and CPS conversion is done
<discocaml>
<contificate> algorithmically
<discocaml>
<dubious245> ANF is new, whats that?
<discocaml>
<contificate> A-Normal Form
<discocaml>
<edhebi> for the record, CPS is a way to handle/abstract/cleanup variable names
<discocaml>
<contificate> a mid-level intermediary representation used in compilers for functional languages, although it is rather ad-hoc
<discocaml>
<contificate> it has the important properties one cares about
<discocaml>
<contificate> hate to link my own garbage, but I explain it in https://compiler.club/anf-conversion/ and https://compiler.club/compiling-lambda-calculus/ - the principles are the exact same for imperative languages as well, you just do a "leaders" algorithm (refer dragon book) to create basic blocks after that, after linearising all control flow to normalised (conditional) goto-like constructs
<discocaml>
<contificate> but yeah the post-order traversal with name stacks is a very imperative way of going about it and somewhat isomorphic to defunctionalising the CPS version, so it's somewhat similar but more error prone and tedious
<discocaml>
<edhebi> one thing that CPS does well, and it's kind of a recurring patterns in compiling, is that it makes everything look the same
<discocaml>
<edhebi> just like it's nicer when all your loops are sugar for the same underlying construct, this is doing this but for control flow in general
<discocaml>
<edhebi> just like it's nicer when all your loops are sugar for the same underlying construct, this is doing this but for control flow in general, or at least "nested" flow
<discocaml>
<contificate> I'm using CPS purely for threading of names in the normalisation algorithm, but yeah join points are effectively non-first-class continuations, and desirable in ANF to avoid duplication of evaluation contexts within conditional constructs
<discocaml>
<contificate> there are obviously compilers with CPS IRs (and some that try to lower this faithfully) but it's out of fashion, and of course the original 0-cfa work is phrased over CPS scheme
<discocaml>
<edhebi> on that topic for a really high level overview of the whole program to exec flow, I really recommand giving https://craftinginterpreters.com a read
<discocaml>
<edhebi> it's free, not too long, and Robery Nystrom is a pleasure to read
<dh`>
the advantage of CPS conversion is that it removes the difference between call and return; so is the disadvantage
<discocaml>
<contificate> very convenient for this algorithm
<discocaml>
<dubious245> Still reading the content and more of a style question: why do you put the recursive function inside of another function?
<discocaml>
<contificate> my preferred style, nested `go`
<discocaml>
<edhebi> same here, tho I admit I kind of stole it from other ocaml I've been reading
<discocaml>
<edhebi> I just like not putting more stuff at the top level than I need to
<discocaml>
<contificate> yeah, but many algorithms are also just partially applied variants - where you seed with an initial thing, I get so used to just writing `go` that it feels odd not to do that
<discocaml>
<edhebi> funnily enough it feels rly natural but I think I've only see it done that much in ocaml
<discocaml>
<contificate> I've seen it a decent amount in scheme
<discocaml>
<edhebi> I'd admit I've not used scheme a ton
<discocaml>
<edhebi> I've use racket a bit but not much
<discocaml>
<edhebi> I'll admit I've not used scheme a ton
<discocaml>
<edhebi> I've used racket a bit but not much
<discocaml>
<Kali> i see it in haskell pretty often
<discocaml>
<edhebi> I feel it's more common to have `f` call `f'` and both at the toplevel
<discocaml>
<contificate> it's just easier to retrofit auxilliary data structures if they are free variables of the nested function
<discocaml>
<contificate> otherwise you're stuck threading stuff around
<discocaml>
<contificate> via arguments
<discocaml>
<dubious245> I dont think I would be able to implement this and maintain it.
<discocaml>
<contificate> c'mon now
<discocaml>
<dubious245> I am very dumb.
<discocaml>
<contificate> nah
<discocaml>
<dubious245> You make a pretty compelling argument.
<discocaml>
<contificate> this is actually how I do it in C as well, but I defunctionalise it first
<discocaml>
<contificate> there's other ways to do it, but they're more awkward
<discocaml>
<contificate> you can have this notion of "embedding" contexts, e.g. you return like `Anf.value * (Anf.t -> Anf.t)` and then compose those on the way up to push contexts down
<discocaml>
<dubious245> I cam see that, might tree to list and then processing it in order with a stack doesn't feel quite right, like I am losing information.
<discocaml>
<contificate> CPS solves all this problem so neatly
<discocaml>
<dubious245> I can see that, my tree to list and then processing it in order with a stack doesn't feel quite right, like I am losing information.
<discocaml>
<contificate> this is actually why I originally learned CPS
<discocaml>
<contificate> this is how Andrew Appel does CPS conversion in his book "Compiling with Continuations"
<discocaml>
<edhebi> you know what, I'm currently working on a system with a ton of pretty specific queries about the state, all super dynamic (it's the core engine for a TCG) and I hadn't though about approching it from a compiler design perspective, and it actually might solve a bunch of design issues I've been having
<discocaml>
<contificate> everything is a compilers problem π
<discocaml>
<edhebi> all programmes are either compilers or databases
<discocaml>
<contificate> and SQLite is mostly a compiler
<discocaml>
<edhebi> the real secret is that compilers are secretly databases
<discocaml>
<edhebi> all programms are either compilers or databases
<dh`>
there are also operating systems
<dh`>
and tbf most "applications" are more like operating systems that compilers
<dh`>
at least, most of them most of the time
<discocaml>
<edhebi> I feel like that one's even more true when look at hardwares
<dh`>
as does libplacebo
<dh`>
woops
<discocaml>
<edhebi> modern cpus are more and more like operating systems with the machine code being an interpreted language
<qu1j0t3>
~ microcode has existed for ~70 years tho ~
<discocaml>
<edhebi> obviously, but also things where a lot less abstracted away 70y ago :p
<qu1j0t3>
less complex, yes.
<qu1j0t3>
but architecturally not so different.
<discocaml>
<edhebi> eh
<discocaml>
<edhebi> speculative exec and register renaming is what, from the 80s ?
<qu1j0t3>
yes, but microcode is older. those are in Hennessy and Patterson i think (so at least MIPS era, probably earlier)
<discocaml>
<edhebi> yeah, but I'd argue those looked _less_ like operating system than the post-80s ones
<discocaml>
<edhebi> so, there has been an evolution in there :p
ygrek has quit [Remote host closed the connection]
<dh`>
if the cpu is an interpreter, it's _not_ an operating system
<qu1j0t3>
i'm more worried about the actual o/s that are on colocated die/s though. :(
<qu1j0t3>
you can't trust anything any more :)
<discocaml>
<edhebi> ye, computing and architecture have mostly evolved the way that each level of complexity is an opaque box
<discocaml>
<edhebi> with all the goods and bads of encapsulation
myrkraverk_ has joined #ocaml
myrkraverk has quit [Ping timeout: 268 seconds]
euphores has quit [Ping timeout: 244 seconds]
euphores has joined #ocaml
mange has joined #ocaml
myrkraverk has joined #ocaml
myrkraverk_ has quit [Ping timeout: 268 seconds]
Serpent7776 has joined #ocaml
bartholin has joined #ocaml
Haudegen has joined #ocaml
Guest57 has joined #ocaml
Guest27 has joined #ocaml
Guest57 has quit [Quit: Client closed]
Guest27 has quit [Ping timeout: 240 seconds]
bartholin has quit [Quit: Leaving]
keyboard has quit [Quit: keyboard]
patrick_ is now known as patrick
dhil has joined #ocaml
patrick has quit [Changing host]
patrick_ has joined #ocaml
eilvelia has quit [Ping timeout: 252 seconds]
eilvelia has joined #ocaml
szkl has joined #ocaml
kit_ty_kate1 has joined #ocaml
<kit_ty_kate1>
avidseeker: are you around?
tremon has joined #ocaml
mange has quit [Quit: Zzz...]
Haudegen has quit [Quit: Bin weg.]
ygrek has joined #ocaml
Haudegen has joined #ocaml
szkl has quit [Quit: Connection closed for inactivity]
<discocaml>
<froyo> there's no way these job positions are still open. and there's no way these are the most recent ocaml postings. i wonder what's the weak link in this current job board data sourcing/updating
euphores has joined #ocaml
<discocaml>
<contificate> it's done via github and people forget
Haudegen has quit [Quit: Bin weg.]
<discocaml>
<dubious245> The weak link is always between the chair and the keyboard.
myrkraverk_ has joined #ocaml
myrkraverk has quit [Ping timeout: 248 seconds]
bartholin has joined #ocaml
<discocaml>
<yawaramin> using GitHub as a CMS
<discocaml>
<gooby_diatonic> Pragmatic
<discocaml>
<dubious245> In regards to the little programming language I am working on If I match ocaml's ABI then theoretical it should be easy to call into ocaml code or be called from ocaml code right?
<discocaml>
<dubious245> In regards to the little programming language I am working on If I match ocaml's ABI then theoretically it should be easy to call into ocaml code or be called from ocaml code right?
<dh`>
ffi'ing is never easy
<discocaml>
<contificate> you don't need to match the ABI of OCaml, only C
<discocaml>
<contificate> as you can already do those things to/from C
<discocaml>
<dubious245> It's not easy but I feel there are design decision one could be aware of up front that can make it harder than it needs to be.
<discocaml>
<contificate> you could target OCaml's IR if you want a little challenge
<discocaml>
<contificate> there's a project, albeit hacky, online around this idea
<discocaml>
<Kali> malfunction, right?
<discocaml>
<contificate> yeah
<discocaml>
<dubious245> What a name.
<discocaml>
<yawaramin> Chez Scheme is a pretty interesting compilation target imho
<discocaml>
<yawaramin> > Chez Scheme also includes extensive support for interfacing with C and other languages, support for multiple threads possibly running on multiple cores, non-blocking I/O, and many other features.
<discocaml>
<deepspacejohn> just compile to ocaml :ocaml:
Haudegen has joined #ocaml
<discocaml>
<dubious245> Dont look at me with those glowing red eyes.
<dh`>
ffi'ing to a managed language means interfacing the garbage collector, which is not impossible but a major headche
<dh`>
what are the characteristics of the backend target you want?
<discocaml>
<deepspacejohn> slightly more serious, the last time I wrote a toy language I found it very satisfying to use tagless-final style so I could write it with generic, abstract semantics and then have multiple implementations for it that meet the same interface.
<discocaml>
<deepspacejohn> like have an implementation that executes it in OCaml code, an implementation that prints javascript code, etc.
<discocaml>
<dubious245> I will target the QEMU Risc-V virtual board for the time being because its convenient and hopefully move on to the starfive development board in my closet.
<discocaml>
<dubious245> Or maybe a rasberry pi 4/5. It depends. Generally speaking I am looking to have the only thing running on a complete board being my code.
dhil has quit [Ping timeout: 246 seconds]
myrkraverk__ has joined #ocaml
myrkraverk_ has quit [Ping timeout: 245 seconds]
<dh`>
that's not what I meant, but if so why not just compile to asm?
<discocaml>
<dubious245> What did you mean?
<discocaml>
<dubious245> That's what I am working on now.
<dh`>
well, you were talking about ocaml's abi above
<dh`>
which prompted suggestions about compiling _to_ ocaml or other things
<dh`>
but if you're intending to run on bare metal you don't want anything garbage-collected
<dh`>
(you _can_ write a concurrent collector in asm, you just don't want to)
<discocaml>
<contificate> target armv8 (rpi) it's nicer than RISC-V
<discocaml>
<contificate> feel like baremetal is tedious to start with, easier to get started and have affirmative output by just linking against libc in userspace and printing stuff
<dh`>
erm. that's debatable
<discocaml>
<contificate> depends if you have taste or not
<discocaml>
<contificate> ARMv8 at least makes instruction selection somewhat interesting
<dh`>
*eye*
<dh`>
"interesting" instruction selection is not a feature
<dh`>
especially for someone new to compiling
<dh`>
or newish
<dh`>
(which I think is more or less the case, right?)
<dh`>
anyway, putting my kernel hat on for a moment what I'd recommend is what works best in qemu this year
<discocaml>
<contificate> my preference would be ARMv7 really, but RISC-V if I was teaching a uni course
<dh`>
sorry, armv8/9 has some redeeming properties but arm32 is disgusting :-p
<discocaml>
<contificate> not even comparable to x86
<discocaml>
<contificate> my first ISA was ARMv7, has a place in my heart
<dh`>
...weird instructions, global state, innumerable weirdness in instructions...
<dh`>
stuff like sign-magnitude immediates for no reason
<dh`>
or the weird rotating immediates
<discocaml>
<contificate> beautiful, just use a constant pool if worried
<dh`>
if it's what you started on, I can understand your position, but I still can't agree with it :-)
<dh`>
for teaching, definitely riscv
<discocaml>
<contificate> nicer than MIPS and x86 which are both used in academia more often
<dh`>
nothing is worse than x86, but mips is relatively clean
<discocaml>
<dubious245> I have always been interested in programming languages and embedded, basically my favorite things but I've always lacked the technical experience to do anything with that.
<discocaml>
<dubious245>
<discocaml>
<dubious245> Now with with half a dozens math classes and CS classes under my belt, I am taking a crack at this again for the first time in a few years and making serious progress. I spat out a simple lexer, parser, and crude compiler in two days and I have some basic assembly that I can feed into a GNU cross compiler. So compared to my past efforts I feel like I have the necessary tools to do it or learn how to do it.
<discocaml>
<dubious245> But yes I am some what new to actually getting anything working, I have heard lots of terms before from previous efforts.
<discocaml>
<contificate> I liked just running toolchains on target hardware, sshfs'ing my arch linux rpi
<dh`>
anyway if I were writing a compiler for fun I'd make it retargetable, no need to embrace any one of these follies
<discocaml>
<contificate> the worst of all worlds
<dh`>
dubious245: you can do it ;-)
<dh`>
contificate: yes and no
<discocaml>
<contificate> you still gotta target stuff with retargetable stuff, unless you mean not writing your own backends
<dh`>
oh sure, but the interesting part becomes framing the retargeting scheme
<dh`>
and then you aren't committed to any architecture's insanity in the real compiler
<discocaml>
<contificate> every large mainstream compiler maintains a pattern matching esolang for this and lots of handwritten special cases and predicates
<dh`>
yes, I know, and they suck, that's what makes it an interesting problem
<discocaml>
<contificate> they do the job, thing is there's an 80:20 rule and diminishing returns
<dh`>
well yes, gcc also does the job, then the question is why one is writing something else
<discocaml>
<dubious245> Ideally as I understand it I compile my source code to some sort of assembly like instruction set, which I am doing now if a limited subset for testing purposes. Then compile that to assembly for a specific machine. Once I have that pipeline working I can add other targets to take that IR to different assemblies. But in the meantime with that single target proven to work I can start actually developing things out and fleshing them out.
<discocaml>
<dubious245> One thing I have noticed is the tendency to spend a lot of time on one stage of the pipeline. Then when I move onto the next and start learning about it realize I made some mistakes in the output.format of the previous stage so I go back and fix it. Rinse and repeat.
<discocaml>
<dubious245>
<discocaml>
<dubious245> so I want to go as far as I can with the bare minimum proven out just to make sure my understanding of each stage is correct.
<discocaml>
<contificate> pretty much, be sure to model parallel moves - lest you commit the mistake every undergrad following Appel's Tiger books commit
<discocaml>
<contificate> honestly when I teach compilers, I don't even do lexing and parsing until later - so you've avoided the first self-gatekeeping endeavours already
<discocaml>
<dubious245> I noticed there is a lot of information online about parsing and not alot on what happens after that except for a basic "evaluate the AST in place."
<discocaml>
<contificate> yeah it's why a lot of resources are post order things that emit stack machine bytecode, it's not very well covered in a cohesive, practical, way
<discocaml>
<dubious245> -# *googles parallel moves. Gets an article you wrote.*
<discocaml>
<contificate> wow, what are the chances
<discocaml>
<gooby_diatonic> Colin taught me everything I know about compilers, which isn't much but π
<qu1j0t3>
lcc is an interesting case study of a "retargetable" compiler
<qu1j0t3>
a literate program published as a book)
_ohm has joined #ocaml
_ohm has left #ocaml [Konversation terminated!]
agentcasey has quit [Quit: ZNC 1.10.x-git-38-e8c4cda0 - https://znc.in]
remexre has quit [Ping timeout: 245 seconds]
remexre has joined #ocaml
remexre has quit [Ping timeout: 276 seconds]
Tuplanolla has joined #ocaml
<discocaml>
<contificate> LCC is cute because of iburg, really have to read the code as well - literate programming is tedious with C
<qu1j0t3>
Knuth did switch from Pascal to C completely though
<qu1j0t3>
(i don't find it a particularly pleasing change for his work lol)
remexre has joined #ocaml
<dh`>
literate program is tedious in general, cannot recommend
<dh`>
s/program/programming/
<dh`>
<dubious245> so I want to go as far as I can with the bare minimum
<dh`>
^ yes, absolutely]
<dh`>
I remember after looking through lcc and burg/lburg that I'd decided it wasn't the right way to do it, but I can't remember why
<qu1j0t3>
i did a pdp-11 target with lcc but the obstacles weren't lburg approach but rather the design assumptions of a more modern, 32 bit processor
<qu1j0t3>
even pdp-11 was slightly too exotic, but the original project was word addressed archs: And forget that lol
<discocaml>
<contificate> iburg is quite limited, bottom up matching is quite tedious honestly - way more effort than top-down
Serpent7776 has quit [Ping timeout: 276 seconds]
remexre has quit [Ping timeout: 260 seconds]
ced2 has joined #ocaml
ced2 has quit [Changing host]
ced2 is now known as cebd
cebd is now known as cedb
kron has quit [Quit: kron]
remexre has joined #ocaml
bartholin has quit [Quit: Leaving]
Exa has quit [Quit: see ya!]
kron has joined #ocaml
remexre has quit [Ping timeout: 252 seconds]
Exa has joined #ocaml
remexre has joined #ocaml
remexre_ has joined #ocaml
remexre has quit [Ping timeout: 265 seconds]
remexre_ is now known as remexre
remexre_ has joined #ocaml
remexre has quit [Read error: Connection reset by peer]
remexre_ is now known as remexre
remexre has quit [Ping timeout: 252 seconds]
<avidseeker>
kit_ty_kate1: I'm online
<kit_ty_kate1>
avidseeker: oh i think i know! `opam reinstall ocamlfind`