sorear changed the topic of #riscv to: RISC-V instruction set architecture | https://riscv.org | Logs: https://libera.irclog.whitequark.org/riscv | Backup if libera.chat and freenode fall over: irc.oftc.net
<xentrac> I think L4's approach to space allocation is inspiring
jn has quit [Remote host closed the connection]
jn has joined #riscv
jn has joined #riscv
<dh`> it is, in a certain way, but it's also extremely elaborate
<sorear> L4 or seL4?
* sorear wants to rank kernels by per-page VM overhead now
* jrtc27 hides
dionysos is now known as zoombie
<sorear> a sel4 dynamic system needs a Page and an Untyped for every page to support memory reuse, add the PTE for a uniquely mapped page and you have 9 words before the VM server's userspace allocation _starts_
zoombie is now known as dionysos
<xentrac> well, I was thinking of seL4 really
<xentrac> but I'm not entirely clear on the evolution of the mechanism through the illustrious and sordid history of L4
<xentrac> the thing I thought was inspiring about it, in particular, was the total elimination of allocation failure from the kernel, because (as I understand it) all the pages are owned by userland code, even the ones the kernel uses
<xentrac> let's see
<xentrac> I've read this paper before but forgotten all of it
vagrantc has quit [Quit: leaving]
<xentrac> it says the mechanism in question was specifically invented in seL4
<xentrac> > This led us to a radically new resource-management model, where all spatial allocation is explicit and directed by user-level code, including kernel memory (Elkaduwe et al. 2008).
<xentrac> (§2.2)
<xentrac> I should actually try using seL4 to see what it's like
Sos has quit [Quit: Leaving]
<xentrac> I have a hypothesis about seL4 that I am uncertain about: I think there is no way for a process accepting a mapping of a memory page from another process to guarantee that the grantor has not retained access to the page. do you know if that is true, sorear?
<sorear> that's correct
<sorear> in a bunch of different ways sel4 requires acceptors of capabilities to trust their sources; if you want to set up a channel between two user processes you need a mutually trusted server to create the needed resources
<xentrac> if you have a mutually trusted server, can they then safely pass a memory region back and forth in a way that guarantees that only one of them has access to it at a time?
<xentrac> my motivating example here is sending a frame of video from an windowed application to a window server, which retains the frame until it has copied the visible part of it into the hardware framebuffer, then sends it back to the application for recycling
<xentrac> it's not a very good motivating example, I admit, because it only guards against accidental bugs rather than security violations
<xentrac> but there are historically lots of cases where fairly similar communication patterns have given rise to TOCTOU vulnerabilities
<sorear> xentrac: sure, the MTS can unmap from a process (if the process was never given the frame cap) or revoke the frame cap (if it was)
<sorear> although I think the idea is to use TOCTOU-safe ring buffers for most things and avoid remapping during operation
<xentrac> "avoid remapping"? well, I do want the page to get unmapped from process A and mapped in process B, and then vice versa; isn't that "remapping"?
FluffyMask has quit [Quit: WeeChat 2.9]
<sorear> yes, that's what you have stated you want, but it's not super well optimized for in either sel4 (to map 50 pages into another address space you need to make 50 syscalls) or hardware (TLB flushing)
<xentrac> hmm, I wonder if there's a way to get around the TLB flushing problem. obviously the 50-syscalls problem is a thing you could fix
<xentrac> without custom hardware :)
<xentrac> a typical graphics window is on the order of 4 megabytes, so in that case a small number of huge pages might be a reasonable solution; copying those 4 megabytes into the framebuffer is going to take tens or hundreds of microseconds, which would swamp the cost of even a full TLB flush
<xentrac> but it's not clear that that's a good solution for things like Unix text pipelines
<xentrac> the cases where zero-copy communication matters (which is sort of what I'm really after) are the cases where the volume of data is fairly high and the computational intensity (in the HPC sense) is low
<sorear> back when wayland was new they did tests and found that the breakeven point for memory remapping was consistently around 256kb, but that was with Linux and (now) 10 year old hw
<xentrac> which way do you suppose it's gone?
<sorear> hardware side? up. tlbs have gotten bigger and the cost of a TLB miss is a bunch of extra round trips to memory, which have been getting more expensive in cycle terms
<sorear> linux vs sel4? could go either way
<xentrac> if the computational intensity is high then you might as well just copy. but I'm maybe unreasonably infatuated by this vision that I can map in a FlatBuffers file and navigate it selectively
<xentrac> follow pointers
<sorear> if you're using a marshalling system like flatbuffers it can be made toctou-safe without unmapping
<sorear> you just need to guarantee that you're accessing each word once
<xentrac> ?
<sorear> "if there's a way to get around the TLB flushing" theoretically it's straightforward to make the TLB coherent with an inclusive L2 cache, this would have adverse effects on TLB reach that are probably large but hard for me estimate without trying it
<sorear> if you never access a word in the buffer more than once at the asm level (in particular you are using volatile/atomic reads), then it doesn't matter whether it's being concurrently modified or not because you will see the old version or the new version
<jrtc27> would your coherent TLB roll back speculative TLB hits?
<jrtc27> or would you still want some form of fence that doesn't flush the TLB, just the pipeline
<sorear> the latter is simpler since the ISA already has sfence.vma (and fence.i)
<jrtc27> yeah
<jrtc27> I don't think you want to have it as speculative state for every instruction...
<jrtc27> :P
<sorear> the Fun Part here is that each TLB entry is in the worst case pinning 3 cache lines
<sorear> or much worse if you have sv48+H
<jrtc27> yeah... though you could do exclusive on the proviso that DMA isn't coherent
<jrtc27> (because wth are you doing DMA'ing to page tables)
<sorear> you have a memory address instruction spanning two pages and accessing a third and suddenly you need like 50 associativity to guarantee forward progress
<jrtc27> 3*4*2 is "only" 24 :)
<sorear> 3*4^4
<jrtc27> can make it 32 if it's an unaligned access supported in hardware
<sorear> 3*4^2 rather
<jrtc27> why squared?
<sorear> because each level of the stage 1 PTW requires a separate stage 2 PTW
adjtm has joined #riscv
<jrtc27> oh
<jrtc27> oh right
<jrtc27> those are guest physical
<jrtc27> is it not 5*4 then?
<jrtc27> 4 PTEs and the actual page you want
<sorear> get the cursed thing working and then realize you have 3/4 of a HTM
<jrtc27> 5 guest physical addresses
<sorear> I think so but didn't want to do that much math that quickly
<jrtc27> pretty sure it's 3*5*4=60 then :D
<jrtc27> which I believe is what the kids call a "big oof"
<sorear> not sure if I've seen a fleshed out design for a variable-associativity victim cache (in hw - qemu's tlb doesn't count)
<sorear> really though you just mark the TLB entries as "potentially stale" when their underlying cache lines are evicted, and ignore that bit until the sfence
<jrtc27> hence my comment about exclusivity being fine so long as DMA isn't a thing
<sorear> I don't understand that bit. what is being exclusive of what, and how does DMA come in?
<jrtc27> as in, what you said
<jrtc27> you're allowed to evict from the L2
<jrtc27> without evicting from the TLB
<jrtc27> exclusive is probably the wrong term, but it's not an inclusive cache
<jrtc27> and the DMA is just because if it's not in the L2 you probably aren't snooping DMA to make it coherent with the TLBs
<jrtc27> (seems NINE, Non-Inclusive Non-Exclusive, is the term...)
riff_IRC has quit [Read error: Connection reset by peer]
ovh has joined #riscv
davidlt has joined #riscv
<xentrac> sorear: true, I suppose it's impossible to distinguish a concurrent modification from a previous incomplete modification if you're only accessing each word in the potentially shared region at most once. but of course it doesn't ensure that what you see is consistent, which is the property of interest in my framebuffer example — but relying on the data you see to be consistent in order to, say, not
<xentrac> crash or disclose secrets, means you are relying on its sender!
<xentrac> 60 is horrifyingly bad, like VAX bad
<sorear> exactly, if you're trusting the sender you can trust the sender to respect synchronization, if you're *not* trusting the sender then a "torn" page is no worse than what the sender might otherwise have sent
<xentrac> I think this is pointing out some important holes in my thinking, and I really appreciate these insights
<xentrac> yeah. echoing my previous whining about hypervisor shibboleths, I've been trying to shift from "trust" terminology to "rely on" terminology for a couple of reasons:
<xentrac> 1. "trust" has lots of warm fuzzies associated with it which are extremely counterproductive in security discussions, since "trusting" other components to function properly is the thing we want to minimize
<xentrac> 2. the whole template is something like "in order to do W, X relies on Y to do/not do Z" and "rely on" seems to encourage people to at least *mention* W in a way that "trust" does not (although maybe it should; I think it was Alan Karp who said he trusts his relatives to watch his kids but not keep his money, but he trusts his bank to keep his money but not watch his kids)
<xentrac> what do you think?
<sorear> i agree with the importance of specifying WXYZ. *bangs drum* security is meaningless without a threat model!
<xentrac> heh
<xentrac> amen brother
<xentrac> or sister
<xentrac> amen sibling!
<TwoNotes> VAX was a product of its times
<xentrac> so was the 68010 and it didn't have a way to cause 60 page faults in one instruction
<xentrac> another of my sort of motivating examples is that I'd like to be able to run very short processes to control information flow, and doing this efficiently sort of suggests mapping in most of the data the process could want in a FlatBuffers-like form. maybe only into its virtual address space, though, rather than necessarily prefetching it from your SSD
<xentrac> and you don't necessarily want that stuff mapped read/write
<TwoNotes> VAX put the Complex into CISC
<xentrac> Linux takes about 100 μs to fork+exit+wait, which is a pretty discouragingly large amount of overhead for a fundamental security isolation primitive
<TwoNotes> Trouble was, they let the high-level language people and mathmaticians design the instruction set. I know - I was there
<xentrac> yeah? what were you working on?
<TwoNotes> Bliss compilers
<xentrac> ever write a bliss-86?
<xentrac> by contrast a context switch between existing processes is down below 10 μs
<xentrac> I've never written anything in bliss-*, closest I've come is various Forths
<sorear> the last time i benchmarked context switches on linux, on an ultra low power laptop, i was getting 5 µs for pipes or futexes and 20 µs for TCP sockets
<xentrac> sounds about right
<xentrac> the order-of-magnitude performance difference makes it tempting to keep a process running longer than would be ideal for security purposes. Lucet can start and stop a wasm "process" in more like 10 μs, so there's less incentive to reuse possibly-corrupted memory state across, for example, data received from multiple identities
<xentrac> (Linux with glibc is closer to 600 μs for fork+exit+wait!)
<xentrac> of course in the Lucet case, you're relying on the Lucet compiler as well as the CPU, and these days the CPU is already bad enough
<xentrac> (to provide isolation between the "processes")
<xentrac> TwoNotes: what's your favorite language these days?
ovh is now known as riff-IRC
<TwoNotes> I like the odd ones
<TwoNotes> Erlang, for example
<TwoNotes> Forth is cool.
<TwoNotes> Bliss was in a battle against Pascal to be THE system programming language. They chose Bliss at DEC
<TwoNotes> Then in the end it was C that won out
<TwoNotes> But programming in RISC-V AS is really fun. Remdinds me a lot of IBM 360 BAL
<TwoNotes> I spent way too long programming in Java to not want to ever look at it again
<TwoNotes> ANother one I have dabbled in is COmmon Lisp
<xentrac> C is kind of like a cross between BLISS and Pascal
* sorear , knowing pascal and C, tries to extrapolate
davidlt has quit [Ping timeout: 245 seconds]
<TwoNotes> The Bliss compiler had a VERY powerful macro package. Most people could not make use of it because it was quite complicated. I was quite good at it because I maintained the part of the compiler that implemented it.
<sorear> C superior due to its support for lowercase letters? /s
<sorear> TwoNotes: isn't that how it always goes
<TwoNotes> Just about all of thge compilers for the VAX were written in Bliss
rvalles has quit [Read error: Connection reset by peer]
rvalles has joined #riscv
TwoNotes has quit [Quit: Leaving]
<sorear> ~rust~ bliss compiler written in bliss and making "excessive" use of advanced language features? the more things change...
<xentrac> heh
<xentrac> https://compilers.iecc.com/comparch/article/87-08-003 talks a little about what I mean by C being a cross between BLISS and Pascal
rvalles has quit [Read error: Connection reset by peer]
rvalles has joined #riscv
smartin has joined #riscv
zjason` has joined #riscv
zjason has quit [Read error: Connection reset by peer]
davidlt has joined #riscv
frost has joined #riscv
cmuellner has quit [Ping timeout: 245 seconds]
Sos has joined #riscv
adjtm has quit [Ping timeout: 272 seconds]
valentin has joined #riscv
hendursaga has quit [Ping timeout: 252 seconds]
hendursaga has joined #riscv
davidlt has quit [Ping timeout: 272 seconds]
cmuellner has joined #riscv
TMM_ has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM_ has joined #riscv
choozy has joined #riscv
usama has joined #riscv
choozy has quit [Remote host closed the connection]
TwoNotes has joined #riscv
zjason` is now known as zjason
choozy has joined #riscv
choozy has quit [Ping timeout: 272 seconds]
tgamblin has quit [Remote host closed the connection]
tgamblin has joined #riscv
<hendursaga> TwoNotes: Common Lisp you say? Have you heard of the Nyxt browser? You might like it...
wingsorc has joined #riscv
Andre_H has joined #riscv
davidlt has joined #riscv
<TwoNotes> A browser written in Lisp?
choozy has joined #riscv
jotweh has quit [Ping timeout: 268 seconds]
<leah2> can i see the cpu frequency from freedom-sdk userland?
<leah2> (on a unmatched)
<enthusi> with freedom-sdk you mean not as described in the SiFive forum thread?
<leah2> i mean some official image
<leah2> cpupower doesnt seem to support it, and the dmesg doesnt show i i think
jotweh has joined #riscv
frost has quit [Quit: Connection closed]
choozy has quit [Ping timeout: 245 seconds]
Andre_H has quit [Ping timeout: 272 seconds]
<jimwilson> leah2, cpupower frequency driver not ported yet, you can get an estimate by using "perf stat /bin/ls", for exact value you can read clock config with devmem2 and decode it as I mentioned in the forums
<leah2> cute trick, thanks
usama has quit [Quit: Leaving.]
usama has joined #riscv
<hendursaga> TwoNotes: Well, the renderer isn't in CL, but the rest is pretty much all CL. Really crazy stuff.
iorem has quit [Quit: Connection closed]
Andre_H has joined #riscv
<TwoNotes> From the appropriate PLL CSR data you can figure out the clock freuency provided you know what the installed XTAL is.
cwebber has joined #riscv
TwoNotes has quit [Remote host closed the connection]
TwoNotes has joined #riscv
wingsorc has quit [Quit: Leaving]
FluffyMask has joined #riscv
vagrantc has joined #riscv
psydroid has quit [Quit: node-irc says goodbye]
llamp[m] has quit [Quit: node-irc says goodbye]
demostanis[m] has quit [Quit: node-irc says goodbye]
khem has quit [Quit: node-irc says goodbye]
ahs3[m] has quit [Quit: node-irc says goodbye]
llamp[m] has joined #riscv
demostanis[m] has joined #riscv
psydroid has joined #riscv
ahs3[m] has joined #riscv
khem has joined #riscv
choozy has joined #riscv
TMM_ has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM_ has joined #riscv
usama has quit [Quit: Leaving.]
riff_IRC has joined #riscv
riff-IRC has quit [Ping timeout: 252 seconds]
riff_IRC has quit [Remote host closed the connection]
riff_IRC has joined #riscv
usama has joined #riscv
mahmutov has joined #riscv
ats has quit [Ping timeout: 272 seconds]
ats has joined #riscv
usama has quit [Ping timeout: 264 seconds]
jeancf has joined #riscv
jeancf has quit [Ping timeout: 272 seconds]
<xentrac> have a somber Tiananmen Square Day
davidlt has quit [Ping timeout: 264 seconds]
<riff_IRC> ^ lol
smartin has quit [Quit: smartin]
mahmutov has quit [Read error: Connection reset by peer]
mahmutov has joined #riscv
valentin has quit [Quit: Leaving]
mhorne has quit [Ping timeout: 245 seconds]
mhorne has joined #riscv
choozy has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
ahs3 has quit [Ping timeout: 252 seconds]
ahs3|afk has joined #riscv
SwitchToFreenode has quit [Remote host closed the connection]
KREYREEN has joined #riscv
Sos has quit [Quit: Leaving]
ahs3|afk has quit [Ping timeout: 268 seconds]
TwoNotes has quit [Quit: Leaving]
ahs3|afk has joined #riscv
ahs3|afk has quit [Ping timeout: 245 seconds]
ats_ has joined #riscv
ats has quit [Ping timeout: 268 seconds]
Andre_H has quit [Ping timeout: 265 seconds]
elastic_dog has quit [Quit: elastic_dog]
elastic_dog has joined #riscv
jellydonut has quit [Quit: jellydonut]