scoobydoo has quit [Read error: Connection timed out]
scoobydoo has joined #osdev
[itchyjunk] has quit [Ping timeout: 268 seconds]
wootehfoot has joined #osdev
[itchyjunk] has joined #osdev
<heat>
geist, what kind of crap are you pulling in zircon for faster TLB shootdowns?
wootehfoot has quit [Ping timeout: 256 seconds]
<heat>
I'm collecting A bits when deciding if I want to shootdown and in the worst case (let's imagine AAAAA-Clean-AAAAA) you end up doing two shootdowns to save 100ns off an invlpg
<heat>
it's... not ideal
<heat>
or a pathological case (A-Clean-A-Clean-A-Clean)
<heat>
also super unclear to me why linux's tlb single page cutoff is at 33. like, you'll end up flushing the whole tlb instead, is 33 really the golden number where you don't gain much time by invlpg'ing directly versus thrashing the tlb
<zid>
It looked like a fairly adhoc heuristic tbh
<zid>
and I think the x86 version is fudged, mainly things calling the _mm() version that just makes a fake huge range and flushes the entire tlb
<dh`>
magic numbers like that are generated by whoever wrote the code running one workload with ad hoc data collection, not actually collecting enough data to be statistically significant, and then fudging based on what they see
<dh`>
at least normally
<geist>
heat: nothing particularly fancy aside from the usual stuff
<geist>
on x86 we gather N pages and then do a cross cpu shootdown to only cpus that have the aspace mapped
<geist>
past some point it just does a global
<heat>
oh wow you're also using 33 as the cutoff
<heat>
I love it
<zid>
do 37 just to be different heat
<heat>
but then it's not optimal and I can't do that
<zid>
it's prime so probably works way better anyway. That's how primes work right?
computerfido has joined #osdev
<dh`>
OS/161 semi-uses 16 by default, but I don't provide the coalesce code and I'm pretty sure students mostly don't write it
<dh`>
also because OS/161 doesn't have threaded processes it doesn't normally come up much
<geist>
i'm guessing 33 is because it's one more than 32
<geist>
as in 32 is a common thing, so dont skip that, but effectively it's > 32
<geist>
or maybe it causes the wrapping data structure to be a nice size
vdamewood has joined #osdev
[itchyjunk] has quit [Remote host closed the connection]
saltd has quit [Ping timeout: 268 seconds]
saltd has joined #osdev
saltd has quit [Remote host closed the connection]
gxt has quit [Ping timeout: 268 seconds]
gxt has joined #osdev
saltd has joined #osdev
freakazoid333 has quit [Ping timeout: 244 seconds]
saltd has quit [Remote host closed the connection]
saltd has joined #osdev
gmacd_ has joined #osdev
wgrant has quit [Quit: WeeChat 2.8]
gmacd_ has quit [Quit: WeeChat 3.6]
<mrvn>
geist: I have to totally disagree with you on the invlpg for recursive page tables. Not because you don't have to invlpg but because if you don't have recursive page tables and no phys map you have to invlpg the page tables already. The recursive part saves you from having to allocate extra page table for the higer levels but changes nothing else. So it's a phys map vs. not phys map thing.
wgrant has joined #osdev
<mrvn>
geist: similar having per core page tables or shared page tables is independent of recursive mapping.
<mrvn>
Being able to manipulate page tables without having to map and unmap/invlpg them is a big simplification you get from phys map.
<mrvn>
Drawback is that any exploit to read kernel memory means all physical ram can be read.
<mrvn>
A feast for all the speculative execution exploits.
the_lanetly_052 has joined #osdev
<geist>
and it only works on x86
<geist>
and thus you have to figure it out again for other arches
<geist>
or at least i dont think it's been proven to work on other page table arches fully. riscv i think might... depends on how the table -> page encoding is done, which iirc on riscv is done via V=1 and RWX=0
<mrvn>
true. It's really nice though not having to deal with allocating, mapping and initializing page tables when trying to map a page though. no recursion in the alloc/map path.
<geist>
i think that probably blows it up, bcause an inner table mapping would be interpreted differently
<geist>
arm64 might get screwed up with things like contig mappings and whatnot being interpreted differently at different levels
<geist>
but it might work if some of the more exotic features of the page tables are not used
<geist>
an interesting exercise if nothing else
<mrvn>
Hmm, not sure about that. You wouldn't recurse into those contig mappings looking for page tables.
<mrvn>
You might get some really stange blobs of memory mapped somewhere but nothing in the kernel would access that range.
<geist>
yah, would have to think about it a bit. i think the bigger problem is the upper page table attributes being fairly differently interpreted on PTEs vs inner nodes
<geist>
i thought i remember looking at it once and it was pretty much going to be a no-go, at least with some of the newer features enabled other than the basic v8.0 stuff
<mrvn>
An inner node has to also function as a leaf but a leaf doesn't have function as an inner node as you never recurse into that.
<geist>
yeah i think the trouble is the inner nodes have a different set of attribute bits in the upper attributes that mean something incompatible
<mrvn>
no idea if arm64 fits that requirement.
<geist>
(on arm64)
<mrvn>
You also have to consider that you might not use all the combinations of bits in your kernel. Maybe there is a subset of bits that is sufficient for your needs and works recursively.
<geist>
as it is i still dont completely understand how bit 7, which is the part that terminates a higher level entry as a large page doesn't get messed up when mapping in a recursive thing
<geist>
sicne it gets interpreted as a PAT bit
<geist>
(on x86)
<geist>
oh i see, on x86-64 (PSE level stuff) bit 7 is reserved for terminal entries i think
<geist>
oh wait, no that's only level 4 and 5
<mrvn>
geist: because if the table is a page then you never access that entry to find a page table. It's one level too deep.
<geist>
yeah the PAT stuff got moved around between x86-32 and -64
<geist>
at least according to some unverified picture on the internet
<geist>
on x86-32 bit 7 (the page size bit) gets recycled for PAT on the terminal level, according to the interwebs, but then that's not an issue because there are only 2 levels
<geist>
though i dunno that might cause some problems where it interprets the bit 7 of an inner entry (would be 0) as a PAT 0. would have to think of how that works. haven't thought about PAT in a while
<mrvn>
Manipulating 2 address spaces with recursive mapping becomes tricky to reason about.
<mrvn>
As said the leaf tables are never used to map page tables so it doesn't matter what's in them.
<mrvn>
PAT is only in leafes, right?
<geist>
yah but if a bit is used in leaf *differnetly* than a inner node, then you have to consider how the same bits in an inner node are interpreted as a leaf
<mrvn>
Ahh yeah, but there you have it 0 iirc.
<geist>
in the case of x86-32 it looks like bit 7 is reused. but in x86-64 or -32 3 level paging it seems that PAT is bit 12? or something like that
<geist>
but then probably unused in inner stuff. os you'd have to arrange for PAT == 0 to be 'okay'
<mrvn>
Been a decade since I used it on x86/x86_64. I only remember that5 it worked just fine.
* geist
nods
<mrvn>
and doesn't on ARM.
wootehfoot has joined #osdev
<geist>
i guess i should really sit down and formally map it out on arm32, arm64, and riscv. just to know
<geist>
maybe this weekend sometime if i'm feeling especially masochistic
<clever>
map out which parts?
<mrvn>
What's the smallest you can make a page table on ARM64? On ARM I have 256 byte for the first level and 1k for the second level as a minimum. But I think ARM64 you need a few pages.
nyah has joined #osdev
<geist>
whether or not the recursive page table mapping things works on those architectures
<geist>
mrvn: yeah i think the multiple page size on arm32 outright blows it up
<geist>
that the top level table is 16k and the lower levels are 1k. :(
<geist>
mrvn: arm64 is much closer to x86 in that all page tables at all levels are the same size, and it's the base page granule
<mrvn>
hmm, that just means you need 16 slots to map
<bslsk05>
github.com: rpi-open-firmware/mmu.c at master · librerpi/rpi-open-firmware · GitHub
<geist>
which may be 4k, 16k, or 64k. but setting the base page granule scales everything
<mrvn>
geist: that sucks (for me) That would mean I need 5 pages for a process.
<geist>
hmm? what do yo umean?
<mrvn>
4 page tables and one for the struct Thread + heap.
<geist>
well, unless you're willing to make the process smaller
<geist>
if you restrict the size of the address space it will 'turn off' levels
<geist>
the first turn off is at 39 bits, then at 30
<geist>
that gets you 3 and 2 levels, respectively
<mrvn>
Each process only gets <4k of memory so that is viable.
wootehfoot has quit [Ping timeout: 252 seconds]
<geist>
yeah, and you can do that dyna,ically per process, so it's pretty much just the tracking data structure
<geist>
oh... now *that* might be interesting in a recursive situation, since it changes the number of levels that you're then mapping into the kernel's say 4 level system
<geist>
but probably not a big thing, if it works
<mrvn>
Let me start over with what I have on ARM32. I have mirco process that have one 4K. That page contains the minimum level 1 table, one level 2 table, the process struct (prev, next, id, registers, ...) and what remains is heap+stack.
* geist
nods
<clever>
:O
<mrvn>
With that I start millions of processes, each one renders one pixel on a framebuffer.
<clever>
i'll need to re-read the specs, because the first level is 4096 slots of 32bits each....
<geist>
sure, on arm64 you could get the same thing, if you just set the address space of that preocess as being 0-1GB (30 bits)
<geist>
oh oh i see. yeah no you'd need 3 pages
<geist>
2 levels of page tables with full 4K pages and then the 3rd page has your code/data/etc
<mrvn>
Maybe have a 64bit hpervisor and then 32bit kernel with the old page table format?
<mrvn>
Start a million kernels so to speak.
<clever>
ok, so TTBCR, N bits 2:0, a value from 0 to 7, controls how big of a virtual range TTBR0 covers, uses bits 31:(14-N) from the virtual addr i think?
<geist>
ithink it's more bits than that
<clever>
> Indicate the width of the base address held in TTBR0. In TTBR0, the base address field is bits[31:14-N]. The value of N also determines:
<clever>
> Whether TTBR0 or TTBR1 is used as the base address for translation table walks.
<clever>
> The size of the translation table pointed to by TTBR0.
<geist>
ah yes, that's a different thing
<clever>
ah, so i'm in the wrong part of the docs
<geist>
it lets you specify up to 8 contiguous pages as your top level page table
<geist>
a little used feature
<geist>
TCR_EL1 is where you get the address space size stuff
* clever
switches to aarch64 docs
<j`ey>
(TxSZ in TCR_EL1)
<geist>
yah. anyway, going to bed. ttyl!
<mrvn>
Can I do 8 64k pages as top level?
<clever>
j`ey: yep, i see those, but this also reminds me, aarch64 with things like 48bit virtual addressing, will have a lower chunk, an upper chunk, and a giant chasm of unmappable addresses
<clever>
but arm32, can map every single byte of the addr space
<clever>
so they function in very different ways
<mrvn>
clever: yeah. But you can create a chasm in arm32.
<clever>
what i know of arm32, is that you just set what bit the cut-off is at, in the addr?
<mrvn>
The top page table can be anywhere from 256 byte to 16k in powers of 2.
<clever>
linux exposes 1g/3g and 2g/2g, but i assume other splits are possible
<mrvn>
Setting one to 16k covers all. Both at 8k covers all. Anything smaller leaves a chasm.
<geist>
clever: the split works differently in arm32. in arm64 they're intrinsically separated by a gap, in arm32 there is a moving line where you set where the cpu starts fetching from one TTBR or the other one
<mrvn>
you can't do 1/3 on ARM
<geist>
alas it works badly, yeah
<clever>
geist: yeah, that fits with what i thought as well
<mrvn>
or maybe you can if you waste 4k in the bigger page table
<dh`>
I thought the middle was fixed at 0x80000000
<clever>
there is no gap on arm32, and you must manually create a "gap" by just tagging pages as unmapped in the paging tables
foudfou_ has joined #osdev
<geist>
dh`: no, it's configurable but in a dumb way
<clever>
if you want one
<dh`>
(on arm32)
<dh`>
I guess I missed the critical bits
* dh`
really hates arm docs
<mrvn>
clever: no, you can reduce the size of the top page table making the gap implicitly unmapped.
foudfou has quit [Ping timeout: 268 seconds]
<geist>
you specify the power of 2 split
<clever>
mrvn: and the lower table doesnt expand to fill the gap?
<geist>
dh`: iirc you can set it like 512/3.5 1/3 2/2 3/1 3.5/.5, etc
<mrvn>
clever: you set the size not to fill the gap
<geist>
or the default where it is 4/0 and thus the feature is disabled and there's a single page table structure
<mrvn>
clever: e.g. 4k size for TTB0 will handle 0-1G
<clever>
geist: strange, my basic identity paging setup, is only using TTBR0, so the higher half is 0 length, and the lower is the full 4gig?
<geist>
exactly, that's where it gets dumb, because the top level is 16k when you divide it you start reducing the top level size
<geist>
clever: yeah, that's the default, because the split mechanism came along in i believe later armv6, so it was a backwards compatible feature
<geist>
ie, i fyou didn't enable the split, then TTB0 was simply TTB (previously before there was a 0 and 1)
<clever>
ah, so until i set a split, it acts like TTBR1 doesnt exist
<geist>
yep
<geist>
and then it works just like x86/riscv/etc. so it was an opt in feature
foudfou_ has quit [Remote host closed the connection]
foudfou has joined #osdev
<clever>
re-reading the TTBR0 description, the N from TTBCR, sets how many bits of address are available for TTBR0, and the alignment requirements (due to lower order bits being absent)
<geist>
yeah
<clever>
at the default of N=0, then the address is in bits 31:14, which is where the 16kb alignment comes from
<geist>
anyway this was one of the thigs tat was greatly cleaned up in arm64 (or really the paging extensions in later armv7+hyp)
<clever>
if i wanted 8k alignment, i would need to use N=1
<clever>
but i dont think this has anything to do with the high/low split
<geist>
it does, but i'm too tired
<clever>
yeah, go to bed!
<geist>
i think mrvn gets it though
<clever>
i'll just read more docs
<geist>
it's annoying, and one of these old warts they cleaned up a bit (and added like 10x the compleixty on top of it) in arm64
xenos1984 has quit [Read error: Connection reset by peer]
<bslsk05>
developer.arm.com: Documentation – Arm Developer
<clever>
yeah, its a mix of backwards compat and a far simpler split rather then 2 sizesin a giant chasm
<geist>
only arch i really know of that enables the full 64bit is itanium. maaaaybe some of the ater POWERs do
<geist>
but itanium was like oh you think arm64 is complex? haha
<mrvn>
clever: There is TTBCR.T0SZ and TTBCR.T1SZ but that might just be for the long mode format.
<geist>
'long mode' format in the later arm32 stuff just got extended to arm64. so really that was where they developed it
<geist>
it's almost precisely the sae pattern as x86-32 PAE 3 level paging -> x86-64 4 level paging being an extension of PAE
<clever>
mrvn: ah thats what i got wrong
<clever>
mrvn: i only read the first half of TTBCR, for the short descriptor format!
<geist>
two differnet modes so yo uhave to be careful. a bunch of the features changes when you enable long mode
<clever>
yeah, that explains my confusion
<mrvn>
So at least in some page table formats you can shrink both TTBR0 and TTBR1 to leave a chasm in the middle.
foudfou has quit [Remote host closed the connection]
foudfou has joined #osdev
<clever>
T1SZ and T0SZ are each 3bits...
<clever>
and despite being a 32bit core, the TTBR0 and TTBR1 are 64bit registers, in long-descriptor mode
<clever>
so the paging tables can live above 4gig
<clever>
though, you would need an initial paging table <4gig, to even map >4gig and begin writing that high
<mrvn>
bootstrapping sucks
<mrvn>
would there be any benefit to have the page tables in high memory? Protect it from 32bit DMA?
<Mutabah>
Put them there because they can?
<clever>
so your allocator doesnt cry when you run out of free pages <4gig
<Mutabah>
Thus saving low memory for 32-bit devices
<clever>
and suddenly, you have free ram, but cant create paging tables
<mrvn>
being able to put page tables above 4G is nice. But is there a reason why you would force that?
<clever>
so i would just start out with one arena for all <4gig ram, create paging talbes in lowmem, turn on the mmu, and then add highmem to the allocator
<mrvn>
(other than saving ram for 32bit DMA)
<clever>
and now paging tables can just live anywhere the allocator wants
<clever>
mrvn: i wouldnt force it, but just allow it
<clever>
so running out of lowmem doesnt cripple the pagetable allocator
<mrvn>
I'm more and more trending to having a page table in boot.S with an ELF loader that maps the kernel and jumps to it. The kernel then allocates it's own page tables and frees boot.S
<clever>
that can work
<clever>
something ive seen somebody asking about on the osdev discord earlier, is what to do with the init thread
<clever>
more specifically, its stack
<mrvn>
what init thread?
<clever>
the first stack, that boot.S pulled from .bss and used when entering C
<clever>
which isnt on the heap
<mrvn>
You can free it
<clever>
in LK, that first stack becomes the stack for the idle thread
smach has quit [Ping timeout: 268 seconds]
<clever>
so you would need to either change the SP, or terminate that bootstrap thread, and spawn a new one for idle
<mrvn>
indeed
<clever>
and yeah, you then have the issue of trying to organize all of the .bss.init into one block that you can free
<mrvn>
Note that you have to create new idle threads for the other cores.
<clever>
thats something ive seen signs of in linux
<clever>
yeah
<clever>
so core0 is running from a .bss stack, while all others are running from a heap stack
<mrvn>
no reason the initial stack needs to be in .bss. Just have to be NOLOAD
<clever>
and those sizes may differ
<clever>
i initially thought you could make the idle thread stack super tiny
<clever>
but its also the interrupt stack
<clever>
if you get interrupted while idle
<mrvn>
only id idle runs in kernel mode
<mrvn>
s/id/if/
<clever>
i'm mostly thinking in terms of LK, where user mode is optional
<mrvn>
On ARM you have the IRQ/FIQ stack
<clever>
yeah, but i believe LK switches back to SVC mode as quickly as possible
<clever>
so it behaves more like other platforms
<mrvn>
also means you don't need a per process IRQ/FIQ stack.
<clever>
but you do have to kill the redzone
<geist>
(note that is also gone in arm64)
<mrvn>
redzone is stupid anyway
<geist>
(all the separate stacks)
<clever>
mrvn: i can kinda see redzone as being useful in leaf functions, why update the sp/framepointer when your not going to call anything else?
<mrvn>
clever: only leaf function can use the redzone. A function call overwrites it.
<clever>
as long as interrupts/signals dont steal the stack from under you, nobody is going to know
<clever>
a non-leaf function could still get ugly, and not update the stack until it makes a conditional call
xenos1984 has joined #osdev
<mrvn>
clever: but think what that means: The function has all the caller saved registers to work with but it's so complext that it needs more registers. So it spills them to the redzone. Think about how complex that function has to be. How is a single "add sp, sp, #size" going to make that function slower?
<clever>
yeah
<mrvn>
Actually scratch spilling registers in that argument. That would use push/pop. You have some local variables like "char buf[32];" that you put in the red zone.
<mrvn>
Something you access via pointer so it has to be in memory.
<clever>
and you just want to write directly to it, without having to bump and then un-bump sp
<mrvn>
I just can't think of good examples for a function that is 1) complex enough the compiler needs to use sack, 2) small enough that bump/un-bump of the SP matters.
<mrvn>
3) the compiler can't push/pop registers and make the function work with only registers
<mrvn>
It kind of has to be something that uses an array and pointer to the array so the compiler can't optimize away the memory access.
<mrvn>
Is leaking memory (new without delete) UB in c++?
<moon-child>
I can't imagine so
<clever>
ok, lets see...., first i need to get this pi booting...
* kof123
tilts head. good question, what are the minimum requirements of space you can grab (or malloc() for c ) ...nonexistent/undefined?
<kof123>
i mean, is there any rule that you eventually run out?
<kof123>
not quite the same, but the trajectory..
<moon-child>
there are no guarantees
<clever>
and the scope on either gpio5 or 21
<kof123>
yeah :D so i would say assuming new or malloc succeeds is pushing your luck :D
<kof123>
before it gets to "can i leak" or not
<clever>
21 looks far simpler, so i need 21=ALT5
<moon-child>
kof123: what?
<moon-child>
no
<mrvn>
kof123: minimum is 0
<moon-child>
the question was: suppose we successfully allocated some memory. Given that we did that, is it UB to not free that memory?
<mrvn>
It's certainly unspecified what happens. But could be nice to be undefined.
<moon-child>
if you were unable to allocate the memory, then you have a false antecedent, and can say whatever you like, but no one will find it particularly interesting
<moon-child>
why?
<mrvn>
kof123: are you thinking of `auto p = new int; if (p) delete p;` where new/delete is unbalanced if the allocation fails?
<kof123>
no to both of you, just being pedantic.
<mrvn>
kof123: then what does `new` failing to allocate have anything to do with the question?
<mrvn>
moon-child: if the compiler detects a memory leak it could fail to compile the code.
<mrvn>
or at least that's where I'm heading
<moon-child>
maybe I'm doing a quick sketch; want to make sure the thing runs, and then figure out how to free stuff
<moon-child>
or maybe I have a massive legacy app that I know uafs, and I want to start by being conservative
<mrvn>
moon-child: sure way to end up with memory leaks because you forget to later fix the code
<moon-child>
either way, I would prefer a warning
<moon-child>
mrvn: the compiler is not going to be able to diagnose all memory leaks anyway
<mrvn>
moon-child: obviously
<moon-child>
per rice
<moon-child>
so 'compiler errors when it sees a memory leak' is not a _solution_ to memory leakage regardless
<kof123>
simply the predictable effects of a leak. leaks can cause a later request to fail. so it is quite obvious what will practically happen down the road with leaks -- new stops working, malloc() stops working. maybe im just not seeing any significance. what do you mean undefined? whats supposed to happen (or not)? what else can possibly happen, or practically?
<mrvn>
It's more than a memory leak though as the constructor is called but not the destructor. That can leak other resources too.
<moon-child>
ah, yes, c++
<mrvn>
kof123: in a non winodws/unix system a leak can also not free the memory at program exit.
<kof123>
good point ^^^
<kof123>
ok, just was trying to get to the meat
<clever>
mrvn: unix can do that too, if your creating temp files and not pre-deleting them
<mrvn>
it's unspecified wether a leak will free the memory on exeit or not. But it probably isn't UB.
<moon-child>
mrvn: another problem is it's not obvious how to define a memory leak
<mrvn>
moon-child: a new without delete
<mrvn>
(in this case)
<mrvn>
clever: or an IPC resource
<moon-child>
is it a leak if an object is reachable but dead? If so, suppose object x has fields y and z, and x.y is dead but x.z is not; is that a leak?
<clever>
mrvn: oh yeah, ive had issues with wpa supplicant like that
<clever>
wpa supplicant (the daemon) tracks which unix sockets have connected in datagram mode
<clever>
and that tracking survives the client restarting
<mrvn>
moon-child: if it's reachable the code might always still delete it.
<clever>
and if a client connects twice from the same socket, it gets 2 copies of every message!
<mrvn>
moon-child: at exit it won't be reachable
<moon-child>
mrvn: if it's reachable but _dead_, definitionally, it will not
<moon-child>
mrvn: new without a delete, _ever_? So I could put all my allocations into one global chain, and make an atexit callback to free them, and that's ok by your definition?
<mrvn>
usualy reachable is the definition of not dead
<moon-child>
no
<moon-child>
reachability is often used as a _proxy_ for liveness
<moon-child>
but strictly speaking an object is alive at some point in the program if it will ever actually be accessed again, and dead otherwise
<mrvn>
moon-child: if you have a static amount of allocations the compiler is free to do one "new" call at the start and one "delete" at exit.
wootehfoot has joined #osdev
<mrvn>
moon-child: it can put everything in .rodata, .data or .bss
<moon-child>
mrvn: my example was meant to cover the dynamic case
<moon-child>
(because reachability is trivial to determine, and liveness is undecidable in general)
<mrvn>
moon-child: Not sure if this answers your question but you could have a C++ runtime where new just calls sbrk() and delete never frees memory and at the end all the heap is freed. Wait, that's what c++ does on every unix (except for large allocations using mmap)
<moon-child>
'that's what c++ does on every unix' ...what??
<mrvn>
moon-child: delete just returns the memory to the libc allocator and doesn't free it to the kernel.
<moon-child>
... yes it does?
<kof123>
allocators all the way down...
<mrvn>
now even polymorphic allocators
<jjuran>
Memory leaks /can't/ be UB unless you define exactly what a memory leak is.
<mrvn>
jjuran: as said I'm talking about new without delete
<bslsk05>
en.wikipedia.org: Sewer alligator - Wikipedia
<kof123>
i thought that was a zidlist
<jjuran>
mrvn: Are you asking if calling new is UB if delete is never eventually called? How would anything know in advance that delete won't be called?
<mrvn>
jjuran: yes. And something can be UB even if it's not obvious or even possible to proof.
<mrvn>
A lot of UB is because it's impossible for the compiler to consistenly fail to reject such code.
<jjuran>
I'm not saying it can't be /proven/, but that it's fundamentally undecidable.
<mrvn>
Take for example this code: void foo() { auto p = new int; } Is the compiler allowed to eliminate the new?
<mrvn>
void bla() { auto p = new Bla{}; } Still allowed to eilimnate it even if Bla{} has side effects?
GeDaMo has joined #osdev
<jjuran>
I'm not sure about eliding `new int`, but if it's allowed, it's not because of UB.
<jjuran>
Obviously a side-effectful constructor call can't be elided.
<mrvn>
jjuran: unless it's UB it's not allowed.
<moon-child>
oh, no we are on to this issue? The notion that the compiler should be permitted to wilfully misinterpret you because of 'ub' is insidious
<mrvn>
moon-child: agreed :)
<moon-child>
well, that's something at any rate!
<mrvn>
clang at least calls "ub2". gcc just does whatever.
<moon-child>
hmm? As I recall, gcc was the one that would put ud2 at the end of a function if you forgot to return from it
<mrvn>
huh? no, it just returns whatever happens to be in eax
<jjuran>
mrvn: The compiler has to produce an executable that /behaves/ as specified by your program. It doesn't have to follow your exact instructions.
<mrvn>
void foo() { auto p = new int(); delete p; } gets elieded by the way.
<moon-child>
hmm, just tried, and neither of them do
<moon-child>
but I could have sworn I saw gcc putting ud2 in a related situation
<bslsk05>
godbolt.org: Compiler Explorer
<jjuran>
There's no guarantee that allocation will fail, so you can't rely on new throwing an exception. Since you don't do anything with p, nothing depends on the memory existing, nor by extension on calling new at all. Hence, it can be elided.
<jjuran>
The executable behaves "as if" new were called successfully without actually doing so.
<mrvn>
jjuran: new has side effects. The compiler is explicitly allowed to elide it despide that.
<jjuran>
What side effects? Calling constructors, or consuming memory?
<mrvn>
jjuran: could be overloaded to do whatever.
<jjuran>
The compiler knows if it's been overloaded.
<mrvn>
the compiler also knows it's allowed to elide it no matter what
<mrvn>
IF it can pair up the new and delete
<jjuran>
Interesting. Can you provide a citation?
<mrvn>
It's required to do that for constexpr. Otherwise you couldn't use something like a vector or string in a constexpr even if you free if before the end.
<bslsk05>
en.cppreference.com: new expression - cppreference.com
<mrvn>
> New-expressions are allowed to elide or combine allocations made through replaceable allocation functions. In case of elision, the storage may be provided by the compiler without making the call to an allocation function (this also permits optimizing out unused new-expression). In case of combining, the allocation made by a new-expression E1 may be extended to provide additional storage for another
<mrvn>
new-expression E2 if all of the following is true:
<jjuran>
Okay, sure. It observes memory being created, used, and freed in the same scope, so it can elide the actual allocation call.
<mrvn>
> During an evaluation of a constant expression, a call to an allocation function is always omitted. Only new-expressions that would otherwise result in a call to a replaceable global allocation function can be evaluated in constant expressions.
<bslsk05>
de.wikipedia.org: CL (Programmiersprache) – Wikipedia
<j`ey>
common lisp?
<moon-child>
common lisp
<mrvn>
that was my first thought to but then google found IBM i CL
<mrvn>
In c++ I find it stupid that known values aren't constants, not even duzing constant evaluation. Like when you write "int x = 3;". That's a "3". the compiler can clearly see that.
<mrvn>
constexpr should be implicit imho.
wootehfoot has quit [Ping timeout: 256 seconds]
<mrvn>
.oO(Who came up with an aspect ratio of 1280x536 at Netflix?)
wootehfoot has joined #osdev
wootehfoot has quit [Ping timeout: 256 seconds]
<kazinsal>
That's 2.39:1
<kazinsal>
So, standard widescreen cinema at 1280w
<kazinsal>
Less a Netflix problem and more a "why are you streaming cinema at such a low resolution" problem
<mrvn>
A "why film cinema aspect for streaming services?" problem
<kazinsal>
Presumably, for cinema
<kazinsal>
Though I've seen 2.35:1 used in prestige television for dramatic/storytelling purposes
<kazinsal>
Season 4 of The Expanse hard cuts from 16:9 to 2.35:1 for scenes on Ilus
<clever>
ive seen some shows using 4:3 for flashback scenes
<clever>
just because other shows where forced to, by reusing old footage
<clever>
and it kind of became a theme for flashbacks to be 4:3
<mrvn>
Just like flashbvacks used to be B&W. Because they are from before we invented color.
<kof123>
yeah, the answer of many mysteries is many times "but mommmmmmm...he did it first!"
<kazinsal>
Still haven't watched the final season of The Expanse... something about the whole six episode of prestige TV thing makes me think it's going to be horribly paced and unwatchable
<mrvn>
no spoilers
<kazinsal>
"prestige" television needs to stop doing this 8-10 episodes per season crap
<kazinsal>
You can't pace a whole arc in 10 episodes properly
<clever>
i had also seen a spin-off movie, where the main character is diving into an alternate reality version of themself
<mrvn>
and you can barely fill an afternood binch watching it
<clever>
and then taking over themself, in a scene from the original 4:3 show
<mrvn>
afternoone
<kazinsal>
16 is the perfect number imo, 13 episodes of pure plot with 3 episodes of breather stories with light arc stuff in the background
<clever>
and they switch to 4:3 for those scenes
<kazinsal>
Nobody wants to fund 26 episode seasons anymore, especially not HBO etc
<mrvn>
kazinsal: 1h or 45m episodes?
<saltd>
1km
<kazinsal>
That's another problem
<kazinsal>
The ridiculous variability of episode length
<mrvn>
kazinsal: they don't have to stick to 42m show, 18m commercials anymore.
<kazinsal>
Bosch on Amazon Video managed to mostly get 10-episode seasons of 44 minute episodes working right because they were adapting at most a single 250-page mystery novel per season
<kazinsal>
But then on the other extreme you get weird shit like Stranger Things S4 where you have like, 10 episodes of 60 minutes and then a followup "volume 2" of two episodes with a combined total of a Lord of the Rings film
<mrvn>
covid has screwed with shows a lot too
<clever>
mrvn: i think some scenes from `everything everywhere all at once` had been filmed over a zoom call, lol
<kazinsal>
I still need to watch that
<kazinsal>
I've heard it's really good
<clever>
kazinsal: its amazing
<kazinsal>
I've watched a few things with Michelle Yeoh and she's a great actress
<mrvn>
didn't really blow my mind
<kazinsal>
For that I can probably just sit my ass down and watch it
<kazinsal>
My biggest problem is I start watching some kind of series and it triggers the "this reminds me of X" bit of my brain
<kazinsal>
And suddenly I'm rewatching The Wire for the fiftieth time
<mrvn>
never watched that
<kazinsal>
Definitely a good watch, it's one of the few cop shows that takes an objective lens at the shittiness of American policing
* kof123
scribbles down on clipboard "kazinsal's in the wild naturally arrive close to the magic 1618"
<kazinsal>
And the interpersonal ruthlessness of the urban drug trade
* saltd
throws a baseball bat at kof123
<kazinsal>
The main character is an asshole, he knows he's an asshole, everyone else knows he's an asshole, but he's good at his job so they pretend to tolerate him
<kazinsal>
Because deep down, if you can turn names from red to black, who cares if you're an asshole, the stats are what matters
<saltd>
tolerantee meeee
<kazinsal>
Also, Lance Reddick stars in it, and you know
<kazinsal>
Lance Reddick fuckin owns
<mjg>
the wire? i had only seen the first season so far
* saltd
looking for miles
<mjg>
i liked it
<kazinsal>
Yeah, it's a phenomenal show
<mjg>
but later seasons have much lower ratings and i don't know if i want to continue
<kazinsal>
Go through it all
<mjg>
the first season decently wrap itself up imo
<mjg>
wraps
<saltd>
4km
<kazinsal>
S2 is a bit weird your first run through
<kazinsal>
Unless you've got family who've been historically involved in labour unions etc
<kazinsal>
In which case it hits real hard
<kazinsal>
S3 is kind of like more of S1
<mjg>
i watched the first episode, got really bored man
<mjg>
of s2
<kazinsal>
S2 picks up a few episodes in, there's a couple slow episodes at the start while things start back up and the major crimes unit gets their shit together again and McNulty gets off the boat
<mjg>
i love mcnulty fwiw
<kazinsal>
S4 is mixed between the drug trade and the cataclysm of inner-city schooling
<kazinsal>
S5 is McNulty's descent into madness and the death throes of print news
<mrvn>
/me recommends Sledge Hammer!
<mrvn>
"A parody of the ultimate tough cop with a big gun, typified by Clint Eastwood in the Dirty Harry films, who always goes for the most violent solution to any problem."
<kazinsal>
Jamie Hector takes a recurring role in S3 and becomes a starring role in S4 and S5