klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books
mctpyt has quit [Ping timeout: 272 seconds]
dmh has joined #osdev
[itchyjunk] has joined #osdev
pretty_dumm_guy has quit [Quit: WeeChat 3.4]
isaacwoods has quit [Quit: WeeChat 3.4]
sdfgsdfg has joined #osdev
masoudd has quit [Ping timeout: 240 seconds]
Vercas has quit [Remote host closed the connection]
Vercas has joined #osdev
Mutabah has quit [Ping timeout: 256 seconds]
Mutabah has joined #osdev
theruran has joined #osdev
sdfgsdfg has quit [Quit: ayo yoyo ayo yoyo hololo, hololo.]
relipse has joined #osdev
<relipse> I am coding a 2D rpg puzzle game anyone want to see it?
gwizon has quit [Quit: Lost terminal]
gog has quit [Quit: byee]
dude12312414 has quit [Remote host closed the connection]
nyah has quit [Ping timeout: 256 seconds]
sdfgsdfg has joined #osdev
elastic_dog has quit [Ping timeout: 240 seconds]
elastic_dog has joined #osdev
ElectronApps has joined #osdev
Jari-- has joined #osdev
<Jari--> morning all
sdfgsdfg has quit [Quit: ayo yoyo ayo yoyo hololo, hololo.]
rustyy has quit [Quit: leaving]
rustyy has joined #osdev
rustyy has quit [Client Quit]
MiningMarsh has quit [Ping timeout: 240 seconds]
MiningMarsh has joined #osdev
rustyy has joined #osdev
rustyy has quit [Quit: leaving]
rustyy has joined #osdev
rustyy has quit [Quit: leaving]
rustyy has joined #osdev
srjek has quit [Ping timeout: 252 seconds]
bradd has quit [Remote host closed the connection]
<moon-child> relipse: no, not really
<klange> not unless it's getting a PonyOS release
[itchyjunk] has quit [Read error: Connection reset by peer]
bradd has joined #osdev
xenos1984 has quit [Remote host closed the connection]
xenos1984 has joined #osdev
k8yun has joined #osdev
k8yun has quit [Quit: Leaving]
ThinkT510 has quit [Quit: WeeChat 3.4]
ThinkT510 has joined #osdev
vdamewood has quit [Read error: Connection reset by peer]
vdamewood has joined #osdev
wolfshappen has quit [Ping timeout: 256 seconds]
wolfshappen has joined #osdev
the_lanetly_052_ has joined #osdev
the_lanetly_052_ has quit [Remote host closed the connection]
mlombard has quit [Quit: Leaving]
nyah has joined #osdev
Patater has quit [Quit: Explodes into a thousand pieces]
theruran has quit [Quit: Connection closed for inactivity]
iceneko has joined #osdev
GeDaMo has joined #osdev
Jari-- has quit [Ping timeout: 256 seconds]
iceneko has quit [Ping timeout: 256 seconds]
mepy has joined #osdev
_xor has quit [Quit: brb]
mepy has left #osdev [Leaving]
j00ru has quit [Ping timeout: 260 seconds]
j00ru has joined #osdev
j00ru has quit [Ping timeout: 256 seconds]
ElectronApps has quit [Remote host closed the connection]
zaquest has quit [Remote host closed the connection]
zaquest has joined #osdev
j00ru has joined #osdev
the_lanetly_052 has joined #osdev
heat has joined #osdev
the_lanetly_052 has quit [Remote host closed the connection]
dennis95 has joined #osdev
Jari-- has joined #osdev
<heat> what's a good heuristic on consolidating per-thread free lists into the shared free list, for a memory allocator?
<mrvn> heat: keep a counter per thread how many pages are free. Then calculate the average and if you are way above return some pages to a common pool.
<Jari--> You can also make non-virtually mapped memory allocation.
<heat> that doesn't make sense if the memory allocator is only being hit by a single CPU
<Jari--> Just allocate it at the physical memory, in case some process will have access to this.
<heat> i.e doing networking RX on a single CPU, for performance
<mrvn> or use a work stealing list (lock-free). if one thread runs out steal some pages from one above average.
<mrvn> Jari--: that makes no sense.
<mrvn> heat: you can balance memory on each context switch or something that happens regulary
<heat> stealing pages defeats part of the purpose of having a per-cpu/per-thread list
<heat> and balancing memory on context switches seems like a great way to add a shit ton of latency
<mrvn> heat: a work stealing list is tune to that stealing not happening often.
<mrvn> Do you have shared memory and filesystem caches?
<heat> and most of the question is: what's a good way to know when to steal memory from the per-cpu free lists to the shared list?
<heat> yes
<heat> balancing lists based on the average of pages/memory chunks each list has seems really suboptimal
[itchyjunk] has joined #osdev
<mrvn> In what way?
<heat> you want to let memory stay inside the per-cpu lists that are getting hit the most often for as long as possible
<heat> only if you have a bunch of pages inside a per-cpu list that has been mostly stale for a while does that make sense
<mrvn> then have a decaying usage/s variable.
<mrvn> Do you need something for 4 cores or for 1024?
<heat> yes
<heat> :)
<GeDaMo> 64K cores should be enough for anyone :P
<mrvn> For 4 cores you can just average some metric every time you need to. For N cores you want some tree or graph form to average only a bunch of cores and let that propagate over time.
<mrvn> Won't all your memory be in the filesystem cache except for the local per-thread lists?
<clever> i might just use an uint32_t[4] (or native wordsize) to hold the free-count for each core
<clever> and just read them lock-less
<mrvn> clever: that was my suggestion above.
<clever> the only problem i can see there, is cache line theft
<clever> every time you free, you check those counts, and steal that cache line from the other cores
<heat> mrvn, no? why would it
<mrvn> as long a there is FS activity not using a free page for cache is wastefull
<clever> and if the other cores are all updating the counts in the same line, they are also stealing it from eachother
<clever> so maybe have a per-core avg, and update it less often
<clever> and spread those counters out into seperate cachelines?
<heat> all memory is page cache memory except the anonymous memory and the shared memory and the kernel internal allocations
<heat> and those kernel internal allocations are done by a kernel allocator which probably wants a percpu cache as well
<mrvn> So the normal state will be that all memory except a little reserve is in use by processes, kernel or cache.
<clever> thats something that a lot of windows users have trouble with when moving to linux
<heat> that's not really the case
<heat> you need a good amount of memory in free lists
<clever> they see something like 64mb free and 7gig used, and freak out about it being a problem
<mrvn> that's just a matter of adjusting what "little" means.
<clever> and 99% of the time, 6gig of that is just the fs cache
<clever> and they still think flushing the fs cache improves performance
<heat> clever, windows does the same, but it's designed for human beings
<clever> maybe task manager is just lying a bit, and reporting fs cache as free?
<clever> ive not taken such a close look at it
<heat> the task manager now shows a detailed bar
<clever> maybe the same is for those new users? and its just their first time on a slower arm sbc?
<clever> and the first time it slows down, they check ram, and see its low
<heat> compressed memory, page cache(I think?), actually free memory, anonymous memory, etc
<mrvn> My point is that the free lists, per-core and global, should be small. So I would create some metric how much memory a core should hold. If the free list is twice that then return half to the global list. If it's half that check if you can get some from the global list. If it's empty run an IPI or steal some.
ElectronApps has joined #osdev
<mrvn> clever: Linux dumbed it down for users now too. Top shows "available Memory"
<heat> i understand what you mean but that's not really the case I'm afraid
<heat> kernel memory allocation is way more complex than "page cache uses everything, so just use a tiny list"
<clever> mrvn: `free -m` also shows available, but users still complain when free is low
<heat> turns out most users aren't kernel devs or system devs so they just want to know if there's enough memory for their programs to run
<clever> yeah
<mrvn> clever: but now we can point those to "available". Didn't use to show that.
<clever> another issue, is bloat in chrome, it can no longer run on just 512mb of ram
<clever> and the recent pi02 model only has 512mb
<heat> forcing users to learn kernel concepts just to find out if they can have 30 tabs open on chrome is exactly why linux is not going anywhere on desktop
<heat> the pi zero 2 w isn't meant for desktop usage anyway
<heat> source: i have one
<mrvn> heat: I still don't see a problem. Over time cache will eat all memory till you hit some lower limit of what should be free. At that point you have to balance free lists.
<clever> heat: i cant even have 1 tab open on about:blank, lol
<clever> the system nearly deadlocks
<heat> yeah
<clever> and some users arent using it for desktop, but just kiosk type applications
<heat> compilation is also decently slow
<clever> where it just boots to a static url and auto-refreshes
<clever> i am wondering where all of that ram is even going
<heat> i think chrome has a special build for android (for lower memory usage)
<clever> yeah
<heat> maybe that's proprietary? dunno
<mrvn> clever: it caches rendered frame buffers so switching tabs is fast.
<clever> mrvn: but with a single open tab, thats not much
<heat> chrome does plenty of stuff to make sure things are safe and fast
<bslsk05> ​chromium.googlesource.com: Checking out and building Chromium for Android
<heat> running everything on a single process is more efficient than spreading it out over multiple, for instance
<dmh> i do plenty to remain safe and fast
<mrvn> heat: and then throws it out the window because with more than 6 tabs that would eat too much memory.
<clever> oh, that reminds me, spectre changes make things worse
<clever> in the old days, a given worker process was mixing different domains together
<clever> but to limit the scope of spectre exploits, each proc is now restricted to servicing a single domain
<clever> so you can only ever steal data from yourself
<mrvn> Like each tab gets their own javascript VM for security. Except tab 7 then shares because can't have O(tabs) memory usage
<clever> mrvn: if i pop open shift+escape on my desktop, i can see 8 youtube tabs are sharing a single proc, which is using 850mb of ram
<clever> 251mb just for js
<mrvn> So your youtube tabs can all steal from each other.
joe9 has quit [Quit: leaving]
<clever> kinda, there are at least 2 procs for youtube
<mrvn> what about other pages that just have a youtube video in them?
<clever> it randomly decides to spawn a new one
<mrvn> why does it have so much state that each tab can't he their own process?
<clever> let me check subframes...
<clever> yeah, if i play a youtube video inside discord, a process containing "subframe: https://youtube.com" spikes in cpu
<bslsk05> ​youtube.com <no title>
<clever> so its RPC'ing things between the discord process and the youtube process
<clever> which makes sense, JS doesnt give you very much control over the iframe
<clever> when cross-domain
<clever> mrvn: checking a heap snapshot....
<mrvn> gotten better then.
<clever> 35% snapshotted
<mrvn> Personally I think tabs that haven't been visible for say an hour can be reduced to minimum state. No need to keep a snapshot of the framebuffer, run any java scripts or keep any processes around.
<clever> that framebuffer is actually key to how android performance works
<clever> when your scrolling thru the open tabs, the only state you have is the url and the framebuffer
<clever> the entire dom and js heap just doesnt exist
<clever> and those framebuffers get saved to disk
<clever> so it can give you the illusion of the tabs still being live
<mrvn> then how does mouse-over or any other javascript trigger work?
<clever> android, no hover events
<clever> and you cant click anything until you set focus to that tab, which then reloads the page
<mrvn> oh, I see what you mean, that tab scrolling. I though scrolling inside the active tab.
<clever> yeah, that one
<clever> the snapshot for a random youtube tab is done
<clever> 73mb shallow size for (system)
<mrvn> that picture is just the visible portion though. I believe tabs cache a lot larger section so scrolling inside the tab is fast
<clever> yeah, when live
<clever> under heavy strain, i can scroll faster then it can render, and that gets exposed
<heat> i bet most of the memory usage isn't O(ntabs)
<heat> a regular web browser uses a gig of memory with what, 4-6 tabs?
<mrvn> 15485 mrvn 20 0 531208 259532 19028 R 32.1 0.4 8782:11 MainThread
<mrvn> I haven't used my browser today at all. Why is it using 32% cpu time?
<clever> mrvn: hit shift+escape, and sort by CPU
<heat> render?
<mrvn> Another 30% for Web Content
<GeDaMo> GIFs?
<clever> GeDaMo: yeah, i have caught imgur eating cpu before
<bauen1> video encoding / decoding of some sort in the background ?
<mrvn> clever: firefox
<GeDaMo> GIFs in particular are bad
<mrvn> The visible tab has a paused video on it.
<heat> also av1 has no gpu acceleration
<heat> and you need to explicitly enable GPU acceleration on linux
<GeDaMo> What annoys me about all the JS on Youtube is that none of it has anything to do with decoding video :|
<heat> because linux
<bauen1> GeDaMo: just be happy that you can block a crap ton of it in a proper browser :(
<heat> if you want to do something without getting a BSc in computer science you're absolutely wrong
<clever> GeDaMo: oh, have you seen blob url's?
<GeDaMo> You mean like data: ?
<clever> nope, blob:
<clever> its a better api
<GeDaMo> Better for whom? :|
<clever> the browser performance
<clever> basically, you can take a JS byte-array, and pass it to the browser
<clever> the browser then returns an opaque token like 'blob:https://www.youtube.com/c6bdc7d8-45f9-47c2-bf7e-fc0e77eb810d' back to you
<bslsk05> ​redirect -> consent.youtube.com: Before you continue to YouTube
<clever> you can then use that in any network based operation
<clever> and the browser will just refer to the byte-array
<clever> no need to convert the bytes to hex, and then back to bytes
<clever> no waste generating a string 2-3x the size of the blob
<clever> GeDaMo: this also works with <input type=file> boxes
gog has joined #osdev
<GeDaMo> How does this cut down on all the JS on YT?
<clever> it doesnt, but it makes some of their crazy methods use less ram/cpu
<clever> i dont know why, but the data for the video, isnt just a plain http(s) request
<clever> its ajax and js based
<GeDaMo> Yeah, I've watched the network console :P
<heat> >i don't know why
<clever> and this lets the blobs be passed into a <video> tag without much more overhead
<heat> copyright?
<heat> obfuscation?
* clever points to youtube-dl
<clever> hows that working? :P
<heat> RE
<mrvn> And who thought up that any javascript you load form some obscure ads URL can throw a transparent box over the whole browser and capture any mouse clicks?
<heat> not saying it works, just saying it's probably needed
<clever> if you grab an element for a <input type=file>, you can then do URL.createObjectURL(element.files[0]); and youll get a blob: token like above
<clever> you can then use that token in anything that expects a url (ajax, img tag, others), and get the contents of a local file the user selected
<mrvn> .oO(or not selected)
<mrvn> Having the browser load file:///etc/passwd still scares me
<bslsk05> ​developer.mozilla.org: File - Web APIs | MDN
<clever> mrvn: but the cross-domain policies wont allow JS to actually read the contents
<clever> only if the user intentionally chooses that file in a <input type=file> will that be possible
<heat> fuchsia is meant to solve that issue
<mrvn> how does that work with uploading the blob to a server and getting it send back?
<heat> can't look at etc/passwd unless you give chrome a handle
<clever> mrvn: how is the file uploaded?
<mrvn> as "blob:file:///etc/passwd" if I followed the discussion right
<clever> that wont work
<clever> the browser generates an opaque token, like blob:https://www.youtube.com/c6bdc7d8-45f9-47c2-bf7e-fc0e77eb810d
<clever> which has your domain, and a randomly generated string
<clever> it then looks that up in a table, to find the original byte-array you used to create it
<clever> so you can only access blobs you already had access to
<mrvn> and keeps that token in memory forever in case it someday comes back to the tab?
<clever> probably tied to the lifetime of that tab
<bslsk05> ​developer.mozilla.org: URL.createObjectURL() - Web APIs | MDN
<clever> > The URL lifetime is tied to the document in the window on which it was created.
<clever> yep
<mrvn> I gues that breaks the back button then
<clever> > Each time you call createObjectURL(), a new object URL is created, even if you've already created one for the same object. Each of these must be released by calling URL.revokeObjectURL() when you no longer need them.
<bslsk05> ​developer.mozilla.org: MediaSource - Web APIs | MDN
<clever> mrvn: and i think youtube is using the MediaSource api, rather then the Blob api
<clever> it looks like some kind of js managed ringbuffer, to allow access to a seekable media file, in chunks
<bslsk05> ​w3c.github.io: Media Source Extensions™
<dmh> browsers might be both the easiest and worst os to develop for
<clever> heat: oh, another feature that blob: and MediaSource offer, is changing the bitrate
<clever> if the remote server slices your mp4 file up cleanly at keyframes and whole container level packets
<clever> then the JS can dynamically switch between different bitrates, and append chunks to the stream as it gets them
<clever> and the <video> tag will just deal with it
<clever> > Define a splicing and buffering model that facilitates use cases like adaptive streaming, ad-insertion, time-shifting, and video editing.
<clever> oh, and ad's, ive got an adblocker, so i rarely think of that case
<bslsk05> ​en.wikipedia.org: Dynamic Adaptive Streaming over HTTP - Wikipedia
<heat> i should port chromium
<clever> GeDaMo: ah yeah, ive used dash and hls before, with the nginx-rtmp plugin
<bslsk05> ​arut/nginx-rtmp-module - NGINX-based Media Streaming Server (3314 forks/11669 stargazers/BSD-2-Clause)
<clever> it can accept an rtmp input (from obs or ffmpeg for example), and then provide rtmp/hls/dash streams to viewers
<heat> >43 deps
<heat> i'm fucked
<heat> why does it need pciutils
<clever> and based on the MediaSource stuff above, and some MDN example code, i'm assuming xhr.responseType = 'arraybuffer'; could be used to fetch chunks of the dash/hls video
<GeDaMo> Pfft! Just write your own browser :P
<clever> which nginx-rtmp is pre-slicing
<clever> https://developer.mozilla.org/en-US/docs/Web/API/SourceBuffer has example code on how to shove that arraybuffer into a mediasource
<bslsk05> ​developer.mozilla.org: SourceBuffer - Web APIs | MDN
<clever> heat: webusb is why chrome has libusb as a dep
<heat> i would rather rewrite my kernel in java than write my own browser
<bslsk05> ​platform.html5.org: The Web Platform: Browser technologies
<clever> ive used that to prod the rpi bootloader from js
<heat> something that I could actually use though
<heat> mesa
pretty_dumm_guy has joined #osdev
<heat> if I got mesa I could get opengl, vulkan, then a compositor on top of that
<heat> chromium shouldn't be too hard after that
<clever> and webgl needs mesa anyways :P
<heat> but chromium doesn't need webgl
<heat> fuchsia has a chromium build but has no opengl
<clever> ah, some parts are probably optional then
<heat> actually if fuchsia can run chromium maybe it's not that hard to port
<heat> it's just "hard" instead of "extremely hard"
* clever heads off to bed
Patater has joined #osdev
Patater has quit [Client Quit]
Patater has joined #osdev
masoudd has joined #osdev
Vercas has quit [Remote host closed the connection]
Vercas has joined #osdev
Patater has quit [Client Quit]
Patater has joined #osdev
X-Scale has joined #osdev
m3a has joined #osdev
ElectronApps has quit [Remote host closed the connection]
mahmutov has joined #osdev
theruran has joined #osdev
terminalpusher has joined #osdev
srjek has joined #osdev
terminalpusher has quit [Remote host closed the connection]
dude12312414 has joined #osdev
the_lanetly_052 has joined #osdev
elastic_dog has quit [Ping timeout: 240 seconds]
elastic_dog has joined #osdev
* geist yawns
<geist> good morning folks
dennis95 has quit [Quit: Leaving]
<geist> actually got up a while ago, just forgot to yawn in this direction
* mjg yawns back
<mjg> do we have any locals in .ua?
<mrvn> geist: got any plans for a web browser that doesn't need gigabytes of ram?
<geist> get more gigabytes!
<GeDaMo> I suspect geist has shares in RAM manufacturers :|
<mjg> the address sapce is there to use it
<geist> but yeah web browsers soaking up ram is pretty annoying
<geist> to be fair i think i blame a lot of web apps for it. i think the browsers are largely growing in size due to the size of the individual pages and how much state they keep going
<mjg> also have fun keeping long uptime without restarting them
<geist> if you poke around in chrome's task manager you can really see a wide disparity based on individual pages
<geist> can easily have a few hundred MB of just javascript heap/etc
<mrvn> geist: That's a problem of the underlyning design of the web
<geist> assuming it's GCing reasonably efficiently, i dont think the browser can do much about it
<mrvn> used to be the web was for hypertext. Now it's a VM running server driven games.
<mrvn> geist: it could not run the crap
<geist> sure, but then that's not a modern web browser
<geist> you are free to disable that crap, but then that's a different thing
<mrvn> geist: run it in tabs the user looks at.
<GeDaMo> Web browser's just another operating system :|
dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]
<geist> might be worse, because then it has to keep reloading it, etc
<geist> but yes theres' a fair amount of that sort of thing going on in the background
<graphitemaster> dammit, who yawned - that shit be contagious
<mrvn> geist: used to be you could hit ESC on a page and it would stop. I really miss that.
<mrvn> The annoying bit is that 99.9% of the crap going on in the background is for adds. It's not for the benefit of the user.
<geist> indeed
<graphitemaster> An empty `Map()` object in V8 JS engine is already 4 MiB of physical RAM - The web is not very 'ram' optimized.
<GeDaMo> graphitemaster: where do you see that? :|
<graphitemaster> I did some memory profiling awhile back on a node app
<GeDaMo> Ouch
<graphitemaster> Was surprised how much memory Map and Set used compared to regular JS object dictionaries
<mrvn> graphitemaster: does Map() take an argument for the expected map size?
<geist> guess 4MB is the default hash table size?
<geist> i guess it could at least demand fault it in
<geist> though i suppose the runtime probably memsets it
<geist> that's definitely something we've discovered in fuchsia. there's a tremendous amout of code out there in various languagse that allocate something out of their heap and then memset it to zero and never touch it again
<graphitemaster> There's no way to set the capacity of a Map in JS. This has more to do with how V8 has different object heaps to keep things from trampling over another, and the largeish alignment they need so they can hack stuff into the pointer bits
<geist> so much of a problem we actually added to the VM a pass that does a quick zero check of pages that were recently faulted in
<geist> a lot of that extraneous zeroing is basically security minded code that trusts nothing
<mrvn> geist: Does memory allocated from fuchsia garantee it's zerroed?
<geist> yep, no other option
<geist> but the code at that level doesn't know, since it's probably pulled a block out of its heap
<geist> which may or may not be fresh from the OS
<graphitemaster> geist, Good memset implementations used by memory allocation should just sit in a loop comparing say 64-bit words at a time to check if they're all zero first, before actually zeroing memory, just to avoid those page-faults - since largish allocations already pull in zero page - like musl for instance does this in calloc, as does glibc iirc.
<geist> and since C++ (or rust or whatot) has no equivalent of calloc
<mrvn> geist: that code should calloc instead of malloc
<graphitemaster> Yeah, bingo.
<geist> C++
<graphitemaster> Fuckin' Rust *shrug*
<geist> or rust
<graphitemaster> Systems language, yet has no "zero alloc" optimization XD
<mrvn> geist: new() should have a calloc flavour
<geist> but whats really going on is it's a modern policy to have your class or object zero out pretty much everything in its constructor
<geist> which is highly encouraged at work, etc for security reasons
<geist> so doesn't matter anyway, by the time the object is made, boom aother zero
<mrvn> The compiler should know about allocated memory being zero and about initializations filling in zero and skip them.
<geist> not sure how that's feasible since it doesn't kow where the pointer came from
<geist> you'd have to tag the pointer somehow with a source
<mrvn> geist: or a different new() that ensures zeroed out memory.
<graphitemaster> Having worked in the games industry long enough, zeroing memory has a pretty insane cost associated with it. Take for instance a simple vec3 class of x,y,z floats, zeroing them out doesn't seem too egregious, until you end up with vector<vec3> which is the most commonly used object in an engine (or some flavor of it) and well zeroing the vec has a large enough cost that in a physics library like Bullet, turning it off in the class (which
<graphitemaster> you can do with a define) makes the library almost 30% faster at physics calculations.
<geist> graphitemaster: yah totally
<geist> mrvn: sure there are plenty of solutions if you're willing to break the language
<mrvn> geist: wouldn't break anything. The object is still properly initialized.
<geist> and as graphitemaster is pointing out, this sort of stuff is ptobably not tolerated in the games indust5ry
<mrvn> graphitemaster: vector<vec3> should not zero anything. Only used parts of the vector should get initialized.
<geist> graphitemaster: yah we *also* have an extra safety thing in clang in the kernel that fills all new locals with a nonzero pattern
<graphitemaster> it's worse in c++ because constructors are zeroing objects, object at a time, so something like vector<vec3>::resize(1 million) is making 1 million function calls assigning this->*... to zero
<geist> also has some performance hits
<graphitemaster> Rather than just using memset(0)
<mrvn> that's more like it.
<geist> worse when the constructor is't inlined (which is probalby filling it with zeros), since the compiler will full the space with garbage, and then run the constructor which overwrites that with zeros
<geist> but yeah i definitely remember some regressions in speed when we moved some code in the zircon kernel from C to C++
<geist> the memset -> O(N) constructor for exampel as we switched the vm_page struct from C to C++
<geist> in C can just memset over the whole array and make sure 0 i the default state
<mrvn> graphitemaster: A vec3 should have the default constructor and if the std::allocator had a flag saying memory is zeroed then the default constructor would just not do anything.
<geist> but then when it grew a constructor, boom you have a loop and contructors. inlined of course but still less efficient than a memset across the whole thing
<graphitemaster> I don't think security is at odds with high performance code though. I think these are exclusively problems with modern language design and the purist of value-semantics and the lack of good allocation information within the design of the language. Like if you didn't just treat the allocator as a global concept that allocated an object (like both Rust and C++ do), but had language-concept of an allocator and designed different types of
<graphitemaster> allocations (zeroed alloc, non-zeroed alloc, zeroed-resize, non-zeroed resize, the aligned versions of all of those, sized-free, etc) you can do a billion times better job while also being secure.
<geist> yah totes
<j`ey> rust has those concepts now
<mrvn> C++ has gone a lot more towards speed with the move semantic.
<bslsk05> ​doc.rust-lang.org: Allocator in std::alloc - Rust
<graphitemaster> Yeah Rust is getting a bit better here, there's still a lot missing from the design I'd say. One major gripe I have is no language really provides allocation interfaces over virtual memory specifically. Like if you want to create a ring-buffer for instance where you just tie the ends together with virtual memory rather than wrapping the buffer with modular arithmetic.
<mrvn> graphitemaster: My idea would have the allocator (or the underlying memory resource) zero out the memory. So vector<vec3> would do a big memset(), or even skip that when the OS gives zeroed memory.
<geist> yah the ring buffer would probably go against the whole model of ownership, i guess
<geist> since you'd now officially have two views of the same thing
<mrvn> You mean have a uint32_t buf[2048]; and use the same physical page for both halfs to make a 1024 element ring?
<geist> i have heard folks talk about things involving multiple mappings of the same object and how to deal with it in rut
<graphitemaster> A lot of optimizations can be done at the virtual memory layer, for instance one of the engines I worked on owns the entire heap - and it has a memcpy implementation which is capable of determining if the addresses are in the managed engine heap, and it'll actually only copy up to a page on each side of the memcpy for page-alignment, then hac at it's heap data structures to perform the inner-copy with virtual memory shenanigans, so it can
<graphitemaster> copy gigabytes of data at the cost of 2 pages basically (max), and let page-faults deal with the rest. If you covnert it back to regular C memcpy ... the performance is lost and stuff like level loads take almost 20x longer.
<mrvn> but then you assume the copied memory isn't all used. Otherwise all the page faults cost you more.
<mrvn> mmap/mremap should have a flag to COW a block of memory.
<graphitemaster> Right, I think it's just more about the amortization of the page-faults. A memcpy is an expensive inline operation and on large data copies it can block for a long time, while something like a ton of page-faults is spread more evenly in the frame where it's needed, it hides better
<graphitemaster> Most of what you care about in realtime is hiding spikes, you're more latency focused
<mrvn> totally. In a realtime system you can't memcpy() 1GB.
<graphitemaster> Unless you're an M1 mac :P
<graphitemaster> Then you apparently can and that's not fair XD
<mrvn> "If the value of old_size is zero, and old_address refers to a shareable mapping (see mmap(2) MAP_SHARED), then mremap() will create a new map‐ ping of the same pages.
<mrvn> " so it seems you can copy pages. Any way to turn those pages into COW semantic?
<graphitemaster> Who knows, a lot of platforms don't even have good ways to do copy-on-write.
<graphitemaster> Like Windows for instance, getting the POSIX behavior where you just get the zero page over and over for your allocation and then it page-faults on writes and actually commits physical memory is just not a thing really.
<graphitemaster> On Windows you can allocate virtual memory and you can physically commit it when needed, but you can't get the implicit behavior of COW as far as I know
terminalpusher has joined #osdev
<graphitemaster> The commit charge is always billed at the VirtualProtect call and the call can fail if you run out of commit
<graphitemaster> Which sucks because the only sensible way to do something like this is to allocate a virtual region large enough that you never need to worry about running out in the case of say a ring buffer
<graphitemaster> But you obviously cannot commit a terabyte :P
<mrvn> graphitemaster: and instead of doing % you want to free pages at the front while fauling in pages at the end?
<graphitemaster> I mean ideally I just want to virtually allocate large everything so I never need to worry. Like if every std::vector<T> just used likke 1 TiB of virtual memory :P
<graphitemaster> Then every resize is in place
<graphitemaster> Basically I envision an OS where there really isn't any actual memory allocation interfaces in user-space. You just get the entire virtual address range and the C allocator just bumps along that space never needing to make a system call to allocate, only ever to free pages which can be done in bulk
<graphitemaster> Then let page-faults and cow deal with the rest
<mrvn> did that, execpt no syscall to free pages.
<mrvn> basically what you have then is a system with (s)brk and #define brk(size) NOP
<mrvn> #define brk(size) 0 I mean
<geist> you could functionally do that in fuchsia by just creating a huge VMO and mapping it over most of your address space
<geist> and then letting it fault in whever you touch it
<geist> other OSes too, with a gigantic anonymous mapping
<mrvn> If you do demand page faulting then the malloc() call becomes totally obsolete other than protecting against run away pointers.
<mrvn> It's too bad there is no MAP_GROWSUP analog to MAP_GROWSDOWN
<mrvn> But you can map something to 0x80....000 - 0x1000 with MAP_GROWSDOWN and have malloc grow the heap downwards.
terminalpusher has quit [Remote host closed the connection]
<graphitemaster> We should also bump the stack size considerably
<mrvn> 4k isn't enough for you?
<gog> 4k should be enough for anybody
<mrvn> My mikrotasks only have 4k memory in total. That includes the page table, struct Task, Heap and Stack.
<mrvn> Running 1 million tasks on a RPi was fun.
the_lanetly_052 has quit [Ping timeout: 240 seconds]
<bslsk05> ​os.phil-opp.com: Writing an OS in Rust
<jimbzy> I found that on OSNews and it looked pretty interesting.
<geist> noice. i've seen a few so far but havne't seen a really good next level one
<geist> lots of simple taskers, but not yet one that's got all the major pieces in place
<geist> not saying this one doesn't (aven't looked at it yet)
<jimbzy> I think I saw one on osdev that's re-working the ARM tutorials.
<geist> yah i'm in no way sayig rust is insufficient for kernel work, but there are some data structure challenges i'd love to see solved efficiently
<jimbzy> I'll save it for a later look. I've been working on a little game pack using the Godot game engine.
<j`ey> the phil-opp one doesn't go super deep
<jimbzy> j`ey, Just another "bare-bones" style kernel?
<j`ey> a little bit more than that
<jimbzy> Yeah, I'll give it a look soon.
<j`ey> geist: you can always fall back to unsafe+pointers :P
<geist> of course. that's the question to me: how unsafe do you realistically need to be able to build an actually performant rust kernel
<geist> stuff ike data structures accessed in interrupt mode, etc
<geist> global thread ists, etc. there are slow but correct solutions for that that i fear would be not good enough
<jimbzy> Pssh. Safety? That's why we have the "hold harmless clause" ;)
<j`ey> yah stuff like that will likely be unsafe, borrowing rules cant really work acrosss interrupts and stuff
<geist> right. of course if you just turn off all the safeties and have to stick to a smaller subset of the language you start to get into the 'why bother with rust' question
dude12312414 has joined #osdev
<geist> vs say having a core system in C/C++ linked with rust drivers or subsystems, which seems like would be a pretty good solution
<j`ey> if you're starting from scratch anyway *shrug*
<mrvn> does that include having threads sleeping till an IRQ happens?
<geist> that might be the solution, running all irqs in thread context
<mrvn> turning IRQs into messages send to threads?
<geist> or having dedicated threads that wake up on irq. not an uncommon solution of course, even in monokernels
<geist> but that might avoid the problem of having driver code running in interrupt context and thus have more of the code in the system exposed to that sort of locking issue
dude12312414 has quit [Remote host closed the connection]
<mrvn> jimbzy: The helm is so nothing flies into your eyes while you are blind?
<jimbzy> Exactly!
<bslsk05> ​'r6mx0vrqm5j81' by [idk] (--:--:--)
<jimbzy> I'm pretty sure that's a Speedglas auto-darkening hood, too.
<mrvn> I want a hood with a screen on the inside and camera on the outside that maintains a constant brightness on the screen.
<jimbzy> That's easy. The hard part is making one that doesn't get nuked due to the EMF.
<mrvn> EMF?
<jimbzy> Electromagnetic field.
<mrvn> do you mean EMP?
<jimbzy> You wish it was just a pulse :p
<mrvn> should be lots of pulses
<bslsk05> ​'1000 AMPS Stick Welding' by WeldTube (00:06:15)
<mrvn> see how the light fluctuates? Lots and lots of sparks.
<jimbzy> Yeah
<jimbzy> You can manipulate the properties of the weld by manipulating the lead angle and arc length.
<mrvn> Is the power supply even a steady current or pulses?
<jimbzy> Stick welding is funky like that, though because the shielding material covering the metal. As the electrode is consumed it creates a little gas pocket around the molten pool.
<GeDaMo> Maybe charging/discharging capacitors?
<jimbzy> It depends on the machine and process.
<jimbzy> There are also constant amperage or constant voltage machines. I believe stick welding like that is constant current.
<mrvn> With 1000 A you certainly don't want to pay the electricity bill if that's a steady current.
<geist> i forget, is the current exceptionally high or very low? one of the two
<jimbzy> High
<geist> but fairly low voltage
<jimbzy> Yeah
<GeDaMo> Seems like "capacitive discharge welder" is a thing
<jimbzy> At 200amps I was usually pushing around 14-15v iirc.
<mrvn> geist: you are creating a short with 2 mettals. Lot and lots of current.
<geist> i remember being surprised that the total W is not as high as you'd think
<jimbzy> Yeah, basically just an out of control heating element.
<geist> though i guess still a few kW
gxt has quit [Ping timeout: 240 seconds]
<mrvn> Your power in is limited to 16A at 240V so do the math.
<jimbzy> Usually like 6.5kw in and 4 out.
<mrvn> unless you have one that needs 3 phase current.
<jimbzy> And it's not constant. They have a duty cycle.
gxt has joined #osdev
<geist> well, okay thats fairly high
goncalo has joined #osdev
<geist> i do remember being happy when i bought my house that it has a pretty solid set of 220 circuits in the garage
<geist> and 400A of power into the house, so could do that sort of thing if wanted
<jimbzy> Yeah that's pretty tight.
<jimbzy> No 220 in the garage here, but the laundry room is there, so I could make it happen if needed.
<geist> and there used to be a hot tub on the deck so there's this whole 50A 220 circuit dedicated to it
<jimbzy> That's really awesome.
<geist> no idea what to do with it, but it's there
<mrvn> I have 240V everywhere.
<jimbzy> geist, Tesla coil?
<geist> was waiting for someone in europe to point out they have 240V everywhere
<geist> jimbzy: hmmm!
<mrvn> We had 220V like 50 years ago.
<GeDaMo> Crazy Americans with their 120V :P
<mrvn> too inefficient so they started to raise it a bit every year.
<geist> but when we do have 220 (or 240 i forget) we usually have fairly high amperage circuits for it
<gog> yeah like 50amp
<mrvn> In germany we have 3 phase power for that.
<jimbzy> We have 3phase here, too.
<jimbzy> 440v
<gog> the one thing i don't like is that the receptacles here don't have a standard neutral
<GeDaMo> In the UK, I think we have 30 amp for things like cookers
<mrvn> And if 3 phases of 240V isn't enough there is also 400V iirc.
<gog> it's whichever
<mrvn> In the UK they also have this strange wireing where the wire forms a loop and you can draw more A from the sockets than the wire should be able to do if you spread it around the room because it then flows from both sides.
<jimbzy> You guys are 50hz too aren't you?
<geist> yah i read about that one time and didn't completely grok it
<gog> oh yeah loop circuits
<mrvn> geist: R = I * V, which basically gives you heat. Every power outlet has two wires providing power so I gets split into left and right.
joe9 has joined #osdev
<jimbzy> Doesn't that require load balancing?
<mrvn> I think they stopped using that because it has some odd properties (because you don't load balance) and it's better to just have more non-looped wires.
<jimbzy> Interesting.
<mrvn> I think UK + Ireland is also the only place where they have fuses in every outlet.
<eryjus> mrvn: V = I * R
<geist> yah i understood the basics i guess i just didn't grok how it'd go back to the fuse panel and whatnot
<geist> but also didn't try too hard
<mrvn> eryjus: hey, at lest I got all the letters :)
<eryjus> lol
<jimbzy> We're sorry, but your answer has to be in the form of a vector.
GeDaMo has quit [Remote host closed the connection]
<graphitemaster> We call them operating systems but they don't operate anymore.
<gog> makes ya think
<heat> bootloaders don't load boots either
<gog> :'(
<mrvn> heat: except on bootos
rustyy has quit [Quit: leaving]
<graphitemaster> The entirety of computers is a lie
<graphitemaster> They don't even compute
rustyy has joined #osdev
<gog> it's ok
<mrvn> what are they? Too hot for a fridge, too cold for a stove, not enough airflow for a cooling fan, ....
<heat> reject compoter become m
Clockface has joined #osdev
<Clockface> i have noticed for the mod/rm byte that some of the offsets are stuff like disp16^2 or disp8^3
<Clockface> does this mean that the value stored should be exponented before being used?
<mrvn> hardly
<mrvn> is that foodnote 2 and 3?
<Clockface> oh...
<Clockface> nevermind then
<Clockface> thank you
troseman has joined #osdev
mahmutov has quit [Ping timeout: 240 seconds]
pretty_dumm_guy has quit [Ping timeout: 240 seconds]
pretty_dumm_guy has joined #osdev
[itchyjunk] has quit [Remote host closed the connection]
[itchyjunk] has joined #osdev
orthoplex64 has quit [Ping timeout: 240 seconds]
crm has joined #osdev
orthoplex64 has joined #osdev
crm has quit [Ping timeout: 272 seconds]
emartinez has joined #osdev
heat has quit [Remote host closed the connection]