#osdev on 2022-09-11 — irc logs at libera.irclog.whitequark.org

2021-05-23 01:57 klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books

00:04 <geist> ah just a temporary breakage i guess

00:06 <heat> not temporary, I dunno what happened

00:06 <heat> maybe LTO doesn't work anymore for older clang versions

00:27 gog has quit [Ping timeout: 260 seconds]

00:40 wgrant has quit [Quit: WeeChat 3.5]

00:42 wgrant has joined #osdev

00:45 vdamewood has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]

01:19 nyah has quit [Ping timeout: 252 seconds]

01:20 frkzoid has joined #osdev

01:56 FreeFull has quit []

02:06 tarel2 has joined #osdev

02:36 mzxtuelkl has quit [Quit: fBNC - https://bnc4free.com]

02:38 dennis95 has quit [Quit: fBNC - https://bnc4free.com]

02:39 friedy has quit [Quit: fBNC - https://bnc4free.com]

02:41 andreas303 has quit [Remote host closed the connection]

02:42 terrorjack has quit [Quit: The Lounge - https://thelounge.chat]

02:42 <wxwisiasdf> where can i find a list of functions gcc require

02:44 terrorjack has joined #osdev

02:44 <zid> ..what?

02:44 <zid> To build it?

02:44 elastic_dog has quit [Ping timeout: 244 seconds]

02:44 <wxwisiasdf> zid: No, to run it

02:45 dennis95 has joined #osdev

02:45 <zid> you mean syscalls then? depends on OS

02:45 <wxwisiasdf> no i meant as in

02:45 <wxwisiasdf> what does cc need to do a sucessful C compilation

02:45 <wxwisiasdf> cc piped into as

02:45 <zid> where do the functions come in?

02:46 <moon-child> wxwisiasdf: memcpy, memmove, memset, memcmp, and abort

02:46 <wxwisiasdf> sorry my bad, yes to build it

02:46 <moon-child> https://gcc.gnu.org/onlinedocs/gcc/Standards.html

02:46 <bslsk05> gcc.gnu.org: Standards (Using the GNU Compiler Collection (GCC))

02:46 <wxwisiasdf> everything from the ansi c language?

02:46 <wxwisiasdf> i guess gnulib would cover the rest no

02:47 heat has quit [Ping timeout: 260 seconds]

02:47 <zid> I'm still incredibly confused as to what you're asking

02:47 <wxwisiasdf> to build gcc

02:48 <zid> Okay. "To build gcc, ..." now fill in the ... part

02:48 <Griwes> the answer is "it's whatever you get an error about missing when you try to compile it"

02:48 <wxwisiasdf> and run it?

02:48 <wxwisiasdf> fair

02:48 <zid> it doesn't need any 'functions', or it needs '50 million' of them, depending on what you're asking

02:49 <wxwisiasdf> i am asking what gnulib won't cover

02:49 <wxwisiasdf> and what i would need to implement to successfuly build it

02:49 <zid> That question is much different, and is "gcc, coreutils, binutils, make, sh"

02:49 <wxwisiasdf> llast time i tried i did implement a good chunk of the c library but gnulib expected some regex shit

02:50 <wxwisiasdf> zid: no, just gcc, building gcc with a libc and getting it running on one's os

02:51 <wxwisiasdf> well not gcc, more like just cc

02:51 elastic_dog has joined #osdev

02:53 <wxwisiasdf> i think i will just grab pdpclib and see what happens

02:54 andreas303 has joined #osdev

03:00 wxwisiasdf has quit [Quit: Lost terminal]

03:20 tarel2 has quit [Quit: Client closed]

03:47 [itchyjunk] has quit [Remote host closed the connection]

04:28 [itchyjunk] has joined #osdev

04:45 opal has quit [Ping timeout: 258 seconds]

04:57 opal has joined #osdev

05:07 <CompanionCube> gnulib isn't even a c library, thaf's glibc

05:09 [itchyjunk] has quit [Remote host closed the connection]

05:10 <clever> CompanionCube: making some progress on my zfs code after adding lz4 into the mix, i can now parse the root dnode, which i suspect contains an array of more dnode's, for every object in the pool

05:14 mavhq has joined #osdev

05:19 <CompanionCube> clever: iirc it's not that simple

05:19 <clever> so far, the main thing ive figured out, is that the root block pointer, is just pointing to a single raw dnode

05:19 <clever> and that dnode, i believe refers to object 0, the mos

05:20 <CompanionCube> yes, and the mos refers to other object sets

05:20 <clever> is the mos then an array of dnode's?

05:20 <CompanionCube> yes but also no

05:20 <clever> where the L0 pointers, point to a block full of dnode's?

05:21 <CompanionCube> https://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSBroadDiskStructure

05:21 <bslsk05> utcc.utoronto.ca: Chris's Wiki :: blog/solaris/ZFSBroadDiskStructure

05:22 <clever> ive mostly been going off of http://www.giis.co.in/Zfs_ondiskformat.pdf, and its not clear on several points

05:22 <clever> like the fact that the root ptr is pointing to a dnode

05:22 <clever> reading your latest link...

05:22 <clever> > and things aren't always immediately clear from it.

05:22 <clever> yep

05:23 <clever> CompanionCube: and each dnode is 1024 bytes long i believe?

05:24 <clever> oh right, i saw a prop for dnode size...

05:24 <CompanionCube> yep, there's a prop for it

05:24 <clever> yeah, `man zfsprops`

05:24 <CompanionCube> part of the large_dnode feature

05:24 <clever> yep, the man page says so

05:25 <clever> i assume thats so you can have a larger bonus value on the dnode?

05:25 <CompanionCube> i think that was the motivation yes, for things like selinux

05:25 <clever> but that then also means, you need to know how large a dnode is on this dataset, and its matching object set

05:26 <clever> to translate an object-id into an offset within the L0 blocks, and then with the shifts, an index thru the indirection tree

05:27 <clever> https://github.com/littlekernel/lk/blob/master/lib/fs/ext2/io.c#L29

05:27 <bslsk05> github.com: lk/io.c at master · littlekernel/lk · GitHub

05:27 <clever> CompanionCube: LK's ext2 driver has a similar function, that takes a blocknr, and then computes the index for every indirection level

05:28 <clever> slightly complicated, by ext2 storing a few L0, a few L1, and a few L2 in the inode

05:28 <clever> so reading block0 can skip the indirection tree always

05:28 orccoin has joined #osdev

05:28 <clever> but then ext4 went extent based, so you cant predict your path thru the tree at all

05:30 <CompanionCube> mhm

05:30 <clever> ext2 storing the first few L0's in the inode, is one area where i can see it being slightly better then zfs, in terms of reading the start of a file faster

05:30 <clever> at the cost of increased complexity

05:31 <clever> with zfs, the number of IO ops needed to read the 1st block of a file, depends on how big the file is

05:34 <clever> just as a random test, i created a 1PB file (and shoved a new pool into it for dummy data), then ran zdb on that 1PB file, and i can see it has a single L4 pointer in the dnode

05:34 <clever> with a pair of L3's inside that

05:34 GeDaMo has joined #osdev

05:36 <clever> and a large number of L1's only have a single L0, so this sparse file is very overhead heavy

05:38 <clever> CompanionCube: ah, i see the critical line in what you linked, so the MOS will refer to another object set, with a blkptr to a single naked dnode, same as the root blkptr did

05:39 <clever> as it said, the object set itself, lacks an object#

05:39 <clever> so within the MOS, will be some kind of tree, description every dataset in the pool

05:39 <clever> and block-ptr's for the object set of each

05:42 <clever> and yep, i see that objectid 1, is an object directory, which has a zap on it, and some L0's

05:45 <clever> CompanionCube: but then how does zfs know where object 1 in the MOS is located, that root dnode must somehow have its own object set?

05:50 <CompanionCube> clever: uberblock points to it iirc@

05:50 <clever> CompanionCube: https://gist.github.com/cleverca22/8c24fddc92b41c26542ce2521b05cfb3

05:50 <bslsk05> gist.github.com: gist:8c24fddc92b41c26542ce2521b05cfb3 · GitHub

05:51 <clever> the 1st file, is a snip from zdb, where its showing the MOS and object 0

05:51 <clever> the 2nd file is my own code, parsing the uberblocks

05:52 <clever> you can see the DVA's match up, so either the MOS is object 0, or the MOS is an object set, and object 0 is just the first element within that

05:52 <clever> i have loaded the node from 0:1600:200, but i have yet to parse its block ptr's

05:52 <CompanionCube> well, it is caloed the meta *object set* :p

05:53 <clever> yeah, so i'm thinking that the root dnode, is pointing directly to an object set

05:53 <clever> and like the other dataset objset's, it lacks an object#

05:53 <clever> and then there are some hard-coded object#'s within that set, that act as your starting point

05:55 <CompanionCube> you can zdb the MOS just like a regular dataset

05:55 <clever> zdb -ddddddddbbbbbbbb lk > big-old-dump

05:55 <clever> i ran this earlier, it recursively walks every single bit of metadata in the pool, and generates a text file of it all

05:57 <clever> CompanionCube: but i'm less clear on how to dump just the mos, and not recurse into every other object within it

05:57 <clever> oh, and the mos itself, also has its own indirection tree...

05:57 <clever> i dont see that within this dump

05:58 <clever> Dataset mos [META], ID 0, cr_txg 4, 106K, 58 objects, rootbp DVA[0]=<0:1600:200>

05:59 <clever> looking closer, 0:1600:200 is the location of the lz4's dnode on disk, 58 objects is what is contained within the L0's behind that dnode, and those L0's total to 106kb?

05:59 <clever> but its not showing that dnode and its own block ptr's

06:12 <CompanionCube> clever: i think zdb takes objset ids and the mos is id 0 so..;

06:13 <clever> https://gist.github.com/cleverca22/8c24fddc92b41c26542ce2521b05cfb3#file-gistfile1-txt-L1-L3

06:13 <bslsk05> gist.github.com: gist:8c24fddc92b41c26542ce2521b05cfb3 · GitHub

06:13 <clever> CompanionCube: but this says object 0 is 208kb, and line 1 says the mos is 106kb?

06:14 <clever> but i'm fuzzy on how big a dnode is on this dataset, so i cant say how many objects should fit in 106kb and 208kb

06:14 <clever> let me parse the root dnode myself, and see what its DVA is....

06:16 <CompanionCube> hm, is 106kb logical or physical?

06:17 <clever> it doesnt specify

06:17 <clever> size=1000L/200P

06:17 <clever> oh wait, there

06:17 <clever> thats claiming 4kb, which is just leaving to more confusion

06:18 <clever> i think that 1000L size, is the root dnode itself

06:18 <clever> and the 106kb, is then within that dnode, which points to more blocks

06:27 <clever> CompanionCube: ah, i think i mis-understood dn_nblkptr, there only seems to be one block ptr in this dnode, but that has a value of 3

06:28 <clever> or perhaps, it can be 3, but 2 of the slots are allowed to be empty?

06:29 <clever> https://gist.github.com/cleverca22/8c24fddc92b41c26542ce2521b05cfb3#file-gistfile3-txt-L24-L35

06:29 <bslsk05> gist.github.com: gist:8c24fddc92b41c26542ce2521b05cfb3 · GitHub

06:29 <clever> yeah, its looking like the root dnode, IS object 0

06:30 <clever> the ptr within that root dnode, is the L1 ptr from object 0

06:34 <clever> and with a max block# of 12, this dnode only has blocks 0-12, or 13 total, with a data blocksize of 16kb, the total size is 13*16, or 208kb, the lsize

06:37 <clever> oh, its right there in zdb, dnsize=512

06:37 <clever> so with 208kb worth of data, and 512 byte dnode's, then the object# can range from 0 to 415

06:37 <clever> which fits with 385 being the highest # i see

06:38 <clever> and if it was 1 dblock shorter, it would only hold 0-383

06:38 <clever> so now this all makes sense

06:39 <clever> i just dont see how zdb got a dnode size of 512

06:40 <CompanionCube> clever: ah, 512 bytes is the 'original' size for dnodes

06:41 <clever> the man page wasnt clear on what legacy meant

06:42 <clever> org.zfsonlinux:large_dnode = 0 shows up later in the zdb dump

06:43 <clever> object 51 is a zap containing that prop, and object1 has a features_for_read = 51

06:43 <CompanionCube> yeah the MOS contains its own copy of the pool config

06:44 <clever> object 63 is another zap with org.zfsonlinux:large_dnode = 4, and feature_enabled_txg = 63

06:44 <clever> yeah, i did also find a complete vdev_tree a few layers past the MOS

06:44 <clever> which cleans up some of the mystery and chance, on finding all vdev's

06:44 <CompanionCube> this is used when importing, that's what the 'untrusted' vs 'trusted' is in zdb

06:45 <clever> the nvlist at the start of a vdev, doesnt fully describe the layout of the pool

06:45 <clever> so if i have a mirror(a1,a2)+mirror(b1,b2), the nvlist on a1/a2, only describe mirror(a1,a2), and the 2nd mirror is entirely absent

06:45 <CompanionCube> (iirc this works differently than originally, let me find the article...)

06:46 <clever> there are only 2 hints that you need to keep searching

06:46 <clever> 1: vdev_children tells you how many top-level vdev's to expect

06:46 <clever> 2: top_guid is either a missing object (in the mos) that joins all top-level's, or a single top-level

06:47 <CompanionCube> ah, here it is: https://www.delphix.com/blog/openzfs-pool-import-recovery

06:47 <bslsk05> www.delphix.com: Turbocharging ZFS Data Recovery | Delphix

06:47 <clever> i also couldnt find a single reference, on how an nvlist is actually encoded on-disk

06:48 <CompanionCube> maybe it's somewhere in illumos-gate?

06:49 <CompanionCube> https://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSUberblockGUIDSumNotes also useful wrt guid

06:49 <bslsk05> utcc.utoronto.ca: Chris's Wiki :: blog/solaris/ZFSUberblockGUIDSumNotes

06:49 <clever> after staring at the hexdump and zdb enough, i did figure out some basics

06:49 <clever> and was able to get it to parse

06:49 <clever> https://gist.github.com/cleverca22/8c24fddc92b41c26542ce2521b05cfb3#file-gistfile4-txt-L1-L23

06:49 <bslsk05> gist.github.com: gist:8c24fddc92b41c26542ce2521b05cfb3 · GitHub

06:50 <clever> this is my debug output, from parsing a single vdev pool

06:51 <clever> the nvlists, are kinda printed twice, fist as a pointer+length, then again as key/value pairs

06:51 <clever> but unsupported value types, are just silently skipped

06:53 <CompanionCube> https://github.com/mharsch/libnvpair this turned up while googling

06:53 <bslsk05> mharsch/libnvpair - portable, userland-only subset of nvpair from illumos (3 forks/5 stargazers)

06:54 <clever> CompanionCube: i found several nvpair libraries on github, they often had the wrong magic# in the header

06:54 <clever> and one claimed to be a portable version of the freebsd nvpair, but it was just a bloody shell script that calls subversion, hosted in github, lol

06:54 <clever> thats not how version control works

06:55 <CompanionCube> yeah but that one has the illumos headers in it

06:55 <clever> i'll have to check that out next

06:56 <CompanionCube> https://github.com/mharsch/libnvpair/blob/master/nvpair.h#L102

06:56 <bslsk05> github.com: libnvpair/nvpair.h at master · mharsch/libnvpair · GitHub

06:56 <clever> taking a break for a bit

06:57 <clever> then i'll review all of those links

07:06 zaquest has quit [Remote host closed the connection]

07:08 zaquest has joined #osdev

07:30 frkzoid has quit [Ping timeout: 244 seconds]

07:31 \Test_User has quit [Quit: .]

07:55 bauen1 has quit [Ping timeout: 244 seconds]

08:19 CryptoDavid has joined #osdev

09:12 dude12312414 has joined #osdev

09:17 dude12312414 has quit [Client Quit]

09:40 biblio has joined #osdev

09:46 FreeFull has joined #osdev

09:49 gog has joined #osdev

09:59 biblio has quit [Quit: Leaving]

10:20 saltd has joined #osdev

10:30 CryptoDavid has quit [Quit: Connection closed for inactivity]

11:26 MiningMarsh has quit [Read error: Connection reset by peer]

11:27 MiningMarsh has joined #osdev

11:32 saltd has quit [Remote host closed the connection]

11:38 orccoin has quit [Ping timeout: 250 seconds]

11:40 saltd has joined #osdev

12:23 saltd has quit [Remote host closed the connection]

12:38 nyah has joined #osdev

12:44 saltd has joined #osdev

12:45 <dzwdz> how hard is it to implement ext2 from scratch?

12:46 <dzwdz> so far i've been considering porting libext2fs from e2fsprogs and using it to handle the on disk format but it doesn't seem to be very portable

12:46 <zid> Not especially I don't believe, depends if you wanna debug a host of corruption or not while you do it :p

12:47 <pitust> yeah its not very hard

12:47 <kazinsal> not terribly difficult -- you don't need to know a bunch of b-tree stuff or similar

12:47 <pitust> esp. for ro

12:47 <pitust> and ext2 has no journaling or anything like that

12:47 <pitust> ext4 is the real fun

12:47 <zid> if ext4 is so good, why is windows up to version 98

12:48 <pitust> i think i never did ext4 lol

12:48 <dzwdz> pitust: i'm mostly concerned about rw

12:48 <dzwdz> writing seems way more complicated than reading

12:48 <kazinsal> I however should not be an authoritative source on filesystem implementations as my most recent one is a fourth edition UFS

12:56 [itchyjunk] has joined #osdev

12:58 orccoin has joined #osdev

14:36 vdamewood has joined #osdev

15:15 Raito_Bezarius has quit [Ping timeout: 260 seconds]

15:23 freakazoid332 has joined #osdev

15:29 Raito_Bezarius has joined #osdev

16:25 freakazoid332 has quit [Ping timeout: 250 seconds]

16:40 freakazoid332 has joined #osdev

16:52 netbsduser has joined #osdev

17:01 xenos1984 has quit [Ping timeout: 248 seconds]

17:02 xenos1984 has joined #osdev

17:12 xenos1984 has quit [Ping timeout: 250 seconds]

17:27 xenos1984 has joined #osdev

17:27 heat has joined #osdev

17:28 <heat> dzwdz, getting rw is always way harder

17:29 <heat> BUT ext2 isn't particularly hard in that respect

17:33 <geist> the hard parts of ext4 are the btree bits they introduced for large directories

17:33 <geist> and then second to that is the journal, though i looked at it and it's fairly straightforward

17:33 <geist> (for write at least)

17:34 <geist> for read you dont really need to understand the btree much more than being able to follow its structure

17:34 <heat> I think you can ignore the journal

17:34 <heat> freebsd doesn't touch it afaik

17:34 <geist> sort of. if you try to mount it and it's partially replayed that's really bad idea to continue with writing. you'll trash the FS for sure

17:34 <heat> for read you don't really need to know what the htree really is

17:34 <geist> for reading it'd be easy enough to not replay the journal but just build up a list of blocks that haven't been replaced yet

17:35 <heat> you can follow the directory entries traditionally and it Just Works

17:35 * geist nods

17:35 <geist> yah hadn't looked *too* closely at it

17:35 <heat> it's backwards compatible

17:35 <geist> noice

17:35 <heat> you'll just lose performance when opening

17:36 <heat> i'm still "struggling" with ext4 extents writing

17:36 <heat> but I think I have things mostly figured out

17:36 <heat> it's literally a b+ tree of extents

17:37 <heat> so the same rules seem to apply

17:37 <geist> cool, yeah it should be simpler in the sense that extents are fairly natural way to think about things anyway, from an allocation point of view, but it's more difficult in the sense that it's less prescriptive precisely what to do in a given situation

17:37 <geist> especially the whole 'search the extent list for offset' stuff you gotta do all the time

17:37 <geist> oh? i thought it was just a linear list of extents?

17:37 <heat> no

17:37 <geist> oh. hrm. okay that makes things more complicated

17:38 <geist> problem with btrees is not that they're complicated to get right (they kinda are) but when you want to work against an existing implementation you have to do precisely what *they* do

17:38 <geist> so it's not as easy as busting out a btree implementation and then just working against existing things

17:38 <heat> it's a tree, internal nodes have "this covers from logical block N to N + L", you follow down that road, until you get the leaf nodes, which are the extents themselves

17:38 freakazoid332 has quit [Ping timeout: 264 seconds]

17:38 <geist> or at least, doing that is even harder, because now you have to build some sort of generic thing

17:39 <heat> yeah

17:39 <geist> ah but is the tree strictly a btree in that it self balances, or is it more of a direct/indirect/double indirect wthing?

17:39 <geist> i thought it was the second, as in the number of levels is determined directly by the size

17:39 <geist> ie, some sort of modified radix tree

17:39 <heat> no, not radix for sure

17:40 <heat> that would be for the ext2/3 block map

17:40 <geist> most extent based things i've seen are more of that, or like NTFS a linear list of blocks (spillover file records) with a header at the start that says what logical range this block is

17:40 <geist> sure, but my point is you can kinda extend the 2/3 block map into an extent tree

17:40 <geist> just replace <pointer to block> with <an extent>

17:40 <heat> that's not how they did things

17:41 <geist> okay

17:41 <heat> that whole u32 i_data[15]; field got union'd with an inline run of 5 extent structs

17:41 <heat> (I think it's 5? can't remember)

17:41 <geist> makes sense

17:42 <heat> then those are either leaf extents, or internal nodes and you need to go down the chain

17:42 <geist> ah and the extent itself says 'this is a pointer to more extents'?

17:43 <geist> does an extent structure in ext4 describe the logical range it covers?

17:43 <geist> or is that implied in the container its in (ie, tree block)?

17:43 <heat> not the extent itself, but a "header" structure

17:43 <heat> you can't have leaf extents in the middle of internal tree nodes

17:43 <geist> yah makes sense. means you still have to linearly scan through the block to find the extent you want, but thats not a big deal on modern machines

17:43 <heat> actually no

17:43 <heat> they're always sorted

17:44 <heat> so you use binary search for everything

17:44 <geist> sure, but once you get to a block of extents doesn't it say 'these extents cover logical range X through Y'?

17:44 <geist> and then say there's 20 extents, within that block

17:44 <heat> yes

17:44 <geist> you have to walk through 20 of them to find the exact one that covers your logical block?

17:45 <heat> sure. but since it's always sorted, you don't need to walk through the 20

17:45 <geist> how so?

17:45 <heat> binary search

17:45 <geist> i dont get the 'always sorted' part

17:45 <geist> how would you binary search those?

17:45 <geist> since they're variable sized, you can't just jump into the middle and know their offset into the overall block

17:45 <geist> you have to count them up from the start of the block

17:46 <heat> https://github.com/heatd/Onyx/blob/pfalcato/ext4-fs/kernel/kernel/fs/ext4/extents.cpp#L153

17:46 <bslsk05> github.com: Onyx/extents.cpp at pfalcato/ext4-fs · heatd/Onyx · GitHub

17:46 <geist> the Nth entry may be 10000 or 15 logical blocks in, based on what 0...N-1 sizes add up

17:46 <geist> that code helps me even less

17:47 <heat> fuck

17:47 <geist> anyway, whatever alas i gotta go

17:47 <geist> need to finish up this FAT driver and then maybe i'll tackle this for reals too

17:47 <geist> FS work is fun

17:48 jimbzy has quit [Quit: ZNC 1.7.5+deb4 - https://znc.in]

17:49 <heat> you basically do a "best-attempt" binary search, then at the end of the search you check if you found the correct thing (because this logic may end up giving you an extent that doesnt map the logical block you want)

17:49 <geist> ah. sounds easier to just linearly walk until you stop. cpus are even really good at that

17:49 <geist> anyway, will talk to ya later, really gotta go now

17:49 <heat> for a whole page?

17:49 <heat> aight

17:54 frkzoid has joined #osdev

17:55 jimbzy has joined #osdev

17:56 \Test_User has joined #osdev

18:01 \Test_User has quit [Quit: .]

18:01 \Test_User has joined #osdev

18:12 FatAlbert has joined #osdev

18:12 <FatAlbert> it's always a pleasure to rub shoulders with the pro's

18:13 <FatAlbert> im NOT into computers anymore ... i did some thinking ... and it's NOT me

18:15 <FatAlbert> i was lying to myself

18:15 <FatAlbert> now i slowly back to what i always liked to do

18:16 <FatAlbert> i feel good

18:17 <FatAlbert> i even ordered a new Thursday's and a new weightlifting shoes

18:34 <heat> why are you in #osdev then

18:34 <heat> sounds pretty computer to me

18:37 <Ermine> maybe they develop some musle which name can be abbreviated to OS

18:39 <Ermine> s/musle/muscle/

18:49 frkzoid has quit [Ping timeout: 268 seconds]

18:56 <FatAlbert> heat: even though i'm not fooling myself as a "computer guy" anymore

18:56 <FatAlbert> i still MIGHT be doing some computer stuff after all

18:57 <FatAlbert> plus, there are some interesting conversations going on here from time to time

18:57 <FatAlbert> and i learned alot of concepts, and stuff from it

18:58 <zid> known troll fyi

18:59 <FatAlbert> zid: how you doing ... still have hate in your heart i see

19:03 <FatAlbert> zid: im not always on-topic yeah ... but why you always de-legitimize me ?

19:03 <FatAlbert> nvm ..

19:04 frkzoid has joined #osdev

19:13 frkzoid has quit [Ping timeout: 250 seconds]

19:14 GeDaMo has quit [Quit: Physics -> Chemistry -> Biology -> Intelligence -> ???]

19:24 c2a1 has joined #osdev

19:24 <gog> hi

19:24 <c2a1> In a process's memory are libraries confined to a single area of physical memory and linked to multiple areas of virtual memory?

19:25 <gog> yeah basically

19:25 <gog> but it's more like they occuply a fixed number of physical pages and can be mapped into an arbitrary number of address spaces

19:26 <gog> the physical pages don't need to be contiguous really

19:26 <c2a1> What do you mean a fixed number of physical pages

19:27 <gog> size_of_library / page_size = number_of_pages

19:27 <c2a1> So the .code segment is located in a single physical address

19:27 <gog> roughly

19:27 <clever> lets say the .text for libc.so is 5 pages long, by random chance, those pages land into physical pages 6, 2, 10, 7, and 9

19:27 basil has quit [Quit: ZNC 1.8.2+deb2+b1 - https://znc.in]

19:27 <c2a1> But the data and bss sections have their own different physical memory locations

19:27 <gog> they don't know about physical memory, really

19:27 <clever> each time something needs to map libc's .text, it then punches 6,2,10,7,9 into the page tables for that process

19:28 <clever> and boom, its now showing up as 5 contiguous pages in the virtual space

19:28 basil has joined #osdev

19:28 <clever> and in general, every process can share those 5 pages

19:28 <clever> as long as they are mapped read-only

19:28 <gog> yes

19:30 <clever> but when low on ram, you may throw physical page 9 out, and blank that entry in every paging table

19:30 <clever> since you can always fetch it from disk again

19:30 <c2a1> Hmm ok

19:31 <c2a1> What process or program usually manages the different memory segments of a given program

19:31 <c2a1> The linker?

19:31 <clever> co-operation between the runtime dynamic linker and the kernel

19:32 <c2a1> Of course. The kernel being mmap and other syscalls presumably?

19:32 <clever> the runtime linker map just call mmap(), and ask for libc's .text to be mapped anywhere

19:32 <clever> the kernel then picks a free virtual addr, and fills in the paging tables

19:32 <clever> the runtime linker then has to setup all of the relocation patching (sometimes via a .plt)

19:32 <c2a1> What is the difference between mmap and malloc

19:33 <clever> malloc just picks a free area from the heap, which is already part of your mapped virtual memory

19:33 <clever> mmap maps a new file into the virtual space

19:34 <c2a1> Hmm.

19:34 <FatAlbert> :)

19:34 <c2a1> I need to read more about this. Thank you all for your help.

19:34 <FatAlbert> heat: this is why im here

19:38 <FatAlbert> clever why do processe's needs to "share" the pages and they won't punch from phyical memory like the first process

19:38 <FatAlbert> ?

19:39 <clever> FatAlbert: the sharing is just done to reduce the overall memory usage

19:39 <\Test_User> they don't need to, but if they're all the same and read-only there's no need to make a copy and use the extra ram

19:39 <clever> yep

19:43 <heat> c2a1, so typical VM systems have a concept called copy-on-write (you could've read this in your local mmap(2) manpage as MAP_PRIVATE)

19:44 <heat> your programs, shared libraries are all mapped copy-on-write to a file

19:44 <heat> you have a master copy, and then if you write to them, it copies it to a new page and maps it (and now it's writable)

19:44 <heat> this means your .text, .data all share physical pages until you don't

19:45 <heat> this is optimized so you share the most possible and do the least amount of work until you need to actually do it

19:46 <clever> ive also noticed that mmap kinda behaves like mini swap files, where the kernel can push data into the mapped file and release ram at any time

19:46 <heat> yes, just like the page cache

19:46 <heat> for MAP_SHARED that is

19:46 <heat> in no way shape or form can that happen with MAP_PRIVATE

19:47 <clever> what if its MAP_PRIVATE and you havent modified the page, can it just discard and re-read later?

19:47 <heat> MAP_SHARED mappings just map the underlying file's page cache, so they share pages

19:47 <heat> clever, yup

19:47 <clever> so private acts like shared, until you write

19:47 <clever> and then cow's

19:48 <heat> well, I don't know if that's the case for linux, but it probably is

19:48 <heat> yes, shared = private until you write

19:48 <heat> the elf loader and ld.so just map everything MAP_PRIVATE

19:48 <heat> it entirely relies on MAP_PRIVATE using COW

19:49 <clever> mmap(0x7fc99edab000, 16384, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x3000) = 0x7fc99edab000

19:49 <heat> in theory you can implement it by just allocating anonymous memory and read()ing, like Sortix does

19:49 <clever> ah yep, this is a random part of libc being mapped, when i start ls up

19:49 <heat> but it's bad and slow

19:50 <sortie> you're bad and slow

19:50 <sortie> :(

19:50 <clever> it also doesnt entirely match what i said earlier, ld.so used MAP_FIXED, so the runtime linker picked where it lands

19:50 <heat> :(

19:50 <heat> clever, oh yeah that's a fun problem

19:50 <sortie> Haha but yeah seriously I really did have a faster PROT_READ on my mind eaarlier so I can mmap binaries

19:50 <clever> heat: many years ago, i was digging into why wine was using a lot of ram for WoW, and then i noticed, it wasnt mmap'ing the main WoW.exe binary

19:51 <clever> heat: as best as i can tell, the pe32 headers caused the executable pages, to not be page-aligned within the file

19:51 <c2a1> What manages the heap, the C library?

19:51 <heat> you have to calculate the full size of the elf object, mmap(NULL, full_size), so you reserve a contiguous $full_size of space, then you MAP_FIXED chunks into it

19:51 <clever> so the page-cache then isnt aligned right for direct mapping

19:51 <c2a1> How is the heap different from a mmap'd file?

19:51 <sortie> c2a1, yeah libc

19:51 <clever> heat: wine then gives up, and just read()'s it into anon memory

19:51 <heat> c2a1, it's anonymous memory

19:51 <gog> the heap might be a mmap'd anonymous

19:51 <c2a1> Ok. I remembered seeing the heap in some kernel tutorial(probably JamesM) so I had to ask.

19:51 <gog> older systems might use sbrk()

19:51 <c2a1> Thanks.

19:52 <heat> in unix tradition sbrk()

19:52 <sortie> c2a1: libc's malloc uses mmap to allocate (big) memory region and then inside those it's able to make arbitrary smaller allocations

19:52 <c2a1> Hmm.

19:52 <clever> last i looked, linux uses a mix of sbrk and mmap

19:52 <sortie> c2a1: You could use mmap for small allocations too but it's super inefficient

19:52 <clever> sortie: yep, that

19:52 <heat> clever, depends on the malloc

19:52 <clever> yeah

19:52 <c2a1> Is the heap slower than mmaping your own memory or usually faster due to optimizations?

19:52 <heat> some really like sbrk, others can't use it (like scudo IIRC)

19:52 <heat> c2a1, faster

19:52 <clever> c2a1: using the heap is faster, because you can often use it without any syscalls

19:52 <sortie> c2a1, it really varies on your workflow

19:53 <c2a1> Or is this a stupid question

19:53 <sortie> *workload

19:53 <c2a1> Ahh ok.

19:53 <c2a1> Thanks.

19:53 <clever> c2a1: but the heap can only shrink from one end, and never punch holes, so fragmentation can lead to extra memory usage

19:53 <heat> free() doesn't return memory back to the OS for instance

19:53 <gog> some applications might reimplement malloc or use their own allocation scheme

19:53 <sortie> c2a1, the heap can avoid system calls to allocating more memory and can recycle previously freed memory

19:53 <heat> that is super slow

19:53 <gog> but generally malloc() is fine for most purposes

19:53 <sortie> c2a1, but if you have a really big allocation, it might be faster to mmap it yourself rather than malloc, but you should usually always use malloc unless there's any reason to use mmap in particular

19:54 <heat> gog, depends on the malloc()

19:54 <gog> yes

19:54 <gog> that's why i said generally

19:54 <gog> :)

19:54 <heat> musl's malloc is fine for most allocations except when threads are involved

19:55 <heat> glibc's malloc is fine except when heavily multithreaded applications are involved

19:55 <FatAlbert> it is the job of the dynamic linker to keep tabs of what shared libraries are already assgined to process ( in order to prevent redundant instances ) ?

19:55 <clever> ive also seen LD_PRELOAD based malloc's for replacing your active malloc

19:55 <clever> FatAlbert: yeah, you generally only want one instance of a given library within a given process

19:56 <heat> I've seen sizeable performance differences (+multiple-tens of percentage) by replacing glibc malloc with a tcmalloc or so

19:56 <clever> any more could lead ot nasty problems

19:56 <heat> also tcmalloc for some reason uses fucking spinlocks

19:56 <heat> which are like spinlocks but not really

19:56 <heat> but they still pessimize the shit out of workloads

19:57 <heat> (sometimes)

20:01 orccoin has quit [Ping timeout: 250 seconds]

20:13 <raggi> greedy optimizations always bench well in microbenchmarks :(

20:14 <c2a1> Could one theoretically make a multi threaded application without using any assembly at all

20:14 biblio has joined #osdev

20:15 <gog> on an existing system or ?

20:15 <c2a1> Existing system

20:16 <zid> Depends what you count as 'using any assembly'

20:16 <gog> yeah multithreaded applications are made all the time without writing assembly code

20:16 <c2a1> Without libraries

20:16 <gog> ehhh

20:16 <c2a1> Or any header files for example.

20:16 <zid> In C you need to rely on the compiler having builtin atomics and barriers etc

20:16 <zid> but for a brand new platform you will need to write that assembly *for* gcc to use

20:16 <gog> yeah multithreading depends a lot on library support

20:17 <zid> but if someone else already wrote it, you don't ever need to

20:17 <c2a1> I wonder what system calls you need to implement threading itself

20:18 <gog> clone()

20:18 <gog> for linux

20:19 <c2a1> Hmm I guess everything is in sched.h

20:35 <FatAlbert> gog: i thought you just #osdev cat :)

20:35 nisa has joined #osdev

20:35 stux has joined #osdev

20:36 <FatAlbert> you know shit ..

20:36 <FatAlbert> good for you

20:38 <FatAlbert> i remember the days when i actually slept at night ..

20:38 <FatAlbert> i miss these days

20:38 <c2a1> You aren't drinking enough caffeine

20:39 <FatAlbert> i frankenstein deep inside

20:39 <FatAlbert> it's probably what i've been through in life .. is coming back at me at the age of 30

20:41 <FatAlbert> im not far from the actor from "the mehanic" (no, not with jason statham)

20:41 <FatAlbert> i have more muscle definition than him, and i sleep at least once in three days

20:41 <FatAlbert> but other than that .. yeah .. it sucks

20:42 <FatAlbert> but you know what ?

20:42 <FatAlbert> I know i'll survive what is to come

20:43 <FatAlbert> i have the best surviving tool that i got from my ancestors

20:43 <FatAlbert> im sorry not "The Mechanic"

20:43 <FatAlbert> it's called "The Machinist" or something like that

20:46 <FatAlbert> c2a1: do you know how i know im going to survive WW3 ?

20:46 <FatAlbert> even though it will happen in my country ?

20:49 zid has quit [Read error: Connection reset by peer]

20:51 zid has joined #osdev

20:57 bauen1 has joined #osdev

21:27 bauen1 has quit [Ping timeout: 265 seconds]

21:42 <heat> raggi, have you tried out scudo in big, server-level super multithreaded environments?

21:42 <heat> i'm curious about its performance

21:42 <zid> heat can you nmap me

21:42 <heat> its clearly suited for desktop level tasks (at least Google trusts it is), but no idea about server stuff

21:42 <heat> zid, idk why

21:43 <zid> I wanna see what my filtered ports are

21:52 <heat> I guess

21:52 <heat> is it the ip on your irc whois?

21:52 <zid> yus

21:54 <heat> seems to be blocking everything?

21:54 <zid> yea filters should be rare

21:54 <zid> maybe only netbios

22:03 <raggi> heat: I have not, although the way you're comparing server and desktop there is not as common now they everyone containerizes everything

22:07 <mrvn> There is filtered and just not responding to it because the ISP blocks the control messages

22:10 vdamewood has quit [Quit: Life beckons]

22:34 <klys> https://www.prnewswire.com/news-releases/opentext-to-acquire-micro-focus-international-plc-301612801.html

22:34 <bslsk05> www.prnewswire.com: OpenText to Acquire Micro Focus International plc

22:44 Andrew has joined #osdev

22:51 <FatAlbert> im going to sleep !

22:51 <FatAlbert> im going to try

22:51 FatAlbert has quit [Quit: if you want peace prepare for war]

22:59 bauen1 has joined #osdev

22:59 <gog> bye

23:01 <heat> raggi, maybe. although I'm talking about big services which do lots of allocations and are /heavily/ multithreaded

23:01 <heat> I certainly saw a lot of services at cloudflare with hundreds if not thousands of threads at the same time, same process

23:04 <raggi> Yes, I understand, it's just more rare, and often people care more about performance than additional defense in depth there

23:05 <raggi> For example the thing probably most common for people in the container universe to look that way is storage like MySQL

23:05 <raggi> I've not seen anyone using scudo in those environments

23:06 <heat> yeah

23:06 <heat> but scudo is allegedly pretty damn fast

23:06 <heat> at least with crc32c instructions

23:07 <heat> i'm curious as to how fast it is

23:07 <heat> no one really seems to publish benchmarks on it

23:16 selene_ has joined #osdev

23:16 <heat> https://rebeccaweekly.com/2022/09/09/best-hardware-engineering-papers/

23:16 <bslsk05> rebeccaweekly.com: Best Hardware Engineering Papers – Rebecca Weekly

23:17 selene_ has quit [Client Quit]

23:34 netbsduser has quit [Remote host closed the connection]

23:49 linkdd has joined #osdev

23:56 biblio has quit [Quit: Leaving]

23:59 carbonfiber has joined #osdev