#osdev on 2021-10-24 — irc logs at libera.irclog.whitequark.org

2021-05-23 01:57 klange changed the topic of #osdev to: Operating System Development || Don't ask to ask---just ask! || For 3+ LoC, use a pastebin (for example https://gist.github.com/) || Stats + Old logs: http://osdev-logs.qzx.com New Logs: https://libera.irclog.whitequark.org/osdev || Visit https://wiki.osdev.org and https://forum.osdev.org || Books: https://wiki.osdev.org/Books

00:06 wootehfoot has quit [Read error: Connection reset by peer]

00:20 dude12312414 has joined #osdev

01:13 adachristine has joined #osdev

01:13 nyah has quit [Ping timeout: 260 seconds]

01:13 gog has quit [Killed (NickServ (GHOST command used by adachristine))]

01:18 adachristine has quit [Quit: byee]

01:18 gog has joined #osdev

01:22 dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]

01:25 ElectronApps has joined #osdev

01:39 sdfgsdfgs is now known as sdfgsdfg

02:23 zaquest has quit [Quit: Leaving]

02:23 pretty_dumm_guy has quit [Quit: WeeChat 3.3]

02:24 ElectronApps has quit [Remote host closed the connection]

02:33 sts-q has quit [Ping timeout: 260 seconds]

03:12 Oli has quit [Quit: leaving]

03:54 [_] has quit [Remote host closed the connection]

04:33 srjek has quit [Ping timeout: 264 seconds]

05:09 mahmutov has joined #osdev

05:21 ravan has joined #osdev

05:35 ravan has quit [Ping timeout: 244 seconds]

05:48 zaquest has joined #osdev

05:58 ElectronApps has joined #osdev

06:13 MiningMarsh has quit [Quit: ZNC 1.8.2 - https://znc.in]

06:29 MiningMarsh has joined #osdev

06:36 GeDaMo has joined #osdev

07:30 k0valski1 has quit [Ping timeout: 245 seconds]

07:30 <geist> oh hey i think i figured out what was wrong with my wyze50 terminal

07:31 <geist> it occasionally made sparking and pop sounds

07:31 <geist> looks like a diode in the power supply which was standing high off the motherboard for some reason was bent over and juuuuust touching one of the windings on a transformer right next to it

07:31 <geist> looks like there are some burnt spots there, so maybe it occasionally arced

07:56 darkstarx has quit [Remote host closed the connection]

07:57 darkstarx has joined #osdev

08:38 <klange> ugh, race in my pipes... too many layers in those

08:40 gog has quit [Quit: byee]

08:40 Vercas has quit [Remote host closed the connection]

08:41 Vercas has joined #osdev

09:12 k0valski1 has joined #osdev

09:32 sortie has joined #osdev

09:56 <klange> Well, I didn't fix the race but I did make pipes faster... I think I need to juggle a lock better here.

10:03 mahmutov has quit [Ping timeout: 244 seconds]

10:04 <sortie> I'm this close to strerror(EPIPE) = "Ceci n'est pas une pipe"

10:05 * sortie . o O (Is a broken pipe a pipe?)

10:07 <GeDaMo> "Ceci n'est pas une pipe"

10:10 <GeDaMo> Huh, I missed that you already said that :P

10:11 <sortie> This is not a "Ceci n'est pas une pipe"

10:11 mahmutov has joined #osdev

10:26 Coldberg has quit [Ping timeout: 260 seconds]

10:28 ElectronApps has quit [Quit: Leaving]

10:28 ElectronApps has joined #osdev

10:36 vai has quit [Ping timeout: 260 seconds]

10:37 Vercas has quit [Remote host closed the connection]

10:37 Vercas has joined #osdev

10:41 Oli has joined #osdev

11:01 m3a has quit [Quit: leaving]

11:24 xyh has joined #osdev

11:40 ElectronApps has quit [Quit: Leaving]

11:40 ElectronApps has joined #osdev

12:02 mahmutov has quit [Ping timeout: 244 seconds]

12:04 nyah has joined #osdev

12:08 <klange> During my last round of user testing, someone said they wanted a volume slider, so... here's a volume slider. https://klange.dev/s/Screenshot%20from%202021-10-24%2021-07-53.png

12:11 <zid> nice, does it crackle when you change volume?

12:12 <clever> zid: could that be avoided by waiting for a zero-crossing before you update the multiplier?

12:12 <dzwdz> what if there isn't one?

12:12 <zid> if there isn't one you have a DC bias

12:13 <zid> and you can die in a fire

12:13 <dzwdz> yes

12:13 <klange> I'm just frobbing the codec knobs, so that's entirely up to the chipset.

12:13 <clever> ahh

12:13 <zid> well the chipset is going to make it crackle

12:13 <dzwdz> only if it's a shitty chipset

12:13 <clever> in my case, there is no codec, i have to implement all of that myself

12:14 <zid> nah it doesn't matter if the chipset is shitty or not really

12:14 <zid> if you tell it to play +127, +127, +127, +127, .. as your samples, then cut the volume to 0

12:14 <zid> you're going to get a pop unless it's disobeying you

12:15 <clever> zid: i have been playing with FIR filters lately, i could use that to just reject any dc bias...

12:15 <clever> anything below say 20Hz is just gone

12:16 Jerjerbinks has joined #osdev

12:17 <Jerjerbinks> yoh

12:19 mahmutov has joined #osdev

12:24 gxt_ has quit [Remote host closed the connection]

12:24 xyh has quit [Quit: WeeChat 3.3]

12:24 gxt_ has joined #osdev

12:25 mahmutov has quit [Ping timeout: 244 seconds]

12:36 srjek has joined #osdev

12:41 Jerjerbinks has quit []

13:02 gog has joined #osdev

13:02 gog has quit [Client Quit]

13:02 gog has joined #osdev

13:16 Vercas has quit [Remote host closed the connection]

13:16 Vercas has joined #osdev

13:18 X-Scale` has joined #osdev

13:20 X-Scale has quit [Ping timeout: 244 seconds]

13:20 X-Scale` is now known as X-Scale

13:38 X-Scale` has joined #osdev

13:39 X-Scale has quit [Ping timeout: 265 seconds]

13:39 X-Scale` is now known as X-Scale

13:55 ElectronApps has quit [Read error: Connection reset by peer]

13:57 ElectronApps has joined #osdev

14:06 dude12312414 has joined #osdev

14:32 mahmutov has joined #osdev

14:33 kling has joined #osdev

14:41 ElementW_ is now known as ElementW

15:05 Jerjerbinks has joined #osdev

15:08 EtherNet has quit [Quit: WeeChat 3.4-dev]

15:08 EtherNet has joined #osdev

15:24 ElectronApps has quit [Remote host closed the connection]

15:31 [itchyjunk] has joined #osdev

15:32 Jerjerbinks has quit [Quit: have to go - GODSPEED!]

16:03 pretty_dumm_guy has joined #osdev

16:04 kling has quit [Quit: Lost terminal]

16:14 doppler has joined #osdev

16:16 wootehfoot has joined #osdev

17:20 wootehfoot has quit [Read error: Connection reset by peer]

18:42 vin has joined #osdev

18:44 <vin> Potentially off topic: Is there a reason for superblocks to contain same block offsets across planes in an SSD? Just easier management for FTL?

19:14 pretty_dumm_guy has quit [Ping timeout: 265 seconds]

19:15 pretty_dumm_guy has joined #osdev

19:40 vdamewood has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]

19:45 fkrauthan has quit [Quit: ZNC - https://znc.in]

19:47 fkrauthan has joined #osdev

19:59 Oli has quit [Remote host closed the connection]

20:00 GeDaMo has quit [Remote host closed the connection]

20:09 <geist> vin: hmm, can you elaborate?

20:13 Oli has joined #osdev

20:13 mctpyt has joined #osdev

20:50 Oli_ has joined #osdev

20:51 Coldberg has joined #osdev

20:51 PapaFrog has joined #osdev

20:52 LostFrog has quit [Ping timeout: 252 seconds]

21:02 <vin> geist: Or rather let me rephrase my question to be more generic. If you are interleaving pages/blocks across devices for larger throughput, where the granulatiry of a single operation to the client is sum of all pages interleaved, can you do better than pages starting at same offset (these pages form a logical block, and block is the access granularity)?

21:07 <geist> ah so you're talking about at the SSD firmware level and devices in this case are nand banks?

21:09 <geist> i *think* most modern high end SSDs the translation is pretty much at the page level, so probably 4K, so logical to physical is pretty much perfecly swizzled

21:10 <geist> the erase size is probably still much larger than a page though, so that's obviously the issue, but since the SSD is most likely writing things out in a journalled way it's generally just appending new writes somewhere else. also the whole SLC caching scheme that modern SSDs do really makes it more complicated too

21:10 <geist> since they're essentially temporarily writing new stuff to a SLC cache which is then flattened out

21:11 <geist> but since there are multiple devices i guess it keeps multiple outstanding journals, one per device? or i suppose it just always stripes across all devices, or at least a gang of them. unclear to me

21:12 <geist> right? you could basically treat all the nand planes as a huge raid1 style stripe, or you can treat them pretty much entirely independently, but balance all the writes across all of them and independently wear level each plane

21:12 <geist> i suppose theyd both have different performance characteristics, though presuambly the latter would be much more complicated to track

21:12 <geist> that being said simple 'SSDs' like SD cards or MMC or whatnot probably do the former and just have a few planes and treats them in a striped way

21:13 Oli_ has quit [Ping timeout: 244 seconds]

21:13 <geist> not really answering your question, but it does have me thinking

21:13 Oli has quit [Ping timeout: 264 seconds]

21:13 <clever> ive also found that trim commands can still be handled async and resumed after a power loss

21:13 <clever> i issued a `blkdiscard` against an entire card, and ejected it immediately afterwards (on purpose)

21:14 <geist> i do actually wonder precisely what modern SSD algorithms look like. my only experience with real shipping SSD firmware was in the form of firmware for <redacted's> SD card about 10 years ago

21:14 <clever> and after re-inserting the card, it did show up as entirely blank

21:14 <clever> i feel like it both nuked the translation tables, so all reads just return null and dont even read

21:14 <geist> yah lots of SSDs just fiddle with the translation table and mark the page blank and not erase immediately

21:14 <clever> and also scheduled a full device erase in the background

21:15 <geist> doesn't even have to do that, there's nothing in trim that says you have to erase the device, that's what the secure erase command is for (on ATA at least)

21:15 Oli has joined #osdev

21:15 <clever> yeah, it could just keep say 2mb pre-erased

21:15 <clever> and erase on-demand as that runs out

21:15 <geist> it can just mark it for erase which still helps the 'find a new page' algorithm later on

21:15 <clever> and the discarded blocks, just give it far better choice on what to erase&use next

21:15 <geist> right

21:15 <clever> yep

21:16 <geist> but depending on the controller and firmware, some can take a lot longer to trim, so it's clearly a bit more complicated than that in some cases

21:16 <geist> almost all of my ssds are samsung nowadays which seem to trim *fairly* quickly

21:16 <clever> ive also read a research paper, where some sata ssd's corrupted the translation tables on simple power loss

21:16 <geist> early on i was using a lot of sandforce SSDs which were early to the game. they use actual data compression to help

21:16 <clever> and that then leads to total data loss

21:17 <geist> and those take *ages* to trim, since presumably they have to look through and deal with the compressed block you're actually trimming

21:17 <clever> ive also seen some repair guys on youtube, that dump the raw nand flash, recover the translation tables, and re-assemble the disk image

21:17 <geist> yah. the fun one is all the SLC caching stuff that modern stuff really oes

21:18 <geist> the breakthrough in my brain is some blab about it on anandtech, because i was always wondering where the SLC cache came from. like is 10% of the device made differently?

21:18 <geist> answer is no, you can take MLC and TLC and QLC flash and erase it as SLC and back again

21:18 <clever> https://www.usenix.org/system/files/conference/fast13/fast13-final80.pdf

21:18 <geist> you just gang up a bunch of cells and treat them as the same thing

21:19 <geist> trick is you can erase and use a TLC/etc cell as SLC and the erase cycle is much faster so it has the performance of SLC, but of course is not space efficient

21:19 <clever> in this paper, they took 15 SSD's and 2 mechanical drives, and subjected them to a torture test

21:19 <geist> so modern SSD controllers, in the last few years, now dynamically switch cells back and forth between SLC and higher stuff

21:19 <clever> yanking the sata power in the middle of bulk write operations

21:19 <geist> so the translation stuff is even more complicated

21:20 elastic_dog has quit [Ping timeout: 264 seconds]

21:20 <clever> one drive, after only 8 power loss events, corrupted its translation tables

21:20 <clever> any read past 256gig into the disk, failed with IO errors

21:20 <geist> absolutely. do not power pull your device

21:20 <geist> i dont trust SSDs and never will in that situation

21:21 <geist> SD cards are also fairly easy to corrupt, but they're usually much simpler so they have less crap in flight

21:21 <clever> ive also heard reports on the rpi forums, that SD cards are more likely to corrupt if your voltage rail sags

21:21 <geist> it's almost certainly one of the big differences between enterprise and consumer SSDs: on board ram, supercaps to keep it going to complete the transation, etc

21:21 <geist> absolutely

21:21 <clever> and a seemingly large number of users are having cards die, or just getting fake cards

21:22 <clever> so far, ive only murdered one card, it died in the middle of a gcc compile

21:22 <clever> but it knows its on the deathbed, so it just ignores all writes

21:22 <geist> yep. using SD cards as your root for your OS is fairly dangerous. back up and/or be ready to have to repave

21:22 <clever> reads still work perfectly and it can still boot

21:23 <geist> yep. 'good' SD card firmware goes into RO mode so you can at least get your shit off

21:23 <geist> bad firmware just goes dead

21:23 <clever> so things get really funky, when writes randomly revert (when the read cache expires), and then it just starts crashing

21:23 <clever> yep

21:23 <geist> i used to have a drawer at my work desk full of dead sd cards

21:23 <geist> fairly easy to corrupt if you're working on a SD stack. lots of times they dont gracefully handle bad or corrupt commands or lots of power cycles as you load firmware on your board

21:24 <clever> ive also been digging into volk and gnu-radio recently

21:25 <clever> https://github.com/gnuradio/volk/blob/main/kernels/volk/volk_32f_x2_dot_prod_32f.h#L934-L971

21:25 <bslsk05> github.com: volk/volk_32f_x2_dot_prod_32f.h at main · gnuradio/volk · GitHub

21:26 <clever> this is a routine for neon based dot product with floats

21:26 <clever> from that, i can see that neon appears to be based on float[4] vectors, and it has an `a = b + (c*d)` opcode, but no way to sum every element in a vector

21:26 elastic_dog has joined #osdev

21:27 <clever> it also looks like there is some data dependency issues? where `a = b + (c*d); e = a + (f*g);` would stall out, waiting for the previous opcode

21:27 <geist> gosh i kinda wish you'd finish up that pending stack for LK

21:27 <geist> i'm about to just start pulling pieces out of it myself and finishing it off

21:27 <geist> ext4 in particular

21:27 <clever> ah, the ext4 stuff? yeah, i should just finish it off in qemu

21:28 <clever> where would i find better docs on what arm neon can do exactly? the arm site is a bit tricky to navigate when you dont know what things are named

21:28 <geist> do you mean arm32 or arm64?

21:29 <geist> they basically renamed it to ASIMD in arm64

21:29 <geist> may be why you're not finding the NEON docs

21:29 <geist> also it's simply part of the ARMv8 ARM

21:29 <clever> interested in both, but i can start at either

21:30 <clever> let me check my armv8 docs...

21:30 <geist> basically same thing, just different ISA to get to it, so the mnemonics are not the same

21:30 <geist> also ARM64 redid how vector registers are mapped to lower level ones so its far more straigthfrward

21:30 <geist> since arm32 had a pretty dumb way of mapping registers

21:30 mahmutov has quit [Ping timeout: 252 seconds]

21:30 <geist> ie, '[s0, s1] = d0' '[s2, s3] = d1'

21:31 <geist> [d0, d1] = v0, etc

21:31 <geist> arm64 does what you expect and s0 is the bottom of d0 is the bottom of v0

21:32 <vin> geist | right? you could basically treat all the nand planes as a huge raid1 style stripe, or you can treat them pretty much entirely independently, but balance all the writes across all of them and independently wear level each plane. || Most modern SSDs use the planes iin RAID0 and send P/E cycles to all blocks part of the array.

21:32 <vin> Offcourse the writes to these blocks are actually made in parallel.

21:32 <geist> yah

21:33 <geist> or maybe they have N sets of raid1

21:33 <ZetItUp> https://i.imgur.com/BByUKlx.png :|

21:33 <vin> But using greedy victim selection for GC forces good blocks to also be invalidated in the array.

21:33 <geist> oh dont get me started. i sold my bitcoins at the absolute lowest it ever got to

21:33 <geist> OTOH i'm basically okay with it because fuck bitcoins

21:34 <ZetItUp> geist how much coins did you have? :P

21:34 <geist> 2.5

21:34 <ZetItUp> oh :D

21:34 <vin> Do SSD FTLs keep their address translation tables iin host RAM?

21:35 <clever> vin: ive heard that more expensive nvme drivers have dedicated ram on the nvme module for that

21:35 <clever> vin: while cheaper ones can steal some host ram, and just dma into it

21:35 <geist> yep. very low end SSDs use a thing called... crap what is it

21:36 <clever> vin: ive also heard that some ssd's, will not bother saving the translation table back to flash, and a supercap will then fuel a mad dash to commit things upon power-loss

21:36 <geist> it's a nvme feature that lets you (the host) bequeath a block of host ram to the card and it puts its translation table there

21:36 <geist> but 'good' SSDs have 512MB-1GB+ onboard DRAM

21:36 <clever> xhci has a similar feature, and xhci calls it scratch space

21:36 <geist> mostly holding the translation table live

21:36 <geist> for example the WD blue low end nvme i have does the nvme host thing

21:37 <vin> Right clever , I wonder what are the durablity gurantees for these tables during a crash. Let's say an update is made to the table (to remap few blocks for fresh writes) in RAM and when commiting this to the drive, there is a power failure. How does SSD recover from faulty table?

21:37 <geist> if the host doesn't partake you get reduced performance sicne the nvme controller is paging the translation table in and out of its own internal sram

21:37 <clever> vin: i would assume that if using host ram, the drive wont claim the write is complete until it has also saved the new translation tables

21:38 <clever> and the host ram is purely a read cache, to make lookups faster

21:38 <geist> sure, it can treat it basically like a write through cache, and journal the updates to the on flash structure

21:38 <geist> which is probably distributed around the part

21:38 <geist> so it can be made safe

21:39 <vin> hmm GC should update the table as well correct? Since the GC happens on drive cpus it needs to update the RAM copy of ATT (Address translation table)

21:39 <geist> yep. think of it as just a WT cache of the translation table. speeds up reads immensely

21:39 <geist> that's why nvmes that have the host memory feature, if they dont get the ram they still work, just reducted performance

21:40 <geist> anandtech did some performance reviews a while back on some newer devices with the feature

21:40 <vin> I see

21:40 <geist> and exactly as you expect the random read performance dropped immensely

21:41 <geist> and you can actually kinda do the math

21:41 <geist> if you have saya 2^40 sized device, split into 2^9 sized blocks, then that's 2^31 blocks. if you then have a translation table with 4 byte pointers, then that's 2^33 worth of table

21:42 <geist> that's assuming you translate at 512 bytes

21:42 <geist> if you use say 4K pages then the size goes down by 2^3

21:42 <vin> Yup

21:42 dude12312414 has quit [Quit: THE RAM IS TOO DAMN HIGH]

21:42 <geist> i fiddled with the WD blue before i put it to use and reformatted it as 4K (sadly yo ucan't do this on samsung devices) and lo and behold the amount of stolen memory went down a lot

21:43 <geist> and actually added up as i expected

21:43 <clever> ive talked to a guy in #raspberrypi that re-formatted a mechanical drive from 4k to 512 mode

21:43 <clever> turns out, the rpi boot firmware, cant deal with 4k drives

21:43 <geist> so TL;DR it's worth your time to use 4K native block sizes especially if your nvme is doing stolen memory mode

21:44 <geist> a SSD with dedicated 1GB ram or so probably benefits less because it already has the ram

21:46 <vin> I didn't understand why having 4K block size improves performance, because smaller ATT? Then wouldn't 2MB do even better?

21:47 <geist> tradeoffs but yeah

21:48 <geist> actually the SD card i got to see the firmware to actually translated at 2MB iirc

21:48 <geist> ie an erase zone size

21:48 <geist> but it had a whole journalling system such that it didn't have to do a 2MB RMW on every write

21:48 <geist> it would journal up N pages across M 2MB blocks, then GC those

21:48 <vin> Yup erase are always block sizes, individual pages can be programmed specifically

21:48 <geist> so the full translation was 2MB translation + all of the outstanding journals

21:48 <clever> ive heard of filesystems in linux for raw nand flash, where it just treats the entire device like one big journaled ringbuffer

21:49 <clever> every time you write, you just write something like inode#, block#, and data, appending to a ringbuffer

21:49 <clever> and you gc the other end, to consolidate free space, and copy still in-use blocks to the new write pointer

21:49 <geist> right, i suspect at some point pretty much anything that deals with nand has some sort of journal, if not just part of a transction, since the way the physical device works is very much amenable to that

21:49 <geist> yep, the SD card thing i was talking about was basically exactly that

21:50 <vin> clever: reminds me of https://web.stanford.edu/~ouster/cgi-bin/papers/lfs.pdf

21:50 <geist> except the whole thing wasn't a journal, the journal itself floated around the device

21:50 <geist> a new block was picked as a journal, it was the current 'head' basically, and new page writes were scribbled down there. as the 2MB block filled up it picked a new one as the journal

21:50 <geist> and then journal blocks were GCed in the background

21:50 <clever> something else i was wondering about flash recently, which does more "damage", the erase cycle, or flipping a bit from the default to non-default state?

21:51 <geist> so it was a combination of an ever increasing journal + 'cold storage' blocks translated at 2MB boundaries

21:51 <clever> could you do less wear, by programming a given region multiple times, and even the same bytes

21:51 <geist> erase cycle

21:51 <clever> ah, so i could massively increse the lifetime, by having a sort of bitfield array, and only flip one bit at a time

21:51 <geist> and yes you can program more than once, but usually can only make bits go to 0 or 1 (depending on how you interpret the cells being 'full' or 'empty' of charge)

21:52 <clever> and once i exaust every bit in the erase block, then i erase and repeat

21:52 <geist> so there's some trickery there too wher eyou can write an entry, and then go back later and overwrite it, as long as you're only flipping bits in one direction

21:52 <clever> yep

21:52 <vin> clever: programming a particular region multiple times will for sure saturate the block's P/E cycles.

21:52 <clever> the OTP in the rpi plays by the same rules (only 0->1 i believe)

21:52 <clever> and the SPI flash i last read the datasheet on, is 1->0 only

21:52 <geist> an erase cycle is the slow part since it involves basically heating up the cell i believe and 'filling' it with charge

21:52 <clever> with erase returning it to 0xff

21:53 <geist> yah can be 0 -> 1 or 1 -> 0 depending on how you or the controller interprets a full cell

21:53 <geist> some controllers i've seen let you set that

21:53 <clever> yep

21:53 <geist> also the erase cycle is the thing they list as say only can handle 3000 or 1000 times

21:53 <clever> that would basically just be xor'ing a config flag with every bit you read out

21:54 <geist> as i was talking about beofre the current hotness where they can take a MLC/TLC/QLC flash and erase it as SLC i think the reason it's faster is it doesn't go through as deep of an erase cycle

21:55 <geist> since it doens't have to be as 'precise' about how much it fills the cells, since it's ganging up multiple cells into a single SLC cell

21:55 <clever> do any flash types use analog levels?

21:55 <geist> also why the multilevel cells (MLC/TLC/QLC) is slower, i believe

21:55 <clever> ah, you beat me to it, lol

21:56 <clever> https://www.winbond.com/resource-files/w25x40cl_f%2020140325.pdf

21:56 <geist> yah i dunno how the sense logic works there, but i guess it's effectively analog

21:56 <clever> i was reading this one earlier, and i believe it stated 100k erase cycle limit

21:56 <clever> down on page 4

21:56 <geist> yep. low density stuff like that can handle a lot more erase cycles

21:56 <geist> the high density MLC/TLC/QLC stuff the number of cycles it handles has been going down somewhat exponentially

21:56 <clever> and the smallest erase block is 4k, but it also has commands for 32k, 64k, and whole-chip erase

21:56 <geist> i think 1k is the current QLC limit

21:57 <geist> that SPI flash thing you've linked is potentially NOR flash, which also handls a lot of erase cycles

21:59 <vin> To avoid redundant invalidation of good blocks when greedy victim selection is done in a superblock, maybe the superblock array should be a set of blocks at different offsets in each plane

21:59 <clever> the protection bits are also surprisingly limited, it can protect none/64k/128k/256k/all, starting at either the top or the bottom of the chip

22:00 <clever> and the protection itself, is configured via a dedicated config register, that is basically just another 8bits of flash memory

22:00 <clever> and the physical write-protect pin, only protects that config register, and nothing else

22:02 <clever> geist: another thing, is erasing flash via UV light

22:02 <clever> ive seen a hackaday article, where somebody decapped an AVR MCU with acid, masked off the program flash with tape, and then used UV light to erase the "fuse bits"

22:02 <clever> to remove the protections that stop you from reading the program memory

22:03 <clever> would that work on all types of flash? would it even go to the same level as a proper erase?

22:03 <geist> ah cute

22:03 <geist> reminds me, i keep meaning to pick up a UV eraser from ebay

22:03 <geist> though mostly deal with eeproms when i have them

22:03 <clever> in the avr case, there is a metal layer over the fuses, for just this reason

22:04 <clever> but if you fire the uv in diagonally, it will bounce between metal layers like a fiber-optic cable, and still hit the cell

22:04 <geist> noice

22:05 <clever> but that makes me wonder, how physical large might the OTP cells in an rpi soc be?

22:05 <clever> if i blast one with UV, will it return to 0 or 1? (1's are permanent normally)

22:05 <vin> What would be good workloads to evaluate FTL policies?

22:05 <clever> i can see some security exploits if i can target a specific OTP register

22:11 <clever> geist: the docs for the new CM4 secureboot where recently found on github, and there are OTP flags to disable developer keys, and disable vpu jtag

22:11 * Ameisen_ sees AVR discussion

22:11 <clever> but the docs didnt mention anything about signing the bootcode.bin blob, so i can only assume that broadcoms RSA key is still part of the trust root

22:11 <Ameisen_> Re: erasing the fuse bits: but why?

22:12 <clever> Ameisen_: to dump a protected program

22:12 <Ameisen_> ah

22:12 <Ameisen_> I was trying to contextualize it as why would _I_ want to do that

22:12 <Ameisen_> forgot that other people have different motivations

22:12 <clever> Ameisen_: normally, the LOCK flag protects the code, and you must do a full chip-erase to unlock it, but by erasing fuses with UV light, you can unlock without a full erase

22:13 <Ameisen_> I am 99.9% sure if I tried to do any of that, I would just break the chip permanently.

22:14 <clever> Ameisen_: well, it does involve melting the top off the chip with acid, without breaking any bond wires....

22:14 <clever> https://hackaday.com/2011/06/27/bunnies-archives-unlocking-protected-microcontrollers/

22:14 <bslsk05> hackaday.com: [Bunnie’s] Archives: Unlocking Protected Microcontrollers | Hackaday

22:14 <Ameisen_> I'm more likely to melt my fingers

22:15 <clever> my memory seems to also be questionable

22:15 <clever> i was certain that it was done to AVR's, but this post is about PIC's

22:15 <clever> am i getting old? lol

22:15 <Ameisen_> probably

22:16 <clever> Ameisen_: https://bunniestudios.com/blog/images/pic/fullchip_labels.pdf a photo of the raw die

22:17 <clever> the problem, is putting a piece of tape over the 8kb flash array, but leaving the security fuse array exposed

22:17 <Ameisen_> I decapped a Geforce 3 a long time ago. It was not intentional.

22:17 <Ameisen_> back when the GeForce 3 was top-of-the-line.

22:17 <Ameisen_> :(

22:18 <Ameisen_> performance-wise in basically every single aspect other than power usage, couldn't you put a lightweight AVR emulator onto a Cortex-M chip and still... beat any AVR chip?>

22:18 <Ameisen_> I've been meaning to tinker with both AVR emulation on AVR and ARM

22:19 <clever> Ameisen_: possibly, the new rp2040 from RPF has 264kb of sram, and has a dual core 125mhz cortex-m0+

22:19 <Ameisen_> on AVR being I want to profile the performance of emulating itself to see how slow execution of AVR instructions from memory is.

22:19 <clever> so you could definitely try and emulate avr there, but getting deterministic execution out may be a bit tricky

22:19 <Ameisen_> this is true. AVR is entirely deterministic clock-wise

22:20 <Ameisen_> so you're emulation layer would have to try to take things like that into account, particularly in regards to inputs

22:20 <Ameisen_> hrmm

22:20 <clever> the rp2040's cortex-m0+ is also deterministic, but if 2 bus masters fight over a bus slave (like a ram bank), one of them will have to stall

22:20 <clever> but you can set a priority, so a certain master always wins

22:21 gxt_ has quit [Remote host closed the connection]

22:23 <Ameisen_> it'd still be an interesting project

22:23 <Ameisen_> though I want to get an emulator running on the AVR itself, first

22:23 <Ameisen_> as I want to execute AVR machine code from RAM

22:24 gxt_ has joined #osdev

22:28 <Ameisen_> Mainly I'm curious what the performance would be like (awful is expected, but _how_ awful)

22:28 <Ameisen_> plus it will be interesting to microoptimize such an emulator - it's easier to do that on AVR than, say, x86

22:29 <clever> but platforms like the rp2040 dont need such hacks, since it can just run code from ram directly

22:30 pretty_dumm_guy has quit [Quit: WeeChat 3.3]

22:30 <Ameisen_> Right.

22:31 <Ameisen_> Such an emulator can just do a static recompile of the AVR program

22:31 <Ameisen_> can't do that when running AVR on AVR though

22:34 <Ameisen_> I'm basically thinking something like x86emu

22:34 <Ameisen_> though I absolutely don't want to emulate x86 on AVR. Though that'd be an interesting experiment; but I don't think the emulator would fit in AVR's program memory... you'd have to first have an AVR emulator, then the x86 emulator running in the AVR emulator on the AVR...

22:35 <clever> Ameisen_: oh, that reminds me, of another avr emulator project

22:36 <clever> Ameisen_: https://spritesmods.com/?art=avrcpm a z80 emulator, complete with SD and DRAM bit-banging, and a CP/M bios, so it can boot full CP/M

22:36 <Ameisen_> I want to add an AVR target to vemips, but I have zero idea how to handle program memory separation there

22:36 <Ameisen_> I had a crazy idea to let vemips load binaries of basically any supported target and let them interact.

22:36 <Ameisen_> but I'd have to figure out a way for it to know the difference between an address to 'program memory' and to normal memory

22:37 <Ameisen_> particularly when addresses might get passed to functions that were originally from a target that had no such concept

22:37 <Ameisen_> best I can think is some sort of prefix or suffix with the address

22:37 <clever> Ameisen_: i think the avr-gcc toolchain, just uses some extra bits in the 32bit addr, to denote if its flash or ram

22:37 <Ameisen_> Yeah, but I cannot rely on that for this.

22:37 <clever> and its up in the range where those bits dont exist on real hw

22:37 <Ameisen_> I'm saying that this would be able to load a MIPS32r6 library, an 8-bit AVR binary, and they could interact in the vemips environment

22:38 <Ameisen_> but if the AVR binary passed an address to a function that originally came from MIPS, and the MIPS-side, say, called it

22:38 <Ameisen_> the MIPS-side would have to know in the interpreter that it needs to point to a virtualized program memory space

22:38 <Ameisen_> setting some flags in the internals of the interpreter for the value could work

22:39 <Ameisen_> just tricky

22:39 <Ameisen_> the values from the AVR-side wouldn't be universal pointers; they'd probably be normal bare 16-bit ones

22:40 <Ameisen_> I'm just not sure how the interpreter would actually know that it's a program memory address if it's being passed as, say, an argument

22:40 <Ameisen_> the idea breaks down at that point

22:40 <clever> that gets into the printf vs printf_P stuff i think?

22:40 <clever> avr-libc has variants of most functions, that expect the pointer to be pointing to flash instead of ram

22:41 <clever> so you can do printf_P(PSTR("foo %d bar %d\n"), foo, bar);

22:41 <Ameisen_> Yes; but as said, in this case I'm taking about an interpreter that can load both AVR and MIPS binaries and have them interact

22:41 <clever> and it wont waste a dozen bytes of ram on the string

22:41 <Ameisen_> without having any real awareness of one another

22:41 <clever> you would likely need a type code on each function you can pass to the interpreter

22:41 <Ameisen_> the binaries shouldn't need to know about the interpreter ;P

22:42 <Ameisen_> that's where it breaks down

22:42 <Ameisen_> generic AVR binaries wouldn't provide enough information to the interpreter to resolve this

22:42 <Ameisen_> they'd have to be ones specifically built for the purpose

22:42 <Ameisen_> and that's sorta lame

23:11 Oli has quit [Read error: Connection reset by peer]

23:11 Oli has joined #osdev

23:31 mctpyt has quit [Ping timeout: 252 seconds]