<clever>
but there will be some slight mis-match, its not perfect
<clever>
and each slot, will be mis-matched differently
<zid>
it doesn't have to be, just within a clock
<clever>
isnt training, to deal with sub-clock differences?
<zid>
no
<zid>
training is to find out what speeds the module can handle
<zid>
without going wrong
<zid>
original xbox had really variable quality ram and some of it would clock 200MHz and some would clock 150MHz out of the POST and make games lag, whoopsie
<clever>
ah, i thought training was adjusting clock phasing for each data pin
<clever>
yeah, rather then just clocking everything at 150mhz, it would dynamically overclock it to the edge of failure
<zid>
they were 'rated' for 200
<zid>
but the quality control was bad from their supplier
<clever>
and some game devs got lucky with a 200mh capable unit, and designed a game that cant run at 150mhz
<clever>
ah
<clever>
something i heard about with hdmi, is that you have a 10x PLL, to convert the pixel clock into a bit clock
<clever>
and it has 4? 90 degree shifted bit clocks coming out of it
<clever>
and the receiver will dynamically try each shifted clock, and pick whichever has the lowest error rate
<clever>
to account for the color channels being half a clock out of skew from the clock channel
<mcrod`>
ok so...
<clever>
and i had assumed dram training, is the same thing?
<mcrod`>
I think the bios flashback failed, but failed as in, "i couldn't find the EFI file"
<mcrod`>
so, I'll let this go for 10 minutes
<clever>
zid: but if its purely changing the clock speed, wont that result in variation in performance, like with the xbox? and how do you know what clock it actually picked?
<zid>
Usually it aims for a specific mhz
<zid>
and only trains the cas# ras# etc
<zid>
so the speed difference is marginal
<zid>
9-9-9-24 800MHz vs 11-9-9-24 800MHz etc
<clever>
ah
<zid>
makes it cost 2 extra clocks on a cold cache miss or whatever
<mcrod`>
i wonder how long this training will take
<clever>
the pi4 ddr4 controller, somehow involves 8 files, named memsys00.bin thru memsys07.bin, each 21kb in size
Burgundy has quit [Ping timeout: 240 seconds]
<clever>
the pi5 appears to have the identical lpddr4 controller, but only has memsys00.bin thru memsys03.bin, but now 41kb each
<zid>
ddr5 is also weird and different
<zid>
to ddr1-4
<clever>
i was thinking maybe there was one file per potential dram chip, 1gig, 2gig, 3gig, 4gig, 5gig ... 8gig
<zid>
ddr4 only had like, a-die and b-die and worked just like ddr3 for the most part
<zid>
yea or maybe 8 suppliers
<clever>
but now that i notice, its half the files, but double the size....
<zid>
'samsung b-die file' etc
<clever>
i'm not sure anymore
<clever>
ELLI LAGDAOLOTS/E ERPECXNOIT:CP PE
<clever>
but, the byte order thing is still present
<clever>
if you swap the byte order on each 32bit word, the above turns into
<clever>
ILLEGAL LOAD/STORE EXCEPTION PC:
<clever>
i dont see those strings doubling up
<clever>
so somehow, these files have doubled in size, and there are half as many......
<zid>
did it drop from 2 channels to 1 channel?
<zid>
or 2 dimms to 1 dimm?
<zid>
(or simms, I guess, if they're soldered)
<clever>
CHA/CHB CA/CS/CK PD_STREN set to picked val shown above
<clever>
i found this string in the pi5 memsys, after i byte-swap it
<clever>
CHA converged at DAC=
<clever>
CHB converged at DAC=
<clever>
and these
<clever>
legend: 11:6=CHB 5:0=CHA. Add
<clever>
that sounds like it still operates in dual channel mode
<clever>
i had also found some datasheets for lpddr4 chips of various sizes, and after going over things, i can now see how rank/channel all work
<clever>
the 4gig part, is just a pair of 2gig dies, in 2 channel mode, a dedicated 16bit lpddr4 bus to each die, operating entirely independantly
<clever>
the 8gig part, is 4 2gig dies, still 2 channels, but each channel has 2 chip-selects, giving you CS0_A/CS0_B/CS1_A/CS1_B
<clever>
and the 16gig part, is 8 2gig dies, 2 channels, 2 CS per channel, but now each die is running in 8bit mode, so a single 16bit access gets split in half, like a raid stripe
<clever>
because a bank/row/col addr is now referencing half as much data, the row# in the 16gig part doubles
<clever>
and the pi4 isnt able to deal with row#'s that large, so the 16gig chip only has 8gig addressable
<clever>
zid: does that all make sense?
<zid>
I wasn't paying any attention
<clever>
do you happen to know how x86 ddr4 dimms differ?
vdamewood has joined #osdev
tacco has quit [Remote host closed the connection]
<zid>
not in depth enough to explain anything beyond basic timings and that multiple manufacturers exist
<zid>
and the timings they tend to like vary a little
<clever>
ah
<zid>
people liked samsung b-die
<clever>
my rough understanding, is that there is a real time component, like for example 6ns to open a row, and thats working in the analog domain
<zid>
that's ras
<clever>
but the dram controller needs an integer number of clock cycles to wait
<zid>
column select is CAS
<zid>
those are the 7-7-7-24 part
<clever>
and you then have to select a RAS that is over 6ns at the current clock
<clever>
yeah
<zid>
cas, ras, trcd, etc
<zid>
it means that faster ram can be slower if the timings are loose enough, which is fun
<clever>
yeah
<zid>
like, is 667 @ 4-4-4-20 better than 800 @ 9-9-9-20? depends if you want latency or b w
<clever>
but ram with a high bank count, and a good dram controller, can hide some of that
<clever>
you can start opening a row in bank 0
<clever>
then go off and do a different request in bank 1
<clever>
the request in bank0 still has high latency, but the bus isnt going to waste and it finds something else to do
<clever>
but that requires a deeper fifo on the dram controller, and re-ordering things
<clever>
and also some fancy raid like stuff, to ensure requests go to different banks and dont conflict
<zid>
fancy raid = dual channel :p
<clever>
but, if its dumb, all of the addresses your hitting may land on the same channel :P
<clever>
and now the other channel goes to waste
<clever>
the mapping from address to channel has to fit the typical request pattern
<zid>
dual channel = cloning all the shit to another kit and treating it as HI and LO
<clever>
yeah, but how do you map things, does word 0 go to channel 0, word 1 to channel 1, word 2 to channel 0?
xenos1984 has quit [Read error: Connection reset by peer]
<clever>
so reading 4 words in a row, would hit up both channels, and do 2 words each?
<clever>
like, lets say i just have an uint32_t[1024], which channel does each index land on?
<zid>
yea it's just raid 0
<zid>
so you get straight up double b w
<clever>
what happens if i'm only accessing every even element?
<zid>
nothing?
<clever>
now the 2nd channel goes to waste
<zid>
it's either hi/lo or straight interleaved, don't remember
<clever>
because the even elements are on chanel 0, and the odd elements on channel 1
<zid>
you can't make it 'go wrong' any more than you could make raid0 go wrong by writing 1 byte files
<zid>
you get half the bits on each drive
<zid>
parity drive = ecc
<clever>
if you splice each byte across both channels, then the channels are always doing the same action
<zid>
if memory was bit addressable you could fuck it up, but.. it isn't
<clever>
and they cant operate independantly
<clever>
i feel you would loose the benefit of having 2 channels?
<zid>
you write back entire cache lines on an actual machine anyway, even if you TRIED to fuck it up, in software
<clever>
it would have been better to just have one wider channel
<zid>
they actually made them even thinner for ddr5
<zid>
wide channels are a compromise, not what you ant
<zid>
want
<clever>
yeah
<zid>
it hides more cas latency
<clever>
it sounds like its better to have many narrow channels?
<zid>
but it's worse for cpu perf
<zid>
yes, but you can always just split the row for free with raid0
<clever>
and the more channels you have, the more parallel things can be
<zid>
and it's entirelyt transparent
<mcrod`>
fun fact
<mcrod`>
memory training took 10 minutes for me on 128GB of RAM
<clever>
it stops being transparent when rowhammer shows up :P
<clever>
and then you need to know who your neighbors are
<zid>
octochannel ram could literally do a bit per dimm, I think it does bytes though instead
<zid>
64byte row gets written back -> 8 bytes per dimm get written back, and you get the next few rows with 0 cas latency as a bonus
<zid>
cas latency is still the same, because you're still only doing one huge parallel CAS for each module, but technically you only wrote 8 bytes of the row, instead of all 64, so your sequential write speed now has 0 cas latency for the next 7 rows
<clever>
i can see that helping for sequential writes
<clever>
but what about random writes?
<zid>
there's nothing "better" you can do
<zid>
random writes require a cas
<clever>
lets say i'm accessing 4 different arrays, each in a different region of ram
<zid>
unless you happen to also have 16 memory ports
<zid>
on your cpu
<clever>
if you dont stripe a row like you said
<zid>
it's a big deal when an intel cpu has two
<clever>
then there is a chance each row could be in a different bank, or different channel
<clever>
and the row for all arrays would stay open
<clever>
(ignoring the L1/L2 cache)
<clever>
which now that i bring that up, nearly everything is going to be huge sequential read/write, due to a cache miss or evict....
<zid>
I don't actually know how it stripes the data, anyway
<zid>
per row, per bit, per byte, etc
<zid>
one of those is presumably best, and what it does
<zid>
but idkw hat it is
<clever>
another thing i'm not entirely sure about
<clever>
a cheap motherboard, might only have 1 channel, and wire every dimm slot in parallel?
<clever>
and just use chip-selects to pick which module
<clever>
while an expensive motherboard (and cpu i guess), would have enough channels to drive every slot at once
<clever>
and what was that whole deal with needing matched dimms in certain slots?
CryptoDavid has quit [Quit: Connection closed for inactivity]
nicesj has joined #osdev
<nicesj>
I've read some news about RISE, it looks fun
xenos1984 has joined #osdev
Burgundy has joined #osdev
Burgundy has quit [Ping timeout: 245 seconds]
netbsduser has quit [Ping timeout: 260 seconds]
<geist>
RISERISERISE
goliath has quit [Quit: SIGSEGV]
<kof123>
> seriously i refuse to believe that 9 fans is not counterproductive https://9fans.github.io/plan9port/ yeah it seems suboptimal :D
<bslsk05>
9fans.github.io: Plan 9 from User Space
riverdc has quit [Ping timeout: 272 seconds]
edr has quit [Quit: Leaving]
pretty_dumm_guy has quit [Quit: WeeChat 3.5]
dude12312414 has joined #osdev
dude12312414 has quit [Remote host closed the connection]
nicesj has quit [Ping timeout: 272 seconds]
duderonomy has quit [Ping timeout: 260 seconds]
heat_ has quit [Ping timeout: 258 seconds]
gbowne1 has quit [Read error: Connection reset by peer]
sbalmos has quit [Ping timeout: 264 seconds]
sbalmos has joined #osdev
Matt|home has joined #osdev
bliminse has quit [Read error: Connection reset by peer]
bliminse has joined #osdev
agent314 has joined #osdev
xvmt has quit [Remote host closed the connection]
rpnx has joined #osdev
xvmt has joined #osdev
<zid>
fuck it's cold
Matt|home has quit [Quit: Leaving]
mkwrz has quit [Ping timeout: 240 seconds]
mkwrz has joined #osdev
eck has quit [Quit: PIRCH98:WIN 95/98/WIN NT:1.0 (build 1.0.1.1190)]
eck has joined #osdev
agent314 has quit [Ping timeout: 255 seconds]
rpnx has quit [Quit: My laptop has gone to sleep.]
Yoofie6 has joined #osdev
Yoofie has quit [Ping timeout: 240 seconds]
Yoofie6 is now known as Yoofie
<SophiaNya>
how cold is cold
<zid>
it is 7 colds
<zid>
or was, it's warmer now
<zid>
I need neighbours that are doing illegal grow ops
<vai>
zid: 110 day without smokes...
<vai>
I stopped with Pall Mall REd
GeDaMo has joined #osdev
agent314 has joined #osdev
gog has joined #osdev
<sham1>
Illegal ops!?
xenos1984 has quit [Read error: Connection reset by peer]
<sham1>
Oh no
xenos1984 has joined #osdev
<zid>
Speaking of illegal acts, see you tonight sham1
pretty_dumm_guy has joined #osdev
[_] has joined #osdev
[_] has quit [Remote host closed the connection]
[_] has joined #osdev
[itchyjunk] has quit [Ping timeout: 260 seconds]
gildasio has quit [Remote host closed the connection]
gildasio has joined #osdev
gildasio has quit [Remote host closed the connection]
gildasio has joined #osdev
Burgundy has joined #osdev
err has quit [Remote host closed the connection]
Left_Turn has joined #osdev
err has joined #osdev
gildasio has quit [Remote host closed the connection]
gildasio has joined #osdev
netbsduser has joined #osdev
zxrom has quit [Quit: Leaving]
bauen1 has quit [Ping timeout: 255 seconds]
bauen1 has joined #osdev
bauen1 has quit [Ping timeout: 264 seconds]
<mcrod`>
hi
<mcrod`>
i saw a sign on the road that said “EXCELLEBT RUST PROTECTION” but i didn’t take a picture
<mcrod`>
EXCELLENT*
Turn_Left has joined #osdev
Left_Turn has quit [Ping timeout: 260 seconds]
goliath has joined #osdev
agent314 has quit [Remote host closed the connection]