f_ changed the topic of ##raspberrypi-internals to: The inner workings of the Raspberry Pi (Low level VPU/HW) -- for general queries please visit #raspberrypi -- open firmware: https://librerpi.github.io/ -- VC4 VPU Programmers Manual: https://github.com/hermanhermitage/videocoreiv/wiki -- chat logs: https://libera.irclog.whitequark.org/~h~raspberrypi-internals -- bridged to matrix and discord
jcea has quit [Ping timeout: 264 seconds]
HerculeP is now known as Herc
Bitweasil has quit [Remote host closed the connection]
Bitweasil has joined ##raspberrypi-internals
jcea has joined ##raspberrypi-internals
ungeskriptet7 has joined ##raspberrypi-internals
ungeskriptet has quit [Ping timeout: 264 seconds]
ungeskriptet7 is now known as ungeskriptet
<f_ridge> <x​2x6_/D> Hi. Had too much work these days. Going to try again now.
<f_ridge> <x​2x6_/D> Immediately as soon as I write to CMDTM register I get these values in interrupt register intr:00208001
<f_ridge> <x​2x6_/D> bit 15 = ERROR, bit 21 DATA CRC ERROR..
<f_ridge> <x​2x6_/D> thats immediately after I have done switch to 4 bit DAT, I will try to do it a bit earlier
<f_ridge> <c​lever___/D> is both the card and sdhci in 4bit mode?
<f_ridge> <x​2x6_/D> Ahh, no
<f_ridge> <x​2x6_/D> both are in 1 bit mode
<f_ridge> <c​lever___/D> i found that once i did the card in 4bit mode at the right time, all i got was crc errors, because the sdhost was still in 1bit mode
<f_ridge> <c​lever___/D> ah
<f_ridge> <x​2x6_/D> in that case I'll try to turn it on first
<f_ridge> <x​2x6_/D> No, same thing
<f_ridge> <x​2x6_/D> As for Response RESP0: 00000900, STATUS REGISTER: 0x01ff0202
<f_ridge> <c​lever___/D> something i'm curious about, can you try reproducing some of my benchmarks?
<f_ridge> <c​lever___/D> in here, i'm stepping thru every possible clock divisor for the sdhost
<f_ridge> <c​lever___/D> and then reading 1mb from the card each time, and reporting the mbit rate
<f_ridge> <c​lever___/D> and i can clearly see its able to transfer 1 bit per clock, until it hits a wall at ~26mbit
<f_ridge> <x​2x6_/D> What do you want me to run?
<f_ridge> <c​lever___/D> try doing a write benchmark, with different sdhci clock divisors
<f_ridge> <c​lever___/D> and see how the speed scales
<f_ridge> <x​2x6_/D> Do you change clock divisors on the fly?
<f_ridge> <c​lever___/D> yep
<f_ridge> <G​itHub Lines/D> ```c
<f_ridge> <G​itHub Lines/D> for (int i=3; i<=70; i++) {
<f_ridge> <G​itHub Lines/D> rpi_sdhost_set_clock(i);
<f_ridge> <G​itHub Lines/D> uint32_t start = *REG32(ST_CLO);
<f_ridge> <G​itHub Lines/D> bio_read_block(dev, buf, 0, (1024*1024)/512);
<f_ridge> <G​itHub Lines/D> uint32_t stop = *REG32(ST_CLO);
<f_ridge> <G​itHub Lines/D> uint32_t interval = stop - start;
<f_ridge> <G​itHub Lines/D> float bits = 1024*1024*8;
<f_ridge> <G​itHub Lines/D> float delta = interval;
<f_ridge> <G​itHub Lines/D> double mbit = bits/delta;
<f_ridge> <G​itHub Lines/D> printf("%f MHz, \t", ((double)vpu_clock)/i);
<f_ridge> <G​itHub Lines/D> printf("%d uSec to read 1MB\t", interval);
<f_ridge> <G​itHub Lines/D> printf("%f mbits/sec\n", mbit);
<f_ridge> <G​itHub Lines/D> }
<f_ridge> <G​itHub Lines/D> ```
<f_ridge> <c​lever___/D> this is how i did it
<f_ridge> <x​2x6_/D> Ok, but I don't have sdhost. You mean that I used clock division on SDHCI or to compile lk-overlay and run on my sdcard?
<f_ridge> <x​2x6_/D> I was thinking of doing your tricky PLL tweaks in undocumented registers, but I am not there yet
<f_ridge> <c​lever___/D> try doing the same things with your sdhci driver
<f_ridge> <c​lever___/D> at the divisors you have available
<f_ridge> <x​2x6_/D> ok, running
<f_ridge> <x​2x6_/D> I am using the mapped clock divs
<f_ridge> <x​2x6_/D> results are pretty interesting
<f_ridge> <x​2x6_/D> results are pretty interesting to watch(edited)
<f_ridge> <x​2x6_/D> I am not going for graphs, I think it will not map to the same as yours
<f_ridge> <x​2x6_/D> These are 1mb reads starting from the same sector
<f_ridge> <x​2x6_/D> 101 while(1) {
<f_ridge> <x​2x6_/D> 102 for (int i = 0; i < 7; ++i) {
<f_ridge> <x​2x6_/D> 103 for (int j = 0; j < 20; ++j) {
<f_ridge> <x​2x6_/D> 104 struct blockdev_io io = {
<f_ridge> <x​2x6_/D> 105 .dev = partition_dev,
<f_ridge> <x​2x6_/D> 106 .is_write = false,
<f_ridge> <x​2x6_/D> 107 .addr = buf_1mb,
<f_ridge> <x​2x6_/D> 108 .start_sector = 0,
<f_ridge> <x​2x6_/D> 109 .num_sectors = 1024 * 1024 / 512,
<f_ridge> <x​2x6_/D> 110 .cb = NULL
<f_ridge> <x​2x6_/D> 111 };
<f_ridge> <x​2x6_/D> 112
<f_ridge> <x​2x6_/D> 113 bcm2835_emmc_set_clock(clocks[i]);
<f_ridge> <x​2x6_/D> 114 blockdev_scheduler_run_io(&io, clocks[i]);
<f_ridge> <x​2x6_/D> 115 }
<f_ridge> <x​2x6_/D> 116 }
<f_ridge> <x​2x6_/D> 117 continue;
<f_ridge> <x​2x6_/D> The clocks part
<f_ridge> <x​2x6_/D> 82 uint32_t clocks[] = {
<f_ridge> <x​2x6_/D> 83 0x80,
<f_ridge> <x​2x6_/D> 84 0x40,
<f_ridge> <x​2x6_/D> 85 0x20,
<f_ridge> <x​2x6_/D> 86 0x10,
<f_ridge> <x​2x6_/D> 87 0x08,
<f_ridge> <x​2x6_/D> 88 0x04,
<f_ridge> <x​2x6_/D> 89 0x02
<f_ridge> <x​2x6_/D> 90 };
<f_ridge> <c​lever___/D> gotta run, work stuff, can read more when i finish