narmstrong changed the topic of #linux-amlogic to: Amlogic mainline kernel development discussion - our wiki http://linux-meson.com/ - ml linux-amlogic@lists.infradead.org - official channel moved from Freenode - publicly logged on https://libera.irclog.whitequark.org/linux-amlogic
naoki has joined #linux-amlogic
vagrantc has quit [Quit: leaving]
jacobk has joined #linux-amlogic
jacobk has quit [Ping timeout: 248 seconds]
jacobk has joined #linux-amlogic
jacobk has quit [Ping timeout: 248 seconds]
vagrantc has joined #linux-amlogic
jacobk has joined #linux-amlogic
elastic_dog is now known as Guest9547
elastic_1 has joined #linux-amlogic
Guest9547 has quit [Killed (lead.libera.chat (Nickname regained by services))]
elastic_1 is now known as elastic_dog
jacobk has quit [Ping timeout: 240 seconds]
jacobk has joined #linux-amlogic
vagrantc has quit [Quit: leaving]
jacobk has quit [Ping timeout: 240 seconds]
naoki has quit [Quit: naoki]
<narmstrong> minute: i had some urgent stuff to do yesterday, I’ll look in the scdc issue, your fix is ok for now if you don’t need hdmi2.0 4k60
ldevulder has joined #linux-amlogic
f11f12 has joined #linux-amlogic
f11f12 has quit [Quit: Leaving]
camus has quit [Ping timeout: 240 seconds]
Danct12 has quit [Remote host closed the connection]
Danct12 has joined #linux-amlogic
camus has joined #linux-amlogic
buzzmarshall has joined #linux-amlogic
<minute> narmstrong: thanks!
<minute> xdarklight: so yeah, i can reproduce it, the wifi module/driver always creates these kind of oopses (different traces though) with any real load like browsing
<minute> xdarklight: still crashes with that patch, uploading relevant dmesg
vagrantc has joined #linux-amlogic
jacobk has joined #linux-amlogic
jacobk has quit [Ping timeout: 250 seconds]
jacobk has joined #linux-amlogic
jacobk has quit [Ping timeout: 240 seconds]
jacobk has joined #linux-amlogic
jacobk has quit [Quit: http://quassel-irc.org - Chat comfortably. Anywhere.]
tsegers has quit [Remote host closed the connection]
tsegers has joined #linux-amlogic
<xdarklight> minute: strange, can you please change the line "if (rtw_chip_wcpu_11n(rtwdev)) {" to "if (true) {" in that patch: https://lore.kernel.org/linux-wireless/20230522202425.1827005-2-martin.blumenstingl@googlemail.com/ ?
<xdarklight> and then try again
<minute> xdarklight: btw i think what happens is that pkt_stat->pkt_len can be 0, but pkt_offset > 0
<minute> xdarklight: btw i already did what you suggested now
<xdarklight> minute: ah - can you please print pkt_stat->pkt_len ?
<xdarklight> or add some WARN_ON( ... )
<minute> xdarklight: yeah so i did this a few minutes ago http://dump.mntmn.com/rtw8822cs-crash-pktsize.txt
<minute> now i tried to discard (return) if pkt_len == 0, but now it crashes after a pkt_len 16129...
<xdarklight> minute: what RTL8822CS firmware version are you running?
<minute> oh
<minute> i grabbed it from linux from scratch... WOW Firmware version 9.9.4, H2C version 15. Firmware version 9.9.15, H2C version 15
<xdarklight> that seems like the correct one
<minute> btw in the bottom here you can see the new crash http://dump.mntmn.com/rtw8822cs-crash-pktsize2.txt
<minute> it happens when there is a big (rx_len) chunk received in the isr
<minute> not sure if i understand, but maybe the last pkt (or its len?) is corrupted?
<minute> (i'm saying that because 16129 looks bigger than the rx_len?)
<xdarklight> so what we've seen on RTL8723DS is that there's a) RX buffer length and b) RX buffer ready bit - before that patch from yesterday we only honored a), which is all that's needed on the 802.11ac hardware/firmware, but older hardware/firmware require checking the RX ready bit before reading the buffer length. when ignoring the RX ready bit we read some size from the RX length register but data is still written to the buffer and pkt_stat will thus
<xdarklight> point beyond the end of the buffer
<xdarklight> in other words: pkt_stat wasn't the problem but rather the incorrect RX length that we read
<xdarklight> is this on a 2.4GHz or 5GHz network?
<minute> 5ghz
<xdarklight> can you paste the output of: cat /sys/kernel/debug/mmc2/ios ?
<minute> yes. btw returning in rtw_sdio_rx_skb if (!pkt_stat->pkt_len || pkt_stat->pkt_len>1546) works around the crash
<xdarklight> minute: uh, all testing with 802.11ac chipsets we've done so far is with SDR-50 or SDR-104 iirc
<xdarklight> I have an arch/arm64/boot/dts/amlogic/meson-sm1-x96-air-gbit.dts which inherits the MMC controller bits from arch/arm64/boot/dts/amlogic/meson-sm1-ac2xx.dtsi where we have: cap-sd-highspeed and sd-uhs-sdr104
<minute> xdarklight: ahh, so the freq is wrong (too low i guess?)
<xdarklight> minute: 25MHz is *default speed* SDIO, not even high speed - so I'll answer with "most likely: yes"
<minute> haha ok, i did not explicitly configure it, that was the mistake then
<minute> hmm, so it was already set up in meson-g12b-bananapi-cm4.dtsi
<minute> sd-uhs-sdr104;
<minute> max-frequency = <50000000>;
<minute> unclear why it shows up at 25
<xdarklight> minute: please add cap-sd-highspeed;
<xdarklight> minute: I think UHS SDR-104 won't work because that requires 1.8V IO lines (but that board seems to have 3.3V IO lines)
<xdarklight> minute: if I read the schematics from https://drive.google.com/file/d/1IXXok1P2OLiW3p8tavkbfEPTGTrM3b-R/view right then they thought about using 1.8V for VDDIO_X but decided to NC (not connected) that according to page 6 but go with 3.3V instead. not sure why one would do it that way
<minute> hmm :/
<minute> ok, with cap-sd-highspeed i get "mmc2: error -84 whilst initialising SDIO card"
<minute> i have to say i'm not an expert on sdio inticracies (yet)
<xdarklight> can you run the vendor kernel somehow and dump /sys/kernel/debug/sdio/ios there?
<minute> hmm, can't do that right now unfortunately
<minute> i can first check vendor dts perhaps
<xdarklight> yep, that would be a good next step too
<xdarklight> if we don't find anything then I suggest writing an email to linux-wireless, Cc'ing Jernej, Ping-Ke and myself and summarizing the issue (seemingly RX corruption at non high-speed modes, -84 / CRC errors with high-speed mode). you can get the relevant email addresses from https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=65371a3f14e73979958aea0db1e3bb456a296149
<xdarklight> Jernej is the other developer that worked on rtw88 SDIO support code and Ping-Ke is from Realtek - he's been super helpful and knows the hardware well (so there's a chance that he has an idea what may be wrong)
ldevulder has quit [Quit: Leaving]
<minute> xdarklight: thank you!
<minute> i also get a bunch of "unused phy status page (x)" with different numbers for x
<xdarklight> which seems like another hint that there's an RX buffer issue (of some kind)
<minute> yeah
elastic_1 has joined #linux-amlogic
elastic_dog has quit [Killed (calcium.libera.chat (Nickname regained by services))]
elastic_1 is now known as elastic_dog
elastic_dog has quit [Read error: Connection reset by peer]
elastic_dog has joined #linux-amlogic
JohnnyonFlame has joined #linux-amlogic
JohnnyonF has quit [Ping timeout: 240 seconds]
vagrantc has quit [Quit: leaving]
JohnnyonF has joined #linux-amlogic