NishanthMenon changed the topic of #openocd to: this is the place to discuss all things OpenOCD | Logs: https://libera.irclog.whitequark.org/openocd/
Helmholtz has joined #openocd
<Fleck> https://p.rullz.lv/goyodulale.cpp << so anyone can explain, why LINE 3, LDR R3 [PC, #4] is actually 080002E8 and not 080002E4? And why gdb does not show .word section and actually shows an instruction at that address? https://p.rullz.lv/kixonivono.cpp
<Hawk777> Fleck: IIRC on ARM PC-relative addressing is relative to the address of the instruction plus four, in other words what PC *would be* after execution completed if the instruction were four bytes long (as they all were in original ARM).
<Hawk777> Fleck: As for why it shows an instruction instead of .word, I would guess that some kind of ELF metadata is missing to tell it that location contains data and not code, or whatever tool you used to make this file doesn’t read that metadata, so it disassembled the bytes because it’s in a code section—as long as the bytes are what you want them to be, it doesn’t matter if they happen to also decode to a valid instruction.
tomtastic has quit [Ping timeout: 260 seconds]
tomtastic has joined #openocd
tsal has quit [Ping timeout: 252 seconds]
tsal has joined #openocd
boru` has joined #openocd
boru has quit [Killed (NickServ (GHOST command used by boru`))]
boru` is now known as boru
thinkfat has quit [Ping timeout: 265 seconds]
thinkfat_ has joined #openocd
[itchyjunk] has joined #openocd
emeb has quit [Quit: Leaving.]
[itchyjunk] has quit [Read error: Connection reset by peer]
rgr has joined #openocd
nerozero has joined #openocd
crabbedhaloablut has quit [Remote host closed the connection]
crabbedhaloablut has joined #openocd
Hawk777 has quit [Quit: Leaving.]
Haohmaru has joined #openocd
<Fleck> aaand Hawk777 just left ;)
<Fleck> I just got here :P
rgr has quit [Ping timeout: 265 seconds]
<Fleck> so if I understand Hawk777, in my example, E2 address would also execute (to get 4 bytes from E0) wich gives PC=080002E4 and then add #4 as instructed? Confused...
tarekb has joined #openocd
tarekb has quit [Ping timeout: 260 seconds]
[itchyjunk] has joined #openocd
<karlp> or just count #4 as 4 words, not four bytes...
<karlp> or.... go and read the ISA.
<karlp> it's freely available, if this sort of thing is interesting to you.
<karlp> the instruction vs data is what hawk said though, you can't tell data and instructions apart from just binaries.
<karlp> but it still flags it as $data.
<Fleck> karlp: I am sorry, maybe I am really dumb and stupid... but: https://pasteboard.co/RVNWi3O84ycC.png
<Fleck> don't get angry at me, I am really trying here
<Fleck> but as I understand, it only works out if #4 is 4 halfwords...
<Fleck> or if each address is 32bits, that can't be true? How can we get unaligned access then...
<Fleck> also if each address is 32bit, then, #4 also doesn't work out...
tomtastic has quit [Ping timeout: 245 seconds]
gnom has quit [Read error: Connection reset by peer]
[itchyjunk] has quit [Remote host closed the connection]
<Fleck> OK, I actually found the answer: Calculate the PC or Align(PC, 4) value of the instruction. The PC value of an instruction is its address plus 4 for a Thumb(I am using thumb...) instruction, or plus 8 for an ARM instruction.
<Fleck> Example for ARM instruction set: https://pasteboard.co/BtSea3S2jHQJ.png +8... and #4 are just bytes, makes sense: 0x080002E0+4+4 = 0x080002E8!
<Fleck> I wonder if ARM manual has error in it, #40 is decimal 40 or 0x40 as they say?
tomtastic has joined #openocd
<Haohmaru> arm .svd file format allows writing numbers with # prefix, but i think it was for binary
<Haohmaru> yeah, #01001110
wingsorc has joined #openocd
[itchyjunk] has joined #openocd
emeb has joined #openocd
[itchyjunk] has quit [Remote host closed the connection]
<Fleck> Haohmaru: well, in asm it's a constant
<Fleck> but usually in deciaml AFAIK
* Haohmaru doesn't speak asm
<Fleck> same :)
<Haohmaru> but weren't there different syntaxes
Haohmaru has quit []
<Fleck> usually it's 0x for hex. w/o for decimal, b for binary, but, IDK...
jerwd has joined #openocd
nerozero has quit [Ping timeout: 265 seconds]
akaWolf has quit [Ping timeout: 252 seconds]
rudar has joined #openocd
akaWolf has joined #openocd
wingsorc has quit [Quit: Leaving]
<PaulFertser> jerwd: hey! I see you have some nice patches there, thank you for contributing!
<jerwd> Hey @PaulFertser! Thanks! I'm really happy to contribute. I've got even more on the way actually. Mostly around the ARM Coresight stuff.
<PaulFertser> jerwd: I wonder if you know some article/guide/success story/conference talk about using those advanced features. I guess most developers, even those who frequently contribute to kernel have no clue how to approach that.
<jerwd> I wish I had more information like that I could share. If I get the opportunity to share more things like that, I would be happy to though.
<PaulFertser> jerwd: you probably can participate in some conference yourself and give an interesting talk. E.g. on ELC
<PaulFertser> ELCE
<jerwd> that's a great idea. I'll look into that.
<jerwd> oh wow. that would be really cool. have to see if I could swing that with work.
zjason` has joined #openocd
rgr has joined #openocd
<PaulFertser> jerwd: you can make a talk how to use those facilities to do something like http://www2.futureware.at/~philipp/ssd/TheMissingManual.pdf ;)
zjason has quit [Ping timeout: 265 seconds]
<jerwd> yep. ARM gives tons of these tools, but they have to be wired up in the ASIC at design time. It's pretty neat what you can do with it though.
<jerwd> I think some of the ASICs out there do show how they have things connected in their docs if they are used.
<PaulFertser> jerwd: yes, and some have proper "ROM table".
<jerwd> Key word being _proper_ :)
<jerwd> Wow, that manual you linked is quite extensive.
<PaulFertser> Too bad the vendors do not provide essential info like that :(
<jerwd> Yeah, that's very rare these days unfortunately.
<jerwd> Not like the good 'ol apple ii where you even gave you the schematics. :)
<jerwd> er where they that is
<clever> jerwd: ive also seen some devices, where the jtag port appears to be wired to an internal gpio-ish pin, that can be bit-banged with mmio from a 2nd cpu
<clever> i still need to investigate how exactly that thing works
<jerwd> oh yeah, I've seen some variations on that theme over the years.
<clever> jerwd: there is also a config flag, on if the jtag is routed to gpio or mmio
<clever> and once the arm core has come out of reset, you cant change any of those flags
<jerwd> that's cool, which parts have you seen that on?
<clever> jerwd: all of the rpi soc's
<jerwd> ah sweet!
<jerwd> I have quite a few of those things laying around.
<jerwd> although, if you use the CPU over JTAG MMIO to stop the core, could be problematic :)
<clever> a non-arm cpu
<clever> it has 2 entirely independant jtag ports
<jerwd> ah, gotcha.
<clever> line 266-260 will configure the pinmux array to expose the arm jtag
<jerwd> oh so they have helper cpus in that part then? besides the normal
<jerwd> ARMs that is?
<clever> line 310 maps the arm jtag to gpio
<clever> yeah, the entire rpi soc lineup is backwards
<clever> the arm is a secondary cpu core
<jerwd> I know that's pretty common in Xilinx UltraZynqs too.
<clever> the master cpu core is the "gpu" core
<jerwd> Yeah, the Xilinx UltraZynq's have a couple R5's plus the A53's and some hard core uBlazes in there for good measure too,.
<clever> on powerup, the VPU runs code from a maskrom, and the arm is entirely off
<clever> the VPU is a variant of a synopsys ARC DSP
<jerwd> Ah yep. So it's a bootstrap/security processor then sounds like.
<jerwd> Cool. Didn't know that.
<clever> yeah
<clever> in the old days, the VPU was the only cpu core on the chip
<jerwd> hehe, yep
<clever> and one of the engineers said "throw an arm in there, we may use it later"
<clever> and the bcm2835 was born!
<jerwd> hahaha.
<clever> there is a dedicated mmu between the arm and the main bus
<clever> so you can firewall off any chunk of ram, or even the whole mmio
<clever> arm physical, is not physical
<jerwd> Yep.
<jerwd> Zynq has SMMUs in their design for that purpose.
<clever> that mmu can also be configured to block MMIO from arm userland
<clever> i spent months trying to get /dev/mem to work in linux
<jerwd> Yep. I've definitely seen that in the Zynq
<clever> because the released header files, have that parameter labeled backwards
<jerwd> oh man
<jerwd> Yeah IOMMUs can make things extremely complicated.
<clever> these 4 constants, configure how the jtag functions
<clever> its either off, gpio, or "jtag bash"
<clever> and then down on line 380, you have the "jtag bash" mmio regs
<jerwd> huh, interesting.
<clever> 3 registers, for TMS, TDI, and TDO
<clever> and a less clear CONF register
<jerwd> yep, for bitbanging it.
<jerwd> TCK as well I assume?
<clever> lines 384-393 have bit lengths
<jerwd> oh, so it does the clocking for you then?
<clever> thats my guess
<jerwd> gotcha
<clever> i have yet to play with that
<jerwd> nice
<clever> this chunk of code configures the MMU mappings
<clever> the arm side is cut up into 64 pages, of 16mb each, covering exactly 1gig of the physical space
<jerwd> yeah, we used FPGA logic to do the clocking of data. I have that in another patchset for openocd here.
<clever> for each page, you can specify a bus address, with a 2mb resolution
<clever> the lack of 16mb alignment on the bus side, implies its got a full adder
<jerwd> interesting
<clever> rather then just dumb bit concats
<clever> and this PROT stuff, controls if userland can do mmio
<jerwd> on another project of mine, I ended up using the RPi PWM/DMA/FIFO to drive some of those WS281X LEDs. Bitbanging in Linux, with time critical frequencies wasn't working for obvious reasons.
<clever> jerwd: the thing that still makes zero sense to this day, is this "bresp" table.....
<clever> jerwd: the DPI can also bit-bang those kind of things, even harder then the pwm can
<clever> basically, the DPI is a digital video output port
<jerwd> yep
<clever> pixel clock, data-valid (non-blanking area), hsync, vsync, 24bits of raw digital color
<clever> who says it has to be image data?
<jerwd> definitely
<clever> this uses DRM to hijack the dpi port, and spit upt 24 uart streams at once
<clever> the only problem, is that all 24 bits go low during the blanking period
<clever> and hsync/vsync can only go as low as 1
<jerwd> nice!
<clever> uart would consider a whole scanline worth of low as a break signal, and a 1 pixel low every line could be tricky to deal with
<clever> but if your protocol idles low to begin with, that is far less of a problem
<clever> or just use some transistors to both boost the voltage, and invert the signal
<jerwd> yeah, have to invert. uart typically idles high if I remember right.
<clever> jerwd: every rpi model also has a full 2d composition engine, https://www.youtube.com/watch?v=JFmCin3EJIs is a demo i made of that
<jerwd> cool
<clever> and i couldnt stop making demos, https://www.youtube.com/watch?v=GHDh9RYg6WI
<clever> this is driving both the 2d and the 3d core, at the same time, and only using 2% of the cpu power
<clever> at only ~200mhz
<jerwd> Nice!
<clever> the arm core isnt even turned on in that demo
<clever> so you have 4 arm cores unused, 1 vpu unused, and 98% of a vpu just sitting idle
<jerwd> hehe, neat
<clever> where is it...
<clever> and in this video, i booted the system in just 0.61 seconds, to a working graphical console, with ext4 support
<clever> the entire application, fit inside a 109kb bootcode.bin
<clever> no other stages exist, no closed firmware involved
<clever> s/video/image/
<clever> i could possibly get the tga slideshow program to fit in there as well
<clever> and then it can load tga files from the SD card, and cycle thru them
<clever> and boot faster then the damn tv can turn on, lol
<jerwd> nice
<clever> jerwd: the hard part of developing all of this, is the major lack of documentation
<clever> the VPU jtag port is entirely undocumented, so my only way to debug things is with printf
<clever> and without docs for a lot of the chip, i'm basically debugging blind
<clever> jerwd: for example, i tried getting the camera port working lately, and the linux driver malfunctioned in ~4 different ways
<clever> 1: a lot of interrupt related registers, are not in what the engineer says is the reset default state
<clever> so the instant linux registers the irq handler, it gets interrupts, before its ready, and segfaults
<clever> and even fixing that, it gets interrupts endlessly, and can never do anything else
<clever> 2: the error handling around the clock management code was broken, and it never got noticed before, because it never actually errored out
<clever> 3: the power management code is entirely in the blob, so the analog PHY cant even turn on
<clever> jerwd: even fixing all of those, i still cant receive any image data
<jerwd> Isn't the camera already supported, or is that only in the proprietary driver?
<clever> jerwd: the camera is supported under both a closed and open camera stack
<jerwd> ah okay gotcha.
<clever> but the "open" camera stack still relies on the closed firmware to manage one clock freq, and the analog power domains
rudar has quit [Quit: Leaving]
<jerwd> ah, gotcha. so this is what you're reverse engineering then.
<clever> yeah
<jerwd> cool
<clever> this is the driver for the CSI peripheral
<clever> it takes raw CSI data from the camera, and shoves it into buffers in ram
<clever> jerwd: i suspect that i'm not turning the power domain on properly, because the registers in the unicam arent in the reset default
<clever> and i dont know which power domain its even in
<clever> i just know that i'm turning it on by accident while booting, lol
<clever> i think the MEMREP step, is what returns registers back to the reset default
<jerwd> My experience with reading/writing to a unpowered domain is that you'll just get a bus hang and complete lockup. Maybe the endpoint is in reset?
<clever> this platform has protections for that
<clever> reads from an unpowered device return 0xdeadbeef
<clever> and also raise an exception on the VPU
<clever> exceptions function basically the same as interrupts
<clever> jerwd: the vector table is an array of uint32_t[128]
<jerwd> ah cool
<clever> 16 slots for cpu exceptions, like divide by zero, or bus errors(caused by other bus masters)
<clever> 16 slots for software interrupts (like `int 0x80` on x86)
<clever> 32 each*
<clever> then 64 slot for peripheral interrupts
<clever> since all opcodes must be 16bit aligned, bit0 of the addr is assumed to be 0
<clever> the bit0 in the vector table, is then repurposed for a secure vs non-secure flag
<clever> the VPU has a (non)secure state flag somewhere
<clever> and you can configure any exception handler to be handled in the current? mode or the secure mode
<clever> the official firmware, runs in non-secure mode, and certain security related MMIO are firewalled off
<clever> there is then a syscall like api using software interrupts, to call certain special functions
sbach has quit [Read error: Connection reset by peer]
sbach has joined #openocd
rgr has quit [Ping timeout: 245 seconds]
<clever> jerwd: also, all of the camera stuff i said so far, is purely just to get raw bayer images into ram
<clever> the agc/awb control loops, and converting bayer into rgb/yuv, involve a hw component that is still blob controlled, and the engineers have stated that it will never get proper drivers from them
<jerwd> ah bummer.
<clever> the ISP is that secret sauce
<clever> from what ive heard, it can do re-packing, debayer, lens shading, various stats collection, pixel format conversions, scaling, and several conversions to/from planar formats
<clever> for example, the camera returns 10bit image data, packed in a rather weird format
<clever> first, you have 4 bytes containing the upper 8 bits from 4 pixels
<clever> then you have 1 byte, containing the lower 2 bits, from the same 4 pixels
<clever> so, to get the full value for pixel 1, you need to do `uint16_t v = (buffer[0] << 2) | buffer[4] & 0x3`
<clever> and then pixel 2, is (buffer[1] << 2) | buffer[4] & (0x3 << 2)
<jerwd> that's strange
<clever> jerwd: and because its in a bayer format (https://en.wikipedia.org/wiki/File:Bayer_pattern_on_sensor.svg), 3 out of every 4 pixels, is missing the blue component
<clever> 2 out of every 4 pixels is missing the green component
<jerwd> maybe they use that format somehow to hardware acceleration.
<clever> and 3 out of every 4, is missing the red
<clever> i think that wonky 10bit pattern, is for users that only want 8bit data
<clever> just ignore the 5th byte, and your left with the upper 8bits from each 10bit int
<jerwd> yeah, maybe so
<clever> when using the "open" stack, the agc loop involves a few steps
<clever> first, libcamera (open source) feeds the raw frame to the blob over an RPC, and the ISP does *magic*
<clever> and in addition to an rgb/yuv frame coming out, you also get a set of stats, saying what the average brightness of the frame was
<clever> a control loop in libcamera, then uses i2c to send commands to the csi sensor, to adjust both the analog gains, and the shutter exposure time
<clever> and the end result, is holding a desired brightness level
<jerwd> interesting
<clever> jerwd: page 40, figure 17, shows each step of processing that happens within the ISP
<clever> the image data goes along the top path, doing thru each stage, some of which output stats
<clever> others take data in from the stats
<clever> jerwd: i'm also starting to suspect if the camera sensor is even on...
<clever> its supposed to be doing some 500mhz DDR stuff, but i'm not picking up any emi when i put a scope probe near the bus
<clever> jerwd: modified the sensor driver, to record very single i2c transaction, now to find a datasheet leak....
<clever> ah, this one doesnt seem like a leak
<clever> it feels more like a proper release
<clever> jerwd: ah, that looks useful the camera sensor can be configured to just emit a static color bar pattern, entirely ignoring the actual image sensor element
<clever> that would at least eliminate the agc loop from the picture
[itchyjunk] has joined #openocd
jerwd has quit [Quit: Client closed]
emeb has quit [Quit: Leaving.]