f_ changed the topic of ##raspberrypi-internals to: The inner workings of the Raspberry Pi (Low level VPU/HW) -- for general queries please visit #raspberrypi -- open firmware: https://librerpi.github.io/ -- VC4 VPU Programmers Manual: https://github.com/hermanhermitage/videocoreiv/wiki -- chat logs: https://libera.irclog.whitequark.org/~h~raspberrypi-internals -- bridged to matrix and discord
jcea has quit [Ping timeout: 268 seconds]
Ad0 has quit [Ping timeout: 260 seconds]
bonda_000 has joined ##raspberrypi-internals
<bonda_000> clever: u here?
Ad0 has joined ##raspberrypi-internals
bonda_000 has quit [Ping timeout: 268 seconds]
bonda_000 has joined ##raspberrypi-internals
jcea has joined ##raspberrypi-internals
<juri_> clever is always here. :)
CompanionCube has quit [Quit: ZNC - http://znc.in]
CompanionCube has joined ##raspberrypi-internals
<f_[xmpp]> juri_ ;)
<f_[xmpp]> many are always there :P
f_ has joined ##raspberrypi-internals
<juri_> always here, but not always in this timezone. :)
Stromeko has quit [Ping timeout: 260 seconds]
Stromeko has joined ##raspberrypi-internals
bonda_000 has quit [Ping timeout: 268 seconds]
bonda_000 has joined ##raspberrypi-internals
<f_> juri_: ;)
<f_ridge> <x​2x6_/D> Hi
<juri_> Hi!
<bonda_000> are you guys working on reverse engineering the vpu code as well? juri: f_ridge:
<bonda_000> it's been quite here since I joined only talked to clever for the most part
f_ has quit [Quit: To contact me, send a memo using MemoServ, PM f_[xmpp], or send an email. See https://vitali64.duckdns.org/.]
<f_ridge> <x​2x6_/D> I eventually do that from time to time. Especially now, when Clever has pointed me to an unstripped version . Before that it was a more tough experience
<f_ridge> <x​2x6_/D> I eventually do that from time to time. Especially now, when Clever has pointed me to an unstripped version of start_x. Before that it was a more tough experience(edited)
user_user has joined ##raspberrypi-internals
user_user has quit [Quit: leaving]
user_user has joined ##raspberrypi-internals
<user_user> I want to measure exact periods of time to understand how my functions are performing on a baremetal code on PI3
<clever> user_user: i think the best option is to just read ST_CLO before and after running your function, but the resolution is only 1 uSec
<clever> the arm timer might also work, if its not being reset by other things
<clever> oh, and the arm has dedicated performance counters you could look into
<user_user> I think I should use a systimer but I remember some controversal info seen somewhere that videocore tweaks clocks for ARM cpu.
<clever> i think those can count things like clock cycles elapsed, and cache misses
<clever> assuming your not having thermal throttling, the VPU will only change the arm clock when you ask it to
<user_user> So if it does so I can not really rely on anything that is clock based?
<clever> usually via the linux cpufreq driver
<clever> ST_CLO is always clocked at 1mhz, that never changes
user_user has quit [Quit: leaving]
<bonda_000> so far figured out
<bonda_000> the first program that is run after the init
<bonda_000> vmcs_app
<bonda_000> ends up in vmcs_task_alloc_ex that creates first 12 threads
<bonda_000> I'm not sure if you call those queues or threads
<bonda_000> clever: do you consider this a code obfuscation if values are swapped back and forth between lo regs, the stack, and high regs?
<bonda_000> I notice that the nested function calls of vcos/tx tend to reach back to the stack of the caller function to grab something
<clever> bonda_000: is it just doing sp + offset to go back in the stack, or is it being passed a pointer?
<bonda_000> one that I am at right now, had sp passed as an argument and yeah goes back up the stack
<clever> the only time ive noticed any real code obfuscation, is when dealing with some of the hmac keys in start.elf
<clever> it has a big array of the expected keys, but they have all been xor'd with a constant
<clever> so you have to xor each of them with that constant, to get the real key
<bonda_000> vcos_thread_create param_3 is a stack pointer of a caller, vcos_thread_create_classic
<clever> that sounds like its just a struct allocated on the local stack frame
<clever> so basically, `struct foo; bar(&foo);`
<clever> which is totally normal c stuff
<bonda_000> uh, sort of
<bonda_000> default_attrs
<bonda_000> got copied there
<clever> ah yes, the thread attributes, i saw that recently
<bonda_000> and then
<clever> thats just initializing the thread attributes to a default value
<bonda_000> yeah it has a pointer to 0x2000 bytes of malloced memory
<bonda_000> then a constant 2000h
<bonda_000> nvm, that's not a default
<bonda_000> that's for these first 12 threads
<bonda_000> and then
<bonda_000> it has hidden two function pointers
<bonda_000> on a stack and into a thread_struct that is of size 1F4
<bonda_000> vmcs_app_message_handler() is the one you want to take a look at
<bonda_000> it's not a message handler but it's basically the meat of the VPU code
<bonda_000> it initiates everything
<bonda_000> arm, linux kernel, cameras, hdmi
<clever> yeah, vmcs_app_message_handler, seems to be starting nearly everything on the system
<bonda_000> so hopefully I can get across the place where the last entry of this struct is being referenced
<clever> bonda_000: oh, found something interesting, vmcs_initialise_auto_vchi_services
<clever> at cma_service_start_info, is an array of things, each 0xc bytes long
<clever> you have a byte telling vmcs_initialise_auto_vchi_services how to call a function, a function pointer, and an unknown 32bit int
<clever> correction, a pointer to a char*
<clever> that launches 3 services, cma, mmal, and wdog
<bonda_000> what is cma?
<clever> contiguous memory allocator
<bonda_000> oh yeah
<clever> mmal_server_start, is the thing we looked at yesterday!
<bonda_000> I dont quite understand what its doing
<bonda_000> adding 4 to a string and calling it as a code
<clever> its not code
<clever> correction
<clever> its not a string
<clever> bonda_000: https://imgur.com/a/2bAGxbD
<bonda_000> char *
<clever> its an array of 3 `struct { byte something; byte padding[3]; code *fun; char *name; }`
<bonda_000> pcVar3 = "\x01"
<bonda_000> char *pcVar3;
<clever> thats testing that the first byte is a 1
<clever> ghidra wrongly assumed its a string
<bonda_000> don't even know what "\x01" but could be a 32 bit address of something
<clever> its just 1
<bonda_000> so then it does
<bonda_000> iVar1 = (**(code **)(pcVar3 + 4))(param_1,param_2,param_3,*(code **)(pcVar3 + 4));
<bonda_000> oh yeah
<bonda_000> how much we are stepping depends on what variable type it is
<bonda_000> but still
<bonda_000> char pointer plus four thats a 16 byte step forward
<bonda_000> ah yeah its the lea r8, cma_service_start_info
<bonda_000> that didn't make it's way to the decompile window
<clever> if the byte is 0, it calls the function pointer with 3 args, the same 3 vmcs_initialise_auto_vchi_services received
<clever> if the byte is 1, it calls it with just 1 argument
<clever> i think technically, its loading the address of __VCHIQ_SERVICES_START and __VCHIQ_SERVICES_END
<clever> but there is then also a symbol for each element in that array
<clever> so cma_service_start_info == __VCHIQ_SERVICES_START
<clever> so it starts at __VCHIQ_SERVICES_START, reads a 12 byte thing, increments by 12, and repeats, until it hits __VCHIQ_SERVICES_END
<clever> fairly standard looping mechanics
<bonda_000> so it's a contiguous memory allocator for a chip that has no MMU?
<bonda_000> what I think is, if you just remove arm_loader and kernel_load from that function, you have all the VPU to yourself
<clever> pretty much
<clever> but you would still need symbols and headers, to properly call the thread/malloc stuff
<clever> and its simpler to just port your own things, like i did with LK
<bonda_000> well good thing is that it fits into a bootcode.bin format
<bonda_000> if you go start_x.elf that's extra copying of bytes and slower startup time
<bonda_000> it takes a minute on Ethernet to get to the end of 2nd stage
<clever> that sounds like serious network problems
<clever> mine can boot in seconds
<bonda_000> "a minute" not literally
<bonda_000> but you can see it gives sd card a timeout
<bonda_000> not instantly jump to secondary options
<clever> ah, put an SD card in it, with no files on it
<clever> then it wont have timeouts
<clever> https://imgur.com/Yeh6wEr this is the bootcode-fast-ntsc target from lk-overlay
<clever> it boots (from sd) to a video output, in just 0.61 seconds
<clever> the tv itself takes longer then that to turn on, lol
<bonda_000> that's the audio jack to composite video you told me about?
<clever> yep
<bonda_000> and that's within the vec module correct? to overtake the audio jack
<clever> yeah