azonenberg changed the topic of #scopehal to: libscopehal, libscopeprotocols, and glscopeclient development and testing | https://github.com/glscopeclient/scopehal-apps | Logs: https://libera.irclog.whitequark.org/scopehal
t4nk_freenode has quit [Quit: ZNC 1.8.2 - https://znc.in]
t4nk_freenode has joined #scopehal
azonenberg has joined #scopehal
veegee has joined #scopehal
<azonenberg> Welp
<azonenberg> Back online after an unplanned lab shutdown lol
<azonenberg> Hit the UPS bypass switches in the wrong order and Bad Things(tm) happened
<azonenberg> Appears the only casualty was a hepa air cleaner and the aging SBC I was using as a serial console server
<azonenberg> anyway, since everything was down I put the GPUs in the xen server
<azonenberg> Attached one to a VM and it enumerates but i'm still testing to see if i'll have full vulkan etc in the VM
<azonenberg> (gotta finish bringing other services online still)
benishor_ has joined #scopehal
benishor has quit [Quit: tah tah!]
benishor_ is now known as benishor
<Darius> oops
Degi_ has joined #scopehal
Degi has quit [Ping timeout: 246 seconds]
Degi_ is now known as Degi
<d1b2> <azonenberg> @johnsel
<d1b2> <johnsel> in a vm?
<d1b2> <azonenberg> That's a xen instance, yes
<d1b2> <johnsel> hurray
<d1b2> <johnsel> shame after all my hours you took this moment of glory away from me
<d1b2> <azonenberg> One card is currently attached to that vm, one is free. one of the amd cards is present in the chassis but i think that pcie slot is unavailable due to conflicts with the m.2 or something
<d1b2> <johnsel> but I'm not complaining
<d1b2> <johnsel> well, I am, lol, but still :p
<d1b2> <azonenberg> all i had to do was disable nouveau and throw the nvidia blob on it
<d1b2> <azonenberg> anyway that's one of my test instances that i'll shut down shortly. the second card is currently uncommitted
<d1b2> <azonenberg> let me know if you need me to attach a card to a vm for you, i forget if you can do that
<d1b2> <johnsel> yeah NVidia drivers are pretty great nowadays under Linux, they stuffed everything in the on-device firmware
<d1b2> <azonenberg> yeah i was impressed. i remember going amd originally for this box because of how anti-virtualization nvidia had been
<d1b2> <johnsel> yeah it's a shame AMD is lagging behind so much on this front
<d1b2> <azonenberg> anyway, one step closer to working CI
<d1b2> <azonenberg> let me know what support if any you need
<d1b2> <johnsel> they seem much more interested in putting their SoCs in Teslas
<d1b2> <johnsel> yes I do need your help getting it attached
<d1b2> <johnsel> I have permission to them but can't seem them due to the stupid UI
<d1b2> <johnsel> unless they patched that
<d1b2> <johnsel> which our repo based instance of the management software doesn't allow
<d1b2> <johnsel> let me get connected to the VPN real quick and let you know what I need
<d1b2> <johnsel> can you pm me the ip address of 'my' xoa instance?
<d1b2> <johnsel> and dns please
<d1b2> <johnsel> @azonenberg
benishor has quit [*.net *.split]
azonenberg has quit [*.net *.split]
veegee has quit [*.net *.split]
t4nk_freenode has quit [*.net *.split]
Bird|otherbox has quit [*.net *.split]
tnt has quit [*.net *.split]
florolf has quit [*.net *.split]
d1b2 has quit [*.net *.split]
syscall has quit [*.net *.split]
Stary has quit [*.net *.split]
mxshift has quit [*.net *.split]
lethalbit has quit [*.net *.split]
Ekho has quit [*.net *.split]
gruetzkopf has quit [*.net *.split]
elms has quit [*.net *.split]
anuejn has quit [*.net *.split]
Yamakaja has quit [*.net *.split]
electronic_eel has quit [*.net *.split]
juh has quit [*.net *.split]
davidc__ has quit [*.net *.split]
esden has quit [*.net *.split]
mithro has quit [*.net *.split]
welterde has quit [*.net *.split]
Stephie has quit [*.net *.split]
juri_ has quit [*.net *.split]
Fridtjof has quit [*.net *.split]
sgstair has quit [*.net *.split]
vup has quit [*.net *.split]
benishor has joined #scopehal
veegee has joined #scopehal
t4nk_freenode has joined #scopehal
azonenberg has joined #scopehal
Bird|otherbox has joined #scopehal
tnt has joined #scopehal
d1b2 has joined #scopehal
florolf has joined #scopehal
vup has joined #scopehal
lethalbit has joined #scopehal
syscall has joined #scopehal
mxshift has joined #scopehal
Ekho has joined #scopehal
Stephie has joined #scopehal
gruetzkopf has joined #scopehal
anuejn has joined #scopehal
Yamakaja has joined #scopehal
electronic_eel has joined #scopehal
elms has joined #scopehal
Stary has joined #scopehal
sgstair has joined #scopehal
Fridtjof has joined #scopehal
juh has joined #scopehal
juri_ has joined #scopehal
davidc__ has joined #scopehal
esden has joined #scopehal
mithro has joined #scopehal
welterde has joined #scopehal
<_whitenotifier-1> [scopehal-apps] azonenberg pushed 2 commits to master [+0/-0/±4] https://github.com/glscopeclient/scopehal-apps/compare/1e326a3735c2...9f5f998e1b45
<_whitenotifier-1> [scopehal-apps] azonenberg 3e60ab7 - Fixed missing space in appdate field
<_whitenotifier-1> [scopehal-apps] azonenberg 9f5f998 - Initial serialization work on trigger groups (and manage instruments dialog). Fixes #607.
<_whitenotifier-1> [scopehal-apps] azonenberg closed issue #607: Serialization support for trigger groups - https://github.com/glscopeclient/scopehal-apps/issues/607
bvernoux has joined #scopehal
bvernoux has quit [Read error: Connection reset by peer]
sgstair has quit [Server closed connection]
sgstair has joined #scopehal
bvernoux has joined #scopehal
<d1b2> <246tnt> I just switched my RX570 for a RX6600 and now I can run ngscopehal without kernel panic 😁
<d1b2> <246tnt> Also doesn't crash when zooming out 🤔
<t4nk_freenode> hey, @246tnt ... I had the same thing with my rx580 last time I tried, haven't dared trying again since, since I like to keep my machine alive ;)
<t4nk_freenode> I was just about to ask how things were in this regard, but apparently the issues still exist
<d1b2> <246tnt> Yeah, me too, I tried a couple of times with that card, but didn't try to debug it much.
<t4nk_freenode> how's that rx6600 for you? is it much better than the rx570?
<d1b2> <246tnt> To be fair, I had fairly old distro ( 20.04 ) so kernel ( drm ) and mesa ( radv vulkan driver ) werent very fresh so maybe it would work with more recent ones but ...
<t4nk_freenode> no man, I'm using gentoo with the latest of everything and I had the same
<d1b2> <246tnt> I literally put it in the PC like 3h ago and just finished upgrading the OS a bit so that I can actually use it, so don't have much impression yet 😅
<azonenberg> well we also fixed a bunch of bugs around AMD stuff recently
<d1b2> <david.rysk> userspace shouldn't be causing panics
<d1b2> <david.rysk> I feel like upstream would be interested in that
<d1b2> <246tnt> Well, I couldn't report anything because with versions I was running all that upstream would say is "update" and then report 😅
<d1b2> <david.rysk> there are PPAs available with updated versions for ubuntu, I'm pretty sure
<d1b2> <246tnt> yeah, most didn't go back to 18.04 even the kernel was a problem to update because of libc deps. Believe me, before I upgraded to 20.04 I tried some option to avoid that because I didn't have enough disk space for the update so I had to move to another drive which was a pain ...
t4nk_freenode is now known as t4nk_fn
<_whitenotifier-1> [scopehal] azonenberg pushed 1 commit to master [+0/-0/±3] https://github.com/glscopeclient/scopehal/compare/39aff5824827...956152729151
<_whitenotifier-1> [scopehal] azonenberg 9561527 - SubtractFilter: correctly handle trigger phase offset as long as sample rates are equal. Fixes #609.
<_whitenotifier-1> [scopehal-apps] azonenberg pushed 1 commit to master [+0/-0/±4] https://github.com/glscopeclient/scopehal-apps/compare/9f5f998e1b45...32e9c0d909d1
<_whitenotifier-1> [scopehal-apps] azonenberg 32e9c0d - Updated submodules, added restrict read/writeonly qualifiers to SSBOs for deskew shader
<_whitenotifier-1> [scopehal-apps] azonenberg closed issue #609: Subtract filter: support trigger phases - https://github.com/glscopeclient/scopehal-apps/issues/609
<d1b2> <hansemro> ngscopeclient (master) should be fixed for RX4XX/RX5XX cards as well. Just tested with RX480.
<azonenberg> which reminds me, we still have a lot of shaders i need to refactor to support 2D work groups
<azonenberg> right now we assume we can run arbitrarily many groups in the X axis, which is true on nvidia (2^31 max groups)
<azonenberg> but most nvidia/intel cards have much lower limits
<azonenberg> most amd/intel*
<azonenberg> so working with hundred megapoint to gigapoint waveforms will require 2D dispatches
<d1b2> <246tnt> @hansemro Mmmm ... do you know which commit fixed it ?
<t4nk_fn> @hansemro are we talking about the same thing? because I don't know what 'incorrect Vulkan queue type used' means
<t4nk_fn> but I was talking of a vicious system crash
<d1b2> <hansemro> Yes, this is the same issue. Mesa-radv driver would crash the kernel when using the wrong queue
<t4nk_fn> not just a black screen and unresponsive, but the whole system crashing, it was really scary
<d1b2> <hansemro> Interestingly, AMD's vulkan driver (amdvlk) does not crash, but would abort cleanly
<t4nk_fn> well, I'll try again sometime soon, see if I can rebuild and if it works for my 580 too
<d1b2> <hansemro> This is odd, but using amdvlk vulkan driver displays a larger compute group size compared to mesa-radv: 4294967295 vs 65535
<azonenberg> interesting
<azonenberg> so perhaps thats a mesa limitation. either way we need to support the smaller group sizes on intel
<t4nk_fn> so those 'amdvlk' and mesa-radv ... they are part of mesa or so?
<t4nk_fn> I think I have xf86-video-amdgpu and mesa master on my system
<t4nk_fn> + + vulkan : Add support for 3D graphics and computing via the Vulkan cross-platform API
<t4nk_fn> + + video_cards_radeonsi
<d1b2> <johnsel> I think mesa-radv is the love child of Valve and others wanting to support vulkan on AMD SoC for their projects
<d1b2> <johnsel> amdvlk is AMD's own driver
<d1b2> <johnsel> AMD drivers are tuurrrrible though regardless. There are many videos of geohot debugging crash after crash for his tinygrad project
<d1b2> <johnsel> that said, he met with lisa su and they are actively working to better it
<d1b2> <johnsel> but they are really focussing on specific SoC based stuff, Tesla, Valve's game thing, Xbox and PS5
<d1b2> <johnsel> and now AI
<azonenberg> johnsel: that in the VM?
<d1b2> <johnsel> it sure is
<azonenberg> :D
<azonenberg> awesome
<azonenberg> So what's left to be able to run full CI jobs with vulkan? and where are we on the linux instance?
<d1b2> <johnsel> very funny
<d1b2> <johnsel> honestly I have to review what we need to do myself, but we're fairly close to running our own CI jobs
<azonenberg> awesome :D
<azonenberg> I may buy some cheap siglent gear to use for hardware in loop tests eventually. i already have a SPD3303X-E power supply i'm not using most of the time
<t4nk_fn> holy smokes, that guy has issues
<t4nk_fn> other than a kernel panic
<d1b2> <johnsel> well look up who he is, he's a very respected hacker
<azonenberg> geohot?
<azonenberg> he's smart but he does in fact have issues :p
<d1b2> <johnsel> but personality wise yeah the dude is mental
<d1b2> <johnsel> oh yeah no doubt about it
<azonenberg> i've met him, he's a character
<azonenberg> why are the best hackers always absolutely insane?
<d1b2> <johnsel> comes with the territory
* azonenberg looks in general direction of chris tarnovsky
<d1b2> <johnsel> and to answer your other question azonenberg, can you swap the GPUs again?
<d1b2> <johnsel> also note I still have to test running stuff headless
<d1b2> <johnsel> I was able to get GPU acceleration over RDP but not sunshine (which hooks the GPU driver directly, normally, but could not now for some reason)
<azonenberg> swap the edid emu again? ok
<d1b2> <johnsel> but it's probably easier to get a good overview of all those tasks with the edid emulators in place, and with a good test protocol
<azonenberg> Done
<azonenberg> I have one for the other card coming any minute now, still shows out for delivery
<d1b2> <johnsel> we may also need that edid in place during boot
<d1b2> <johnsel> which is not ideal
<d1b2> <johnsel> but I guess it is what it is
<azonenberg> well long term there will be one dedicated to each card
<azonenberg> moving them is just temporary for the next few hours :p
<d1b2> <johnsel> yep yep, but it may need to be in there when the system boots
<d1b2> <johnsel> because the firmware might be onto us
<d1b2> <johnsel> the RDP working but sunshine not is kinda weird
<d1b2> <johnsel> there is another remote gaming app I can try that might work but it would be easiest to exclude all possible reasons for that happening
<d1b2> <johnsel> the internal console connects to the virtio gpu(?? or whatever else xen uses)
<d1b2> <johnsel> anyway if you could pretty please do a full server reboot once the edid emulators are in place then I can be sure I have to look on the software side
<d1b2> <johnsel> I think before the end of this year we will have CI fully operational
<d1b2> <johnsel> will that be usb or ethernet hardware by the way?
<d1b2> <johnsel> also normal trigger mode hangs ngscopeclient 😦
<d1b2> <johnsel> at least with demoscope
<d1b2> <johnsel> I can try with my RIgol later
<d1b2> <johnsel> I'm bouncing between tasks, can't get anything done this way
<d1b2> <johnsel> also azonenberg, suppose I would bridge the VPN with my internal ethernet that is dedicated to the Rigol (I am grateful for them sending it, but that thing is not going on my LAN lol) would the vpn give out an IP to it?
<d1b2> <johnsel> or do we need a full s2s config for that?
<d1b2> <johnsel> mm better to port forward
<d1b2> <louis8374> This would work for me
<azonenberg> a full host reboot?
<azonenberg> that would be a pain
<azonenberg> as far as vpn, the config i have right now hands out one IP to each client but i can add routing rules for subnets
<azonenberg> vpn endpoints go in 10.255.2.x/24 and actual site systems are 10.site.subnet.host
<azonenberg> with the CI environment living in site #2
<azonenberg> so i could assign a site number to your test network and add routing rules for it, open scpi port traffic to it, etc
<azonenberg> it'd take a bit of setup work but is absolutely doable and i have peering arrangements over the same vpn with other people
bvernoux has quit [Quit: Leaving]
<d1b2> <johnsel> yeah...
<d1b2> <johnsel> did you boot at least 1 GPU with edid emu?
<azonenberg> The one i had on fpgadev before. which i think is the one now attached to your linux builder
<azonenberg> whats sunshine?
<azonenberg> i'm just using ssh+vnc to the linux test system on my end
<d1b2> <johnsel> can you swap the GPUs between the VMs?
<azonenberg> I cant control which gpu goes to which vm
<d1b2> <johnsel> can kill both v
<azonenberg> i just ask for "a GPU of this type"
<azonenberg> and i have no idea which one gets attached
<d1b2> <johnsel> that sucks
<azonenberg> i guess i could force that a bit
<azonenberg> shut down one of yours, i attach to another instance
<azonenberg> shut down the other one, leaving one free
<azonenberg> now the next one to start has to get the free card
<azonenberg> etc
<d1b2> <johnsel> not sure that works but you have more vision into the pci endpoints
<azonenberg> but they're supposed to be equivalent, it shouldnt matter which gets what :p
<d1b2> <johnsel> sunshine is a remote desktop app
<azonenberg> if only one of the two cards is attached and the other is free
<d1b2> <johnsel> but it hooks into low level GPU driver paths
<azonenberg> any vm that starts will get the free one
<azonenberg> the idea is to keep exactly one card free at all times so yo ucan "hand off" the card from one vm to another
<azonenberg> with a third vm it should be possible to exchange cards between two
<d1b2> <johnsel> well let's give it a try
<azonenberg> ok so shut down one instance
<d1b2> <johnsel> the problem I am facing is running things headless
<d1b2> <johnsel> can you do it? I'm eating
<d1b2> <johnsel> you can just kill the box
<azonenberg> Gimme a few. Or i could just wait until the edid emulators get here if you arent in a rush?
<d1b2> <johnsel> I'm not in a rush, but the edid emulator may be necessary at boot for it to initialize fully
<azonenberg> We'll deal with that if we have to but i have enough other prod stuff on this same host i'd rather not reboot it if i can avoid it
<azonenberg> long term i kinda want to get a second xen server so i can do failover. the second one wouldn't even be running all the time
<azonenberg> i'd turn it on, migrate stuff onto it, shut down the primary, work on it
<azonenberg> then i could run both if i had demand scaling beyond what one could handle
<azonenberg> but thats expensive enough i'm not doing it yet :p
<d1b2> <johnsel> I understand, but I am having crashes via RDP and the sunshine tool and nvidia tool show the card is connected but I am not connected to the framebuffer
<d1b2> <johnsel> sunshine should be able to do that
<azonenberg> interesting
<d1b2> <johnsel> so it may only be partially functional over RDP
<d1b2> <johnsel> and RDP has it's own display driver that does things
<d1b2> <johnsel> so it can talk to the nvidia driver and get it to do some things
<d1b2> <johnsel> render in blender works fine
<d1b2> <johnsel> so strange behavior all over, and my experience has taught me that that is the card or drivers only half initializing
<azonenberg> i assume you've rebooted the instance already?
<d1b2> <johnsel> yes
<azonenberg> Ok
<azonenberg> well i can try a host reboot once the edid emus get here i guess
<d1b2> <johnsel> anyway 1 card should have been booted with edid emu and should be golden as far so we can try to differentially diagnose by swapping cardsd
* azonenberg pictures flock of large flightless birds with VESA mounts on them
<d1b2> <johnsel> I did not notice any change in behavior when you swapped the edid
<d1b2> <johnsel> lol
<d1b2> <johnsel> that took me a while
<d1b2> <johnsel> hmm
<d1b2> <johnsel> food gave me some ideas to test
<d1b2> <johnsel> we could disable xen GPU and see what happens
<d1b2> <johnsel> HAH
<d1b2> <johnsel> fuck