#scopehal on 2022-10-10 — irc logs at libera.irclog.whitequark.org

2022-03-25 21:41 azonenberg changed the topic of #scopehal to: libscopehal, libscopeprotocols, and glscopeclient development and testing | https://github.com/glscopeclient/scopehal-apps | Logs: https://libera.irclog.whitequark.org/scopehal

00:04 <azonenberg> @johnsel yeah in general the project is not yet big enough that we have multiple dev conversations going on at once stepping on each other

00:05 <azonenberg> so i think it makes sense to keep as much as possible in the main channel so everyone is on the same page

00:55 <azonenberg> But yes in general as the number of contributors grows, we do need to coordinate who's working on what to avoid wasted/duplicated effort

01:42 <azonenberg> lain: sooo

01:42 <azonenberg> re multi gpu and laptops

01:42 <azonenberg> i discovered something today

01:44 <azonenberg> The reason I only see one GPU at a time on my nvidia optimus setup is that if i do primusrun, the layer hides the discrete card (all rendering is done on the nvidia card to an offscreen texture then copied to a texture drawn by the integrated gpu)

01:44 <azonenberg> sorry i meant the layer hides the integrated card

01:44 <azonenberg> so i only see the discrete card and the copy to the surface run by the integrated card is done in the background

01:45 <azonenberg> And if I do not use primusrun, the nvidia card is in sleep mode so vulkan enumeration won't see it

01:45 <azonenberg> If i run a *different* app that uses the nvidia card, it's powered on

01:45 <azonenberg> at which point vulkan now sees both cards

01:45 <azonenberg> presumably with bumblebee commands you could do this manually

01:46 <azonenberg> This poses the potential for a more power efficient architecture on laptops

01:46 <d1b2> <david.rysk> Bumblebee doesn’t play nice with the nvidia drivers

01:46 <azonenberg> maybe its not bumblebee,. whatever primus uses to turn on the nvidia card

01:46 <d1b2> <david.rysk> There’s another thing one can do with nvidia drivers that I have set up which is power efficient

01:46 <d1b2> <david.rysk> It only works with recent cards

01:46 <azonenberg> i havent looked under the hood to do it

01:46 <azonenberg> anyway, what i think might be worth looking into is, since we already are doing all waveform rendering and DSP in a headless compute shader flow

01:46 <azonenberg> run imgui on the integrated card

01:47 <azonenberg> run the heavy stuff on the discrete

01:47 <azonenberg> this way, the discrete can downclock or sleep if we're just clicking around the UI

01:48 <azonenberg> essentially we'd be doing our own implementation of primus, but instread of copying the entire window's pixels across GPUs we'd just be copying the rendered waveform bitmaps across gpus

01:48 <azonenberg> this is in fact what we do in glscopeclient now with the hybrid opengl/vulkan architecture

01:48 <azonenberg> Nothing stops those two halves of the UI from being on different cards

01:49 <azonenberg> Splitting the compute across multiple GPUs would enable scaling to even higher performance levels but the scheduling of the workloads to minimize device-to-device copies, stalls, etc would be nontrivial

01:49 <azonenberg> so that is a longer term feature

01:49 <azonenberg> I think a simple frontend/backend multi GPU split is likely doable in a few days of work

01:49 <azonenberg> Not saying we should do it now, but i think it is well within the realm of what could be done fairly easily

01:50 <azonenberg> and it answers a questoin i had had for a while WRT how vulkan handled multigpu

01:50 <azonenberg> i.e. there isnt actually any secret sauce, its all very clean

01:50 <d1b2> <david.rysk> It’s worth noting that the newer cards not only downclock the chip, they completely shut it off when it’s not needed

01:51 <azonenberg> Yes

01:51 <azonenberg> That is what my setup here does

01:51 <azonenberg> and it explains why i didnt see the dGPU at all

01:51 <d1b2> <david.rysk> Can you access the igpu when primusrun is used to run it on the dgpu?

01:51 <azonenberg> So, the way primus actually works is that the igpu is the sole output to the display

01:51 <d1b2> <david.rysk> dgpu operation cuts battery life to half or a third 🙃

01:51 <azonenberg> Primus turns on the dgpu

01:52 <azonenberg> then it hides the igpu

01:52 <azonenberg> via a vulkan layer

01:52 <azonenberg> your app then renders on the dgpu thinking it's on a single gpu system

01:52 <d1b2> <david.rysk> Ahhh. (I’ve seen it implemented a few ways, I believe that’s the most common with x86 machines)

01:52 <azonenberg> primus then takes the (invisible) surface and copies it to a vulkan surface on the igpu

01:52 <azonenberg> and displays it

01:52 <d1b2> <david.rysk> (Some older machines as well as intel macs with a dGPU use a mux instead)

01:52 <azonenberg> AFAIK this is the only way primus-vk does it

01:52 <d1b2> <david.rysk> Yeah that’s Optimus

01:52 <azonenberg> i cannot comment on opengl

01:52 <d1b2> <david.rysk> Which is what primus uses

01:53 <azonenberg> Anyway, my point is, if we managed that copy ourself

01:53 <azonenberg> we could do it at the waveform level vs the whole app

01:53 <azonenberg> run the gui on the igpu

01:53 <azonenberg> only wake up the dgpu when a waveform has to be re-rendered or some heavy DSP is needed

01:54 <azonenberg> maybe even supporting a hybrid strategy, igpu only up to X samples of memory depth on screen or something

01:54 <azonenberg> then wake up the dgpu when things get too heavy

01:54 <azonenberg> point is, i think there are better options than igpu-only (slow) or using primus to push everything to the dgpu (power hungry)

01:55 <azonenberg> if we split the work intelligently we can likely get almost the same performance - or even better, since we can utilize igpu and dgpu simultaneously during bursty periords

01:55 <azonenberg> with greatly improved power efficiency

01:55 <d1b2> <david.rysk> Yeah. The question is whether the API lets you do this. macOS back in the hybrid GPU days had a flag you’d use when you create an off-screen buffer

01:55 <azonenberg> So thats the thing

01:55 <d1b2> <david.rysk> But again they used a mux

01:55 <azonenberg> we are not using vulkan rendering until the very end of the pipeline

01:55 <azonenberg> waveform rasterizing is done in a compute shader

01:55 <d1b2> <david.rysk> They did not use Optimus (where the igpu is always at the end of the chain)

01:55 <azonenberg> it's just a float[] SSBO

01:56 <azonenberg> all we'd have to do is add a shim in the code that instead of passing the ssbo handle to the shader that tone maps it to RGBA, copy the buffer contents from dgpu internal memory to pinned memory on the CPU that is readable by the igpu

01:56 <azonenberg> then pass that new ssbo to the tone mapping shader

01:57 <azonenberg> and run the tone mapping shader on the igpu instead of the dgpu

01:57 <azonenberg> then have all of the imgui stuff living on the igpu

01:57 <azonenberg> it would be a very simple split

01:58 <azonenberg> waveform sample data would live entirely on the dgpu, gui geometry entirely on the igpu, and rasterized waveforms cross between them

01:58 <azonenberg> and like i said we do this in glscopeclient already

01:58 <azonenberg> with the exact same split

01:58 <azonenberg> the difference is, in glscopeclient the tone mapping and presentation is opengl vs vulkan

01:58 <azonenberg> while the dsp and rasterization is vulkan

01:58 <azonenberg> but the api boundary means we cant share any handles. so they're effectively separate cards even if they're the same physical gpu

01:59 <azonenberg> so basically we've already proven that the model works

02:04 <_whitenotifier-7> [scopehal-apps] azonenberg labeled issue #534: Improve Primus/Optimus support - https://github.com/glscopeclient/scopehal-apps/issues/534

02:04 <_whitenotifier-7> [scopehal-apps] azonenberg opened issue #534: Improve Primus/Optimus support - https://github.com/glscopeclient/scopehal-apps/issues/534

02:04 <_whitenotifier-7> [scopehal-apps] azonenberg labeled issue #534: Improve Primus/Optimus support - https://github.com/glscopeclient/scopehal-apps/issues/534

02:07 <_whitenotifier-7> [scopehal-apps] azonenberg commented on issue #534: Improve Primus/Optimus support - https://github.com/glscopeclient/scopehal-apps/issues/534#issuecomment-1272707614

02:12 <azonenberg> @david.rysk: for reference the specific setup i have here for testing is a quadro rtx 3000 plus an Intel(R) UHD Graphics (CML GT2)

02:40 <lain> azonenberg: on that's fun @ primus / bumblebee / dgpu vs igpu

02:40 <lain> I've seen a handful of implementations but I don't think I ran into that particular one before

02:40 <d1b2> <david.rysk> that sounds like regular optimus to me

02:40 <d1b2> <david.rysk> using the "current" proprietary nvidia drivers

02:40 <azonenberg> yes

02:41 <d1b2> <david.rysk> (a lot has changed in the past couple of years, it used to be really janky)

02:41 <azonenberg> yeah i never got it working pre 2020

02:41 <azonenberg> now it just works

03:35 Degi_ has joined #scopehal

03:36 Degi has quit [Ping timeout: 268 seconds]

03:36 Degi_ is now known as Degi

07:34 bvernoux has joined #scopehal

08:13 massi has joined #scopehal

10:15 bvernoux has quit [Read error: Connection reset by peer]

11:01 massi has quit [Remote host closed the connection]

15:26 <_whitenotifier-7> [scopehal-apps] azonenberg labeled issue #535: Allow switching to event driven GUI flow to improve power efficiency on laptops - https://github.com/glscopeclient/scopehal-apps/issues/535

15:26 <_whitenotifier-7> [scopehal-apps] azonenberg opened issue #535: Allow switching to event driven GUI flow to improve power efficiency on laptops - https://github.com/glscopeclient/scopehal-apps/issues/535

17:00 <_whitenotifier-7> [scopehal-apps] azonenberg pushed 5 commits to master [+1/-0/±18] https://github.com/glscopeclient/scopehal-apps/compare/05f35098da40...ba5f6ad88aaa

17:00 <_whitenotifier-7> [scopehal-apps] azonenberg e7dabc6 - Initial work on markers. Can create them via context menu and they show up (partially) under history view, but not rendered in display yet. See #511.

17:00 <_whitenotifier-7> [scopehal-apps] azonenberg 4092588 - PreferenceManager: save prefs on app exit. Added some debug log messages.

17:00 <_whitenotifier-7> [scopehal-apps] azonenberg 689ea99 - Preferences: fixed bug in serialization of color values

17:00 <_whitenotifier-7> [scopehal-apps] ... and 2 more commits.

18:36 <_whitenotifier-7> [scopehal-apps] azonenberg pushed 1 commit to master [+0/-0/±1] https://github.com/glscopeclient/scopehal-apps/compare/ba5f6ad88aaa...cdf53d6dcb0c

18:36 <_whitenotifier-7> [scopehal-apps] azonenberg cdf53d6 - Implemented power / performance settings backend. Fixes #535.

18:36 <_whitenotifier-7> [scopehal-apps] azonenberg closed issue #535: Allow switching to event driven GUI flow to improve power efficiency on laptops - https://github.com/glscopeclient/scopehal-apps/issues/535

20:20 <_whitenotifier-7> [scopehal-apps] azonenberg pushed 1 commit to master [+0/-0/±10] https://github.com/glscopeclient/scopehal-apps/compare/cdf53d6dcb0c...1abad99cfc07

20:20 <_whitenotifier-7> [scopehal-apps] azonenberg 1abad99 - Continued work on markers. Can now render them in WaveformGroup's, but not drag them around. See #511.

20:34 josuah has quit [Quit: WeeChat 3.4.1]

20:36 josuah has joined #scopehal

20:52 vup has quit [Ping timeout: 268 seconds]

20:52 anuejn has quit [Ping timeout: 268 seconds]

21:51 vup has joined #scopehal

21:51 anuejn has joined #scopehal