<azonenberg>
@johnsel yeah in general the project is not yet big enough that we have multiple dev conversations going on at once stepping on each other
<azonenberg>
so i think it makes sense to keep as much as possible in the main channel so everyone is on the same page
<azonenberg>
But yes in general as the number of contributors grows, we do need to coordinate who's working on what to avoid wasted/duplicated effort
<azonenberg>
lain: sooo
<azonenberg>
re multi gpu and laptops
<azonenberg>
i discovered something today
<azonenberg>
The reason I only see one GPU at a time on my nvidia optimus setup is that if i do primusrun, the layer hides the discrete card (all rendering is done on the nvidia card to an offscreen texture then copied to a texture drawn by the integrated gpu)
<azonenberg>
sorry i meant the layer hides the integrated card
<azonenberg>
so i only see the discrete card and the copy to the surface run by the integrated card is done in the background
<azonenberg>
And if I do not use primusrun, the nvidia card is in sleep mode so vulkan enumeration won't see it
<azonenberg>
If i run a *different* app that uses the nvidia card, it's powered on
<azonenberg>
at which point vulkan now sees both cards
<azonenberg>
presumably with bumblebee commands you could do this manually
<azonenberg>
This poses the potential for a more power efficient architecture on laptops
<d1b2>
<david.rysk> Bumblebee doesn’t play nice with the nvidia drivers
<azonenberg>
maybe its not bumblebee,. whatever primus uses to turn on the nvidia card
<d1b2>
<david.rysk> There’s another thing one can do with nvidia drivers that I have set up which is power efficient
<d1b2>
<david.rysk> It only works with recent cards
<azonenberg>
i havent looked under the hood to do it
<azonenberg>
anyway, what i think might be worth looking into is, since we already are doing all waveform rendering and DSP in a headless compute shader flow
<azonenberg>
run imgui on the integrated card
<azonenberg>
run the heavy stuff on the discrete
<azonenberg>
this way, the discrete can downclock or sleep if we're just clicking around the UI
<azonenberg>
essentially we'd be doing our own implementation of primus, but instread of copying the entire window's pixels across GPUs we'd just be copying the rendered waveform bitmaps across gpus
<azonenberg>
this is in fact what we do in glscopeclient now with the hybrid opengl/vulkan architecture
<azonenberg>
Nothing stops those two halves of the UI from being on different cards
<azonenberg>
Splitting the compute across multiple GPUs would enable scaling to even higher performance levels but the scheduling of the workloads to minimize device-to-device copies, stalls, etc would be nontrivial
<azonenberg>
so that is a longer term feature
<azonenberg>
I think a simple frontend/backend multi GPU split is likely doable in a few days of work
<azonenberg>
Not saying we should do it now, but i think it is well within the realm of what could be done fairly easily
<azonenberg>
and it answers a questoin i had had for a while WRT how vulkan handled multigpu
<azonenberg>
i.e. there isnt actually any secret sauce, its all very clean
<d1b2>
<david.rysk> It’s worth noting that the newer cards not only downclock the chip, they completely shut it off when it’s not needed
<azonenberg>
Yes
<azonenberg>
That is what my setup here does
<azonenberg>
and it explains why i didnt see the dGPU at all
<d1b2>
<david.rysk> Can you access the igpu when primusrun is used to run it on the dgpu?
<azonenberg>
So, the way primus actually works is that the igpu is the sole output to the display
<d1b2>
<david.rysk> dgpu operation cuts battery life to half or a third 🙃
<azonenberg>
Primus turns on the dgpu
<azonenberg>
then it hides the igpu
<azonenberg>
via a vulkan layer
<azonenberg>
your app then renders on the dgpu thinking it's on a single gpu system
<d1b2>
<david.rysk> Ahhh. (I’ve seen it implemented a few ways, I believe that’s the most common with x86 machines)
<azonenberg>
primus then takes the (invisible) surface and copies it to a vulkan surface on the igpu
<azonenberg>
and displays it
<d1b2>
<david.rysk> (Some older machines as well as intel macs with a dGPU use a mux instead)
<azonenberg>
AFAIK this is the only way primus-vk does it
<d1b2>
<david.rysk> Yeah that’s Optimus
<azonenberg>
i cannot comment on opengl
<d1b2>
<david.rysk> Which is what primus uses
<azonenberg>
Anyway, my point is, if we managed that copy ourself
<azonenberg>
we could do it at the waveform level vs the whole app
<azonenberg>
run the gui on the igpu
<azonenberg>
only wake up the dgpu when a waveform has to be re-rendered or some heavy DSP is needed
<azonenberg>
maybe even supporting a hybrid strategy, igpu only up to X samples of memory depth on screen or something
<azonenberg>
then wake up the dgpu when things get too heavy
<azonenberg>
point is, i think there are better options than igpu-only (slow) or using primus to push everything to the dgpu (power hungry)
<azonenberg>
if we split the work intelligently we can likely get almost the same performance - or even better, since we can utilize igpu and dgpu simultaneously during bursty periords
<azonenberg>
with greatly improved power efficiency
<d1b2>
<david.rysk> Yeah. The question is whether the API lets you do this. macOS back in the hybrid GPU days had a flag you’d use when you create an off-screen buffer
<azonenberg>
So thats the thing
<d1b2>
<david.rysk> But again they used a mux
<azonenberg>
we are not using vulkan rendering until the very end of the pipeline
<azonenberg>
waveform rasterizing is done in a compute shader
<d1b2>
<david.rysk> They did not use Optimus (where the igpu is always at the end of the chain)
<azonenberg>
it's just a float[] SSBO
<azonenberg>
all we'd have to do is add a shim in the code that instead of passing the ssbo handle to the shader that tone maps it to RGBA, copy the buffer contents from dgpu internal memory to pinned memory on the CPU that is readable by the igpu
<azonenberg>
then pass that new ssbo to the tone mapping shader
<azonenberg>
and run the tone mapping shader on the igpu instead of the dgpu
<azonenberg>
then have all of the imgui stuff living on the igpu
<azonenberg>
it would be a very simple split
<azonenberg>
waveform sample data would live entirely on the dgpu, gui geometry entirely on the igpu, and rasterized waveforms cross between them
<azonenberg>
and like i said we do this in glscopeclient already
<azonenberg>
with the exact same split
<azonenberg>
the difference is, in glscopeclient the tone mapping and presentation is opengl vs vulkan
<azonenberg>
while the dsp and rasterization is vulkan
<azonenberg>
but the api boundary means we cant share any handles. so they're effectively separate cards even if they're the same physical gpu
<azonenberg>
so basically we've already proven that the model works
<_whitenotifier-7>
[scopehal-apps] azonenberg e7dabc6 - Initial work on markers. Can create them via context menu and they show up (partially) under history view, but not rendered in display yet. See #511.
<_whitenotifier-7>
[scopehal-apps] azonenberg 4092588 - PreferenceManager: save prefs on app exit. Added some debug log messages.
<_whitenotifier-7>
[scopehal-apps] azonenberg 689ea99 - Preferences: fixed bug in serialization of color values
<_whitenotifier-7>
[scopehal-apps] ... and 2 more commits.
<_whitenotifier-7>
[scopehal-apps] azonenberg 1abad99 - Continued work on markers. Can now render them in WaveformGroup's, but not drag them around. See #511.