<azonenberg>
ok implemented most of the history system in ngscopeclient now
<azonenberg>
only thing missing is a handler to actually update waveforms when the history changes
Degi has quit [Ping timeout: 252 seconds]
Degi_ has joined #scopehal
Degi_ is now known as Degi
<azonenberg>
ok and that's done. only thing that's missing is updating the selection in the history view when a new waveform comes in during normal trigger mode
<azonenberg>
(not the one i just pushed a fix for)
<azonenberg>
we pack our bool's into uint32_t[]s in order to work on platforms that don't support byte addressable SSBOs
<azonenberg>
but we do not pad the SSBO up to a multiple of 4 bytes host side
<azonenberg>
easiest fix is for AcceleratorBuffer<bool> to ensure it always allocates capacity rounded up to next multiple of 4 bytes regardless of requested size
<azonenberg>
the padding bytes' value is unimportant as long as it's legal to read from
<_whitenotifier-7>
[scopehal] azonenberg 10af34e - AcceleratorBuffer: round digital allocations up to next multiple of 4 bytes to allow 32-bit reads from shaders
<d1b2>
<ehntoo> after some particularly aggressive includes refactoring to eliminate usage of scopehal.h and scopeprotocols.h entirely within the context of the library as well as eliminating usage of Gdk::Color to minimze the places we're including gtkmm, I've got a ~22-25% speedup in ngscopeclient compile times on my mac. (~130s -> ~100s). The really long pole remaining in the tent is the FlowGraphNode.h -> Waveform.h -> AcceleratorBuffer.h ->
<d1b2>
vulkan_raii.hpp include chain.
<d1b2>
<ehntoo> if we can add an abstract interface in there or otherwise break the include chain so that every individual filter and protocol decoder doesn't need to include the vulkan headers, that'll be a really massive improvement
<azonenberg>
yeah that would be nice but i'm not sure how viable it is. but 25% speedup is a good start already
<azonenberg>
but basically, every filter ultimately has to be able to create waveforms and manipulate them
<azonenberg>
and if those waveforms are allocated in vulkan memory i'm not sure how viable breaking that chain is
<d1b2>
<ehntoo> yeah, I'm experimenting with a couple different changes. I definitely do not want to sacrifice any runtime performance just to get a compile time reduction.
<azonenberg>
yeah. in particular AcceleratorBuffer has to mostly be in a header because it's a template
<azonenberg>
and the implementations of the AcceleratorBuffer functions have to do vulkan stuff
<azonenberg>
i'm not sure if it's even possible, performance aside, to avoid that
<d1b2>
<ehntoo> or, it turns, out, I could just precompile the vulkan headers and get most of the speedup that I was spending a lot of time chasing. ~100s -> ~84s just by precompiling vulkan_raii.hpp. I'm going to check out a clean copy of the project and see what just precompiling the vulkan headers and gtkmm gets me without all the changes I've made.
<d1b2>
<ehntoo> I might still clean up a local branch I've got that strips usage of scopehal.h and scopeprotocols.h to improve incremental compiles, but it's a lot lower priority after this change. 🙂
<azonenberg>
ehntoo: yeah scopeprotocols.h is definitely included in way more places than it needs to be
<azonenberg>
scopeprotocols.cpp with all of the module registrations is really the only thing that needs every single filter
<azonenberg>
ehntoo: also i want to move away from use of gtk in scopehal. how much do we actually have using gtk in scopehal now?
<azonenberg>
i feel like there isnt a lot
<azonenberg>
we use sigcxx for callbacks but that's much lighter than all of gtk
<azonenberg>
and something i'm willing to keep around indefinitely
<azonenberg>
Benchmarking a full build of glscopeclient + ngscopeclient on my machine now
<azonenberg>
before/after the PCH
<d1b2>
<ehntoo> off the top of my head, there's some import/export stuff and a lot of usage of gdk::color.
<d1b2>
<ehntoo> the gdk::color change is a pretty straightforward swap, though it touches a lot of files since it's part of the Filter.h interface.
<azonenberg>
Yes. i want to replace gdk::color with a std::string using html color codes
<azonenberg>
hex rgb color codes are straightforward to convert to Gdk::Color or ImU32
<azonenberg>
on the glscopeclient or ngscopeclient side
<azonenberg>
it is, like you said, an exhaustive refactoring
<azonenberg>
scopeexports is all gtk code and will likely get kept as a glscopeclient-only module
<azonenberg>
with new export code written as an imgui frontend in ngscopeclient for headless backend code in scopeprotocols
<_whitenotifier-7>
[scopehal] ehntoo d49833c - Precompile selected headers to improve compile times Precompiling these system/library headers takes a clean build of ngscopeclient from ~100-135s to ~40-45 seconds on my macbook.
<_whitenotifier-7>
[scopehal] azonenberg 18726f4 - Merge pull request #701 from ehntoo/compile-speed-improvements Precompile selected headers to improve compile times
<d1b2>
<ehntoo> if there's a ticket for removing usage of Gdk::Color I can take it - I've spent enough time looking at the relevant code recently that I should be able to knock it out pretty quickly.
<azonenberg>
ehntoo: let me check. it's on the list of things i want done
<azonenberg>
i just dont know if there's a formal issue
<azonenberg>
also making a few fixes to your PCH to compile on linux, we have to include <memory> and a few others before the vulkan headers
<azonenberg>
test building that and pushing in a minute or two
<d1b2>
<ehntoo> 👍
<azonenberg>
very, very interesting
<azonenberg>
real 222.89
<azonenberg>
user 2204.68
<azonenberg>
sys 632.33
<azonenberg>
it seems when you throw enough cores at the build, you end up saturating them less (under half the total cpu-sec consumed) but only getting a slight speedup in the overall build
<azonenberg>
because it's bottlenecked on serial steps like linking
<azonenberg>
This was with make -j64 on dual socket xeon 6144s
<azonenberg>
if you have less ludicrous amounts of CPU i expect the speedup is likely significant
<d1b2>
<ehntoo> yeah, if you have that many cores to throw at it, you'll have a whole bunch of cores waiting for the initial two or three files to finish so the headers will be available
<d1b2>
<ehntoo> I do expect it to be a very significant win on CI.
<azonenberg>
Yes exactly. because although the new vm server will be 26 cores / 52 threads, i won't be using all of them for one build
<azonenberg>
this is to be shared among all of my existing virtualization workloads and a minimum of one linux and one windows ci server
<azonenberg>
ehntoo: how much faster is a clean build here? pulling in a bunch of STL to the PCH too
<azonenberg>
needed for linux build and might speed things on mac too
<d1b2>
<ehntoo> the 60% speedup I'm seeing with clang is with 11 compile threads running.
<azonenberg>
(with this vs your original PR i mean)
<d1b2>
<ehntoo> one sec, I'll run a test
<azonenberg>
in an hour or so we should have build feedback from the github CI runners
<azonenberg>
The last push from last night took 41 min for linux and 149 min for windows
<azonenberg>
to run CI
<d1b2>
<ehntoo> a clean build of ngscopeclient with the additional STL inclusions is 42s, which is very well within this machine's thermal envelope variance of my initial PR time
<azonenberg>
Ok so substantially no change
<azonenberg>
thats about what i figured
<azonenberg>
the point of putting them in the pch was mostly because linux required them to build vulkan_raii
<azonenberg>
also, i was sorely tempted to name the file stdafx.h
<azonenberg>
does that make me old? :p
<d1b2>
<ehntoo> maybe a little. 😉 as I'm trying to tell myself as I'm starting to find an occasional gray hair, that's the experience showing. 😛
<azonenberg>
to be fair, i was like 9 years old when i started writing C with a borland compiler on my dad's old 386 running win 3.1
<azonenberg>
but i didn't get seriously started until around two years later when we got a box running windows ME with vc98 on it
<azonenberg>
(we skipped the 95/98 era entirely)
<azonenberg>
this was visual c++ '98 bought on a cd-rom in a store, in person
<d1b2>
<ehntoo> I was recompiling linux kernels in middle school, but I don't think I started really programming seriously until the end of high school.
<azonenberg>
I mean i wasn't exactly the best programmer around back then :p
<azonenberg>
tl;dr it's possible to get one extra trigger if you stop in normal-trigger mode, and then get a stray waveform next time you re-arm the trigger
<azonenberg>
lain: almost ready to merge the queue refactoring? what's left to do on that?
<azonenberg>
ok so, next up in ngscopeclient is a few small history fixes (allowing you to change the history depth and delete old waveforms)
<azonenberg>
that shouldn't take long
<d1b2>
<ehntoo> Ugh, windows. I'll take a look.
<lain>
azonenberg: merging master into my macos branch today so I can test everything out
<lain>
or at least that's my plan. still struggling to wake up, currently
<_whitenotifier-7>
[scopehal] ehntoo 30da18e - Hopefully fix windows build pre-compiled headers YAML_INCLUDES is not available becase we're not using FindPackage for yaml-cpp on windows. Hard-code the path to the mingw64 package's includes dir for now.
<_whitenotifier-7>
[scopehal] azonenberg 460530c - Merge pull request #702 from ehntoo/fix-windows-build-pch Hopefully fix windows build pre-compiled headers
<_whitenotifier-7>
[scopehal-apps] azonenberg c72d292 - Updated submodules. Replaced magic number with enum value.
<d1b2>
<ehntoo> 🤞
<d1b2>
<ehntoo> one other thing I'm noticing comparing the CI setups and my own build environment is that I'm using ninja rather than make. I wouldn't expect it to make a huge difference but maybe it's worth experimenting with?
<azonenberg>
ehntoo: new build failed too
<d1b2>
<ehntoo> ugh, windows failure again.
<d1b2>
<ehntoo> apparently I was not fully understanding how the msys shell interacts with cmake and path specification
nelgau has quit [Remote host closed the connection]
<d1b2>
<ehntoo> we should probably get to the bottom of why the find_package isn't working for yaml-cpp, but this'll get builds running in CI at least.
<d1b2>
<ehntoo> I hope.
<azonenberg>
This will break builds for anyone building locally? or does find_package work outside the CI environment? or what
<d1b2>
<ehntoo> find_package should work everywhere - this hack will only work for the CI environment, though. :/
<azonenberg>
So in theory it should still work
<azonenberg>
it'll just add a nonexistent path to the search path
<_whitenotifier-7>
[scopehal] ehntoo fd511f1 - Hardcode the windows yaml-cpp path even more.
<_whitenotifier-7>
[scopehal] azonenberg a9a7a72 - Merge pull request #703 from ehntoo/really-fix-windows-pch-maybe Hardcode the windows yaml-cpp path even more.
<_whitenotifier-7>
[scopehal-apps] azonenberg dff949a - Update scopehal with hopefully fixed CI build?
<d1b2>
<ehntoo> I'm pretty sure this'll break windows builds for anyone that doesn't have their mingw64 environment installed in D:/a/_temp - the target_include_directories adds a build target to produce the precompiled header, and cmake will bail if it can't find the yaml.h "source" file.
<azonenberg>
Well, we'll at least have working CI in the mean time
<azonenberg>
i want to move away from gtk and mingw as fast as possible long term
<d1b2>
<ehntoo> yeah. if this doesn't work again, I think we can just conditionally not put yaml.h in the PCH on windows
<d1b2>
<ehntoo> Looks like the more-recent ubuntu builds are getting a better speedup of 24m. I'll definitely take that.
<d1b2>
<ehntoo> stepping away from my computer for a few hours at minimum, but if the windows CI does fail again, you can delete the "D:/a/_temp/mingw64/include/yaml-cpp/yaml.h" from line 225 of lib/scopehal/CMakeLists.txt to leave $<$<BOOL:${WIN32}>:,${YAML_INCLUDES}/yaml-cpp/yaml.h>, and that will definitely fix the windows build.
<azonenberg>
i'm an idiot, yeah
<azonenberg>
i was about to say why not just remove the yaml
<d1b2>
<ehntoo> oh, yeah, the homebrew vulkan sdk issues. I'd forgotten that you'd mentioned hitting that. We've already got a couple search paths in there, probably not too damaging if we tack that one on as well if it's the only change you're carrying locally?
<d1b2>
<david.rysk> yeah I'd suggest tacking that one on, as long as the nonexistence of it doesn't cause an issue
<d1b2>
<david.rysk> the only other problem is that it can't find ffts.h even though I'm giving it a CMAKE_PREFIX_PATH
<azonenberg>
well FFTS i want gone in the next few weeks
<azonenberg>
so that will be a temporary nuisance
<lain>
azonenberg: I got most of master merged into my macos branch locally, there's still a couple files that I'm gonna definitely need to be awake and alert to finish merging properly. assuming I get decent sleep tonight, that should get done early tomorrow and then I can hopefully get macos further along :D
<azonenberg>
Great. i should have the memory leaks fixed shortly, debugging that now
<azonenberg>
also fixed some bugs in history and should have that pushed by later today
<d1b2>
<david.rysk> aha, the ffts issues are because a502ca55dc6fb2ce4ad8f0ca71355c9328e44d1d was prematurely added to master
<d1b2>
<david.rysk> ideally that commit would be reverted and then brought back after the dependence on FFTS is fully removed
<d1b2>
<ehntoo> I don't think that commit is on master?
<d1b2>
<david.rysk> oops that's my bad
<d1b2>
<david.rysk> I'm on the macos branch, lemme fix that
<azonenberg>
ah yeah that's lain's dev branch
<azonenberg>
it's not something i'd recommend for general use
<azonenberg>
(yet)
<d1b2>
<ehntoo> hrm. I was hoping that github would run the macos workflow in the PR for #509. I guess I understand why they wouldn't run arbitrary actions in PRs though.
<azonenberg>
I need to approve actions to run in PRs i think
<d1b2>
<ehntoo> woo! windows build step was successful. Looks like the build step took 32m 20s compared to the previous successful run's 47m 15s. Still not fast, but it's not nothing.
<_whitenotifier-7>
[scopehal] azonenberg 264b9aa - Made Rename() a virtual method of Waveform, not just derived classes. Fixed handle leak in WaveformPool.
<_whitenotifier-7>
[scopehal-apps] azonenberg 54b9300 - HistoryDialog now allows deleting of individual items from context menu. Pin in history actually works. Fixed handle leak caused by adding waveforms to a pool after its destructor had already been called.
<azonenberg>
ehntoo: It's an improvement. i think using scopeprotocols.h less would be a big speedup all around
<azonenberg>
the full list of filters really only ever needs to be used in scopeprotocols.cpp
<azonenberg>
and nowhere else
<azonenberg>
IMO anything else including scopeprotocols.h is a bug
<d1b2>
<ehntoo> 👍 I'll take a look at that next. Now that I have CI running on my own fork can use those machines as a benchmark - some of the changes I was making were hard to measure on my local machine since the impact was low, but it'll be more significant on the slower runners.
<azonenberg>
Yeah. well once the new hardware comes in hopefully me and johnsel can figure out a good provisioning architecture to get things running locally and we'll see major speedups there
<azonenberg>
We'll also see big speedups by not rebuilding / installing dependencies every single commit longer term
<d1b2>
<david.rysk> lol of course it crashes in the version of ffts that I compiled
<d1b2>
<david.rysk> sigh ffts
<azonenberg>
there's a happy medium between "build isn't reproducible because it depends on custom environment and specific dependencies magically present in the CI environment" and "recompiling every dependency from source every commit"
<azonenberg>
that we're working to hit
<azonenberg>
david.rysk: yeah afaik ffts on arm64 is basically just nonfunctional
<azonenberg>
only the unit tests and spectrogram filter should be impacted
<azonenberg>
all of our other fft-using filters will preferentially use vkFFT
<azonenberg>
yeah there's also non backward compatible changes in the most recent vulkan sdk (1.3.229) i need to support
<d1b2>
<david.rysk> right...
<d1b2>
<ehntoo> there's a bunch of changes between catch2 v2 and v3. I've got a patch floating around my tree to update the test suite for it
<azonenberg>
file a ticket and we'll work on it when we get a chance. or send a PR
<azonenberg>
for now, we support catch2 actual 2
<azonenberg>
not catch2-but-really-3 lol
<d1b2>
<david.rysk> do you need to support both 2 and 3?
<azonenberg>
As a general rule we should be able to build on debian stable with as few from-source dependencies as possible
<azonenberg>
vulkan sdk is not in the debian repos but is available as a ppa or similar
<azonenberg>
we're moving away from ffts
<azonenberg>
we submodule vkfft and imgui
<d1b2>
<david.rysk> catch2 might be something worth doing as a submodule. not sure
<azonenberg>
yes, potentially worth looking into
<d1b2>
<david.rysk> I've gone over this quite a bit, but debian stable is rather unworkable for a desktop for many people
<azonenberg>
well it's what i use, so it has to run there :p
<d1b2>
<david.rysk> since the packages are just too old for a lot of stuff to work
<azonenberg>
that's not negotiable
<azonenberg>
generally speaking stable, or oldstable if stable came out <6 months ago, is what i target as e.g. minimum cmake version for building, etc
<d1b2>
<ehntoo> if you want it to work locally you can cherry-pick this patch. either submoduling the library or ifdef'ing it to compatibility with both should be doable as a more-permanent solution
<d1b2>
<david.rysk> ehntoo: is it possible to support both 2 and 3?
<azonenberg>
anything that requires a user of a stable distro to install a newer version of a package to work is a decision i make carefully and after exhausting other options
<lain>
yes
<d1b2>
<ehntoo> I'd just wait for lain's change to land. 🙂
<d1b2>
<david.rysk> IMO, requiring users of rolling-release distros that supply current non-beta packages to install an older version is equally as bad, if not worse.
<d1b2>
<david.rysk> right, that will hurt performance on unified memory systems
<azonenberg>
yep. open ticket that is probably next on lain's plate after her current work lands
<d1b2>
<david.rysk> switching to fullscreen works, but is not hidpi-aware
<azonenberg>
switching to fullscreen is not hidpi aware?
<azonenberg>
the app fullscreened vs maximized should look ~the same
<d1b2>
<david.rysk> yeah, everything becomes tiny
<azonenberg>
or is it not handling hidpi when maximized either?
<d1b2>
<david.rysk> maximized is fine
<azonenberg>
interesting. file a bug
<d1b2>
<david.rysk> specifically when I use view menu -> fullscreen
<d1b2>
<david.rysk> rather than the macos fullscreen
<d1b2>
<david.rysk> which is different
<d1b2>
<ehntoo> huh. I guess I haven't tried that yet.
<azonenberg>
yeah
<azonenberg>
well fullscreen works on windows and it didnt before
<azonenberg>
so its progress :p
<azonenberg>
scaling issues are not surprising and i can work on that
<azonenberg>
ehntoo: btw, you should not be -mavx512f globally
<azonenberg>
that will result in the generated binaries autovectorizing using avx512 everywhere and creating apps that won't run on anything without
<azonenberg>
what we do instead is compile with default machine settings and then enable vectorization for specific hot loops with dynamic dispatch for *those*
<d1b2>
<david.rysk> macOS has a weird way of doing fullscreen -- OS-aware fullscreen involves the app getting its own virtual desktop
<d1b2>
<david.rysk> when I click the titlebar button that does that, it works fine, with no hidpi issues
<d1b2>
<david.rysk> however, when I do the in-app view -> fullscreen, it seems to change my screen resolution
<d1b2>
<david.rysk> okay this was fun
<d1b2>
<david.rysk> I first did OS-aware fullscreen, then I did view->fullscreen
<azonenberg>
lol
<d1b2>
<david.rysk> and then switched back to my desktop, and now everything is tiny
<d1b2>
<david.rysk> it changed the overall screen resolution for the whole system 😮
<azonenberg>
would it perhaps make sense to just hide the fullscreen button / menu in the app on macos if there's already a working, OS-integrated way to do it?
<d1b2>
<david.rysk> yeah
<azonenberg>
Send a PR :)
<d1b2>
<david.rysk> though this all leads to the question of whether it's possible to use the proper macOS menubar on macOS
<d1b2>
<david.rysk> what were you using for I/O instead of SDL?
<azonenberg>
glfw
<azonenberg>
the menus are handled by imgui which i think does support a top level OS-integrated menu on macos
<azonenberg>
iirc we just arent using that
<d1b2>
<david.rysk> oh it does support that?
<azonenberg>
let me check
<d1b2>
<david.rysk> BeginMainMenuBar() but does this actually integrate with the OS or does it fake the menu bar
<azonenberg>
I think there was a flag somewhere to say use the os integrated menu
<azonenberg>
thats what i'm looking for
<d1b2>
<ehntoo> hm... the -mavx512f was added to avoid ABI incompatibility warnings about "AVX vector return of type '__m512i' ... without 'avx512f' enabled changes the ABI". Does clang not support attribute((target))? I'll investigate.
<azonenberg>
yeah look into it. any function that returns a vector type is using that ABI
<azonenberg>
but anything else should be using a no-avx abi
<azonenberg>
anyway if imgui can't do native menus on macos yet i'm sure that will come soon and not need code changes on our end
<d1b2>
<david.rysk> it may, it may not
<d1b2>
<david.rysk> we can live with built-in menus for now
<d1b2>
<ehntoo> did a little checking into menubar support while waiting to see if I found the magic incantation to fix the AVX ABI errors. It looks like it's something we'd have to support at the glfw layer with some custom platform code. https://discourse.glfw.org/t/modifying-glfw-window-menu-bar/1291
<d1b2>
<ehntoo> not something I'd personally prioritize though, the in-app menubar is good enough for me.