azonenberg changed the topic of #scopehal to: libscopehal, libscopeprotocols, and glscopeclient development and testing | https://github.com/azonenberg/scopehal-apps | Logs: https://libera.irclog.whitequark.org/scopehal
<azonenberg> you sure? i thought that was the newer one
<azonenberg> in any case that's the one we use
<azonenberg> cl2.hpp
<dang`r`us> yeah, check the README at https://github.com/KhronosGroup/OpenCL-CLHPP
<azonenberg> oh interesting
<azonenberg> Now that i think about it
<azonenberg> I think there was a reason we couldn't use opencl.hpp on some platform
<azonenberg> so maybe the correct solution is to detect which is present
<azonenberg> and use opencl.hpp if available?
<dang`r`us> sounds like a good idea
<dang`r`us> mental note
<dang`r`us> currently still learning how to properly find stuff with cmake, I might have something ..
<azonenberg> aaand yeah i'm looking on debian
<azonenberg> I see cl.h, cl.hpp, cl2.hpp
<azonenberg> so cl.hpp must be the really old bindings
<azonenberg> cl2.hpp the middle, and opencl.hpp is too new for debian stable to have yet
<dang`r`us> °_o
<azonenberg> So we should use opencl.hpp if it's there, and if not fall back to cl2.hpp
<dang`r`us> the fun never stops when versioning apis
<azonenberg> well I think opencl.hpp and cl2.hpp are the same API, cl.hpp is the older in compatible one
<dang`r`us> yeah maybe plus fixes
<azonenberg> yeah
<azonenberg> so we should be able to target the new api and then just have cmake figure out which file is present
<dang`r`us> so I guess opencl.hpp is the one to *vendor*, but the *search* checks the system for both cl2.hpp and opencl.hpp, uses the newer of those, and *falls back to vendor/opencl.hpp* if neither is found
<dang`r`us> I mean, or just always vendor it ..
<dang`r`us> though, maybe that then conflicts with older c++ compilers, who knows
<azonenberg> I dont think we need to vendor it
<azonenberg> if osx has opencl.hpp
<dang`r`us> it does not
<azonenberg> So it only has the C headers?
<dang`r`us> yup.
<azonenberg> Rather than vendoring (which risks becoming out of date easily), how about we just pull the khronos header in as a submodule and point to it?
<azonenberg> you can still put the submodule in a vendor directory
<azonenberg> but pull KhronosGroup/OpenCL-CLHPP in directly rather than the single file
<azonenberg> It appears they considered this
<dang`r`us> that sounds great
<azonenberg> In fact, at that point we dont even need to depend on the system opencl.hpp at all
<dang`r`us> cool
<dang`r`us> was about to ask
<azonenberg> Pull in this repo as a submodule, *always* use it
<azonenberg> then all we depend on is the system OpenCL C headers
<dang`r`us> done
<dang`r`us> still fighting a yaml linker error, no idea where that's coming from
<dang`r`us> and wow this intel mac sure is slow compared to the m1
<dang`r`us> (a fully decked out macbook pro, last intel one they released ... ouch)
<dang`r`us> ok linker error seems to be the same as over here: https://github.com/NaluCFD/Nalu/issues/274
<dang`r`us> on macos it's going to be a .dylib, but the generated linker command does not respect that
<dang`r`us> ok, might have a simple fix..
<dang`r`us> yup.
<dang`r`us> so I found this in glscopeclient's CMakeLists.txt: https://dpaste.org/AixO -- similar problem under macos, what works there is to add GLEW::GLEW to target_link_libraries (instead of ${GLEW_LIBRARIES} I suppose? leaving the latter there doesn't seem to interfere however)
<dang`r`us> apart from that, the clean intel mac port looks "done" (as in, it links, starts and trips over like the dirty m1 port). Will prepare PRs tomorrow, then do the m1 port, then tackle OpenCL
<dang`r`us> almost 4am here X_X
<Bird|ghosted> yeah, GLEW::GLEW is the more modern form AIUI
<azonenberg> dang`r`us: Great. We still have the 500 pound gorilla that is porting the renderer to OpenCL
<dang`r`us> might be able to remove the windows special case then (I don't have the time to set up a windows build tho)
<dang`r`us> azonenberg, a matter of simple find and replace!
<azonenberg> And then the glue that figures out whether to use GL 4.1 + CL or GL 4.3
<dang`r`us> #include <GL/ue.h>
<azonenberg> lool
<dang`r`us> I think that one defines glDrawElephants
<azonenberg> We don't need more glue, we already have GLEW and GLU
<azonenberg> :p
<azonenberg> What about the teapotohedron?
<dang`r`us> the elusive 6th platonic solid
<dang`r`us> anyways, n8!
<_whitenotifier-7> [scopehal-apps] azonenberg pushed 1 commit to master [+0/-0/±5] https://git.io/JG8LV
<_whitenotifier-7> [scopehal-apps] azonenberg cfa50b5 - Added vertical trigger position line. Fixes #362.
<_whitenotifier-7> [scopehal-apps] azonenberg closed issue #362: Enhancement: Trigger position cursor - https://git.io/JGYND
Degi_ has joined #scopehal
Degi has quit [Ping timeout: 272 seconds]
Degi_ is now known as Degi
<azonenberg> Have neither the budget nor lab space
<azonenberg> but if anybody is looking for a 50 GHz sampling scope
<azonenberg> lecroy is getting rid of an agilent scope from their lab
<_whitenotifier-7> [scopehal] azonenberg pushed 1 commit to master [+0/-0/±1] https://git.io/JG8Dh
<_whitenotifier-7> [scopehal] azonenberg b0a831b - Oscilloscope: unrolled Convert16BitSamplesAVX2 loop to 32 samples per iteration
<_whitenotifier-7> [scopehal] azonenberg pushed 1 commit to master [+0/-0/±1] https://git.io/JG8QN
<_whitenotifier-7> [scopehal] azonenberg 9abfb59 - PicoOscilloscope: convert samples for each analog channel in parallel
<GyrosGeier> dang`r`us, my fork only exists to submit PRs :P
<azonenberg> GyrosGeier: anthonix seems to have dropped off the radar
<azonenberg> I don't think anything submitted to upstream will ever be merged
<azonenberg> We need to make a fork that will become the authoritative new upstream
<GyrosGeier> mmh
<azonenberg> They've had no activity on github in four years
<GyrosGeier> I just wanted to do a bit of packaging :/
<azonenberg> iirc
<azonenberg> Yeah, last activity was june 2017
<azonenberg> He does have a linkedin page with a mutual contact though
<azonenberg> I might reach out to him
<_whitenotifier-7> [scopehal-apps] azonenberg pushed 1 commit to master [+0/-0/±3] https://git.io/JG484
<_whitenotifier-7> [scopehal-apps] azonenberg 510b518 - Cursors now snap to digital and protocol edges if within 5 pixels. Fixes #357.
<_whitenotifier-7> [scopehal-apps] azonenberg closed issue #357: When moving a cursor over a digital channel, it should snap to closest transition if within some radius - https://git.io/JsMhO
someone--else has joined #scopehal
bvernoux has joined #scopehal
<_whitenotifier-7> [scopehal] spookyvision opened pull request #489: fix #474 (BSD build support) - https://git.io/JGBT8
<dang`r`us> *ding*
<_whitenotifier-7> [scopehal-apps] spookyvision opened pull request #366: fix BSD build support (#345) - https://git.io/JGBIw
<dang`r`us> *ding*
<dang`r`us> can't ever quite remember how to format these PRs
<GyrosGeier> whee
<_whitenotifier-7> [scopehal] ericonr reviewed pull request #489 commit - https://git.io/JGBLV
<_whitenotifier-7> [scopehal] ericonr reviewed pull request #489 commit - https://git.io/JGBLr
<GyrosGeier> hm
<_whitenotifier-7> [scopehal-apps] ericonr reviewed pull request #366 commit - https://git.io/JGBLN
* ericonr is against ifdef forests
<GyrosGeier> the CL path could be done with a conditional include_directories()
<GyrosGeier> possible target_include_directories
<_whitenotifier-7> [scopehal] spookyvision synchronize pull request #489: fix #474 (BSD build support) - https://git.io/JGBT8
<_whitenotifier-7> [scopehal] spookyvision reviewed pull request #489 commit - https://git.io/JGBtu
<ericonr> dang`r`us: I guess you mixed up the pthread_setname stuff with the stat stuff?
<dang`r`us> GyrosGeier, which ifdef in particular?
<dang`r`us> ericonr, yes, thanks for spotting
<ericonr> also, fucking apple, man
<dang`r`us> ah sorry GyrosGeier I was mixing up your line and ericonr's
<dang`r`us> ericonr, I like the utility function idea
<dang`r`us> let's see if I can whip something up that's not grossly offensive to more experienced C++ devs
someone--else has quit [Quit: Connection closed]
<dang`r`us> ericonr, how would you resolve the ifdef cascade? Spell all conditions out 2 times?
<ericonr> dang`r`us: IMO an utility function is enough
<ericonr> keep the ifdef'ing contained :)
<dang`r`us> heh, ok
<ericonr> dang`r`us: tooting my own horn, but I really like how I did https://github.com/ericonr/purr-c/blob/master/compat.c
<ericonr> have a single file with the platform specific concerns
<dang`r`us> aye
<dang`r`us> hm, I've added my compat.c to add_executable(glscopeclient; it gets built and its artefact is included in the linker call, but: symbol not found. Do I need to do some kind of special dance here or is my C just rusty?
<ericonr> dang`r`us: can you pastebin git diff
<ericonr> also basic check, you didn't make the functions static, right? :P
_whitelogger has joined #scopehal
<dang`r`us> ericonr, https://dpaste.org/K7EU
bvernoux1 has joined #scopehal
<ericonr> nit, I think pthread_* might be reserved, so just "thread_setname" or whatever sounds better to me
bvernoux has quit [Ping timeout: 268 seconds]
<ericonr> dang`r`us: try a different order
<ericonr> in CMakeLists.txt
<ericonr> usually symbols needed by object A are only searched in the arguments passed to the linker after A
<dang`r`us> hhhnnnrrgh
<dang`r`us> I was assuming the exact opposite
<dang`r`us> didn't help
<dang`r`us> mangling maybe?
<dang`r`us> linker says it can't find "pthread_setname_np_compat(char const*)"
<dang`r`us> that's a very odd symbol name
<dang`r`us> with the parentheses and everything
<ericonr> oooh right
<ericonr> mangling sounds reasonable
<ericonr> just make it a cpp file :P
<ericonr> easier than extern "C" {}
<dang`r`us> so just rename .c to .cpp ?
<ericonr> yeah
<dang`r`us> [100%] Built target glscopeclient
<dang`r`us> tadaaah
<_whitenotifier-7> [scopehal-apps] spookyvision synchronize pull request #366: fix BSD build support (#345) - https://git.io/JGBIw
<ericonr> yay
bvernoux has joined #scopehal
bvernoux1 has quit [Ping timeout: 272 seconds]
someone--else has joined #scopehal
<dang`r`us> ericonr, thoughts on the following? I'm doing the m1 port now which leads to a few of those:
<dang`r`us> #ifdef __X86_64__
<dang`r`us> #include <immintrin.h>
<dang`r`us> #endif
<dang`r`us> should I instead create a platform_intrinsics.h which wraps those 3 lines?
<dang`r`us> probably not a bad idea, repetition and all
<ericonr> dang`r`us: I guess it depends on how intrinsics are used inside the code, but that's a small enough change that I'd ask azonenberg about it too
<ericonr> s/change/burden/
<dang`r`us> aye - going ahead with my suggestion for now, can change back later
<dang`r`us> ok, that was smooth sailing otherwise.
bvernoux has quit [Quit: Leaving]
bvernoux has joined #scopehal
<GyrosGeier> mmintrin.h is a bit more common
<GyrosGeier> at least that exists on ppc, arm, amd64 and ia32
<dang`r`us> according to my research it's not that clear cut
<dang`r`us> or, hm, I confused this with another intrinsic. anyway I don't see any use for immintrin.h outside of x86_64 targets
<dang`r`us> so, hm.. not sure this is bug report worthy, buutttt the g_searchPaths.push_back("..."); approach falls over when building with -DCMAKE_INSTALL_PREFIX=/some/custom/path
<dang`r`us> in happier news I got clFFT to link and now (using a symlink hotfix) got rid of all the file not founds
* dang`r`us (staring at WaveformArea.cpp and suddenly having GLSL boilerplate horror flashbacks)
<GyrosGeier> yes, the "does it actually find all the shaders" bit is why I asked for testers
<dang`r`us> also - reading some light intro material on OpenCL --> oof
<dang`r`us> azonenberg, so .. I'm not giving up yet but I don't think I'll be able to do this all by myself. mostly a workload thing, I think I understand the code well enough by now; OpenCL looks pretty daunting though. especially unpleasant is that according to intel docs I found performance characteristics of certain OpenCL paradigms are exactly opposite depending on whether the CL device is a CPU or a
<dang`r`us> GPU ... (I'd be inclined to optimize just for GPUs here, we're interoparting with GL buffers anyway)
<dang`r`us> so in terms of execution model my current understanding is in openCL we'd have a single 2D NDRange(w,h) instead of n(=64) gpu threads each doing a column?
Stary has quit [Changing host]
Stary has joined #scopehal
<GyrosGeier> CL has n-dimensional working sets, and splits that across the number of units
<GyrosGeier> the dimensionality is just convenience so the user doesn't have to care about the number of available units
<GyrosGeier> directly transporting buffers from CL to GL is a CL extension, "GL interop"
<GyrosGeier> in general, the performance characteristics are different because the architecture is different
<GyrosGeier> GPUs don't like control flow instructions, CPUs don't like keeping wide vectors in registers
<GyrosGeier> good CL implementations should fix that up though
<GyrosGeier> basically "compute a; compute b; select(p, a, b);" can be rewritten as "if(p) compute a; else compute b;" and vice versa
<GenTooMan> finally debian updated cmake to 3.16.3 from 3.13.4 (2015) so now I can build other stuff (instead of hope and pray)
<dang`r`us> granted, this is specific to *intel* GPUs, but I'd still not be surprised if this held more or less generally
<dang`r`us> regarding the working sets specifically I'm not sure what to do with your explanation, it seems to be in line with my understanding
<dang`r`us> I was going for the difference to the current compute shader impl which does a row/column separation
<dang`r`us> regarding extensions, clCreateFromGLTexture doesn't seem to inherently require any, depending on what exactly you want to do with it (or am I missing something?)
<azonenberg> dang`r`us: We specifically filter opencl devices for GPU right now
<azonenberg> Because none of the kernels are optimized for CPU targets
<dang`r`us> makes sense
<azonenberg> If it can't find a GPU it just gives up and reports no opencl available
<ericonr> are there situations where optimizing an opencl kernel for CPU is worth it more than writing a CPU impl directly?
<dang`r`us> azonenberg, I've already located the CL related globals, I suppose they would need to move slightly higher up so they can be accessed by the scope renderer as well?
<dang`r`us> ericonr, maybe a bit of a pipe dream where one would abstract over different flavors of SIMD
<ericonr> heh
<azonenberg> ericonr: faster development perhaps? but we have few enough of them i dont see it being worth it
<ericonr> intel has their own CPU opencl impl, are there any others?
<azonenberg> dang`r`us: i dont think they will, they're global in libscopehal and glscopeclient includes scopehal.h
<azonenberg> i assume amd does
<azonenberg> not sure if anyone has made a NEON opencl backend
<dang`r`us> azonenberg, oh, I thought they were only accessible from inside scopehal.cpp
<azonenberg> No, they're used all over libscopehal and libscopeprotocols for compute acceleration
<azonenberg> check out e.g. the de-embed filter
<dang`r`us> but yeah, they are in the .h too, doh
<azonenberg> the only stuff in scopehal.cpp is the glue that detects available devices
<azonenberg> which only has to be done once and is already called by glscopeclient's startup code
<dang`r`us> aye
<dang`r`us> so I suppose the branch to CL would be in WaveformArea.cpp around line 542 (check for GL4.2)?
<dang`r`us> or is glBindImageTexture also required for non-compute passes
<azonenberg> You'd also want to copy some of the check on 551
<azonenberg> Some of those extensions are used in the non-compute rendering
<azonenberg> the majority are for compute though
<azonenberg> But if any of the non-compute stuff is missing we'll have to find replacements for it
<azonenberg> The M1 has GL_EXT_blend_equation_separate and GL_EXT_framebuffer_object
<azonenberg> It also has GL_APPLE_vertex_array_object and I'm unsure if this is interchangeable with GL_ARB_vertex_array_object
<azonenberg> so you'll have to do some reading
<azonenberg> GL_ARB_arrays_of_arrays, GL_ARB_compute_shader, and GL_ARB_shader_storage_buffer_object are compute only
<dang`r`us> the fun never stops when you GL_EXT
<dang`r`us> personally I'd like to keep extension use to a minimum (both in GL and CL) but what can you do sometimes ...
<azonenberg> These are prettty much minimums
<azonenberg> Most of those extensions are core in GL 4.3, i switched from 4.3 to 4.2 plus those extensions to gain support for older intel integrated cards
<azonenberg> What i don't know is whether anything in 4.1 -> 4.2 we're missing that we care about
<azonenberg> glBindImageTexture i think is only used so we can render to a texture from a compute shader
<dang`r`us> ok! :)
<dang`r`us> GL_APPLE_vertex_array_object seems absolutely *ancient* (2002?) compared to ARB (2008-2012). Oh well, we'll see
<azonenberg> It's entirely possible apple had it all the time and ARB adopted it more recently
<dang`r`us> true
<azonenberg> But also, the VAO usage is pretty minimal
<azonenberg> we use it to draw a total of like three fullscreen textured quads (2 triangles each)
<dang`r`us> ok that sounds pretty manageable ;D
<azonenberg> almost all of our rendering is compute based
<dang`r`us> aye
<azonenberg> the remainder is basically "draw textured quad with a simple shader to mess with colors or alpha blending a bit"
<dang`r`us> is VAO even worth it with so few verts
<azonenberg> It might well be possible to remove entirely
<azonenberg> If the non-compute rendering can be overhauled i'm all for it
<dang`r`us> it's been ages since I've last done anything serious in GL, so take everything I say with a mine of salt
<dang`r`us> but I'll have a look
<_whitenotifier-7> [scopehal] azonenberg closed issue #474: BSD build support - https://git.io/JsTtr
<_whitenotifier-7> [scopehal] azonenberg closed pull request #489: fix #474 (BSD build support) - https://git.io/JGBT8
<_whitenotifier-7> [scopehal] azonenberg pushed 3 commits to master [+0/-0/±9] https://git.io/JGRsf
<_whitenotifier-7> [scopehal] spookyvision 1bd3a59 - fix #474 (BSD build support)
<_whitenotifier-7> [scopehal] spookyvision ed6b3c8 - remove wrong freebsd special case
<_whitenotifier-7> [scopehal] azonenberg 1bae7a2 - Merge pull request #489 from spookyvision/lib_intel_mac fix #474 (BSD build support)
<dang`r`us> \o/
<dang`r`us> I did manage to pull in clFFT btw, should I make a separate PR for that? currently it lives in my m1 port but it's strictly speaking not related to it
<dang`r`us> (it's just one additional fallback search for the lib, since homebrew doesn't install a pkg-config file for it)
<azonenberg> Yeah do that separately
<azonenberg> right now i'm trying to figure out why my submodules didn't add the new directory properly
<_whitenotifier-7> [scopehal-apps] azonenberg closed issue #345: BSD build support - https://git.io/JsTtP
<dang`r`us> I might have messed that one up - not a submodule pro
<_whitenotifier-7> [scopehal-apps] azonenberg pushed 3 commits to master [+4/-0/±11] https://git.io/JGRZC
<_whitenotifier-7> [scopehal-apps] spookyvision 613118e - fix #345
<_whitenotifier-7> [scopehal-apps] spookyvision cb8482a - pthread compat wrapper
<_whitenotifier-7> [scopehal-apps] azonenberg 15a189d - Merge pull request #366 from spookyvision/intel_mac fix BSD build support (#345)
<_whitenotifier-7> [scopehal-apps] azonenberg closed pull request #366: fix BSD build support (#345) - https://git.io/JGBIw
<azonenberg> yeah i'm investigating that
<azonenberg> gimme a bit
<azonenberg> I think it's just some kind of stale caching or something
<_whitenotifier-7> [scopehal] azonenberg pushed 1 commit to master [+1/-0/±0] https://git.io/JGRnJ
<_whitenotifier-7> [scopehal] azonenberg b46450c - Added OpenCL CLHPP as submodule
<_whitenotifier-7> [scopehal-apps] azonenberg pushed 2 commits to master [+0/-0/±4] https://git.io/JGRnU
<_whitenotifier-7> [scopehal-apps] azonenberg 945ccd5 - pthread_compat: minor coding style fixes to curly braces, added header comments
<_whitenotifier-7> [scopehal-apps] azonenberg cda6586 - Updated to latest scopehal
<azonenberg> I think that did it
<dang`r`us> wheee
<azonenberg> Yeeeeesh
<azonenberg> thousands of warnings in opencl.hpp from -Wmissing-field-initializers
<azonenberg> We're going to have to tell gcc that's a system include so it ignores it
<dang`r`us> fun times yeah
<dang`r`us> clang complains too
<azonenberg> Yeah gimme a minute
<_whitenotifier-7> [scopehal] azonenberg pushed 1 commit to master [+0/-0/±2] https://git.io/JGRcg
<_whitenotifier-7> [scopehal] azonenberg b695cd3 - Disable warning spam in opencl.hpp
<_whitenotifier-7> [scopehal-apps] azonenberg pushed 2 commits to master [+0/-0/±2] https://git.io/JGRcS
<_whitenotifier-7> [scopehal-apps] azonenberg 486d390 - Updated scopehal
<_whitenotifier-7> [scopehal-apps] azonenberg 708a6b0 - pthread_compat.cpp: include pthread_compat.h so we have the prototype
<azonenberg> That should take care of it
<dang`r`us> oh, neat
<dang`r`us> didn't know the SYSTEM trick
<azonenberg> the SYSTEM thing SHOULD eliminate all warnings but wasnt enough
<dang`r`us> did I really forget my own .h include
<azonenberg> that was just so you could include <CL/opencl.hpp>
<dang`r`us> jeez
<dang`r`us> ah
<azonenberg> the pragmas in scopehal.h actually disable the warnings
<azonenberg> Which should work on both gcc and clang
<azonenberg> And yes you forgot your own include, you also didn't put the curly brace on its own line (coding style)
<azonenberg> i wasn't going to reject the PR for those though, trivial to fix myself
<dang`r`us> mental note: it helps to read PRs till the end (re pragmas)
<azonenberg> And it's failing CI tests on windows, hmm
<dang`r`us> whoops
<azonenberg> it's not you, i think
<dang`r`us> intriguing
<dang`r`us> I mean I did change things related to yaml
<azonenberg> Well, please investigate
<azonenberg> it looks like something is getting multiply defined
<dang`r`us> mmhmmm
<dang`r`us> I guess yaml-cpp is dynamically linked on macos
<azonenberg> it should be on linux too
<azonenberg> $ ldd src/glscopeclient/glscopeclient | grep yaml
<azonenberg> libyaml-cpp.so.0.6 => /usr/lib/x86_64-linux-gnu/libyaml-cpp.so.0.6 (0x00007fd8a4cb4000)
<dang`r`us> only linking related change I made though was to switch from target_link_libraries(... yaml-cpp ...) to (... ${YAML_LIBRARIES} ...)
<azonenberg> is it possible you somehow end up linking to the lib twice on windows?
<dang`r`us> I also added find_package(Yaml REQUIRED) to scopehal
<dang`r`us> but that on its own shouldn't change linkage ... or is that possible?
<azonenberg> don't know. you broke it, investigate :p
<dang`r`us> I am
<dang`r`us> I'm however out of ideas
<dang`r`us> sec, idea..
<dang`r`us> hm, no
<dang`r`us> why is it even trying to link yaml-cpp into scopeprotocols
<dang`r`us> the one thing that's fishy is that GLEW libraries are now maybe added twice. But I see no yaml there either. Still .. lemme prepare somethign
bvernoux has quit [Quit: Leaving]
<_whitenotifier-7> [scopehal-apps] spookyvision opened pull request #367: remove potential glew duplication - https://git.io/JGR0Y
<dang`r`us> let's see.
<dang`r`us> hm, no.
<dang`r`us> a problem for future dang. Very tired rn
esden has quit [Read error: Connection reset by peer]
elms has quit [Ping timeout: 264 seconds]
elms has joined #scopehal
esden has joined #scopehal