<set_> oops.
nparafe has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
nparafe has joined #beagle
<set_> Up, up, and otay!
<set_> Took me long enough...sheesh.
<set_> This git server is not getting any younger!
kaelin has joined #beagle
kaelin has quit [Client Quit]
kaelin has joined #beagle
<kaelin> Hey all, I have a somewhat random question, i figure someone here might be able to give me some pointers. What library/vendor is "libsrv_um.so"? It's invoked by openCL, which is in turn invoked by OpenCV, in my current case. I'm on an AI-64 with a recent Bullseye image. I see mentions of it as a vendored ARM-related binary but I am not seeing clear
<kaelin> origins in search results.
<kaelin> Context:
<kaelin> Thread 1 "python3" received signal SIGABRT, Aborted.
<kaelin> __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
<kaelin> 50      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
<kaelin> (gdb) bt
<kaelin> #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
<kaelin> #1  0x0000fffff7cf6aa0 in __GI_abort () at abort.c:79
<kaelin> #2  0x0000ffffe17cc47c in  () at /usr/lib/libsrv_um.so
<kaelin> #3  0x0000ffffe183e860 in  () at /usr/lib/libOpenCL.so
<kaelin> #4  0x0000ffffe184dec0 in  () at /usr/lib/libOpenCL.so
<kaelin> #5  0x0000ffffe186c674 in  () at /usr/lib/libOpenCL.so
<kaelin> #6  0x0000fffff359cffc in cv::ocl::haveOpenCL() () at /usr/lib/aarch64-linux-gnu/libopencv_core.so.4.5
<kaelin> #7  0x0000fffff35b3e54 in cv::ocl::useOpenCL() () at /usr/lib/aarch64-linux-gnu/libopencv_core.so.4.5
<kaelin> #8  0x0000fffff3604ce8 in cv::UMat::getStdAllocator() () at /usr/lib/aarch64-linux-gnu/libopencv_core.so.4.5
<kaelin> #9  0x0000fffff3605580 in cv::UMat::create(int, int const*, int, cv::UMatUsageFlags) () at /usr/lib/aarch64-linux-gnu/libopencv_core.so.4.5
<kaelin> #10 0x0000fffff357da04 in cv::_OutputArray::create(int, int const*, int, int, bool, cv::_OutputArray::DepthMask) const () at /usr/lib/aarch64-linux-gnu/libopencv_core.so.4.5
<kaelin> #11 0x0000fffff34df720 in cv::Mat::copyTo(cv::_OutputArray const&) const () at /usr/lib/aarch64-linux-gnu/libopencv_core.so.4.5
<kaelin> #12 0x0000fffff4f5d2c0 in  () at /usr/lib/python3/dist-packages/cv2.cpython-39-aarch64-linux-gnu.so
* set_ usually uses pastebin.com.
<kaelin> Not asking for help debugging the crash (yet, at least...) just for pointers on the binaries.
<kaelin> Good point, will use an online sharing service next time.
kaelin has quit [Client Quit]
<set_> Yea. I get a lot of people not chatting b/c of my odd behavior.
kaelin has joined #beagle
kaelin has quit [Client Quit]
kaelin has joined #beagle
<set_> Git. Sheesh. This is harder than porting sound from the kernel to the BBB i2s.
demirok has quit [Quit: Leaving.]
Shadyman has quit [Ping timeout: 272 seconds]
<kaelin> re: my query, it seems that binary might be from PowerVR, which would make sense given the context. I still haven't found a definitive reference. And, more importantly, my question becomes: if I believe I've identified a bug in this OpenCV/OpenCL/PowerVR stack, who is it most appropriate to report to? All I'm really asking is where the BeagleBoard
<kaelin> images get the binary from. If it's provided through TI then I'll report it to them.
<set_> The people that deal w/ the mfg. of the SGX.
<set_> I would think they would have a forum and support.
<set_> I know they do. I was just thinking if it is a bug in the stack, people may be more inclined to pitch in!
<set_> Imag. Tech.
<set_> I am pretty sure they mfg. the chip on the BBB. SGX.
<set_> kaelin: That is difficult stuff to handle. I am glad someone stuck it out!
<set_> It beat me.
<set_> I failed until my face turned Purple. My hair went already w/ the purple hue.
<kaelin> Yeah, I guess it depends on how much the GPU manufacturer is invested in supporting all the software that others have built on top. I'm fairly sure this is an oversight in the interface between either OpenCV/OpenCL or OpenCL/PowerVR drivers, but I don't know where it is. They might just throw it back to me as "can't repro".
<set_> Right...that is always a concern.
<set_> kaelin: Maybe wait around for someone else who deals w/ the SGX more often these days...
<set_> I know there was an image from beagleboard.org and they handled the port. I actually got some graphics displayed but not my own.
<kaelin> I really am not doing anything low-level with the SGX hardware. This is high-level libraries internally dispatching to OpenCL and the hardware accelerators. In this case I was fully ready to blame TI's deep learning library for memory corruption issues at first :D  I did some digging and experimentation and it turns out that this happens any time
<kaelin> you pass floaing-point pixel coordinates into OpenCV drawing functions. Usually OpenCV would detect it and error but something about this accelerated backend isn't checking for that case. Someone gets passed a float when they didn't expect it...
<kaelin> I'll wait to see if anyone has more precise suggestions and otherwise I'll ask the imagination guys and go from there.
<set_> Right, when I consumed my time around the SGX, I found them very helpful. At least the remainder of the men/women who were around that knew the chip.
<set_> That last sentence makes no sense. Sorry.
<set_> ... could have helped or could help. That should have been the ending to that sentence.
Shadyman has joined #beagle
<kaelin> haha I get the point, yeah that company/IP has been around a long time
<set_> I have been following the beaglboard.org people and their moves in the field for some time, e.g. getting the books, learning Linux, and doing hardware type things. I was surprised they wanted to go ahead w/ it all.
<set_> The BBAI has another type on it.
<set_> And then, the BBAi_64, I believe may have another type on it as well.
<set_> I can check...
<set_> PowerVR® Rogue™ 8XE GE8430 3D GPU
kaelin has quit [Quit: Client closed]
<set_> That is what is on the AI-64.
Shadyman has quit [Ping timeout: 260 seconds]
starblue1 has quit [Ping timeout: 246 seconds]
starblue1 has joined #beagle
Shadyman has joined #beagle
Shadyman has quit [Ping timeout: 255 seconds]
Shadyman has joined #beagle
<zmatt> TI says they don't support OpenCL for the GPU on the TDA4VM
kaelin has joined #beagle
<kaelin> Yeah, I can't speak to whether they officially support it — the error message definitely seems to suggest they don't. But in this case a stock build of OpenCV (either pre-installed or installed via pip, I don't recall) and other libraries. So if there's a normal API where a user unwittingly can invoke the unsupported codepath it's a bug in one of
<kaelin> the libraries in the stack.
<kaelin> This one is caused by a call to "cv2.rectangle" (i.e., draw a rectangle). Nothing special, no custom builds or options, etc.
<kaelin> The only requirement as far as I can tell is that at least one of the pixel coordinates is a float
kaelin has quit [Quit: Client closed]
<zmatt> kaelin: ah, I see in the traceback that they're trying to probe whether openCL is available, and this triggers an abort in the powervr library... yeah that's pretty rude
kaelin has joined #beagle
<zmatt> kaelin: wb
<zmatt> kaelin: where does this libOpenCL.so come from? (dpkg-query --search libOpenCL.so)
<kaelin> Haha I was just writing to ponder where it came from
<zmatt> you could try temporarily renaming/moving it to see if that fixes the problem but my guess would be that other libs have a dep on it
<kaelin> I'm going to assume it's from a bb.org-vendored package, I'll need to check when I get back home in a couple hours.
<zmatt> kaelin: ah, I found something useful\
<zmatt> it seems you can set the environment variable OPENCV_OPENCL_RUNTIME=disabled
<zmatt> to forcibly disable opencv support
<zmatt> opencl support in opencv I mean
<kaelin> Ooh nice, I'll definitely try that
<zmatt> the alternative would have been to just create a trivial clGetPlatformIDs() function in your program, overriding the one in libOpenCL.so
<kaelin> Haha yeah I didn't do too much digging after what I posted above (I've been out of the house) so I hadn't looked into the logic thay invokes OpenCL. In this particular case, I'm working in Python so don't have an easy way to override linker symbols, but the idea is reasonable.
kaelin has quit [Quit: Ping timeout (120 seconds)]
<zmatt> kaelin: it's surprisingly easy to override shared library symbols in any program, you don't even need the source code, you can use LD_PRELOAD: https://pastebin.com/34tjhKt5
<zmatt> kaelin: btw, in case it's of interest, TI does support OpenVX for hw-acceleration for vision stuff, but I don't know if that's useful for opencv or your particular application
ikarso has joined #beagle
kaelin has joined #beagle
<kaelin> OpenCV gives the _wrong_ error message (I think), but at least it's an appropriate parameter error, when disabling the OpenCL acceleration.
<kaelin> It's not a blocker for me since giving fractional pixel coords to OpenCV drawing functions is "incorrect" (and warrants an error), it's just an inscrutable error condition to arrive at which doesn't point you to the root cause. If OpenCL is going to crash on the BB then IMO it shouldn't be included/all backends should be disabled.
<kaelin> Yeah it looks like OpenCV has a codepath in its Python binding logic which, when it detects invalid parameters, ends up attempting to copy a matrix which in turn probes for OpenCL. So any time you pass an invalid parameter type to a Python OpenCV function on the BB it'll give this silly OpenCL abort assertion error and crash.
<zmatt> kaelin: I'm a bit worried about that other OpenCL.so.1, I wonder if there's shared library confusion happening
kaelin has quit [Read error: Connection reset by peer]
<zmatt> kaelin: anyway, OPENCV_OPENCL_RUNTIME=disabled looks like a good enough workaround. you can probably even set that environment variable in your python code (before importing any opencv library)
kaelin has joined #beagle
<zmatt> (os.environ['OPENCV_OPENCL_RUNTIME'] = 'disabled')
kaelin has quit [Read error: Connection reset by peer]
kaelin has joined #beagle
kaelin has quit [Read error: Connection reset by peer]
kaelin has joined #beagle
kaelin has left #beagle [#beagle]
kaelin has joined #beagle
demirok has joined #beagle
kaelin has quit [Read error: Connection reset by peer]
kaelin has joined #beagle
<kaelin> zmatt: Yeah, I'm not really worried about my own use-case anymore -- I can work around it -- I'm more worried about impact on other users. It's a pretty big flaw if a user encounters this without doing anything particularly strange on a stock BB image and board. I'll take another look tomorrow. But I've now run into this in a few different cases today while working with OpenCV which means it'd be really easy for an
<kaelin> up blocked by it.
kaelin has left #beagle [#beagle]
starblue1 has quit [Quit: WeeChat 3.0]
<zmatt> kaelin: removing ti-sgx-j721e-ddx-um entirely might also be an option assuming you don't actually intend to use the gpu
ikarso has quit [Quit: Connection closed for inactivity]
Shadyman has quit [Quit: Leaving.]
starblue has joined #beagle
demirok has quit [Quit: Leaving.]
xet7 has quit [Quit: Leaving]
ikarso has joined #beagle
otisolsen70 has joined #beagle
otisolsen70 has quit [Remote host closed the connection]
otisolsen70 has joined #beagle
CrazyEddy has quit [Ping timeout: 256 seconds]
starblue has quit [Ping timeout: 260 seconds]
CrazyEddy has joined #beagle
ikarso has quit [Quit: Connection closed for inactivity]
Shadyman has joined #beagle
vagrantc has joined #beagle
CrazyEddy has quit [Ping timeout: 260 seconds]
CrazyEddy has joined #beagle
brook has joined #beagle
ds2 has quit [Ping timeout: 260 seconds]
xet7 has joined #beagle
vagrantc has quit [Quit: leaving]
zjason` has joined #beagle
zjason has quit [Ping timeout: 264 seconds]
brook has quit [Remote host closed the connection]
vagrantc has joined #beagle
brook has joined #beagle
ikarso has joined #beagle
demirok has joined #beagle
vagrantc has quit [Quit: leaving]
starblue has joined #beagle
starblue has quit [Ping timeout: 246 seconds]
starblue has joined #beagle
starblue has quit [Ping timeout: 256 seconds]