aindilis_ has quit [Quit: ZNC 1.8.2+deb3.1+deb12u1 - https://znc.in]
aindilis has joined #ruby
jasfloss has joined #ruby
CRISPR has quit [Ping timeout: 245 seconds]
Jado has quit [Ping timeout: 252 seconds]
Jado has joined #ruby
Jado has quit [Ping timeout: 268 seconds]
CRISPR has joined #ruby
eax_ has left #ruby [#ruby]
ih8u has quit [Quit: ih8u]
wbooze has quit [Read error: Connection reset by peer]
Inline has quit [Ping timeout: 268 seconds]
Jado has joined #ruby
dionysus69 has quit [Quit: dionysus69]
Jado has quit [Ping timeout: 244 seconds]
Jado has joined #ruby
Jado has quit [Ping timeout: 268 seconds]
CRISPR has quit [Quit: WeeChat 3.8]
Jado has joined #ruby
Jado has quit [Ping timeout: 265 seconds]
Jado has joined #ruby
xokia_ has quit [Ping timeout: 246 seconds]
patrick has quit [Ping timeout: 248 seconds]
patrick_ is now known as patrick
grenierm has joined #ruby
patrick_ has joined #ruby
patrick has joined #ruby
patrick has quit [Changing host]
patrick_ is now known as patrick
patrick_ has joined #ruby
Jado has quit [Ping timeout: 248 seconds]
konsolebox has joined #ruby
Jado has joined #ruby
dalan03822833508 has quit [Quit: dalan03822833508]
Jado has quit [Ping timeout: 265 seconds]
dalan03822833508 has joined #ruby
rdsm has joined #ruby
xokia_ has joined #ruby
konsolebox has quit [Ping timeout: 260 seconds]
xokia_ has quit [Read error: Connection reset by peer]
fantazo has joined #ruby
Jado has joined #ruby
xokia_ has joined #ruby
xokia_ has quit [Read error: Connection reset by peer]
Jado has quit [Ping timeout: 244 seconds]
Jado has joined #ruby
konsolebox has joined #ruby
nirvdrum7 has quit [Ping timeout: 252 seconds]
Jado has quit [Ping timeout: 248 seconds]
Jado has joined #ruby
rvalue has quit [Read error: Connection reset by peer]
rvalue has joined #ruby
hwpplayer1 has joined #ruby
gemmaro has quit [Read error: Connection reset by peer]
Jado has quit [Ping timeout: 252 seconds]
konsolebox has quit [Ping timeout: 248 seconds]
Jado has joined #ruby
Stenotrophomonas is now known as brokkoli_origin
hwpplayer1 has quit [Remote host closed the connection]
Linux_Kerio has joined #ruby
Tempesta has quit [Quit: See ya!]
gemmaro has joined #ruby
Jado has quit [Ping timeout: 252 seconds]
hwpplayer1 has joined #ruby
gemmaro has quit [Client Quit]
gemmaro_ has joined #ruby
Jado has joined #ruby
konsolebox has joined #ruby
Jado has quit [Ping timeout: 246 seconds]
hwpplayer1 has quit [Remote host closed the connection]
gemmaro_ has quit [Ping timeout: 252 seconds]
Jado has joined #ruby
Tempesta has joined #ruby
hwpplayer1 has joined #ruby
hwpplayer1 has quit [Remote host closed the connection]
hwpplayer1 has joined #ruby
gemmaro has joined #ruby
Jado has quit [Ping timeout: 248 seconds]
hwpplayer1 has quit [Read error: Connection reset by peer]
hwpplayer1 has joined #ruby
<o0x1eef>
havenwood: Right. But on huggingface.co you have a lot of different types of models. AFAIK llama is specialized towards LLMs specifically. On huggingface.co you could use a model at a lower level of abstraction, more like a programmer's API and expose that over HTTP.
<o0x1eef>
For example there's models specifically for text-to-speech, and so on.
<o0x1eef>
Usually you'd want a decent GPU though.
Jado has joined #ruby
hwpplayer1 has quit [Remote host closed the connection]
grenierm has quit [Ping timeout: 240 seconds]
Jado has quit [Ping timeout: 265 seconds]
brokkoli_origin has quit [Ping timeout: 265 seconds]
hwpplayer1 has joined #ruby
brokkoli_origin has joined #ruby
brokkoli_origin has quit [Remote host closed the connection]
brokkoli_origin has joined #ruby
xokia_ has joined #ruby
Jado has joined #ruby
konsolebox has quit [Ping timeout: 276 seconds]
xokia_ has quit [Read error: Connection reset by peer]
Jado has quit [Ping timeout: 245 seconds]
konsolebox has joined #ruby
konsolebox has quit [Ping timeout: 268 seconds]
Jado has joined #ruby
KoUmas has joined #ruby
konsolebox has joined #ruby
Jado has quit [Ping timeout: 244 seconds]
Jado has joined #ruby
Jado has quit [Ping timeout: 246 seconds]
xokia_ has joined #ruby
Jado has joined #ruby
hwpplayer1 has quit [Remote host closed the connection]
xokia_ has quit [Read error: Connection reset by peer]
dviola has joined #ruby
Jado has quit [Ping timeout: 260 seconds]
Jado has joined #ruby
user71 has joined #ruby
Jado has quit [Ping timeout: 252 seconds]
xokia has joined #ruby
Starfoxxes has joined #ruby
Jado has joined #ruby
wbooze has joined #ruby
Inline has joined #ruby
hwpplayer1 has joined #ruby
Jado has quit [Ping timeout: 252 seconds]
hwpplayer1 has quit [Ping timeout: 252 seconds]
hwpplayer1 has joined #ruby
Jado has joined #ruby
rvalue- has joined #ruby
rvalue has quit [Ping timeout: 252 seconds]
Jado has quit [Ping timeout: 246 seconds]
Jado has joined #ruby
rvalue- is now known as rvalue
TomyLobo has joined #ruby
r3m has quit [Quit: WeeChat 4.6.0-dev]
r3m has joined #ruby
Jado has quit [Ping timeout: 268 seconds]
sweeTarts is now known as swee
Jado has joined #ruby
<havenwood>
o0x1eef: As a tool, llama.cpp does support some multimodal models and it ships with a REST API server and the ability to download huggingface models directly as GGUF.
<havenwood>
At least it supports multiple VL models. It lacks image and video generation support generally, AFIAK. An advantage it it's flexible to run on CPU or GPU and is easy to use. There are other inference runtimes that are better at running across multiple machines or specializing for certain hardware but llama.cpp is a fine one to quickly get working
<havenwood>
with Ruby via REST.
<wbooze>
anybody using rvm ?
<havenwood>
Just for exploration I think it's fine. :) I'd use a separate tool for stable diffusion or flux.1 or whatever.
<havenwood>
wbooze: As one of the lingering RVM maintainers who uses chruby, I'd not recommend RVM unless you have a compelling reason.
<havenwood>
wbooze: Running into RVM issues or just considering it?
<wbooze>
hoow am i supposed to install iruby ?? i did an gem install iruby, iruby console complains about not finding bundler
<havenwood>
wbooze: What operating stystem?
<wbooze>
susee
<havenwood>
The modern choices are chruby/ruby-install, rbenv/ruby-build, asdf or mise.
<havenwood>
Both asdf and mise install languages other than Ruby.
<wbooze>
bundle install -gemfile=<path> works but until i cd to that dir iruby console complains about not being able to find bundler
<havenwood>
The simplest thing that can possibly work beyond installing yourself and setting env vars is chruby/ruby-install.
<wbooze>
afterwards when i iruby register from there the kernel is unable to find bundler too and it all doesn't work
<wbooze>
so it all works when i cd to that dir, at least for the console, even registering a kernel from there the kernel does not run correctly
cappy has joined #ruby
<havenwood>
wbooze: Check `which ruby` and `ruby -v` from both directories. Is it the same?
<wbooze>
i only have 1 ruby
<havenwood>
wbooze: Then check `which bundle` and `gem which bundler`?
<havenwood>
wbooze: Modern Ruby ships with Bundler, so `bundle` being missing is suspicious that you're using an old Ruby. What version of Ruby?
<wbooze>
.rvm/gems/ruby-3.4.1/bin/bundle and bundler
<havenwood>
wbooze: Oh, RVM. If you're `cd`ing into a directory it may be autoswitching your Ruby.
<havenwood>
Sanity check `rvm list default` and `rvm list`. Does the directory you're cding into have a `.ruby-version` or `Gemfile` file? If so, what version of Ruby do they specify?
joako has quit [Quit: quit]
<wbooze>
it only has a Gemfile no .ruby-version
<wbooze>
both rvm list default and rvm list list only 1 thing, because i currently have only 1 ruby
Jado has quit [Ping timeout: 252 seconds]
<havenwood>
wbooze: Does the `Gemfile` have `ruby` directive? Like?: ruby "3.2.7"
<o0x1eef>
havenwood: I still think if I was to develop a product, I'd go with a Python web service that exposed whatever model you may want to use underneath. That seems like a much better *developer* environment. I don't really see llama in the same light.
<havenwood>
o0x1eef: Hem, I think of "Llama" as the Meta models, like Llama 3.3 70b and llama.cpp as needing the ".cpp" part to mean the Runtime.
<havenwood>
o0x1eef: Yeah, fair that using an HTTP interface to stream JSON isn't what you'd want to do in prod. Handy for prototyping.
Exa has quit [Ping timeout: 244 seconds]
<o0x1eef>
I wouldn't bother with llama.cpp if I was developing a product or piece of software. It's too limiting.
<havenwood>
o0x1eef: For exploring models it's handy as a hand wave, I think. I've tried alternatives like Rust MLX C bindings but I end up spending a lot of time getting one model working where with llama.cpp or an MLX client I can try out a bunch quickly and figure out tooling before I commit.
<havenwood>
o0x1eef: Fair that Python makes it generally easier to get bindings unless you want to go to C ones.
<havenwood>
o0x1eef: Or with MLX they ship Swift ones too.
<havenwood>
I guess it depends on what you're doing.
<havenwood>
o0x1eef: I kinda agree the overhead and complexity is unacceptable beyond prototyping. I still think GGUF and MLX wrappers that just make it easy have use for prototyping.
<o0x1eef>
I mean, if you want to build software, then the best path IMO is to use Python, and interface with models that way. You can continue to train them, you can expose a light web interface, etc. You have complete control. You can deploy at scale. llama.cpp falls down for anything that isn't just a hobby project IMO.
<havenwood>
Kinda meant to be simple and run anywhere more than the best choice for production.
<o0x1eef>
And if you know Ruby, Python is not that big of a jump. You can pick it up quickly.
Exa has joined #ruby
<havenwood>
Yeah, modern Python smooths over many of the issues I used to have with it compared to Ruby.
<havenwood>
You can even `exit` the REPL without it refusing while knowing what you mean!
<havenwood>
o0x1eef: But then if you're doing a GUI, suddenly Python is sus and you probably want something compiled.
<o0x1eef>
As a Rubyist I think it is a nice alternative approach. You can still have your web stack in Ruby / Rails, and then the AI part is just another web service that happens to be implemented in a different language.
<havenwood>
You'd expect the venture-backed tools to use Rust or Zig but they tend to use Electron with JavaScript and thinly wrap a REST API deffering the runtime to an upstream tool.
Jado has joined #ruby
<havenwood>
o0x1eef: I think you could argue for llama.cpp for a "runs on one machine" app meant to be portable. As soon as you're running some LLM SaaS it's a terrible choice.
<havenwood>
vLLM or whatever is much better, for an off-the-shelf tool.
<havenwood>
Sometimes the foot-in-the-door tools have a place, even if it's just a stepping stone you leave behind.
<o0x1eef>
I'm more so thinking of a model for text to speech, text to image, image to video. These are all models that could fit nicely behind a web service, and you could certainly build a product on them, with the main web application Rails.
joako has joined #ruby
<havenwood>
o0x1eef: Yeah, totally. Makes me think of Open WebUI. Again, they just thinly wrap llama.cpp but. 🤷
<havenwood>
Qwen's chat just uses Open WebUI straight up. I dunno if they changed the backend, but if not it's an example at some scale. https://chat.qwenlm.ai
Jado has quit [Ping timeout: 246 seconds]
<o0x1eef>
It's cool indeed and a great option for self-hosting. I could see how it would be cool to set that up to be used within a LAN.
<o0x1eef>
I think it's a different usecase to what I'm getting at though.
Jado has joined #ruby
Linux_Kerio has quit [Ping timeout: 252 seconds]
<o0x1eef>
I will give you a more concrete example. I am working on a Rails application for taking bookmarks. I want an AI service that will help me with classification of those bookmarks. I don't need an LLM. I just need a model that's good at classification. I can pull one from hugginface.co, expose it over HTTP, and then it is a service my Rails app can use. The web service & interface with the model happens
<o0x1eef>
in Python.
<o0x1eef>
Anyway, I think I've spammed the channel enough :)
<havenwood>
I think it's fair to talk about Rails connecting to LLMs and different strategies. Most GUIs, web or otherwise are using simple HTTP. SSE or in some cases WebSockets.
<havenwood>
But that's I guess what you're talking about. Run your service from Python and figure out how to efficiently do discovery.
<havenwood>
You can use Tailscale, radio, bluetooth, whatever for that part. Low bandwidth. I think that's why folk get by with JSON.
<havenwood>
I agree we can do better.
<havenwood>
It'd be interesting to do a Rails integration using WebSockets. I know Shopify was rewriting GRPC in pure Ruby, but dunno if it's mature enough to use yet and the old C-ext is crufty. Gives me pause.
<o0x1eef>
It takes a lot of juice to run an LLM. At least before DeepSeek. IME It's less of an issue if you're focused solely on text classification. You could self-host the AI service and not rely on any third party. I haven't tried to deploy on amazon or whatnot, but locally, it is within the realm of possible and I don't need the internet for my app to run.
<havenwood>
Rails could handle the standard SSE streaming interface fine but it's pretty lousy and most don't implement ID for retries or anything. Pretty simplistic. Does work.
<havenwood>
o0x1eef: Even the real DeepSeek takes hundreds of GB of VRAM to run at speed. Just the Qwen and Llama distills can run on reasonable machines. I can barely partially offload the lowest 1.58-bit quant of DeepSeek R1 on a 128GB M4 Max. Funny when a cluster of Apple Silicon is the cheapest way to run a thing. >.>
<havenwood>
Interesting idea to run Rails in the cloud on prem with GPU.
<havenwood>
I've only been exploring locally.
<o0x1eef>
I bought a gamer's PC with a decent GPU so I could test and develop locally. For focused models that do one thing and do it well, such as classification, it is more than capable. LLMs? Not so much.
KoUmas has quit [Ping timeout: 268 seconds]
Jado has joined #ruby
cappy has quit [Quit: Leaving]
<o0x1eef>
The main take away is that there's a lot of models on huggingface.co, a lot of them are useful and don't require the resources of an LLM, so you can solve problems using AI and keep everything in-house. No need to talk to OpenAI. No need for the cloud. I feel like I'm rambling so I will take a break.
user71 has quit [Quit: Leaving]
<fantazo>
if people do freelancing for ruby, where are projects to be found?
hwpplayer1 has quit [Remote host closed the connection]
Jado has quit [Ping timeout: 245 seconds]
konsolebox has quit [Ping timeout: 250 seconds]
Starfoxxes has quit [Remote host closed the connection]
nirvdrum7 has joined #ruby
Jado has joined #ruby
<dorian>
anybody encounter a type validation/coercion gem that competes with dry-types?
<dorian>
because i am about to launch dry-types into the sun
gemmaro has quit [Read error: Connection reset by peer]
gemmaro_ has joined #ruby
wbooze has quit [Quit: Leaving]
Inline has quit [Quit: Leaving]
xokia has quit [Quit: Leaving]
xokia has joined #ruby
graywolf has joined #ruby
Inline has joined #ruby
balrog has quit [Quit: Bye]
mange has joined #ruby
balrog has joined #ruby
ruby[bot] has quit [Remote host closed the connection]
ruby[bot] has joined #ruby
TomyLobo has quit [Read error: Connection reset by peer]