juanfra has quit [Quit: Bridge terminating on SIGTERM]
sixecho has quit [Quit: Bridge terminating on SIGTERM]
jhass[m] has quit [Quit: Bridge terminating on SIGTERM]
mewfree[m] has quit [Quit: Bridge terminating on SIGTERM]
Bi[m] has quit [Quit: Bridge terminating on SIGTERM]
MeowcatWoofWoofF has quit [Quit: Bridge terminating on SIGTERM]
alex[m]1234567 has quit [Quit: Bridge terminating on SIGTERM]
hsiktas[m] has quit [Quit: Bridge terminating on SIGTERM]
work has joined #ruby
_ht has joined #ruby
jhass[m] has joined #ruby
TzilTzal has joined #ruby
<adam12>
jhass[m]: I generate and store all documentation, where rubydoc.info generates on demand and caches some.
sixecho has joined #ruby
hsiktas[m] has joined #ruby
mewfree[m] has joined #ruby
alex[m]123 has joined #ruby
Bi[m] has joined #ruby
MeowcatWoofWoofF has joined #ruby
juanfra has joined #ruby
<jhass>
is the plan to do it for all gems?
<adam12>
jhass[m]: Both obviously with their pros and cons, especially since some gems generate almost a gigabyte of docs, but one of my strategies is heavy compression for storage. And I don't use "the cloud" in the traditional sense, but rather I have my own platform and my own servers, so it's "reasonably" cheap for me to just rack a server with a bunch
<adam12>
of SSDs.
<adam12>
I'd like to do it for all gems, perhaps through the Rubygems webhook that I was using originally. Before I started gemdocs.org, I made an experiment to build docs for _every_ gem, and record sizes, to see if it was feasible.
<adam12>
But the current incarnation of the site is super simple, so I haven't added the webhook yet. I probably need to actually add a database (it uses the filesystem curently), and perhaps some sort of locking mechanism for parallel doc builds, and maybe a few other bits.
<jhass>
that's cool. I'd be curious about the ballpark numbers if you still have them
<adam12>
But I agressively add gems as I see htem.
<adam12>
I regret not timestamping each record (and using sqlite up front), so it's hard to gauge the time span.
<adam12>
Largest gem for docs is `oci`, and compressed it's like 23MB. And the longest building gem is `rbt` and it takes over an hour to run `yard` on it.
<adam12>
The azure- gems are my biggest culprit right now tho. I shared some stats a few weeks back, where during the build it hit almost 1 GB of docs. I compress it down to like ~ 60 MB tho, which I guess is better than nothing. Indicative of why Loren won't store _everything_ on rubydoc.info.
<jhass>
so about 5G when compressed?
<adam12>
Which?
<jhass>
all in all
<adam12>
Ahh! That build log is the experimental one, so it doesn't reflect gemdocs.org, just what I was building via webhooks.
<adam12>
But actually right now, it's even less than that on gemdocs.org. Only about 210MB compressed.
<jhass>
yeah I mean that's more than managable. I think I'd look into a fast decompression algo, like zstd and a compressing filesystem that supports it
<adam12>
Well right now I have a huge hack, where I always gzip it, and always serve the gzip version (sorry ancient IE users :P)
<adam12>
but it was my _plan_ to use btrfs or ZFS, eventually.
TzilTzal has quit [Remote host closed the connection]
TomyWork has quit [Remote host closed the connection]
<jidar>
zfs would support you throwing those docs uncompressed on a dataset with compression turned on, zstd/etc
<jidar>
oh, interesting you can't quite do the same on btrfs
<jidar>
> Can I set compression per-subvolume? no
jpw has quit [Remote host closed the connection]
_ht has quit [Remote host closed the connection]
<nakilon>
adam12 have you tried to see what exactly consumes to much space? is it really just api methods docs?
<nakilon>
I had a bash-function somewhere on another machine that prints a summary by file type at least
<nakilon>
*so much space
<weaksauce>
a gig seems much
roshanavand has quit [Ping timeout: 265 seconds]
roshanavand has joined #ruby
dviola has joined #ruby
donofrio has quit [Remote host closed the connection]
roadie has quit [Ping timeout: 268 seconds]
<adam12>
nakilon: I never bothered to dig in, but for the azure_mgmt_network, it's _huge_ on it's own, without docs. So my guess is that it's just huge**2 or something.
<adam12>
jidar: Yeah. My current implementation is just "one less piece", since I deploy from cloud images (like Debian) and they only come with a default filesystem. I'd have to auto-partition, mount another disk, or make a file on the existing filesystem and mount it loopback, then format. But eventually I'll probably try to move towards that.
<adam12>
adam@Adams-MacBook-Air tmp % du -sh azure_mgmt_network-0.26.1
<adam12>
140Mazure_mgmt_network-0.26.1
<adam12>
140MB of Ruby code...
<adam12>
They are deprecating these SDKs, so it might not matter as much in the future. But they definitely could of done with someone experienced in Ruby to build them.
roadie has joined #ruby
TzilTzal has quit []
CrazyEddy has joined #ruby
roshanavand has quit [Ping timeout: 265 seconds]
roshanavand has joined #ruby
lunarkitty has joined #ruby
ur5us has quit [Ping timeout: 245 seconds]
CrazyEddy has quit [Ping timeout: 245 seconds]
<nakilon>
140mb of ruby code? how? maybe they've bundled other libs in?
CrazyEddy has joined #ruby
<weaksauce>
haha
<weaksauce>
i just looked at it and they autogenerated a bunch of code
<weaksauce>
# Code generated by Microsoft (R) AutoRest Code Generator.
<weaksauce>
AND instead of using git like git they created new folders for each release