> people don't need to work out which version of cuda/which gpu settings/librari...

aragilar · on Feb 19, 2025

That's a pytorch issue. The solution is, as always, build from source. You will understand how the system is assembled, then you can build a minimal version meeting your specific needs (which, given wheels are a well-defined thing, you can then store on a server for reuse).

Cross-OS (especially with a VM like Java or JS) is relatively easy compared to needing specific versions for every single sub-architecture of a CPU and GPU system (and that's ignoring all the other bespoke hardware that's out there).

physicsguy · on Feb 19, 2025

Cross platform Java doesn't have the issue because the JVM is handling all of that for you. But if you want native extensions written in C you get back to the same problem pretty quickly.

sieve · on Feb 19, 2025

> if you want native extensions written in C

The SQLite project I linked to is a JDBC driver that makes use of the C version of the library appropriate to each OS. LWJGL (https://repo1.maven.org/maven2/org/lwjgl/lwjgl/3.3.6/) is another project which heavily relies on native code. But distributing these, or using these as dependencies, does not result in hair-pulling like it does with python.

aragilar · on Feb 19, 2025

There's native code like SQLite which assuming a sensible file system and C compiler is quite portable, and then there's native code which cares about exact compiler versions, driver version, and the exact model of your CPU, GPU and NIC. My suggestion is go look at how to program a GPU using naive vulkan/metal, and then look for the dark magic that is used to make GPUs run fast. It's the latter you're encountering with the ML python projects.