Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It seems like one needs a big machine farm and a vast corpus of training data with a lot of manual curation to get started creating a competitive LLM, plus whatever technical expertise that I don't even know about. The stuff that makes LLMs exist now and not earlier.

It might be possible to organize all that with volunteers and some paid work, but how in practice? Stallman seems kind of out of the game at this point and there is no Linus Torvalds figure neither for this, as of now.



> It seems like one needs a big machine farm and a vast corpus of training data with a lot of manual curation to get started creating a competitive LLM, plus whatever technical expertise that I don't even know about. The stuff that makes LLMs exist now and not earlier.

"big machine farm" reminds me of folding@home, which needed the same and got it.

"manual curation" is what Wikipedia did, as well as the free software community.

"technical expertise" is present in the free software world too. It is sparse since it is sparse in the world as a whole, but it exists.

"no Linus Torvalds figure" might be the main problem ATM.


I also thought of these after writing my comment. The main problems that I see with these solutions are:

- Training seems to need a lot of data available at the same time, which is difficult to handle on commodity hardware.

- Manual curation can be a mind-numbing task, it might need to be gamified somehow.

There is a chance that the curation could be higher quality than the current corporate stuff. Pretty sure that it's not an intrinsic property of LLMs to write like TED talks.


> there is no Linus Torvalds figure neither for this, as of now

Well yes there is. It's Karpathy.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: