GPU support #1142
Replies: 1 comment
-
|
Hey @AmirAlasady, Sorry for the late reply—things have been hectic with coursework and book updates. I took a look at your atomic-autograd project, and I have to say, it’s a fantastic piece of work. I love how clean and intuitive it is. Projects like this often feel so simple (simple is a good thing!) and natural once you understand them, but teaching these ideas to others is where the real challenge lies. I’ve run into the same issue myself—finding the right balance between detailed explanations and keeping things straightforward isn’t easy, but your approach strikes a nice balance. It’s great that you’re sharing this work, and I completely understand the difficulty of managing it alongside everything else. I’ve found that teaching helps me stay engaged, but I know how tough it can be to juggle everything. You'll be surprised by how many hours and sleepless nights I've poured into https://mlsysbook.ai/ and the tinytorch code. So I’m happy to support you however I can. As for your idea about adding a CuPy GPU-backed implementation alongside NumPy, I think it’s a really compelling suggestion. The shared architecture and method signatures between the two libraries make it a super nice fit. I’ll spend some more time exploring your atomic-autograd implementation to see how we could integrate something similar. Multi-device execution is such an exciting area, especially as we think about scaling up to larger and more complex models. So I definitely want us to put that in. Thanks again for sharing your work. Keep building and wishing you all the best! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I wanted to suggest a new addition and improvement to the current NumPy backend.
This would be introducing a CuPy GPU-supported backend alongside NumPy. Since both libraries share the same architecture and method signatures but execute on different devices, this addition would enable better scaling and significantly improve performance, opening new horizons for larger models.
This idea also comes from my own experience. I previously built a deep learning framework called atomic-autograd, which I published as an open-source project and an actual Python library that can be installed via pip. It is based on the same autograd concept I mentioned earlier and supports multi-device execution: CPU using NumPy and GPU (CUDA) using CuPy, with seamless dynamic device switching and consistent data types.
You can find the library implementation here, if you are interested:
https://github.com/AmirAlasady/atomic-autograd.git
Please note that I have not maintained this repository for a long time, as I became busy with work and did not have enough time to keep an eye on it. That honestly makes me a bit sad, especially because I live in Iraq, where it is rare for people to notice such work, recognize its value, or actually make use of it.
I truly wish that it could receive the recognition it deserves from an amazing person like you, sir, and from that community. It would be my pleasure.
Beta Was this translation helpful? Give feedback.
All reactions