737615
Hylke
@hylkedonker #737615
Dutch machine learning enthousiast 🤖 with a love for programming.
10 Follower 63 Following
Looks like coding agents are already doing a pretty good job implementing scientific papers:
https://arxiv.org/abs/2504.01848
https://arxiv.org/abs/2504.01848
For those interested in the intersection of AI and statistics, I have written a blog post how to build bayesian attention:
https://medium.com/data-science-collective/exploiting-the-structured-state-space-duality-to-build-bayesian-attention-3883ab8bacd4
https://medium.com/data-science-collective/exploiting-the-structured-state-space-duality-to-build-bayesian-attention-3883ab8bacd4
Looking at the nightly changelogs, release of mojo 24.6, which is supposed to ship with gpu support, is coming any day now.
Recurrent neural networks, are transformers, are state space models, are convolutions?
Looks like we went full circle, back to 2012, when deep learning made it's first splash.
https://arxiv.org/abs/2405.21060
Looks like we went full circle, back to 2012, when deep learning made it's first splash.
https://arxiv.org/abs/2405.21060
State space models can be used as drop in replacements for attention, but with more favourable sequence length scaling. This video may well be the most lucid intro to state space models I've come across:
https://youtu.be/QJHA-PY8zDc?si=J5kGW87Yg0SAFdpR
https://youtu.be/QJHA-PY8zDc?si=J5kGW87Yg0SAFdpR
Debugging neural nets is always a pain, but maybe penzai may bring some relief?
https://github.com/google-deepmind/penzai
https://github.com/google-deepmind/penzai