papers-please

/papers-please162

A channel to share interesting research papers. Papers only please.

Sweat analysis will be in the next iteration of wearable devices.
I wear an Oura and a Whoop and they’re just scratching the surface when it comes to human biomarkers monitoring.

https://www.nature.com/articles/nature16521
New day, new paper.

We work closely with Zhuo, he came up with an interesting classification for smart contract bugs, distinguishing them between "machine auditable" and "machine unauditable".

"Machine" in this case doesn't take LLMs into consideration.
https://www.cs.purdue.edu/homes/zhan3299/res/ICSE23.pdf
A new channel to share research papers: /papers-please

Channel is now open to post in.

h/t to @nadav for the idea.
Good recent paper from Apple on LLMs reasoning capabilities. Their claim is that it still looks more like sophisticated pattern matching, which makes sense intuitively.

We found it to be a pretty robust and intellectually honest evaluation.
https://arxiv.org/pdf/2410.05229
G-Eval is a framework presented by the cognitive research team at Microsoft that uses chain-of thoughts (CoT) and a form-filling paradigm for NLG evaluation. Metrics like BLEU and ROUGE have historically had low correlation with human judgements.
https://arxiv.org/pdf/2303.16634
“DisTrO (Distributed Training Over-the-Internet) a family of architecture-agnostic and network-agnostic distributed optimizers that reduces the inter-GPU communication requirements by 1000x to 10,000x without relying on amortized analysis, and matches AdamW+All-Reduce in convergence rates. This enables low-latency training of large neural networks on slow internet bandwidths with heterogeneous networking hardware.”



https://github.com/NousResearch/DisTrO/blob/main/A_Preliminary_Report_on_DisTrO.pdf has issues loading on mobile
I believe Karpathy was referring to this paper as the paper that set OpenAI off course for years

https://arxiv.org/abs/1312.5602

https://x.com/karpathy/status/1857980896776990830