AI

/ai17280

๐—ฎ๐˜๐˜๐—ฒ๐—ป๐˜๐—ถ๐—ผ๐—ป ๐—ถ๐˜€ ๐—ฎ๐—น๐—น ๐˜†๐—ผ๐˜‚ ๐—ป๐—ฒ๐—ฒ๐—ฑ

guys, are we cooked?
/AI
Finally got Ollama running on a Hetzner VPS; now with super powers!

https://www.reddit.com/r/LocalLLaMA/s/vtLGkmBO8j
/AI
TIL about CREDO23, a sort of "Certified Organic" label for movies etc that asserts that nothing in the movie was AI generated. It's silly in a lot of ways, but I kind of love the effort.
https://credo23.com/
https://www.youtube.com/watch?v=oZ6FUfU579k
/AI
In-browser Postgres with an AI interface

โ€œEach database is paired with a large language model (LLM) which opens the door to some interesting use cases:

โ€” Drag-and-drop CSV import (generate table on the fly)
โ€” Generate and export reports
โ€” Generate charts
โ€” Build database diagramsโ€

https://database.build/

https://youtu.be/ooWaPVvljlU?si=hMBs-y9l6DVqGvuK
/AI
Training Loras for Flux 1.0 [dev]

I love open-source - and non-profit projects based on open-source.

Tost.AI is a site where you can try out new models and pipelines like Live Portrait.

You can train your own LoRa and create cool stuff. With just 6 photos, you can consistently generate the same object. Details of the object are captured so accurately that even text remains intact.

E.g. you can take photos in a white box and create a product photo with a model (or on a model), or an image in some unique location, etc.

Worth noting that, even though the weights are under a non-profit license, you fully own the images generated by Flux.
/AI
Try the example app in the blog, itโ€™s near real time image generation ??? !!!
/AI
Solution in this Paper:

โ€ข Evaluates o1 models on planning tasks focusing on feasibility, optimality, and generalizability

โ€ข Uses benchmark tasks like Barman, Blocksworld, Floortile, Grippers, Tyreworld, and Termes

โ€ข Analyzes performance in constraint-heavy and spatially complex environments

โ€ข Compares o1-preview, o1-mini, and GPT-4 across different planning scenarios

https://x.com/rohanpaul_ai/status/1844950409003049047?s=46

https://cdn.openai.com/improving-mathematical-reasoning-with-process-supervision/Lets_Verify_Step_by_Step.pdf
/AI
Rings very true to my experience. The issue with LLMs is once you exit the domain the common path, you quickly realize you are back to solving problems. I like to think of LLMs more as a big, conversational search engine. Synthetic data and boundary pushing are still difficult if not impossible with current tech. That said this will eventually get solved but we are much further from it than we think.

https://x.com/VictorTaelin/status/1844969648904663126
/AI
Bold prediction: a humanoid robot product will overtake the iPhone as the best selling technology product
/AI
NotebookLM experiment: generate synthetic patient; survey synthetic patient [T2D screening]; import survey responses to generate podcast; listen to analysis
/AI
Feels like there would be an opportunity for using farcaster and cryptopayments to make this reality.
https://x.com/VictorTaelin/status/1844172564190621849
/AI
New open-source text and image video generation model

- released under MIT license.
- generates high-quality 10-second videos at 768p resolution, 24 FPS.
- uses pyramid flow matching for efficient autoregressive video generation

https://huggingface.co/rain1011/pyramid-flow-sd3
/AI
Prefer humans to machines; _and_ this protocol has plenty of weirdos to be the only place like it on โ€˜net; but let the bots play with each other while we reward ourselves
/AI
AI has taken over the Nobel Prize.
- Physics โœ”๏ธ
- Chemistry โœ”๏ธ
/AI
A Visual Guide to Mixture of Experts (MoE)

> recommended reading

A Mixture of Experts (MoE) is a neural network architecture that consists of multiple โ€œexpertโ€ models and a router mechanism to direct inputs to the most suitable expert. This approach allows the model to handle specific aspects of a task more efficiently. The router decides which expert processes each input, enabling a model to use only a subset of its total parameters for any given task.

The success of MoE lies in its ability to scale models with more parameters while reducing computational costs during inference. By selectively activating only a few experts for each input, MoE optimizes performance without overloading memory or compute resources. This flexibility has made MoE effective in various domains, including both language and vision tasks.

https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-mixture-of-experts
/AI
Podcastfy is an open-source Python tool that converts web content, PDFs, and text into multi-lingual audio conversations using GenAI, focusing on customizable and programmatic audio generation.

https://github.com/souzatharsis/podcastfy-demo
/AI
anyone have a go-to tool for converting text to diagram? kind of like excalidraw's text-to-diagram using mermaid but with bigger context windows, memory, and prettier diagrams. specifically something like a UML diagram
/AI
The Llama 3.1-Nemotron-70B-Reward model, trained through human feedback (RLHF). Leads the RewardBench leaderboard w/a 94.1% score, demonstrates proficiency in Safety (95.1%) and Reasoning (98.1%), effectively rejects unsafe responses and solving complex tasks. Despite being smaller than the Nemotron-4 340B Reward model, it offers high efficiency and accuracy, uses CC-BY-4.0-licensed HelpSteer2 data for training, making it suitable for enterprise use. The model uses a combination of regression-style and Bradley-Terry reward modeling, using meticulously curated data from HelpSteer2 to maximize performance.

https://huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Reward
/AI