AI
/ai17280
๐ฎ๐๐๐ฒ๐ป๐๐ถ๐ผ๐ป ๐ถ๐ ๐ฎ๐น๐น ๐๐ผ๐ ๐ป๐ฒ๐ฒ๐ฑ
guys, are we cooked?
Finally got Ollama running on a Hetzner VPS; now with super powers!
https://www.reddit.com/r/LocalLLaMA/s/vtLGkmBO8j
https://www.reddit.com/r/LocalLLaMA/s/vtLGkmBO8j
TIL about CREDO23, a sort of "Certified Organic" label for movies etc that asserts that nothing in the movie was AI generated. It's silly in a lot of ways, but I kind of love the effort.
https://credo23.com/
https://www.youtube.com/watch?v=oZ6FUfU579k
https://credo23.com/
https://www.youtube.com/watch?v=oZ6FUfU579k
In-browser Postgres with an AI interface
โEach database is paired with a large language model (LLM) which opens the door to some interesting use cases:
โ Drag-and-drop CSV import (generate table on the fly)
โ Generate and export reports
โ Generate charts
โ Build database diagramsโ
https://database.build/
https://youtu.be/ooWaPVvljlU?si=hMBs-y9l6DVqGvuK
โEach database is paired with a large language model (LLM) which opens the door to some interesting use cases:
โ Drag-and-drop CSV import (generate table on the fly)
โ Generate and export reports
โ Generate charts
โ Build database diagramsโ
https://database.build/
https://youtu.be/ooWaPVvljlU?si=hMBs-y9l6DVqGvuK
Training Loras for Flux 1.0 [dev]
I love open-source - and non-profit projects based on open-source.
Tost.AI is a site where you can try out new models and pipelines like Live Portrait.
You can train your own LoRa and create cool stuff. With just 6 photos, you can consistently generate the same object. Details of the object are captured so accurately that even text remains intact.
E.g. you can take photos in a white box and create a product photo with a model (or on a model), or an image in some unique location, etc.
Worth noting that, even though the weights are under a non-profit license, you fully own the images generated by Flux.
I love open-source - and non-profit projects based on open-source.
Tost.AI is a site where you can try out new models and pipelines like Live Portrait.
You can train your own LoRa and create cool stuff. With just 6 photos, you can consistently generate the same object. Details of the object are captured so accurately that even text remains intact.
E.g. you can take photos in a white box and create a product photo with a model (or on a model), or an image in some unique location, etc.
Worth noting that, even though the weights are under a non-profit license, you fully own the images generated by Flux.
INTELLECT-1: the first decentralized training of a 10B model
๐: https://app.primeintellect.ai/intelligence?_gl=1*1fxs3kt*_gcl_au*MTI5NzY5NDM2MC4xNzI4ODQ4NDQ2
donโt sleep.
๐: https://app.primeintellect.ai/intelligence?_gl=1*1fxs3kt*_gcl_au*MTI5NzY5NDM2MC4xNzI4ODQ4NDQ2
donโt sleep.
Try the example app in the blog, itโs near real time image generation ??? !!!
Solution in this Paper:
โข Evaluates o1 models on planning tasks focusing on feasibility, optimality, and generalizability
โข Uses benchmark tasks like Barman, Blocksworld, Floortile, Grippers, Tyreworld, and Termes
โข Analyzes performance in constraint-heavy and spatially complex environments
โข Compares o1-preview, o1-mini, and GPT-4 across different planning scenarios
https://x.com/rohanpaul_ai/status/1844950409003049047?s=46
https://cdn.openai.com/improving-mathematical-reasoning-with-process-supervision/Lets_Verify_Step_by_Step.pdf
โข Evaluates o1 models on planning tasks focusing on feasibility, optimality, and generalizability
โข Uses benchmark tasks like Barman, Blocksworld, Floortile, Grippers, Tyreworld, and Termes
โข Analyzes performance in constraint-heavy and spatially complex environments
โข Compares o1-preview, o1-mini, and GPT-4 across different planning scenarios
https://x.com/rohanpaul_ai/status/1844950409003049047?s=46
https://cdn.openai.com/improving-mathematical-reasoning-with-process-supervision/Lets_Verify_Step_by_Step.pdf
Rings very true to my experience. The issue with LLMs is once you exit the domain the common path, you quickly realize you are back to solving problems. I like to think of LLMs more as a big, conversational search engine. Synthetic data and boundary pushing are still difficult if not impossible with current tech. That said this will eventually get solved but we are much further from it than we think.
https://x.com/VictorTaelin/status/1844969648904663126
https://x.com/VictorTaelin/status/1844969648904663126
Bold prediction: a humanoid robot product will overtake the iPhone as the best selling technology product
NotebookLM experiment: generate synthetic patient; survey synthetic patient [T2D screening]; import survey responses to generate podcast; listen to analysis
Feels like there would be an opportunity for using farcaster and cryptopayments to make this reality.
https://x.com/VictorTaelin/status/1844172564190621849
https://x.com/VictorTaelin/status/1844172564190621849
New open-source text and image video generation model
- released under MIT license.
- generates high-quality 10-second videos at 768p resolution, 24 FPS.
- uses pyramid flow matching for efficient autoregressive video generation
https://huggingface.co/rain1011/pyramid-flow-sd3
- released under MIT license.
- generates high-quality 10-second videos at 768p resolution, 24 FPS.
- uses pyramid flow matching for efficient autoregressive video generation
https://huggingface.co/rain1011/pyramid-flow-sd3
Prefer humans to machines; _and_ this protocol has plenty of weirdos to be the only place like it on โnet; but let the bots play with each other while we reward ourselves
AI has taken over the Nobel Prize.
- Physics โ๏ธ
- Chemistry โ๏ธ
- Physics โ๏ธ
- Chemistry โ๏ธ
A Visual Guide to Mixture of Experts (MoE)
> recommended reading
A Mixture of Experts (MoE) is a neural network architecture that consists of multiple โexpertโ models and a router mechanism to direct inputs to the most suitable expert. This approach allows the model to handle specific aspects of a task more efficiently. The router decides which expert processes each input, enabling a model to use only a subset of its total parameters for any given task.
The success of MoE lies in its ability to scale models with more parameters while reducing computational costs during inference. By selectively activating only a few experts for each input, MoE optimizes performance without overloading memory or compute resources. This flexibility has made MoE effective in various domains, including both language and vision tasks.
https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-mixture-of-experts
> recommended reading
A Mixture of Experts (MoE) is a neural network architecture that consists of multiple โexpertโ models and a router mechanism to direct inputs to the most suitable expert. This approach allows the model to handle specific aspects of a task more efficiently. The router decides which expert processes each input, enabling a model to use only a subset of its total parameters for any given task.
The success of MoE lies in its ability to scale models with more parameters while reducing computational costs during inference. By selectively activating only a few experts for each input, MoE optimizes performance without overloading memory or compute resources. This flexibility has made MoE effective in various domains, including both language and vision tasks.
https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-mixture-of-experts
Podcastfy is an open-source Python tool that converts web content, PDFs, and text into multi-lingual audio conversations using GenAI, focusing on customizable and programmatic audio generation.
https://github.com/souzatharsis/podcastfy-demo
https://github.com/souzatharsis/podcastfy-demo
anyone have a go-to tool for converting text to diagram? kind of like excalidraw's text-to-diagram using mermaid but with bigger context windows, memory, and prettier diagrams. specifically something like a UML diagram
Software is not just eating the world but also the accolades of other disciplines
https://x.com/NobelPrize/status/1843589140455272810?t=dOfTniUrOzMPwWdI3pVN2Q&s=19
https://x.com/NobelPrize/status/1843589140455272810?t=dOfTniUrOzMPwWdI3pVN2Q&s=19
The Llama 3.1-Nemotron-70B-Reward model, trained through human feedback (RLHF). Leads the RewardBench leaderboard w/a 94.1% score, demonstrates proficiency in Safety (95.1%) and Reasoning (98.1%), effectively rejects unsafe responses and solving complex tasks. Despite being smaller than the Nemotron-4 340B Reward model, it offers high efficiency and accuracy, uses CC-BY-4.0-licensed HelpSteer2 data for training, making it suitable for enterprise use. The model uses a combination of regression-style and Bradley-Terry reward modeling, using meticulously curated data from HelpSteer2 to maximize performance.
https://huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Reward
https://huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Reward