Very ML
State-of-the-art Machine Learning News Feed
/r/MachineLearning
последний пост 3 часа назад
[D] It’s 2026. Can we finally admit TensorFlow is the "COBOL of Machine Learning"?
[D] It’s 2026. Can we finally admit TensorFlow is the "COBOL of Machine Learning"?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

3 часа назад @ reddit.com
[D] Real-time Student Attention Detection: ResNet vs Facial Landmarks - Which approach for resource-constrained deployment?
[D] Real-time Student Attention Detection: ResNet vs Facial Landmarks - Which approach for resource-constrained deployment?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

5 часов назад @ reddit.com
[R] ACL ARR review desk rejected
[R] ACL ARR review desk rejected

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

6 часов назад @ reddit.com
[D] We audited LoCoMo: 6.4% of the answer key is wrong and the judge accepts up to 63% of intentionally wrong answers
[D] We audited LoCoMo: 6.4% of the answer key is wrong and the judge accepts up to 63% of intentionally wrong answers

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

7 часов назад @ reddit.com
[P] ClaudeFormer: Building a Transformer Out of Claudes — Collaboration Request
[P] ClaudeFormer: Building a Transformer Out of Claudes — Collaboration Request

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

7 часов назад @ reddit.com
[D] Building a demand forecasting system for multi-location retail with no POS integration, architecture feedback wanted
[D] Building a demand forecasting system for multi-location retail with no POS integration, architecture feedback wanted

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

7 часов назад @ reddit.com
[P] Deezer showed CNN detection fails on compressed audio, here's a dual-engine approach that survives MP3
[P] Deezer showed CNN detection fails on compressed audio, here's a dual-engine approach that survives MP3

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

9 часов назад @ reddit.com
[D] Looking for definition of open-world ish learning problem
[D] Looking for definition of open-world ish learning problem

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

10 часов назад @ reddit.com
[D] On conferences and page limitations
[D] On conferences and page limitations

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

11 часов назад @ reddit.com
[R] Which place should I commit to ACL SRW or ICML workshop or AACL?
[R] Which place should I commit to ACL SRW or ICML workshop or AACL?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

15 часов назад @ reddit.com
Retraining vs Fine-tuning or Transfer Learning? [D]
Retraining vs Fine-tuning or Transfer Learning? [D]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

15 часов назад @ reddit.com
[R] Interested in recent research into recall vs recognition in LLMs
[R] Interested in recent research into recall vs recognition in LLMs

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

21 час назад @ reddit.com
Pretrained ADAM v2 weights [D]
Pretrained ADAM v2 weights [D]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

22 часа назад @ reddit.com
[D] Why evaluating only final outputs is misleading for local LLM agents
[D] Why evaluating only final outputs is misleading for local LLM agents

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day назад @ reddit.com
[D] - 1M tokens/second serving Qwen 3.5 27B on B200 GPUs, benchmark results and findings
[D] - 1M tokens/second serving Qwen 3.5 27B on B200 GPUs, benchmark results and findings

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 1 hour назад @ reddit.com
Towards Data Science
последний пост 5 часов назад
Building a Production-Grade Multi-Node Training Pipeline with PyTorch DDP
Building a Production-Grade Multi-Node Training Pipeline with PyTorch DDP Building a Production-Grade Multi-Node Training Pipeline with PyTorch DDP

We will build a complete, production-grade multi-node training pipeline from scratch using PyTorch’s DistributedDataParallel (DDP).

model = MiniResNet( in_channels=config.in_channels, num_classes=config.num_classes, ) return model.to(device) def wrap_ddp(model: nn.Module, local_rank: int) -> DDP: """Wrap model with DistributedDataParallel."""

Performance Pitfalls and TipsAfter building hundreds of distributed training jobs, these are the mistakes I see most often:Forgetting sampler.set_epoch().

A clean, modular codebase (config → data → model → training → utils) is what makes scaling from 1 GPU to 100 GPUs feasible.

Distributed training often feels intimidating — not because it’s inherently…

5 часов назад @ towardsdatascience.com
A Beginner’s Guide to Quantum Computing with Python
A Beginner’s Guide to Quantum Computing with Python A Beginner’s Guide to Quantum Computing with Python

Quantum Computing is the exploitation of properties of Quantum Mechanics to perform computations and solve problems that a classical computer can’t and never will.

For example:The word “Hi” → “h”: 01001000 and “i”: 01101001 → 01001000 01101001On the other end, quantum computers process information with qubits (quantum bits), which can be both 0 and 1 at the same time.

There is no exact public count of how many quantum computers are out there, but estimates are around 200 worldwide.

SetupIn Python, there are several libraries to work with quantum computers around the world:Qiskit by IBM is the most complete high-level ecosystem for running quantum programs on IBM quantum computers, perfect f…

8 часов назад @ towardsdatascience.com
How ElevenLabs Voice AI Is Replacing Screens in Warehouse and Manufacturing Operations
How ElevenLabs Voice AI Is Replacing Screens in Warehouse and Manufacturing Operations How ElevenLabs Voice AI Is Replacing Screens in Warehouse and Manufacturing Operations

Illustration of an operator using voice picking – (Image by Samir Saci)When I was designing supply chain solutions in logistics companies, vocalisation was the default choice, especially for price-sensitive projects.

What if you could achieve similar results using a smartphone, a custom-made web application, and modern AI voice technology?

The solution was a web application connected to the Warehouse Management System (WMS) database that guides the operator through the warehouse.

This sequential flow is critical because it determines the walking path through the warehouse using the optimisation algorithms.

If your warehouse has WiFi and your operators have smartphones, you can prototype a v…

10 часов назад @ towardsdatascience.com
How to Make Your AI App Faster and More Interactive with Response Streaming
How to Make Your AI App Faster and More Interactive with Response Streaming How to Make Your AI App Faster and More Interactive with Response Streaming

In the context of AI applications, HTTP streaming over SSE can support simple AI applications where we just need to stream the model’s response for latency and UX reasons.

On the rest of this post we’ll be taking a better look into streaming for simple AI apps using HTTP streaming over SSE.. . .What about HTTP Streaming Over SSE?

HTTP Streaming Over Server-Sent Events (SSE) is based on HTTP streaming.. . .HTTP streaming means that the server can send whatever it is that it has to send in parts, rather than all at once.

Fortunately, there are several frameworks and wrappers that simplify HTTP streaming, one of which is HTTP Streaming Over Server-Sent Events (SSE).. . .

Streaming makes AI sys…

1 day, 5 hours назад @ towardsdatascience.com
Beyond Code Generation: AI for the Full Data Science Workflow
Beyond Code Generation: AI for the Full Data Science Workflow Beyond Code Generation: AI for the Full Data Science Workflow

The bigger shift is that AI can now participate in a much more end-to-end data science workflow.

AI references my old GitHub code and writes a Python script to parse the raw data.

Essentially, AI handles nearly every step from data engineering to analysis, with me acting more as a reviewer and decision-maker.

Codex MCP set up screenshots by the authorCodex MCP set up screenshots by the author2.

For example, AI noticed my activity level had declined significantly since 2020.

1 day, 8 hours назад @ towardsdatascience.com
What the Bits-over-Random Metric Changed in How I Think About RAG and Agents
What the Bits-over-Random Metric Changed in How I Think About RAG and Agents What the Bits-over-Random Metric Changed in How I Think About RAG and Agents

But after reading the recent work on Bits over Random (BoR), I think they are incomplete for the Agentic systems many of us are now actually building.

Figure 1: In LLM systems, retrieval quality is not just about finding relevant information, but about how much irrelevant material comes with it.

If you show the model only 1 tool, random chance of surfacing a relevant one is 20 percent.

At 5 tools, random inclusion is already fairly strong.

How this changes evaluation practice for meThe biggest practical shift is that I would now evaluate retrieval systems more like this:First, look at standard retrieval metrics.

1 day, 10 hours назад @ towardsdatascience.com
Following Up on Like-for-Like for Stores: Handling PY
Following Up on Like-for-Like for Stores: Handling PY Following Up on Like-for-Like for Stores: Handling PY

Introductionto my last article, about building the Like-for-Like (L4L) solution based on Power Query:The solution works as expected for the most part.

Look at the following two screenshots, which show two different cases that include the Retail Sales and the Retail Sales PY measures.

We see these results for the second case:The values for the Retail Sales PY measure for “Comparable” stores, but with an interruption between August and October.

Retail Sales (PY) = CALCULATE([Retail Sales] ,'Time Intelligence'[Time Measures] = "PY" ,USERELATIONSHIP('Bridge_L4L'[L4LKey_PY], 'DIM_L4L'[L4LKey]) )Now, the results are what is expected.

ConclusionThe additional call of the USERELATIONSHIP() function…

2 days, 5 hours назад @ towardsdatascience.com
The Machine Learning Lessons I’ve Learned This Month
The Machine Learning Lessons I’ve Learned This Month The Machine Learning Lessons I’ve Learned This Month

There is the main project (writing an MLOps pipeline, drafting a paper, upgrading the compute cluster).

And here we come back to the main project: time spent on other projects is not available for the main project.

Instead, by simply blocking parts of your calendar, you can dedicate sufficient time to the main project.

Essentially, it boils down to prioritization in 90% of cases: prioritize the main project.

Planning, planning, and keeping the plan the planLooking back on the month — and the previous two lessons learned — I think this all calls for an overarching lesson: planning.

2 days, 7 hours назад @ towardsdatascience.com
Building Human-In-The-Loop Agentic Workflows
Building Human-In-The-Loop Agentic Workflows Building Human-In-The-Loop Agentic Workflows

On the contrary, human review remains important because of LLMs’ inherent probabilistic nature and the potential for errors.

In this article, we will walk through how to use LangGraph to set up a human-in-the-loop agentic workflow for content generation and publication on Bluesky.

(1) Primer to LangGraphLangGraph (part of the LangChain ecosystem) is a low-level agent orchestration framework and runtime for building agentic workflows.

The web search node utilizes the Tavily tool to search online for articles matching the top.

Let us see how the former works by setting up the human review node.

2 days, 8 hours назад @ towardsdatascience.com
My Models Failed. That’s How I Became a Better Data Scientist.
My Models Failed. That’s How I Became a Better Data Scientist. My Models Failed. That’s How I Became a Better Data Scientist.

And I was going to be a Data Scientist.

Illustration by Round Icons on UnsplashData Scientist IIFollowing the successful implementation of a few other models, I was promoted to Data Scientist II.

The modern data scientist is the perfect candidate to be that translator.

In healthcare, especially with AI magic at our fingertips, a successful data scientist isn’t the one who can build the most complex models.

A successful data scientist is one who understands the environment the model is meant for.

2 days, 11 hours назад @ towardsdatascience.com
How to Make Claude Code Improve from its Own Mistakes
How to Make Claude Code Improve from its Own Mistakes How to Make Claude Code Improve from its Own Mistakes

Continual learning for agents is still an unsolved challenge, but I’ll go through how I make my coding agents learn from their mistakes and improve over time.

Why do we need continual learningWe need continual learning because we always want to become better at the tasks we do.

The generalize knowledge commandThe simplest and most effective approach to make Claude Code learn from its mistakes is a general knowledge command.

SkillsSkills is another concept I’d like to cover, which really helps contribute to continual learning and helps Claude Code learn from its mistakes.

ConclusionIn this article, I’ve covered how to make Claude Code and other coding agents learn from their mistakes.

3 days, 4 hours назад @ towardsdatascience.com
From Dashboards to Decisions: Rethinking Data & Analytics in the Age of AI
From Dashboards to Decisions: Rethinking Data & Analytics in the Age of AI From Dashboards to Decisions: Rethinking Data & Analytics in the Age of AI

, I attended the Gartner Data & Analytics (D&A) Summit 2026 in Orlando, Florida.

Across three days of hearing from data & analytics leaders, one idea stood out clearly: analytics is no longer just about asking questions and comprehending the past.

This shift redefines the role of analysts and data & analytics teams.

Analytics is evolving from AI copilots to autonomous, agent-driven decision systems that are powered by context, semantics, and real-world data.

Rashi is a data wiz from Chicago who loves to analyze data and create data stories to communicate insights.

3 days, 5 hours назад @ towardsdatascience.com
Production-Ready LLM Agents: A Comprehensive Framework for Offline Evaluation
Production-Ready LLM Agents: A Comprehensive Framework for Offline Evaluation Production-Ready LLM Agents: A Comprehensive Framework for Offline Evaluation

But if we ask an LLM the same question twice and we’ll get different phrasings, different structures, sometimes different emphasis.

Different query types need different quality criteria.

You need offline evaluation to define your quality baseline before online evaluation can tell you whether you’re maintaining it.

Different agents need different evaluation criteria, and this isn’t always obvious upfront.

Context precision measures what fraction of retrieved documents were actually relevant — if you retrieved four documents and only two were useful, that’s 50% precision.

3 days, 7 hours назад @ towardsdatascience.com
The Complete Guide to AI Implementation for Chief Data & AI Officers in 2026
The Complete Guide to AI Implementation for Chief Data & AI Officers in 2026 The Complete Guide to AI Implementation for Chief Data & AI Officers in 2026

In this guide we’ll outline a framework for evaluating the different avenues Chief Data and AI Officers have for advancing AI in their companies.

A measure of the increase in productivity AI can facilitateA framework for thinking about the impact of AI.

Given the technical nature of the work for Data and AI Teams, incorporating AI and automation into processes would appear of fundamental importance in 2026.

Summary | Good AI needs good ProcessIn this piece I outlined a framework for Chief Data and AI Officers to evaluate AI initiatives and to form a holistic AI strategy.

This should come as good news, not just for Chief Data and AI Officers, but for Data Practitioners generally.

3 days, 8 hours назад @ towardsdatascience.com
4 Pandas Concepts That Quietly Break Your Data Pipelines
4 Pandas Concepts That Quietly Break Your Data Pipelines 4 Pandas Concepts That Quietly Break Your Data Pipelines

Things like:How Pandas handles data typesHow index alignment worksworks The difference between a copy and a viewand how to write defensive data manipulation codeThese concepts don’t feel exciting when you’re first learning Pandas.

We have revenue values, some discounts, and a few missing entries.

Data Types: The Hidden Source of Many Pandas BugsThe issue we just saw comes down to something simple: data types.

Index Alignment: Pandas Matches Labels, Not RowsOne of the most powerful — and confusing — behaviors in Pandas is index alignment.

Defensive Data Manipulation: Writing Pandas Code That Fails LoudlyOne thing I’ve slowly realized while working with data is that most problems don’t come f…

4 days, 4 hours назад @ towardsdatascience.com
Distill.pub Distill.pub
последний пост None
The Gradient The Gradient
последний пост 1 month назад
The Reasonable Effectiveness of Virtue Ethics in AI Alignment
The Reasonable Effectiveness of Virtue Ethics in AI Alignment The Reasonable Effectiveness of Virtue Ethics in AI Alignment

I say she tries to be mathematically excellent, which involves promoting mathematical excellence through mathematical excellence, and that this structure is closely related to why 'mathematical excellence' can even be a concept.

This is not to suggest that ‘good mathematics causes future good mathematics’ is a full definition or even full informal description of good mathematics.

My claim is only that the fact that good mathematics has a disposition to cause future good mathematics reveals something essential about our concept of good mathematics (and about the material affordances enabling this concept).

‘mathematical excellence through mathematical excellence’ AI?

I believe ‘mathematical …

1 month назад @ thegradient.pub
AGI Is Not Multimodal
AGI Is Not Multimodal AGI Is Not Multimodal

Despite this, scale maximalists have implicitly suggested that multimodal models can be a structure-agnostic framework for AGI.

While structure-agnostic scale maximalism has succeeded in producing LLMs and LVMs that pass Turing tests, a multimodal scale maximalist approach to AGI will not bear similar fruit.

CitationFor attribution in academic contexts or books, please cite this work asBenjamin A. Spiegel, "AGI Is Not Multimodal", The Gradient, 2025.

@article{spiegel2025agi, author = {Benjamin A. Spiegel}, title = {AGI Is Not Multimodal}, journal = {The Gradient}, year = {2025}, howpublished = {\url{https://thegradient.pub/agi-is-not-multimodal}, }ReferencesAndreas, Jacob.

“Language Models,…

9 months, 3 weeks назад @ thegradient.pub
TheSequence TheSequence
последний пост 1 day, 9 hours назад
The Sequence Opinion #831: NVIDIA Is Quietly Building the Operating System of AI
The Sequence Opinion #831: NVIDIA Is Quietly Building the Operating System of AI The Sequence Opinion #831: NVIDIA Is Quietly Building the Operating System of AI

A hardware company ships a breakthrough chip.

Then the hardware company realizes the software layer is where the real lock-in lives, and starts building it themselves.

NVIDIA is doing it right now with AI — except they’re doing it at a scale and speed that makes the previous examples look quaint.

He talked about an operating system for AI factories.

Buried under seven new chips, five rack-scale systems, and a software ecosystem so vertically integrated it makes Apple look open.

1 day, 9 hours назад @ thesequence.substack.com
The Sequence AI of the Week #830: The Quiet Ambush: Inside the Amazing MiMo-V2-Pro aka Hunter Alpha
The Sequence AI of the Week #830: The Quiet Ambush: Inside the Amazing MiMo-V2-Pro aka Hunter Alpha The Sequence AI of the Week #830: The Quiet Ambush: Inside the Amazing MiMo-V2-Pro aka Hunter Alpha

Every so often, the artificial intelligence community experiences a collective double-take.

We are moving rapidly from the “Chat Era”—where models act as passive oracles answering static trivia and generating boilerplate—to the “Agent Era,” where models are embedded in complex scaffolds, utilizing tools, and driving open-ended, continuous software engineering workflows.

In this high-stakes transition, the most disruptive entry didn’t come with a press tour.

It came disguised as a nameless API endpoint on OpenRouter, executing a strategy of pure, blind telemetry.

The “Hunter Alpha” Anomaly

2 days, 9 hours назад @ thesequence.substack.com
The Sequence Knowledge #829: World Models and Physical AI
The Sequence Knowledge #829: World Models and Physical AI The Sequence Knowledge #829: World Models and Physical AI

Today we will Discuss:World models for physical AI.

Inside the amazing NIVIDIA Cosmos model.

💡 AI Concept of the Day: World Models and Physical AIWhile modern world models often focus on the “temporal prediction” of pixels—essentially hallucinating the next frame in a video—World Labs’ Marble represents a fundamental shift toward spatial intelligence.

Founded by Dr. Fei-Fei Li, World Labs has positioned Marble not as a mere video generator, but as a “Large World Model” (LWM) designed to reconstruct, generate, and simulate persistent 3D environments.

The Core Architecture: Lifting 2D into 4D

3 days, 9 hours назад @ thesequence.substack.com
The Sequence Radar #828: NVIDIA’s GTM Releases, Bezos’s $100B Bet, Xiaomi’s Ambush, and the Fracturing of OpenAI
The Sequence Radar #828: NVIDIA’s GTM Releases, Bezos’s $100B Bet, Xiaomi’s Ambush, and the Fracturing of OpenAI The Sequence Radar #828: NVIDIA’s GTM Releases, Bezos’s $100B Bet, Xiaomi’s Ambush, and the Fracturing of OpenAI

The spotlight was stolen by their agentic software announcements, most notably Dynamo 1.0—an open-source distributed operating system for AI factories—and NemoClaw, an enterprise stack designed to build and orchestrate self-evolving agents.

By integrating these with the new Agent Toolkit, NVIDIA is positioning itself as the foundational orchestration layer for the agentic economy.

They aren’t just selling the silicon anymore; they are building the entire nervous system for enterprise AI.

AI Lab: Microsoft ResearchSummary: This paper introduces Online Experiential Learning (OEL), a framework that enables large language models to continuously improve from real-world deployment experiences.

AI…

5 days, 9 hours назад @ thesequence.substack.com
The Sequence Opinion #827: Taming the Agentic Lobster: Learning from OpenClaw
The Sequence Opinion #827: Taming the Agentic Lobster: Learning from OpenClaw The Sequence Opinion #827: Taming the Agentic Lobster: Learning from OpenClaw

The model forgets you, your context, and your goals the moment you close the browser tab.

Created by Austrian vibe-coder Peter Steinberger—who recently took these ideas with him to OpenAI—OpenClaw has rapidly become the defining architecture for local, autonomous AI agents.

Let’s dive under the hood and deconstruct the OpenClaw architecture.

Once you understand how OpenClaw works, you understand the blueprint for how all production-grade agentic systems will operate.

The Anatomy of OpenClaw: Component by Component

1 week, 1 day назад @ thesequence.substack.com
The Sequence AI of the Week #826: Sleep While it Computes: Inside Karpathy’s AutoResearch
The Sequence AI of the Week #826: Sleep While it Computes: Inside Karpathy’s AutoResearch The Sequence AI of the Week #826: Sleep While it Computes: Inside Karpathy’s AutoResearch

For years, the frontier of artificial intelligence research has been constrained by a very specific bottleneck: the human being.

As Andrej Karpathy recently put it, AI research has historically been performed by “meat computers” who have to eat, sleep, and occasionally synchronize their findings using “sound wave interconnects” during painfully slow weekly group meetings.

The machine learning loop—forming a hypothesis, modifying code, launching a training run, waiting, and evaluating the results—operates strictly at human speed.

1 week, 2 days назад @ thesequence.substack.com
The Sequence Knowledge #825: Inside World Labs Marble
The Sequence Knowledge #825: Inside World Labs Marble The Sequence Knowledge #825: Inside World Labs Marble

Today we will Discuss:An overview of World Labs’ amazing Marble model.

A deep dive into the engine powering Marble.

💡 AI Concept of the Day: Inside World Labs’ Marble ModelWhile modern world models often focus on the “temporal prediction” of pixels—essentially hallucinating the next frame in a video—World Labs’ Marble represents a fundamental shift toward spatial intelligence.

Founded by Dr. Fei-Fei Li, World Labs has positioned Marble not as a mere video generator, but as a “Large World Model” (LWM) designed to reconstruct, generate, and simulate persistent 3D environments.

The Core Architecture: Lifting 2D into 4D

1 week, 3 days назад @ thesequence.substack.com
The Sequence Radar #824: Last Week in AI: Sovereign Lobsters, Self-Coding Agents, and Gigawatt Factories
The Sequence Radar #824: Last Week in AI: Sovereign Lobsters, Self-Coding Agents, and Gigawatt Factories The Sequence Radar #824: Last Week in AI: Sovereign Lobsters, Self-Coding Agents, and Gigawatt Factories

Subscribe and don’t miss out:📝 Editorial: Last Week in AI: Sovereign Lobsters, Self-Coding Agents, and Gigawatt FactoriesWelcome to The Sequence.

It is the scientific method running at machine speed—a profound paradigm shift where AI models actively improve themselves while researchers sleep.

Anthropic is applying a similar multi-agent philosophy to software engineering with the launch of Claude Code Review.

Dubbed “raising lobsters” by locals (a nod to the project’s crustacean mascot), OpenClaw allows users to run persistent, locally-hosted AI agents capable of controlling operating systems and executing complex workflows.

🤖 AI Tech ReleasesClaude Code ReviewAnthropic released Claude Code …

1 week, 5 days назад @ thesequence.substack.com
The Sequence Opinion #823: SaaSmagedon, Is SaaS Dead?: Vibe Coding, Agentic Engineering, and the Collapse of the Code Moat
The Sequence Opinion #823: SaaSmagedon, Is SaaS Dead?: Vibe Coding, Agentic Engineering, and the Collapse of the Code Moat The Sequence Opinion #823: SaaSmagedon, Is SaaS Dead?: Vibe Coding, Agentic Engineering, and the Collapse of the Code Moat

The software industry is currently navigating a phase transition of unprecedented magnitude, an event frequently characterized in financial and technical circles as SaaSmagedon or the SaaS-pocalypse.

This period is marked not by a mere cyclical downturn, but by a structural re-evaluation of the entire $1 trillion software ecosystem.

In early 2026, the market witnessed a massive sell-off where over $1 trillion in market capitalization was erased from software stocks in a single week.

This transition is not merely a change in tooling; it is a change in the biology of how software is born, distributed, and monetized.

The Evolution of the Computational Stack: From Syntax to Stochastic Simulatio…

2 weeks, 1 day назад @ thesequence.substack.com
The Sequence AI of the Week #822: Inside GPT-5.4: When Language Models Start Acting Like Operating Systems
The Sequence AI of the Week #822: Inside GPT-5.4: When Language Models Start Acting Like Operating Systems The Sequence AI of the Week #822: Inside GPT-5.4: When Language Models Start Acting Like Operating Systems

For years, discussions about frontier AI models revolved around a familiar set of architectural questions.

The most interesting architectural innovations are no longer happening strictly inside the transformer.

The neural network is still the core intelligence, but it increasingly functions as the cognitive engine inside a much larger execution environment.

Reasoning, memory management, tool usage, multimodal perception, and agentic behavior are now tightly integrated into the model’s operational stack.

The result is a system that looks less like a chatbot and more like a general-purpose cognitive runtime.

2 weeks, 2 days назад @ thesequence.substack.com
The Sequence Knowledge #821: 4D and World Models and the Amazing DeepMind D4RT
The Sequence Knowledge #821: 4D and World Models and the Amazing DeepMind D4RT The Sequence Knowledge #821: 4D and World Models and the Amazing DeepMind D4RT

Today we will Discuss:World models in 4D.

Exploring the amazing D4RT from DeepMind.

💡 AI Concept of the Day:The evolution of “world models” has reached a pivotal inflection point: the shift from predicting pixels in 2D video to reconstructing the physical geometry of a dynamic world in 4D.

As highlighted in Part 4 of our series for The Sequence, this frontier is defined by Spatial Intelligence—the ability of an AI to not only see a scene but to perceive its volume, its occluded parts, and its temporal trajectory with mathematical precision.

The D4RT Breakthrough: From Fragments to Unity

2 weeks, 3 days назад @ thesequence.substack.com
The Sequence Radar #820: GPT-5.4, Cursor, and the New Desktop
The Sequence Radar #820: GPT-5.4, Cursor, and the New Desktop The Sequence Radar #820: GPT-5.4, Cursor, and the New Desktop

Subscribe and don’t miss out:📝 Editorial: When AI Takes the Mouse: GPT-5.4, Cursor, and the New DesktopWelcome to this week’s edition of The Sequence.

While proprietary AI models are crossing the Rubicon into autonomous, always-on digital workers, the human engines driving open-weight breakthroughs are colliding head-first with corporate bureaucracy.

OpenAI & Cursor: The Agentic DesktopOpenAI just redefined the frontier with the release of GPT-5.4 and its reasoning-focused counterpart, GPT-5.4 Thinking.

GPT-5.4 can interact directly with software interfaces—interpreting screenshots, clicking, scrolling, and executing multi-step workflows across native desktop applications.

AI Lab: Qwen Team…

2 weeks, 5 days назад @ thesequence.substack.com
The Sequence Opinion #819: How AI Chips are Made?
The Sequence Opinion #819: How AI Chips are Made? The Sequence Opinion #819: How AI Chips are Made?

From Silicon to Intelligence: What Actually Goes Into Designing an AI ChipA deep dive into the architecture, physics, and craft of building hardware for neural networks — from RTL, Verilog to tape-out.

Let me tell you something that took me an embarrassingly long time to truly internalize: the reason deep learning works as well as it does in 2026 is only maybe 40% algorithms.

We got lucky — spectacularly lucky — that the GPU, a chip originally designed to make triangles pretty in Quake III, turned out to be almost exactly the right computational substrate for training neural networks.

But “almost exactly right” is doing a lot of heavy lifting in that sentence.

The story of AI chips is the s…

3 weeks, 1 day назад @ thesequence.substack.com
The Sequence AI of the Week #818: You Cannot Miss Qwen 3.5
The Sequence AI of the Week #818: You Cannot Miss Qwen 3.5 The Sequence AI of the Week #818: You Cannot Miss Qwen 3.5

They launched the Qwen 3.5 series.

And just yesterday, in a move that signals their ambition to own the entire deployment stack, they released the Qwen 3.5 “Small” series—ranging from 0.8B to 9B parameters, aimed squarely at on-device edge computing.

The Qwen 3.5 family isn’t just bigger; it’s structurally different.

Let’s dive into the technical contributions that make Qwen 3.5 one of the most fascinating engineering feats in the open-weight ecosystem right now.

The Wall: Why Dense Models Don’t Scale Economically

3 weeks, 2 days назад @ thesequence.substack.com
The Sequence Knowledge #817: DeepMind Genie and Interactive World Models
The Sequence Knowledge #817: DeepMind Genie and Interactive World Models The Sequence Knowledge #817: DeepMind Genie and Interactive World Models

Today we will Discuss:The core principles of actionable world models.

A deep dive into the amazing DeepMind Genie models.

💡 AI Concept of the Day: Actionable World Models and DeepMind’s GenieIf the “Sora moment” proved that AI could dream a consistent world, the next frontier is proving that AI can control it.

This distinction is the defining battleground of World Models today.

This brings us to the rise of Actionable World Models—systems that transform static video data into interactive, playable environments.

3 weeks, 3 days назад @ thesequence.substack.com
Synced Review
последний пост None
📓 Cool Blogs
ODS.ai Habr ODS.ai Habr
последний пост 1 month назад
Почему я стал ИТ-волонтером & Датасет новостей о противоречиях современного общества
Почему я стал ИТ-волонтером & Датасет новостей о противоречиях современного общества Почему я стал ИТ-волонтером & Датасет новостей о противоречиях современного общества

Простой пример с ценами на топливо: бензин дорожает и из-за роста цены на нефть, и из-за ее падения.

Осознание того, что твой труд увеличивает чью-то капитализацию, но не решает реальных проблем общества, видимых в быту и в новостях, подтолкнуло искать еще какую-то деятельность.

Кроме того, благодаря АМБ появился уникальный датасет новостей с противоречиями современного общества на kaggle и github, далее о нем.

Датасет новостей о противоречиях современного обществаАктивисты АМБ и волонтеры дружественных коллективов собрали и разметили датасет новостей, подсвечивающие те самые системные противоречия, о которых я задумывался ранее.

Пример Б В 2023 году в мире голодал каждый 11-й человек, а в …

1 month назад @ habr.com
[Перевод] Как устроен Codex
[Перевод] Как устроен Codex [Перевод] Как устроен Codex

Подробный разбор того, как команда OpenAI Codex создаёт своего кодового агента, как его используют инженеры и что это может значить для будущего разработки ПО.

Чтобы разобраться, как устроен Codex, как команды внутри OpenAI его используют и как он влияет на инженерные практики у создателей ChatGPT, я поговорил с тремя сотрудниками OpenAI:Тибо Соттио (Thibault Sottiaux) — руководитель Codex.

Оба продукта были запущены весной: Codex CLI анонсировали в апреле 2025 года, а Codex в ChatGPT представили в мае.

В команде Codex эти файлы объясняют агенту, как ориентироваться в кодовой базе, какие команды запускать для тестирования и как следовать стандартам проекта.

Использование Codex в OpenAIПомим…

1 month назад @ habr.com
Курс Natural Language Processing & LLMs — новый сезон
Курс Natural Language Processing & LLMs — новый сезон Курс Natural Language Processing & LLMs — новый сезон

10 февраля мы в очередной раз запускаем бесплатный онлайн-курс по обработке естественного языка (Natural Language Processing).

Что будем проходить:классическое начало: закон Ципфа, TF-IDF, RNN, CNN, Transformer;основные задачи NLP: классификация текста, тегирование и генерация;специфичные области: агенты и вайб-кодинг;LLM и их применение.

Если вы студент ИТМО, МФТИ или ВШЭ, то курс можно зачесть, как учебный.

Работаю в области NLP более 12 лет, успел поработать в Яндексе и ВКонтакте, защитить кандидатскую диссертацию.

Если есть вопросы, то приходите с ними в ODS Mattermost – там будут все ответы, время семинаров и ссылки.

1 month, 3 weeks назад @ habr.com
SWE-MERA — новый динамический бенчмарк для моделей агентной генерации кода
SWE-MERA — новый динамический бенчмарк для моделей агентной генерации кода SWE-MERA — новый динамический бенчмарк для моделей агентной генерации кода

Однако все задачи в MERA CODE, как впрочем и в SWE-bench и других бенчмарках подобного назначения, следуют классической парадигме, когда у нас есть фиксированный обучающий набор данных и, что более важно, фиксированный проверочный набор.

Но большие языковые модели для кодинга, которые мы и пытаемся оценивать нашим набором, также учатся на GitHub – со времен еще первой модели LLaMa.

Кажется, что 700 задач немного, но это уже очень приличное количество, и что самое важное — это новые задачи.

Current behavior: from sympy import ask, Q, Symbol x = Symbol('x') print(ask(Q.finite(x**-1), Q.real(x))) # Output: True Expected behavior: The function should return None to indicate uncertainty, as x**-…

6 months, 1 week назад @ habr.com
DRAGON: динамический бенчмарк для оценки RAG-систем на русском языке
DRAGON: динамический бенчмарк для оценки RAG-систем на русском языке DRAGON: динамический бенчмарк для оценки RAG-систем на русском языке

Ответ: Кэисукэ ТибаSPARQL-запрос SimpleSELECT DISTINCT ?s ?r ?o WHERE { { SELECT ?s ?r ?o WHERE { ?s ?r ?o . }

GROUP BY ?s ?r HAVING(count(?o) = 1) } { SELECT ?s ?r ?o WHERE { ?s ?r ?o . }

Ответ: Национальная система платежных карт (НСПК) Центр биометрических технологий (ЦБТ) ЕБСSELECT ?s ?r ?o ?len WHERE { { SELECT ?s ?r (COUNT(?o1) as ?len) (GROUP_CONCAT(DISTINCT(STR(?o1));separator="|") AS ?o) WHERE { ?s ?r ?o1 . }

FILTER(?o != ?o1) } GROUP BY ?o ?o1 ?r ?r1 HAVING(COUNT(?s) = 1) } UNION { SELECT ?s ?r ?o ?r1 ?s1 WHERE { ?s ?r ?o .

FILTER(?o != ?o1) } GROUP BY ?o ?o1 ?r ?r1 HAVING(COUNT(?s) = 1) } UNION { SELECT ?s ?r ?o ?r1 ?s1 WHERE { ?s ?r ?o .

8 months назад @ habr.com
RKNN Toolkit2: конвертация моделей и симуляция NPU Rockchip
RKNN Toolkit2: конвертация моделей и симуляция NPU Rockchip RKNN Toolkit2: конвертация моделей и симуляция NPU Rockchip

В этой статье я хочу поделиться своим опытом по конвертации нейросети в формат rknn с помощью библиотеки rknn-toolkit2.

Вот как выглядят веса pytorch модели в Netron:веса pytorch модели в NetronВажно!

Конвертация onnx модели в rknnДалее создается объект RKNN , который управляет процессом конвертации и инференса модели на платформе Rockchip.

На этом этапе происходит подготавка модели к конвертации в формат RKNN и последующему запуску на NPU Rockchip.

Создание и экспорт rknn моделиНа этом этапе происходит конвертация ONNX-модели во внутренний формат RKNN, оптимизация графа и подготовка к запуску на NPU Rockchip.

8 months, 1 week назад @ habr.com
MERA Code: всесторонняя оценка генерации кода в прикладных сценариях
MERA Code: всесторонняя оценка генерации кода в прикладных сценариях MERA Code: всесторонняя оценка генерации кода в прикладных сценариях

🔗MERA Code🔗GitHub с кодом и данными🔗Коллекция на Hugging Face🔗Статья на arxiv🔗Репозиторий проекта на GitVerseЧто такое MERA Code?

Современные кодовые языковые модели и модели общего назначения (ChatGPT, Claude, Qwen, YandexGPT, GigaChat и др.)

Список текущих задач MERA Code и их характеристикКаталог задач MERA Code и их подробное описание представлено на сайте.

В MERA Code промпты строго подобраны под задачу и корректный выбор ответа.

В заключениеMERA Code — это попытка закрыть важный пробел в тестировании LLM: насколько они действительно полезны в реальной, локализованной разработке.

8 months, 1 week назад @ habr.com
Machine Learning Mastery
последний пост 9 часов назад
LlamaAgents Builder: From Prompt to Deployed AI Agent in Minutes
LlamaAgents Builder: From Prompt to Deployed AI Agent in Minutes LlamaAgents Builder: From Prompt to Deployed AI Agent in Minutes

Share Post ShareIn this article, you will learn how to build, deploy, and test a no-code document-processing AI agent with LlamaAgents Builder in LlamaCloud.

IntroductionCreating an AI agent for tasks like analyzing and processing documents autonomously used to require hours of near-endless configuration, code orchestration, and deployment battles.

This article unveils the process of building, deploying, and using an intelligent agent from scratch without writing a single line of code, using LlamaAgents Builder.

Building with LlamaAgents BuilderLlamaAgents Builder is one of the newest features in the LlamaCloud web platform, whose flagship product was originally introduced as LlamaParse.

Th…

9 часов назад @ machinelearningmastery.com
Vector Databases Explained in 3 Levels of Difficulty
Vector Databases Explained in 3 Levels of Difficulty Vector Databases Explained in 3 Levels of Difficulty

How vector databases support nearest neighbor search, metadata filtering, and hybrid retrieval.

How indexing techniques such as HNSW, IVF, and PQ help vector search scale in production.

Vector databases answer a different one: which records are most similar to this?

Comparing a query vector against every stored vector means billions of floating-point operations at production data sizes, and that math makes real-time search impractical.

Production vector databases run ANN algorithms under the hood.

1 day, 6 hours назад @ machinelearningmastery.com
5 Practical Techniques to Detect and Mitigate LLM Hallucinations Beyond Prompt Engineering
5 Practical Techniques to Detect and Mitigate LLM Hallucinations Beyond Prompt Engineering 5 Practical Techniques to Detect and Mitigate LLM Hallucinations Beyond Prompt Engineering

search ( query_embedding , k = 1 ) retrieved_doc = documents [ indices [ 0 ] [ 0 ] ] # Step 7: Generate response using retrieved context client = OpenAI ( ) response = client .

create ( model = "gpt-4o-mini" , messages = [ { "role" : "system" , "content" : "Answer using the provided context only."

Instead of relying on a single model response, you introduce additional steps that check, validate, or challenge what was generated before it reaches the user.

Here is a simple implementation:from openai import OpenAI client = OpenAI() def get_answer_with_confidence(question): response = client.chat.completions.create( model="gpt-4o-mini", messages=[ { "role": "system", "content": "Answer the ques…

2 days, 7 hours назад @ machinelearningmastery.com
Beyond the Vector Store: Building the Full Data Layer for AI Applications
Beyond the Vector Store: Building the Full Data Layer for AI Applications Beyond the Vector Store: Building the Full Data Layer for AI Applications

Share Post ShareIn this article, you will learn why production AI applications need both a vector database for semantic retrieval and a relational database for structured, transactional workloads.

Topics we will cover include:What vector databases do well, and where they fall short in production AI systems.

Production AI applications need two complementary data engines working in lockstep: a vector database for semantic retrieval, and a relational database for everything else.

This is cheaper, faster, and more reliable than relying on vector search alone to return a perfectly scoped result set.

If you are building a production AI application, it would be a mistake to treat these as competin…

3 days, 11 hours назад @ machinelearningmastery.com
7 Steps to Mastering Memory in Agentic AI Systems
7 Steps to Mastering Memory in Agentic AI Systems 7 Steps to Mastering Memory in Agentic AI Systems

A Guide to Enhancing AI Learning and Recall | MongoDBStep 2: Learning the AI Agent Memory Type TaxonomyCognitive science gives us a vocabulary for the distinct roles memory plays in intelligent systems.

Further reading: Beyond Short-term Memory: The 3 Types of Long-term Memory AI Agents Need and Making Sense of Memory in AI Agents by Leonie MonigattiStep 3: Knowing the Difference Between Retrieval-Augmented Generation and MemoryOne of the most persistent sources of confusion for developers building agentic systems is conflating retrieval-augmented generation (RAG) with agent memory.

Further reading: AI Agent Memory: Build Stateful AI Systems That Remember – Redis and Building Memory-Aware A…

4 days, 9 hours назад @ machinelearningmastery.com
Why Agents Fail: The Role of Seed Values and Temperature in Agentic Loops
Why Agents Fail: The Role of Seed Values and Temperature in Agentic Loops Why Agents Fail: The Role of Seed Values and Temperature in Agentic Loops

Share Post ShareIn this article, you will learn how temperature and seed values influence failure modes in agentic loops, and how to tune them for greater resilience.

Topics we will cover include:How low and high temperature settings can produce distinct failure patterns in agentic loops.

However, two invisible steering mechanisms can also influence failure: temperature and seed value.

Regarding this setting, the main problem that usually causes failure in agent loops is using a fixed seed in production.

Basically, breaking out of failure in agentic loops often entails changing the seed value or temperature as part of retry efforts to seek a different cognitive path.

1 week назад @ machinelearningmastery.com
5 Production Scaling Challenges for Agentic AI in 2026
5 Production Scaling Challenges for Agentic AI in 2026 5 Production Scaling Challenges for Agentic AI in 2026

Share Post ShareIn this article, you will learn about five major challenges teams face when scaling agentic AI systems from prototype to production in 2026.

IntroductionEveryone’s building agentic AI systems right now, for better or for worse.

So let’s talk about the five biggest headaches teams are running into as they try to scale agentic AI in 2026.

As agentic systems start making decisions that affect customers directly, questions about accountability, auditability, and compliance become urgent.

If you’re scaling agentic systems right now, just know that the pain you’re feeling is universal.

1 week, 1 day назад @ machinelearningmastery.com
7 Readability Features for Your Next Machine Learning Model
7 Readability Features for Your Next Machine Learning Model 7 Readability Features for Your Next Machine Learning Model

Topics we will cover include:How Textstat can quantify readability and text complexity for downstream machine learning tasks.

By contrast, the Automated Readability Index (ARI) computes grade levels based on the number of characters per word.

# Calculate Automated Readability Index df['ARI'] = df['Text'].apply(textstat.automated_readability_index) print("Automated Readability Index:") print(df[['Category', 'ARI']]) 1 2 3 4 5 # Calculate Automated Readability Index df [ 'ARI' ] = df [ 'Text' ] .

automated_readability_index ) print ( "Automated Readability Index:" ) print ( df [ [ 'Category' , 'ARI' ] ] )Output:Automated Readability Index: Category ARI 0 Simple -2.288000 1 Standard 12.559412 …

1 week, 2 days назад @ machinelearningmastery.com
Everything You Need to Know About Recursive Language Models
Everything You Need to Know About Recursive Language Models Everything You Need to Know About Recursive Language Models

Share Post ShareIn this article, you will learn what recursive language models are, why they matter for long-input reasoning, and how they differ from standard long-context prompting, retrieval, and agentic systems.

IntroductionIf you are here, you have probably heard about recent work on recursive language models.

Recursive language models (RLMs) are a response to this problem.

How a Recursive Language Model Works in PracticeIn an RLM setup, the prompt is treated as part of the external environment.

If you want to explore recursive language models in more depth, the following resources are useful starting points:

1 week, 3 days назад @ machinelearningmastery.com
Building Smart Machine Learning in Low-Resource Settings
Building Smart Machine Learning in Low-Resource Settings Building Smart Machine Learning in Low-Resource Settings

Why Lightweight Machine Learning Is Actually a Power MoveThe truth is that deep learning gets a lot of hype, but in low-resource environments, lightweight models are your best friend.

set_title ( 'Temperature Distribution' ) # Humidity Distribution sns .

set_title ( 'Humidity Distribution' ) # Rainfall Distribution sns .

In many ways, this captures the spirit of machine learning in low-resource environments.

ConclusionImage by AuthorWorking in low-resource machine learning environments is possible.

2 weeks, 1 day назад @ machinelearningmastery.com
Setting Up a Google Colab AI-Assisted Coding Environment That Actually Works
Setting Up a Google Colab AI-Assisted Coding Environment That Actually Works Setting Up a Google Colab AI-Assisted Coding Environment That Actually Works

Yes: Colab now incorporates tools for AI-assisted coding, such as code generation from natural language, explanations of written code, auto-completion, and smart troubleshooting.

Looking into Colab’s AI-Assisted CapabilitiesFirst, we sign in to Google Colab with a Google account of our choice and click “New Notebook” to start a fresh coding workspace.

Now, there is a third type of cell, and it is not clearly identifiable at first glance: its name is the AI prompt cell.

Creating an AI prompt cell is simple: in the upper toolbar, right below the menus, click on the little dropdown arrow next to “Code” and select “Add AI prompt cell”.

Wrapping UpGoogle Colab is continuously releasing new AI-as…

2 weeks, 3 days назад @ machinelearningmastery.com
From Text to Tables: Feature Engineering with LLMs for Tabular Data
From Text to Tables: Feature Engineering with LLMs for Tabular Data From Text to Tables: Feature Engineering with LLMs for Tabular Data

Specifically, you can leverage pre-trained LLMs from providers like Groq (for example, models from the Llama family) to undertake data transformation and preprocessing tasks, including turning unstructured data like text into fully structured, tabular data that can be used to fuel predictive machine learning models.

import random import time random.seed(42) categories = ["access", "inquiry", "software", "billing", "hardware"] templates = { "access": [ "I've been locked out of my account for {days} days and need urgent help!

choice ( categories ) # Injecting a random number of days into specific templates to foster variety text = random .

append ( { "text" : text , "account_age_days" : rando…

2 weeks, 3 days назад @ machinelearningmastery.com
The 6 Best AI Agent Memory Frameworks You Should Try in 2026
The 6 Best AI Agent Memory Frameworks You Should Try in 2026 The 6 Best AI Agent Memory Frameworks You Should Try in 2026

Topics we will cover include:What “agent memory” means and why it matters for real-world assistants.

Practical project ideas to get hands-on experience with agent memory.

In this article, we’ll explore AI agent memory frameworks that are useful for:Storing and retrieving conversation historyManaging long-term factual knowledgeImplementing semantic memory searchHandling context windows effectivelyPersonalizing agent behavior based on past interactionsLet’s explore each framework.

It is designed specifically to give agents long-term memory that persists across sessions and evolves over time.

You can then look at Core Concepts and LLMs as Operating Systems: Agent Memory by DeepLearning.AI.

2 weeks, 4 days назад @ machinelearningmastery.com
Vector Databases vs. Graph RAG for Agent Memory: When to Use Which
Vector Databases vs. Graph RAG for Agent Memory: When to Use Which Vector Databases vs. Graph RAG for Agent Memory: When to Use Which

To clarify their respective roles, this article explores the underlying theory, practical strengths, and limitations of both approaches for agent memory.

Vector Databases: The Foundation of Semantic Agent MemoryVector databases represent memory as dense mathematical vectors, or embeddings, situated in high-dimensional space.

Vector databases are strong choices for agent memory.

Graph RAG: Structured Context and Relational MemoryGraph RAG addresses the limitations of semantic search by combining knowledge graphs with LLMs.

Developers would be well advised to adopt a layered approach: start agent memory with a vector database for basic conversational grounding.

3 weeks, 1 day назад @ machinelearningmastery.com
5 Essential Security Patterns for Robust Agentic AI
5 Essential Security Patterns for Robust Agentic AI 5 Essential Security Patterns for Robust Agentic AI

This article lists 5 key security patterns for robust AI agents and highlights why they matter.

Just-in-Time Tool PrivilegesOften abbreviated as JIT, this is a security model that grants users or applications specialized or elevated access privileges only when needed, and only for a limited period of time.

In the realm of agentic AI, an example would be issuing short term access tokens to limits the “blast radius” if the agent becomes compromised.

Bounded AutonomyThis security principle allows AI agents to operate independently within a bounded setting, meaning within clearly defined safe parameters, striking a balance between control and efficiency.

The AI FirewallThis refers to a dedicate…

3 weeks, 2 days назад @ machinelearningmastery.com
ML in Production
последний пост None
Sorta Insightful Sorta Insightful
последний пост 2 weeks, 2 days назад
Why I Signed The Amicus Brief for Anthropic v Department of War
Why I Signed The Amicus Brief for Anthropic v Department of War Why I Signed The Amicus Brief for Anthropic v Department of War

On Monday, Anthropic filed a lawsuit against the Department of War, and an amicus brief in support of Anthropic was filed on behalf of a number of OpenAI and Google employees.

There’s also an amicus brief filed on behalf of Microsoft.

There’s conflicting reporting, but very broadly, Anthropic signed an agreement with the government to deploy Claude in classified, military contexts.

Anthropic said no, Pete Hegseth declared them a supply chain risk, and Anthropic filed a lawsuit against this.

The amicus brief was broadly aligned with my thoughts on the matter, so I signed.

2 weeks, 2 days назад @ alexirpan.com
MIT Mystery Hunt 2026
MIT Mystery Hunt 2026 MIT Mystery Hunt 2026

This has spoilers for MIT Mystery Hunt 2026.

Pre-HuntThe time running up to Hunt was more stressful than usual…very briefly, I typically hunt with teammate.

Just last year, I did GPH 2025, LN Hunt, Teammate Hunt 2025, Microsoft Hunt 2025, and Silph Puzzle Hunt 2025, all of which had significant 3+ hour solve puzzles that would not be out of place in Mystery Hunt.

Not to mention smaller hunts like Advent Hunt, and then I didn’t even do Brown Puzzlehunt or Vertex Hunt or the fall CMU Hunt.

To me, the crux is whether Mystery Hunt is broken, or Mystery Hunt is fine.

1 month, 3 weeks назад @ alexirpan.com
Authentic Imperfection
Authentic Imperfection Authentic Imperfection

* * *I’ve been thinking about the anger surrounding generative AI.

To keep things fair, he took the best human images and best AI images, meaning human art from famous artists, and AI art from prompters skilled at removing obvious tells of image generation.

When people complain about AI slop, I see it as a complaint against the deluge of default style AI images.

We’ve seen this happen in all forms: AI text, AI music, older forms of computer generated content like CGI.

As much as we celebrate imperfection, digital imperfection is a step too far.

4 months, 1 week назад @ alexirpan.com
Ten Years Later
Ten Years Later Ten Years Later

Every now and then, someone asks me why I blog, and I don’t know really know what to tell them.

That’s another reason I’m not celebrating 10 years with more gusto, I know I’ve been writing less.

Indiana Jones and the Great Circle: I don’t know how they did it, but Indiana Jones and the Great Circle was just fun all the way through.

My one complaint is that the hand-to-hand combat feels like the worst part of the game, so of course they put a bunch of upgrades behind learning parry timings you’ll never use later.

I have not tried Peak, but Another Crab’s Treasure was really good and is worth playing if you’re interested in a Souls-like.

7 months, 1 week назад @ alexirpan.com
Brony Musicians Seize The Means of Production: My Eyewitness Account to BABSCon 2025
Brony Musicians Seize The Means of Production: My Eyewitness Account to BABSCon 2025 Brony Musicians Seize The Means of Production: My Eyewitness Account to BABSCon 2025

A music concert in the evenings, typically set up as a rave with EDM or rock music made by brony musicians.

She has been involved in organizing pony music concerts for over a decade, for both BABSCon and other pony conventions.

Thank you, BABSCon ChairsThe brony musicians immediately jump into an emergency Discord call with Pinkaboo, to get her side of the story.

Other conventions start tweeting in support of the brony musicians, with no one taking BABSCon’s side.

It’s hard for me to explain why I like MLP fan music, because brony music really isn’t accessible.

8 months, 1 week назад @ alexirpan.com
Lil'Log
последний пост None
inFERENCe
последний пост 1 month назад
The Future of Software
The Future of Software The Future of Software

February 25, 2026The Future of SoftwareThe world of software is undergoing a shift not seen since the advent of compilers in the 1970s.

How will humans tell AI agents what software artefacts we would like to create?

How will humans tell AI agents what software artefacts we would like to create?

This future of software creation, in which our programming languages are abstracted away, raises two very important questions:What will the instruction/specification language look like?

This should be a clear layer of separation between the developer and the pool of AI agents working to maintain software.

1 month назад @ inference.vc
Deep Learning is Powerful Because It Makes Hard Things Easy - Reflections 10 Years On
Deep Learning is Powerful Because It Makes Hard Things Easy - Reflections 10 Years On Deep Learning is Powerful Because It Makes Hard Things Easy - Reflections 10 Years On

Deep Learning is Powerful Because It Makes Hard Things Easy - Reflections 10 Years OnTen years ago this week, I wrote a provocative and bold post that blew up, made it to top spot on HackerNews.

In hindsight: There is a lot of stuff in deep learning that we don't understand nearly enough.

Sometimes things work for reasons completely unrelated to why we thought they would work.

(Pop some 🍿 in the microwave and read till the end for more)🎯 "Deep learning is powerful exactly because it makes hard things easy"Okay, this was a great insight.

🎯 Generative ModelingIn the post I suggested people learn "something harder" instead of - or in addition to - deep learning.

1 month, 3 weeks назад @ inference.vc
The Spectator
последний пост None
The Unofficial Google Data Science Blog The Unofficial Google Data Science Blog
последний пост None
Off the Convex Path
последний пост None
Jay Alammar
последний пост None
Piekniewski's blog
последний пост None
fast.ai NLP fast.ai NLP
последний пост None
Sebastian Ruder
последний пост None
大トロ 大トロ
последний пост None
🔬 Science
Papers With Code Papers With Code
последний пост 8 months, 1 week назад
/henry123-boy/ SpatialTrackerV2: 3D Point Tracking Made Easy
/henry123-boy/ SpatialTrackerV2: 3D Point Tracking Made Easy /henry123-boy/ SpatialTrackerV2: 3D Point Tracking Made Easy

We present SpatialTrackerV2, a feed-forward 3D point tracking method for monocular videos.

Going beyond modular pipelines built on off-the-shelf components for 3D tracking, our approach unifies the intrinsic connections between point tracking, monocular depth, and camera pose estimation into a high-performing and feedforward 3D point tracker.

It decomposes world-space 3D motion into scene geometry, camera ego-motion, and pixel-wise object motion, with a fully differentiable and end-to-end architecture, allowing scalable training across a wide range of datasets, including synthetic sequences, posed RGB-D videos, and unlabeled in-the-wild footage.

By learning geometry and motion jointly from …

8 months, 1 week назад @ paperswithcode.com
/antof27/ Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation
/antof27/ Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation /antof27/ Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation

Calisthenics skill classification is the computer vision task of inferring the skill performed by an athlete from images, enabling automatic performance assessment and personalized analytics.

Traditional methods for calisthenics skill recognition are based on pose estimation methods to determine the position of skeletal data from images, which is later fed to a classification algorithm to infer the performed skill.

This work proposes a direct approach to calisthenics skill recognition, which leverages depth estimation and athlete patch retrieval to avoid the computationally expensive human pose estimation module.

Using Depth Anything V2 for depth estimation and YOLOv10 for athlete localizat…

8 months, 1 week назад @ paperswithcode.com
/snowflakedb/ Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AI
/snowflakedb/ Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AI /snowflakedb/ Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AI

Inference is now the dominant AI workload, yet existing systems force trade-offs between latency, throughput, and cost.

Arctic Inference, an open-source vLLM plugin from Snowflake AI Research, introduces Shift Parallelism, a dynamic parallelism strategy that adapts to real-world traffic while integrating speculative decoding, SwiftKV compute reduction, and optimized embedding inference.

It achieves up to 3.4 times faster request completion, 1.75 times faster generation, and 1.6M tokens/sec per GPU for embeddings, outperforming both latency- and throughput-optimized deployments.

Already powering Snowflake Cortex AI, Arctic Inference delivers state-of-the-art, cost-effective inference for ent…

8 months, 1 week назад @ paperswithcode.com
/NVIDIA/ FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale
/NVIDIA/ FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale /NVIDIA/ FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale

FourCastNet 3 advances global weather modeling by implementing a scalable, geometric machine learning (ML) approach to probabilistic ensemble forecasting.

The approach is designed to respect spherical geometry and to accurately model the spatially correlated probabilistic nature of the problem, resulting in stable spectra and realistic dynamics across multiple scales.

FourCastNet 3 delivers forecasting accuracy that surpasses leading conventional ensemble models and rivals the best diffusion-based methods, while producing forecasts 8 to 60 times faster than these approaches.

In contrast to other ML approaches, FourCastNet 3 demonstrates excellent probabilistic calibration and retains realis…

8 months, 1 week назад @ paperswithcode.com
/jingyanw/ Choosing the Better Bandit Algorithm under Data Sharing: When Do A/B Experiments Work?
/jingyanw/ Choosing the Better Bandit Algorithm under Data Sharing: When Do A/B Experiments Work? /jingyanw/ Choosing the Better Bandit Algorithm under Data Sharing: When Do A/B Experiments Work?

We study A/B experiments that are designed to compare the performance of two recommendation algorithms.

The bias arising from this type of data sharing is known as "symbiosis bias".

In this paper, we highlight that, for decision-making purposes, the sign of the GTE often matters more than its precise magnitude when selecting the better algorithm.

We formalize this insight under a multi-armed bandit framework and theoretically characterize when the sign of the expected GTE estimate under data sharing aligns with or contradicts the sign of the true GTE.

Our analysis identifies the level of exploration versus exploitation as a key determinant of how symbiosis bias impacts algorithm selection.

8 months, 1 week назад @ paperswithcode.com
/qqq-yi/ DAC: A Dynamic Attention-aware Approach for Task-Agnostic Prompt Compression
/qqq-yi/ DAC: A Dynamic Attention-aware Approach for Task-Agnostic Prompt Compression /qqq-yi/ DAC: A Dynamic Attention-aware Approach for Task-Agnostic Prompt Compression

Task-agnostic prompt compression leverages the redundancy in natural language to reduce computational overhead and enhance information density within prompts, especially in long-context scenarios.

Existing methods predominantly rely on information entropy as the metric to compress lexical units, aiming to achieve minimal information loss.

However, these approaches overlook two critical aspects: (i) the importance of attention-critical tokens at the algorithmic level, and (ii) shifts in information entropy during the compression process.

Motivated by these challenges, we propose a dynamic attention-aware approach for task-agnostic prompt compression (DAC).

This approach effectively integrate…

8 months, 1 week назад @ paperswithcode.com
/lukasellinger/ Simplifications are Absolutists: How Simplified Language Reduces Word Sense Awareness in LLM-Generated Definitions
/lukasellinger/ Simplifications are Absolutists: How Simplified Language Reduces Word Sense Awareness in LLM-Generated Definitions /lukasellinger/ Simplifications are Absolutists: How Simplified Language Reduces Word Sense Awareness in LLM-Generated Definitions

Large Language Models (LLMs) can provide accurate word definitions and explanations for any context.

However, the scope of the definition changes for different target groups, like children or language learners.

We investigate how simplification impacts homonym definition quality across three target groups: Normal, Simple, and ELI5.

Our results show that simplification drastically degrades definition completeness by neglecting polysemy, increasing the risk of misunderstanding.

Fine-tuning Llama 3.1 8B with Direct Preference Optimization substantially improves homonym response quality across all prompt types.

8 months, 1 week назад @ paperswithcode.com
/pspdada/ Mitigating Object Hallucinations via Sentence-Level Early Intervention
/pspdada/ Mitigating Object Hallucinations via Sentence-Level Early Intervention /pspdada/ Mitigating Object Hallucinations via Sentence-Level Early Intervention

Multimodal large language models (MLLMs) have revolutionized cross-modal understanding but continue to struggle with hallucinations - fabricated content contradicting visual inputs.

Existing hallucination mitigation methods either incur prohibitive computational costs or introduce distribution mismatches between training data and model outputs.

We identify a critical insight: hallucinations predominantly emerge at the early stages of text generation and propagate through subsequent outputs.

To address this, we propose **SENTINEL** (**S**entence-level **E**arly i**N**tervention **T**hrough **IN**-domain pr**E**ference **L**earning), a framework that eliminates dependency on human annotations…

8 months, 1 week назад @ paperswithcode.com
/owos/ FLEXITOKENS: Flexible Tokenization for Evolving Language Models
/owos/ FLEXITOKENS: Flexible Tokenization for Evolving Language Models /owos/ FLEXITOKENS: Flexible Tokenization for Evolving Language Models

Language models (LMs) are challenging to adapt to new data distributions by simple finetuning.

This is due to the rigidity of their subword tokenizers, which typically remain unchanged during adaptation.

This inflexibility often leads to inefficient tokenization, causing overfragmentation of out-of-distribution domains, unseen languages, or scripts.

In this work, we develop byte-level LMs with learnable tokenizers to make tokenization adaptive.

Our models include a submodule that learns to predict boundaries between the input byte sequence, encoding it into variable-length segments.

8 months, 1 week назад @ paperswithcode.com
/wojiufukele/ Graph-Structured Data Analysis of Component Failure in Autonomous Cargo Ships Based on Feature Fusion
/wojiufukele/ Graph-Structured Data Analysis of Component Failure in Autonomous Cargo Ships Based on Feature Fusion /wojiufukele/ Graph-Structured Data Analysis of Component Failure in Autonomous Cargo Ships Based on Feature Fusion

To address the challenges posed by cascading reactions caused by component failures in autonomous cargo ships (ACS) and the uncertainties in emergency decision-making, this paper proposes a novel hybrid feature fusion framework for constructing a graph-structured dataset of failure modes.

A hierarchical feature fusion framework is constructed, using Word2Vec encoding to encode subsystem/component features, BERT-KPCA to process failure modes/reasons, and Sentence-BERT to quantify the semantic association between failure impact and emergency decision-making.

The dataset covers 12 systems, 1,262 failure modes, and 6,150 propagation paths.

In the label prediction results, the Shore-based Meteor…

8 months, 1 week назад @ paperswithcode.com
/YF-W/ Tri-Learn Graph Fusion Network for Attributed Graph Clustering
/YF-W/ Tri-Learn Graph Fusion Network for Attributed Graph Clustering /YF-W/ Tri-Learn Graph Fusion Network for Attributed Graph Clustering

In recent years, models based on Graph Convolutional Networks (GCN) have made significant strides in the field of graph data analysis.

Although the Graph Transformer architecture has mitigated some of these issues, its performance is still limited when processing heterogeneous graph data.

To address these challenges, this study proposes a novel deep clustering framework that comprising GCN, Autoencoder (AE), and Graph Transformer, termed the Tri-Learn Graph Fusion Network (Tri-GFN).

The tri-learning mechanism allows mutual learning among these modules, while the feature fusion strategy enables the model to capture complex relationships, yielding highly discriminative representations for gra…

8 months, 1 week назад @ paperswithcode.com
/mr-ravin/ APTx Neuron: A Unified Trainable Neuron Architecture Integrating Activation and Computation
/mr-ravin/ APTx Neuron: A Unified Trainable Neuron Architecture Integrating Activation and Computation /mr-ravin/ APTx Neuron: A Unified Trainable Neuron Architecture Integrating Activation and Computation

We propose the APTx Neuron, a novel, unified neural computation unit that integrates non-linear activation and linear transformation into a single trainable expression.

The APTx Neuron is derived from the APTx activation function, thereby eliminating the need for separate activation layers and making the architecture both computationally efficient and elegant.

The proposed neuron follows the functional form $y = \sum_{i=1}^{n} ((\alpha_i + \tanh(\beta_i x_i)) \cdot \gamma_i x_i) + \delta$, where all parameters $\alpha_i$, $\beta_i$, $\gamma_i$, and $\delta$ are trainable.

We validate our APTx Neuron-based architecture on the MNIST dataset, achieving up to 96.69\% test accuracy in just 20 ep…

8 months, 1 week назад @ paperswithcode.com
/Rec4Fun/ A Reproducibility Study of Product-side Fairness in Bundle Recommendation
/Rec4Fun/ A Reproducibility Study of Product-side Fairness in Bundle Recommendation /Rec4Fun/ A Reproducibility Study of Product-side Fairness in Bundle Recommendation

While this problem has been widely studied in traditional recommendation settings, its implications for bundle recommendation (BR) remain largely unexplored.

Existing fairness frameworks and metrics designed for traditional recommender systems may not directly translate to this multi-layered setting.

In this paper, we conduct a comprehensive reproducibility study of product-side fairness in BR across three real-world datasets using four state-of-the-art BR methods.

We analyze exposure disparities at both the bundle and item levels using multiple fairness metrics, uncovering important patterns.

Overall, our findings offer actionable insights for building fairer bundle recommender systems and…

8 months, 1 week назад @ paperswithcode.com
/cbobed/ OntView: What you See is What you Meant
/cbobed/ OntView: What you See is What you Meant /cbobed/ OntView: What you See is What you Meant

However, the lack of tools that provide effective visualization is still a significant challenge.

In this paper, we present OntView, an ontology viewer that is designed to provide users with an intuitive visual representation of ontology concepts and their formal definitions through a user-friendly interface.

Building on the use of a DL reasoner, OntView follows a "What you see is what you meant" paradigm, showing the actual inferred knowledge.

One key aspect for this is its ability to visualize General Concept Inclusions (GCI), a feature absent in existing visualization tools.

OntView has been released with an open-source license for the whole community.

8 months, 1 week назад @ paperswithcode.com
/Rec4Fun/ RaMen: Multi-Strategy Multi-Modal Learning for Bundle Construction
/Rec4Fun/ RaMen: Multi-Strategy Multi-Modal Learning for Bundle Construction /Rec4Fun/ RaMen: Multi-Strategy Multi-Modal Learning for Bundle Construction

These approaches fail to capture elaborate relations hidden in real-world bundle structures, resulting in suboptimal bundle representations.

To overcome this limitation, we propose RaMen, a novel method that provides a holistic multi-strategy approach for bundle construction.

RaMen utilizes both intrinsic (characteristics) and extrinsic (collaborative signals) information to model bundle structures through Explicit Strategy-aware Learning (ESL) and Implicit Strategy-aware Learning (ISL).

Integrating diverse strategies enables RaMen to learn more comprehensive and robust bundle representations.

Meanwhile, Multi-strategy Alignment & Discrimination module is employed to facilitate knowledge tr…

8 months, 1 week назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 8 months, 1 week назад
/PrimisAI/ Adaptive Multi-Agent Reasoning via Automated Workflow Generation
/PrimisAI/ Adaptive Multi-Agent Reasoning via Automated Workflow Generation /PrimisAI/ Adaptive Multi-Agent Reasoning via Automated Workflow Generation

The rise of Large Reasoning Models (LRMs) promises a significant leap forward in language model capabilities, aiming to tackle increasingly sophisticated tasks with unprecedented efficiency and accuracy.

However, despite their impressive performance, recent studies have highlighted how current reasoning models frequently fail to generalize to novel, unseen problems, often resorting to memorized solutions rather than genuine inferential reasoning.

In this paper, we introduce Nexus Architect, an enhanced iteration of our multi-agent system framework, Nexus, equipped with a novel automated workflow synthesis mechanism.

Given a user's prompt and a small set of representative examples, the Archi…

8 months, 1 week назад @ paperswithcode.com
/sharanya02/ Real Time Captioning of Sign Language Gestures in Video Meetings
/sharanya02/ Real Time Captioning of Sign Language Gestures in Video Meetings /sharanya02/ Real Time Captioning of Sign Language Gestures in Video Meetings

One of the most tested ways to establish such a communication is through the use of sign based languages.

However, not many people are aware of the smaller intricacies involved with sign language.

Sign language recognition using computer vision aims at eliminating the communication barrier between deaf-mute and ordinary people so that they can properly communicate with others.

In recent studies, it has been found that people with hearing disabilities prefer to sign over typing during these video calls.

In this paper, we are proposing a browser extension that will automatically translate sign language to subtitles for everyone else in the video call.

8 months, 1 week назад @ paperswithcode.com
/alessiopittiglio/ Leveraging Context for Multimodal Fallacy Classification in Political Debates
/alessiopittiglio/ Leveraging Context for Multimodal Fallacy Classification in Political Debates /alessiopittiglio/ Leveraging Context for Multimodal Fallacy Classification in Political Debates

In this paper, we present our submission to the MM-ArgFallacy2025 shared task, which aims to advance research in multimodal argument mining, focusing on logical fallacies in political debates.

Our approach uses pretrained Transformer-based models and proposes several ways to leverage context.

In the fallacy classification subtask, our models achieved macro F1-scores of 0.4444 (text), 0.3559 (audio), and 0.4403 (multimodal).

Our multimodal model showed performance comparable to the text-only model, suggesting potential for improvements.

PDFAbstract

8 months, 1 week назад @ paperswithcode.com
/RS2002/ One Step is Enough: Multi-Agent Reinforcement Learning based on One-Step Policy Optimization for Order Dispatch on Ride-Sharing Platforms
/RS2002/ One Step is Enough: Multi-Agent Reinforcement Learning based on One-Step Policy Optimization for Order Dispatch on Ride-Sharing Platforms /RS2002/ One Step is Enough: Multi-Agent Reinforcement Learning based on One-Step Policy Optimization for Order Dispatch on Ride-Sharing Platforms

On-demand ride-sharing platforms face the fundamental challenge of dynamically bundling passengers with diverse origins and destinations and matching them with vehicles in real time, all under significant uncertainty.

However, conventional MARL-based ride-sharing approaches heavily rely on the accurate estimation of Q-values or V-values, which becomes problematic in large-scale, highly uncertain environments.

To address these challenges, we propose two novel alternative methods that bypass value function estimation.

First, we adapt GRPO to ride-sharing, replacing the PPO baseline with the group average reward to eliminate critic estimation errors and reduce training bias.

Second, inspired b…

8 months, 1 week назад @ paperswithcode.com
/LiXinran6/ Long-Short Distance Graph Neural Networks and Improved Curriculum Learning for Emotion Recognition in Conversation
/LiXinran6/ Long-Short Distance Graph Neural Networks and Improved Curriculum Learning for Emotion Recognition in Conversation /LiXinran6/ Long-Short Distance Graph Neural Networks and Improved Curriculum Learning for Emotion Recognition in Conversation

Include the markdown at the top of your GitHub README.md file to showcase the performance of the model.

Badges are live and will be dynamically updated with the latest ranking of this paper.

8 months, 1 week назад @ paperswithcode.com
/ShimSoonYong/ ZClassifier: Temperature Tuning and Manifold Approximation via KL Divergence on Logit Space
/ShimSoonYong/ ZClassifier: Temperature Tuning and Manifold Approximation via KL Divergence on Logit Space

We introduce a novel classification framework, ZClassifier, that replaces conventional deterministic logits with diagonal Gaussian-distributed logits. Code: https://github.com/ShimSoonYong/ZClassifier

8 months, 1 week назад @ paperswithcode.com
/briziorusso/ On Gradual Semantics for Assumption-Based Argumentation
/briziorusso/ On Gradual Semantics for Assumption-Based Argumentation

In this paper, we fill this gap and propose a family of novel gradual semantics for equipping assumptions, which are the core components in ABA frameworks, with dialectical strengths. Code: https://github.com/briziorusso/GradualABA

8 months, 1 week назад @ paperswithcode.com
/wumingqi/ Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination
/wumingqi/ Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination

Cloudflare is unable to establish an SSL connection to the origin server.

If you're a visitor of this website:Please try again in a few minutes.

If you're the owner of this website:It appears that the SSL configuration used is not compatible with Cloudflare.

This could happen for a several reasons, including no shared cipher suites.

Additional troubleshooting information here.

8 months, 1 week назад @ paperswithcode.com
/IsaacYQH/ WildFX: A DAW-Powered Pipeline for In-the-Wild Audio FX Graph Modeling
/IsaacYQH/ WildFX: A DAW-Powered Pipeline for In-the-Wild Audio FX Graph Modeling

Despite rapid progress in end-to-end AI music generation, AI-driven modeling of professional Digital Signal Processing (DSP) workflows remains challenging. Code: https://github.com/IsaacYQH/WildFX

8 months, 1 week назад @ paperswithcode.com
/summer1278/ Addressing Data Imbalance in Transformer-Based Multi-Label Emotion Detection with Weighted Loss
/summer1278/ Addressing Data Imbalance in Transformer-Based Multi-Label Emotion Detection with Weighted Loss

This paper explores the application of a simple weighted loss function to Transformer-based models for multi-label emotion detection in SemEval-2025 Shared Task 11. Code: https://github.com/summer1278/semeval2025-task11

8 months, 1 week назад @ paperswithcode.com
/gabrielkmbo/ Step-wise Policy for Rare-tool Knowledge (SPaRK): Offline RL that Drives Diverse Tool Use in LLMs
/gabrielkmbo/ Step-wise Policy for Rare-tool Knowledge (SPaRK): Offline RL that Drives Diverse Tool Use in LLMs

We present Step-wise Policy for Rare-tool Knowledge (SPaRK), a novel reinforcement learning framework that teaches large language models to explore diverse tool usage patterns beyond conventional high-temperature sampling. Code: https://github.com/gabrielkmbo/explore-rl

8 months, 1 week назад @ paperswithcode.com
/Cavendish518/ Learning to Tune Like an Expert: Interpretable and Scene-Aware Navigation via MLLM Reasoning and CVAE-Based Adaptation
/Cavendish518/ Learning to Tune Like an Expert: Interpretable and Scene-Aware Navigation via MLLM Reasoning and CVAE-Based Adaptation

Service robots are increasingly deployed in diverse and dynamic environments, where both physical layouts and social contexts change over time and across locations. Code: https://github.com/Cavendish518/LE-Nav

8 months, 1 week назад @ paperswithcode.com
/MatteoFasulo/ AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles
/MatteoFasulo/ AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles

Cloudflare is unable to establish an SSL connection to the origin server.

If you're a visitor of this website:Please try again in a few minutes.

If you're the owner of this website:It appears that the SSL configuration used is not compatible with Cloudflare.

This could happen for a several reasons, including no shared cipher suites.

Additional troubleshooting information here.

8 months, 1 week назад @ paperswithcode.com
/VCA-EPFL/ SystolicAttention: Fusing FlashAttention within a Single Systolic Array
/VCA-EPFL/ SystolicAttention: Fusing FlashAttention within a Single Systolic Array

The frequent data swaps between the systolic array and external vector units result in low systolic array utilization. Code: https://github.com/VCA-EPFL/FSA

8 months, 1 week назад @ paperswithcode.com
/Buddhi19/ Precision Spatio-Temporal Feature Fusion for Robust Remote Sensing Change Detection
/Buddhi19/ Precision Spatio-Temporal Feature Fusion for Robust Remote Sensing Change Detection

Cloudflare is unable to establish an SSL connection to the origin server.

If you're a visitor of this website:Please try again in a few minutes.

If you're the owner of this website:It appears that the SSL configuration used is not compatible with Cloudflare.

This could happen for a several reasons, including no shared cipher suites.

Additional troubleshooting information here.

8 months, 1 week назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 8 months, 1 week назад
/fudanvi/ Beyond Task-Specific Reasoning: A Unified Conditional Generative Framework for Abstract Visual Reasoning
/fudanvi/ Beyond Task-Specific Reasoning: A Unified Conditional Generative Framework for Abstract Visual Reasoning

Cloudflare is unable to establish an SSL connection to the origin server.

If you're a visitor of this website:Please try again in a few minutes.

If you're the owner of this website:It appears that the SSL configuration used is not compatible with Cloudflare.

This could happen for a several reasons, including no shared cipher suites.

Additional troubleshooting information here.

8 months, 1 week назад @ paperswithcode.com
/benedekrozemberczki/ PGT-I: Scaling Spatiotemporal GNNs with Memory-Efficient Distributed Training
/benedekrozemberczki/ PGT-I: Scaling Spatiotemporal GNNs with Memory-Efficient Distributed Training

Spatiotemporal graph neural networks (ST-GNNs) are powerful tools for modeling spatial and temporal data dependencies. Code: https://github.com/benedekrozemberczki/pytorch_geometric_temporal

8 months, 1 week назад @ paperswithcode.com
/chengxuphd/ DCR: Quantifying Data Contamination in LLMs Evaluation
/chengxuphd/ DCR: Quantifying Data Contamination in LLMs Evaluation

Cloudflare is unable to establish an SSL connection to the origin server.

If you're a visitor of this website:Please try again in a few minutes.

If you're the owner of this website:It appears that the SSL configuration used is not compatible with Cloudflare.

This could happen for a several reasons, including no shared cipher suites.

Additional troubleshooting information here.

8 months, 1 week назад @ paperswithcode.com
/gitter-lab/ Assay2Mol: large language model-based drug design using BioAssay context
/gitter-lab/ Assay2Mol: large language model-based drug design using BioAssay context

Scientific databases aggregate vast amounts of quantitative data alongside descriptive text. Code: https://github.com/gitter-lab/Assay2Mol

8 months, 1 week назад @ paperswithcode.com
/hayatkhan8660-maker/ DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition
/hayatkhan8660-maker/ DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition

We employ forward Kullback-Leibler (KL) divergence alongside spatio-temporal focal modulation to effectively transfer both local and global context from the Video-FocalNet Base (teacher) to the proposed VFL-Net (student). Code: https://github.com/hayatkhan8660-maker/DVFL-Net

8 months, 1 week назад @ paperswithcode.com
/JudyJuezhuLong/ Best Practices for Large-Scale, Pixel-Wise Crop Mapping and Transfer Learning Workflows
/JudyJuezhuLong/ Best Practices for Large-Scale, Pixel-Wise Crop Mapping and Transfer Learning Workflows

Cloudflare is unable to establish an SSL connection to the origin server.

If you're a visitor of this website:Please try again in a few minutes.

If you're the owner of this website:It appears that the SSL configuration used is not compatible with Cloudflare.

This could happen for a several reasons, including no shared cipher suites.

Additional troubleshooting information here.

8 months, 1 week назад @ paperswithcode.com
/joaojcorreia/ A Fuzzy Approach to Project Success: Measuring What Matters
/joaojcorreia/ A Fuzzy Approach to Project Success: Measuring What Matters

This paper introduces a novel approach to project success evaluation by integrating fuzzy logic into an existing construct. Code: https://github.com/joaojcorreia/FuzzyLogic_ProjectSuccess

8 months, 1 week назад @ paperswithcode.com
/kunkunlin1221/ InstructFLIP: Exploring Unified Vision-Language Model for Face Anti-spoofing
/kunkunlin1221/ InstructFLIP: Exploring Unified Vision-Language Model for Face Anti-spoofing

Extensive experiments demonstrate the effectiveness of InstructFLIP by outperforming SOTA models in accuracy and substantially reducing training redundancy across diverse domains in FAS. Code: https://github.com/kunkunlin1221/InstructFLIP

8 months, 1 week назад @ paperswithcode.com
/Linvyl/ Describe Anything Model for Visual Question Answering on Text-rich Images
/Linvyl/ Describe Anything Model for Visual Question Answering on Text-rich Images

Recent progress has been made in region-aware vision-language modeling, particularly with the emergence of the Describe Anything Model (DAM). Code: https://github.com/Linvyl/DAM-QA

8 months, 1 week назад @ paperswithcode.com
/abhijeet3922/ Developing Visual Augmented Q&A System using Scalable Vision Embedding Retrieval & Late Interaction Re-ranker
/abhijeet3922/ Developing Visual Augmented Q&A System using Scalable Vision Embedding Retrieval & Late Interaction Re-ranker

We propose multi-step custom implementation utilizing widely adopted hybrid search (metadata & embedding) and state of the art late interaction re-ranker to retrieve best matching pages. Code: https://github.com/abhijeet3922/vision-RAG

8 months, 1 week назад @ paperswithcode.com
/ziangcao0312/ PhysX: Physical-Grounded 3D Asset Generation
/ziangcao0312/ PhysX: Physical-Grounded 3D Asset Generation

3D modeling is moving from virtual to physical. Code: https://github.com/ziangcao0312/PhysX

8 months, 1 week назад @ paperswithcode.com
/henry123-boy/ SpatialTrackerV2: 3D Point Tracking Made Easy
/henry123-boy/ SpatialTrackerV2: 3D Point Tracking Made Easy

We present SpatialTrackerV2, a feed-forward 3D point tracking method for monocular videos. Code: https://github.com/henry123-boy/SpaTrackerV2

8 months, 1 week назад @ paperswithcode.com
/cncs-fit/ Emergence of Functionally Differentiated Structures via Mutual Information Optimization in Recurrent Neural Networks
/cncs-fit/ Emergence of Functionally Differentiated Structures via Mutual Information Optimization in Recurrent Neural Networks

Analysis of network performance, correlation patterns, and weight matrices reveals that mutual information minimization yields high task performance alongside clear functional modularity and moderate structural modularity. Code: https://github.com/cncs-fit/mio_rnn

8 months, 1 week назад @ paperswithcode.com
/coswindywang/ Making Language Model a Hierarchical Classifier and Generator
/coswindywang/ Making Language Model a Hierarchical Classifier and Generator

Language heads of the last layer are copied to different selected intermediate layers, and fine-tuned with different task inputs. Code: https://github.com/coswindywang/HdLM

8 months, 1 week назад @ paperswithcode.com
/ahmedehabb/ From Roots to Rewards: Dynamic Tree Reasoning with RL
/ahmedehabb/ From Roots to Rewards: Dynamic Tree Reasoning with RL

Modern language models address complex questions through chain-of-thought (CoT) reasoning (Wei et al., 2023) and retrieval augmentation (Lewis et al., 2021), yet struggle with error propagation and knowledge integration. Code: https://github.com/ahmedehabb/From-Roots-to-Rewards-Dynamic-Tree-Reasoning-with-RL

8 months, 1 week назад @ paperswithcode.com
💼 University and corporation labs
DeepMind DeepMind
последний пост 1 day, 5 hours назад
Gemini 3.1 Flash Live: Making audio AI more natural and reliable
Gemini 3.1 Flash Live: Making audio AI more natural and reliable Gemini 3.1 Flash Live: Making audio AI more natural and reliable

Today, we’re advancing Gemini’s real-time dialogue capabilities with Gemini 3.1 Flash Live, our highest-quality audio and voice model yet.

It delivers the speed and natural rhythm needed for the next generation of voice-first AI, offering a more intuitive experience for developers, enterprises and everyday users.

3.1 Flash Live is available across Google products:For developers: Robust reasoning and task executionWe’ve improved 3.1 Flash Live’s overall quality, making it more reliable for developers and enterprises to build voice-first agents that can complete complex tasks at scale.

On ComplexFuncBench Audio, a benchmark that captures multi-step function calling with various constraints, i…

1 day, 5 hours назад @ blog.google
Protecting people from harmful manipulation
Protecting people from harmful manipulation Protecting people from harmful manipulation

Why harmful manipulation mattersConsider two scenarios: One AI model gives you facts to make a well-informed healthcare decision that improves your well-being.

Another AI model uses fear to pressure you to make an ill-informed decision that harms your health.

Developing new evaluations for a complex challengeTesting the outcomes of AI harmful manipulationTesting for harmful manipulation is inherently difficult because it involves measuring subtle changes in how people think and act, varying heavily by topic, culture and context.

Our findings show that success in one domain does not predict success in another, validating our targeted approach to testing for harmful manipulation in specific, …

2 days, 4 hours назад @ deepmind.google
Lyria 3 Pro: Create longer tracks in more
Lyria 3 Pro: Create longer tracks in more Lyria 3 Pro: Create longer tracks in more

Vertex AI: Lyria 3 Pro is now in public preview on Vertex AI for businesses who require on-demand audio at scale.

Lyria 3 Pro is now available alongside Lyria RealTime in AI Studio.

Google Vids: Vids is an AI-powered video creation app that anyone can use.

This is rolling out to Google Workspace customers and Google AI Pro & Ultra subscribers starting this week.

Gemini app: Longer generations with Lyria 3 Pro are now available in the Gemini app, starting with paid subscribers.

2 days, 4 hours назад @ blog.google
Measuring progress toward AGI: A cognitive framework
Measuring progress toward AGI: A cognitive framework Measuring progress toward AGI: A cognitive framework

Artificial General Intelligence (AGI) has the potential to accelerate scientific discovery and help solve some of humanity’s most pressing problems.

Tracking progress toward AGI will require a wide range of methods and approaches, and we believe cognitive science provides one important piece of the puzzle.

That’s why today, we’re releasing a new paper, “Measuring Progress Toward AGI: A Cognitive Taxonomy,” that presents a scientific foundation for understanding the cognitive capabilities of AI systems.

Deconstructing general intelligenceOur framework draws on decades of research from psychology, neuroscience and cognitive science to develop a cognitive taxonomy.

It identifies 10 key cogniti…

1 week, 3 days назад @ blog.google
From games to biology and beyond: 10 years of AlphaGo’s impact
From games to biology and beyond: 10 years of AlphaGo’s impact From games to biology and beyond: 10 years of AlphaGo’s impact

Scientific collaboration: We are integrating the search and reasoning principles pioneered with AlphaGo into an AI co-scientist.

We’ve also used AI to better understand the genome, advance fusion energy research, improve weather prediction and more.

Future of intelligenceFor an AI to be truly general, it needs to understand the physical world.

We think the combination of Gemini’s world models, AlphaGo’s search and planning techniques, and specialized AI tool use will prove to be critical for AGI.

True creativity is a key capability that such an AGI system would need to exhibit.

2 weeks, 4 days назад @ deepmind.google
Gemini 3.1 Flash-Lite: Built for intelligence at scale
Gemini 3.1 Flash-Lite: Built for intelligence at scale Gemini 3.1 Flash-Lite: Built for intelligence at scale

Today, we're introducing Gemini 3.1 Flash-Lite, our fastest and most cost-efficient Gemini 3 series model.

Built for high-volume developer workloads at scale, 3.1 Flash-Lite delivers high quality for its price and model tier.

Starting today, 3.1 Flash-Lite is rolling out in preview to developers via the Gemini API in Google AI Studio and for enterprises via Vertex AI.

Cost-efficiency without compromisePriced at just $0.25/1M input tokens and $1.50/1M output tokens, 3.1 Flash-Lite delivers enhanced performance at a fraction of the cost of larger models.

This low latency is needed for high-frequency workflows, making it an ideal model for developers to build responsive, real-time experiences.

3 weeks, 3 days назад @ blog.google
Nano Banana 2: Combining Pro capabilities with lightning-fast speed
Nano Banana 2: Combining Pro capabilities with lightning-fast speed Nano Banana 2: Combining Pro capabilities with lightning-fast speed

In August of last year, our Gemini Image model, Nano Banana, became a viral sensation, redefining image generation and editing.

Then in November, we released Nano Banana Pro, offering users advanced intelligence and studio-quality creative control.

Today, we’re bringing the best of both worlds to users across Google.

Introducing Nano Banana 2 (Gemini 3.1 Flash Image), our latest state-of-the-art image model.

Now you can get the advanced world knowledge, quality and reasoning you love in Nano Banana Pro, at lightning-fast speed.

4 weeks, 1 day назад @ blog.google
Gemini 3.1 Pro: A smarter model for your most complex tasks
Gemini 3.1 Pro: A smarter model for your most complex tasks Gemini 3.1 Pro: A smarter model for your most complex tasks

Today, we’re releasing the upgraded core intelligence that makes those breakthroughs possible: Gemini 3.1 Pro.

We are shipping 3.1 Pro across our consumer and developer products to bring this progress in intelligence to your everyday applications.

Starting today, 3.1 Pro is rolling out:For developers in preview via the Gemini API in Google AI Studio, Gemini CLI, our agentic development platform Google Antigravity and Android Studioin preview via the Gemini API in Google AI Studio, Gemini CLI, our agentic development platform Google Antigravity and Android Studio For enterprises in Vertex AI and Gemini Enterprisein Vertex AI and Gemini Enterprise For consumers via the Gemini app and Notebook…

1 month назад @ blog.google
A new way to express yourself: Gemini can now create music
A new way to express yourself: Gemini can now create music A new way to express yourself: Gemini can now create music

New audio verification capabilitiesAll tracks generated in the Gemini app are embedded with SynthID, our imperceptible watermark for identifying Google AI-generated content.

We are also giving you more tools to help identify AI content, broadening our verification capabilities in the Gemini app to include audio, along with image and video.

Simply upload a file and ask if it was generated using Google AI, and Gemini will check for SynthID and use its own reasoning to return a response.

And Google AI Plus, Pro and Ultra subscribers will enjoy higher limits.

Our goal with music generation in the Gemini app is to help you add a fun, custom soundtrack to your daily life.

1 month, 1 week назад @ blog.google
Accelerating discovery in India through AI-powered science and education
Accelerating discovery in India through AI-powered science and education Accelerating discovery in India through AI-powered science and education

In the global AI transformation, India is showing exceptional leadership in applying the technology to tackle its own biggest challenges.

But India is going even further, playing a critical international role by convening this week the fourth global AI summit of governments, companies and civil society.

Partnership in India to broaden AI accessOur partnerships are designed to accelerate the pace of progress across India.

Ltd., a K-12 textbook publisher in India, Gemini will be used to transform two million static textbooks into AI-powered interactive journeys across more than 250 titles and 2,000 schools.

Ltd., a K-12 textbook publisher in India, Gemini will be used to transform two million…

1 month, 1 week назад @ deepmind.google
Gemini 3 Deep Think: Advancing science, research and engineering
Gemini 3 Deep Think: Advancing science, research and engineering Gemini 3 Deep Think: Advancing science, research and engineering

Our most specialized reasoning mode is now updated to solve modern science, research and engineering challenges.

1 month, 1 week назад @ deepmind.google
Accelerating Mathematical and Scientific Discovery with Gemini Deep Think
Accelerating Mathematical and Scientific Discovery with Gemini Deep Think Accelerating Mathematical and Scientific Discovery with Gemini Deep Think

Under direction from expert mathematicians and scientists, Gemini Deep Think is solving professional research problems across mathematics, physics, and computer scienceIn the summer of 2025, an advanced version of Gemini Deep Think achieved Gold-medal standard at the International Mathematics Olympiad (IMO) and later, an updated version, obtained similar results at the International Collegiate Programming Contest.

Since then, Gemini Deep Think mode has moved into science, engineering and enterprise workflows to tackle more complex, open-ended challenges.

In the last week, our teams published two papers (1, 2) detailing a cross-disciplinary effort to solve professional research problems usin…

1 month, 2 weeks назад @ deepmind.google
Project Genie: Experimenting with infinite, interactive worlds
Project Genie: Experimenting with infinite, interactive worlds Project Genie: Experimenting with infinite, interactive worlds

Starting today, we're rolling out access to Project Genie for Google AI Ultra subscribers in the U.S (18+).

This experimental research prototype lets users create, explore and remix their own interactive worlds.

How we’re advancing world modelsA world model simulates the dynamics of an environment, predicting how they evolve and how actions affect them.

Building on our model research with trusted testers from across industries and domains, we are taking the next step with an experimental research prototype: Project Genie.

How Project Genie worksProject Genie is a prototype web app powered by Genie 3, Nano Banana Pro and Gemini, which allows users to experiment with the immersive experiences…

1 month, 3 weeks назад @ blog.google
D4RT: Teaching AI to see the world in four dimensions
D4RT: Teaching AI to see the world in four dimensions D4RT: Teaching AI to see the world in four dimensions

Introducing D4RT, a unified AI model for 4D scene reconstruction and tracking across space and time.

Today, we are introducing D4RT (Dynamic 4D Reconstruction and Tracking), a new AI model that unifies dynamic scene reconstruction into a single, efficient framework, bringing us closer to the next frontier of artificial intelligence: total perception of our dynamic reality.

How D4RT Works: A Query-Based ApproachD4RT operates as a unified encoder-decoder Transformer architecture.

The encoder first processes the input video into a compressed representation of the scene’s geometry and motion.

This makes D4RT extremely fast and scalable, whether it’s tracking just a few points or reconstructing …

2 months, 1 week назад @ deepmind.google
Veo 3.1 Ingredients to Video: More consistency, creativity and control
Veo 3.1 Ingredients to Video: More consistency, creativity and control Veo 3.1 Ingredients to Video: More consistency, creativity and control

Today, Veo is getting more expressive, with improvements that help you create more fun, creative, high-quality videos based on ingredient images, built directly for the mobile format.

We’re excited to bring new creative possibilities for everyone from casual storytellers to professional filmmakers.

We’re releasing:Improvements to Veo 3.1 Ingredients to Video, our capability that lets you create videos based on reference images.

This update makes videos more expressive and creative, even with simple prompts Native vertical outputs for Ingredients to Video (portrait mode) to power mobile-first, short-form video creation State-of-the-art upscaling to 1080p and 4K resolution for high-fidelity p…

2 months, 1 week назад @ blog.google
Google
последний пост 4 часа назад
How to build production-ready AI agents with Google-managed MCP servers
How to build production-ready AI agents with Google-managed MCP servers How to build production-ready AI agents with Google-managed MCP servers

As ​​developers build AI agents with more sophisticated reasoning systems, they require higher-quality fuel–in the form of enterprise data and specialized tools–to drive real business value.

To get the most out of that octane-rich mix, we offer Google-managed model context protocol (MCP) servers: an engine purpose-built for AI agents to interact securely with Google and Google Cloud services.

These Google-hosted, fully-managed endpoints allow AI agents to communicate with Google Maps, BigQuery, Google Kubernetes Engine, Cloud Run, and many other Google services.

As we boldly build AI agents, ensuring that we’re also building responsibly is critical.

In this guide, we demonstrate how to buil…

4 часа назад @ cloud.google.com
Easy as a green run: How Vail Resorts built an AI assistant to automate personalized recommendations
Easy as a green run: How Vail Resorts built an AI assistant to automate personalized recommendations Easy as a green run: How Vail Resorts built an AI assistant to automate personalized recommendations

It’s a big part of the reason Vail Resorts launched My Epic Assistant during the 2024-2025 snow season.

My Epic Assistant is an AI-powered assistant that takes the expertise of Vail Resort’s IT, hospitality, and operational teams and feeds it into Google’s powerful Gemini models.

Vail Resorts wanted more than a chatbot; they wanted a digital concierge that understands the nuance of a powder day at Whistler versus a family trip to Beaver Creek.

To take these offerings to the next level, the teams at Vail Resorts and Google Cloud specialist 66degrees updated My Epic Assistant with new features for the 2025-2026 season, including season pass recommendations and improved personalization.

In thi…

4 часа назад @ cloud.google.com
The new AI literacy: Insights from student developers
The new AI literacy: Insights from student developers The new AI literacy: Insights from student developers

Contrary to fears that students use AI to cheat or are becoming intellectually lazy, our research with UC Berkeley students reveals something different.

Students treated AI as a learning partner rather than a shortcut, using it strategically for some tasks while deliberately turning it off for others.

The students at UC Berkeley are showing us one answer: with curiosity, caution, and a commitment to genuine learning that technology can support but never replace.

It's very important to understand and validate the code AI is generating and use it appropriately."

The students at UC Berkeley are showing us one answer: with curiosity, caution, and a commitment to genuine learning that technology…

1 day, 3 hours назад @ cloud.google.com
How FM Logistic tackled the traveling salesman problem at warehouse scale with AlphaEvolve
How FM Logistic tackled the traveling salesman problem at warehouse scale with AlphaEvolve How FM Logistic tackled the traveling salesman problem at warehouse scale with AlphaEvolve

The traveling salesman problem asks a deceptively simple question: What's the shortest route that visits every point exactly once?

FM Logistic, a global logistics provider operating in over 14 countries, had already optimized their routing once.

FM Logistic designed a custom evaluation function using a representative dataset of 60 tours (over one hour of workforce data), letting the agent test thousands of generated algorithms against real-world conditions.

That efficiency gives FM Logistic room to handle larger order volumes with the same team and equipment, without adding headcount or expanding their fleet.

What’s nextThe Poland pilot demonstrated what evolutionary AI can do for complex r…

1 day, 4 hours назад @ cloud.google.com
DRA: A new era of Kubernetes device management with Dynamic Resource Allocation
DRA: A new era of Kubernetes device management with Dynamic Resource Allocation DRA: A new era of Kubernetes device management with Dynamic Resource Allocation

This week at KubeCon Europe, NVIDIA donated its Dynamic Resource Allocation (DRA) Driver for GPUs to the Kubernetes community, and Google donated the DRA driver for Tensor Processing Units (TPUs).

Moving beyond static infrastructureFor years, Kubernetes’ Device Plugin framework was the standard way to consume hardware accelerators.

However, Device Plugins only allow you to express hardware requirements as simple integers (e.g., gpu: 1 ) — no fractional GPUs!

As the new Kubernetes standard for resource management, DRA reached “stable” status in Kubernetes OSS 1.34.

This capability-based approach ensures that workloads are matched with the most suitable available hardware, improving both reso…

2 days, 4 hours назад @ cloud.google.com
Kubernetes as AI Infrastructure: Google Cloud, llm-d, and the CNCF
Kubernetes as AI Infrastructure: Google Cloud, llm-d, and the CNCF Kubernetes as AI Infrastructure: Google Cloud, llm-d, and the CNCF

At Google Cloud, serving the massive-scale needs of large foundation model builders and AI-native companies is at the forefront of our AI infrastructure strategy.

As generative AI transitions to mission-critical production environments, these innovators require dynamic, relentlessly efficient infrastructure to overcome complex orchestration challenges and power an agentic future.

To meet this moment, we are thrilled to announce that llm-d has officially been accepted as a Cloud Native Computing Foundation (CNCF) Sandbox project.

Google Cloud is proud to be a founding contributor to llm-d alongside Red Hat, IBM Research, CoreWeave, and NVIDIA, uniting around a clear, industry-defining vision…

3 days, 11 hours назад @ cloud.google.com
A developer’s guide to training with Ironwood TPUs
A developer’s guide to training with Ironwood TPUs A developer’s guide to training with Ironwood TPUs

To implement these FP8 training recipes, users can start with the Qwix library.

,See our blog post, Inside the optimization of FP8 training on Ironwood, in the Google Developer forums for more details.

These tools allow for the adjustment of tile sizes and other configurations to align with the specific memory hierarchy of the Ironwood TPU.

Refer to TPU Pipelining for more on TPU memory architecture.

See the Optimizing Frontier Model Training on TPU v7x Ironwood post in the Developer forums for more detail on techniques 2-5 above.

4 days, 4 hours назад @ cloud.google.com
Introducing multi-cluster GKE Inference Gateway: Scale AI workloads around the world
Introducing multi-cluster GKE Inference Gateway: Scale AI workloads around the world Introducing multi-cluster GKE Inference Gateway: Scale AI workloads around the world

With this release, the system uses Kubernetes Custom Resources to manage your distributed inference service.

These backends are exported and become visible as GCPInferencePoolImport resources in the "config cluster."

Standard Gateway and HTTPRoute resources in the config cluster define the entry point and routing rules, directing traffic to these imported pools.

For more information about GKE Inference Gateway core concepts check out our guide.

Get started todayAs you scale your AI inference serving workloads to more users in more places, we're excited for you to try multi-cluster GKE Inference Gateway.

1 week, 3 days назад @ cloud.google.com
Google Cloud and NVIDIA expand AI innovation across industries at GTC 2026
Google Cloud and NVIDIA expand AI innovation across industries at GTC 2026 Google Cloud and NVIDIA expand AI innovation across industries at GTC 2026

By providing more granular access to advanced hardware, fractional G4 VMs let you optimize resource allocation and reduce overhead without sacrificing performance.

Delivering efficiency across the AI infrastructure stackAs part of our commitment to a fully open ecosystem, we are excited to announce the integration of Dynamo and GKE Inference Gateway.

These configurations show how to overcome memory and interconnect bottlenecks when running AI inference workloads on AI Hypercomputer.

Advancing Vertex AI training and Model GardenWe are meeting the demands of next-generation AI with two major infrastructure advancements to Vertex AI training clusters.

Empowering public sector AI startupsTo fos…

1 week, 4 days назад @ cloud.google.com
Build Resilient LLM Applications on Vertex AI and Reduce 429 Errors
Build Resilient LLM Applications on Vertex AI and Reduce 429 Errors Build Resilient LLM Applications on Vertex AI and Reduce 429 Errors

Default options: The default option with Gemini on Vertex AI is Standard Pay-as-you-go (Paygo).

For Standard Pay-as-you-go (Paygo) traffic, Vertex AI uses a system with Usage Tiers.

Reduce payload via context cachingAn effective way to reduce the load on Vertex AI is to avoid making calls for repetitive queries.

Agent memory optimization: For Agentic workloads you can leverage Vertex AI Agent Engine Memory Bank .

Explore the Vertex AI samples on GitHub, or jumpstart your next project with the Google Cloud Beginner’s Guide, Vertex AI quickstart or start building your next AI agent with the Agent Development Kit (ADK)

2 weeks, 1 day назад @ cloud.google.com
The ultimate Nano Banana prompting guide
The ultimate Nano Banana prompting guide The ultimate Nano Banana prompting guide

Built on the Gemini 3 family of models, Nano Banana models apply deep reasoning capabilities to fully understand your prompt before generating an image.

So we spent weeks testing Nano Banana 2 and Nano Banana Pro against every use case we could imagine to test its limits.

What you’ll learn in this guide:Model overview Full breakdown of tech specs Best practices for effective prompting Prompting frameworks How Nano Banana works with other creative models, Veo and Lyria.

Most recently, we announced Nano Banana 2, which shines in three ways:More accurate visuals: Nano Banana 2 is powered by real-time information and images from web search.

Breakdown of tech specs for Nano Banana 2 and Nano Ban…

3 weeks, 1 day назад @ cloud.google.com
Small models, high quality: Inside BMW Group’s experiments evaluating domain-specific language models
Small models, high quality: Inside BMW Group’s experiments evaluating domain-specific language models Small models, high quality: Inside BMW Group’s experiments evaluating domain-specific language models

One way of achieving better, more natural voice commands is by incorporating AI foundation models into vehicle systems, which offer more intelligence than traditional voice commands.

While large language models (LLMs) offer powerful capabilities, they present one considerable drawback, at least in automotive settings: their reliance on consistent network access makes LLMs impractical for in-vehicle use due to potential lag and interruption.

Small language models may offer an ideal balance — but finding the right trade-off between size and capability requires careful optimization.

These purpose-built, right-sized generative AI models can be run directly on edge devices, including vehicles.

S…

3 weeks, 2 days назад @ cloud.google.com
Designing private network connectivity for RAG-capable gen AI apps
Designing private network connectivity for RAG-capable gen AI apps Designing private network connectivity for RAG-capable gen AI apps

The flexibility of Google Cloud allows enterprises to build secure and reliable architecture for their AI workloads.

In this blog we will look at a reference architecture for private connectivity for retrieval-augmented generation (RAG)-capable generative AI applications.

This architecture is for scenarios where communications of the overall system must use private IP addresses and must not traverse the internet.

RAG allows an application to retrieve relevant information from your documents, datasources, or databases in real time.

Design pattern exampleTo understand how to think about setting up your network for private connectivity for a RAG application in a regional design, let's look at …

3 weeks, 4 days назад @ cloud.google.com
PayPal's historically large data migration is the foundation for its gen AI innovation
PayPal's historically large data migration is the foundation for its gen AI innovation PayPal's historically large data migration is the foundation for its gen AI innovation

To continue leading the next wave of innovation in financial services, we knew we had to modernize our data foundation.

The journey to unified dataAfter choosing Google Cloud as our data partner, we embarked on our historic data migration.

Discovery and analysis: Detailed inventories of data, workloads and inbound/outbound data streams is crucial for defining scope, effort and forecasting budget.

Personalized financial insights that help merchants optimize their businesses.

Lessons for the AI eraWhile our migration may be extraordinary in its scale, we are not alone in our needs or ambitions.

4 weeks, 1 day назад @ cloud.google.com
Pro-level image generation gets faster and more accessible with Nano Banana 2
Pro-level image generation gets faster and more accessible with Nano Banana 2 Pro-level image generation gets faster and more accessible with Nano Banana 2

Today, we’re entering a new and vibrant era of generative creativity with Nano Banana 2.

What’s new: Nano Banana 2 is our state-of-the-art image generation and editing model.

Whether you’re building the next major consumer app or marketing campaign, Nano Banana 2 delivers:More accurate visuals: Nano Banana 2 is powered by real-time information and images from web search.

Where you can access Nano Banana 2: Developers can start building with Nano Banana 2 in preview using the Gemini API in Vertex AI .

Deep dive into how Nano Banana 2 sets a new bar for creativeAdvanced world knowledge: Nano Banana 2 is powered by real-time information and images from web search.

4 weeks, 1 day назад @ cloud.google.com
OpenAI
последний пост None
Microsoft Microsoft
последний пост 1 day, 1 hour назад
AsgardBench: A benchmark for visually grounded interactive planning
AsgardBench: A benchmark for visually grounded interactive planning AsgardBench: A benchmark for visually grounded interactive planning

At a glance To successfully complete tasks, embodied AI agents must ground and update their plans based on visual feedback.

Spanning 108 controlled task instances across 12 task types, the benchmark requires agents to adapt their plans based on what they observe.

Evaluating AsgardBenchWe tested several leading vision-capable models on AsgardBench and observed that high-performing models require visual grounding to consistently succeed.

Across the models, visual input substantially improved performance: most models more than doubled success rates when given images versus text-only descriptions of the scene.

AsgardBench is open source and available on GitHub (opens in new tab), providing a fo…

1 day, 1 hour назад @ microsoft.com
GroundedPlanBench: Spatially grounded long-horizon task planning for robot manipulation
GroundedPlanBench: Spatially grounded long-horizon task planning for robot manipulation GroundedPlanBench: Spatially grounded long-horizon task planning for robot manipulation

Video-to-Spatially Grounded Planning (V2GP) is a framework that converts robot demonstration videos into spatially grounded training data, enabling models to learn planning and grounding jointly.

Grounded planning improves both task success and action accuracy, outperforming decoupled approaches in benchmark and real-world evaluations.

We also built Video-to-Spatially Grounded Planning (V2GP), a framework that converts robot demonstration videos into training data to help VLMs learn this capability.

Decoupled vs. grounded planning, illustrating how ambiguous language causes actions to be grounded to the wrong objects.

In contrast, our approach, grounded planning, performs planning and groun…

1 day, 4 hours назад @ microsoft.com
Will machines ever be intelligent?
Will machines ever be intelligent? Will machines ever be intelligent?

And the question we’re going to discuss is, are machines intelligent?

No, no, that’s right, that’s right.

I mean, in some sense, you could potentially have a super intelligent system, right, that’s far more intelligent than anything else on the planet.

BURGER: Right, right.

At the same time, I think, you know, transformers are not intelligent in the way that a three-year-old is, right?

4 days, 5 hours назад @ microsoft.com
Systematic debugging for AI agents: Introducing the AgentRx framework
Systematic debugging for AI agents: Introducing the AgentRx framework Systematic debugging for AI agents: Introducing the AgentRx framework

Debugging AI agent failures is hard because trajectories are long, stochastic, and often multi-agent, so the true root cause gets buried.

As AI agents transition from simple chatbots to autonomous systems capable of managing cloud incidents, navigating complex web interfaces, and executing multi-step API workflows, a new challenge has emerged: transparency.

The challenge: Why AI agents are hard to debugModern AI agents are often:Long-horizon: They perform dozens of actions over extended periods.

LLM-based judging: Finally, an LLM judge uses the validation log and a grounded failure taxonomy to identify the Critical Failure Step—the first unrecoverable error.

Together, we can build AI agents…

2 weeks, 1 day назад @ microsoft.com
From raw interaction to reusable knowledge: Rethinking memory for AI agents
From raw interaction to reusable knowledge: Rethinking memory for AI agents From raw interaction to reusable knowledge: Rethinking memory for AI agents

It seems counterintuitive: giving AI agents more memory can make them less effective.

In our recent paper “PlugMem: A Task-Agnostic Plugin Memory Module for LLM Agents,” we introduce a plug-and-play memory system that transforms raw agent interactions into reusable knowledge.

Raw interactions are standardized and transformed into propositional knowledge (facts) and prescriptive knowledge (reusable skills).

One memory, any taskMost AI memory systems are built for one job.

Toward reusable memory for agentsAs AI agents take on longer and more complex tasks, its memory needs to evolve from storing past interactions to actively supplying reusable knowledge.

2 weeks, 3 days назад @ microsoft.com
Phi-4-reasoning-vision and the lessons of training a multimodal reasoning model
Phi-4-reasoning-vision and the lessons of training a multimodal reasoning model Phi-4-reasoning-vision and the lessons of training a multimodal reasoning model

At a glance Phi-4-reasoning-vision-15B is a compact and smart open‑weight multimodal reasoning model that balances reasoning power, efficiency, and training data needs.

is a compact and smart open‑weight multimodal reasoning model that balances reasoning power, efficiency, and training data needs.

This leads to several possible training pipelines:Non-reasoning LLM → reasoning multimodal training: Reasoning and multimodal capabilities are trained together.

Non-reasoning LLM → non-reasoning multimodal → reasoning multimodal training: Multimodal capabilities are learned first, then reasoning is added.

Reasoning LLM → reasoning multimodal training: A reasoning base is used, but all multimodal d…

3 weeks, 2 days назад @ microsoft.com
Trailer: The Shape of Things to Come
Trailer: The Shape of Things to Come Trailer: The Shape of Things to Come

This is The Shape of Things to Come.

I manage Microsoft Research’s worldwide labs, and I’m excited to introduce this new Microsoft Research Podcast series.

I called the podcast The Shape of Things to Come because as researchers, the problems that we choose to solve and the technologies that we develop do change the shape of the future.

It’s very hard to say whether we’re in an inflection point because I see the advancement of technology accelerating.

But I don’t know what the inflection point is because all I’ve seen is a curve going up.

3 weeks, 3 days назад @ microsoft.com
CORPGEN advances AI agents for real work
CORPGEN advances AI agents for real work CORPGEN advances AI agents for real work

To determine what a benchmark would need to test, we ran MHTEs at scale on some of today’s leading AI agents, exposing four weaknesses.

CORPGEN’s architectureCORPGEN introduces digital employees: LLM-powered AI agents with persistent identities, role-specific expertise, and realistic work schedules.

At 46 tasks, CORPGEN completed 15.2% of tasks, compared with 4.3% for the baselines, roughly 3.5 times more.

CORPGEN also opens a new lens on how AI agents collaborate.

AcknowledgmentsThis work is a result of a collaboration between the Office of the CTO at Microsoft and the Microsoft AI Development Accelerator Program (MAIDAP).

4 weeks, 1 day назад @ microsoft.com
Media Authenticity Methods in Practice: Capabilities, Limitations, and Directions
Media Authenticity Methods in Practice: Capabilities, Limitations, and Directions Media Authenticity Methods in Practice: Capabilities, Limitations, and Directions

We refer to technologies aimed at helping viewers verify the source and history—that is, the provenance—of digital content as media integrity and authentication (MIA) methods.

Today, we are publishing our findings in the Media Integrity & Authentication: Status, Directions & Futures report.

The report distills lessons learned and outlines practical directions for strengthening media integrity in the years ahead.

Watch on-demand Opens in a new tabFindings and directions forwardOur research recognizes that different media integrity and authenticity methods serve differing purposes and offer distinct levels of protection.

Important directions include in-stream tools that display provenance inf…

1 month назад @ microsoft.com
Project Silica’s advances in glass storage technology
Project Silica’s advances in glass storage technology Project Silica’s advances in glass storage technology

At a glance Microsoft Research publishes breakthrough in Nature on glass-based data storage that could preserve information for 10,000 years.

Glass is a permanent data storage material that is resistant to water, heat, and dust.

Phase voxels, a new storage method: We invented a new type of data storage in glass called phase voxels, in which the phase change of the glass is modified instead of its polarization, showing that only a single pulse is necessary to make a phase voxel.

We demonstrated that these phase voxels can also be formed in borosilicate glass and devised a technique to read the phase information from phase voxels encoded in this material.

A research-grade Writer used to set t…

1 month, 1 week назад @ microsoft.com
Rethinking imitation learning with Predictive Inverse Dynamics Models
Rethinking imitation learning with Predictive Inverse Dynamics Models Rethinking imitation learning with Predictive Inverse Dynamics Models

Predictive Inverse Dynamics Models (PIDMs) predict plausible future states, clarifying the direction of behavior during imitation learning.

Predictive Inverse Dynamics Models (PIDMs) offer a different take on imitation learning by changing how agents interpret human behavior.

By grounding the selection process in a plausible future, PIDMs provide a clearer basis for choosing an action during inference.

They then use an inverse dynamics model to predict the action required to move from the current state towards that future state.

A player (left) and a PIDM agent (right) side by side playing the game Bleeding Edge.

1 month, 2 weeks назад @ microsoft.com
Paza: Introducing automatic speech recognition benchmarks and models for low resource languages
Paza: Introducing automatic speech recognition benchmarks and models for low resource languages Paza: Introducing automatic speech recognition benchmarks and models for low resource languages

At a glance Microsoft Research releases PazaBench and Paza automatic speech recognition models , advancing speech technology for low resource languages.

First-of-its-kind ASR leaderboard, starting with African languages: Pazabench is the first automatic speech recognition (ASR) leaderboard for low-resource languages.

It launches with initial coverage for 39 African languages and benchmarks 52 state‑of‑the‑art ASR and language models, including newly released Paza ASR models for six Kenyan languages.

Paza ASR Models: Built with and for Kenyan languagesThe Paza ASR models consist of three fine-tuned ASR models built on top of state‑of‑the‑art model architectures.

Here is how Paza models compa…

1 month, 2 weeks назад @ microsoft.com
UniRG: Scaling medical imaging report generation with multimodal reinforcement learning
UniRG: Scaling medical imaging report generation with multimodal reinforcement learning UniRG: Scaling medical imaging report generation with multimodal reinforcement learning

At a glance AI-driven medical image report generation can help medical providers become more efficient and productive.

Universal Report Generation (UniRG) uses reinforcement learning to align model training with real-world radiology practice rather than proxy text-generation objectives.

Beyond the real-world benefits, report generation has also become a critical benchmark for evaluating multimodal reasoning in healthcare AI.

In this blog, we introduce Universal Report Generation (UniRG) (opens in new tab), a reinforcement learning–based framework for medical imaging report generation.

UniRG is a promising step toward scaling medical imaging report generationUniRG introduces a reinforcement …

1 month, 4 weeks назад @ microsoft.com
UniRG: Scaling medical imaging report generation with multimodal reinforcement learning
UniRG: Scaling medical imaging report generation with multimodal reinforcement learning UniRG: Scaling medical imaging report generation with multimodal reinforcement learning

At a glance AI-driven medical image report generation can help medical providers become more efficient and productive.

Universal Report Generation (UniRG) uses reinforcement learning to align model training with real-world radiology practice rather than proxy text-generation objectives.

Beyond the real-world benefits, report generation has also become a critical benchmark for evaluating multimodal reasoning in healthcare AI.

In this blog, we introduce Universal Report Generation (UniRG) (opens in new tab), a reinforcement learning–based framework for medical imaging report generation.

UniRG is a promising step toward scaling medical imaging report generationUniRG introduces a reinforcement …

1 month, 4 weeks назад @ microsoft.com
Multimodal reinforcement learning with agentic verifier for AI agents
Multimodal reinforcement learning with agentic verifier for AI agents Multimodal reinforcement learning with agentic verifier for AI agents

It’s an agentic verification framework designed to improve the reliability of reinforcement learning in multimodal models.

Reinforcement learning is a training method where AI models learn by receiving rewards for desired behaviors and penalties for undesired ones, gradually improving their performance through trial and error.

This design prevents unreliable feedback from dominating training and produces a stable reward signal for reinforcement learning.

Models trained with Argos also showed a substantial reduction of hallucinations compared with both standard chain-of-thought prompting and reinforcement learning baselines.

How Argos shapes reinforcement learningTo understand how Argos affe…

2 months назад @ microsoft.com
MIT AI MIT AI
последний пост 1 day назад
Seeing sounds
Seeing sounds Seeing sounds

Now he’s one of five master’s students in the Music Technology and Computation Graduate Program’s inaugural cohort.

When paired with a stimulus like music, these images can “show” sounds in action.

“This approach enables anyone to create music-driven visuals while leveraging the expressive and sometimes unpredictable dynamics of self-organized systems,” Salcedo says.

“He brings great energy and thoughtfulness to his work, and to supporting others in the [music technology and computation graduate] program,” Egozy notes.

“I want users to feel movement and explore sounds and their impact more fully,” he says.

1 day назад @ news.mit.edu
MIT engineers design proteins by their motion, not just their shape
MIT engineers design proteins by their motion, not just their shape MIT engineers design proteins by their motion, not just their shape

“AI must go beyond analyzing static forms to understanding how structure and motion are fundamentally intertwined,” Buehler adds.

Now, MIT engineers have taken a major step toward closing the gap with the development of an AI model known as VibeGen.

VibeGen does something no protein design tool has done before.

By enabling researchers to specify motion as a direct design parameter, VibeGen treats proteins less like static shapes and more like programmable mechanical devices.

They also hope to integrate motion-aware design with other AI tools, building toward systems that can design proteins to be not just dynamic, but multifunctional; machines that sense their environment, respond to signal…

1 day назад @ news.mit.edu
AI system learns to keep warehouse robot traffic running smoothly
AI system learns to keep warehouse robot traffic running smoothly AI system learns to keep warehouse robot traffic running smoothly

In simulations inspired by actual e-commerce warehouse layouts, this new approach achieved about a 25 percent gain in throughput over other methods.

Importantly, the system can quickly adapt to new environments with different quantities of robots or varied warehouse layouts.

The planning system needs to be adaptive to these changes as the warehouse operations go on,” Zheng says.

“By interacting with simulations inspired by real warehouse layouts, our system receives feedback that we use to make its decision-making more intelligent.

While their system is still far away from real-world deployment, these demonstrations highlight the feasibility and benefits of using a machine learning-guided a…

1 day, 16 hours назад @ news.mit.edu
Augmenting citizen science with computer vision for fish monitoring
Augmenting citizen science with computer vision for fish monitoring Augmenting citizen science with computer vision for fish monitoring

Monitoring fish movement and understanding population dynamics are essential for informing conservation efforts and supporting fisheries management.

A team of researchers from the Woodwell Climate Research Center, MIT Sea Grant, the MIT Computer Science and Artificial Intelligence Lab (CSAIL), MIT Lincoln Laboratory, and Intuit explored a new monitoring method using underwater video and computer vision to supplement citizen science efforts.

The open-access paper, “From snapshots to continuous estimates: Augmenting citizen science with computer vision for fish monitoring,” outlines how recent advancements in computer vision and deep learning, from object detection and tracking to species cla…

1 day, 23 hours назад @ news.mit.edu
Wristband enables wearers to control a robotic hand with their own movements
Wristband enables wearers to control a robotic hand with their own movements Wristband enables wearers to control a robotic hand with their own movements

In demonstrations, the team has shown that a person wearing the wristband can wirelessly control a robotic hand.

Now, MIT engineers have designed an ultrasound wristband that precisely tracks a wearer’s hand movements in real-time.

Some approaches use cameras to record a person’s hand movements as they manipulate objects or perform tasks.

Others involve having a person wear a glove with sensors, which records the person’s hand movements and transmits the data to a receiving robot.

They tested the algorithm on a new set of ultrasound images and found it correctly predicted the corresponding hand gestures.

2 days, 10 hours назад @ news.mit.edu
How to create “humble” AI
How to create “humble” AI How to create “humble” AI

One way to prevent these mistakes is to program AI systems to be more “humble,” according to the researchers.

This new approach could allow doctors and AI systems to work as partners, the researchers say, and help prevent AI from exerting too much influence over doctors’ decisions.

Previous studies have found that ICU physicians defer to AI systems that they perceive as reliable even when their own intuition goes against the AI suggestion.

To create such a system, the consortium designed a framework that includes several computational modules that can be incorporated into existing AI systems.

“Of course, we cannot stop or even delay the development of AI, not just in health care, but in eve…

3 days, 16 hours назад @ news.mit.edu
Advancing international trade research and finding community
Advancing international trade research and finding community Advancing international trade research and finding community

The event, part of the CIS Global Research and Policy Seminar, filled the venue with audience members from across MIT.

“My work is directly connected to what CIS faculty have previously done on international trade and security,” Park said afterwards.

Pursuing interdisciplinary research and connectionsBefore pursuing a tenure-track position, Park set his sights on conducting research at MIT.

He’s published prolifically along the way, including two articles in the Review of International Organizations and the Review of International Political Economy.

“I’ll never stop using the communication skills that I got here at MIT," Park says.

3 days, 23 hours назад @ news.mit.edu
On algorithms, life, and learning
On algorithms, life, and learning On algorithms, life, and learning

At MIT, Bertsimas is the vice provost for open learning, associate dean for online education and artificial intelligence, Boeing Leaders for Global Operations Professor of Management, and professor of operations research in the MIT Sloan School of Management.

Bertsimas delivered his lecture, titled “Algorithms for Life: AI and Operations Research Transforming Healthcare, Education, and Agriculture,” to an audience of over 300 MIT community members in Huntington Hall (Room 10-250) on campus.

Moving to MIT for his graduate work, he then earned his MS in operations research and his PhD in applied mathematics and operations research.

Bertsimas joined the MIT faculty after receiving his doctorat…

4 days, 3 hours назад @ news.mit.edu
What’s the right path for AI?
What’s the right path for AI? What’s the right path for AI?

In her remarks, Hao outlined the astonishing size of datasets now being used by the biggest AI firms to develop large language models.

By contrast, Hao offered, an alternate path for AI might exist in the example of AlphaFold, the Nobel Prize-winning tool used to identify protein structures.

This represents the concept of the “small, task-specific AI model tackling a well-scoped problem that lends itself to the computational strengths of AI,” Hao said.

Ricaurte has also served on expert committees such as the Global Partnership for AI, UNESCO’s AI Ethics Experts Without Borders, and the Women for Ethical AI project.

“Part of the challenge in talking about AI is the complete lack of specific…

1 week назад @ news.mit.edu
MIT and Hasso Plattner Institute establish collaborative hub for AI and creativity
MIT and Hasso Plattner Institute establish collaborative hub for AI and creativity MIT and Hasso Plattner Institute establish collaborative hub for AI and creativity

The following is a joint announcement from the MIT School of Architecture and Planning, MIT Schwarzman College of Computing, Hasso Plattner Institute, and Hasso Plattner Foundation.

The MIT Morningside Academy for Design (MAD), MIT Schwarzman College of Computing, Hasso Plattner Institute (HPI), and Hasso Plattner Foundation celebrated the launch of the MIT and HPI AI and Creativity Hub (MHACH) at a signing ceremony this week.

A steering committee composed of representatives from the MIT School of Architecture and Planning, MIT Schwarzman College of Computing, and Hasso Plattner Institute will facilitate the shared governance of MHACH.

This collaboration is made possible by the Hasso Plattn…

1 week назад @ news.mit.edu
A better method for identifying overconfident large language models
A better method for identifying overconfident large language models A better method for identifying overconfident large language models

Large language models (LLMs) can generate credible but inaccurate responses, so researchers have developed uncertainty quantification methods to check the reliability of predictions.

Research has shown that epistemic uncertainty, or uncertainty about whether one is using the right model, can be a better way to assess true uncertainty when a model is overconfident.

But since it is impossible to build an ideal model, researchers use surrogates or approximations that often rely on faulty assumptions.

To improve uncertainty quantification, the MIT researchers needed a more accurate way to estimate epistemic uncertainty.

Once they had developed this method for estimating epistemic uncertainty, t…

1 week, 1 day назад @ news.mit.edu
Generative AI improves a wireless vision system that sees through obstructions
Generative AI improves a wireless vision system that sees through obstructions Generative AI improves a wireless vision system that sees through obstructions

This new technique builds a partial reconstruction of a hidden object from reflected wireless signals and fills in the missing parts of its shape using a specially trained generative AI model.

The researchers also introduced an expanded system that uses generative AI to accurately reconstruct an entire room, including all the furniture.

The system utilizes wireless signals sent from one stationary radar, which reflect off humans moving in the space.

“What we’ve done now is develop generative AI models that help us understand wireless reflections.

In the new papers, they overcame that limitation by using a generative AI model to fill in parts that are missing from a partial reconstruction.

1 week, 1 day назад @ news.mit.edu
Sustaining diplomacy amid competition in US-China relations
Sustaining diplomacy amid competition in US-China relations Sustaining diplomacy amid competition in US-China relations

He listed four primary areas for this competition: military, technology, trade and economics, and values.

Outside of North America, China is the United States’ largest trade partner.

“At one point, you’ll remember, 145 percent tariffs by the United States, and 125 percent by China on the United States.

In 2024 and 2025, the United States was not the only country to place tariffs on these products; India, Turkey, South Africa, Mexico, Canada, the EU, and others followed suit.

Burns also described the significant technological competition between the United States and China — an area of central importance.

1 week, 2 days назад @ news.mit.edu
MIT-IBM Watson AI Lab seed to signal: Amplifying early-career faculty impact
MIT-IBM Watson AI Lab seed to signal: Amplifying early-career faculty impact MIT-IBM Watson AI Lab seed to signal: Amplifying early-career faculty impact

This includes building a research team, which demands innovative ideas and direction, creative collaborators, and reliable resources.

Shortly after joining MIT, Andreas jump-started his first major project through the MIT-IBM Watson AI Lab, working on language representation and structured data augmentation methods for low-resource languages.

For several other faculty members, timely participation with the MIT-IBM Watson AI Lab proved to be highly advantageous as well.

Merging expertiseThe nature of the MIT-IBM Watson AI Lab is that it not only brings together researchers in the AI realm to accelerate research, but also blends work across disciplines.

“I think these early-career projects [w…

1 week, 3 days назад @ news.mit.edu
Can AI help predict which heart-failure patients will worsen within a year?
Can AI help predict which heart-failure patients will worsen within a year? Can AI help predict which heart-failure patients will worsen within a year?

Characterized by weakened or damaged heart musculature, heart failure results in the gradual buildup of fluid in a patient’s lungs, legs, feet, and other parts of the body.

Yet heart failure remains one of the leading causes of morbidity and mortality, placing a substantial burden on health-care systems across the globe.

“The biggest thing that distinguishes [PULSE-HF] from other heart failure ECG methods is instead of detection, it does forecasting,” says Yau.

The paper notes that to date, no other methods exist for predicting future LVEF decline among patients with heart failure.

While the model aims to forecast a patient’s ejection fraction, the labels for the training data weren’t alway…

2 weeks назад @ news.mit.edu
Berkeley AI
последний пост 2 weeks назад
Identifying Interactions at Scale for LLMs
Identifying Interactions at Scale for LLMs Identifying Interactions at Scale for LLMs

Identifying Interactions at Scale for LLMsUnderstanding the behavior of complex machine learning systems, particularly Large Language Models (LLMs), is a critical challenge in modern artificial intelligence.

Therefore, grounded or reality-checked interpretability methods must also be able to capture these influential interactions.

In this blog post, we describe the fundamental ideas behind SPEX and ProxySPEX, algorithms capable of identifying these critical interactions at scale.

SPEX and ProxySPEX FrameworkTo discover influential interactions with a tractable number of ablations, we have developed SPEX (Spectral Explainer).

We formalize this through two observations: sparsity (relatively f…

2 weeks назад @ bair.berkeley.edu
Information-Driven Design of Imaging Systems
Information-Driven Design of Imaging Systems Information-Driven Design of Imaging Systems

We developed a framework that enables direct evaluation and optimization of imaging systems based on their information content.

The first approach treated imaging systems as unconstrained communication channels, ignoring the physical limitations of lenses and sensors.

Our Information-Driven Encoder Analysis Learning (IDEAL) method uses gradient ascent on information estimates to optimize imaging system parameters.

The standard approach to computational imaging design, end-to-end optimization, jointly trains the imaging hardware and a neural network decoder.

The computational efficiency of IDEAL suggests possibilities for designing imaging systems that were previously intractable.

2 months, 2 weeks назад @ bair.berkeley.edu
RL without TD learning
RL without TD learning RL without TD learning

RL without TD learningIn this post, I’ll introduce a reinforcement learning (RL) algorithm based on an “alternative” paradigm: divide and conquer.

We can do Reinforcement Learning (RL) based on divide and conquer, instead of temporal difference (TD) learning.

There are two classes of algorithms in RL: on-policy RL and off-policy RL.

We compared TRL with $n$-step TD learning with different values of $n$, from $1$ (pure TD) to $\infty$ (pure MC).

I still think one of the most important problems in RL (and even in machine learning) is to find a scalable off-policy RL algorithm.

4 months, 3 weeks назад @ bair.berkeley.edu
What exactly does word2vec learn?
What exactly does word2vec learn? What exactly does word2vec learn?

What exactly does word2vec learn?

What exactly does word2vec learn, and how?

In this framing, it’s clear that word2vec is a minimal neural language model.

As a result, the theory predicts exactly what features are learned in terms of the corpus statistics and the algorithmic hyperparameters.

We find that over the course of learning, word2vec builds these linear representations in a sequence of noisy learning steps, and their geometry is well-described by a spiked random matrix model.

6 months, 3 weeks назад @ bair.berkeley.edu
Whole-Body Conditioned Egocentric Video Prediction
Whole-Body Conditioned Egocentric Video Prediction Whole-Body Conditioned Egocentric Video Prediction

Whole-Body Conditioned Egocentric Video Prediction×Predicting Ego-centric Video from human Actions (PEVA).

We trained a model to Predict Ego-centric Video from human Actions (PEVA) for Whole-Body-Conditioned Egocentric Video Prediction.

We train an autoregressive conditional diffusion transformer on Nymeria, a large-scale dataset pairing real-world egocentric video with body pose capture.

We include some samples here:Body Movement Actions Move Forward Rotate Left Rotate Right Left Hand Actions Move Left Hand Up Move Left Hand Down Move Left Hand Left Move Left Hand Right Right Hand Actions Move Right Hand Up Move Right Hand Down Move Right Hand Left Move Right Hand RightLong RolloutHere you…

8 months, 4 weeks назад @ bair.berkeley.edu
AWS Machine Learning AWS Machine Learning
последний пост 21 час назад
Run Generative AI inference with Amazon Bedrock in Asia Pacific (New Zealand)
Run Generative AI inference with Amazon Bedrock in Asia Pacific (New Zealand) Run Generative AI inference with Amazon Bedrock in Asia Pacific (New Zealand)

Today, we’re excited to announce that Amazon Bedrock is now available in the Asia Pacific (New Zealand) Region (ap-southeast-6).

Amazon Bedrock provides two types of cross-Region inference profiles:Geographic cross-Region inference – Routes requests within a specific geographic boundary.

Cross-Region inference type Example models AU geographic cross-Region inference Anthropic Claude Opus 4.6, Claude Sonnet 4.6, Claude Sonnet 4.5, Claude Haiku 4.5 Global cross-Region inference Anthropic Claude Opus 4.6, Claude Sonnet 4.6, Claude Opus 4.5, Claude Sonnet 4.5, Claude Haiku 4.5AU geographic cross-Region inference currently supports Anthropic Claude models, keeping inference processing within the…

21 час назад @ aws.amazon.com
Building age-responsive, context-aware AI with Amazon Bedrock Guardrails
Building age-responsive, context-aware AI with Amazon Bedrock Guardrails Building age-responsive, context-aware AI with Amazon Bedrock Guardrails

To address these challenges, we implemented a fully serverless, guardrail-first solution using Amazon Bedrock Guardrails and other AWS services that align with modern AI safety and compliance alignment needs.

Solution overviewThis solution uses Amazon Bedrock, Amazon Bedrock Guardrails, AWS Lambda, and Amazon API Gateway as core services for intelligent response generation, centralized policy enforcement, and secure access.

For production enterprise deployments, host the web interface using AWS Amplify, Amazon S3 and Amazon CloudFront, or container services like Amazon Elastic Container Service (Amazon ECS) and Amazon Elastic Kubernetes Service (Amazon EKS).

For detailed Amazon Bedrock Guar…

1 day, 3 hours назад @ aws.amazon.com
Accelerating LLM fine-tuning with unstructured data using SageMaker Unified Studio and S3
Accelerating LLM fine-tuning with unstructured data using SageMaker Unified Studio and S3 Accelerating LLM fine-tuning with unstructured data using SageMaker Unified Studio and S3

Last year, AWS announced an integration between Amazon SageMaker Unified Studio and Amazon S3 general purpose buckets.

The full end-to-end data ingestion, model development, and metric evaluation process will be orchestrated using Amazon SageMaker Unified Studio.

To achieve this process flow, we build an architecture that performs the data ingestion, data preprocessing, model training, and evaluation using Amazon SageMaker Unified Studio.

PrerequisitesTo prepare your organization to use the new integration between Amazon SageMaker Unified Studio and Amazon S3 general purpose buckets, you must complete the following prerequisites.

Grantees can access Amazon S3 data by using the AWS Command L…

1 day, 3 hours назад @ aws.amazon.com
Introducing Amazon Polly Bidirectional Streaming: Real-time speech synthesis for conversational AI
Introducing Amazon Polly Bidirectional Streaming: Real-time speech synthesis for conversational AI Introducing Amazon Polly Bidirectional Streaming: Real-time speech synthesis for conversational AI

Today, we’re excited to announce the new Bidirectional Streaming API for Amazon Polly, enabling streamlined real-time text-to-speech (TTS) synthesis where you can start sending text and receiving audio simultaneously.

: Stream text to Amazon Polly as it becomes available—no need to wait for complete sentences or paragraphs.

The bidirectional streaming API test sends each word to the stream as it arrives, allowing Amazon Polly to begin synthesis before the full text is available.

Characters processed: " + event.requestCharacters()); }) .build()) .build();Complete example: streaming text from an LLMHere’s a practical example showing how to integrate bidirectional streaming with incremental te…

1 day, 3 hours назад @ aws.amazon.com
Unlocking video insights at scale with Amazon Bedrock multimodal models
Unlocking video insights at scale with Amazon Bedrock multimodal models Unlocking video insights at scale with Amazon Bedrock multimodal models

The evolution of video analysisTraditional video analysis approaches rely on manual review or basic computer vision techniques that detect predefined patterns.

Three approaches to video understandingUnderstanding video content is inherently complex, combining visual, auditory, and temporal information that must be analyzed together for meaningful insights.

Multimodal embedding: semantic video searchMultimodal embedding represents an emerging approach to video understanding, particularly powerful for video semantic search applications.

The solution offers workflows using Amazon Nova Multimodal Embedding and TwelveLabs Marengo models available on Amazon Bedrock.

Social Media Content Moderatio…

2 days, 1 hour назад @ aws.amazon.com
Deploy voice agents with Pipecat and Amazon Bedrock AgentCore Runtime – Part 1
Deploy voice agents with Pipecat and Amazon Bedrock AgentCore Runtime – Part 1 Deploy voice agents with Pipecat and Amazon Bedrock AgentCore Runtime – Part 1

In this series of posts, you will learn how streaming architectures help address these challenges using Pipecat voice agents on Amazon Bedrock AgentCore Runtime.

Amazon Bedrock AgentCore Runtime addresses these challenges by providing a secure, serverless environment for scaling dynamic AI agents.

When building voice agents, latency is a critical consideration, determining how natural and reliable a voice conversation feels.

Deploy Pipecat voice agents on AgentCore Runtime using persistent, bidirectional WebSocket connections for audio streaming between client devices and your agent logic.

This path is not supported on AgentCore Runtime as the runtime environment cannot be assigned to a pub…

2 days, 2 hours назад @ aws.amazon.com
Reinforcement fine-tuning on Amazon Bedrock with OpenAI-Compatible APIs: a technical walkthrough
Reinforcement fine-tuning on Amazon Bedrock with OpenAI-Compatible APIs: a technical walkthrough Reinforcement fine-tuning on Amazon Bedrock with OpenAI-Compatible APIs: a technical walkthrough

In December 2025, we announced the availability of Reinforcement fine-tuning (RFT) on Amazon Bedrock starting with support for Nova models.

In Amazon Bedrock RFT, this could be Amazon Nova, Llama, Qwen, or other supported models.

How Amazon Bedrock RFT worksAmazon Bedrock RFT is built to make reinforcement fine-tuning practical at the enterprise level.

You can also authenticate using AWS Sigv4 credentials but in this walkthrough we use an Amazon Bedrock API Key.

The full notebook with end-to-end code for both GPT-OSS 20B and Qwen3 32B is available on GitHub:github.com/aws-samples/amazon-bedrock-samples/tree/main/custom-models/bedrock-reinforcement-fine-tuningFor more details, see the Amazon…

2 days, 3 hours назад @ aws.amazon.com
Deploy SageMaker AI inference endpoints with set GPU capacity using training plans
Deploy SageMaker AI inference endpoints with set GPU capacity using training plans Deploy SageMaker AI inference endpoints with set GPU capacity using training plans

Originally designed for training workloads, training plans now support inference endpoints, providing predictable GPU availability for time-bound inference workloads.

After created, the training plan provides a set capacity that can be referenced when deploying SageMaker AI inference endpoints.

In this post, we walk through how to search for available p-family GPU capacity, create a training plan reservation for inference, and deploy a SageMaker AI inference endpoint on that reserved capacity.

Figure 1: Training Plans landing page with Create training plan buttonStep A – Search for training plan offerings:Under Target, select Inference Endpoint.

ConclusionSageMaker AI training plans provide…

3 days назад @ aws.amazon.com
Accelerating custom entity recognition with Claude tool use in Amazon Bedrock
Accelerating custom entity recognition with Claude tool use in Amazon Bedrock Accelerating custom entity recognition with Claude tool use in Amazon Bedrock

This post introduces a game-changing solution: Claude Tool use in Amazon Bedrock which uses the power of large language models (LLMs) to perform dynamic, adaptable entity recognition without extensive setup or training.

Solution overviewIn this post, we demonstrate how to extract custom fields from driver’s licenses using Claude Tool use in Amazon Bedrock.

Working with Claude Tool use schemas Amazon Bedrock with Claude 4.5 Sonnet supports function calling using Tool use — where you define callable tools with clear JSON schemas.

input_schema: JSON schema that defines required fields, types, and structure Example Tool use definition: [{ "name": "extract_license_fields", "input_schema": { "typ…

3 days, 3 hours назад @ aws.amazon.com
How Reco transforms security alerts using Amazon Bedrock
How Reco transforms security alerts using Amazon Bedrock How Reco transforms security alerts using Amazon Bedrock

Using Anthropic Claude in Amazon Bedrock, Reco tackles the challenge of machine-readable security alerts that SOC teams struggle to quickly interpret.

In this blog post, we show you how Reco implemented Amazon Bedrock to help transform security alerts and achieve significant improvements in incident response times.

The challenge: Making security alerts actionableModern security alerts are often highly technical, requiring security engineers to manually analyze raw event data, cross-reference indicators across multiple security alerts, determine potential impact and appropriate responses, derive actionable insights, and communicate findings to non-technical stakeholders.

Example outcomeThe f…

4 days, 4 hours назад @ aws.amazon.com
Integrating Amazon Bedrock AgentCore with Slack
Integrating Amazon Bedrock AgentCore with Slack Integrating Amazon Bedrock AgentCore with Slack

Integrating Amazon Bedrock AgentCore with Slack brings AI agents directly into your workspace.

Solution overviewThis solution consists of two main components: The Slack integration infrastructure and the Amazon AgentCore Runtime with tools.

The integration infrastructure in this solution uses Amazon API Gateway, AWS Lambda, AWS Secrets Manager, and Amazon Simple Queue Service (Amazon SQS) for serverless integration.

It’s built with the Strands Agents SDK that integrates with Amazon Bedrock AgentCore Gateway for tool access and AgentCore Memory for conversation history.

Section B – AgentCore Components – Next, WeatherAgentCoreStack CDK deploys the AgentCore Runtime, Gateway, Memory, and AWS …

4 days, 4 hours назад @ aws.amazon.com
Overcoming LLM hallucinations in regulated industries: Artificial Genius’s deterministic models on Amazon Nova
Overcoming LLM hallucinations in regulated industries: Artificial Genius’s deterministic models on Amazon Nova Overcoming LLM hallucinations in regulated industries: Artificial Genius’s deterministic models on Amazon Nova

In this post, we’re excited to showcase how AWS ISV Partner Artificial Genius is using Amazon SageMaker AI and Amazon Nova to solve this challenge.

This approach uses the generative power of Amazon Nova to understand context but applies a deterministic layer to verify and produce output.

To create this third-generation capability, Artificial Genius uses SageMaker AI to perform a specific form of instruction tuning on Amazon Nova base models.

With this understanding, let’s explore the technical implementation of building a non-generative fine-tuning pipeline using Amazon Nova on SageMaker AI.

The team began with one model, identified a behavioral flaw (unwanted CoT), engineered a data-centri…

4 days, 4 hours назад @ aws.amazon.com
Run NVIDIA Nemotron 3 Super on Amazon Bedrock
Run NVIDIA Nemotron 3 Super on Amazon Bedrock Run NVIDIA Nemotron 3 Super on Amazon Bedrock

Nemotron 3 Super is now available as a fully managed and serverless model on Amazon Bedrock, joining the Nemotron Nano models that are already available within the Amazon Bedrock environment.

With NVIDIA Nemotron open models on Amazon Bedrock, you can accelerate innovation and deliver tangible business value without managing infrastructure complexities.

Get Started with NVIDIA Nemotron 3 Super in Amazon Bedrock.

Try it now: Head over to the Amazon Bedrock Console to experiment with NVIDIA Nemotron 3 Super in the model playground.

Head over to the Amazon Bedrock Console to experiment with NVIDIA Nemotron 3 Super in the model playground.

1 week, 1 day назад @ aws.amazon.com
Use RAG for video generation using Amazon Bedrock and Amazon Nova Reel
Use RAG for video generation using Amazon Bedrock and Amazon Nova Reel Use RAG for video generation using Amazon Bedrock and Amazon Nova Reel

Using Amazon Bedrock, Amazon Nova Reel, the Amazon OpenSearch Service vector engine, and Amazon Simple Storage Service (Amazon S3), the solution seamlessly integrates image retrieval, prompt-based video generation, and batch processing into a single automated workflow.

Solution overviewOur solution is designed to take a structured text prompt, retrieve the most relevant image, and use Amazon Nova Reel for video generation.

Prompt-based video generation – Users define an action prompt (for example, “Camera pans down”), which is combined with the retrieved image to generate a video using Amazon Nova Reel.

– Users define an action prompt (for example, “Camera pans down”), which is combined wit…

1 week, 1 day назад @ aws.amazon.com
Introducing V-RAG: revolutionizing AI-powered video production with Retrieval Augmented Generation
Introducing V-RAG: revolutionizing AI-powered video production with Retrieval Augmented Generation Introducing V-RAG: revolutionizing AI-powered video production with Retrieval Augmented Generation

This post introduces Video Retrieval-Augmented Generation (V-RAG), an approach to help improve video content creation.

By combining retrieval augmented generation with advanced video AI models, V-RAG offers an efficient, and reliable solution for generating AI videos.

Video generation customizationText prompting can only get you so far with video generation.

V-RAG: an effective approach in video generation customizationVideo Retrieval-Augmented Generation (V-RAG) builds upon image-to-video technology to expand video customization capabilities.

This approach offers a promising path for AI-powered video generation, enabling organizations to create compelling visual content.

1 week, 1 day назад @ aws.amazon.com
NVIDIA
последний пост 1 day, 5 hours назад
Into the Omniverse: NVIDIA GTC Showcases Virtual Worlds Powering the Physical AI Era
Into the Omniverse: NVIDIA GTC Showcases Virtual Worlds Powering the Physical AI Era Into the Omniverse: NVIDIA GTC Showcases Virtual Worlds Powering the Physical AI Era

At the center of this shift are new frontier models for physical AI, including NVIDIA Cosmos 3, NVIDIA Isaac GR00T N1.7 and NVIDIA Alpamayo 1.5.

NVIDIA also released the NVIDIA Physical AI Data Factory Blueprint, designed to push the state of the art in world modeling, humanoid skills and autonomous driving, as well as the NVIDIA Omniverse DSX Blueprint for AI factory digital twin simulation.

Compute Is Data: Real-World Data Is No Longer the MoatReal-world data used to function as a moat for physical AI — but it doesn’t scale.

To help address this, NVIDIA introduced at GTC its Physical AI Data Factory Blueprint, an open reference architecture that transforms compute into large-scale, high-q…

1 day, 5 hours назад @ blogs.nvidia.com
Game On: Five New Titles Now Streaming on GeForce NOW
Game On: Five New Titles Now Streaming on GeForce NOW Game On: Five New Titles Now Streaming on GeForce NOW

This week, five new titles are ready to play instantly in the cloud gaming platform’s library.

Plus, Honkai: Star Rail Version 4.1, “Unraveled for Daybreak,” touches down.

Let’s Play TodayHonkai: Star Rail Version 4.1, “Unraveled for Daybreak,” is available now, bringing new adventures aboard the Astral Express.

The crew touches down at Star Rail FEST, a grand interstellar celebration packed with new zones, characters and challenges.

Play the latest Honkai: Star Rail update instantly on GeForce NOW — no installs, just starlight and action.

1 day, 7 hours назад @ blogs.nvidia.com
The Future of AI Is Open and Proprietary
The Future of AI Is Open and Proprietary The Future of AI Is Open and Proprietary

As NVIDIA founder and CEO Jensen Huang told attendees at a special session on open frontier models at NVIDIA GTC, “Proprietary versus open is not a thing.

It’s proprietary and open.”That’s why the future of AI innovation isn’t about a single massive model.

Several Nemotron Coalition members joined other leaders building and consuming open models for a back-to-back panel session at GTC.

“There’s a flourishing ecosystem of powerful, closed models but equally capable open models that are going to be coming over the next couple years.”This combination of open and proprietary models drives advancements at frontier AI companies as well as in academia.

“I think the shape of AI is going to reflect …

2 days, 1 hour назад @ blogs.nvidia.com
Blowing Off Steam: How Power-Flexible AI Factories Can Stabilize the Global Energy Grid
Blowing Off Steam: How Power-Flexible AI Factories Can Stabilize the Global Energy Grid Blowing Off Steam: How Power-Flexible AI Factories Can Stabilize the Global Energy Grid

In a recent white paper, Emerald AI — in collaboration with NVIDIA, EPRI, National Grid and Nebius — showcased how “power-flexible” AI factories can autonomously adjust their power usage during peak demand.

For AI factories, this could unlock significantly faster grid connections without waiting for massive, years-long infrastructure upgrades.

“With this technology, AI factories become friendly and helpful grid assets,” said Varun Sivaram, founder and CEO of Emerald AI.

Emerald AI recorded 100% alignment with over 200 power targets that EPRI and National Grid instructed the AI cluster to follow for this experiment.

“We’ve proved the value that this technology brings.”Scaling London’s Grid a…

2 days, 9 hours назад @ blogs.nvidia.com
Building NVIDIA Nemotron 3 Agents for Reasoning, Multimodal RAG, Voice, and Safety
Building NVIDIA Nemotron 3 Agents for Reasoning, Multimodal RAG, Voice, and Safety Building NVIDIA Nemotron 3 Agents for Reasoning, Multimodal RAG, Voice, and Safety

Keep agents safe with Nemotron 3 Content SafetyAs agents expand from text‑only to multimodal workflows, safety guardrails must evolve across inputs, retrieval, and outputs.

Nemotron 3 Content Safety is a compact 4B‑parameter multimodal safety model that detects unsafe or sensitive content across text and images.

With Nemotron and NVIDIA NeMo, you’re getting the building blocks for trustworthy, repeatable, and scalable digital assistants for your production agentic systems.

Get started today:Stay up-to-date on NVIDIA Nemotron by subscribing to NVIDIA news and following NVIDIA AI on LinkedIn, X, Discord, and YouTube.

Explore open Nemotron models and datasets on Hugging Face and Blueprints on …

3 days, 4 hours назад @ developer.nvidia.com
Advancing Open Source AI, NVIDIA Donates Dynamic Resource Allocation Driver for GPUs to Kubernetes Community
Advancing Open Source AI, NVIDIA Donates Dynamic Resource Allocation Driver for GPUs to Kubernetes Community Advancing Open Source AI, NVIDIA Donates Dynamic Resource Allocation Driver for GPUs to Kubernetes Community

For the vast majority of enterprises, this workload runs on Kubernetes, an open source platform that automates the deployment, scaling and management of containerized applications.

“NVIDIA’s deep collaboration with the Kubernetes and CNCF community to upstream the NVIDIA DRA Driver for GPUs marks a major milestone for open source Kubernetes and AI infrastructure,” said Chris Aniszczyk, chief technology officer of CNCF.

In addition, NVIDIA announced at GTC new open source projects including the NVIDIA NemoClaw reference stack and NVIDIA OpenShell runtime for securely running autonomous agents.

In addition, following the release of NVIDIA Dynamo 1.0, NVIDIA is expanding the Dynamo ecosystem w…

3 days, 12 hours назад @ blogs.nvidia.com
How Autonomous AI Agents Become Secure by Design With NVIDIA OpenShell
How Autonomous AI Agents Become Secure by Design With NVIDIA OpenShell How Autonomous AI Agents Become Secure by Design With NVIDIA OpenShell

Autonomous agents mark a new inflection point in AI.

The NVIDIA OpenShell runtime is being built to address this.

Part of NVIDIA Agent Toolkit, OpenShell is an open source, secure-by-design runtime for running autonomous agents such as claws.

OpenShell Provides an Enterprise-Grade Sandbox for Building Personal AI AssistantsNVIDIA NemoClaw is an open source reference stack that simplifies installing OpenClaw always-on assistants with the OpenShell runtime and NVIDIA Nemotron models in a single command.

Get started with NVIDIA OpenShell and launch a ready‑to‑use environment on NVIDIA Brev, or explore the open source project on GitHub.

4 days, 5 hours назад @ blogs.nvidia.com
NVIDIA GTC 2026: Live Updates on What’s Next in AI
NVIDIA GTC 2026: Live Updates on What’s Next in AI NVIDIA GTC 2026: Live Updates on What’s Next in AI

For structured data, NVIDIA cuDF accelerates open source data processing engines such as Apache Spark, Presto, DuckDB, Polars and Velox, delivering up to 5x faster processing compared with CPU-only deployments.

At Snap, which serves more than 946 million active users, NVIDIA cuDF on GKE cut daily data processing costs by 76%.

In early experiments with Nestlé’s Order-to-Cash mart, watsonx.data with NVIDIA cuDF accelerated workloads ran five times faster, with 83% lower cost savings.

Monday, March 16, 1:30 p.m. PT 🔗NVIDIA Launches cuEST for Accelerated Quantum Chemistry in Semiconductor DesignNVIDIA this week launched NVIDIA cuEST, a new NVIDIA CUDA-X library that shifts electronic-structure …

1 week назад @ blogs.nvidia.com
Smooth Moves: 90 Frames-Per-Second Virtual Reality Arrives on GeForce NOW
Smooth Moves: 90 Frames-Per-Second Virtual Reality Arrives on GeForce NOW Smooth Moves: 90 Frames-Per-Second Virtual Reality Arrives on GeForce NOW

This week, GeForce NOW offers smoother sights in virtual reality (VR) and a sprawling new land to conquer.

And Crimson Desert, which recently surpassed 3 million wishlist additions on Steam, debuts in the cloud with GeForce RTX 5080‑class power.

With support for up to 90 fps for Ultimate members, gameplay feels smoother, movement more natural and action more comfortable.

Just fire up GeForce NOW on supported VR platforms, step into a virtual big screen and let the cloud handle the heavy lifting.

From chill sessions in a virtual theater to high‑octane firefights, 90 fps helps keep every moment looking sharp and feeling responsive.

1 week, 1 day назад @ blogs.nvidia.com
From Simulation to Production: How to Build Robots With AI
From Simulation to Production: How to Build Robots With AI From Simulation to Production: How to Build Robots With AI

The next generation of robots will be generalist-specialists — capable of understanding instructions and learning broad skills while also trainable for specialized tasks.

To accelerate this shift, the open NVIDIA Isaac platform provides robotics developers with everything they need — models, data pipelines, simulation frameworks, runtime libraries — to build a robot and deploy it at scale with NVIDIA’s three-computer solution.

NVIDIA even provides an open VLA model, NVIDIA Isaac GR00T N, which gives developers a powerful foundation to bootstrap and post-train their own robotic intelligence.

With the latest agent-friendly NVIDIA Isaac GR00T models, Isaac robot simulation and learning framewo…

1 week, 2 days назад @ blogs.nvidia.com
More Than Meets the Eye: NVIDIA RTX-Accelerated Computers Now Connect Directly to Apple Vision Pro
More Than Meets the Eye: NVIDIA RTX-Accelerated Computers Now Connect Directly to Apple Vision Pro More Than Meets the Eye: NVIDIA RTX-Accelerated Computers Now Connect Directly to Apple Vision Pro

At the NVIDIA GTC global AI conference running this week in San Jose, NVIDIA and Apple detailed how visionOS now supports NVIDIA CloudXR with foveated streaming, enabling apps to display high-resolution, low-latency immersive content on Apple Vision Pro.

This feature intelligently optimizes rendering resolution based on approximately where the user is looking, while strictly protecting user gaze data.

“Apple Vision Pro is redefining what professionals can do with spatial computing, enabling teams to visualize, collaborate and work with extraordinary fidelity in entirely new ways,” said Jeff Norris, senior director of the vision products group at Apple.

CloudXR for visionOS enables all three…

1 week, 3 days назад @ blogs.nvidia.com
NVIDIA, Telecom Leaders Build AI Grids to Optimize Inference on Distributed Networks
NVIDIA, Telecom Leaders Build AI Grids to Optimize Inference on Distributed Networks NVIDIA, Telecom Leaders Build AI Grids to Optimize Inference on Distributed Networks

Global Operators Turn Distributed Networks Into AI GridsAcross six major operators, AI grids are moving from concept to reality.

This demonstrates how cell sites and mobile switching offices can support distributed edge AI workloads while continuing to deliver advanced 5G connectivity.

New AI‑Native Services Put Telecom AI Grids to WorkAI grids are becoming foundational to a new class of AI‑native applications — real‑time, hyper‑personalized, concurrent and token-intensive.

AI Grid Reference Design and EcosystemThe NVIDIA AI Grid Reference Design defines the building blocks — including NVIDIA accelerated computing, networking and software platforms — for deploying and orchestrating AI acros…

1 week, 3 days назад @ blogs.nvidia.com
GTC Spotlights NVIDIA RTX PCs and DGX Sparks Running Latest Open Models and AI Agents Locally
GTC Spotlights NVIDIA RTX PCs and DGX Sparks Running Latest Open Models and AI Agents Locally GTC Spotlights NVIDIA RTX PCs and DGX Sparks Running Latest Open Models and AI Agents Locally

These devices, like the NVIDIA DGX Spark desktop AI supercomputer or dedicated NVIDIA RTX PCs, are ideal for running personal agents — privately and for free.

New Open Models Bring Cloud-Level Quality to Local AgentsThe next generation of local models — with increasingly large context windows — delivers the intelligence to run agents on PC.

The first features available in NemoClaw are NVIDIA Nemotron open models and the NVIDIA OpenShell runtime.

Check out other RTX AI Garage posts for more information on fine-tuning models with NVIDIA GeForce RTX GPUs.

Plug in to NVIDIA AI PC on Facebook, Instagram, TikTok and X — and stay informed by subscribing to the RTX AI PC newsletter.

1 week, 3 days назад @ blogs.nvidia.com
Snap Decisions: How Open Libraries for Accelerated Data Processing Boost A/B Testing for Snapchat
Snap Decisions: How Open Libraries for Accelerated Data Processing Boost A/B Testing for Snapchat Snap Decisions: How Open Libraries for Accelerated Data Processing Boost A/B Testing for Snapchat

To keep pace, its parent company Snap has adopted open data processing libraries from NVIDIA on Google Cloud services to boost development.

Snap runs thousands of these experiments each month — processing over 10 petabytes of data within a three-hour window each morning using the Apache Spark distributed framework.

The open library for accelerated data processing builds on the power of the NVIDIA cuDF GPU DataFrame library while scaling it for the Apache Spark distributed computing framework.

“The Spark accelerator is a perfect match for our workloads.”Looking ahead, the Snap team plans to integrate the Spark accelerator beyond the A/B team to a broader range of production workloads.

Read m…

1 week, 3 days назад @ blogs.nvidia.com
Roche Scales NVIDIA AI Factories Globally to Accelerate Drug Discovery, Diagnostic Solutions and Manufacturing Breakthroughs
Roche Scales NVIDIA AI Factories Globally to Accelerate Drug Discovery, Diagnostic Solutions and Manufacturing Breakthroughs Roche Scales NVIDIA AI Factories Globally to Accelerate Drug Discovery, Diagnostic Solutions and Manufacturing Breakthroughs

Harnessing Drug Discovery for Breakthroughs at a New ScaleAI is already transforming Roche and Genentech’s discovery engine, and the new infrastructure will accelerate this process further.

With NVIDIA Blackwell and the NVIDIA BioNeMo platform running on its AI factories, Roche can train and fine‑tune biological and molecular foundation models, integrate proprietary datasets and expand AI‑driven lab automation, all powering the Lab-in-the-Loop.

This allows Roche to explore far larger swaths of the biological and chemical space — and to do it with greater speed.

Driving Pharmaceutical Manufacturing Powered by Digital TwinsRoche is also modernizing pharmaceutical manufacturing with NVIDIA Omn…

1 week, 4 days назад @ blogs.nvidia.com
Facebook
последний пост 1 week, 2 days назад
Friend Bubbles: Enhancing Social Discovery on Facebook Reels
Friend Bubbles: Enhancing Social Discovery on Facebook Reels Friend Bubbles: Enhancing Social Discovery on Facebook Reels

Friend bubbles in Facebook Reels highlight Reels your friends have liked or reacted to, helping you discover new content and making it easier to connect over shared interests.

Friend bubbles enhance the social experience on Facebook Reels by helping you discover content your friends enjoy, creating a shared viewing experience and sparking new conversations.

Along with additional optimizations in the underlying method, this approach enabled us to ship friend bubbles while preserving core Reels performance.

Friend bubbles work because the signal is high value: It adds meaningful social context that helps people decide what’s worth watching.

Engagement also scales consistently with the number …

1 week, 2 days назад @ engineering.fb.com
Ranking Engineer Agent (REA): The Autonomous AI Agent Accelerating Meta’s Ads Ranking Innovation
Ranking Engineer Agent (REA): The Autonomous AI Agent Accelerating Meta’s Ads Ranking Innovation Ranking Engineer Agent (REA): The Autonomous AI Agent Accelerating Meta’s Ads Ranking Innovation

Meta’s Ranking Engineer Agent (REA) autonomously executes key steps across the end-to-end machine learning (ML) lifecycle for ads ranking models.

Powering these interactions are highly sophisticated, complex and massively distributed machine learning (ML) models that continuously evolve to serve both advertisers and people who use the platforms.

Optimizing these ML models has traditionally been time-consuming.

To address this, Meta built the Ranking Engineer Agent, an autonomous AI agent designed to drive the end-to-end ML lifecycle and iteratively evolve Meta’s ads ranking models at scale.

ML training jobs run for hours or days, far beyond what any session-bound assistant can manage.

1 week, 3 days назад @ engineering.fb.com
Patch Me If You Can: AI Codemods for Secure-by-Default Android Apps
Patch Me If You Can: AI Codemods for Secure-by-Default Android Apps Patch Me If You Can: AI Codemods for Secure-by-Default Android Apps

Nowhere is this more apparent than in mobile security, where a single class of vulnerability can be replicated across hundreds of call sites scattered throughout a sprawling, multi-app codebase serving billions of users.

Meta’s Product Security team has developed a two-pronged strategy to address this:Designing secure-by-default frameworks that wrap potentially unsafe Android OS APIs and make the secure path the easiest path for developers, andLeveraging generative AI to automate the migration of existing code to those frameworks at scale.

The result is a system that can propose, validate, and submit security patches across millions of lines of code with minimal friction for the engineers w…

2 weeks назад @ engineering.fb.com
RCCLX: Innovating GPU communications on AMD platforms
RCCLX: Innovating GPU communications on AMD platforms RCCLX: Innovating GPU communications on AMD platforms

RCCLX is fully integrated with Torchcomms and aims to empower researchers and developers to accelerate innovation, regardless of their chosen backend.

We want to iterate on collectives, transports, and novel features quickly on AMD platforms.

With RCCLX, we have integrated CTran to AMD platforms, enabling the AllToAllvDynamic – a GPU-resident collective.

These features provide significant performance improvements on AMD platforms and we are excited to share this with the community.

RCCLX Quick Start GuideInstall Torchcomms with RCCLX backend by following the installation instructions in the Torchcomms repo.

1 month назад @ engineering.fb.com
The Death of Traditional Testing: Agentic Development Broke a 50-Year-Old Field, JiTTesting Can Revive It
The Death of Traditional Testing: Agentic Development Broke a 50-Year-Old Field, JiTTesting Can Revive It The Death of Traditional Testing: Agentic Development Broke a 50-Year-Old Field, JiTTesting Can Revive It

A Catching JiTTest focuses specifically on finding regressions introduced by a code change.

Agentic development dramatically increases the pace of code change, straining test development burden and scaling the cost of false positives and test maintenance to breaking point.

And since the JiTTest itself is LLM-generated, it can often infer the plausible intention of a code change and simulate possible faults that may result from it.

With them engineers no longer have to spend time writing, reviewing, and testing complex test code.

READ THE PAPERJust-in-Time Catching Test Generation at Meta

1 month, 2 weeks назад @ engineering.fb.com
Adapting the Facebook Reels RecSys AI Model Based on User Feedback
Adapting the Facebook Reels RecSys AI Model Based on User Feedback Adapting the Facebook Reels RecSys AI Model Based on User Feedback

Our new User True Interest Survey (UTIS) model , now helps surface more niche, high-quality content and boosts engagement, retention, and satisfaction.

Our paper, “ Improve the Personalization of Large-Scale Ranking Systems by Integrating User Survey Feedback ” shares full details on this work.

The main candidate ranking model used by the platform is a large multi-task, multi-label model.

We trained a lightweight UTIS alignment model layer on the collected user survey responses using existing predictions of the main model as input features.

The UTIS model consistently outperformed the baseline, driving higher user engagement and retention .

2 months, 1 week назад @ engineering.fb.com
DrP: Meta’s Root Cause Analysis Platform at Scale
DrP: Meta’s Root Cause Analysis Platform at Scale DrP: Meta’s Root Cause Analysis Platform at Scale

DrP’s key components include:Expressive SDK : The DrP SDK allows engineers to codify investigation workflows into analyzers.

Post-processing system : After an investigation, the post-processing system can take automated actions based on the analysis results.

Bootstrap code : The DrP SDK provides bootstrap code to create a template analyzer with pre-populated boilerplate code.

Data access and analysis : The SDK includes libraries for data access and analysis, such as dimension analysis and time series correlation.

This provides immediate analysis results to on-call engineers.

3 months, 1 week назад @ engineering.fb.com
How AI Is Transforming the Adoption of Secure-by-Default Mobile Frameworks
How AI Is Transforming the Adoption of Secure-by-Default Mobile Frameworks How AI Is Transforming the Adoption of Secure-by-Default Mobile Frameworks

Generative AI and automation accelerate the adoption of secure frameworks at scale, enabling consistent security enforcement and efficient migration across Meta’s vast codebase.

How We Design Secure-by-Default Frameworks at MetaDesigning secure-by-default frameworks for use by a large number of developers shipping vastly different features across multiple apps is an interesting challenge.

There shouldn’t be one security framework that covers all security issues, and not every security issue is general enough to deserve its own framework.

Now that we’ve looked at the design philosophy behind our frameworks, let’s look at one of our most widely used Android security frameworks, SecureLinkLaun…

3 months, 1 week назад @ engineering.fb.com
Zoomer: Powering AI Performance at Meta’s Scale Through Intelligent Debugging and Optimization
Zoomer: Powering AI Performance at Meta’s Scale Through Intelligent Debugging and Optimization Zoomer: Powering AI Performance at Meta’s Scale Through Intelligent Debugging and Optimization

Zoomer has delivered training time reductions, and significant QPS improvements, making it the de-facto tool for AI performance optimization across Meta’s entire AI infrastructure.

Zoomer is Meta’s automated, one-stop-shop platform for performance profiling, debugging, analysis, and optimization of AI training and inference workloads.

AI Performance Optimization Using ZoomerZoomer is an automated debugging and optimization platform that works across all of our AI model types (ads recommendations, GenAI, computer vision, etc.)

Memory Analysis : Comprehensive analysis of GPU memory usage patterns, allocation tracking, and leak detection.

Realtime Memory Profiling : GPU memory allocation track…

4 months назад @ engineering.fb.com
Open Source Is Good for the Environment
Open Source Is Good for the Environment Open Source Is Good for the Environment

But have you heard about open hardware?

And did you know open source can have a positive impact on the environment?

On this episode of the Meta Tech Podcast, Pascal Hartig sits down with Dharmesh and Lisa to talk about all things open hardware, and Meta’s biggest announcements from the 2025 Open Compute Project (OCP) Summit – including a new open methodology for leveraging AI to understand Scope 3 emissions.

You’ll also hear how AI and open hardware are helping Meta push to achieve net zero emissions in 2030, including how AI is being used to develop new concrete mixes for data center construction.

And if you’re interested in learning more about career opportunities at Meta visit the Meta C…

4 months, 1 week назад @ engineering.fb.com
Meta’s Generative Ads Model (GEM): The Central Brain Accelerating Ads Recommendation AI Innovation
Meta’s Generative Ads Model (GEM): The Central Brain Accelerating Ads Recommendation AI Innovation Meta’s Generative Ads Model (GEM): The Central Brain Accelerating Ads Recommendation AI Innovation

We’re sharing details about Meta’s Generative Ads Recommendation Model (GEM), a new foundation model that delivers increased ad performance and advertiser ROI by enhancing other ads recommendation models’ ability to serve relevant ads.

GEM propagates its learnings, leveraging a suite of post-training techniques across the entire ads model fleet, enabling a paradigm shift in Meta’s Ads Recommendation system.

GEM leverages enhanced training scalability that efficiently utilizes thousands of GPUs for building and iterating an LLM-scale ads foundation model.

The Generative Ads Recommendation Model (GEM) is Meta’s most advanced ads foundation model, built on an LLM-inspired paradigm and trained …

4 months, 2 weeks назад @ engineering.fb.com
Scaling LLM Inference: Innovations in Tensor Parallelism, Context Parallelism, and Expert Parallelism
Scaling LLM Inference: Innovations in Tensor Parallelism, Context Parallelism, and Expert Parallelism Scaling LLM Inference: Innovations in Tensor Parallelism, Context Parallelism, and Expert Parallelism

At Meta, we are constantly pushing the boundaries of LLM inference systems to power applications such as the Meta AI App.

These metrics highlight the distinct computational demands of LLM inference: Prefill is compute-intensive, while decoding is memory bandwidth-intensive.

Communication: Communication latency increases when parallelizing across multiple hosts.

In EP-based inference, we utilize a two-shot, all-to-all communication pattern to exchange tokens between data parallelism and expert parallelism ranks based on routing.

We are committed to continuous innovation to ensure efficient and scalable LLM inference for millions of users worldwide.

5 months, 1 week назад @ engineering.fb.com
How Meta Is Leveraging AI To Improve the Quality of Scope 3 Emission Estimates for IT Hardware
How Meta Is Leveraging AI To Improve the Quality of Scope 3 Emission Estimates for IT Hardware How Meta Is Leveraging AI To Improve the Quality of Scope 3 Emission Estimates for IT Hardware

We leveraged AI to help us improve this database and understand our Scope 3 emissions associated with IT hardware by:Identifying similar components and applying existing PCFs to similar components that lack these carbon estimates.

Understanding the carbon footprint of IT racks and applying generative AI (GenAI) as a categorization algorithm to create a new and standard taxonomy .

If these similar components are not identified their carbon footprint estimates will remain at a lower data quality.

These similar components can be mapped to a representative proxy PCF, allowing us to use high-quality PCF data in similar components.

For example, we can scale the carbon footprint calculation for a …

5 months, 2 weeks назад @ engineering.fb.com
OCP Summit 2025: The Open Future of Networking Hardware for AI
OCP Summit 2025: The Open Future of Networking Hardware for AI OCP Summit 2025: The Open Future of Networking Hardware for AI

At Open Compute Project Summit (OCP) 2025, we’re sharing details about the direction of next-generation network fabrics for our AI training clusters.

At Meta, we believe that open hardware is a catalyst for innovation — especially as data center infrastructure increasingly supports new and emerging AI technologies.

Open hardware plays a crucial role in enabling disaggregation, allowing us to break down traditional data center technologies into their core components.

Today, through OCP, we continue to advance open network technologies for the next generation of AI applications.

Ethernet for Scale-Up Networking in OCP: Meta’s Industry LeadershipAt Meta, we recognize that the future of AI and …

5 months, 2 weeks назад @ engineering.fb.com
LLMs Are the Key to Mutation Testing and Better Compliance
LLMs Are the Key to Mutation Testing and Better Compliance LLMs Are the Key to Mutation Testing and Better Compliance

By leveraging LLMs we’ve been able to overcome the barriers that have prevented mutation testing from being efficiently deployed at scale.

Our presentations shared insights into how we’ve used LLMs to solve the major barriers that have prevented mutation testing at scale and highlighted new areas in automated software testing where LLMs can have a significant impact.

Mutation Testing Isn’t ScalableTraditional mutation testing generates a very large number of mutants, making it computationally expensive and difficult to scale to large industrial codebases.

Mutation Testing Requires a Lot of Computational ResourcesMutation testing is costly in terms of computational resources and developer ef…

5 months, 4 weeks назад @ engineering.fb.com
Uber Engineering
последний пост None
neptune.ai neptune.ai
последний пост 3 months, 3 weeks назад
We are joining OpenAI
We are joining OpenAI We are joining OpenAI

Piotr Niedźwiedź, CEO/CTO and founder of neptune.aiI’m excited to share that we’ve entered into a definitive agreement to be acquired by OpenAI, subject to closing conditions.

We are thrilled to join the OpenAI team and help their AI researchers build better models faster.

Neptune is a metrics dashboard company.”We’ve worked closely with OpenAI to create the metrics dashboard that helps teams building foundation models.

Our future with OpenAINeptune will join OpenAI and continue to support AI researchers with tools to monitor, debug, and evaluate frontier models.

We are looking forward to working with top AI researchers and supporting OpenAI’s mission of ensuring that AGI benefits all of hu…

3 months, 3 weeks назад @ neptune.ai
Synthetic Data for LLM Training
Synthetic Data for LLM Training Synthetic Data for LLM Training

For instance, financial data is highly sensitive and protected by very strict regulations, and synthetic data mimics the real data distribution without revealing customer information.

Read more about how leading foundation model teams curate their training data and other topics in the State of Foundation Model Training Report 2025.

Choosing the right synthetic data generation technique depends on the type of data and its complexity.

Synthetic tabular data generation is a promising direction to overcome these challenges by learning the distribution of the tabular data.

Post-processingAs the distribution of tabular data is highly complex, it makes the synthetic tabular data generation very ch…

4 months, 2 weeks назад @ neptune.ai
What are LLM Embeddings: All you Need to Know
What are LLM Embeddings: All you Need to Know What are LLM Embeddings: All you Need to Know

TL;DR LLM embeddings are the numerical, vector representations of text that Large Language Models (LLMs) use to process information.

Unlike their predecessor word embeddings, LLM embeddings are context-aware and dynamically change to capture semantic and syntactic relationships based on the surrounding text.

What are the applications of LLM embeddings?

Word EmbeddingsSparse Word Embeddings One-Hot Vectors 1970s TF-IDF1980s Co-Occurrence MatrixStatic Word Embeddings Word2Vec 2013 GloVe 2014Contextualized word embeddings ELMo 2018 GPT-1 2018 BERT 2018 LLAMA 2023 DeepSeek-V1 2023 GPT-4 2023Static word embeddingsStatic word embeddings, such as word2vec in 2013, marked a significant development.…

4 months, 3 weeks назад @ neptune.ai
Detecting and Fixing ‘Dead Neurons’ in Foundation Models
Detecting and Fixing ‘Dead Neurons’ in Foundation Models Detecting and Fixing ‘Dead Neurons’ in Foundation Models

TL;DR Dead neurons silently waste compute and reduce effective model capacity in foundation models.

Dead neurons’ impactRecent studies into dead neurons in the context of foundation models show interesting, albeit worrying, results.

These large reported fractions of dead neurons in foundation models are a concern from a computational perspective.

Before we move on to discuss how to detect and fix dead neurons, let’s touch upon an important distinction between dead neurons and vanishing gradients.

Further reading How to Monitor, Diagnose, and Solve Gradient Issues in Foundation Models Read moreVisualizing activation distributionsIs your foundation model suffering from dead neurons?

5 months назад @ neptune.ai
Part 2: Instruction Fine-Tuning: Evaluation and Advanced Techniques for Efficient Training
Part 2: Instruction Fine-Tuning: Evaluation and Advanced Techniques for Efficient Training Part 2: Instruction Fine-Tuning: Evaluation and Advanced Techniques for Efficient Training

In the first part of this series, we covered the fundamentals of instruction fine-tuning (IFT).

def calculate_irs(instruction, output, reference_model): evaluation_prompt = f""" Instruction: {instruction} Model Output: {output} Rate how well the output follows the instruction on these criteria: 1.

| SourceHINT addresses a computational inefficiency in standard instruction fine-tuning: repeatedly reprocessing the same task instruction with every input example.

Read more about foundation model training infrastructure and other topics in Neptune’s 2025 State of Foundation Model Training Report.

First, during initial instruction fine-tuning across multiple diverse tasks, the model learns genera…

5 months назад @ neptune.ai
How to Optimize LLM Inference
How to Optimize LLM Inference How to Optimize LLM Inference

Large Language Model (LLM) inference at scale is challenging as it involves transferring massive amounts of model parameters and data and performing computations on large tensors.

In the following, we’ll use the Llama model family architecture as a specific example to understand the LLM workload at inference.

For a far more detailed analysis of the LLM workload at inference, see the chapter All About Transformer Inference in the book How to Scale Your Model, published by Google DeepMind.

See also How to Run LLMs Locally Read moreA quick primer on hardware for LLM inferenceA typical LLM inference cluster consists of several nodes, each with a multi-core CPU and multiple accelerator devices, …

5 months, 2 weeks назад @ neptune.ai
A Researcher’s Guide to LLM Grounding
A Researcher’s Guide to LLM Grounding A Researcher’s Guide to LLM Grounding

In this article, we’ll explore the fundamental concepts of LLM grounding as well as strategies for optimally grounding models.

What is LLM grounding?

LLM grounding is analogous.

If relevant knowledge cannot be inferred from the data, then LLM grounding cannot yield more relevant responses.

When grounding LLMs using RAG, consider retaining only a few of the top hits (i.e., top-k) for your retrieval queries.

6 months назад @ neptune.ai
Instruction Fine-Tuning: Fundamentals, Architecture Modifications, and Loss Functions
Instruction Fine-Tuning: Fundamentals, Architecture Modifications, and Loss Functions Instruction Fine-Tuning: Fundamentals, Architecture Modifications, and Loss Functions

TL;DR Instruction fine-tuning (IFT) refines pre-trained large language models (LLMs) to follow specific task instructions by training on prompt-response pairs.

Instruction fine-tuning in a nutshellIFT tailors LLMs to follow user instructions by bridging their inherent next-word prediction with human-defined objectives.

Related LLM Fine-Tuning and Model Selection Using Neptune and Transformers Read moreParameter-efficient instruction fine-tuningWhile major foundation models like GPT-4 or Llama-2 undergo full parameter instruction fine-tuning during development, parameter-efficient fine-tuning (PEFT) methods have become widely adopted for instruction fine-tuning since the LoRA paper was publi…

6 months, 1 week назад @ neptune.ai
Understanding Prompt Injection: Risks, Methods, and Defense Measures
Understanding Prompt Injection: Risks, Methods, and Defense Measures Understanding Prompt Injection: Risks, Methods, and Defense Measures

Prompt injection 101: When prompts go rogueThe term ‘Prompt Injection’ comes from SQL injection attacks.

There is another claim of the independent discovery of prompt injection attacks, which suggests that Riley Goodside publicly exhibited a prompt injection in a tweet back in September 2022.

The indirect prompt injection attacks are classified into active, passive, user-driven and virtual prompt attacks.

Virtual prompt injection attacksThis injection type is closely related to passive injection attacks previously described.

Prompt injection: current challenges & lessons learnedThe arms race between prompt injection attacks and defenses is a challenge for researchers, developers, and users.

7 months, 3 weeks назад @ neptune.ai
SabiYarn: Advancing Low-Resource Languages With Multitask NLP Pre-Training [Paper Reflections]
SabiYarn: Advancing Low-Resource Languages With Multitask NLP Pre-Training [Paper Reflections] SabiYarn: Advancing Low-Resource Languages With Multitask NLP Pre-Training [Paper Reflections]

This simple idea avoids computing loss on input prompt tokens the model already knows.

Prompt tokens are (too) expensive in low-resource settingsDuring pre-training, LLMs are trained in causal language modeling through a next-token prediction task.

=> Mo fẹ́ràn ìrẹsì,” the model is trained to predict every token, from the prompt to the actual answer:Step Prompt Next token 1 Translate English Static prompt 2 Translate English to Static prompt 3 Translate English to Yoruba: Static prompt 4 Translate English to Yoruba: I 5 Translate English to Yoruba: I love 6 Translate English to Yoruba: I love rice.

This is straightforward to implement in PyTorch by masking out the prompt tokens in the label …

7 months, 4 weeks назад @ neptune.ai
How to Monitor, Diagnose, and Solve Gradient Issues in Foundation Models
How to Monitor, Diagnose, and Solve Gradient Issues in Foundation Models How to Monitor, Diagnose, and Solve Gradient Issues in Foundation Models

What gradient issues occur during foundation model training?

During training, gradient descent updates model parameters by computing the gradients of the loss function via forward and backward passes.

The green line corresponds to a learning rate of 10, while the orange line has a learning rate of 0.1.

The gradient norm for the orange line with LR = 0.1 is very high in the first steps, while the gradient norm of the green line with LR = 10 diverges to NaN after a few steps.

Techniques for gradient stabilizationMonitoring gradient norms and training loss provides insights into the learning dynamics of the foundation models.

8 months, 3 weeks назад @ neptune.ai
STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning [Paper Reflection]
STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning [Paper Reflection] STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning [Paper Reflection]

Unstructured pruning removes individual weights, while structured pruning removes entire model components.

In the context of MoEs, as expert structures from training MoEs correspond to such patterns, pruning experts is a natural fit for structured pruning.

Thus, structured pruning does not significantly decrease kurtosis, leaving plenty of margin for unstructured pruning.

Since structured pruning primarily reduces architectural redundancy rather than reshaping the underlying weight distribution, our two-phase approach—leveraging unstructured pruning after structured pruning—outperforms unstructured-only pruning.

Since STUN does not make any assumption about base MoE models, it is generaliza…

9 months, 3 weeks назад @ neptune.ai
▶️ YouTube
Yannic Kilcher Yannic Kilcher
последний пост 2 weeks, 6 days назад
I BUILT A FULLY AUTOMATIC MANSPLAINER
I BUILT A FULLY AUTOMATIC MANSPLAINER I BUILT A FULLY AUTOMATIC MANSPLAINER

All information about GTC and the DGX Spark Raffle is here: https://www.ykilcher.com/gtc Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https://www.subscribestar.com/yannickilcher

Patreon: https://www.patreon.com/yannickilcher

Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq

Ethereu…

2 weeks, 6 days назад @ youtube.com
Traditional X-Mas Stream
Traditional X-Mas Stream Traditional X-Mas Stream

Letsgooo

2 months, 4 weeks назад @ youtube.com
Traditional Holiday Live Stream
Traditional Holiday Live Stream Traditional Holiday Live Stream

https://ykilcher.com/discord Links:

TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://discord.gg/4H8xxDF

BitChute: https://www.bitchute.com/channel/yannic-kilcher

Minds: https://www.minds.com/ykilcher

Parler: https://parler.com/profile/YannicKilcher

LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/

BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https:/…

2 months, 4 weeks назад @ youtube.com
TiDAR: Think in Diffusion, Talk in Autoregression (Paper Analysis)
TiDAR: Think in Diffusion, Talk in Autoregression (Paper Analysis) TiDAR: Think in Diffusion, Talk in Autoregression (Paper Analysis)

Paper: https://arxiv.org/abs/2511.08923 Abstract:

Diffusion language models hold the promise of fast parallel generation, while autoregressive (AR) models typically excel in quality due to their causal structure aligning naturally with language modeling. This raises a fundamental question: can we achieve a synergy with high throughput, higher GPU utilization, and AR level quality? Existing methods fail to effectively balance these two aspects, either prioritizing AR using a weaker model for sequential drafting (speculative decoding), leading to lower drafting efficiency, or using some form of left-to-right (AR-like) decoding logic for diffusion, which still suffers from quality degradation …

3 months назад @ youtube.com
Titans: Learning to Memorize at Test Time (Paper Analysis)
Titans: Learning to Memorize at Test Time (Paper Analysis) Titans: Learning to Memorize at Test Time (Paper Analysis)

Paper: https://arxiv.org/abs/2501.00663 Abstract:

Over more than a decade there has been an extensive research effort on how to effectively utilize recurrent models and attention. While recurrent models aim to compress the data into a fixed-size memory (called hidden state), attention allows attending to the entire context window, capturing the direct dependencies of all tokens. This more accurate modeling of dependencies, however, comes with a quadratic cost, limiting the model to a fixed-length context. We present a new neural long-term memory module that learns to memorize historical context and helps attention to attend to the current context while utilizing long past information. We sh…

3 months, 1 week назад @ youtube.com
[Paper Analysis] The Free Transformer (and some Variational Autoencoder stuff)
[Paper Analysis] The Free Transformer (and some Variational Autoencoder stuff) [Paper Analysis] The Free Transformer (and some Variational Autoencoder stuff)

https://arxiv.org/abs/2510.17558 Abstract:

We propose an extension of the decoder Transformer that conditions its generative process on random latent variables which are learned without supervision thanks to a variational procedure. Experimental evaluations show that allowing such a conditioning translates into substantial improvements on downstream tasks. Author: François Fleuret Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the con…

4 months, 3 weeks назад @ youtube.com
[Video Response] What Cloudflare's code mode misses about MCP and tool calling
[Video Response] What Cloudflare's code mode misses about MCP and tool calling [Video Response] What Cloudflare's code mode misses about MCP and tool calling

Theo's Video: https://www.youtube.com/watch?v=bAYZjVAodoo

Cloudflare article: https://blog.cloudflare.com/code-mode/ Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https://www.subscribestar.com/yannickilcher

Patreon: https://www.patreon.com/yannickilcher

Bitcoin (BTC): bc1q49lsw3q325tr58ygf8…

5 months, 1 week назад @ youtube.com
[Paper Analysis] On the Theoretical Limitations of Embedding-Based Retrieval (Warning: Rant)
[Paper Analysis] On the Theoretical Limitations of Embedding-Based Retrieval (Warning: Rant) [Paper Analysis] On the Theoretical Limitations of Embedding-Based Retrieval (Warning: Rant)

Paper: https://arxiv.org/abs/2508.21038 Abstract:

Vector embeddings have been tasked with an ever-increasing set of retrieval tasks over the years, with a nascent rise in using them for reasoning, instruction-following, coding, and more. These new benchmarks push embeddings to work for any query and any notion of relevance that could be given. While prior works have pointed out theoretical limitations of vector embeddings, there is a common assumption that these difficulties are exclusively due to unrealistic queries, and those that are not can be overcome with better training data and larger models. In this work, we demonstrate that we may encounter these theoretical limitations in realist…

5 months, 2 weeks назад @ youtube.com
AGI is not coming!
AGI is not coming! AGI is not coming!

jack Morris's investigation into GPT-OSS training data https://x.com/jxmnop/status/1953899426075816164?t=3YRhVQDwQLk2gouTSACoqA&s=09

7 months, 2 weeks назад @ youtube.com
Context Rot: How Increasing Input Tokens Impacts LLM Performance (Paper Analysis)
Context Rot: How Increasing Input Tokens Impacts LLM Performance (Paper Analysis) Context Rot: How Increasing Input Tokens Impacts LLM Performance (Paper Analysis)

Paper: https://research.trychroma.com/context-rot Abstract:

Large Language Models (LLMs) are typically presumed to process context uniformly—that is, the model should handle the 10,000th token just as reliably as the 100th. However, in practice, this assumption does not hold. We observe that model performance varies significantly as input length changes, even on simple tasks.

In this report, we evaluate 18 LLMs, including the state-of-the-art GPT-4.1, Claude 4, Gemini 2.5, and Qwen3 models. Our results reveal that models do not use their context uniformly; instead, their performance grows increasingly unreliable as input length grows. Authors: Kelly Hong, Anton Troynikov, Jeff Huber Links:

8 months, 1 week назад @ youtube.com
Energy-Based Transformers are Scalable Learners and Thinkers (Paper Review)
Energy-Based Transformers are Scalable Learners and Thinkers (Paper Review) Energy-Based Transformers are Scalable Learners and Thinkers (Paper Review)

Paper: https://arxiv.org/abs/2507.02092

Code: https://github.com/alexiglad/EBT

Website: https://energy-based-transformers.github.io/ Abstract:

Inference-time computation techniques, analogous to human System 2 Thinking, have recently become popular for improving model performances. However, most existing approaches suffer from several limitations: they are modality-specific (e.g., working only in text), problem-specific (e.g., verifiable domains like math and coding), or require additional supervision/training on top of unsupervised pretraining (e.g., verifiers or verifiable rewards). In this paper, we ask the question "Is it possible to generalize these System 2 Thinking approaches, and de…

8 months, 1 week назад @ youtube.com
Henry AI Labs Henry AI Labs
последний пост None
3blue1brown 3blue1brown
последний пост 1 day, 23 hours назад
The subset sum puzzle
The subset sum puzzle The subset sum puzzle

Part of a series of monthly puzzlers. Stay subscribed to see the solution

1 day, 23 hours назад @ youtube.com
Escher's most mathematically interesting piece
Escher's most mathematically interesting piece Escher's most mathematically interesting piece

Escher's Print Gallery, and the tour of complex analysis it invites.

Check out our virtual career fair: 3b1b.co/talent

Join channel supporters to see videos early: 3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com Original paper by de Smit and Lenstra:

https://pub.math.leidenuniv.nl/~smitbde/papers/2003-de_smit-lenstra-escher.pdf Timestamps: 0:00 - The print gallery

13:04 - Conformal maps from complex analysis

21:41 - The complex exponential

25:56 - The complex logarithm

32:32 - 3b1b Talent

33:14 - Constructing the key function

40:16 - The deeper math behind Escher ------------------ These animations are largely made us…

5 days, 7 hours назад @ youtube.com
Bacteria Grid Puzzle Solution
Bacteria Grid Puzzle Solution Bacteria Grid Puzzle Solution

Part of a monthly series of puzzlers, in collaboration with MoMath and Peter Winkler

6 days, 9 hours назад @ youtube.com
The most underappreciated formula | Exploring high-dimensional spheres
The most underappreciated formula | Exploring high-dimensional spheres The most underappreciated formula | Exploring high-dimensional spheres

On the volumes of higher-dimensional spheres

Explore the 3b1b virtual career fair: See https://3b1b.co/talent

Become a supporter for early views of new videos: https://3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com Thanks to UC Santa Cruz for letting me film there, and special thanks to Pedro Morales-Almazan for arranging everything. My video on Numberphile with a fun application of this problem: https://youtu.be/6_yU9eJ0NxA Timestamps:

0:00 - Introduction

1:01 - Random puzzle

6:16 - Outside the box

14:35 - Setting up the volume grid

21:14 - Why 4πr^2

25:21 - Archimedes in higher dimensions

36:17 - The general formul…

4 weeks назад @ youtube.com
The lattice bacteria puzzle
The lattice bacteria puzzle The lattice bacteria puzzle

Part of a series of monthly puzzles, done in collaboration with MoMath.

https://momath.org/mindbenders

1 month, 1 week назад @ youtube.com
Solution to the ladybug clock puzzle
Solution to the ladybug clock puzzle Solution to the ladybug clock puzzle

Solution to last month's probability puzzle.

1 month, 1 week назад @ youtube.com
The Hairy Ball Theorem
The Hairy Ball Theorem The Hairy Ball Theorem

Unexpected applications and a beautiful proof.

Looking for a new career? Check out https://3b1b.co/talent

Supporters get early access to new videos: https://3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com Credits:

Senia Sheydvasser: Co-writing and sphere deformation animations

Paul Dancstep: Those lovely fluffy sphere animations Vince Rubinetti: Music Timestamps:

0:00 - To comb a hairy ball

1:24 - Applications

8:46 - The puzzle of one null point

12:12 - The proof outline

16:41 - Defining orientation

21:44 - Why inside-out is impossible

25:59 - 3b1b Talent

27:44 - Final food for thought ------------------ These animati…

1 month, 3 weeks назад @ youtube.com
The ladybug clock puzzle
The ladybug clock puzzle The ladybug clock puzzle

This is the first in a set of monthly puzzles, curated by Peter Winkler. This one was originally suggested by Richard Stanley. You can sign up to hear his description of the answer at http://momath.org/mindbenders

2 months, 1 week назад @ youtube.com
The most absurd product I've made
The most absurd product I've made The most absurd product I've made

Because why not make a pi creature neck pillow?

Available at 3b1b.co/store

4 months назад @ youtube.com
How Laplace transforms solve differential equations
How Laplace transforms solve differential equations How Laplace transforms solve differential equations

Studying the forced harmonic oscillator by taking a Laplace transform and studying its poles.

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com Chapter on the Laplace Transform:

https://youtu.be/j0wJBEZdwLs Chapter on the S-plane and Simple Harmonic Motion:

https://youtu.be/-j8PzkZ70Lg Timestamps:

0:00 - Opening puzzle

1:06 - Key properties of a Laplace Transform

3:29 - Qualitative analysis with Laplace Transforms

4:29 - The Laplace Transforms of a Derivative

6:06 - The forced oscillator

11:59 - Intuition from the transformed solution

1…

4 months, 3 weeks назад @ youtube.com
The dynamics of e^(πi)
The dynamics of e^(πi) The dynamics of e^(πi)

A fuller version of this explanation, also including the reason we care about complex exponents in the first place: https://youtu.be/-j8PzkZ70Lg

5 months, 2 weeks назад @ youtube.com
But what is a Laplace Transform?
But what is a Laplace Transform? But what is a Laplace Transform?

Visualizing the most important tool for differential equations.

Previous chapter: https://youtu.be/-j8PzkZ70Lg

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com Artwork by Kurt Bruns Engine animation borrowed with permission from this (excellent) blog: https://ciechanow.ski/internal-combustion-engine/ Timestamps:

0:00 - Understanding the engine

1:16 - Key background ideas

5:41 - Definition and intuition

10:43 - Complex integration

20:43 - Analytic continuation

23:52 - The transform of exponentials

26:15 - A deep look at cos(t)

32:59 - W…

5 months, 2 weeks назад @ youtube.com
The dynamics of e^(πi)
The dynamics of e^(πi) The dynamics of e^(πi)

A fuller version of this explanation, also including the reason we care about complex exponents in the first place: https://youtu.be/-j8PzkZ70Lg

5 months, 2 weeks назад @ youtube.com
Why complex exponents matter | Laplace Transform Prelude
Why complex exponents matter | Laplace Transform Prelude Why complex exponents matter | Laplace Transform Prelude

How dynamics explain Euler's formula, and vice versa.

Early view of the Laplace Transform video: https://www.patreon.com/posts/laplace-early-140428165

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com Timestamps:

0:00 - Intro

1:51 - Euler's formula explained dynamically

9:27 - The harmonic oscillator

21:08 - General linear equations

22:47 - Motivating the Laplace Transform ------------------ These animations are largely made using a custom Python library, manim. See the FAQ comments here:

https://3b1b.co/faq#manim Music by Vincent Rubin…

5 months, 3 weeks назад @ youtube.com
Why ruler and compass? | Guest video by ⁨@bensyversen⁩
Why ruler and compass? | Guest video by ⁨@bensyversen⁩ Why ruler and compass? | Guest video by ⁨@bensyversen⁩

What role were ruler and compass constructions really serving?

Check out Ben's channel: @bensyversen Interview with the author of this video: https://youtu.be/VohYM99j8e0

Supporters get early views of new videos: https://3b1b.co/support Written, produced, edited, and animated by Ben Syversen

Additional editing: Jack Saxon

3d Blender model: Jan-Hendrik Müller

Additional Blender help: Thibaut Modrzyk (@Deepia)

Illustrations: Alex Zepherin/DonDada Studio

Drums: Jeremy Gustin

Additional music from Epidemic Sound Special thanks to Viktor Blåsjö: https://intellectualmathematics.com/opinionated-history-of-mathematics/ References/Recommended reading: Euclid’s Elements:

Visual edition of Book 1: htt…

6 months, 1 week назад @ youtube.com
Two Minute Papers Two Minute Papers
последний пост 4 часа назад
DeepMind’s New AI Just Changed Science Forever
DeepMind’s New AI Just Changed Science Forever DeepMind’s New AI Just Changed Science Forever

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://arxiv.org/abs/2602.10177 Source:

https://www.youtube.com/watch?v=6evUpgCHtOQ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~z…

4 часа назад @ youtube.com
The Algorithm That Made Me Cry
The Algorithm That Made Me Cry The Algorithm That Made Me Cry

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Free course on Ray Tracing:

https://users.cg.tuwien.ac.at/zsolnai/gfx/rendering-course/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

Thumbnail design: ht…

1 day, 5 hours назад @ youtube.com
DeepSeek Just Fixed One Of The Biggest Problems With AI
DeepSeek Just Fixed One Of The Biggest Problems With AI DeepSeek Just Fixed One Of The Biggest Problems With AI

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The #DeepSeek paper is available here:

https://github.com/deepseek-ai/Engram

https://arxiv.org/abs/2601.07372 Larry Wheels:

https://www.youtube.com/watch?v=7SM816P5G9s&lc=Ugz7yiDrr_8YD7w8gaN4AaABAg Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Taz…

3 days, 5 hours назад @ youtube.com
Honey Is Way More Complex Than You Think
Honey Is Way More Complex Than You Think Honey Is Way More Complex Than You Think

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://xuan-li.github.io/pdf/publications/li2024dynamicduo.pdf Sources:

https://www.youtube.com/watch?v=CfEg7fucVYg Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My rese…

2 weeks, 1 day назад @ youtube.com
NVIDIA’s New AI Just Cracked The Hardest Part Of Self Driving
NVIDIA’s New AI Just Cracked The Hardest Part Of Self Driving NVIDIA’s New AI Just Cracked The Hardest Part Of Self Driving

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://github.com/NVlabs/alpamayo Research panel I will be at GTC:

https://www.nvidia.com/gtc/session-catalog/sessions/gtc26-s81810/ Sources:

https://www.youtube.com/watch?v=0aq4Wi2rsOk

https://www.youtube.com/watch?v=I0yPzZp6dM0 Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundv…

2 weeks, 3 days назад @ youtube.com
Most People Miss What Makes This Impossible
Most People Miss What Makes This Impossible Most People Miss What Makes This Impossible

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://www.geometry.caltech.edu/pubs/LD23.pdf Source:

https://www.youtube.com/watch?v=VIV7GYOBTfM Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.t…

2 weeks, 4 days назад @ youtube.com
DeepMind’s New AI Tracks Objects Faster Than Your Brain
DeepMind’s New AI Tracks Objects Faster Than Your Brain DeepMind’s New AI Tracks Objects Faster Than Your Brain

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://d4rt-paper.github.io/ Our Gaussian Material Synthesis paper:

https://users.cg.tuwien.ac.at/zsolnai/gfx/gaussian-material-synthesis/ Tweet link: https://x.com/GoogleDeepMind/status/2014352808426807527 Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Taz…

2 weeks, 6 days назад @ youtube.com
Adobe & NVIDIA: 10,000,000 Sparkles At 280 FPS
Adobe & NVIDIA: 10,000,000 Sparkles At 280 FPS Adobe & NVIDIA: 10,000,000 Sparkles At 280 FPS

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://perso.telecom-paristech.fr/boubek/papers/Glinty/ Demo: https://www.shadertoy.com/view/tcdGDl Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/ #nvidia #adobe

1 month назад @ youtube.com
The Impossible Physics Of Fire
The Impossible Physics Of Fire The Impossible Physics Of Fire

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers 📝 The paper is available here:

https://helgewrede.github.io/firex/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

1 month назад @ youtube.com
NVIDIA’s New AI Tells You When Photos Lie
NVIDIA’s New AI Tells You When Photos Lie NVIDIA’s New AI Tells You When Photos Lie

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here: https://research.nvidia.com/labs/sil/projects/ppisp/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/ #nvidia

1 month, 1 week назад @ youtube.com
Anthropic Found Why AIs Go Insane
Anthropic Found Why AIs Go Insane Anthropic Found Why AIs Go Insane

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://www.anthropic.com/research/assistant-axis Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

1 month, 1 week назад @ youtube.com
This Is Now 66x Faster
This Is Now 66x Faster This Is Now 66x Faster

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://sig25ddmpd.github.io/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

1 month, 2 weeks назад @ youtube.com
NVIDIA’s New AI Just Leveled Up Video Editing
NVIDIA’s New AI Just Leveled Up Video Editing NVIDIA’s New AI Just Leveled Up Video Editing

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper and the code are now available here:

https://dvirsamuel.github.io/omnimattezero.github.io/

https://github.com/dvirsamuel/OmnimatteZero Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/ …

1 month, 2 weeks назад @ youtube.com
New DeepSeek Research - The Age of AI Is Here!
New DeepSeek Research - The Age of AI Is Here! New DeepSeek Research - The Age of AI Is Here!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://arxiv.org/abs/2501.12948 Sources:

https://x.com/awnihannun/status/1883276535643455790

https://x.com/bcjordan/status/1886825587097878826

https://x.com/izag82161/status/1906347576204640514 Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw…

1 month, 3 weeks назад @ youtube.com
What A Time To Be Alive (Our First Ever Music Video)
What A Time To Be Alive (Our First Ever Music Video) What A Time To Be Alive (Our First Ever Music Video)

Thank you so much everyone, this was amazing fun! Guitar tab is available here for free, no paywall nonsense: https://www.dropbox.com/scl/fi/oo1ny6i7mtgu3l006roa5/what-a-time-to-be-alive.gp?rlkey=n0c4xryfip7m15derrudnb8zl&st=fu51yljh&dl=1 Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~…

1 month, 3 weeks назад @ youtube.com
DataFest Video DataFest Video
последний пост None
Семинары JetBrains Research Семинары JetBrains Research
последний пост None
Яндекс. Компьютерные науки Яндекс. Компьютерные науки
последний пост 2 days, 9 hours назад
Эволюция ML-моделей в рекомендательных системах
Эволюция ML-моделей в рекомендательных системах Эволюция ML-моделей в рекомендательных системах

Как развиваются рекомендательные системы и к чему они должны прийти? Рассказывает Даня Бурлаков, руководитель группы рекомендательных продуктов. Смотрите полное видео на канале! #shorts #ml #ai #datascience #рекомендательныесистемы #нейросети

2 days, 9 hours назад @ youtube.com
Как ARGUS кратно ускоряет время экспериментов
Как ARGUS кратно ускоряет время экспериментов Как ARGUS кратно ускоряет время экспериментов

ARGUS — новая архитектура для ML-моделей в рекомендательных системах. О ней рассказывает Даня Бурлаков, руководитель группы рекомендательных продуктов. Смотрите полное видео на канале! #shorts #ml #ai #datascience #рекомендательныесистемы #нейросети

5 days, 9 hours назад @ youtube.com
Какие методы мы используем в рекомендательных системах
Какие методы мы используем в рекомендательных системах Какие методы мы используем в рекомендательных системах

Аналогия с другими пользователями, статистика, матричные факторизации. Об этих и других способах предлагать контент пользователям рассказал Даня Бурлаков, руководитель группы рекомендательных продуктов. Смотрите полное видео на канале! #shorts #ml #ai #datascience #рекомендательныесистемы #нейросети

1 week, 2 days назад @ youtube.com
Рекомендательные системы за одну минуту
Рекомендательные системы за одну минуту Рекомендательные системы за одну минуту

Как работают рекомендательные системы? Рассказывает Даня Бурлаков, руководитель группы рекомендательных продуктов. Смотрите полное видео на канале! #shorts #ml #ai #datascience #рекомендательныесистемы #нейросети

1 week, 4 days назад @ youtube.com
Tinder для разметчиков LEGO 😅
Tinder для разметчиков LEGO 😅 Tinder для разметчиков LEGO 😅

Андрей Татаринов, CEO и CTO в Epoch8, рассказал о системе компьютерного зрения для приложения Brickit, которое сканирует множество деталей лего и подсказывает, что из них можно собрать. В докладе Андрей объяснил, как его команда боролась с редкими классами и оптимизировала пайплайн под мобильные устройства. А ещё он поделился MLOps-решениями для масштабируемого дообучения и поддержки модели. Полное видео уже на канале! #Brickit, #Lego, #AI, #Нейросети, #ComputerVision, #DeepLearning, #MobileAI, #Tech, #Стартап, #Будущее, #MachineLearning, #Гаджеты, #Технологии, #WOW, #AIприложение

1 week, 6 days назад @ youtube.com
Что внутри рекомендательных систем: настоящее и будущее
Что внутри рекомендательных систем: настоящее и будущее Что внутри рекомендательных систем: настоящее и будущее

Даня Бурлаков, руководитель группы рекомендательных продуктов, рассказывает, как именно технологии учатся понимать наши предпочтения и почему последние годы стали временем больших перемен в индустрии. Как на самом деле работают рекомендательные системы и почему именно сейчас в них происходят самые большие изменения за последние годы? В видео Даня разбирает: 🔸 Какую эволюцию прошли рекомендательные системы и при чём тут ML-модели

🔸 Как новая архитектура ARGUS помогает быстрее обучать нейросети в Яндексе

🔸 Зачем понадобились трансформеры и что такое коллаборативная фильтрация 🔸 Как генеративные модели изменят логику работы рекомендаций Это видео будет полезно ML-инженерам, data scientist, bac…

2 weeks назад @ youtube.com
Что делает Brickit, если не хватает деталек LEGO
Что делает Brickit, если не хватает деталек LEGO Что делает Brickit, если не хватает деталек LEGO

Андрей Татаринов, CEO и CTO в Epoch8, рассказал о системе компьютерного зрения для приложения Brickit, которое сканирует множество деталей лего и подсказывает, что из них можно собрать. В докладе Андрей объяснил, как его команда боролась с редкими классами и оптимизировала пайплайн под мобильные устройства. А ещё он поделился MLOps-решениями для масштабируемого дообучения и поддержки модели. Полное видео уже на канале! #Brickit, #Lego, #AI, #Нейросети, #ComputerVision, #DeepLearning, #MobileAI, #Tech, #Стартап, #Будущее, #MachineLearning, #Гаджеты, #Технологии, #WOW, #AIприложение

2 weeks, 3 days назад @ youtube.com
Сфоткал кучу LEGO — получил инструкцию за секунду 🤯
Сфоткал кучу LEGO — получил инструкцию за секунду 🤯 Сфоткал кучу LEGO — получил инструкцию за секунду 🤯

Андрей Татаринов, CEO и CTO в Epoch8, рассказал о системе компьютерного зрения для приложения Brickit, которое сканирует множество деталей лего и подсказывает, что из них можно собрать. В докладе Андрей объяснил, как его команда боролась с редкими классами и оптимизировала пайплайн под мобильные устройства. А ещё он поделился MLOps-решениями для масштабируемого дообучения и поддержки модели. Полное видео уже на канале! #Brickit, #Lego, #AI, #Нейросети, #ComputerVision, #DeepLearning, #MobileAI, #Tech, #Стартап, #Будущее, #MachineLearning, #Гаджеты, #Технологии, #WOW, #AIприложение

3 weeks, 1 day назад @ youtube.com
Saturday ML Party 21.03.2026
Saturday ML Party 21.03.2026 Saturday ML Party 21.03.2026

Добро пожаловать на вечерний митап для ml-инженеров от Яндекса. В этот раз поговорим про про развитие картиночной мультимодальности, книжном синтезе и не только. Вступайте в канал Yandex for ML, чтобы быть в курсе всех событий и мероприятий Яндекса https://t.me/+LoohiDU_sVk4NGVi Оставляйте свои вопросы спикерам в чате с тегом #вопрос: https://t.me/+OsKnLNG-7DE1ZTFi

1 month, 1 week назад @ youtube.com
Как устроены тематические блоки в поиске Яндекса
Как устроены тематические блоки в поиске Яндекса Как устроены тематические блоки в поиске Яндекса

Под капот поисковой системы Яндекса заглядывает Артём Лобанов, разработчик в группе продуктовых сценариев бизнес-группы Поиска и Рекламных технологий. Полное выступление с Data Fest Siberia уже на канале! #поиск, #поисковыесистемы, #геопоиск, #searchengineering, #dataanalytics, #datascience, #ML, #искусственныйинтеллект, #релевантность, #качествопоиска, #YandexSearch, #ЯндексПоиск, #ITразработка, #DataFestSiberia, #ITконференция, #Shorts

1 month, 1 week назад @ youtube.com
Как поиск путает Эрмитажи
Как поиск путает Эрмитажи Как поиск путает Эрмитажи

Про сценарии нерелевантных запросов рассказывает Артём Лобанов, разработчик в группе продуктовых сценариев бизнес-группы Поиска и Рекламных технологий. Полное выступление с Data Fest Siberia уже на канале! #поиск, #поисковыесистемы, #геопоиск, #searchengineering, #dataanalytics, #datascience, #ML, #искусственныйинтеллект, #релевантность, #качествопоиска, #YandexSearch, #ЯндексПоиск, #ITразработка, #DataFestSiberia, #ITконференция, #Shorts

1 month, 2 weeks назад @ youtube.com
Нейросеть должна знать, что Барселона — топ
Нейросеть должна знать, что Барселона — топ Нейросеть должна знать, что Барселона — топ

Фрагмент доклада Алексея Колесова, CTO в Yandex R&D, с Practical ML Conf 2025. На простом и «очевидном» примере из футбола разбираем серьёзную проблему больших языковых моделей: почему LLM могут плохо помнить факты о мире и как в YandexGPT 5.1 улучшали фактическую память и применение знаний. В докладе также показано, как в реальной системе удалось стабильно запустить online reinforcement learning (RL) для обучения модели. Про обучение нейросетей, память LLM и практический ML в продакшене. Полный доклад — на канале. #YandexGPT #LLM #AI #MachineLearning #ReinforcementLearning #DeepLearning #ML #ArtificialIntelligence #PracticalML #MLOps

1 month, 3 weeks назад @ youtube.com
Как выглядит плохой показ в поиске
Как выглядит плохой показ в поиске Как выглядит плохой показ в поиске

Объясняет Артём Лобанов, разработчик в группе продуктовых сценариев бизнес-группы Поиска и Рекламных технологий. Полное выступление с Data Fest Siberia уже на канале! #поиск, #поисковыесистемы, #геопоиск, #searchengineering, #dataanalytics, #datascience, #ML, #искусственныйинтеллект, #релевантность, #качествопоиска, #YandexSearch, #ЯндексПоиск, #ITразработка, #DataFestSiberia, #ITконференция, #Shorts

1 month, 3 weeks назад @ youtube.com
Зачем нужен геопоиск Яндекса
Зачем нужен геопоиск Яндекса Зачем нужен геопоиск Яндекса

Про функциональность сервиса рассказывает Артём Лобанов, разработчик в группе продуктовых сценариев бизнес-группы Поиска и Рекламных технологий. Полное выступление с Data Fest Siberia уже на канале! #поиск, #поисковыесистемы, #геопоиск, #searchengineering, #dataanalytics, #datascience, #ML, #искусственныйинтеллект, #релевантность, #качествопоиска, #YandexSearch, #ЯндексПоиск, #ITразработка, #DataFestSiberia, #ITконференция, #Shorts

1 month, 4 weeks назад @ youtube.com
Что такое PMI Score
Что такое PMI Score Что такое PMI Score

Это фрагмент выступления Всеволода Парамонова и Андрея Шевцова, разработчиков группы аналитики ценообразования Яндекс Лавки. На Data Fest Siberia они рассказали, что лежит под капотом динамического ценообразования в сервисе. Также ребята разобрали, как устроены модели спроса и эластичности и какие математические формулы приводят процесс в движение. Смотрите полное видео на нашем канале! #ценообразование, #dynamicpricing, #pricingmodel, #dataanalytics, #datascience, #ML, #математика, #экономика, #аналитика, #бизнесметрики, #спрос, #эластичность, #YandexLavka, #DataFestSiberia, #ITконференция, #datadriven, #Shorts

2 months назад @ youtube.com
ML Trainings ML Trainings
последний пост 4 days, 12 hours назад
Экономика или пользователи: ошибка многих
Экономика или пользователи: ошибка многих Экономика или пользователи: ошибка многих 4 days, 12 hours назад @ youtube.com
Код перестал быть ценностью: взгляд Валентина Малых
Код перестал быть ценностью: взгляд Валентина Малых Код перестал быть ценностью: взгляд Валентина Малых 4 days, 12 hours назад @ youtube.com
Как опытные инженеры принимают кодовые агенты
Как опытные инженеры принимают кодовые агенты Как опытные инженеры принимают кодовые агенты 4 days, 12 hours назад @ youtube.com
В США строят больше центров обработки данных, чем офисов
В США строят больше центров обработки данных, чем офисов В США строят больше центров обработки данных, чем офисов 4 days, 12 hours назад @ youtube.com
Агенты в работе когда полезны, а когда вредны
Агенты в работе когда полезны, а когда вредны Агенты в работе когда полезны, а когда вредны 4 days, 12 hours назад @ youtube.com
Cloud code vs copilot: почему стартапы выбирают Cloud code
Cloud code vs copilot: почему стартапы выбирают Cloud code Cloud code vs copilot: почему стартапы выбирают Cloud code 4 days, 12 hours назад @ youtube.com
Капитанский мостик №11: Kaggle для военных | Регулирование ИИ в РФ | ЦОДы съели офисы
Капитанский мостик №11: Kaggle для военных | Регулирование ИИ в РФ | ЦОДы съели офисы Капитанский мостик №11: Kaggle для военных | Регулирование ИИ в РФ | ЦОДы съели офисы

0:00:00 Начало

0:00:39 Kaggle для военных

0:06:42 Программирование промтами

0:14:54 Рынок труда в США

0:20:36 Cursor использует Kimi

0:23:51 Регулирование ИИ в РФ

0:31:16 Эссе про ИИ для всех

0:36:53 Агенты замедляют тебя

0:42:40 nVIDIA и новые GPU

0:46:45 Отечественные TPU

0:49:42 Коалиция против OpenAI

0:54:59 Борьба c мошенниками и ИИ

0:58:40 OpenAI догоняет Claude

1:06:09 ЦОДы съели офисы

1:08:50 ИИ по пятницам ИИ-саммари: Обсуждение последних новостей в области искусственного интеллекта, включая военные разработки, регулирование и новые технологии. Экспертное мнение о будущем ИИ и его влиянии на общество и бизнес. Обсуждение последних трендов в области искусственного интеллекта, регуля…

5 days, 13 hours назад @ youtube.com
Митап #2 | Data Fusion Contest 2026
Митап #2 | Data Fusion Contest 2026 Митап #2 | Data Fusion Contest 2026

Представляем ежегодную серию соревнований по машинному обучению Data Fusion Contest. Страница с задачами: https://ods.ai/tracks/data-fusion-2026-competitions В четверг 19 марта (19:00 - 20:00 по мск) мы провели второй митап. Программа:

Главная часть программы — анонсы первой волны победителей лучших публичных решений спецноминации Companion!

🌚️️️️️️ Q&A и обсуждение вопросов с участниками. _____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

1 week назад @ youtube.com
Искусственный интеллект и HR: в чем разница
Искусственный интеллект и HR: в чем разница Искусственный интеллект и HR: в чем разница 1 week, 4 days назад @ youtube.com
Валентин Малых объясняет, как приготовить пельмени
Валентин Малых объясняет, как приготовить пельмени Валентин Малых объясняет, как приготовить пельмени 1 week, 4 days назад @ youtube.com
Агенты меняют интернет: как это работает
Агенты меняют интернет: как это работает Агенты меняют интернет: как это работает 1 week, 4 days назад @ youtube.com
Агенты в интернете: что происходит в интернете
Агенты в интернете: что происходит в интернете Агенты в интернете: что происходит в интернете 1 week, 4 days назад @ youtube.com
Капитанский мостик №10: Отказ от ИИ в России | Лекун против AGI | Резюме пельменей
Капитанский мостик №10: Отказ от ИИ в России | Лекун против AGI | Резюме пельменей Капитанский мостик №10: Отказ от ИИ в России | Лекун против AGI | Резюме пельменей

0:00:00 Начало

0:00:56 Отказ от ИИ в России

0:12:14 Claude Cowork от Microsoft

0:17:51 Алиса поможет купить

0:30:30 ChaGPT создает рабочие места

0:36:52 Claude подсматривает в вебе

0:40:57 Moltbook вместо ВК

0:45:05 Amazon за инженеров

0:55:06 ИИ поддерживает код

1:01:35 Лекун против AGI

1:16:54 Обучение без ограничения

1:20:30 Резюме пельменей ИИ-саммари: В этом выпуске обсуждаются новые законы о праве отказаться от услуг на базе ИИ, влияние крупных корпораций, таких как Microsoft и Яндекс, на рынок ИИ, а также последние новости о моделях и сервисах ИИ, включая Amazon и Anthropic. Обсуждение последних трендов в области искусственного интеллекта, влияния пандемии на рынок труда и роли инжен…

1 week, 5 days назад @ youtube.com
Татьяна Шаврина критикует подход к безопасности
Татьяна Шаврина критикует подход к безопасности Татьяна Шаврина критикует подход к безопасности 2 weeks, 3 days назад @ youtube.com
Татьяна Шаврина рассказывает о подозрительных роликах цру
Татьяна Шаврина рассказывает о подозрительных роликах цру Татьяна Шаврина рассказывает о подозрительных роликах цру 2 weeks, 3 days назад @ youtube.com
Primer Primer
последний пост 2 months, 3 weeks назад
Taking AI Doom Seriously For 62 Minutes
Taking AI Doom Seriously For 62 Minutes Taking AI Doom Seriously For 62 Minutes

Patreon: https://www.patreon.com/primerlearning

80,000 Hours: 80000hours.org/primer https://www.desmos.com/calculator/a5pfjtr4tr Other connections:

Discord: https://discord.gg/NbruaNW

Twitch: https://www.twitch.tv/justin_helps

Store: https://store.dftba.com/collections/primer Reddit: https://www.reddit.com/r/primerlearning/

Bsky: https://bsky.app/profile/justinhelps.bsky.social

Twitter: https://twitter.com/primerlearning Links to other resources:

https://yoshuabengio.org/2024/07/09/reasoning-through-arguments-against-taking-ai-safety-seriously/

https://www.youtube.com/c/robertmilesai

https://www.youtube.com/@Siliconversations

https://www.youtube.com/@Go-Meta

https://www.youtube.com/@Dwarkes…

2 months, 3 weeks назад @ youtube.com
Simulating a single brain cell
Simulating a single brain cell Simulating a single brain cell

Patreon:

https://www.patreon.com/primerlearning Helpful resources if you want to learn more about neural networks

https://www.youtube.com/@AndrejKarpathy

https://course.fast.ai/

https://www.youtube.com/@WelchLabsVideo

https://www.youtube.com/@3blue1brown Early papers. These probably aren't helpful for understanding the concepts in this video, but if you're interested in history.

The Perceptron – A perceiving and recognizing automaton: https://bpb-us-e2.wpmucdn.com/websites.umass.edu/dist/a/27637/files/2016/03/rosenblatt-1957.pdf

The Perceptron: A probabilistic model for information storage and organization in the brain: https://www.ling.upenn.edu/courses/cogs501/Rosenblatt1958.pdf A Logical…

6 months назад @ youtube.com
🎧 Podcasts
Lex Fridman AI Podcast Lex Fridman AI Podcast
последний пост 4 days, 4 hours назад
#494 – Jensen Huang: NVIDIA – The $4 Trillion Company & the AI Revolution
#494 – Jensen Huang: NVIDIA – The $4 Trillion Company & the AI Revolution #494 – Jensen Huang: NVIDIA – The $4 Trillion Company & the AI Revolution

Jensen Huang is the co-founder and CEO of NVIDIA, the world’s most valuable company and the engine powering the AI computing revolution.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep494-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://drinkLMNT.com/lexFin: AI agent for customer service.

Go to https://quo.com/lexOUTLINE:(00:00) – Introduction(00:26) – Sponsors, Comments, and Reflections(06:34) – Extreme co-design and rack-scale engineering(09:20) – How Jensen runs NVIDIA(28:41) – AI scaling laws(43:41) – Biggest blockers to AI scaling laws(45:25) – Supply chain(47:20) – Memory(53:25) – Power…

4 days, 4 hours назад @ lexfridman.com
#493 – Jeff Kaplan: World of Warcraft, Overwatch, Blizzard, and Future of Gaming
#493 – Jeff Kaplan: World of Warcraft, Overwatch, Blizzard, and Future of Gaming #493 – Jeff Kaplan: World of Warcraft, Overwatch, Blizzard, and Future of Gaming

Jeff Kaplan is a legendary Blizzard game designer of World of Warcraft and Overwatch, now preparing to launch a new game, The Legend of California, from his new studio Kintsugiyama – available to wishlist on Steam today, with alpha later in March.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep493-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://fin.ai/lexBlitzy: AI agent for large enterprise codebases.

Go to https://blitzy.com/lexBetterHelp: Online therapy and counseling.

Go to https://betterhelp.com/lexShopify: Sell stuff online.

2 weeks, 2 days назад @ lexfridman.com
#492 – Rick Beato: Greatest Guitarists of All Time, History & Future of Music
#492 – Rick Beato: Greatest Guitarists of All Time, History & Future of Music #492 – Rick Beato: Greatest Guitarists of All Time, History & Future of Music

Rick Beato is a music educator, interviewer, producer, songwriter, and a true multi-instrument musician, playing guitar, bass, cello & piano.

His incredible YouTube channel celebrates great musicians & musical ideas, and helps millions of people fall in love with great music all over again.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep492-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://upliftdesk.com/lexBetterHelp: Online therapy and counseling.

Go to https://drinkLMNT.com/lexFin: AI agent for customer service.

3 weeks, 5 days назад @ lexfridman.com
#491 – OpenClaw: The Viral AI Agent that Broke the Internet – Peter Steinberger
#491 – OpenClaw: The Viral AI Agent that Broke the Internet – Peter Steinberger #491 – OpenClaw: The Viral AI Agent that Broke the Internet – Peter Steinberger

Peter Steinberger is the creator of OpenClaw, an open-source AI agent framework that’s the fastest-growing project in GitHub history.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep491-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://coderabbit.ai/lexFin: AI agent for customer service.

Go to https://fin.ai/lexBlitzy: AI agent for large enterprise codebases.

Go to https://drinkLMNT.com/lexOUTLINE:(00:00) – Introduction(03:51) – Sponsors, Comments, and Reflections(15:29) – OpenClaw origin story(18:48) – Mind-blowing moment(28:15) – Why OpenClaw went viral(32:12) – Self-modifying AI agent(36:57)…

1 month, 1 week назад @ lexfridman.com
#490 – State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI
#490 – State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI #490 – State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI

Nathan Lambert and Sebastian Raschka are machine learning researchers, engineers, and educators.

Sebastian Raschka is the author of Build a Large Language Model (From Scratch) and Build a Reasoning Model (From Scratch).

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep490-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

(25:11) – ChatGPT vs Claude vs Gemini vs Grok: Who is winning?

(36:11) – Best AI for coding(43:02) – Open Source vs Closed Source LLMs(54:41) – Transformers: Evolution of LLMs since 2019(1:02:38) – AI Scaling Laws: Are they dead or still holding?

1 month, 3 weeks назад @ lexfridman.com
#489 – Paul Rosolie: Uncontacted Tribes in the Amazon Jungle
#489 – Paul Rosolie: Uncontacted Tribes in the Amazon Jungle #489 – Paul Rosolie: Uncontacted Tribes in the Amazon Jungle

Paul Rosolie is a naturalist, explorer, author of a new book titled Junglekeeper, and is someone who has dedicated his life to protecting the Amazon rainforest.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep489-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://perplexity.ai/BetterHelp: Online therapy and counseling.

Go to https://fin.ai/lexMiro: Online collaborative whiteboard platform.

Go to https://miro.com/MasterClass: Online classes from world-class experts.

2 months, 1 week назад @ lexfridman.com
#488 – Infinity, Paradoxes that Broke Mathematics, Gödel Incompleteness & the Multiverse – Joel David Hamkins
#488 – Infinity, Paradoxes that Broke Mathematics, Gödel Incompleteness & the Multiverse – Joel David Hamkins #488 – Infinity, Paradoxes that Broke Mathematics, Gödel Incompleteness & the Multiverse – Joel David Hamkins

Joel David Hamkins is a mathematician and philosopher specializing in set theory, the foundations of mathematics, and the nature of infinity, and he’s the #1 highest-rated user on MathOverflow.

He is also the author of several books, including Proof and the Art of Mathematics and Lectures on the Philosophy of Mathematics.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep488-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://masterclass.com/lexpodOUTLINE:(00:00) – Introduction(01:58) – Sponsors, Comments, and Reflections(15:40) – Infinity & paradoxes(1:02:50) – Russell’s paradox(1:15:57) – Gödel’s…

2 months, 3 weeks назад @ lexfridman.com
#487 – Irving Finkel: Deciphering Secrets of Ancient Civilizations & Flood Myths
#487 – Irving Finkel: Deciphering Secrets of Ancient Civilizations & Flood Myths #487 – Irving Finkel: Deciphering Secrets of Ancient Civilizations & Flood Myths

Irving Finkel is a scholar of ancient languages and a longtime curator at the British Museum, renowned for his expertise in Mesopotamian history and cuneiform writing.

He specializes in reading and interpreting cuneiform inscriptions, including tablets from Sumerian, Akkadian, Babylonian, and Assyrian contexts.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep487-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://shopify.com/lexMiro: Online collaborative whiteboard platform.

Go to https://miro.com/Chevron: Reliable energy for data centers.

3 months, 2 weeks назад @ lexfridman.com
#486 – Michael Levin: Hidden Reality of Alien Intelligence & Biological Life
#486 – Michael Levin: Hidden Reality of Alien Intelligence & Biological Life #486 – Michael Levin: Hidden Reality of Alien Intelligence & Biological Life

Michael Levin is a biologist at Tufts University working on novel ways to understand and control complex pattern formation in biological systems.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep486-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://upliftdesk.com/lexMiro: Online collaborative whiteboard platform.

Go to https://miro.com/MasterClass: Online classes from world-class experts.

(2:42:41) – Mind uploading(3:01:22) – Alien intelligence(3:16:17) – Advice for young people(3:22:46) – Questions for AGI

3 months, 3 weeks назад @ lexfridman.com
#485 – David Kirtley: Nuclear Fusion, Plasma Physics, and the Future of Energy
#485 – David Kirtley: Nuclear Fusion, Plasma Physics, and the Future of Energy #485 – David Kirtley: Nuclear Fusion, Plasma Physics, and the Future of Energy

David Kirtley is a nuclear fusion engineer and CEO of Helion Energy, a company working on building the world's first commercial fusion power plant by 2028.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep485-sc

See below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc. Transcript:

https://lexfridman.com/david-kirtley-transcript CONTACT LEX:

Feedback - give feedback to Lex: https://lexfridman.com/survey

AMA - submit questions, videos or call-in: https://lexfridman.com/ama

Hiring - join our team: https://lexfridman.com/hiring

Other - other ways to get in touch: https://lexfridman.com/contact EPISODE LINKS:

David's X: htt…

4 months, 1 week назад @ lexfridman.com
#484 – Dan Houser: GTA, Red Dead Redemption, Rockstar, Absurd & Future of Gaming
#484 – Dan Houser: GTA, Red Dead Redemption, Rockstar, Absurd & Future of Gaming #484 – Dan Houser: GTA, Red Dead Redemption, Rockstar, Absurd & Future of Gaming

Dan Houser is co-founder of Rockstar Games and is a legendary creative mind behind Grand Theft Auto (GTA) and Red Dead Redemption series of video games.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep484-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://box.com/aiUPLIFT Desk: Standing desks and office ergonomics.

Go to https://drinkLMNT.com/lexOUTLINE:(00:00) – Introduction(01:29) – Sponsors, Comments, and Reflections(11:32) – Greatest films of all time(23:45) – Making video games(26:36) – GTA 3(29:55) – Open world video games(32:42) – Character creation(36:09) – Superintelligent AI in A Bette…

4 months, 3 weeks назад @ lexfridman.com
#483 – Julia Shaw: Criminal Psychology of Murder, Serial Killers, Memory & Sex
#483 – Julia Shaw: Criminal Psychology of Murder, Serial Killers, Memory & Sex #483 – Julia Shaw: Criminal Psychology of Murder, Serial Killers, Memory & Sex

Julia Shaw is a criminal psychologist and author who in her books explores human nature, including psychopathy, violent crime, the psychology of evil, police interrogation, false memory manipulation, deception detection, and human sexuality.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep483-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://shopify.com/lexBetterHelp: Online therapy and counseling.

Go to https://betterhelp.com/lexLMNT: Zero-sugar electrolyte drink mix.

Go to https://drinkLMNT.com/lexAG1: All-in-one daily nutrition drink.

5 months, 2 weeks назад @ lexfridman.com
#482 – Pavel Durov: Telegram, Freedom, Censorship, Money, Power & Human Nature
#482 – Pavel Durov: Telegram, Freedom, Censorship, Money, Power & Human Nature #482 – Pavel Durov: Telegram, Freedom, Censorship, Money, Power & Human Nature

Pavel Durov is the founder and CEO of Telegram.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep482-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Transcript:https://lexfridman.com/pavel-durov-transcriptCONTACT LEX:Feedback – give feedback to Lex: https://lexfridman.com/surveyAMA – submit questions, videos or call-in: https://lexfridman.com/amaHiring – join our team: https://lexfridman.com/hiringOther – other ways to get in touch: https://lexfridman.com/contactEPISODE LINKS:Pavel’s Telegram: https://t.me/durovPavel’s X: https://x.com/durovTelegram: https://telegram.org/Telegram Contests: https://contest.c…

5 months, 3 weeks назад @ lexfridman.com
#481 – Norman Ohler: Hitler, Nazis, Drugs, WW2, Blitzkrieg, LSD, MKUltra & CIA
#481 – Norman Ohler: Hitler, Nazis, Drugs, WW2, Blitzkrieg, LSD, MKUltra & CIA #481 – Norman Ohler: Hitler, Nazis, Drugs, WW2, Blitzkrieg, LSD, MKUltra & CIA

Norman Ohler is a historian and author of “Blitzed: Drugs in the Third Reich,” a book that investigates the role of psychoactive drugs, particularly stimulants such as methamphetamine, in the military history of World War II.

It is a book that two legendary historians Ian Kershaw and Antony Beevor give very high praise for its depth of research.

Norman also wrote “Tripped: Nazi Germany, the CIA, and the Dawn of the Psychedelic Age”, and he is working on a new book “Stoned Sapiens” looking at the history of human civilization through the lens of drugs.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep481-scSee below for timestamps, transcript, and to give f…

6 months, 1 week назад @ lexfridman.com
#480 – Dave Hone: T-Rex, Dinosaurs, Extinction, Evolution, and Jurassic Park
#480 – Dave Hone: T-Rex, Dinosaurs, Extinction, Evolution, and Jurassic Park #480 – Dave Hone: T-Rex, Dinosaurs, Extinction, Evolution, and Jurassic Park

Dave Hone is a paleontologist, expert on dinosaurs, co-host of the Terrible Lizards podcast, and author of numerous scientific papers and books on the behavior and ecology of dinosaurs.

He lectures at Queen Mary University of London on topics of Ecology, Zoology, Biology, and Evolution.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep480-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://go.lindy.ai/lexBetterHelp: Online therapy and counseling.

Go to https://shopify.com/lexLMNT: Zero-sugar electrolyte drink mix.

6 months, 3 weeks назад @ lexfridman.com
Microsoft Research Podcast Microsoft Research Podcast
последний пост 4 days, 5 hours назад
Will machines ever be intelligent?
Will machines ever be intelligent? Will machines ever be intelligent?

And the question we’re going to discuss is, are machines intelligent?

No, no, that’s right, that’s right.

I mean, in some sense, you could potentially have a super intelligent system, right, that’s far more intelligent than anything else on the planet.

BURGER: Right, right.

At the same time, I think, you know, transformers are not intelligent in the way that a three-year-old is, right?

4 days, 5 hours назад @ microsoft.com
Trailer: The Shape of Things to Come
Trailer: The Shape of Things to Come Trailer: The Shape of Things to Come

Join Microsoft’s Doug Burger and guests as they dig into the fundamental truths about AI and how it will reshape the future.

Technical advances are moving at such a rapid pace that it can be challenging to define the tomorrow we’re working toward.

In The Shape of Things to Come, Microsoft research leader Doug Burger and experts from across disciplines tease out the thorniest AI issues facing technologists, policymakers, business decision-makers, and other stakeholders today.

It’s important to understand what the emerging shapes are and how we should respond.” – Doug Burger, Technical Fellow and Corporate Vice President, Microsoft ResearchAbout Doug BurgerDoug Burger is a research leader in …

3 weeks, 3 days назад @ microsoft.com
Ideas: Community building, machine learning, and the future of AI
Ideas: Community building, machine learning, and the future of AI Ideas: Community building, machine learning, and the future of AI

This week, machine learning researchers around the world will be attending the annual Conference on Neural Information Processing Systems, or NeurIPS.

In this series, we’ll explore the technologies that are shaping our future and the big ideas that propel them forward.

So around that time when I started my PhD at Penn, I was working in machine learning theory and algorithmic economics.

How had you experienced a lack of community or network of women in machine learning before the founding of WiML?

So particularly when working on topics related to fairness, I’ve ended up focusing a bunch on stuff to do with marginalized groups as part of my responsible AI work.

3 months, 3 weeks назад @ microsoft.com
Ideas: More AI-resilient biosecurity with the Paraphrase Project
Ideas: More AI-resilient biosecurity with the Paraphrase Project Ideas: More AI-resilient biosecurity with the Paraphrase Project

Today, I’m excited to talk about the Paraphrase Project, an effort I co-led exploring how advances in AI tools for protein design might impact biosecurity.

These “patches,” akin to those in cybersecurity, have now been shared with organizations globally to strengthen biosecurity screening.

The project highlights that the same AI tools capable of incredible good can also be misused, requiring us to be vigilant, thoughtful, and creative so we continue to get the most benefit out of AI tools while working to ensure that we avoid costly misuses.

So things like, how similar is this to that template, wild-type protein structure that we used as our conditioning information?

But I feel like broadly…

5 months, 3 weeks назад @ microsoft.com
Coauthor roundtable: Reflecting on healthcare economics, biomedical research, and medical education
Coauthor roundtable: Reflecting on healthcare economics, biomedical research, and medical education Coauthor roundtable: Reflecting on healthcare economics, biomedical research, and medical education

KOHANE: So I think you’ve “nerd sniped” me because you [LAUGHTER]—which is all too easy—but I think there’s a central issue here.

But I actually think this is dark matter of human organizational technology that is not well understood.

AZEEM AZHAR: We didn’t talk about, you know, AI in its ability to potentially do this, which is to extend the clinician’s presence throughout the week.

And so I think there’s always going to be an opening for either differences of opinion or agreeing with you too much.

And this gets into whether AI is really going to get almost to the ab initio understanding of human biology.

7 months, 1 week назад @ microsoft.com
Reimagining healthcare delivery and public health with AI
Reimagining healthcare delivery and public health with AI Reimagining healthcare delivery and public health with AI

We are sorry, the page you requested cannot be found.

The page you are looking for could not be found or is no longer available.

7 months, 3 weeks назад @ microsoft.com
Navigating medical education in the era of generative AI
Navigating medical education in the era of generative AI Navigating medical education in the era of generative AI

Prior to med school, Daniel pursued experiences that cultivated his interest in the application of AI in medical practice and education.

Really, really looking forward to this chat.

There’s AI before ChatGPT and before, you know, generative AI really became a big thing, and then afterwards.

And then after we talk about what’s really happening, what do you think should happen in medical education given the reality of generative AI?

And I do agree [that] AI really gives us real hope that we can make it true.

8 months назад @ microsoft.com
AI Testing and Evaluation: Reflections
AI Testing and Evaluation: Reflections AI Testing and Evaluation: Reflections

Our goal is to learn from their successes and their stumbles to move the science and practice of AI testing forward.

We have examples, like the pharmaceutical or medical device industry experts with whom you spoke, that’s really, you know, testing … there is a pre-deployment requirement.

And the third is just how rigid versus adaptive these testing and evaluation regimes or frameworks are in these different domains.

I really agree that there has been a lot of emphasis to date on, sort of, testing models upstream, the AI model evaluation.

You know, I think there’s been real progress already in the AI evaluation and testing ecosystem in the public-private partnership context.

8 months, 1 week назад @ microsoft.com
AI Testing and Evaluation: Learnings from cybersecurity
AI Testing and Evaluation: Learnings from cybersecurity AI Testing and Evaluation: Learnings from cybersecurity

Absolutely, I really, really was.

As a principal director on the Microsoft AI Red Team, Tori leads all AI security and safety red team operations, as well as dangerous capability testing, to directly inform C-suite decision-makers.

This year, we’ve pulled a lot of those assets and insights into the Azure [AI] Foundry AI Red Teaming Agent (opens in new tab).

So you can get a little taste of what we do day to day in the AI Red Teaming Agent.

WESTERHOFF: I think the most important takeaway from those lessons is that AI security is truly a team sport.

8 months, 2 weeks назад @ microsoft.com
How AI will accelerate biomedical research and discovery
How AI will accelerate biomedical research and discovery How AI will accelerate biomedical research and discovery

Dr. Eric Topol is the executive vice president of the biomedical research non-profit Scripps Research, where he founded and now directs the Scripps Research Translational Institute.

Let’s continue our deep dive on AI and biomedical research with this conversation with Noubar Afeyan:LEE: Noubar, thanks so much for joining.

And there’s the origin story of contact with AI, you know, before the emergence of generative AI and afterwards.

What is going on today with respect to AI really being used for something meaningful in the design and development of drugs?

TOPOL: You would read about how, you know, data is the new oil and, you know, gold and whatnot.

8 months, 2 weeks назад @ microsoft.com
AI Testing and Evaluation: Learnings from pharmaceuticals and medical devices
AI Testing and Evaluation: Learnings from pharmaceuticals and medical devices AI Testing and Evaluation: Learnings from pharmaceuticals and medical devices

Our goal is to learn from their successes and their stumbles to move the science and practice of AI testing forward.

During the pre-market phase, medical testing establishes baseline safety and effectiveness metrics through bench testing, performance standards, and clinical studies.

SULLIVAN: So medical devices face a pretty prescriptive multi-level testing path before they hit the market.

We are looking into medical devices, as well, obviously, but also other technologies in advanced medical computing.

So we see Phase 3 trials as something that occurs in the medical devices and pharmaceuticals field.

8 months, 3 weeks назад @ microsoft.com
AI Testing and Evaluation: Learnings from genome editing
AI Testing and Evaluation: Learnings from genome editing AI Testing and Evaluation: Learnings from genome editing

As generative AI continues to advance, Microsoft has gathered a range of experts—from genome editing to cybersecurity—to share how their fields approach evaluation and risk assessment.

CHARO: Well, you know, genome editing is both very old and very new.

Now the earliest forms of genome editing were very inefficient, and so we didn’t worry that much.

But the bottom-line thing to remember, the way to really think about it is, we don’t regulate genome editing; we regulate the things that use genome editing.

And she said, you know, we don’t regulate genome editing; we regulate the things that use genome editing.

9 months назад @ microsoft.com
AI Testing and Evaluation: Learnings from Science and Industry
AI Testing and Evaluation: Learnings from Science and Industry AI Testing and Evaluation: Learnings from Science and Industry

Our goal is to learn from their successes and their stumbles to move the science and practice of AI testing forward.

And I think, really, there are two reasons why tech is so, kind of, representative of that kind of challenge that I’ve always found fascinating.

Continues to be a really important topic in the AI policy conversation right now, I think, for really good reason.

Testing is an important component for governance and AI and, of course, in all of these other domains, as well.

I think about almost, like, in the near to mid-term, like three issues that we need to address in the AI, kind of, policy and testing context.

9 months, 1 week назад @ microsoft.com
The AI Revolution in Medicine, Revisited: How AI is reshaping the future of healthcare and medical research
The AI Revolution in Medicine, Revisited: How AI is reshaping the future of healthcare and medical research The AI Revolution in Medicine, Revisited: How AI is reshaping the future of healthcare and medical research

LEE: Yeah, yeah.

It cannot—as, you know, Bill was saying—it cannot learn from your document.

And I don’t know if the two of you remember, but I ended up doing a lot of tests.

I don’t know if you know, but just recently, there was a paper that was published on a scientific discovery using o3- mini (opens in new tab).

Like, if you have a human trained for one task and you put them into another task, then you don’t … you often don’t know.

9 months, 2 weeks назад @ microsoft.com
NLP Highlights NLP Highlights
последний пост None
Data Skeptic
последний пост 5 часов назад
Book Ratings and Recommendations
Book Ratings and Recommendations Book Ratings and Recommendations

Goodreads star ratings can be misleading as measures of "book quality," and research from Hannes Rosenbusch suggests that for many professionally published books, differences between readers often matter more than differences between books. The episode also explores how to model reader preferences, why reviews often reveal more about the reviewer than the text, and how LLMs can aid computational literary research while still falling short of human editors in creative writing.

5 часов назад @ dataskeptic.com
Disentanglement and Interpretability in Recommender Systems
Disentanglement and Interpretability in Recommender Systems Disentanglement and Interpretability in Recommender Systems 2 weeks, 3 days назад @ dataskeptic.com
Collective Altruism in Recommender Systems
Collective Altruism in Recommender Systems Collective Altruism in Recommender Systems

Ekaterina (Kat) Filadova from MIT EECS joins us to discuss strategic learning in recommender systems—what happens when users collectively coordinate to game recommendation algorithms. Kat's research reveals surprising findings: algorithmic "protest movements" can paradoxically help platforms by providing clearer preference signals, and the challenge of distinguishing coordinated behavior from bot activity is more complex than it appears. This episode explores the intersection of machine learning and game theory, examining what happens when your training data actively responds to your algorithm.

4 weeks назад @ dataskeptic.com
Niche vs Mainstream
Niche vs Mainstream Niche vs Mainstream

Anas Buhayh discusses multi-stakeholder fairness in recommender systems and the S'mores framework—a simulation allowing users to choose between mainstream and niche algorithms. His research shows specialized recommenders improve utility for niche users while raising questions about filter bubbles and data privacy.

1 month, 1 week назад @ dataskeptic.com
Healthy Friction in Job Recommender Systems
Healthy Friction in Job Recommender Systems Healthy Friction in Job Recommender Systems

In this episode, host Kyle Polich speaks with Roan Schellingerhout, a fourth-year PhD student at Maastricht University, about explainable multi-stakeholder recommender systems for job recruitment. Roan discusses his research on creating AI-powered job matching systems that balance the needs of multiple stakeholders—job seekers, recruiters, HR professionals, and companies. The conversation explores different types of explanations for job recommendations, including textual, bar chart, and graph-based formats, with findings showing that lay users strongly prefer simple textual explanations over more technical visualizations. Roan shares insights from his "healthy friction" study, which tested …

1 month, 3 weeks назад @ dataskeptic.com
Fairness in PCA-Based Recommenders
Fairness in PCA-Based Recommenders Fairness in PCA-Based Recommenders

In this episode, we explore the fascinating world of recommender systems and algorithmic fairness with David Liu, Assistant Research Professor at Cornell University's Center for Data Science for Enterprise and Society. David shares insights from his research on how machine learning models can inadvertently create unfairness, particularly for minority and niche user groups, even without any malicious intent. We dive deep into his groundbreaking work on Principal Component Analysis (PCA) and collaborative filtering, examining why these fundamental techniques sometimes fail to serve all users equally. David introduces the concept of "power niche users" - highly active users with specialized in…

2 months назад @ dataskeptic.com
Video Recommendations in Industry
Video Recommendations in Industry Video Recommendations in Industry

In this episode, Kyle Polich sits down with Cory Zechmann, a content curator working in streaming television with 16 years of experience running the music blog "Silence Nogood." They explore the intersection of human curation and machine learning in content discovery, discussing the concept of "algatorial" curation—where algorithms and editorial expertise work together. Key topics include the cold start problem, why every metric is just a "proxy metric" for what users actually want, the challenge of filter bubbles, and the importance of balancing familiarity with discovery. Cory shares insights on why TikTok's algorithm works so well (clean data and massive interaction volume), the crucial …

3 months назад @ dataskeptic.com
Eye Tracking in Recommender Systems
Eye Tracking in Recommender Systems Eye Tracking in Recommender Systems

In this episode, Santiago de Leon takes us deep into the world of eye tracking and its revolutionary applications in recommender systems. As a researcher at the Kempelin Institute and Brno University, Santiago explains the mechanics of eye tracking technology—how it captures gaze data and processes it into fixations and saccades to reveal user browsing patterns. He introduces the groundbreaking RecGaze dataset, the first eye tracking dataset specifically designed for recommender systems research, which opens new possibilities for understanding how users interact with carousel interfaces like Netflix. Through collaboration between psychologists and AI researchers, Santiago's work demonstrate…

3 months, 1 week назад @ dataskeptic.com
Cracking the Cold Start Problem
Cracking the Cold Start Problem Cracking the Cold Start Problem

In this episode of Data Skeptic, we dive deep into the technical foundations of building modern recommender systems. Unlike traditional machine learning classification problems where you can simply apply XGBoost to tabular data, recommender systems require sophisticated hybrid approaches that combine multiple techniques. Our guest, Boya Xu, an assistant professor of marketing at Virginia Tech, walks us through a cutting-edge method that integrates three key components: collaborative filtering for dimensionality reduction, embeddings to represent users and items in latent space, and bandit learning to balance exploration and exploitation when deploying new recommendations. Boya shares insigh…

3 months, 2 weeks назад @ dataskeptic.com
Designing Recommender Systems for Digital Humanities
Designing Recommender Systems for Digital Humanities Designing Recommender Systems for Digital Humanities

In this episode of Data Skeptic, we explore the fascinating intersection of recommender systems and digital humanities with guest Florian Atzenhofer-Baumgartner, a PhD student at Graz University of Technology. Florian is working on Monasterium.net, Europe's largest online collection of historical charters, containing millions of medieval and early modern documents from across the continent. The conversation delves into why traditional recommender systems fall short in the digital humanities space, where users range from expert historians and genealogists to art historians and linguists, each with unique research needs and information-seeking behaviors. Florian explains the technical challen…

4 months назад @ dataskeptic.com
DataRec Library for Reproducible in Recommend Systems
DataRec Library for Reproducible in Recommend Systems DataRec Library for Reproducible in Recommend Systems

In this episode of Data Skeptic's Recommender Systems series, host Kyle Polich explores DataRec, a new Python library designed to bring reproducibility and standardization to recommender systems research. Guest Alberto Carlo Mario Mancino, a postdoc researcher from Politecnico di Bari, Italy, discusses the challenges of dataset management in recommendation research—from version control issues to preprocessing inconsistencies—and how DataRec provides automated downloads, checksum verification, and standardized filtering strategies for popular datasets like MovieLens, Last.fm, and Amazon reviews. The conversation covers Alberto's research journey through knowledge graphs, graph-based recommen…

4 months, 1 week назад @ dataskeptic.com
Shilling Attacks on Recommender Systems
Shilling Attacks on Recommender Systems Shilling Attacks on Recommender Systems

In this episode of Data Skeptic's Recommender Systems series, Kyle sits down with Aditya Chichani, a senior machine learning engineer at Walmart, to explore the darker side of recommendation algorithms. The conversation centers on shilling attacks—a form of manipulation where malicious actors create multiple fake profiles to game recommender systems, either to promote specific items or sabotage competitors. Aditya, who researched these attacks during his undergraduate studies at SPIT before completing his master's in computer science with a data science specialization at UC Berkeley, explains how these vulnerabilities emerge particularly in collaborative filtering systems. From promoting a …

4 months, 3 weeks назад @ dataskeptic.com
Music Playlist Recommendations
Music Playlist Recommendations Music Playlist Recommendations

In this episode, Rebecca Salganik, a PhD student at the University of Rochester with a background in vocal performance and composition, discusses her research on fairness in music recommendation systems. She explores three key types of fairness—group, individual, and counterfactual—and examines how algorithms create challenges like popularity bias (favoring mainstream content) and multi-interest bias (underserving users with diverse tastes). Rebecca introduces LARP, her multi-stage multimodal framework for playlist continuation that uses contrastive learning to align text and audio representations, learn song relationships, and create playlist-level embeddings to address the cold start prob…

4 months, 4 weeks назад @ dataskeptic.com
Bypassing the Popularity Bias
Bypassing the Popularity Bias Bypassing the Popularity Bias 5 months, 1 week назад @ dataskeptic.com
Sustainable Recommender Systems for Tourism
Sustainable Recommender Systems for Tourism Sustainable Recommender Systems for Tourism

In this episode, we speak with Ashmi Banerjee, a doctoral candidate at the Technical University of Munich, about her pioneering research on AI-powered recommender systems in tourism. Ashmi illuminates how these systems can address exposure bias while promoting more sustainable tourism practices through innovative approaches to data acquisition and algorithm design. Key highlights include leveraging large language models for synthetic data generation, developing recommendation architectures that balance user satisfaction with environmental concerns, and creating frameworks that distribute tourism more equitably across destinations. Ashmi's insights offer valuable perspectives for both AI res…

5 months, 2 weeks назад @ dataskeptic.com
SuperDataScience SuperDataScience
последний пост 9 часов назад
A Post-Transformer Architecture Crushes Sudoku (Transformers Solve ~0%)
A Post-Transformer Architecture Crushes Sudoku (Transformers Solve ~0%) A Post-Transformer Architecture Crushes Sudoku (Transformers Solve ~0%)

A game millions of people solve over morning coffee is exposing a fundamental weakness in today’s most powerful AI models. In this Five-Minute Friday, Jon Krohn breaks down Pathway’s new Sudoku Extreme benchmark, roughly 250,000 of the hardest Sudoku puzzles available and why leading LLMs like o3-mini, DeepSeek-R1, and Claude 3.7 Sonnet scored effectively zero percent, while Pathway’s post-transformer BDH architecture achieved 97.4% accuracy at a fraction of the cost. Listen to the episode to find out what BDH is doing differently, why Sudoku performance matters far beyond puzzles, and what this means for the future of AI reasoning. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatas…

9 часов назад @ podtrac.com
977: Attention, World Models and the Future of AI, with Prof. Kyunghyun Cho
977: Attention, World Models and the Future of AI, with Prof. Kyunghyun Cho 977: Attention, World Models and the Future of AI, with Prof. Kyunghyun Cho

What’s going to be the next big step function that blasts us forward in AI capabilities? To find out, Jon Krohn sits down with Professor Kyunghyun Cho, whose 200,000 citations and co-authorship of the first paper on attention place him among the most influential AI researchers in the world. In this episode, Kyunghyun explains why today’s models have already captured most correlations in passive data, making the real challenge about actively choosing which data to collect. He also weighs in on the open debate around world models, whether AI needs high-fidelity, step-by-step imagination or whether a high-level latent representation that lets it skip ahead is sufficient and shares the surprisi…

3 days, 9 hours назад @ podtrac.com
976: NVIDIA’s Nemotron 3 Super: The Perfect LLM for Multi-Agent Systems
976: NVIDIA’s Nemotron 3 Super: The Perfect LLM for Multi-Agent Systems 976: NVIDIA’s Nemotron 3 Super: The Perfect LLM for Multi-Agent Systems

NVIDIA just dropped Nemotron 3 Super, a 120-billion-parameter open-weight model that only activates 12 billion parameters at a time and it’s built for the agentic AI era. In this Five-Minute Friday, Jon Krohn breaks down the model’s hybrid Mamba-Transformer architecture, its million-token context window, and why its combination of frontier-class reasoning with blazing-fast throughput matters for anyone building multi-agent systems. Find out how Nemotron 3 Super claimed the #1 spot on the DeepResearch Bench leaderboards, which companies are already adopting it, and where you can start using it today. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/976⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Intere…

1 week назад @ podtrac.com
975: Unmetered Intelligence is Heralding the Next Renaissance, with Zack Kass
975: Unmetered Intelligence is Heralding the Next Renaissance, with Zack Kass 975: Unmetered Intelligence is Heralding the Next Renaissance, with Zack Kass

Zack Kass speaks to Jon Krohn about his bestselling, tech-positive book, The Next Renaissance, that charts the rapid progress of humanity and the benefits that artificial intelligence will bring to us, as well as why a future where intelligence is a cheap and abundant resource will give humanity an edge. Elsewhere in the show, Zack discusses why it’s important to hold parents, teachers and students accountable for their education, why it is incumbent on us to build a healthier relationship with technology, and his 4 principles for thriving in the age of AI. This episode is brought to you by the⁠ ⁠⁠Cisco⁠, by ⁠Acceldata⁠ and by ⁠⁠ODSC, the Open Data Science Conference⁠⁠. Additional materials…

1 week, 3 days назад @ podtrac.com
974: When Will The AI Bubble Burst? How Bad Will It Be?
974: When Will The AI Bubble Burst? How Bad Will It Be? 974: When Will The AI Bubble Burst? How Bad Will It Be?

In this week’s Five-Minute Friday, Jon Krohn holds the AI bubble up to the light. He points to the deep greyzone found in AI startups like Cluely that are established on dubious ideas (Cluely’s tagline was “cheat on everything”) and funding bluster, as well as the staggering spending by companies on infrastructure and researcher salaries. Listen to the episode to hear about the historical precedents to the AI bubble that go all the way back to the invention of the railway, what to make of current investments in AI, and what you can do about these changes as an AI practitioner. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/974⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a Supe…

2 weeks назад @ podtrac.com
973: AI Systems Performance Engineering, with Chris Fregly
973: AI Systems Performance Engineering, with Chris Fregly 973: AI Systems Performance Engineering, with Chris Fregly

No one should be manually writing code in 2026, thinks Chris Fregly, Jon Krohn’s guest on this week’s episode. In this interview about Chris’ latest book, AI Systems Performance Engineering, he explains why it’s so important to consider memory bandwidth when evaluating GPU performance, that understanding the full hardware software stack is the most valuable skill for anyone working in AI development, and which shortcuts we still shouldn’t ever take when writing code, even though we might be outsourcing a great deal to generative AI. This episode is brought to you by the ⁠⁠Cisco, by Acceldata and by ⁠ODSC, the Open Data Science Conference⁠. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠…

2 weeks, 3 days назад @ podtrac.com
972: In Case You Missed It in February 2026
972: In Case You Missed It in February 2026 972: In Case You Missed It in February 2026

Jon Krohn recaps the month of February in this episode of In Case You Missed It. Across four interviews with Will Falcon (Episode 965), Tom Griffiths (Episode 969), Antje Barth (Episode 963), and Praveen Murugesan (Episode 967), Jon questions the brains behind some of the AI industry’s most innovative companies about launching a startup, developing a popular product, what artificial intelligence can still learn from human intelligence, and how AI might finally start to think on its own. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/972⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

3 weeks назад @ podtrac.com
971: 90% of The World’s Data is Private; Lin Qiao’s Fireworks AI is Unlocking It
971: 90% of The World’s Data is Private; Lin Qiao’s Fireworks AI is Unlocking It 971: 90% of The World’s Data is Private; Lin Qiao’s Fireworks AI is Unlocking It

Lin Qiao, CEO of Fireworks AI, talks to Jon Krohn about how she builds effective models quickly, why coding agents can perform at the level of a junior engineer, and what she attributes to the success of Fireworks AI: True to its name, the company exploded into the AI industry with over $300 million secured in venture capital, as well as netting a further $250 million Series C funding. For Lin, many enterprises miss out by not being familiar with open models. Open models give a lot of control to the user, offering customizability and at a much lower price point. Listen to hear how Fireworks AI helps companies continue to save money through AI. This episode is brought to you by the⁠ ⁠⁠⁠⁠Dell…

3 weeks, 3 days назад @ podtrac.com
970: The “100x Engineer”: How to Be One, But Should You?
970: The “100x Engineer”: How to Be One, But Should You? 970: The “100x Engineer”: How to Be One, But Should You?

Working with code-gen models and Claude Code: In this Five-Minute Friday, Jon Krohn addresses how AI superstars like Andrej Karpathy are using AI agents in their coding work, the outlook for code-gen in 2026, and how you can get started. Hear about Karpathy’s work as well as the soaring success of Peter Steinberger and how he managed to surpass the GitHub commit rate of teams as an individual working with AI agents. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/970⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

4 weeks назад @ podtrac.com
969: The Laws of Thought: The Math of Minds and Machines, with Prof. Tom Griffiths
969: The Laws of Thought: The Math of Minds and Machines, with Prof. Tom Griffiths 969: The Laws of Thought: The Math of Minds and Machines, with Prof. Tom Griffiths

Princeton Professor Tom Griffiths talks to Jon Krohn about his new book, The Laws of Thought, which grapples with the mathematical models behind biological and artificial intelligence, and what makes the human brain so fascinating for psychologists and computer scientists to study. In this episode, he details how the mathematical principles governing the external world can also be used to explore cognitive science, or “the internal world.” This episode is brought to you by the ⁠⁠Dell⁠⁠, by ⁠⁠Intel⁠⁠, by Cisco and by Acceldata. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/969⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natal…

1 month назад @ podtrac.com
968: Is AI Automating Away All Coding Jobs?
968: Is AI Automating Away All Coding Jobs? 968: Is AI Automating Away All Coding Jobs?

Now that AI agents can develop new apps from product development to delivery, do AI developers have reason to worry about their careers? Podcast host Jon Krohn addresses the stark predictions that AI could “eliminate half of all entry-level white-collar jobs” by going back to the data. Find out why the numbers show a very different picture, which in-demand occupations have increased by 40% since late 2022, and Jon’s advice on why technical professionals shouldn’t panic in this latest Five-Minute Friday. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/968⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship…

1 month назад @ podtrac.com
967: AI for the Physical World, with Samsara's Praveen Murugesan
967: AI for the Physical World, with Samsara's Praveen Murugesan 967: AI for the Physical World, with Samsara's Praveen Murugesan

VP of Engineering at Samsara Praveen Murugesan talks to Jon Krohn about processing 20 trillion data points covering 90 billion miles across private and public sectors, how the company helps truckers who operate long hours and travel for long stretches without cellphone signal, and who they’re looking to hire to help this agile startup keep on developing high-impact solutions for real-world problems. And, if you’re looking to work for the company, there’s no better time to apply, and you’ll want to listen to the end of the show to hear exactly what Praveen looks for in new hires. This episode is brought to you by the ⁠⁠Dell⁠⁠, by ⁠⁠Intel⁠⁠, by Acceldata and by the ODSC, the Open Data Science…

1 month, 1 week назад @ podtrac.com
966: The Moltbook Phenomenon: OpenClaw Unleashed
966: The Moltbook Phenomenon: OpenClaw Unleashed 966: The Moltbook Phenomenon: OpenClaw Unleashed

Jon Krohn gives Five-Minute Friday listeners all the details about the new social network causing a stir, Moltbook. What makes Moltbook so unique is that this is the first network designed just for AI agents. It’s an exclusive club, only its alleged 1.5 million registered agents can post, comment, and upvote, but we can watch this real-world experiment in agent ecology from the sidelines. Listen to the episode to hear the fascinating, if disturbing, story of Moltbook’s swift turn into facilitating a digital theocracy and forms of government, and whether this development is a sign of an approaching singularity or rather AI continuing to ape human thought and turn it into slop. Additional mat…

1 month, 1 week назад @ podtrac.com
965: From PhD Side Project to $500M ARR: Will Falcon’s PyTorch Lightning Story
965: From PhD Side Project to $500M ARR: Will Falcon’s PyTorch Lightning Story 965: From PhD Side Project to $500M ARR: Will Falcon’s PyTorch Lightning Story

CEO of Lightning AI Will Falcon speaks to podcast host and Lightning AI fellow Jon Krohn about the company’s merger with Voltage Park, and why Will has named it the “full-stack AI neo-cloud for enterprises and frontier labs”. Lightning AI’s offer is a secure, flexible, and collaborative environment that can run on the cloud, all essentials for early-stage startups. Listen to the episode to hear Will Falcon discuss Lightning AI Studio, founding PyTorch Lightning, and how he came to found his AI company. This episode is brought to you by the Dell⁠⁠, by ⁠⁠Intel⁠⁠, by Fabi and by Cisco. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/965⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in spons…

1 month, 2 weeks назад @ podtrac.com
SDS 964: In Case You Missed It in January 2026
SDS 964: In Case You Missed It in January 2026 SDS 964: In Case You Missed It in January 2026

In this first of the year ICYMI episode, Jon Krohn selects his favorite moments from January’s SuperDataScience interviews. Listen to why incentivizing workers is the best way to get them to disclose their use of AI tools and pave the way for an AI-forward future, how AI continues to mimic human development in its own evolution, the importance of evaluation in building AI systems, and how to keep your best employees (and also: how to know your value) with guests Sadie St. Lawrence, Ashwin Rajeeva, Sinan Ozdemir, Vijoy Pandey, and Ethan Mollick. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/964⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email n…

1 month, 2 weeks назад @ podtrac.com
Data Science at Home Data Science at Home
последний пост 3 weeks, 1 day назад
There Is No AI. There’s a Stateless Function on 10,000 GPUs Pretending to Know You (Ep. 299)
There Is No AI. There’s a Stateless Function on 10,000 GPUs Pretending to Know You (Ep. 299) There Is No AI. There’s a Stateless Function on 10,000 GPUs Pretending to Know You (Ep. 299)

Personal newsletter: https://defragzone.substack.com📩 Newsletter: https://datascienceathome.substack.com🎙 Podcast: Available on Spotify, Apple Podcasts, and more.

🐦 Twitter: @DataScienceAtHome📘 LinkedIn: https://www.linkedin.com/in/fragadaleta/ Instagram: https://www.instagram.com/datascienceathome/Facebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Whether you’re a data professional, tech enthusiast, or just curious about the field, our podcast delivers insights, intervi…

3 weeks, 1 day назад @ datascienceathome.com
Bias in the machine (edited)
Bias in the machine (edited) Bias in the machine (edited)

The title of today’s episode is Bias in the machineC: Francesco, today we are starting with an infuriating discussion.

The failure of the medical community as a whole to recognise this obvious bias up to the 21st century is an example of how insidious the problem of bias is.

Three: The bias in your training sample: people put training samples together, and people have culture, experience, and prejudice.

These assumptions inform the way AI systems work—and fail—to this day.

When an algorithm is a black box and you can’t look inside, you have no way of analysing its bias.

3 weeks, 1 day назад @ datascienceathome.com
What is wrong with reinforcement learning? (Ep. 82)
What is wrong with reinforcement learning? (Ep. 82) What is wrong with reinforcement learning? (Ep. 82)

Join the discussion on our Discord serverAfter reinforcement learning agents doing great at playing Atari video games, Alpha Go, doing financial trading, dealing with language modeling, let me tell you the real story here.In this episode I want to shine some light on reinforcement learning (RL) and the limitations that every practitioner should consider before taking certain directions.

RL seems to work so well!

What is wrong with it?

Are you a listener of Data Science at Home podcast?

Or did you subscribe to the Artificial Intelligence at your fingertips newsletter?

1 month, 3 weeks назад @ datascienceathome.com
How to generate very large images with GANs (Ep. 76)
How to generate very large images with GANs (Ep. 76) How to generate very large images with GANs (Ep. 76)

Join the discussion on our Discord serverIn this episode I explain how a research group from the University of Lubeck dominated the curse of dimensionality for the generation of large medical images with GANs.

The problem is not as trivial as it seems.

Many researchers have failed in generating large images with GANs before.

One interesting application of such approach is in medicine for the generation of CT and X-ray images.Enjoy the show!

ReferencesMulti-scale GANs for Memory-efficient Generation of High Resolution Medical Images https://arxiv.org/abs/1907.01376

1 month, 3 weeks назад @ datascienceathome.com
Training neural networks faster without GPU [RB] (Ep. 77)
Training neural networks faster without GPU [RB] (Ep. 77) Training neural networks faster without GPU [RB] (Ep. 77)

Join the discussion on our Discord serverTraining neural networks faster usually involves the usage of powerful GPUs.

In this episode I explain an interesting method from a group of researchers from Google Brain, who can train neural networks faster by squeezing the hardware to their needs and making the training pipeline more dense.

Enjoy the show!

ReferencesFaster Neural Network Training with Data Echoinghttps://arxiv.org/abs/1907.05550

1 month, 3 weeks назад @ datascienceathome.com
More powerful deep learning with transformers (Ep. 84) (Rebroadcast)
More powerful deep learning with transformers (Ep. 84) (Rebroadcast) More powerful deep learning with transformers (Ep. 84) (Rebroadcast)

Some of the most powerful NLP models like BERT and GPT-2 have one thing in common: they all use the transformer architecture.

Such architecture is built on top of another important concept already known to the community: self-attention.In this episode I explain what these mechanisms are, how they work and why they are so powerful.

Don’t forget to subscribe to our Newsletter or join the discussion on our Discord serverReferences

1 month, 3 weeks назад @ datascienceathome.com
Your Favorite AI Startup is Probably Bullshit (Ep. 298) [RB]
Your Favorite AI Startup is Probably Bullshit (Ep. 298) [RB] Your Favorite AI Startup is Probably Bullshit (Ep. 298) [RB]

The brutal truth about why Silicon Valley is blowing billions on glorified autocomplete while pretending it’s the next iPhone.

We’re diving deep into the AI investment circus where VCs who can’t code are funding companies that barely understand their own technology.

From blockchain déjà vu to the “ChatGPT wrapper” economy—this episode will make you question every AI valuation you’ve ever seen.

Fair warning: We’re naming names and calling out the hype.

Don’t listen if you work at a “revolutionary AI startup” that’s just OpenAI’s API with a pretty interface.

1 month, 3 weeks назад @ datascienceathome.com
Why AI Researchers Are Suddenly Obsessed With Whirlpools (Ep. 297) [RB]
Why AI Researchers Are Suddenly Obsessed With Whirlpools (Ep. 297) [RB] Why AI Researchers Are Suddenly Obsessed With Whirlpools (Ep. 297) [RB]

VortexNet uses actual whirlpools to build neural networks.

By borrowing equations from fluid dynamics, this new architecture might solve deep learning’s toughest problems—from vanishing gradients to long-range dependencies.

Today we explain how vortex shedding, the Strouhal number, and turbulent flows might change everything in AI.

SponsorsThis episode is brought to you by Statistical HorizonsAt Statistical Horizons, you can stay ahead with expert-led livestream seminars that make data analytics and AI methods practical and accessible.

Join thousands of researchers and professionals who’ve advanced their careers with Statistical Horizons.

1 month, 3 weeks назад @ datascienceathome.com
AGI: The Dream We Should Never Reach (Ep. 296)
AGI: The Dream We Should Never Reach (Ep. 296) AGI: The Dream We Should Never Reach (Ep. 296)

Also on YouTubeTwo AI experts who actually love the technology explain why chasing AGI might be the worst thing for AI’s future—and why the current hype cycle could kill the field we’re trying to save.

Head to datascienceathome.com for detailed show notes, code examples, and exclusive deep-dives into the papers we discuss.

Subscribe to our newsletter for weekly breakdowns of cutting-edge research delivered straight to your inbox—no fluff, just science!

Our Discord community is full of ML engineers, researchers, and AI enthusiasts discussing papers, sharing projects, and helping each other level up.

Whether you’re debugging your first neural net or training your tenth transformer, there’s a …

1 month, 3 weeks назад @ datascienceathome.com
When Data Stops Being Code and Starts Being Conversation (Ep. 297)
When Data Stops Being Code and Starts Being Conversation (Ep. 297) When Data Stops Being Code and Starts Being Conversation (Ep. 297)

Mark Brocato built Mockaroo—the tool that taught millions of developers how to fake data.

Now, as Head of Engineering at Tonic.ai, he’s building the AI agent that’s making his own creation obsolete.

From the hidden failures of legacy mocks to the security implications of agent-driven synthesis, Mark reveals what happens when data generation becomes a conversation—not a pipeline.

SponsorsTonic.ai Synthetic data solutions for software and AI development.

Accelerate engineering velocity and ensure compliance with AI-powered data synthesisThis episode is brought to you by Statistical HorizonsAt Statistical Horizons, you can stay ahead with expert-led livestream seminars that make data analytics…

3 months назад @ datascienceathome.com
Your AI Strategy is Burning Money: Here’s How to Fix It (Ep.295)
Your AI Strategy is Burning Money: Here’s How to Fix It (Ep.295) Your AI Strategy is Burning Money: Here’s How to Fix It (Ep.295)

Most companies don’t have an AI problem.

In this conversation, he breaks down when AI actually makes sense, where AWS costs spiral out of control, and why your “cool demo” keeps dying before launch.

If you’re tired of AI hype and ready for straight answers, hit play.

Our Discord community is full of ML engineers, researchers, and AI enthusiasts discussing papers, sharing projects, and helping each other level up.

Whether you’re debugging your first neural net or training your tenth transformer, there’s a place for you.

4 months назад @ datascienceathome.com
From Tokens to Vectors: The Efficiency Hack That Could Save AI (Ep. 294)
From Tokens to Vectors: The Efficiency Hack That Could Save AI (Ep. 294) From Tokens to Vectors: The Efficiency Hack That Could Save AI (Ep. 294)

LLMs generate text painfully slow, one low-info token at a time.

Researchers just figured out how to compress 4 tokens into smart vectors & cut costs by 44%—with full code & proofs!

🔥📊SponsorsThis episode is brought to you by Statistical HorizonsAt Statistical Horizons, you can stay ahead with expert-led livestream seminars that make data analytics and AI methods practical and accessible.

Join thousands of researchers and professionals who’ve advanced their careers with Statistical Horizons.

Get $200 off any seminar with code DATA25 at https://statisticalhorizons.com

4 months, 2 weeks назад @ datascienceathome.com
Why AI Researchers Are Suddenly Obsessed With Whirlpools (Ep. 293)
Why AI Researchers Are Suddenly Obsessed With Whirlpools (Ep. 293) Why AI Researchers Are Suddenly Obsessed With Whirlpools (Ep. 293)

VortexNet uses actual whirlpools to build neural networks.

By borrowing equations from fluid dynamics, this new architecture might solve deep learning’s toughest problems—from vanishing gradients to long-range dependencies.

Today we explain how vortex shedding, the Strouhal number, and turbulent flows might change everything in AI.

SponsorsThis episode is brought to you by Statistical HorizonsAt Statistical Horizons, you can stay ahead with expert-led livestream seminars that make data analytics and AI methods practical and accessible.

Join thousands of researchers and professionals who’ve advanced their careers with Statistical Horizons.

4 months, 4 weeks назад @ datascienceathome.com
The Scientists Growing Living Computers in Swiss Labs (Ep. 292)
The Scientists Growing Living Computers in Swiss Labs (Ep. 292) The Scientists Growing Living Computers in Swiss Labs (Ep. 292)

At the intersection of ethics and engineering, Amethix creates AI systems that don’t just function—they adapt, learn, and serve.

With a focus on dual-use innovation, Amethix is shaping a future where intelligent machines extend human capability, not replace it.

Discover more at https://amethix.com This episode is brought to you by Intrepid AI.

From drones to satellites, Intrepid AI gives engineers and defense innovators the tools to prototype, simulate, and deploy autonomous systems with confidence.

Learn more at intrepid.aiReferencesWebsite: finalspark.comDiscord account: / discordNewsletter: https://finalspark.com/#newsletterTopics: Biological computing • Neural engineering • Energy-effic…

5 months назад @ datascienceathome.com
When AI Hears Thunder But Misses the Fear (Ep. 291)
When AI Hears Thunder But Misses the Fear (Ep. 291) When AI Hears Thunder But Misses the Fear (Ep. 291)

Sanjoy Chowdhury reveals AI’s hidden weakness: while systems can see objects and hear sounds perfectly, they can’t reason across senses like humans do.

At the intersection of ethics and engineering, Amethix creates AI systems that don’t just function—they adapt, learn, and serve.

Discover more at https://amethix.comThis episode is brought to you by Intrepid AI.

From drones to satellites, Intrepid AI gives engineers and defense innovators the tools to prototype, simulate, and deploy autonomous systems with confidence.

Whether it’s in the sky, on the ground, or in orbit—if it’s intelligent and mobile, Intrepid helps you build it.

5 months, 2 weeks назад @ datascienceathome.com