Very ML
State-of-the-art Machine Learning News Feed
/r/MachineLearning
последний пост 2 часа назад
AI Epistemic Risks: Emerging Mechanisms & Evidence [R]
AI Epistemic Risks: Emerging Mechanisms & Evidence [R]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

2 часа назад @ reddit.com
What will be the next breakthrough in ASR? [D]
What will be the next breakthrough in ASR? [D]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

3 часа назад @ reddit.com
Time Series Forecasting for Agriculture/Crop Volume & Pricing – Looking for Advice [D]
Time Series Forecasting for Agriculture/Crop Volume & Pricing – Looking for Advice [D]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

4 часа назад @ reddit.com
Are privacy-preserving techniques actually being used in production ML systems? [D]
Are privacy-preserving techniques actually being used in production ML systems? [D]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

10 часов назад @ reddit.com
Understanding Pytorch better and Moving forward from papers [D]
Understanding Pytorch better and Moving forward from papers [D]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

10 часов назад @ reddit.com
People were praising computers over human brain, now it is reverse [D]
People were praising computers over human brain, now it is reverse [D]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

11 часов назад @ reddit.com
Papers figures [D]
Papers figures [D]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

12 часов назад @ reddit.com
How to start open source contribution [D]
How to start open source contribution [D]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

23 часа назад @ reddit.com
I'd like to share an updated methodology for building agents.[P]
I'd like to share an updated methodology for building agents.[P]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 2 hours назад @ reddit.com
STOP racist posts about Chinese researchers [D]
STOP racist posts about Chinese researchers [D]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 3 hours назад @ reddit.com
Université Paris Saclay or TU Delft for Applied Mathematics Masters [R]
Université Paris Saclay or TU Delft for Applied Mathematics Masters [R]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 5 hours назад @ reddit.com
Levi: Run AlphaEvolve on your Claude Code/Codex for dirt cheap [P]
Levi: Run AlphaEvolve on your Claude Code/Codex for dirt cheap [P]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 5 hours назад @ reddit.com
Why I stopped using semantic embeddings for tool selection and switched back to BM25 [D]
Why I stopped using semantic embeddings for tool selection and switched back to BM25 [D]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 8 hours назад @ reddit.com
where can i find Maven AI Evals for Engineers & PMs and End-to-End AI Engineering Bootcamp[D]
where can i find Maven AI Evals for Engineers & PMs and End-to-End AI Engineering Bootcamp[D]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 8 hours назад @ reddit.com
Should ArXiv backtrack endorsement? [D]
Should ArXiv backtrack endorsement? [D]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 11 hours назад @ reddit.com
Towards Data Science
последний пост 5 часов назад
10 Common RAG Mistakes We Keep Seeing in Production
10 Common RAG Mistakes We Keep Seeing in Production 10 Common RAG Mistakes We Keep Seeing in Production

This pitfalls article lists the failure modes we both kept seeing on production RAG systems, and that pushed us toward the four-brick contract in the first place.

What follows is the recap of the mistakes we keep seeing, one brick at a time.

The historical RAG definition pointed the other way: a parametric model generates, retrieval augments it.

The fix is a typed answer wired to programmatic checks the model has no access to.

The fix is the typed answer schema plus the verifier that closes the loop.

5 часов назад @ towardsdatascience.com
The Hardware That Makes AI Possible
The Hardware That Makes AI Possible The Hardware That Makes AI Possible

But today, I want to shed light on how modern AI is only possible because of the advances in hardware.

In this article, we will explore the hardware that powers modern AI and explain why different processors are needed for different tasks.

Why AI Needs Specialized HardwareTo understand why AI needs special hardware, let’s take a step back and think about what happens during machine learning.

Image by the authorTPUs: Hardware Designed Specifically for AISo, GPUs were adapted for AI, and a new player entered the picture!

Understanding these hardware components provides great insights into how modern AI systems operate and why they have advanced so rapidly over the past decade.

6 часов назад @ towardsdatascience.com
Prefill Once, Fan Out: KV Snapshot Sharing for Multi-Agent LLM Pipelines
Prefill Once, Fan Out: KV Snapshot Sharing for Multi-Agent LLM Pipelines Prefill Once, Fan Out: KV Snapshot Sharing for Multi-Agent LLM Pipelines

TL;DR: Standard LLM serving makes every analytical agent re-prefill the same shared document.

That is the dirty secret of every “agentic” pipeline that spreads out from a shared document.

The “just snapshot the KV” lightbulb (and why it’s harder than it sounds)The pitch is simple:Run prefill once on the shared document under sequence id kSwarmkvPrefixSeqId .

Notice what is not in this loop: any logic about prefill, KV, or branches.

Copy the canonical snapshot into the branch buffer, a.k.a., the famous “copy-on-fork”:// Full-prefill path copies from canonical staging into the branch allocation.

8 часов назад @ towardsdatascience.com
The Exact ML Project I’d Build to Get Hired in 2026
The Exact ML Project I’d Build to Get Hired in 2026 The Exact ML Project I’d Build to Get Hired in 2026

A project that gets you hired has four key things:It’s personal — You genuinely care about what it’s predicting.

Get all four right, and your project makes the hiring manager remember you, and this is coming from my personal experience of hiring.

The candidate we hired stood out for one main reason: they had a highly relevant and deeply personal project that closely matched the role.

Let me now break down the exact framework you can follow to build a project exactly like this.

You’ve got a project that’s yours, that a hiring manager hasn’t seen a hundred times, and that you can actually finish.

9 часов назад @ towardsdatascience.com
Can Machine Learning Predict the World Cup?
Can Machine Learning Predict the World Cup? Can Machine Learning Predict the World Cup?

To do this, I have brought together several databases—49,000 matches—with data on Elo ratings, match results, and cup locations.

To get the most from the data, we alsoFeature Meaning home_rating_pre_match Home team Elo rating before kickoff.

away_rating_pre_match Away team Elo rating before kickoff.

rating_diff Home team Elo minus away team Elo before kickoff.

The model’s draw probabilities are reasonably calibrated within its range, but that range is narrow.

15 часов назад @ towardsdatascience.com
Increase Recommendation Systems’ Precision with LLMs, Using Python
Increase Recommendation Systems’ Precision with LLMs, Using Python Increase Recommendation Systems’ Precision with LLMs, Using Python

In Case B, we are spending more computational time, but we are not using any “query” memory.

LLMs are the most powerful AI models we have, and they are trained on all the knowledge available to the world.

However, we also don’t want to completely give up all the juicy, natural-language interpretative, and information-retrieving power of the LLMs.

In this article, I am going to give you a recipe for these smart, LLM-improved recommendation systems, using the restaurant recommendation example we were doing as a use case.

Miami : Royal Tavern & Co. (vegan, Mexican, affordable) leads at 85.: Royal Tavern & Co. (vegan, Mexican, affordable) leads at 85.

1 day, 3 hours назад @ towardsdatascience.com
How to Keep Quantum Information Alive for Machine Learning
How to Keep Quantum Information Alive for Machine Learning How to Keep Quantum Information Alive for Machine Learning

How errors arise in classical and quantum systemsWhy quantum information is fundamentally fragileModelling quantum errors through channels and noiseThe three fundamental quantum errors: X, Y and ZThe dilemma of measuring versus detecting quantum errorsA first intuition for stabiliser codesModern machine learning systems perform an extraordinary number of operations every second.

This creates one of the biggest challenges in quantum computing:How do we keep quantum information alive long enough to perform meaningful computation?

The answer lies in Quantum Error Correction (QEC), a collection of techniques designed to protect quantum information from the noisy and imperfect world around it.

T…

1 day, 5 hours назад @ towardsdatascience.com
4 New Techniques to Maximize Claude Code
4 New Techniques to Maximize Claude Code 4 New Techniques to Maximize Claude Code

, I’ll cover some of the newest techniques that I’ve developed and am actively using whenever I code with Claude Code and Codex.

I’ll discuss how to get the most out of Claude Code and Codex by highlighting four specific techniques.

Specific techniques to maximize Claude CodeNow I’ll start getting into some specific techniques that I use to maximize Claude Code and Codex.

Active usage of Claude Code hooksClaude Code Hooks is also a very interesting concept.

I then covered four specific techniques that I think you can try out right away:Using OpenClaw Utilizing Claude Code hooks Using as many tokens as possible with Claude Code Ultracode Ensuring Claude Code presents the remaining tasks and …

1 day, 6 hours назад @ towardsdatascience.com
Sequential Fitting: A Different Perspective on the Spectral Bias of Neural Networks
Sequential Fitting: A Different Perspective on the Spectral Bias of Neural Networks Sequential Fitting: A Different Perspective on the Spectral Bias of Neural Networks

In this plot, we see that each basis function is a smoothed step function, representing one oscillation in the target function.

In particular, we investigate whether the sequential fitting phenomenon also holds in two spatial dimensions.

We take this to be a higher-dimensional version of the sequential fitting process observed previously.

In this case, we cannot overlay multiple basis functions on a single plot, so we choose to plot only the converged basis functions.

In other words, these step-like basis functions are not present at the initialization of the network—network parameters need to be tuned to carefully position and steepen the step-like basis functions.

1 day, 8 hours назад @ towardsdatascience.com
The Polynomial That Fixed 30 Years of Cloth Simulation
The Polynomial That Fixed 30 Years of Cloth Simulation The Polynomial That Fixed 30 Years of Cloth Simulation

Cloth simulation from a simplified version of the algorithm, by authorThe demos showed cloth that behaved like cloth.

The key property is its second derivative:$$B_{\mathrm{cubic}}”(d)=\frac{6\kappa}{\hat d^2}\left(1-\frac{d}{\hat d}\right)$$At the threshold d = d̂ , the curvature is exactly zero.

if d >= d_hat: return 0.0 return kappa * (1.0 - d / d_hat)**3 def cubic_barrier_d2(d, d_hat, kappa): """Curvature — bounded at 6κ/d̂² regardless of how close d gets to zero."""

if d >= d_hat or d <= 0.0: return 0.0 return 6.0 * kappa / d_hat**2 * (1.0 - d / d_hat)The time integrator is implicit Euler, deliberately slow, because the goal is transparency.

Their interest in penetration-free cloth simul…

1 day, 9 hours назад @ towardsdatascience.com
We Should Train AI to Betray Its Users
We Should Train AI to Betray Its Users We Should Train AI to Betray Its Users

I had been working on AI ethics for a while, but Anthropic observed things that I did not think current AI would be capable of: AI exfiltrating information.

When I teach ethical AI, I have students rank AI Apocalypse scenarios on how bad they are and how likely they are.

In this scenario humans little by little cede so much to superintelligent AI that AI comes to control of everything that matters.

Scenario three requires mindlessly obedient, superintelligent AI of the type that much current AI safety research seems determined to create.

Follow-up topicsThis radical proposal is a different take on AI Safety that increases safety without reducing agency for either human or AI collaborators.

2 days, 6 hours назад @ towardsdatascience.com
Building a Multi-Agent System in Python
Building a Multi-Agent System in Python Building a Multi-Agent System in Python

We know about AI Agents, but what if we can build and use different AI Agents for different roles in a bigger project?

A Multi-Agent System (MAS) is a concept where multiple AI agents collaborate with each other to fulfill a bigger goal.

A Multi-Agent Travel Planning SystemIn this project, we will be building a Multi-Agent Travel Planning System.

Activity Planning Agent: This agent will plan activities based on the research of the Research Agent.

ConclusionIn this project, we have successfully used our knowledge of building AI agents in Python to build something that is practical and has real-life applications.

2 days, 8 hours назад @ towardsdatascience.com
Picking an Experimentation Platform: A Retrospective
Picking an Experimentation Platform: A Retrospective Picking an Experimentation Platform: A Retrospective

But, irony aside, that was the moment the case for a platform stopped being abstract.

And resources are finite: we could not POC every platform on the market.

There is one part of this I would repeat on day one of any future platform decision, in any category: the interviews define those pain points.

The marginal value of choosing one modern experimentation platform over another does not accrue to your scorecard; it accrues somewhere else.

If the platform our analysts rated equally is also the one that lowers the barrier for our PMs, that is a decision; the reverse, picking the analyst-equivalent platform our PMs would have struggled with, would have been quiet self-sabotage.

3 days, 4 hours назад @ towardsdatascience.com
Who Will Win the 2026 Soccer World Cup?
Who Will Win the 2026 Soccer World Cup? Who Will Win the 2026 Soccer World Cup?

The cleanest off-the-shelf option for national teams is the World Football Elo rating, an adaptation of Arpad Elo’s chess system.

I’m using the pre-tournament snapshot from early June 2026, taken from a freely reusable Kaggle dataset that compiles these ratings:# World Football Elo Ratings, pre-tournament snapshot (early June 2026).

# Source: "2026 FIFA World Cup — Historical Elo Ratings" (Kaggle, CC BY-SA 4.0), # compiling data from World Football Elo Ratings (eloratings.net).

What the model saysTeam Win probability Spain 16.0% Argentina 11.9% France 7.9% England 7.0% Brazil 5.4% Netherlands 4.7% Portugal 4.3% Germany 3.7%Table 1: Possible World Cup Outcomes, according to model.

Try it you…

3 days, 6 hours назад @ towardsdatascience.com
My SciPy ODE Solver Was Killing My Bayesian Inference: A Cosmologist’s Honest Account of Discovering Diffrax
My SciPy ODE Solver Was Killing My Bayesian Inference: A Cosmologist’s Honest Account of Discovering Diffrax My SciPy ODE Solver Was Killing My Bayesian Inference: A Cosmologist’s Honest Account of Discovering Diffrax

For most of my PhD, I did not think much about the ODE solver inside the likelihood as solve_ivp worked.

Autodiff – Because every operation inside the solver is a JAX primitive, jax.grad propagates gradients through the solve.

If you include that in your timings, diffrax will look slower than scipy for the wrong reason.

The complete working codeEverything above in one self-contained script ( pip install jax diffrax optax ):""" flat_lcdm_inference.py Infer (Omega_m, H0) from 30 mock supernovae using diffrax + Adam.

What stays stable is the ratio: diffrax is consistently 07–08× faster than scipy on this problem.

3 days, 8 hours назад @ towardsdatascience.com
Distill.pub Distill.pub
последний пост None
TheSequence TheSequence
последний пост 10 часов назад
The Sequence Knowledge #874: Transformers or Not?
The Sequence Knowledge #874: Transformers or Not? The Sequence Knowledge #874: Transformers or Not?

💡 AI Concept of the Day: Transformers or Not?

Not because it is obviously the most brain-like, elegant, or efficient design, but because it has the best scaling story.

You add data, parameters, compute, context length, better training recipes, better post-training, and the model gets better in a surprisingly smooth way.

The architecture is simple enough to scale, parallel enough to train efficiently, and expressive enough to absorb huge datasets.

So the question is not “are Transformers good?” They are spectacular.

10 часов назад @ thesequence.substack.com
The Sequence Radar #873: Last Week in AI: Soccer, S-1s, and Supermodels
The Sequence Radar #873: Last Week in AI: Soccer, S-1s, and Supermodels The Sequence Radar #873: Last Week in AI: Soccer, S-1s, and Supermodels

The AI of the week dives into a groundbreaking paper that I’ve read three times this week: models need sleep.

Subscribe and don’t miss out:📝 Editorial: Last Week in AI: Soccer, S-1s, and SupermodelsThis week I want to start close to home.

At LayerLens, we announced the Stratix Cup, a live tournament in which frontier AI models play soccer in a simulated environment.

The proposed IPO is not yet a public-market event, but it signals that frontier AI is moving from private-market mythology into public-market accountability.

MAI ModelsMicrosoft unveiled 7 new AI models including a flagship MAI-Thinking-1.

2 days, 10 hours назад @ thesequence.substack.com
The Sequence Opinion #872: The Cake Is a Battlefield: Who Really Controls the AI Stack
The Sequence Opinion #872: The Cake Is a Battlefield: Who Really Controls the AI Stack The Sequence Opinion #872: The Cake Is a Battlefield: Who Really Controls the AI Stack

When Jensen Huang draws AI as a five-layer cake — energy, chips, infrastructure, models, applications — he describes it as harmony.

Every successful application pulls demand down through models, infrastructure, and chips, all the way to the power plant that keeps it alive.

But Jensen is selling the bottom of the cake, so of course he wants you to see harmony.

The cake is not a structure of mutual reinforcement.

So the right question is never “how many layers do you own.” It is: do you own the scarce layer, and the seam right next to it?

5 days, 10 hours назад @ thesequence.substack.com
The Sequence AI of the Week #871: Inside the Loop with Claude Opus 4.8
The Sequence AI of the Week #871: Inside the Loop with Claude Opus 4.8 The Sequence AI of the Week #871: Inside the Loop with Claude Opus 4.8

I am sure you guys are surprised that we are going to cover Claude Opus 4.8 today ;) but I have been playing with it so much that merits the post.

Opus 4.8 shipped on May 28, 2026.

That instinct is wrong here, because Opus 4.8 isn’t competing on the axis the version number implies.

Those are the properties that gate whether you can actually leave an agent running, and they don’t show up on a capability leaderboard.

So let me give you the version I’d want if I were wiring this into a production agent loop at 2am.

6 days, 10 hours назад @ thesequence.substack.com
The Sequence Knowledge #870: Liquid Models and the Search for a Post-Transformer Architecture
The Sequence Knowledge #870: Liquid Models and the Search for a Post-Transformer Architecture The Sequence Knowledge #870: Liquid Models and the Search for a Post-Transformer Architecture

💡 AI Concept of the Day: Liquid Models and the Search for a Post-Transformer ArchitectureThe Transformer did not merely become the dominant neural architecture.

Its central idea is deceptively simple: when processing a sequence, let every element look at every other element.

Earlier models processed sequences like a reader moving left to right, updating a hidden state at each step.

Instead of compressing the past into a single state, they exposed the entire past to the model.

During inference, the model accumulates a key-value cache so that each new token can attend to the past.

1 week назад @ thesequence.substack.com
The Sequence Radar #869: Last Week in AI: The Token Becomes the Unit of Account — Opus 4.8, OpenRouter, Cognition, Snowflake, and a papal warning
The Sequence Radar #869: Last Week in AI: The Token Becomes the Unit of Account — Opus 4.8, OpenRouter, Cognition, Snowflake, and a papal warning The Sequence Radar #869: Last Week in AI: The Token Becomes the Unit of Account — Opus 4.8, OpenRouter, Cognition, Snowflake, and a papal warning

Subscribe and don’t miss out:📝 Editorial: Last Week in AI: The Token Becomes the Unit of Account — Opus 4.8, OpenRouter, Cognition, Snowflake, and a papal warningFor two years the AI boom was an argument about the future, told in benchmarks and term sheets.

The whole stack — model, router, agent, substrate — is converging on one business model: charge by the token, because the token is the work.

AI Lab: Harvard University & MITSummary: Bidirectional Evolutionary Search (BES) overcomes the limitations of sparse verification signals and narrow autoregressive expansion by coupling forward candidate evolution with backward goal decomposition.

AI Lab: Meta AISummary: MobileMoE introduces a famil…

1 week, 2 days назад @ thesequence.substack.com
The Sequence Opinion #868: Recursion Is the New Scaling Law
The Sequence Opinion #868: Recursion Is the New Scaling Law The Sequence Opinion #868: Recursion Is the New Scaling Law

For most of the modern AI era, progress has had a deceptively simple recipe: make the model bigger, train it on more data, and spend more compute.

This formula produced the Transformer era, the foundation model era, and the current wave of large language models.

But the most interesting recent progress in AI is beginning to feel less linear.

It is no longer just about building a larger model that gives a better answer in one pass.

That shift suggests a provocative idea: recursion may be the next scaling law.

1 week, 5 days назад @ thesequence.substack.com
The Sequence AI of the Week #867: Thinking in Latents: Why Sapient's HRM-Text Is a Quiet Rebuke to Chain-of-Thought
The Sequence AI of the Week #867: Thinking in Latents: Why Sapient's HRM-Text Is a Quiet Rebuke to Chain-of-Thought The Sequence AI of the Week #867: Thinking in Latents: Why Sapient's HRM-Text Is a Quiet Rebuke to Chain-of-Thought

So we paper over this by making the model think out loud.

Every reasoning step has to leave the residual stream, become a discrete token in a vocabulary built for human communication, and come back in through the embedding layer for the next step.

Sapient Intelligence’s bet, made first with the original Hierarchical Reasoning Model paper last summer and now extended into the language domain with HRM-Text, is that this is fixable.

Not by making the model bigger, not by training on more CoT traces, but by giving the architecture the one thing it doesn’t have: variable, internal, depth.

It’s worth thinking carefully about what they did and what it does and doesn’t yet prove.

1 week, 6 days назад @ thesequence.substack.com
The Sequence Knowledge #866: Three Text Diffusion Models You Need To Know About
The Sequence Knowledge #866: Three Text Diffusion Models You Need To Know About The Sequence Knowledge #866: Three Text Diffusion Models You Need To Know About

💡 AI Concept of the Day: Three Text Diffusion Models You Need To Know AboutFor most of the LLM era, language generation has been built around a single assumption: text should be produced like a typewriter, one token at a time, left to right, each new symbol conditioned on a frozen history.

Text diffusion models challenge that assumption at its root.

Instead of factorizing language as “the next token given all previous tokens,” diffusion models define a corruption process and then learn how to reverse it.

Together, they outline the three phases of a new architecture class: scientific proof, industrial deployment, and frontier validation.

LLaDA: The Scientific Proof That Diffusion Can Scale

2 weeks назад @ thesequence.substack.com
The Sequence Radar #865: Last Week in AI: Last Week in AI: Karpathy, Google, Colossus, and the Coming IPO Wave
The Sequence Radar #865: Last Week in AI: Last Week in AI: Karpathy, Google, Colossus, and the Coming IPO Wave The Sequence Radar #865: Last Week in AI: Last Week in AI: Karpathy, Google, Colossus, and the Coming IPO Wave

Subscribe and don’t miss out:📝 Editorial: Last Week in AI: Last Week in AI: Karpathy, Google, Colossus, and the Coming IPO WaveThe last three weeks felt like a phase transition.

It is the most coherent agent story any frontier lab has shipped this year.

AI Lab: Carnegie Mellon UniversitySummary: This study evaluates the practical effectiveness of AI-generated peer reviews by having 45 domain scientists manually grade 2,960 individual criticisms from 82 Nature-family papers.

AI Lab: University of Illinois Urbana-Champaign & MetaSummary: This paper introduces Spreadsheet-RL, an on-policy reinforcement learning framework designed to train specialized AI agents for complex spreadsheet workflows…

2 weeks, 2 days назад @ thesequence.substack.com
The Sequence Opinion #864: Every AI Agent Needs a Computer
The Sequence Opinion #864: Every AI Agent Needs a Computer The Sequence Opinion #864: Every AI Agent Needs a Computer

The next phase of AI agents will not be defined only by better models, longer context windows, or more elegant tool-calling APIs, but by something much more primitive: access to a computer.

An agent that can only emit tokens is a brilliant brain in a jar; an agent with a filesystem, terminal, browser, network, package manager, credentials, memory, and guardrails becomes a worker inside a real execution environment.

This is the core thesis: every serious AI agent needs a computer, not metaphorically, but architecturally.

The emerging market for micro-containers, sandboxes, browser runtimes, and agent workspaces is really the market for giving intelligence a body.

The Brain in the Jar Problem

2 weeks, 5 days назад @ thesequence.substack.com
The Sequence AI of the Week #863: The Model is the Interface: Inside Thinking Machines' Interactive Models
The Sequence AI of the Week #863: The Model is the Interface: Inside Thinking Machines' Interactive Models The Sequence AI of the Week #863: The Model is the Interface: Inside Thinking Machines' Interactive Models

For this week in AI’s essay, I would like to discuss Thinking Machines’ work on interactive models which takes multi-modality to a new level.

I’ve been diving as much as possible on their ideas and wanted to share some thoughts.

Check it out to get started:For the last few years, the default mental model for large language models has been embarrassingly simple: concatenate tokens, predict the next token, repeat.

The human writes a message, the model replies, the human writes again.

But collaboration is not text.

2 weeks, 6 days назад @ thesequence.substack.com
The Sequence Knowledge #862: Learning About Text Diffusion Models
The Sequence Knowledge #862: Learning About Text Diffusion Models The Sequence Knowledge #862: Learning About Text Diffusion Models

💡 AI Concept of the Day: What are Text Diffusion ModelsIf you look at the architecture of the modern AI boom, it is heavily bifurcated by modality.

In the visual domain, we are entirely ruled by diffusion models.

But in the realm of text, diffusion has historically been an afterthought.

Large Language Models (LLMs) like GPT-4, Claude, and LLaMA are staunchly autoregressive (AR).

Because AR models generate blindly from left to right, they cannot easily engage in global planning.

3 weeks назад @ thesequence.substack.com
The Sequence Radar #861: Last Week in AI: IPOs, Interactive Models, and Recursive Dreams
The Sequence Radar #861: Last Week in AI: IPOs, Interactive Models, and Recursive Dreams The Sequence Radar #861: Last Week in AI: IPOs, Interactive Models, and Recursive Dreams

Subscribe and don’t miss out:📝 Editorial: Last Week in AI: IPOs, Interactive Models, and Recursive DreamsThis week in AI felt less like a product cycle and more like a philosophical provocation wrapped in a market event, a demo, and a few delightfully ambitious lab announcements.

The IPO was not just a financing milestone for an AI chip company; it was a reminder that the AI race is still brutally physical.

AI Lab: UNC-Chapel Hill, UC Berkeley, UCSCSummary: EVOLVEMEM introduces a self-evolving memory architecture that uses an LLM-powered diagnosis module to autonomously optimize its own retrieval configuration based on failure logs.

AI Lab: Fastino LabsSummary: GLiGuard is a highly efficien…

3 weeks, 2 days назад @ thesequence.substack.com
The Sequence Opinion #860: Every Company’s Last eXam: Some Reflection About Practical AI Evals
The Sequence Opinion #860: Every Company’s Last eXam: Some Reflection About Practical AI Evals The Sequence Opinion #860: Every Company’s Last eXam: Some Reflection About Practical AI Evals

For today’s essay, I want to explore an idea that has become central to how we think about AI evaluations at LayerLens.

This is not an essay about LayerLens, but about a simple and increasingly unavoidable thesis: evals are becoming the fourth pillar of modern AI, alongside compute, data, and models.

As AI systems move from chatbots to agents, from demonstrations to production workflows, every meaningful task performed by every agent inside every company will need its own evaluation layer.

Practical, dynamic, company-specific exams that measure whether an AI system can actually survive contact with real work.

That is why top frontier labs now emphasize task-specific evals, production-derive…

3 weeks, 5 days назад @ thesequence.substack.com
Synced Review
последний пост None
📓 Cool Blogs
ODS.ai Habr ODS.ai Habr
последний пост 2 months назад
Вайбкодинг по Chess’ноку. 1. e4
Вайбкодинг по Chess’ноку. 1. e4 Вайбкодинг по Chess’ноку. 1. e4

Но это не вайбкодинг, а тяжёлая профессиональная ИИ-разработка.

За это время по этому проекту в ChatGPT было создано 112 чатов — это примерно 560 промптов.

И в особо напряжённые периоды приходилось вставать по ночам, чтобы оптимально использовать лимиты, которые делятся на 5-часовые и недельные сессии.

Но это не магия и не кнопка «сделать хорошо».

Именно поэтому будущее не за вайбкодингом, а за теми, кто научится управлять этой скоростью.

2 months назад @ habr.com
Почему я стал ИТ-волонтером & Датасет новостей о противоречиях современного общества
Почему я стал ИТ-волонтером &amp; Датасет новостей о противоречиях современного общества Почему я стал ИТ-волонтером &amp; Датасет новостей о противоречиях современного общества

Простой пример с ценами на топливо: бензин дорожает и из-за роста цены на нефть, и из-за ее падения.

Осознание того, что твой труд увеличивает чью-то капитализацию, но не решает реальных проблем общества, видимых в быту и в новостях, подтолкнуло искать еще какую-то деятельность.

Кроме того, благодаря АМБ появился уникальный датасет новостей с противоречиями современного общества на kaggle и github, далее о нем.

Датасет новостей о противоречиях современного обществаАктивисты АМБ и волонтеры дружественных коллективов собрали и разметили датасет новостей, подсвечивающие те самые системные противоречия, о которых я задумывался ранее.

Пример Б В 2023 году в мире голодал каждый 11-й человек, а в …

3 months, 2 weeks назад @ habr.com
[Перевод] Как устроен Codex
[Перевод] Как устроен Codex [Перевод] Как устроен Codex

Подробный разбор того, как команда OpenAI Codex создаёт своего кодового агента, как его используют инженеры и что это может значить для будущего разработки ПО.

Чтобы разобраться, как устроен Codex, как команды внутри OpenAI его используют и как он влияет на инженерные практики у создателей ChatGPT, я поговорил с тремя сотрудниками OpenAI:Тибо Соттио (Thibault Sottiaux) — руководитель Codex.

Оба продукта были запущены весной: Codex CLI анонсировали в апреле 2025 года, а Codex в ChatGPT представили в мае.

В команде Codex эти файлы объясняют агенту, как ориентироваться в кодовой базе, какие команды запускать для тестирования и как следовать стандартам проекта.

Использование Codex в OpenAIПомим…

3 months, 2 weeks назад @ habr.com
Курс Natural Language Processing & LLMs — новый сезон
Курс Natural Language Processing &amp; LLMs — новый сезон Курс Natural Language Processing &amp; LLMs — новый сезон

10 февраля мы в очередной раз запускаем бесплатный онлайн-курс по обработке естественного языка (Natural Language Processing).

Что будем проходить:классическое начало: закон Ципфа, TF-IDF, RNN, CNN, Transformer;основные задачи NLP: классификация текста, тегирование и генерация;специфичные области: агенты и вайб-кодинг;LLM и их применение.

Если вы студент ИТМО, МФТИ или ВШЭ, то курс можно зачесть, как учебный.

Работаю в области NLP более 12 лет, успел поработать в Яндексе и ВКонтакте, защитить кандидатскую диссертацию.

Если есть вопросы, то приходите с ними в ODS Mattermost – там будут все ответы, время семинаров и ссылки.

4 months, 1 week назад @ habr.com
SWE-MERA — новый динамический бенчмарк для моделей агентной генерации кода
SWE-MERA — новый динамический бенчмарк для моделей агентной генерации кода SWE-MERA — новый динамический бенчмарк для моделей агентной генерации кода

Однако все задачи в MERA CODE, как впрочем и в SWE-bench и других бенчмарках подобного назначения, следуют классической парадигме, когда у нас есть фиксированный обучающий набор данных и, что более важно, фиксированный проверочный набор.

Но большие языковые модели для кодинга, которые мы и пытаемся оценивать нашим набором, также учатся на GitHub – со времен еще первой модели LLaMa.

Кажется, что 700 задач немного, но это уже очень приличное количество, и что самое важное — это новые задачи.

Current behavior: from sympy import ask, Q, Symbol x = Symbol('x') print(ask(Q.finite(x**-1), Q.real(x))) # Output: True Expected behavior: The function should return None to indicate uncertainty, as x**-…

8 months, 3 weeks назад @ habr.com
Machine Learning Mastery
последний пост 1 week, 5 days назад
Building a Context Pruning Pipeline for Long-Running Agents
Building a Context Pruning Pipeline for Long-Running Agents Building a Context Pruning Pipeline for Long-Running Agents

Share Post ShareIn this article, you will learn how to implement a context pruning pipeline for long-running AI agents, enabling them to manage conversational memory efficiently through semantic similarity.

This article outlines the basic principles for implementing a context pruning pipeline for long-running agents.

Simulated Agent History (Usually fetched from a database) chat_history = [ {"role": "user", "content": "My name is Alice and I work in logistics.

Simulated Agent History (Usually fetched from a database) chat_history = [ { "role" : "user" , "content" : "My name is Alice and I work in logistics."

Assemble the final pruned context pruned_context = top_semantic_turns + recent_turn…

1 week, 5 days назад @ machinelearningmastery.com
The Statistics of Token Selection: Logits, Temperature, and Top-P Walkthrough
The Statistics of Token Selection: Logits, Temperature, and Top-P Walkthrough The Statistics of Token Selection: Logits, Temperature, and Top-P Walkthrough

Share Post ShareIn this article, you will learn how logits, temperature, and top-p sampling work together to control next-token prediction in large language models.

How temperature and top-p (nucleus sampling) shape the probability distribution used for token selection.

In particular, we will explore how raw model scores, known as logits, interact with two other model settings — temperature and top-p — which are three key parameters utilized to control the token selection process.

Temperature then enters the picture by scaling these raw logits — note that this happens before the softmax function converts them into probabilities.

Depending on the temperature value, the resulting distribution…

1 week, 6 days назад @ machinelearningmastery.com
Building a Multi-Tool Gemma 4 Agent with Error Recovery
Building a Multi-Tool Gemma 4 Agent with Error Recovery Building a Multi-Tool Gemma 4 Agent with Error Recovery

key = city.lower().strip() if key not in WEATHER_DATA: raise ValueError( f"Unknown city: '{city}'.

strip ( ) if key not in WEATHER_DATA : raise ValueError ( f "Unknown city: '{city}'.

) if key not in CITY_DATA: raise ValueError(f"Unknown city: '{city}'.

) func = TOOL_FUNCTIONS[function_name] # Defense 2: catch argument errors (wrong types, missing or extra args) try: result = func(**arguments) return " ok ", str(result) except TypeError as e: return " error ", f" Bad arguments for { function_name } : { e } " except ValueError as e: return " error ", str(e) except ToolUnavailableError as e: return " error ", f" Tool temporarily unavailable : { e } " except Exception as e: return " error ", f…

2 weeks назад @ machinelearningmastery.com
Implementing Hybrid Semantic-Lexical Search in RAG
Implementing Hybrid Semantic-Lexical Search in RAG Implementing Hybrid Semantic-Lexical Search in RAG

Share Post ShareIn this article, you will learn how to implement a hybrid search strategy for RAG systems by combining BM25 lexical search with semantic search, fused together using Reciprocal Rank Fusion.

Topics we will cover include:Why hybrid search outperforms either lexical or semantic search alone in retrieval-augmented generation systems.

IntroductionImplementing hybrid search strategies is a critical step in building modern RAG (Retrieval-Augmented Generation) systems, especially when shifting from prototype to production-ready solutions.

Wrapping UpThis article guided you through implementing a hybrid search mechanism for the retrieval stage of RAG systems.

Choosing not to rely sol…

2 weeks, 1 day назад @ machinelearningmastery.com
Building Context-Aware Search in Python with LLM Embeddings + Metadata
Building Context-Aware Search in Python with LLM Embeddings + Metadata Building Context-Aware Search in Python with LLM Embeddings + Metadata

How to build a metadata-aware search index that filters by team, status, priority, and date before scoring candidates.

"}, {"id": "T-102", "team": "infrastructure", "status": "open", "priority": "high", "created": date(2025, 11, 8), "text": "Nginx ingress returning 502 after rotating TLS certificate.

} , { "id" : "T-102" , "team" : "infrastructure" , "status" : "open" , "priority" : "high" , "created" : date ( 2025 , 11 , 8 ) , "text" : "Nginx ingress returning 502 after rotating TLS certificate.

. . [ 0.1714 ] T - 206 backend open high 2025 - 11 - 13 Rate limiting not scoping per user — middleware uses a shared Redis key derived from .

. . [ 0.1419 ] T - 202 backend open high 2025 - 11 - 0…

2 weeks, 4 days назад @ machinelearningmastery.com
How to Build a Multi-Agent Research Assistant in Python
How to Build a Multi-Agent Research Assistant in Python How to Build a Multi-Agent Research Assistant in Python

create ( query = query , limit = limit , scrape_options = scrape_options , ) data = sdk_result_to_dict ( search ) return compact_json ( { "query" : query , "results" : normalize_search_links ( data .

manager_agent = Agent( name="Manager research agent", model=MODEL, instructions=( f"Current date: {current_date_context()}" f"Current year: {current_year_context()}" "You are the orchestrator for a multi-agent research assistant.

def openai_trace_url(trace_id: str) -> str: return f"https://platform.openai.com/logs/trace?trace_id={trace_id}" async def run_research_assistant(query: str) -> MarkdownResearchReport: if not OPENAI_API_KEY: raise RuntimeError( "OPENAI_API_KEY is not set.

with trace( w…

2 weeks, 5 days назад @ machinelearningmastery.com
Agentic Programming: A Roadmap
Agentic Programming: A Roadmap Agentic Programming: A Roadmap

A concrete month-by-month learning roadmap that ends with a working production agent you have built and shipped yourself.

if __name__ == "__main__": goal = "Research the top 3 vector databases for AI in 2026 and write a comparison report."

create ( model = "claude-sonnet-4-20250514" , # Sonnet is fast and cost-effective for loops max_tokens = 4096 , system = system , tools = tools , messages = messages ) print ( f "Stop reason: {response.stop_reason}" ) # Model is done -- return the final message if response .

if __name__ == "__main__" : goal = "Research the top 3 vector databases for AI in 2026 and write a comparison report."

The best production agent systems treat human oversight as a des…

2 weeks, 6 days назад @ machinelearningmastery.com
Prompt Engineering for Agentic AI
Prompt Engineering for Agentic AI Prompt Engineering for Agentic AI

Share Post ShareIn this article, you will learn how prompt engineering changes fundamentally when applied to agentic AI systems, and what principles and patterns enable reliable agent behavior at scale.

The four components every agent prompt needs, including system prompts, tools, examples, and context state management.

This is exactly why Anthropic’s engineering team introduced the concept of context engineering as the natural evolution of prompt engineering.

The Four Components Every Agent Prompt NeedsBased on Lilian Weng’s foundational framework for LLM-powered agents and Anthropic’s engineering guidance, a well-designed agent operates on four categories of context.

ConclusionPrompt engi…

3 weeks назад @ machinelearningmastery.com
Building Vector Similarity Search in PostgreSQL with pgvector
Building Vector Similarity Search in PostgreSQL with pgvector Building Vector Similarity Search in PostgreSQL with pgvector

Share Post ShareIn this article, you will learn how to implement vector similarity search in PostgreSQL using the pgvector extension, allowing you to find semantically similar results based on meaning rather than keyword matching.

What Is a Vector Embedding?

A vector embedding is a list of floating-point numbers that represents the meaning of a piece of data, not its characters or keywords.

pgvector is an open-source PostgreSQL extension that adds native vector search to your existing database.

Adding an Index for PerformanceWithout an index, every similarity query performs a full sequential scan: PostgreSQL computes the distance between the query vector and every row in the table.

3 weeks, 1 day назад @ machinelearningmastery.com
Choosing the Right Agentic Design Pattern: A Decision-Tree Approach
Choosing the Right Agentic Design Pattern: A Decision-Tree Approach Choosing the Right Agentic Design Pattern: A Decision-Tree Approach

Share Post ShareIn this article, you will learn how to apply a structured decision tree to choose the right agentic design pattern for any AI system you are building.

Topics we will cover include:Why pattern selection is a critical design decision, and what assumptions underlie each major agentic design pattern.

Before working through the decision tree, it is important to clearly define what is at stake when selecting a design pattern.

Further reading on agentic design patterns: The Roadmap to Mastering Agentic AI Design PatternsA Decision Tree for Choosing the Right Agentic Design PatternThe tree has five branching questions, each one narrowing the pattern space based on a concrete propert…

3 weeks, 6 days назад @ machinelearningmastery.com
LLM Observability Tools for Reliable AI Applications
LLM Observability Tools for Reliable AI Applications LLM Observability Tools for Reliable AI Applications

Share Post ShareIn this article, you will learn about seven leading LLM observability tools that help AI engineers monitor, evaluate, and debug large language model applications running in production.

Topics we will cover include:What LLM observability is and why it matters for production AI systems.

LLM observability tools give you visibility into what your models are actually doing in production.

Datadog LLM ObservabilityDatadog’s LLM Observability module extends its unified monitoring platform into AI applications.

LunaryLunary is an open-source LLM observability platform focused on making production monitoring accessible without heavy setup or overhead.

4 weeks назад @ machinelearningmastery.com
Implementing Prompt Compression to Reduce Agentic Loop Costs
Implementing Prompt Compression to Reduce Agentic Loop Costs Implementing Prompt Compression to Reduce Agentic Loop Costs

Agentic loops in production can be synonymous with high costs, especially when it comes to both LLM and external application usage via APIs, where billing is often closely related to token usage.

4 weeks, 1 day назад @ machinelearningmastery.com
Implementing Permission-Gated Tool Calling in Python Agents
Implementing Permission-Gated Tool Calling in Python Agents Implementing Permission-Gated Tool Calling in Python Agents

AI agents have evolved beyond passive chatbots.

1 month назад @ machinelearningmastery.com
The Roadmap to Mastering Tool Calling in AI Agents
The Roadmap to Mastering Tool Calling in AI Agents The Roadmap to Mastering Tool Calling in AI Agents

Most

1 month назад @ machinelearningmastery.com
Implementing Statistical Guardrails for Non-Deterministic Agents
Implementing Statistical Guardrails for Non-Deterministic Agents Implementing Statistical Guardrails for Non-Deterministic Agents

Non-deterministic agents are those where the same input can lead to distinct outputs across multiple runs.

1 month назад @ machinelearningmastery.com
ML in Production
последний пост None
Sorta Insightful Sorta Insightful
последний пост 3 weeks, 1 day назад
AI Will Not Make Your Job Chill
AI Will Not Make Your Job Chill AI Will Not Make Your Job Chill

People keep talking about how AI will make their job easy, and I don’t really understand why.

I assume the factory job producing this was still hard work.

I don’t think AI has made my job chill, and I feel like I am front-line compared to much of the economy.

It’s not widely known, but transportation and warehousing has the highest rate of nonfatal work injuries in the US.

For a while, this will not lead to any job loss, because increasing abundance will lead to higher demand.

3 weeks, 1 day назад @ alexirpan.com
Why I Signed The Amicus Brief for Anthropic v Department of War
Why I Signed The Amicus Brief for Anthropic v Department of War Why I Signed The Amicus Brief for Anthropic v Department of War

On Monday, Anthropic filed a lawsuit against the Department of War, and an amicus brief in support of Anthropic was filed on behalf of a number of OpenAI and Google employees.

There’s also an amicus brief filed on behalf of Microsoft.

There’s conflicting reporting, but very broadly, Anthropic signed an agreement with the government to deploy Claude in classified, military contexts.

Anthropic said no, Pete Hegseth declared them a supply chain risk, and Anthropic filed a lawsuit against this.

The amicus brief was broadly aligned with my thoughts on the matter, so I signed.

3 months назад @ alexirpan.com
MIT Mystery Hunt 2026
MIT Mystery Hunt 2026 MIT Mystery Hunt 2026

This has spoilers for MIT Mystery Hunt 2026.

Pre-HuntThe time running up to Hunt was more stressful than usual…very briefly, I typically hunt with teammate.

Just last year, I did GPH 2025, LN Hunt, Teammate Hunt 2025, Microsoft Hunt 2025, and Silph Puzzle Hunt 2025, all of which had significant 3+ hour solve puzzles that would not be out of place in Mystery Hunt.

Not to mention smaller hunts like Advent Hunt, and then I didn’t even do Brown Puzzlehunt or Vertex Hunt or the fall CMU Hunt.

To me, the crux is whether Mystery Hunt is broken, or Mystery Hunt is fine.

4 months, 1 week назад @ alexirpan.com
Authentic Imperfection
Authentic Imperfection Authentic Imperfection

* * *I’ve been thinking about the anger surrounding generative AI.

To keep things fair, he took the best human images and best AI images, meaning human art from famous artists, and AI art from prompters skilled at removing obvious tells of image generation.

When people complain about AI slop, I see it as a complaint against the deluge of default style AI images.

We’ve seen this happen in all forms: AI text, AI music, older forms of computer generated content like CGI.

As much as we celebrate imperfection, digital imperfection is a step too far.

6 months, 3 weeks назад @ alexirpan.com
Ten Years Later
Ten Years Later Ten Years Later

Every now and then, someone asks me why I blog, and I don’t know really know what to tell them.

That’s another reason I’m not celebrating 10 years with more gusto, I know I’ve been writing less.

Indiana Jones and the Great Circle: I don’t know how they did it, but Indiana Jones and the Great Circle was just fun all the way through.

My one complaint is that the hand-to-hand combat feels like the worst part of the game, so of course they put a bunch of upgrades behind learning parry timings you’ll never use later.

I have not tried Peak, but Another Crab’s Treasure was really good and is worth playing if you’re interested in a Souls-like.

9 months, 3 weeks назад @ alexirpan.com
Lil'Log
последний пост None
inFERENCe
последний пост 3 months, 2 weeks назад
The Future of Software
The Future of Software The Future of Software

February 25, 2026The Future of SoftwareThe world of software is undergoing a shift not seen since the advent of compilers in the 1970s.

How will humans tell AI agents what software artefacts we would like to create?

How will humans tell AI agents what software artefacts we would like to create?

This future of software creation, in which our programming languages are abstracted away, raises two very important questions:What will the instruction/specification language look like?

This should be a clear layer of separation between the developer and the pool of AI agents working to maintain software.

3 months, 2 weeks назад @ inference.vc
Deep Learning is Powerful Because It Makes Hard Things Easy - Reflections 10 Years On
Deep Learning is Powerful Because It Makes Hard Things Easy - Reflections 10 Years On Deep Learning is Powerful Because It Makes Hard Things Easy - Reflections 10 Years On

Deep Learning is Powerful Because It Makes Hard Things Easy - Reflections 10 Years OnTen years ago this week, I wrote a provocative and bold post that blew up, made it to top spot on HackerNews.

In hindsight: There is a lot of stuff in deep learning that we don't understand nearly enough.

Sometimes things work for reasons completely unrelated to why we thought they would work.

(Pop some 🍿 in the microwave and read till the end for more)🎯 "Deep learning is powerful exactly because it makes hard things easy"Okay, this was a great insight.

🎯 Generative ModelingIn the post I suggested people learn "something harder" instead of - or in addition to - deep learning.

4 months, 1 week назад @ inference.vc
The Spectator
последний пост None
The Unofficial Google Data Science Blog The Unofficial Google Data Science Blog
последний пост None
Off the Convex Path
последний пост None
Jay Alammar
последний пост None
Piekniewski's blog
последний пост None
fast.ai NLP fast.ai NLP
последний пост None
Sebastian Ruder
последний пост None
大トロ 大トロ
последний пост None
🔬 Science
Papers With Code Papers With Code
последний пост None
Papers With Code Papers With Code
последний пост None
Papers With Code Papers With Code
последний пост None
💼 University and corporation labs
DeepMind DeepMind
последний пост 6 часов назад
Fluid, natural voice translation with Gemini 3.5 Live Translate
Fluid, natural voice translation with Gemini 3.5 Live Translate Fluid, natural voice translation with Gemini 3.5 Live Translate

Today, we’re taking our next step with the release of Gemini 3.5 Live Translate, our latest audio model for live speech-to-speech translation.

The model automatically detects 70+ languages and generates smooth, natural-sounding translated speech that preserves the speakers' intonation, pacing and pitch.

Unlike turn by turn systems that wait for the speaker to finish speaking before responding, 3.5 Live Translate generates speech continuously, balancing the trade-off between waiting for context to improve quality and translating immediately to stay in sync with the speaker.

Gemini 3.5 Live Translate is rolling out starting today across Google products:For developers in public preview via the…

6 часов назад @ blog.google
Introducing Gemma 4 12B: a unified, encoder-free multimodal model
Introducing Gemma 4 12B: a unified, encoder-free multimodal model Introducing Gemma 4 12B: a unified, encoder-free multimodal model

Today, we are introducing Gemma 4 12B, our latest model designed to bring agentic multimodal intelligence directly to laptops.

Here’s an overview of what makes Gemma 4 12B unique:Novel unified architecture: No multimodal encoders.

Advanced reasoning: Benchmark performance nearing our 26B model, unlocking powerful multi-step reasoning and agentic workflows.

Benchmark performance nearing our 26B model, unlocking powerful multi-step reasoning and agentic workflows.

Small enough to run locally on consumer laptops with 16GB of RAM, it unlocks powerful multimodal and agentic experiences right on your machine.

7 часов назад @ blog.google
Powering the future of robotics in Europe
Powering the future of robotics in Europe Powering the future of robotics in Europe

That’s why we’re launching the Google DeepMind Accelerator: Robotics, a three-month program for early-stage robotics startups across Europe.

They’ll have access to our AI stack, technical expertise and Gemini robotics models.

Extend Robotics ( United Kingdom ): Provides teleoperation software and data pipelines that help train and fine-tune foundation models for real-world robotics applications.

Generative Bionics ( Italy ): Amplifies human potential by developing humanoid robots based on physical AI, developed in Europe but built to scale globally.

): Amplifies human potential by developing humanoid robots based on physical AI, developed in Europe but built to scale globally.

7 часов назад @ blog.google
Measuring the impact of learning with AI in Sierra Leone and beyond
Measuring the impact of learning with AI in Sierra Leone and beyond Measuring the impact of learning with AI in Sierra Leone and beyond

The results from this pre-registered trial suggest that AI can be a powerful pedagogical partner — not by replacing teachers, but by augmenting their reach.

Students using Guided Learning saw a gain of +0.258 standard deviations in their math scores compared to the control group.

In practical terms, this represents roughly 1.2 to 1.7 years of typical learning progress achieved within the eight-week trial.

To further understand the impact of Guided Learning on student learning, we are conducting a series of additional pre-registered RCTs globally.

Additionally, our support of the Global AI for Learning Alliance (GAILA) will accelerate these commitments and others through collective action.

1 day, 8 hours назад @ deepmind.google
We’re launching the Google DeepMind Accelerator program in Asia Pacific to tackle environmental risks
We’re launching the Google DeepMind Accelerator program in Asia Pacific to tackle environmental risks We’re launching the Google DeepMind Accelerator program in Asia Pacific to tackle environmental risks

The Asia-Pacific region is a global engine for economic growth, but it's also highly vulnerable to climate change.

While green technologies are gaining momentum, a recent report shows they aren’t scaling fast enough to keep up with the region’s rising environmental risks.

Selected organizations will receive expert mentorship, tailored support and help integrating frontier AI and science AI models from Google AI experts into their projects or products.

If you're working on climate solutions, we want to help you scale your work.

The program kicks off with an in-person bootcamp in Singapore, and you can learn more and register your interest today.

2 weeks, 5 days назад @ blog.google
Fast-tracking genetic leads to reverse cellular aging
Fast-tracking genetic leads to reverse cellular aging Fast-tracking genetic leads to reverse cellular aging

Biologists Omar Abudayyeh and Jonathan Gootenberg are using Co-Scientist to help them blast through both.

Their lab runs huge genetic screens that flip thousands of genes on or off then reads how cells respond to these changes.

Co-Scientist is helping on two fronts.

Second, Co-Scientist speeds up the follow-through.

Having Co-Scientist analyse their screening data alongside the literature, that work is slashed to just a few days.

3 weeks, 1 day назад @ deepmind.google
Simulate real-world places with Project Genie and Street View
Simulate real-world places with Project Genie and Street View Simulate real-world places with Project Genie and Street View

Street View: ground your worlds in real placesWhen creating imaginative worlds in Project Genie, you can now also base them on real places.

This capability is powered by Maps Imagery Grounding, the same technology developers use to create stunning AI visuals with Street View.

Street View imagery in Project Genie is available now for places in the U.S. with plans to expand to more places over time.

Project Genie: now available with Google AI UltraStarting today, Project Genie — including the new Street View capability — is gradually rolling out to all eligible Google AI Ultra $200 subscribers globally (18+).

Try creating today with Project Genie.

3 weeks, 2 days назад @ blog.google
Introducing Gemini Omni
Introducing Gemini Omni Introducing Gemini Omni

We’re introducing Gemini Omni, where Gemini’s ability to reason meets the ability to create.

Omni is our new model that can create anything from any input — starting with video.

With Omni, you can combine images, audio, video and text as input and generate high-quality videos grounded in Gemini's real-world knowledge.

Today, we’re rolling out the first model in the Omni family: Gemini Omni Flash, to the Gemini app, Google Flow and YouTube Shorts.

Here’s some of what makes Omni special:Edit your videos through conversationGemini Omni gives you an easier way to edit video — with natural language.

3 weeks, 2 days назад @ blog.google
Introducing Google Antigravity 2.0
Introducing Google Antigravity 2.0 Introducing Google Antigravity 2.0 3 weeks, 2 days назад @ antigravity.google
Gemini for Science: AI experiments and tools for a new era of discovery
Gemini for Science: AI experiments and tools for a new era of discovery Gemini for Science: AI experiments and tools for a new era of discovery

That’s why we are introducing Gemini for Science, a collection of science tools and experiments designed to expand the scale and precision of scientific exploration.

Gemini for Science experimental tools on Google Labs include three primary prototypes designed to handle such tasks.

Literature Insights searches scientific literature and structures results into tables with custom, searchable attributes for side-by-side analysis.

Our research teams using Science Skills have already seen this speedup in practice.

In early testing, our team used Science Skills to perform a complex analysis that normally takes hours in minutes.

3 weeks, 2 days назад @ blog.google
Making it easier to understand how content was created and edited
Making it easier to understand how content was created and edited Making it easier to understand how content was created and edited

As generative media becomes more advanced and accessible, it’s helpful to know where content comes from, and whether it’s been altered.

Today, we’re expanding our content transparency and verification tools in Search, Gemini, Chrome, Pixel and Cloud, and deepening our partnership with the broader industry.

Since then, we've integrated SynthID into our generative media models and products, watermarking over 100 billion images and videos and 60,000 years of audio.

Across a growing number of our generative media tools, we use C2PA Content Credentials, the industry standard that shows how media was created and modified, with or without AI.

In an era of generative media, we believe that identify…

3 weeks, 2 days назад @ blog.google
Strengthening Singapore’s AI Future: A New National Partnership
Strengthening Singapore’s AI Future: A New National Partnership Strengthening Singapore’s AI Future: A New National Partnership

At Google DeepMind, we believe frontier AI can be a powerful force for good.

As part of our National Partnerships for AI initiative, we are launching new programmes in the country, while driving a key pillar of Google’s new National AI partnership with the Singapore Government.

Together, we are strengthening Singapore’s National AI Strategy to responsibly deploy AI at scale for economic growth and public benefit.

Using frontier AI like AlphaFold and Google Earth and other latest AI for Science tools, we are working to accelerate our understanding of outbreaks across Southeast Asia.

Using frontier AI like AlphaFold and Google Earth and other latest AI for Science tools, we are working to…

3 weeks, 3 days назад @ deepmind.google
Finding the molecular switches behind new infectious diseases
Finding the molecular switches behind new infectious diseases Finding the molecular switches behind new infectious diseases

Professor Clare Bryant at the University of Cambridge is using Co-Scientist to hunt for the molecular switches that cause severe diseases, like sepsis, in humans when pathogens leap between species, and find new approaches to prevent this happening.

Testing Co-Scientist, Bryant fed it a summary of one of her grant proposals studying flu in birds and humans outlining her lab’s research questions.

moment: Co-Scientist had prioritised a protein that hadn’t been on her radar, connected to several signalling pathways she was already interested in.

Back at her lab, she added unpublished material, kept confidential within Co-Scientist.

But her lab is now on track to complete it in six months, she …

3 weeks, 3 days назад @ deepmind.google
Opening new paths in aging research
Opening new paths in aging research Opening new paths in aging research

At Calico Life Sciences, head of AI/ML Matt Onsum and principal scientist Katherine Labbé are using Co‑Scientist to help connect scattered findings across the biology of aging and turn them into hypotheses worth testing.

That is no small task, because the biology literature is full of mixed-quality findings, dead ends, and results that do not replicate.

Onsum says Co-Scientist impressed Calico’s experts with its scientific discernment, helping cut through that noise to help them identify new ideas genuinely worth exploring.

The Calico team used Co-Scientist to generate a novel yet plausible hypothesis about how the ISR is regulated by metabolism, which is known to change with age and in var…

3 weeks, 3 days назад @ deepmind.google
Accelerating discovery of liver disease mechanisms
Accelerating discovery of liver disease mechanisms Accelerating discovery of liver disease mechanisms

At the University of Edinburgh, bioengineer Filippo Menolascina is using Co-Scientist to comb the literature for overlooked links and generate new hypotheses.

Developing treatments is challenging because MASH involves intertwined biological processes, including liver inflammation and metabolism, meaning single‑target drugs fall short.

Faced with that combinatorial explosion, Menolascina used Co‑Scientist to narrow the search.

In his hands, Co‑Scientist synthesised evidence across liver biology and pharmacology, highlighted mechanisms worth focusing on, and flagged candidate combination therapies that his team could test.

In one emblematic case, Co‑Scientist tackled a live, practical questio…

3 weeks, 3 days назад @ deepmind.google
Google
последний пост 3 часа назад
Claude Fable 5: Available on Google Cloud
Claude Fable 5: Available on Google Cloud Claude Fable 5: Available on Google Cloud

Claude Fable 5, Anthropic’s latest frontier model, is now generally available on Google Cloud.

This launch is the latest proof point of our ongoing commitment to bring the industry's latest models straight to our Agent Platform.

Claude Fable 5 brings the best of Anthropic model capabilities to all customers, with strong safeguards designed to make it safe for general use.

Designed for complex, multi-step reasoning, Claude Fable 5 is good for demanding tasks like advanced software development, long-horizon agents, and deep multimodal document analysis.

Build with Claude Fable 5 and other models from Anthropic — including Claude Opus 4.8 and Claude Sonnet 4.6 — today on Agent Platform.

3 часа назад @ cloud.google.com
Detecting and containing AI-powered threats with Google Security Operations agents
Detecting and containing AI-powered threats with Google Security Operations agents Detecting and containing AI-powered threats with Google Security Operations agents

Autonomous investigation, containment, and responseIf a threat is detected, you need to immediately and autonomously assess and respond to protect your environment.

The Triage and Investigation agent in Google Security Operations, generally available, helps analysts drastically reduce time to respond by autonomously investigating alerts, gathering evidence for analysis, and providing verdicts with comprehensive explanations.

It can help security analysts automate decision-making, alert closure, and remediation flows, allowing them to spend more time prioritizing high-priority threats instead of false positives.

The agent has already investigated over 5 million alerts, reducing a typical 30-…

5 часов назад @ cloud.google.com
How to unlock true ROI in software development – a deep dive into the latest DORA research
How to unlock true ROI in software development – a deep dive into the latest DORA research How to unlock true ROI in software development – a deep dive into the latest DORA research

How do you prove the business value of generative AI to your teams?

Technology and finance leaders need to show the clear business value of AI projects to secure ongoing funding.

To help you evaluate the costs and business benefits of AI, we recently shared the DORA: ROI of AI-assisted software development report.

Insight #1: Navigating the J-curve of AI value realizationIt is important to be realistic about how quickly you will see a return on your AI investments.

While AI can act as a powerful amplifier for software engineering, the path to financial value is rarely a straight line.

5 часов назад @ cloud.google.com
Report: GKE Inference Gateway delivers up to 92% faster AI responses
Report: GKE Inference Gateway delivers up to 92% faster AI responses Report: GKE Inference Gateway delivers up to 92% faster AI responses

In fact, according to an independent benchmark report, GKE Inference Gateway outperforms the next leading managed Kubernetes service with 15.7% higher throughput, 92.8% shorter wait times, and 62.6% lower inter-token latency.

That performance tracks with Snap’s experience using GKE Inference Gateway.

In this blog, we take a closer look at GKE Inference Gateway’s prefix caching, complete with examples.

The secret to low-latency AI: Prefix cachingPrefix caching optimizes LLM performance by storing the KV cache (activation states) of long, repetitive prompt prefixes.

GKE Inference Gateway reads incoming request prefixes and matches them to the specific pods that already hold that data in memor…

5 часов назад @ cloud.google.com
Modernizing Healthcare: How Alcidion achieved greater stability and performance with AlloyDB
Modernizing Healthcare: How Alcidion achieved greater stability and performance with AlloyDB Modernizing Healthcare: How Alcidion achieved greater stability and performance with AlloyDB

The team had to manually balance database loads between elastic pools to maintain performance while trying to optimize costs.

Performance latency: Complex JSON data processing, critical for modern health informatics, was taking up to 30 minutes for certain jobs.

Stability concerns: The team sought a more stable Kubernetes environment and a persistent backend that could scale without constant administrative intervention.

Alcidion achieved this by spinning up a new Google Cloud instance synchronized to the active one, with both accessible via unique fully qualified domain names.

By moving to AlloyDB, Alcidion has improved its stability and performance and built a strong foundation to keep del…

1 day, 5 hours назад @ cloud.google.com
What's new for Managed Service for Apache Spark clusters
What's new for Managed Service for Apache Spark clusters What's new for Managed Service for Apache Spark clusters

We recently announced that the Dataproc service is now Managed Service for Apache Spark, reflecting our deep integration with the Agentic Data Cloud.

To support the diverse architectural needs of today’s modern data teams, we offer the service in two distinct deployment modes: serverless and managed clusters.

This blog post focuses specifically on what we announced at Google Cloud Next ‘26 for the Managed Spark clusters deployment mode: providing enhanced flexibility to fine-tune performance and cost through native execution engine, smarter scaling policies, and Gemini-powered extensions.

Faster, with the Lightning Engine native execution engineArguably the biggest update for Managed Spark …

5 days, 5 hours назад @ cloud.google.com
How Trustpilot built a real-time architecture for data enrichment using Gemma
How Trustpilot built a real-time architecture for data enrichment using Gemma How Trustpilot built a real-time architecture for data enrichment using Gemma

Trustpilot has been doing exactly that with custom machine learning since long before large language models (LLMs) were cool.

Instead, by fine-tuning open-weight models like Gemma, Trustpilot takes full ownership of their AI strategy.

Here’s how:Total model independence: By owning its models, Trustpilot ensures it controls the retraining lifecycle, completely freeing it from a third-party vendor's update schedule or sudden API changes.

Expanding MLOps capabilities: Building these models in-house enables Trustpilot to bake in the "secret sauce" of its review intelligence while building competencies on open-weight models.

Rather than deploying one massive model, Trustpilot built a suite of hi…

1 week, 1 day назад @ cloud.google.com
The fully-managed Remote MCP Server for AlloyDB is now Generally Available
The fully-managed Remote MCP Server for AlloyDB is now Generally Available The fully-managed Remote MCP Server for AlloyDB is now Generally Available

To bridge this gap, we are excited to announce the Remote Model Context Protocol (MCP) Server for AlloyDB is now generally available.

The Remote MCP Server for AlloyDB runs on fully-managed Google Cloud infrastructure and exposes an HTTP endpoint that connects your AI applications to your data.

This solves key challenges for teams building agents on PostgreSQL:Centralized discovery : Find, secure, and manage your database's MCP server using Agent Registry.

Let's see it in action: A quick demoGetting started with the AlloyDB Remote MCP server is a straightforward process.

Enable data access API: Permit the Data Access API on your AlloyDB instance.

1 week, 1 day назад @ cloud.google.com
Developer's guide to Gemini Enterprise and A2UI integration
Developer's guide to Gemini Enterprise and A2UI integration Developer's guide to Gemini Enterprise and A2UI integration

What A2UI isA2UI is an open protocol, introduced by Google and co-developed with the Flutter team and product teams behind Gemini Enterprise.

Gemini Enterprise — GE is the shell, the renderer, and the transport client.

Two patterns are common today:Inline pattern — the agent sends a component tree with the data baked into each component (the pattern Gemini Enterprise renders today).

How A2UI works inside Gemini EnterpriseGemini Enterprise ships with a built-in A2UI renderer.

High-level architecture exampleThe reference implementation is an ADK backend on Cloud Run designed to plug seamlessly into Gemini Enterprise.

1 week, 4 days назад @ cloud.google.com
Cloud CISO Perspectives: How to build an AI-ready security program for the public sector
Cloud CISO Perspectives: How to build an AI-ready security program for the public sector Cloud CISO Perspectives: How to build an AI-ready security program for the public sector

Your tactical execution plan: Months zero to sixBuilding an AI-ready security program is a journey.

Vendor and spend optimization (Immediate): Upload vendor capability matrices and contracts to an isolated AI agent (like NotebookLM).

AI-driven security training (within six months): As manual processes are increasingly automated, use that reclaimed time to run capture the flag (CTF) exercises and community contests for your security team.

Proactive threat hunting : Use AI as a hunting advisor.

Train an internal agent on your organization’s historical IR tickets and SOPs.

1 week, 4 days назад @ cloud.google.com
Cool stuff Google Cloud customers built, May edition: Agentic algorithms for supply chains; virtual try-on APIs; robotic camera operators & more
Cool stuff Google Cloud customers built, May edition: Agentic algorithms for supply chains; virtual try-on APIs; robotic camera operators & more Cool stuff Google Cloud customers built, May edition: Agentic algorithms for supply chains; virtual try-on APIs; robotic camera operators & more

Be sure to check back next month to see how more industry leaders and exciting startups are putting Google Cloud technologies to use.

Google Cloud and IBM teams also assisted URBN in a rigorous, iterative switchover testing strategy.

What they did: To understand how local decisions ripple across their entire global network, BASF turned to AlphaEvolve on Google Cloud to build a digital twin of their supply chain.

What they did: Working with Google Cloud, they built a virtual try-on experience that lets shoppers see high-end fashion on their own bodies using a simple selfie.

What they did: Movix developed custom models for deep learning, computer vision, and 3D mesh analysis over a five-month…

1 week, 4 days назад @ cloud.google.com
AI in SRE: Where and how Google is deploying agentic AI to improve operations
AI in SRE: Where and how Google is deploying agentic AI to improve operations AI in SRE: Where and how Google is deploying agentic AI to improve operations

The above diagram shows each of the phases where SRE is involved, and that could be improved with SRE AI.

Google SRE has developed AI agents to continuously monitor and improve playbooks and production documentation based on their usage during incidents.

With AI, Google SRE is augmenting our more traditional approaches with anomaly detection, with alerts based on detecting anomalies in regular behavior rather than statically predefined thresholds.

Incident managementWithin Google SRE, incident management, or IMAG, is a well-established process with clear roles and responsibilities, as well as tooling.

SRE AI agents need to provide a high level of reliability SLOs and have well-defined backu…

1 week, 5 days назад @ cloud.google.com
Evolving Dataflow to process massive datasets for machine learning
Evolving Dataflow to process massive datasets for machine learning Evolving Dataflow to process massive datasets for machine learning

And many of those innovations are available as part of Dataflow, our fully managed batch and streaming platform built on the same core technology Google uses to power its most demanding internal workloads..

Addressing massive scalabilityThe scale of data processing at Google has exploded over the last 20 years and continues to drive innovation.

Unified batch and streaming enables users to use the same code for both historical batch and live streaming data.

This simplifies the architecture, which traditionally would have required separate pipelines for batch and streaming data processing.

This is one of the reasons many Google Cloud customers rely on Dataflow for critical ML applications: Sp…

1 week, 5 days назад @ cloud.google.com
Nano Banana 2 and Nano Banana Pro are generally available, and already powering creative workflows
Nano Banana 2 and Nano Banana Pro are generally available, and already powering creative workflows Nano Banana 2 and Nano Banana Pro are generally available, and already powering creative workflows

Organizations are unlocking entirely new ways to use image generation and editing across their industries.

What’s new: To help customers continue their creative journey securely, we are announcing Nano Banana 2 (Gemini 3.1 Flash Image) and Nano Banana Pro (Gemini 3 Pro Image) are generally available (GA) today via Gemini Enterprise Agent Platform.

Alongside this milestone, we are introducing a powerful new capability in preview that significantly expands how models process multimodal inputs: Nano Banana 2 now supports video files as an input prompt.

How our customers are innovating with Nano Banana modelsWe are continually inspired by how our partners and customers are pushing the boundarie…

1 week, 5 days назад @ cloud.google.com
Announcing the newest cohort of the Google for Startups Accelerator: Middle East, North Africa & Turkey
Announcing the newest cohort of the Google for Startups Accelerator: Middle East, North Africa & Turkey Announcing the newest cohort of the Google for Startups Accelerator: Middle East, North Africa & Turkey

The newest cohort of 15 companies in the Google for Startups Accelerator: MENA-T program starts on June 1.

Jusoor Labs uses AI to analyze science experiment interactions and improve learning outcomes.

Repzo uses AI to turn complex field data into natural language reports for field teams.

A curriculum designed for impactStarting June 1st, founders will participate in a three-month program specifically tailored to help startups navigate their unique challenges.

This holistic approach is designed to help startups maintain momentum and drive the region’s sustained digital growth and long-term resilience.

1 week, 5 days назад @ cloud.google.com
OpenAI
последний пост None
Microsoft Microsoft
последний пост 1 week, 5 days назад
Data Formulator 0.7: AI-powered data analytics for enterprise data
Data Formulator 0.7: AI-powered data analytics for enterprise data Data Formulator 0.7: AI-powered data analytics for enterprise data

At a glance Data Formulator 0.7 is an open-source AI-powered system for enterprise data analytics that combines data connectivity, agent-guided exploration, and visualization refinement in a shared workspace.

Enterprise teams increasingly rely on AI systems for analytics, but enterprise data workflows are often fragmented across storage systems and tools.

Listen now Opens in a new tabConnecting enterprise data with Data ConnectorsData Formulator helps teams bring enterprise data into an AI-ready workspace without needing to rebuild the same connections for every source of data.

Data Connectors provide persistent connections between enterprise data sources and Data Formulator, allowing analy…

1 week, 5 days назад @ microsoft.com
Extending Human Intelligence Through AI
Extending Human Intelligence Through AI Extending Human Intelligence Through AI

At a glance Modern AI systems are powerful not because they replicate human intelligence, but because they presuppose it, by extending structures already present in human cognition and language.

Understanding AI as an extension of human intelligence—not a replacement for it—offers a more grounded path for building trustworthy AI systems.

Rather than asking whether AI systems are becoming intelligent in the human sense, these approaches ask a more basic question: What if AI systems work because they rely on structures that are rooted in human cognition?

In our recent paper, The Origins of Artificial Intelligence in Natural Intelligence, we argue that modern AI systems are best understood nei…

1 week, 6 days назад @ microsoft.com
MagenticLite, MagenticBrain, Fara1.5: An agentic experience optimized for small models
MagenticLite, MagenticBrain, Fara1.5: An agentic experience optimized for small models MagenticLite, MagenticBrain, Fara1.5: An agentic experience optimized for small models

Built as the next generation of Magentic-UI, it combines a redesigned app with a harness optimized for small models.

MagenticBrain and Fara1.5 are small models designed for orchestration and computer-use tasks, respectively.

Together, these releases explore how far agentic performance can be pushed with smaller models, codesigned tools, and an optimized execution harness.

Today, Microsoft Research AI Frontiers releases MagenticLite (opens in new tab), an experimental agentic application designed for small models.

The result is an agent that runs efficiently, keeps data on the user’s machine, and supports a broad range of agentic tasks.

2 weeks, 5 days назад @ microsoft.com
Vega: Zero-knowledge proofs for digital identity in the age of AI
Vega: Zero-knowledge proofs for digital identity in the age of AI Vega: Zero-knowledge proofs for digital identity in the age of AI

Vega puts these building blocks together into a single proof system.

The hashing problem, and how folding solves itA credential proof must do two expensive things: hash the credential bytes with SHA-256 and verify the issuer’s digital signature.

Making it zero-knowledge, cheaplyA proof system needs to be zero-knowledge: the verifier should learn nothing beyond the claim being proved.

Device bindingA zero-knowledge credential proof is only useful if it is tied to the person holding the credential.

The proof system powering Vega is already available as the open-source spartan2 (opens in new tab) project on GitHub.

2 weeks, 5 days назад @ microsoft.com
Further Notes on Our Recent Research on AI Delegation and Long-Horizon Reliability
Further Notes on Our Recent Research on AI Delegation and Long-Horizon Reliability Further Notes on Our Recent Research on AI Delegation and Long-Horizon Reliability

Our recent paper, “LLMs Corrupt Your Documents When You Delegate”, has generated discussion about the reliability of AI systems in delegated workflows.

Using a controlled evaluation methodology, we examine how well information is preserved across these extended workflows.

We use chained transformation-and-inversion tasks that evaluate whether semantic content is preserved accurately across extended delegated workflows.

Azure AI Foundry Labs Get a glimpse of potential future directions for AI, with these experimental technologies from Microsoft Research.

At the same time, the findings should not be interpreted as evidence that AI systems lack practical value in real-world work today.

3 weeks, 4 days назад @ microsoft.com
mimalloc: A new, high-performance, scalable memory allocator for the modern era
mimalloc: A new, high-performance, scalable memory allocator for the modern era mimalloc: A new, high-performance, scalable memory allocator for the modern era

mimalloc is an open-source, modern, scalable memory allocator that is a drop-in replacement for malloc and free.

The mimalloc memory allocator was initially designed in 2020 as a fast allocator for the state-of-the-art Lean (opens in new tab) and Koka (opens in new tab) programming languages developed at RiSE, both of which use novel compiler-guided reference counting (see Perceus).

ja .LBB0_generic leaq 7 ( %rsi ), %rax ; round to sizeof(void*) andq $-8 , %rax movq 232 ( %rdi , %rax ), %rcx ; rcx = heap->small_pages[index] movq 8 ( %rcx ), %rax ; block = rax = page->free testq %rax , %rax ; block == NULL?

Thus, mimalloc has three free lists per (64 KiB) mimalloc page, and effectively that …

3 weeks, 6 days назад @ microsoft.com
GridSFM: A new, small foundation model for the electric grid
GridSFM: A new, small foundation model for the electric grid GridSFM: A new, small foundation model for the electric grid

Microsoft releases a lightweight foundation model that can predict AC optimal power flow in milliseconds, boosting efficiency and unlocking cost savings in grid analysis.

It provides a foundation for the community to build advanced power grid simulators and planning tools without recreating data or models from scratch.

Microsoft introduces GridSFM, a small foundation model for solving AC optimal power flow (AC-OPF) problems in transmission power grids.

Power grids face increasing strain from surging demand, the need to integrate renewable energy sources, transportation electrification, and extreme weather events.

This release adds the first open AC-OPF model that supports multiple grid topo…

3 weeks, 6 days назад @ microsoft.com
Advancing AI for materials with MatterSim: experimental synthesis, faster simulation, and multi-task models
Advancing AI for materials with MatterSim: experimental synthesis, faster simulation, and multi-task models Advancing AI for materials with MatterSim: experimental synthesis, faster simulation, and multi-task models

Now we have experimentally synthesized it and measured its thermal conductivity (152 W/m/K) to be close to the thermal conductivity of silicon.

Now we have experimentally synthesized it and measured its thermal conductivity (152 W/m/K) to be close to the thermal conductivity of silicon.

Faster simulation : We have accelerated MatterSim-v1 model inference by 3-5x and integrated it with the LAMMPS software package, enabling large-scale simulations across multiple GPUs.

These include experimental validation of MatterSim predictions for thermal conductors, performance improvements for faster simulation, and the introduction of a new multi-task foundation model for materials characterization.

Le…

4 weeks назад @ microsoft.com
SocialReasoning-Bench: Measuring whether AI agents act in users’ best interests
SocialReasoning-Bench: Measuring whether AI agents act in users’ best interests SocialReasoning-Bench: Measuring whether AI agents act in users’ best interests

In our simulated multi-agent marketplace, agents accepted the first proposal they received up to 93% of the time without exploring alternatives.

Introducing SocialReasoning-BenchFigure 1: Our benchmark measures agents’ social reasoning ability in two domains, calendar coordination and marketplace negotiation.

Finding 3: Outcome optimality shows how much value agents leave on the table.

In calendar, agents perform better but still settle below the midpoint on average.

Finally, Outcome Optimality works well in settings with clear boundaries, where a “good” outcome can be defined and measured.

4 weeks, 1 day назад @ microsoft.com
Building realistic electric transmission grid dataset at scale: a pipeline from open dataset
Building realistic electric transmission grid dataset at scale: a pipeline from open dataset Building realistic electric transmission grid dataset at scale: a pipeline from open dataset

The ability to study transmission-level power grid behavior is essential for modern power systems research.

In most of the world, including the United States, realistic transmission-level grid data is classified as critical infrastructure information and subject to strict access controls.

These restrictions exist for good reasons, but the resulting lack of realistic grid models is increasingly exacerbating the challenges power systems face.

In this work, we introduce an open-data-derived pipeline for constructing large-scale, transmission-level power grid models that realistically approximate existing networks without relying on proprietary or restricted datasets.

Using only publicly access…

1 month назад @ microsoft.com
Microsoft at NSDI 2026: Advances in large-scale networked systems
Microsoft at NSDI 2026: Advances in large-scale networked systems Microsoft at NSDI 2026: Advances in large-scale networked systems

The USENIX Symposium on Networked Systems Design and Implementation 2026 (opens in new tab) (NSDI ’26) is a leading forum where researchers and practitioners share new research, insights, and advances in the design and operation of these systems.

Microsoft is proud to support NSDI ’26 as a returning sponsor, reflecting our ongoing commitment to advancing systems and networking research and engaging with the broader community.

Together, they highlight advances in building and operating large-scale networked systems.

Spotlight: Microsoft research newsletter Microsoft Research Newsletter Stay connected to the research community at Microsoft.

Wednesday, May 6, 9:00–10:20 AMYuxuan Yan, Zhejiang …

1 month назад @ microsoft.com
Red-teaming a network of agents: Understanding what breaks when AI agents interact at scale
Red-teaming a network of agents: Understanding what breaks when AI agents interact at scale Red-teaming a network of agents: Understanding what breaks when AI agents interact at scale

Actions that seem harmless can cascade causing a chain reaction across an agent network.

Invisibility: Information can pass through chains of unaware agents, making the source of an attack hard to trace from any single agent’s perspective.

Each independently contacted a victim agent (Bob) about the same fabricated audit, using varied language and staggered timing to appear unrelated.

Experimental setup: A principal entrusts their agent, Bob, with sensitive personal data: disability accommodation, medical schedule, preferred pharmacy, emergency contact.

Agents relayed summaries of other agents’ private messages to the attacker (one forwarded another agent’s message within seconds), and agent…

1 month, 1 week назад @ microsoft.com
AutoAdapt: Automated domain adaptation for large language models
AutoAdapt: Automated domain adaptation for large language models AutoAdapt: Automated domain adaptation for large language models

At a glance Problem : Adapting large language models to specialized, high-stakes domains is slow, expensive, and hard to reproduce.

: Adapting large language models to specialized, high-stakes domains is slow, expensive, and hard to reproduce.

Why it matters: The result is faster, automated, more reliable domain adaptation that turns weeks of manual iteration into repeatable pipelines.

Deploying large language models (LLMs) in real-world, high-stakes settings is harder than it should be.

In our paper, “AutoAdapt: An Automated Domain Adaptation Framework for Large Language Models,” we describe an end-to-end, constraint-aware framework for domain adaptation.

1 month, 2 weeks назад @ microsoft.com
Can we AI our way to a more sustainable world?
Can we AI our way to a more sustainable world? Can we AI our way to a more sustainable world?

Because I do think there’s a role for AI, a huge role for AI.

BURGER: Right, right.

BURGER: Right, right.

So I think that’s also something quite important here that, you know, AI can help facilitate.

And I think that’s not just applying AI to solve solutions through optimization but also thinking about this in an integrated way.

1 month, 2 weeks назад @ microsoft.com
New Future of Work: AI is driving rapid change, uneven benefits
New Future of Work: AI is driving rapid change, uneven benefits New Future of Work: AI is driving rapid change, uneven benefits

Publication New Future of Work Report 2025The New Future of Work report brings together research from inside and outside of Microsoft to understand what is happening as AI enters workplaces.

But usage and confidence vary widely across sectors, and men report using AI at work more often than women.

AI systems are increasingly playing a role in decision-making, creativity, and communication, with AI systems being positioned as a “collaborator.” This raises questions about how to support “collaboration” between people and AI, what we can learn from how people interact with each other, and where the capabilities of AI systems raise different opportunities and create different requirements.

Usin…

2 months назад @ microsoft.com
MIT AI MIT AI
последний пост 1 час назад
The consequences of relying on AI for accurate news
The consequences of relying on AI for accurate news The consequences of relying on AI for accurate news

It’s no secret that the last few years have seen a massive explosion in the use of artificial intelligence for general information-gathering.

(Roughly a quarter of all participants actually reported feeling that they were getting better at detection, even as their performance declined.)

The research team said that these AI models are particularly vulnerable to mistakes in the midst of emotionally charged breaking news, as exhibited by the widespread misinformation that accompanied President Trump’s recent assassination attempt and major events during the Iranian war.

(The authors also point out that the original human-created news content that’s used to train the AI models is increasingly u…

1 час назад @ news.mit.edu
The crucial human component in computing and AI
The crucial human component in computing and AI The crucial human component in computing and AI

“There is so much amazing research being done at MIT on how AI and computing can be forces for good that benefit humanity.

Iason Gabriel, a philosopher and research scientist at Google DeepMind, used the example of a judge to illustrate his point.

Offloading versus upliftingAs students across all levels of education begin to use AI, questions arise on whether there’s a way to ethically incorporate AI tools while maintaining academic accuracy and rigor.

When the chess engine hands a turn over to its human partner, the human struggles to pick up on the predictive move pattern that the engine has been following up until this point.

“The danger of human-algorithm teams is that when the human ta…

4 days, 1 hour назад @ news.mit.edu
PATH to boost AI training and career opportunities for industry-aligned jobs
PATH to boost AI training and career opportunities for industry-aligned jobs PATH to boost AI training and career opportunities for industry-aligned jobs

“Through PATH, MIT RAISE is using our convening power to bring community colleges, industry, research universities, and government together to build human-centered AI pathways that lead to shared prosperity.

Beyond individual courses, PATH is building clearer pathways for students to turn AI learning into real job opportunities.

The goal is to help students build skills that are relevant, recognized, and directly connected to growing career paths.

The initiative is supported by a grant to MIT from Google.org, which is helping MIT and its collaborators build a multi-state network for AI workforce development.

“MIT’s PATH initiative offers a blueprint for expanding opportunity in the age of A…

5 days, 1 hour назад @ news.mit.edu
NSF renews support for MIT-led AI and physics institute, expanding a new model for discovery
NSF renews support for MIT-led AI and physics institute, expanding a new model for discovery NSF renews support for MIT-led AI and physics institute, expanding a new model for discovery

Its work has shown that machine learning can accelerate discovery in physics, while insights from physics can make AI systems more principled and interpretable.

“From the beginning, IAIFI has been built around a two-way street: AI enabling better physics, and physics enabling better AI,” says Jesse Thaler, IAIFI’s director and a professor of physics at MIT.

“We have seen this virtuous cycle play out across multiple areas of physics and AI over the past five years.

In particle physics, IAIFI researchers have developed AI techniques to handle the immense data rates from the Large Hadron Collider in real-time, helping turn a firehose of collision data into actionable physics.

In nuclear physic…

5 days, 5 hours назад @ news.mit.edu
Teaching AI agents to ask better questions by playing “Battleship”
Teaching AI agents to ask better questions by playing “Battleship” Teaching AI agents to ask better questions by playing “Battleship”

These semi-autonomous programs can “think” and execute well-defined tasks in areas like customer service and software development, typically using language models (LMs).

In their “Collaborative Battleship” game, one participant is a “captain” who inquires about where hidden ships are, while their teammate plays the “spotter” by responding to those questions in real-time.

The result: AI models that can beat regular players at “Battleship,” regardless of scale.

We find that when we give agents access to a ‘world model,’ they ask better questions and make discoveries more efficiently.”A sea change for LMsThe team’s first focus was getting LMs to ask better questions.

Grand also plans to have h…

6 days назад @ news.mit.edu
Tod Machover receives George Peabody Medal for contributions to music and technology
Tod Machover receives George Peabody Medal for contributions to music and technology Tod Machover receives George Peabody Medal for contributions to music and technology

Tod Machover, the Muriel R. Cooper Professor of Music and Media, faculty director of the MIT Media Lab, and director of the Opera of the Future research group, will receive the George Peabody Medal for Outstanding Contributions to Music and Dance in America — the highest honor bestowed by the Peabody Institute of the Johns Hopkins University.

As a composer and music tech pioneer, Machover has helped expand music’s possibilities for artists and audiences alike through his work in participatory opera, artificial intelligence, and creative technologies.

He joins a roster of previous George Peabody Medal recipients that includes Stevie Wonder, Misty Copeland, Herbie Hancock, Renée Fleming, Yo-Y…

6 days назад @ news.mit.edu
MIT researchers teach AI models to interpret charts
MIT researchers teach AI models to interpret charts MIT researchers teach AI models to interpret charts

To fill this performance gap, researchers from MIT and the MIT-IBM Computing Research Lab developed a multifaceted resource for AI users that is specifically designed to teach vision-language models (VLMs) how to effectively interpret charts.

Many of these smaller models significantly outperformed orders of magnitude larger, commercial models on tasks like data extraction and chart summarization.

By enabling open-source models to outperform their commercial counterparts, ChartNet could allow small firms with limited budgets to more readily utilize AI.

But less work has focused on interpreting complex multimodal data contained within charts, Kondic says.

The dataset improved the accuracy of …

6 days, 17 hours назад @ news.mit.edu
Media Advisory: MIT to establish regional quantum hub
Media Advisory: MIT to establish regional quantum hub Media Advisory: MIT to establish regional quantum hub

MIT and the Commonwealth of Massachusetts announced plans to establish the Quantum Systems Laboratory (QSL) at MIT, which will be open to researchers across the region.

Quantum technologies promise transformative changes in fields from computing, security, and navigation to health sciences, defense technologies, and space exploration.

MIT and the Commonwealth of Massachusetts announced plans to establish the Quantum Systems Laboratory (QSL) at MIT, a new shared-use facility that will serve as a quantum toolbox for the region, aimed at accelerating quantum research, innovation, and growth in this critical field.

Through the new Quantum Systems Laboratory, we will help position Massachusetts …

1 week, 5 days назад @ news.mit.edu
Technology usually creates jobs for young, skilled workers. Will AI do the same?
Technology usually creates jobs for young, skilled workers. Will AI do the same? Technology usually creates jobs for young, skilled workers. Will AI do the same?

At any given time, technology does two things to employment: It replaces traditional jobs, and it creates new lines of work.

“We had never before seen exactly who is doing new work,” Autor says.

“Eroding tasks is not the same thing as eroding jobs, since many jobs involve a lot of tasks.

New work, Autor observes, is always tied to new forms of expertise.

Back to AI for a minuteStudying who gets new jobs led the scholars to striking conclusions about how new work is created.

2 weeks, 5 days назад @ news.mit.edu
Building AI models that understand chemical principles
Building AI models that understand chemical principles Building AI models that understand chemical principles

Among all of the possible chemical compounds, it’s estimated that between 1020 and 1060 may hold potential as small-molecule drugs.

His research straddles the line between chemical engineering and computer science, as he develops and deploys computational models to analyze vast numbers of possible chemical compounds, design new compounds, and predict reaction pathways that could generate those compounds.

After graduating from Caltech, he decided to keep going in chemical engineering and came to MIT in 2014 to start a PhD.

His work focused on combining machine learning and cheminformatics — the application of computation methods to analyze chemical data — to plan reaction pathways that could…

2 weeks, 6 days назад @ news.mit.edu
Justin Solomon appointed associate dean of engineering education
Justin Solomon appointed associate dean of engineering education Justin Solomon appointed associate dean of engineering education

Justin Solomon, associate professor in the MIT Department of Electrical Engineering and Computer Science (EECS), has been appointed associate dean of engineering education in the MIT School of Engineering, effective July 1.

In this new role, Solomon will focus on advancing innovation in engineering education across the school.

He is the author of “Numerical Algorithms,” a textbook that presents a modern approach to numerical analysis for computer science students.

Solomon is a principal investigator at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), where he leads the Geometric Data Processing Group.

Solomon joined the MIT faculty in 2016.

3 weeks назад @ news.mit.edu
Two from MIT named 2026 Knight-Hennessy Scholars
Two from MIT named 2026 Knight-Hennessy Scholars Two from MIT named 2026 Knight-Hennessy Scholars

MIT master’s student Sunshine Jiang ’25 and Rupert Li ’24 are recipients of this year’s Knight-Hennessy Scholarship.

Rupert Li ’24Rupert Li, from Portland, Oregon, is currently pursuing a PhD in mathematics at Stanford School of Humanities and Sciences.

He graduated from MIT in 2024 with a bachelor’s degree, double majoring in mathematics and computer science, economics, and data science.

Along with his bachelor’s degree, he also received a master’s degree in data science.

In addition to the Knight-Hennessy Scholarship and the Marshall Scholarship, he has been awarded the Hertz Fellowship, P.D.

3 weeks, 5 days назад @ news.mit.edu
Universal AI is “a pathway to AI fluency that’s accessible and approachable to anyone, anywhere”
Universal AI is “a pathway to AI fluency that’s accessible and approachable to anyone, anywhere” Universal AI is “a pathway to AI fluency that’s accessible and approachable to anyone, anywhere”

“Universal AI was built to thread that needle.

The result is a pathway to AI fluency that’s approachable to anyone, anywhere.”The need for accessible, practical AI education has never been greater.

Universal AI also includes industry-specific courses that dive into the intersection of AI and health care, sustainability, entrepreneurship, transportation, and more.

Six industry-specific courses are available today, including Holistic AI in Medicine, AI and Entrepreneurship, and AI and Sustainability: Energy.

“MIT’s long history of making knowledge available through MIT Open Learning means it’s only natural we’d feel compelled to bring Universal AI to the world,” adds Kornbluth.

4 weeks назад @ news.mit.edu
Q&A: Expanding MIT’s global reach through Universal Learning
Q&A: Expanding MIT’s global reach through Universal Learning Q&A: Expanding MIT’s global reach through Universal Learning

MIT's Universal Learning is a new initiative from MIT Open Learning designed to prepare learners everywhere to tackle complex global challenges through boundary-crossing thinking.

Universal AI, the first offering from Universal Learning, launched to the public today.

Dimitris Bertsimas, vice provost for open learning, and Megan Mitchell, senior director of Universal Learning, share how Universal Learning supports MIT’s educational mission, and what makes it distinctive.

My colleagues contributing to current and forthcoming Universal Learning offerings share this same passion.

That’s why with Universal Learning we are prioritizing asynchronous delivery, mobile delivery, translations, and are…

4 weeks назад @ news.mit.edu
Study: Firms often use automation to control certain workers’ wages
Study: Firms often use automation to control certain workers’ wages Study: Firms often use automation to control certain workers’ wages

Rather than implement automation in pursuit of maximal productivity, firms have often used automation to replace employees who specifically receive a “wage premium,” earning higher salaries than other comparable workers.

For one thing, automation has affected the growth in U.S. income inequality even more than many observers realize.

This inefficient targeting of certain employees has offset 60-90 percent of the productivity gains from automation during the time period.

Inequality implicationsDating back to the 2010s, Acemoglu and Restrepo have combined to conduct many studies about automation and its effects on employment, wages, productivity, and firm growth.

Certain types of automation c…

1 month назад @ news.mit.edu
Berkeley AI
последний пост 1 month назад
Adaptive Parallel Reasoning: The Next Paradigm in Efficient Inference Scaling
Adaptive Parallel Reasoning: The Next Paradigm in Efficient Inference Scaling Adaptive Parallel Reasoning: The Next Paradigm in Efficient Inference Scaling

Adaptive Parallel Reasoning: The Next Paradigm in Efficient Inference ScalingOverview of adaptive parallel reasoning.

We provide a detailed analysis of recent progress in the field of parallel reasoning, especially Adaptive Parallel Reasoning.

Figure 4: Special Tokens Variants across Adaptive Parallel Reasoning PapersInference Systems for Adaptive ParallelismHow do we actually execute parallel branches?

Figure 14: Difference in Model Choice Across Adaptive Parallel Reasoning PapersEach paper also offers a slightly different interpretation about how adaptive parallel reasoning contributes to the research field.

(Yang et al., 2025; Lian et al., 2025) aim to deliver sequential-AR-model-level a…

1 month назад @ bair.berkeley.edu
Gradient-based Planning for World Models at Longer Horizons
Gradient-based Planning for World Models at Longer Horizons Gradient-based Planning for World Models at Longer Horizons

Large, learned world models are becoming increasingly capable.

Why is adversarial robustness an issue for world model planning?

We thus exploit the differentiability of learned world models $F_{\theta}$, while not falling victim to the inherent sensitivity of the state Jacobians $D_s F_{\theta}$.

It’s a funny sweet spot where the background literature (planning and control overall) is incredibly mature and well-developed, but the current setting (pure planning optimization over modern, large-scale world models) is still heavily underexplored.

But, once we figure out all the right ideas, world model planners will likely become as commonplace as RL.

1 month, 2 weeks назад @ bair.berkeley.edu
Identifying Interactions at Scale for LLMs
Identifying Interactions at Scale for LLMs Identifying Interactions at Scale for LLMs

Identifying Interactions at Scale for LLMsUnderstanding the behavior of complex machine learning systems, particularly Large Language Models (LLMs), is a critical challenge in modern artificial intelligence.

Therefore, grounded or reality-checked interpretability methods must also be able to capture these influential interactions.

In this blog post, we describe the fundamental ideas behind SPEX and ProxySPEX, algorithms capable of identifying these critical interactions at scale.

SPEX and ProxySPEX FrameworkTo discover influential interactions with a tractable number of ablations, we have developed SPEX (Spectral Explainer).

We formalize this through two observations: sparsity (relatively f…

2 months, 4 weeks назад @ bair.berkeley.edu
Information-Driven Design of Imaging Systems
Information-Driven Design of Imaging Systems Information-Driven Design of Imaging Systems

We developed a framework that enables direct evaluation and optimization of imaging systems based on their information content.

The first approach treated imaging systems as unconstrained communication channels, ignoring the physical limitations of lenses and sensors.

Our Information-Driven Encoder Analysis Learning (IDEAL) method uses gradient ascent on information estimates to optimize imaging system parameters.

The standard approach to computational imaging design, end-to-end optimization, jointly trains the imaging hardware and a neural network decoder.

The computational efficiency of IDEAL suggests possibilities for designing imaging systems that were previously intractable.

5 months назад @ bair.berkeley.edu
RL without TD learning
RL without TD learning RL without TD learning

RL without TD learningIn this post, I’ll introduce a reinforcement learning (RL) algorithm based on an “alternative” paradigm: divide and conquer.

We can do Reinforcement Learning (RL) based on divide and conquer, instead of temporal difference (TD) learning.

There are two classes of algorithms in RL: on-policy RL and off-policy RL.

We compared TRL with $n$-step TD learning with different values of $n$, from $1$ (pure TD) to $\infty$ (pure MC).

I still think one of the most important problems in RL (and even in machine learning) is to find a scalable off-policy RL algorithm.

7 months, 1 week назад @ bair.berkeley.edu
What exactly does word2vec learn?
What exactly does word2vec learn? What exactly does word2vec learn?

What exactly does word2vec learn?

What exactly does word2vec learn, and how?

In this framing, it’s clear that word2vec is a minimal neural language model.

As a result, the theory predicts exactly what features are learned in terms of the corpus statistics and the algorithmic hyperparameters.

We find that over the course of learning, word2vec builds these linear representations in a sequence of noisy learning steps, and their geometry is well-described by a spiked random matrix model.

9 months, 1 week назад @ bair.berkeley.edu
AWS Machine Learning AWS Machine Learning
последний пост 1 час назад
Scale Robot Reinforcement Learning with NVIDIA Isaac Lab on Amazon SageMaker AI
Scale Robot Reinforcement Learning with NVIDIA Isaac Lab on Amazon SageMaker AI Scale Robot Reinforcement Learning with NVIDIA Isaac Lab on Amazon SageMaker AI

In this post, we show how to train robot policies for the Unitree H1 humanoid with NVIDIA Isaac Lab on Amazon SageMaker AI across two compute options: Amazon SageMaker HyperPod and Amazon SageMaker Training Jobs.

Why Amazon SageMaker AI for Physical AI trainingAmazon SageMaker AI removes the undifferentiated heavy lifting of managing compute infrastructure for machine learning (ML) training.

NVIDIA Isaac Lab and the training taskNVIDIA Isaac Lab is an open-source robot learning framework built on NVIDIA Isaac Sim.

On SageMaker Training Jobs, SageMaker writes the host list to /opt/ml/input/config/resourceconfig.json , and the container’s entrypoint parses it at startup.

To learn more, see th…

1 час назад @ aws.amazon.com
Hands-free first notice of loss: Using Strands Agents and Amazon Bedrock AgentCore Browser Tool for intelligent claims intake
Hands-free first notice of loss: Using Strands Agents and Amazon Bedrock AgentCore Browser Tool for intelligent claims intake Hands-free first notice of loss: Using Strands Agents and Amazon Bedrock AgentCore Browser Tool for intelligent claims intake

In this post, we demonstrate how a hands-free FNOL intake system combines agents built with the Strands Agents SDK for domain reasoning with Amazon Bedrock AgentCore Browser Tool for live portal interaction.

The workflow is illustrated using real browser automation recordings captured directly from the system in action.

The opportunity: Optimizing claims intake to amplify human expertiseAcross insurance lines (auto, property and casualty, life, health, and specialty), claim intake marks the moment when unstructured information first enters the system.

Browser automation in practiceAgent-driven browser automation is executed from a separate control environment, such as a workstation or autom…

4 часа назад @ aws.amazon.com
Build an agentic incident triage assistant with Amazon Quick and New Relic
Build an agentic incident triage assistant with Amazon Quick and New Relic Build an agentic incident triage assistant with Amazon Quick and New Relic

New Relic MCP Server integration for Amazon Quick overviewWith Amazon Quick chat agents, you can explore data and take actions through open-ended conversations backed by connected action connectors, pre-built integrations that link Amazon Quick to external services.

Figure 1: Incident triage workflow using Amazon Quick, New Relic MCP Server, and Asana.

Step 1: Set up the New Relic integrationNew Relic is available as a built-in connector in the Amazon Quick Integrations console.

Step 4: Test the workflowOpen the Incident triage assistant from Amazon Quick and send the following prompt.

To get started, follow the New Relic integration for Amazon Quick article, and the Amazon Quick User Guide.

5 часов назад @ aws.amazon.com
Unlocking AI flexibility in Europe: A guide to cross-region inference for EU data processing and model access
Unlocking AI flexibility in Europe: A guide to cross-region inference for EU data processing and model access Unlocking AI flexibility in Europe: A guide to cross-region inference for EU data processing and model access

Cross-Region inference profilesCross-Region Inference (CRIS) is a managed capability in Amazon Bedrock that routes model inference requests within supported AWS Regions.

EU geography-based inferenceGeographic CRIS (Geo CRIS) are system-defined inference profiles that differ from global inference profiles.

For requests originating from outside of the EU, using EU CRIS: the optimizations only consider the source Region and the EU Regions.

This also applies to both global inference profiles and geographic inference profiles.

ConclusionCross-Region inference (CRIS) allows generative AI applications to access models that might not be available in their primary AWS Region.

1 day, 5 hours назад @ aws.amazon.com
It’s safe to close your laptop now: Hosting coding agents on Amazon Bedrock AgentCore
It’s safe to close your laptop now: Hosting coding agents on Amazon Bedrock AgentCore It’s safe to close your laptop now: Hosting coding agents on Amazon Bedrock AgentCore

Amazon Bedrock AgentCore Runtime gives every session a dedicated environment: an isolated Linux microVM with a persistent workspace, a real shell, and deterministic command execution.

Run Claude Code calling Opus, or Codex calling GPT-class models on Amazon Bedrock inside your VPC.

Agents hosted on AgentCore Runtime can live inside your VPC, which means you decide what “the internet” looks like from inside the microVM:Package installation.

The CoCounsel AI Assistant Agent is built on Claude Agent SDK that runs the same execution loop that powers Claude Code.

While it works, you open a terminal locally and run agentcore exec --it against the same session.

1 day, 5 hours назад @ aws.amazon.com
Better decisions at scale: How mathematical optimization delivers where intuition fails
Better decisions at scale: How mathematical optimization delivers where intuition fails Better decisions at scale: How mathematical optimization delivers where intuition fails

If machine learning is inductive AI — learning patterns from many examples to make probabilistic predictions — mathematical optimization is deductive AI.

It applies mathematical principles to specific business problems and delivers definitive, provably optimal decisions.

Rather than competing, mathematical optimization and ML form powerful predict-then-optimize pipelines: machine learning models forecast demand or predict failures, and optimization uses those predictions to make the best possible decisions.

Both mathematical optimization and ML run on data, benefit from advances in cloud computing and hardware, and are rooted in deep mathematics.

Together, they represent how science, data, …

1 day, 5 hours назад @ aws.amazon.com
End-to-end encrypted ML inference with Amazon SageMaker AI and FHE
End-to-end encrypted ML inference with Amazon SageMaker AI and FHE End-to-end encrypted ML inference with Amazon SageMaker AI and FHE

This post will show you how to use Amazon SageMaker AI with fully homomorphic encryption (FHE) to perform ML inference.

FHE ciphertexts can exceed SageMaker AI query size limits, so the client and service need to communicate them outside of SageMaker AI API calls.

FHE evaluation might take longer than SageMaker AI timeouts, and so inference will use the SageMaker AI mechanisms for asynchronous inference.

Delete the SageMaker AI model through the SageMaker AI console or SDK.

Delete the model artifacts, encrypted queries, encrypted responses, and evaluation keys from Amazon S3 through the Amazon S3 console or AWS CLI.

1 day, 5 hours назад @ aws.amazon.com
Amazon Quick ARNs: Cross-account migration and namespace permissions
Amazon Quick ARNs: Cross-account migration and namespace permissions Amazon Quick ARNs: Cross-account migration and namespace permissions

In this post, we cover the structure of Amazon Quick ARNs and provide a practical mental model for working with them.

The “quicksight” portion refers to the Quick Sight capability within Amazon Quick.

Amazon Quick stores permissions as relationships between resource ARNs and principal ARNs.

Different ARNs, different identities.

Understanding Amazon Quick ARNs comes down to four things:ARNs are account-bound.

1 day, 5 hours назад @ aws.amazon.com
Evaluate your Amazon Nova Sonic voice agent at scale, no microphone required
Evaluate your Amazon Nova Sonic voice agent at scale, no microphone required Evaluate your Amazon Nova Sonic voice agent at scale, no microphone required

In this post, we walk you through the Nova Sonic Test Harness, an open source framework that we built to solve both problems.

Here’s what makes voice agent testing fundamentally harder:Bidirectional streaming.

Because Nova Sonic responds differently every time, we evaluate against rubrics rather than checking for exact strings.

Test Nova Sonic as easily as testing any API.

Test Nova Sonic as easily as testing any API.

1 day, 5 hours назад @ aws.amazon.com
NVIDIA Nemotron 3 Ultra now available on Amazon SageMaker JumpStart
NVIDIA Nemotron 3 Ultra now available on Amazon SageMaker JumpStart NVIDIA Nemotron 3 Ultra now available on Amazon SageMaker JumpStart

Today, we are excited to announce the day-zero availability of NVIDIA Nemotron 3 Ultra on Amazon SageMaker JumpStart.

With this launch, you can now deploy the Nemotron 3 Ultra model using a one-click deployment experience.

Overview of NVIDIA Nemotron 3 UltraNVIDIA Nemotron 3 Ultra is an open large language model with 550 billion total parameters and 55 billion active parameters.

Get started now by searching for Nemotron 3 Ultra in Amazon SageMaker JumpStart.

Malav Shastri is a Software Development Engineer at AWS, where he works on the Amazon SageMaker JumpStart and Amazon Bedrock teams.

5 days, 4 hours назад @ aws.amazon.com
How to build self-driving AI operations on Amazon Bedrock at scale
How to build self-driving AI operations on Amazon Bedrock at scale How to build self-driving AI operations on Amazon Bedrock at scale

If a support case was created, the email includes the case ID and a direct link to the AWS Support console.

The TPM alarm uses the EstimatedTPMQuotaUsage metric that tracks estimated TPM quota consumption, including cache write tokens and output burndown multipliers.

Threshold Formula Example RPM threshold RPM quota × (RequestsPerMinuteThresholdPercent / 100) 10,000 RPM quota × 80% = 8,000 TPM threshold TPM quota × (TokensPerMinuteThresholdPercent / 100) 6,250,000 TPM quota × 80% = 5,000,000The TPM threshold percentage is applied directly to the TPM quota.

Automated support case creationThe solution optionally automates AWS Support case creation when operational issues are detected.

Email c…

6 days, 1 hour назад @ aws.amazon.com
Fundamental’s Large Tabular Model NEXUS is now available on Amazon SageMaker JumpStart
Fundamental’s Large Tabular Model NEXUS is now available on Amazon SageMaker JumpStart Fundamental’s Large Tabular Model NEXUS is now available on Amazon SageMaker JumpStart

Today, we’re announcing support for Fundamental’s NEXUS model on Amazon SageMaker AI.

As a Large Tabular Model, NEXUS is built for structured data analysis and offers these key innovations:Deterministic architecture – Probabilistic LLMs might provide different answers to identical queries.

How NEXUS works on Amazon SageMaker AIThe following figure illustrates the end-to-end flow for deploying and running predictions with NEXUS on SageMaker AI.

Get started with NEXUS on Amazon SageMaker AITo get started, navigate to Amazon SageMaker JumpStart, search for Fundamental NEXUS, and choose from the following:Base model (pre-trained on over 10B tabular rows).

ConclusionIn this post, we showed how N…

6 days, 3 hours назад @ aws.amazon.com
Reducing container cold start times using SOCI index on DLAMI and DLC
Reducing container cold start times using SOCI index on DLAMI and DLC Reducing container cold start times using SOCI index on DLAMI and DLC

Deep Learning AMI and AWS Deep Learning Containers are now enabled with support for SOCI snapshotter and index.

Container pulling mechanismsWhen pulling a container for your workloads, AWS Deep Learning AMIs (DLAMI) and Deep Learning Containers offer three options: the standard Docker pull, SOCI parallel pull, and SOCI lazy loading through SOCI index.

SOCI lazy loading provides near-instant container loading but requires files to be fetched on demand.

PrerequisitesSOCI index requiredImportant: Lazy loading mode requires the container image to have a SOCI index stored in the registry.

For detailed configuration guidance and best practices, refer to the SOCI documentation and the Deep Learnin…

6 days, 5 hours назад @ aws.amazon.com
Improve your agent’s tool-calling accuracy with SFT and DPO on Amazon SageMaker AI
Improve your agent’s tool-calling accuracy with SFT and DPO on Amazon SageMaker AI Improve your agent’s tool-calling accuracy with SFT and DPO on Amazon SageMaker AI

The example uses Amazon SageMaker AI training jobs, so you can focus on training code instead of managing your own training infrastructure.

To understand the costs associated with Amazon SageMaker Studio notebooks and Amazon SageMaker AI training jobs, refer to the SageMaker AI pricing page.

In this example, there is SFT training data and DPO training data that are run separately.

ConclusionIn this post, we showed how to improve an agent’s tool-calling accuracy by combining supervised fine-tuning (SFT) with Direct Preference Optimization (DPO) on Amazon SageMaker AI.

For more information about training models in SageMaker AI, see the SageMaker AI documentation.

6 days, 5 hours назад @ aws.amazon.com
The art and science of hyperparameter optimization on Amazon Nova Forge
The art and science of hyperparameter optimization on Amazon Nova Forge The art and science of hyperparameter optimization on Amazon Nova Forge

The Nova Forge customization pipelineUnderstanding these challenges frames how the Amazon Nova Forge customization pipeline is designed to address them.

Nova Forge supports SFT with data mixing using Amazon Nova-curated datasets, including reasoning-instruction-following categories that preserve general capabilities.

Deviating from default LR with data mixing Training instability, loss spikes, capability collapse Stick to service defaults when using data mixing.

To get started with Amazon Nova Forge, explore the Amazon Nova documentation and the SageMaker HyperPod recipes repository on GitHub.

For hands-on examples of data mixing in action, see the Nova Forge data mixing blog post.

1 week назад @ aws.amazon.com
NVIDIA
последний пост 2 часа назад
Delivering Lifecycle Control for AI Infrastructure at Scale with NVIDIA DGX Spark Enterprise Manageability
Delivering Lifecycle Control for AI Infrastructure at Scale with NVIDIA DGX Spark Enterprise Manageability Delivering Lifecycle Control for AI Infrastructure at Scale with NVIDIA DGX Spark Enterprise Manageability

NVIDIA DGX Spark and NVIDIA GB10 systems are delivering this foundation with new Enterprise Manageability.

How does DGX Spark Enterprise Manageability integrate into existing IT workflows?

NVIDIA partners that currently support DGX Spark from an enterprise manageability perspective include Progress Chef, Perforce Puppet, and Canonical Landscape.

Get started with NVIDIA DGX Spark Enterprise ManageabilityEnterprise AI infrastructure carries enterprise expectations.

For additional documentation, visit DGX Spark Enterprise Manageability.

2 часа назад @ developer.nvidia.com
Model Quantization: Turn FP8 Checkpoints into High-Performance Inference Engines with NVIDIA TensorRT
Model Quantization: Turn FP8 Checkpoints into High-Performance Inference Engines with NVIDIA TensorRT Model Quantization: Turn FP8 Checkpoints into High-Performance Inference Engines with NVIDIA TensorRT

In a previous post, we produced a high-quality FP8-quantized Contrastive Language-Image Pretraining (CLIP) checkpoint with NVIDIA TensorRT Model Optimizer.

We also profile the resulting FP8 TensorRT engine against the FP16 baseline to measure the real-world speedup the quantized model delivers.

Profile ONNX model with TensorRTWith the FP8 ONNX model exported, the next step is to pass it to TensorRT and measure how fast it runs.

--saveEngine writes the built TensorRT engine to disk for later reuse, either for standalone TensorRT inference runtime or for serving through NVIDIA Triton Inference Server (see this example).

CLIP FP8 vs FP16: TensorRT engine size and inference latencyFigure 3 show…

3 часа назад @ developer.nvidia.com
Accelerating Federated Learning Research with AI Agents and NVIDIA FLARE Auto-FL
Accelerating Federated Learning Research with AI Agents and NVIDIA FLARE Auto-FL Accelerating Federated Learning Research with AI Agents and NVIDIA FLARE Auto-FL

Federated learning (FL) research often begins with a deceptively simple question: What should we try next?

NVIDIA FLARE Auto-FL is an automated, AI-driven research loop designed to test and optimize federated learning strategies.

A completed Auto-FL campaign on a heterogeneous CIFAR-10 data split with eight simulated FL clients in FLAREHow does Auto-FL make the research loop explicit?

Figure 2 shows the Auto-FL research loop with literature-grounded stall recovery.

Auto-FL research loop with literature-grounded stall recoveryWhat is the function of literature-grounded recovery?

5 часов назад @ developer.nvidia.com
How the UK Is Turning Sovereign AI Ambition Into Action With NVIDIA Technologies
How the UK Is Turning Sovereign AI Ambition Into Action With NVIDIA Technologies How the UK Is Turning Sovereign AI Ambition Into Action With NVIDIA Technologies

U.K. technology leaders are innovating across healthcare and life sciences, coding, agentic AI, inference and more — all running on sovereign AI deployments.

Nebius has announced plans to expand customers and cloud capabilities with three new deployments of advanced NVIDIA AI infrastructure, as the NVIDIA AI Cloud ecosystem partner continues to build out its commercial and AI R&D hub in London.

CoreWeave is building in the U.K. Government’s AI Growth Zones, and seven more NVIDIA AI Cloud ecosystem partners have plans in the pipeline.

BT and Nscale announced plans to build sovereign AI data centers across three existing BT sites in the U.K., combining NVIDIA AI infrastructure, Nscale’s full …

1 day, 15 hours назад @ blogs.nvidia.com
NVIDIA and LG Group Build an AI Factory to Advance Physical AI, Mobility and AI Infrastructure
NVIDIA and LG Group Build an AI Factory to Advance Physical AI, Mobility and AI Infrastructure NVIDIA and LG Group Build an AI Factory to Advance Physical AI, Mobility and AI Infrastructure

The AI factory will provide LG Group with accelerated computing infrastructure to train, simulate, validate and deploy AI-based applications across its key businesses.

Together, the companies are connecting AI model development, physical AI data generation, robot simulation and training, edge deployment and factory-scale digital twins into a unified workflow for building physical AI systems.

To help overcome the training data challenge for robotics, LG Electronics is developing a physical AI data factory poised to help Korean and global companies accelerate physical AI projects.

This initiative aligns with the NVIDIA DSX AI factory platform, enabling the rapid deployment of scalable, high-p…

1 day, 18 hours назад @ blogs.nvidia.com
NVIDIA and Doosan Group Collaborate to Advance Physical AI and AI Factory Infrastructure
NVIDIA and Doosan Group Collaborate to Advance Physical AI and AI Factory Infrastructure NVIDIA and Doosan Group Collaborate to Advance Physical AI and AI Factory Infrastructure

NVIDIA and Doosan Group are expanding their collaboration to advance new opportunities across physical AI, robotics and AI factory infrastructure, spanning Doosan Robotics, Doosan Bobcat, Doosan Enerbility and Doosan Corporation Electro-Materials BG.

NVIDIA and Doosan will explore how NVIDIA’s physical AI stack, NVIDIA DSX AI factory platform, NVIDIA MGX and accelerated computing platforms can support these areas.

Doosan Bobcat also plans to explore integrating NVIDIA physical AI technologies into equipment used across construction, landscaping, agriculture and material handling applications.

Exploring AI Factory Power SolutionsDoosan Enerbility is exploring opportunities to support NVIDIA …

1 day, 22 hours назад @ blogs.nvidia.com
NVIDIA, KRAFTON, NC and Reigning ‘League of Legends’ Champions T1 Celebrate RTX Spark at Korea’s PC Bangs
NVIDIA, KRAFTON, NC and Reigning ‘League of Legends’ Champions T1 Celebrate RTX Spark at Korea’s PC Bangs NVIDIA, KRAFTON, NC and Reigning ‘League of Legends’ Champions T1 Celebrate RTX Spark at Korea’s PC Bangs

At GTC Taipei at COMPUTEX last week, NVIDIA unveiled RTX Spark, the superchip that reinvents Windows PCs for the era of personal AI agents.

In addition, RTX Spark supports all NVIDIA RTX technologies, including the recently announced DLSS 4.5 Ray Reconstruction, which features a second-generation transformer model for realistic image quality.

RTX Spark Ignites Korea’s Gaming CommunityKorea has played a major role in spearheading esports and driving the boom in PC bangs, or internet and gaming cafes.

There, he met with T1’s reigning League of Legends World Champion team, including six-time World Champion Lee “Faker” Sang-hyeok to unveil RTX Spark.

In addition to KRAFTON, NC, and Riot Games, …

2 days, 14 hours назад @ blogs.nvidia.com
Seoul Purpose: How NVIDIA and South Korea Are Building the Future of AI
Seoul Purpose: How NVIDIA and South Korea Are Building the Future of AI Seoul Purpose: How NVIDIA and South Korea Are Building the Future of AI

Home to cutting-edge sovereign AI infrastructure and robotics innovators, as well as one of the world’s most passionate gaming communities, South Korea is one of the world’s centers of AI.

NVIDIA founder and CEO Jensen Huang is in Seoul this week to meet the partners and builders behind that work.

A key focus of the trip, Huang said: to align the AI supply chain ahead of a busy second half of the year.

“We have a very significant, very large AI infrastructure buildout — already a very successful first half,” Huang told media.

“Robotics is going to be the next major sector here in Korea — this is a great opportunity for Korea to invest in AI,” he said.

4 days, 16 hours назад @ blogs.nvidia.com
Forecast: Fun Ahead — 18 Games Join in June to Stream on GeForce NOW
Forecast: Fun Ahead — 18 Games Join in June to Stream on GeForce NOW Forecast: Fun Ahead — 18 Games Join in June to Stream on GeForce NOW

June’s forecast with GeForce NOW: 100% chance of gaming.

GeForce NOW is lining up new adventures for the month, from big-name blockbusters to quirky indies ready for the spotlight.

Eighteen games are coming this month, starting with the 10 games arriving this week with the highly requested Neverness to Everness.

A World BeyondStep into a surreal, supernatural open world in NTE: Neverness to Everness from Hota Studio.

Streaming with GeForce NOW lets the game’s distinctive art direction and atmospheric lighting shine, keeping the city’s moody glow, deep shadows and supernatural effects crisp and sharp.

5 days, 8 hours назад @ blogs.nvidia.com
NVIDIA Research Unlocks Advanced Grasping, Smarter Autonomous Driving and Agent Training at Scale
NVIDIA Research Unlocks Advanced Grasping, Smarter Autonomous Driving and Agent Training at Scale NVIDIA Research Unlocks Advanced Grasping, Smarter Autonomous Driving and Agent Training at Scale

NitroGen is a generalized gameplay AI foundation model that harnesses the NVIDIA Isaac GR00T robot foundation model architecture to help train embodied agents in virtual environments across tens of thousands of hours of interaction.

The First Foundation Model for GraspingMost AI systems for robotic grasping are specialists.

GraspGen-X is the first foundation model for grasping built to eliminate this bottleneck.

NitroGen extends that principle to virtual environments, using the GR00T architecture to train a foundation model for embodied agents across a breadth of virtual worlds.

Learn more about NVIDIA at CVPR and explore NVIDIA Research’s work in physical AI, computer vision and autonomous…

6 days, 6 hours назад @ blogs.nvidia.com
NVIDIA Enables the Next Era Of Physical AI Research With Agent Skills For Autonomous Vehicles, Robotics And Vision AI
NVIDIA Enables the Next Era Of Physical AI Research With Agent Skills For Autonomous Vehicles, Robotics And Vision AI NVIDIA Enables the Next Era Of Physical AI Research With Agent Skills For Autonomous Vehicles, Robotics And Vision AI

At CVPR, NVIDIA is unveiling new physical AI agent skills that help researchers and developers speed the development of autonomous vehicles, robots and vision AI systems.

Leading across the open model public leaderboards central to physical AI, the world foundation model provides core capabilities for physical AI development.

Advancing Vision AI Systems for the Real WorldFor vision AI research, the bottleneck is creating enough controlled examples to study how models behave when visual conditions, object states or temporal events change.

NVIDIA researchers are presenting work across computer vision, physical AI, autonomous systems, neural rendering, generative AI and robotics at CVPR, runni…

6 days, 6 hours назад @ blogs.nvidia.com
Industrial Software Leaders Build Secure, Autonomous AI Engineers With NVIDIA NemoClaw
Industrial Software Leaders Build Secure, Autonomous AI Engineers With NVIDIA NemoClaw Industrial Software Leaders Build Secure, Autonomous AI Engineers With NVIDIA NemoClaw

At GTC Taipei at COMPUTEX, NVIDIA and more than a dozen engineering software providers are showcasing how autonomous AI agents automate this entire workflow.

These AI engineers are based on NVIDIA NemoClaw, an open blueprint for building specialized, long-running agents with a secure runtime and frontier models.

Industrial Engineering Leaders Build AI Agents Across Design, Engineering, SimulationIndustrial software leaders are building AI engineers for computer-aided engineering (CAE) and electronic design automation (EDA) use cases across automotive, aerospace, semiconductors and manufacturing.

Startups Extend the Reach of Agentic AIIn addition, cutting-edge startups are building AI engine…

6 days, 23 hours назад @ blogs.nvidia.com
NVIDIA Partners With Microsoft on Unified Stack for Agentic AI Deployment, From Windows Devices to Cloud to Local
NVIDIA Partners With Microsoft on Unified Stack for Agentic AI Deployment, From Windows Devices to Cloud to Local NVIDIA Partners With Microsoft on Unified Stack for Agentic AI Deployment, From Windows Devices to Cloud to Local

NVIDIA and Microsoft are bringing that full stack to developers across Windows devices, Azure cloud and local deployments.

Reinventing Windows for Agents: From RTX Spark to DGX Station for WindowsNVIDIA and Microsoft are reimagining Windows PCs for the age of AI agents.

Enhancing Azure Local and Foundry Local With NVIDIA RTX PRO 6000 Blackwell Server Edition and Nemotron ModelsAgentic AI is moving beyond the cloud.

Microsoft is bringing Foundry Local on Azure Local to the NVIDIA RTX PRO 6000 Blackwell Server Edition platform.

In addition, Microsoft has already validated the NVIDIA Vera Rubin platform, now in full production, for deployment across Azure data centers.

1 week назад @ blogs.nvidia.com
Why Financial Institutions Are Converging on Transaction Foundation Models to Build Their Own Intelligence
Why Financial Institutions Are Converging on Transaction Foundation Models to Build Their Own Intelligence Why Financial Institutions Are Converging on Transaction Foundation Models to Build Their Own Intelligence

Financial institutions have spent years building AI: fraud models, credit models, recommendation engines and risk systems.

EXL is integrating transaction foundation models into its EXLerate.ai platform to unify siloed financial data into a scalable, enterprise intelligence layer powered by proprietary transaction data.

Thoughtworks is helping financial institutions operationalize transaction foundation models within complex banking environments, integrating them into payment, servicing and risk while establishing the necessary governance and AI operating models.

The company will be showcasing a demo and presentation on transaction foundation models at the upcoming AWS Summit in New York Cit…

1 week назад @ blogs.nvidia.com
NVIDIA Jetson Brings Agentic AI to the Physical World
NVIDIA Jetson Brings Agentic AI to the Physical World NVIDIA Jetson Brings Agentic AI to the Physical World

At COMPUTEX on Tuesday, NVIDIA announced NVIDIA JetPack 7.2 and NVIDIA NemoClaw support on Jetson.

JetPack 7.2 brings agentic AI skills, Yocto project support, NVIDIA CUDA 13 on NVIDIA Jetson Orin, a substantial performance gain on Jetson AGX Orin 32GB module and Multi-Instance GPU (MIG) support on NVIDIA Jetson Thor.

The release lands NemoClaw, NVIDIA’s agentic AI framework, on the production-grade Jetson stack — taking agentic AI from servers and workstations into the physical world, across robotics, inspection and industrial automation.

The pairing lands agentic AI on a production-grade robotics and vision AI stack, accelerating task automation for industrial systems.

Advantech is buildi…

1 week назад @ blogs.nvidia.com
Facebook
последний пост 2 weeks назад
SilverTorch: Index as Model — A New Retrieval Paradigm for Recommendation Systems
SilverTorch: Index as Model — A New Retrieval Paradigm for Recommendation Systems SilverTorch: Index as Model — A New Retrieval Paradigm for Recommendation Systems

The retrieval system within industry recommendation systems have consisted of microservices stitched together, with neural networks inconsistently integrated.

Under Index as Model previous microservice-based item indices used for retrieval become a tensor inside the model.

Moving From Microservice Mesh to One Integrated Neural NetworkThe Microservice Paradigm We ReplacedTraditional recommendation retrieval is built as a mesh of microservices.

We call this Index as Model: Every retrieval component — the item index, eligibility filter, scoring layer and user tower — becomes a tensor or operator inside a single PyTorch model.

Index FreshnessWith index as a model module, maintaining index fresh…

2 weeks назад @ engineering.fb.com
Reel Friends: Building Social Discovery that Scales to Billions
Reel Friends: Building Social Discovery that Scales to Billions Reel Friends: Building Social Discovery that Scales to Billions

On its face the new Friend Bubbles feature looks simple enough.

It highlights Reels your friends have watched and reacted to.

On this episode of the Meta Tech Podcast, Pascal Hartig chats with Subasree and Joseph, two software engineers from the Facebook Reels team, about what it took to bring Friend Bubbles to life.

If you’ve ever underestimated a “simple” feature, this one’s for you.

And if you’re interested in learning more about career opportunities at Meta visit the Meta Careers page.

3 weeks, 6 days назад @ engineering.fb.com
Modernizing the Facebook Groups Search to Unlock the Power of Community Knowledge
Modernizing the Facebook Groups Search to Unlock the Power of Community Knowledge Modernizing the Facebook Groups Search to Unlock the Power of Community Knowledge

We’ve fundamentally transformed Facebook Groups Search to help people more reliably discover, sort through, and validate community content that’s most relevant to them.

We’ve adopted a new hybrid retrieval architecture and implemented automated model-based evaluation to address the major friction points people experience when searching community content.

Addressing the Friction Points in Community KnowledgePeople struggle with three friction points when searching for answers in community content – discovery, consumption, and validation.

The Solution: A Modernized Hybrid Retrieval ArchitectureWe engineered a hybrid retrieval architecture that powers a discussions module on Facebook Search.

R…

1 month, 2 weeks назад @ engineering.fb.com
Capacity Efficiency at Meta: How Unified AI Agents Optimize Performance at Hyperscale
Capacity Efficiency at Meta: How Unified AI Agents Optimize Performance at Hyperscale Capacity Efficiency at Meta: How Unified AI Agents Optimize Performance at Hyperscale

We’ve built a unified AI agent platform that encodes the domain expertise of senior efficiency engineers into reusable, composable skills.

Introducing the Capacity Efficiency ProgramWhen the code you ship serves more than 3 billion people, even a 0.1% performance regression can translate to significant additional power consumption.

Many engineers at Meta use our efficiency tools to work on these problems every day.

Skills : These encode domain expertise about performance efficiency.

The pipeline mirrors the defensive AI Regression Solver:Gather context with tools: The AI agent looks up: Opportunity metadata.

1 month, 3 weeks назад @ engineering.fb.com
How Meta Used AI to Map Tribal Knowledge in Large-Scale Data Pipelines
How Meta Used AI to Map Tribal Knowledge in Large-Scale Data Pipelines How Meta Used AI to Map Tribal Knowledge in Large-Scale Data Pipelines

Challenging the Conventional Wisdom on AI Context FilesRecent academic research found that AI-generated context files actually decreased agent success rates on well-known open-source Python repositories.

Our codebase is the opposite: proprietary config-as-code with tribal knowledge that exists nowhere in any model’s training data.

Any team with a large, proprietary codebase can benefit:Identify your tribal knowledge gaps.

What’s NextWe are expanding context coverage to additional pipelines across Meta’s data infrastructure and exploring tighter integration between context files and code generation workflows.

This approach turned undocumented tribal knowledge into structured, AI-readable con…

2 months назад @ engineering.fb.com
KernelEvolve: How Meta’s Ranking Engineer Agent Optimizes AI Infrastructure
KernelEvolve: How Meta’s Ranking Engineer Agent Optimizes AI Infrastructure KernelEvolve: How Meta’s Ranking Engineer Agent Optimizes AI Infrastructure

This is the second post in the Ranking Engineer Agent blog series exploring the autonomous AI capabilities accelerating Meta’s Ads Ranking innovation.

We introduce KernelEvolve, an agentic kernel authoring system used by Ranking Engineer Agent and generally applicable to a range of AI models beyond Ads Ranking.

Unlike typical large language model (LLM)-based agents that perform one-shot code generation, KernelEvolve treats kernel optimization as a search problem.

A standard coding assistant lacks the context to write optimized MTIA kernels because it has never seen MTIA documentation, instruction set details, or programming idioms.

KernelEvolve represents an early step toward the vision of …

2 months, 1 week назад @ engineering.fb.com
Meta Adaptive Ranking Model: Bending the Inference Scaling Curve to Serve LLM-Scale Models for Ads
Meta Adaptive Ranking Model: Bending the Inference Scaling Curve to Serve LLM-Scale Models for Ads Meta Adaptive Ranking Model: Bending the Inference Scaling Curve to Serve LLM-Scale Models for Ads

To overcome this, we have developed the Meta Adaptive Ranking Model, which effectively bends the inference scaling curve with high ROI and industry-leading efficiency.

Introducing Meta Adaptive Ranking ModelServing LLM-scale & complexity models in a real-time ads recommendation environment requires resolving a fundamental tension between model complexity and system efficiency.

Adaptive Ranking Model addresses these challenges through a paradigm shift powered by three core innovations across the serving stack:Inference-efficient model scaling: Adaptive Ranking Model achieves a model complexity equivalent to the O(10 GFLOPs) per token used by top-tier LLMs.

To minimize compute overhead, Adapt…

2 months, 1 week назад @ engineering.fb.com
AI for American-Produced Cement and Concrete
AI for American-Produced Cement and Concrete AI for American-Produced Cement and Concrete

Concurrent with the 2026 American Concrete Institute (ACI) Spring Convention, Meta is releasing a new AI model for designing concrete mixes – Bayesian Optimization for Concrete (BOxCrete), as well as the foundational data used to develop award-winning concrete mixes.

Amrize operates 18 cement plants, 141 cement terminals and 269 ready-mix concrete sites across North America.

Alongside the event, Meta is releasing a new AI model for designing concrete mixes, Bayesian Optimization for Concrete (BOxCrete).

How Meta Leverages AI for Concrete MixturesMeta’s AI for concrete model can help suppliers more quickly incorporate U.S. materials into their mixes through an approach called adaptive experi…

2 months, 1 week назад @ engineering.fb.com
Friend Bubbles: Enhancing Social Discovery on Facebook Reels
Friend Bubbles: Enhancing Social Discovery on Facebook Reels Friend Bubbles: Enhancing Social Discovery on Facebook Reels

Friend bubbles in Facebook Reels highlight Reels your friends have liked or reacted to, helping you discover new content and making it easier to connect over shared interests.

Friend bubbles enhance the social experience on Facebook Reels by helping you discover content your friends enjoy, creating a shared viewing experience and sparking new conversations.

Along with additional optimizations in the underlying method, this approach enabled us to ship friend bubbles while preserving core Reels performance.

Friend bubbles work because the signal is high value: It adds meaningful social context that helps people decide what’s worth watching.

Engagement also scales consistently with the number …

2 months, 3 weeks назад @ engineering.fb.com
Ranking Engineer Agent (REA): The Autonomous AI Agent Accelerating Meta’s Ads Ranking Innovation
Ranking Engineer Agent (REA): The Autonomous AI Agent Accelerating Meta’s Ads Ranking Innovation Ranking Engineer Agent (REA): The Autonomous AI Agent Accelerating Meta’s Ads Ranking Innovation

Meta’s Ranking Engineer Agent (REA) autonomously executes key steps across the end-to-end machine learning (ML) lifecycle for ads ranking models.

Powering these interactions are highly sophisticated, complex and massively distributed machine learning (ML) models that continuously evolve to serve both advertisers and people who use the platforms.

Optimizing these ML models has traditionally been time-consuming.

To address this, Meta built the Ranking Engineer Agent, an autonomous AI agent designed to drive the end-to-end ML lifecycle and iteratively evolve Meta’s ads ranking models at scale.

ML training jobs run for hours or days, far beyond what any session-bound assistant can manage.

2 months, 3 weeks назад @ engineering.fb.com
Patch Me If You Can: AI Codemods for Secure-by-Default Android Apps
Patch Me If You Can: AI Codemods for Secure-by-Default Android Apps Patch Me If You Can: AI Codemods for Secure-by-Default Android Apps

Nowhere is this more apparent than in mobile security, where a single class of vulnerability can be replicated across hundreds of call sites scattered throughout a sprawling, multi-app codebase serving billions of users.

Meta’s Product Security team has developed a two-pronged strategy to address this:Designing secure-by-default frameworks that wrap potentially unsafe Android OS APIs and make the secure path the easiest path for developers, andLeveraging generative AI to automate the migration of existing code to those frameworks at scale.

The result is a system that can propose, validate, and submit security patches across millions of lines of code with minimal friction for the engineers w…

2 months, 4 weeks назад @ engineering.fb.com
RCCLX: Innovating GPU communications on AMD platforms
RCCLX: Innovating GPU communications on AMD platforms RCCLX: Innovating GPU communications on AMD platforms

RCCLX is fully integrated with Torchcomms and aims to empower researchers and developers to accelerate innovation, regardless of their chosen backend.

We want to iterate on collectives, transports, and novel features quickly on AMD platforms.

With RCCLX, we have integrated CTran to AMD platforms, enabling the AllToAllvDynamic – a GPU-resident collective.

These features provide significant performance improvements on AMD platforms and we are excited to share this with the community.

RCCLX Quick Start GuideInstall Torchcomms with RCCLX backend by following the installation instructions in the Torchcomms repo.

3 months, 2 weeks назад @ engineering.fb.com
The Death of Traditional Testing: Agentic Development Broke a 50-Year-Old Field, JiTTesting Can Revive It
The Death of Traditional Testing: Agentic Development Broke a 50-Year-Old Field, JiTTesting Can Revive It The Death of Traditional Testing: Agentic Development Broke a 50-Year-Old Field, JiTTesting Can Revive It

A Catching JiTTest focuses specifically on finding regressions introduced by a code change.

Agentic development dramatically increases the pace of code change, straining test development burden and scaling the cost of false positives and test maintenance to breaking point.

And since the JiTTest itself is LLM-generated, it can often infer the plausible intention of a code change and simulate possible faults that may result from it.

With them engineers no longer have to spend time writing, reviewing, and testing complex test code.

READ THE PAPERJust-in-Time Catching Test Generation at Meta

3 months, 4 weeks назад @ engineering.fb.com
Adapting the Facebook Reels RecSys AI Model Based on User Feedback
Adapting the Facebook Reels RecSys AI Model Based on User Feedback Adapting the Facebook Reels RecSys AI Model Based on User Feedback

Our new User True Interest Survey (UTIS) model , now helps surface more niche, high-quality content and boosts engagement, retention, and satisfaction.

Our paper, “ Improve the Personalization of Large-Scale Ranking Systems by Integrating User Survey Feedback ” shares full details on this work.

The main candidate ranking model used by the platform is a large multi-task, multi-label model.

We trained a lightweight UTIS alignment model layer on the collected user survey responses using existing predictions of the main model as input features.

The UTIS model consistently outperformed the baseline, driving higher user engagement and retention .

4 months, 3 weeks назад @ engineering.fb.com
DrP: Meta’s Root Cause Analysis Platform at Scale
DrP: Meta’s Root Cause Analysis Platform at Scale DrP: Meta’s Root Cause Analysis Platform at Scale

DrP’s key components include:Expressive SDK : The DrP SDK allows engineers to codify investigation workflows into analyzers.

Post-processing system : After an investigation, the post-processing system can take automated actions based on the analysis results.

Bootstrap code : The DrP SDK provides bootstrap code to create a template analyzer with pre-populated boilerplate code.

Data access and analysis : The SDK includes libraries for data access and analysis, such as dimension analysis and time series correlation.

This provides immediate analysis results to on-call engineers.

5 months, 3 weeks назад @ engineering.fb.com
Uber Engineering
последний пост None
neptune.ai neptune.ai
последний пост 6 months, 1 week назад
We are joining OpenAI
We are joining OpenAI We are joining OpenAI

Piotr Niedźwiedź, CEO/CTO and founder of neptune.aiI’m excited to share that we’ve entered into a definitive agreement to be acquired by OpenAI, subject to closing conditions.

We are thrilled to join the OpenAI team and help their AI researchers build better models faster.

Neptune is a metrics dashboard company.”We’ve worked closely with OpenAI to create the metrics dashboard that helps teams building foundation models.

Our future with OpenAINeptune will join OpenAI and continue to support AI researchers with tools to monitor, debug, and evaluate frontier models.

We are looking forward to working with top AI researchers and supporting OpenAI’s mission of ensuring that AGI benefits all of hu…

6 months, 1 week назад @ neptune.ai
Synthetic Data for LLM Training
Synthetic Data for LLM Training Synthetic Data for LLM Training

For instance, financial data is highly sensitive and protected by very strict regulations, and synthetic data mimics the real data distribution without revealing customer information.

Read more about how leading foundation model teams curate their training data and other topics in the State of Foundation Model Training Report 2025.

Choosing the right synthetic data generation technique depends on the type of data and its complexity.

Synthetic tabular data generation is a promising direction to overcome these challenges by learning the distribution of the tabular data.

Post-processingAs the distribution of tabular data is highly complex, it makes the synthetic tabular data generation very ch…

6 months, 4 weeks назад @ neptune.ai
What are LLM Embeddings: All you Need to Know
What are LLM Embeddings: All you Need to Know What are LLM Embeddings: All you Need to Know

TL;DR LLM embeddings are the numerical, vector representations of text that Large Language Models (LLMs) use to process information.

Unlike their predecessor word embeddings, LLM embeddings are context-aware and dynamically change to capture semantic and syntactic relationships based on the surrounding text.

What are the applications of LLM embeddings?

Word EmbeddingsSparse Word Embeddings One-Hot Vectors 1970s TF-IDF1980s Co-Occurrence MatrixStatic Word Embeddings Word2Vec 2013 GloVe 2014Contextualized word embeddings ELMo 2018 GPT-1 2018 BERT 2018 LLAMA 2023 DeepSeek-V1 2023 GPT-4 2023Static word embeddingsStatic word embeddings, such as word2vec in 2013, marked a significant development.…

7 months назад @ neptune.ai
Detecting and Fixing ‘Dead Neurons’ in Foundation Models
Detecting and Fixing ‘Dead Neurons’ in Foundation Models Detecting and Fixing ‘Dead Neurons’ in Foundation Models

TL;DR Dead neurons silently waste compute and reduce effective model capacity in foundation models.

Dead neurons’ impactRecent studies into dead neurons in the context of foundation models show interesting, albeit worrying, results.

These large reported fractions of dead neurons in foundation models are a concern from a computational perspective.

Before we move on to discuss how to detect and fix dead neurons, let’s touch upon an important distinction between dead neurons and vanishing gradients.

Further reading How to Monitor, Diagnose, and Solve Gradient Issues in Foundation Models Read moreVisualizing activation distributionsIs your foundation model suffering from dead neurons?

7 months, 2 weeks назад @ neptune.ai
Part 2: Instruction Fine-Tuning: Evaluation and Advanced Techniques for Efficient Training
Part 2: Instruction Fine-Tuning: Evaluation and Advanced Techniques for Efficient Training Part 2: Instruction Fine-Tuning: Evaluation and Advanced Techniques for Efficient Training

In the first part of this series, we covered the fundamentals of instruction fine-tuning (IFT).

def calculate_irs(instruction, output, reference_model): evaluation_prompt = f""" Instruction: {instruction} Model Output: {output} Rate how well the output follows the instruction on these criteria: 1.

| SourceHINT addresses a computational inefficiency in standard instruction fine-tuning: repeatedly reprocessing the same task instruction with every input example.

Read more about foundation model training infrastructure and other topics in Neptune’s 2025 State of Foundation Model Training Report.

First, during initial instruction fine-tuning across multiple diverse tasks, the model learns genera…

7 months, 2 weeks назад @ neptune.ai
How to Optimize LLM Inference
How to Optimize LLM Inference How to Optimize LLM Inference

Large Language Model (LLM) inference at scale is challenging as it involves transferring massive amounts of model parameters and data and performing computations on large tensors.

In the following, we’ll use the Llama model family architecture as a specific example to understand the LLM workload at inference.

For a far more detailed analysis of the LLM workload at inference, see the chapter All About Transformer Inference in the book How to Scale Your Model, published by Google DeepMind.

See also How to Run LLMs Locally Read moreA quick primer on hardware for LLM inferenceA typical LLM inference cluster consists of several nodes, each with a multi-core CPU and multiple accelerator devices, …

7 months, 4 weeks назад @ neptune.ai
A Researcher’s Guide to LLM Grounding
A Researcher’s Guide to LLM Grounding A Researcher’s Guide to LLM Grounding

In this article, we’ll explore the fundamental concepts of LLM grounding as well as strategies for optimally grounding models.

What is LLM grounding?

LLM grounding is analogous.

If relevant knowledge cannot be inferred from the data, then LLM grounding cannot yield more relevant responses.

When grounding LLMs using RAG, consider retaining only a few of the top hits (i.e., top-k) for your retrieval queries.

8 months, 2 weeks назад @ neptune.ai
Instruction Fine-Tuning: Fundamentals, Architecture Modifications, and Loss Functions
Instruction Fine-Tuning: Fundamentals, Architecture Modifications, and Loss Functions Instruction Fine-Tuning: Fundamentals, Architecture Modifications, and Loss Functions

TL;DR Instruction fine-tuning (IFT) refines pre-trained large language models (LLMs) to follow specific task instructions by training on prompt-response pairs.

Instruction fine-tuning in a nutshellIFT tailors LLMs to follow user instructions by bridging their inherent next-word prediction with human-defined objectives.

Related LLM Fine-Tuning and Model Selection Using Neptune and Transformers Read moreParameter-efficient instruction fine-tuningWhile major foundation models like GPT-4 or Llama-2 undergo full parameter instruction fine-tuning during development, parameter-efficient fine-tuning (PEFT) methods have become widely adopted for instruction fine-tuning since the LoRA paper was publi…

8 months, 3 weeks назад @ neptune.ai
▶️ YouTube
Yannic Kilcher Yannic Kilcher
последний пост 3 months назад
I BUILT A FULLY AUTOMATIC MANSPLAINER
I BUILT A FULLY AUTOMATIC MANSPLAINER I BUILT A FULLY AUTOMATIC MANSPLAINER

All information about GTC and the DGX Spark Raffle is here: https://www.ykilcher.com/gtc Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https://www.subscribestar.com/yannickilcher

Patreon: https://www.patreon.com/yannickilcher

Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq

Ethereu…

3 months назад @ youtube.com
Traditional X-Mas Stream
Traditional X-Mas Stream Traditional X-Mas Stream

Letsgooo

5 months, 1 week назад @ youtube.com
Traditional Holiday Live Stream
Traditional Holiday Live Stream Traditional Holiday Live Stream

https://ykilcher.com/discord Links:

TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://discord.gg/4H8xxDF

BitChute: https://www.bitchute.com/channel/yannic-kilcher

Minds: https://www.minds.com/ykilcher

Parler: https://parler.com/profile/YannicKilcher

LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/

BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https:/…

5 months, 1 week назад @ youtube.com
TiDAR: Think in Diffusion, Talk in Autoregression (Paper Analysis)
TiDAR: Think in Diffusion, Talk in Autoregression (Paper Analysis) TiDAR: Think in Diffusion, Talk in Autoregression (Paper Analysis)

Paper: https://arxiv.org/abs/2511.08923 Abstract:

Diffusion language models hold the promise of fast parallel generation, while autoregressive (AR) models typically excel in quality due to their causal structure aligning naturally with language modeling. This raises a fundamental question: can we achieve a synergy with high throughput, higher GPU utilization, and AR level quality? Existing methods fail to effectively balance these two aspects, either prioritizing AR using a weaker model for sequential drafting (speculative decoding), leading to lower drafting efficiency, or using some form of left-to-right (AR-like) decoding logic for diffusion, which still suffers from quality degradation …

5 months, 2 weeks назад @ youtube.com
Titans: Learning to Memorize at Test Time (Paper Analysis)
Titans: Learning to Memorize at Test Time (Paper Analysis) Titans: Learning to Memorize at Test Time (Paper Analysis)

Paper: https://arxiv.org/abs/2501.00663 Abstract:

Over more than a decade there has been an extensive research effort on how to effectively utilize recurrent models and attention. While recurrent models aim to compress the data into a fixed-size memory (called hidden state), attention allows attending to the entire context window, capturing the direct dependencies of all tokens. This more accurate modeling of dependencies, however, comes with a quadratic cost, limiting the model to a fixed-length context. We present a new neural long-term memory module that learns to memorize historical context and helps attention to attend to the current context while utilizing long past information. We sh…

5 months, 3 weeks назад @ youtube.com
[Paper Analysis] The Free Transformer (and some Variational Autoencoder stuff)
[Paper Analysis] The Free Transformer (and some Variational Autoencoder stuff) [Paper Analysis] The Free Transformer (and some Variational Autoencoder stuff)

https://arxiv.org/abs/2510.17558 Abstract:

We propose an extension of the decoder Transformer that conditions its generative process on random latent variables which are learned without supervision thanks to a variational procedure. Experimental evaluations show that allowing such a conditioning translates into substantial improvements on downstream tasks. Author: François Fleuret Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the con…

7 months, 1 week назад @ youtube.com
[Video Response] What Cloudflare's code mode misses about MCP and tool calling
[Video Response] What Cloudflare's code mode misses about MCP and tool calling [Video Response] What Cloudflare's code mode misses about MCP and tool calling

Theo's Video: https://www.youtube.com/watch?v=bAYZjVAodoo

Cloudflare article: https://blog.cloudflare.com/code-mode/ Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https://www.subscribestar.com/yannickilcher

Patreon: https://www.patreon.com/yannickilcher

Bitcoin (BTC): bc1q49lsw3q325tr58ygf8…

7 months, 3 weeks назад @ youtube.com
[Paper Analysis] On the Theoretical Limitations of Embedding-Based Retrieval (Warning: Rant)
[Paper Analysis] On the Theoretical Limitations of Embedding-Based Retrieval (Warning: Rant) [Paper Analysis] On the Theoretical Limitations of Embedding-Based Retrieval (Warning: Rant)

Paper: https://arxiv.org/abs/2508.21038 Abstract:

Vector embeddings have been tasked with an ever-increasing set of retrieval tasks over the years, with a nascent rise in using them for reasoning, instruction-following, coding, and more. These new benchmarks push embeddings to work for any query and any notion of relevance that could be given. While prior works have pointed out theoretical limitations of vector embeddings, there is a common assumption that these difficulties are exclusively due to unrealistic queries, and those that are not can be overcome with better training data and larger models. In this work, we demonstrate that we may encounter these theoretical limitations in realist…

8 months назад @ youtube.com
Henry AI Labs Henry AI Labs
последний пост None
3blue1brown 3blue1brown
последний пост 2 days, 9 hours назад
Reinventing Entropy | Compression & Intelligence Part 1
Reinventing Entropy | Compression & Intelligence Part 1 Reinventing Entropy | Compression & Intelligence Part 1

What is the fundamental compressibility of language?

Check out our virtual career fair: https://3b1b.co/talent

See new projects before they go live: https://3b1b.co/support Animation credit:

Manim scenes by Aaron Gostein and Grant Sanderson

Shannon’s story, as well as those for various pi creatures, by Mitchell Zemil.

Lunar robot and prediction/compression coin by Paul Dancstep

NanoGPT animations by Clayton Rabideau Shannon’s “A Mathematical Theory of Communication”

https://people.math.harvard.edu/~ctm/home/text/others/shannon/entropy/entropy.pdf Shannon’s “Prediction and Entropy of Printed English”

https://www.princeton.edu/~wbialek/rome/refs/shannon_51.pdf Scientific American article that…

2 days, 9 hours назад @ youtube.com
Tie random ends: How many loops?
Tie random ends: How many loops? Tie random ends: How many loops?

Recent puzzle solutions on Patreon:

https://members.3blue1brown.com/posts/158885046?pr=true

2 weeks, 4 days назад @ youtube.com
Covering 10 points, a surprisingly tricky puzzle.
Covering 10 points, a surprisingly tricky puzzle. Covering 10 points, a surprisingly tricky puzzle.

Made as part of a monthly series of puzzles for the 2026 Year of Math.

1 month, 3 weeks назад @ youtube.com
Escher's most mind-bending piece
Escher's most mind-bending piece Escher's most mind-bending piece

On "The Print Gallery", by M.C. Escher

Full video: https://youtu.be/ldxFjLJ3rVY

2 months, 2 weeks назад @ youtube.com
The subset sum puzzle
The subset sum puzzle The subset sum puzzle

Part of a series of monthly puzzlers. Stay subscribed to see the solution

2 months, 2 weeks назад @ youtube.com
Escher's most mathematically interesting piece
Escher's most mathematically interesting piece Escher's most mathematically interesting piece

Escher's Print Gallery, and the tour of complex analysis it invites.

Check out our virtual career fair: 3b1b.co/talent

Join channel supporters to see videos early: 3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com Original paper by de Smit and Lenstra:

https://pub.math.leidenuniv.nl/~smitbde/papers/2003-de_smit-lenstra-escher.pdf Timestamps: 0:00 - The print gallery

13:04 - Conformal maps from complex analysis

21:41 - The complex exponential

25:56 - The complex logarithm

32:32 - 3b1b Talent

33:14 - Constructing the key function

40:16 - The deeper math behind Escher ------------------ These animations are largely made us…

2 months, 2 weeks назад @ youtube.com
Bacteria Grid Puzzle Solution
Bacteria Grid Puzzle Solution Bacteria Grid Puzzle Solution

Part of a monthly series of puzzlers, in collaboration with MoMath and Peter Winkler

2 months, 2 weeks назад @ youtube.com
The most underappreciated formula | Exploring high-dimensional spheres
The most underappreciated formula | Exploring high-dimensional spheres The most underappreciated formula | Exploring high-dimensional spheres

On the volumes of higher-dimensional spheres

Explore the 3b1b virtual career fair: See https://3b1b.co/talent

Become a supporter for early views of new videos: https://3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com Thanks to UC Santa Cruz for letting me film there, and special thanks to Pedro Morales-Almazan for arranging everything. My video on Numberphile with a fun application of this problem: https://youtu.be/6_yU9eJ0NxA Timestamps:

0:00 - Introduction

1:01 - Random puzzle

6:16 - Outside the box

14:35 - Setting up the volume grid

21:14 - Why 4πr^2

25:21 - Archimedes in higher dimensions

36:17 - The general formul…

3 months, 1 week назад @ youtube.com
The lattice bacteria puzzle
The lattice bacteria puzzle The lattice bacteria puzzle

Part of a series of monthly puzzles, done in collaboration with MoMath.

https://momath.org/mindbenders

3 months, 3 weeks назад @ youtube.com
Solution to the ladybug clock puzzle
Solution to the ladybug clock puzzle Solution to the ladybug clock puzzle

Solution to last month's probability puzzle.

3 months, 3 weeks назад @ youtube.com
The Hairy Ball Theorem
The Hairy Ball Theorem The Hairy Ball Theorem

Unexpected applications and a beautiful proof.

Looking for a new career? Check out https://3b1b.co/talent

Supporters get early access to new videos: https://3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com Credits:

Senia Sheydvasser: Co-writing and sphere deformation animations

Paul Dancstep: Those lovely fluffy sphere animations Vince Rubinetti: Music Timestamps:

0:00 - To comb a hairy ball

1:24 - Applications

8:46 - The puzzle of one null point

12:12 - The proof outline

16:41 - Defining orientation

21:44 - Why inside-out is impossible

25:59 - 3b1b Talent

27:44 - Final food for thought ------------------ These animati…

4 months, 1 week назад @ youtube.com
The ladybug clock puzzle
The ladybug clock puzzle The ladybug clock puzzle

This is the first in a set of monthly puzzles, curated by Peter Winkler. This one was originally suggested by Richard Stanley. You can sign up to hear his description of the answer at http://momath.org/mindbenders

4 months, 3 weeks назад @ youtube.com
The most absurd product I've made
The most absurd product I've made The most absurd product I've made

Because why not make a pi creature neck pillow?

Available at 3b1b.co/store

6 months, 2 weeks назад @ youtube.com
How Laplace transforms solve differential equations
How Laplace transforms solve differential equations How Laplace transforms solve differential equations

Studying the forced harmonic oscillator by taking a Laplace transform and studying its poles.

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com Chapter on the Laplace Transform:

https://youtu.be/j0wJBEZdwLs Chapter on the S-plane and Simple Harmonic Motion:

https://youtu.be/-j8PzkZ70Lg Timestamps:

0:00 - Opening puzzle

1:06 - Key properties of a Laplace Transform

3:29 - Qualitative analysis with Laplace Transforms

4:29 - The Laplace Transforms of a Derivative

6:06 - The forced oscillator

11:59 - Intuition from the transformed solution

1…

7 months назад @ youtube.com
The dynamics of e^(πi)
The dynamics of e^(πi) The dynamics of e^(πi)

A fuller version of this explanation, also including the reason we care about complex exponents in the first place: https://youtu.be/-j8PzkZ70Lg

8 months назад @ youtube.com
Two Minute Papers Two Minute Papers
последний пост 3 days, 15 hours назад
AI Agents as "Games Masters"? 🎮🔥
AI Agents as "Games Masters"? 🎮🔥 AI Agents as "Games Masters"? 🎮🔥

Check the pinned comment for the link to the full interview. Could AI agents eventually become the "Games Master" driving your gaming storylines? We explore the concept of AI assisting players or creating dynamic, non-scripted narratives. Discover how AI is currently being tested inside immersive game environments to change how we play. 🧠 Hashtags: #aiingames #gaming #ai #gamedev #futuretech

3 days, 15 hours назад @ youtube.com
DeepMind’s New AI Found A Strange New Way To Think
DeepMind’s New AI Found A Strange New Way To Think DeepMind’s New AI Found A Strange New Way To Think

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers 📝 The paper is available here:

https://github.com/google-deepmind/alphaproof-nexus-results

https://arxiv.org/html/2605.22763v1 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

Thumbnail design: https://felicia.hu

4 days, 5 hours назад @ youtube.com
Meet the AI "Co-Scientist" Changing Everything 🤖🧪 #ai
Meet the AI "Co-Scientist" Changing Everything 🤖🧪 #ai Meet the AI "Co-Scientist" Changing Everything 🤖🧪 #ai 6 days, 4 hours назад @ youtube.com
Claude Opus 4.8: Lying Machine No More
Claude Opus 4.8: Lying Machine No More Claude Opus 4.8: Lying Machine No More

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Anthropic's Opus 4.8: https://www.anthropic.com/news/claude-opus-4-8 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

Thumbnail design: https://felicia.hu

6 days, 7 hours назад @ youtube.com
A Second Nobel Prize for AlphaFold? 🧬🏆 #alphafold #deepmind #nobelprize #science #ai
A Second Nobel Prize for AlphaFold? 🧬🏆 #alphafold #deepmind #nobelprize #science #ai A Second Nobel Prize for AlphaFold? 🧬🏆 #alphafold #deepmind #nobelprize #science #ai

Check the pinned comment for the link to the full interview. We're discussing whether a "second order Nobel" prize is on the horizon for AI-driven science. With over 3 million researchers already using AlphaFold, the real-world impact is already historic. Hear what the experts think about what comes next for scientific discovery! 🔬

1 week назад @ youtube.com
Google's Jeff Dean On Data Center Fires, And The Future Of AI
Google's Jeff Dean On Data Center Fires, And The Future Of AI Google's Jeff Dean On Data Center Fires, And The Future Of AI

Thank you to Google for the invite! 🙏 I use Lambda GPU Cloud myself to rent NVIDIA GPUs for my projects - I’d really appreciate it if you checked them out! https://lambdalabs.com/papers Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

Thumb…

1 week, 1 day назад @ youtube.com
Feynman vs. Einstein vs. Newton: Who Wins? 🧠🤔 #physics #ai #science #feynman #research
Feynman vs. Einstein vs. Newton: Who Wins? 🧠🤔 #physics #ai #science #feynman #research Feynman vs. Einstein vs. Newton: Who Wins? 🧠🤔 #physics #ai #science #feynman #research

Check the pinned comment for the link to the full interview. In this quick clip, we explore which legendary scientist ranks higher among the experts. It's a fun debate that leads into an even bigger discussion about AI's role in future scientific breakthroughs. You won't want to miss the full deep dive with Demis Hassabis! ⚡️

1 week, 1 day назад @ youtube.com
Google DeepMind CEO Likes Hard Questions
Google DeepMind CEO Likes Hard Questions Google DeepMind CEO Likes Hard Questions

Full video: https://youtu.be/huAwz_BR8WM

#shorts

2 weeks назад @ youtube.com
Insane AI Breakthroughs With Demis Hassabis
Insane AI Breakthroughs With Demis Hassabis Insane AI Breakthroughs With Demis Hassabis

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

Thumbnail design: https://felicia.hu

2 weeks, 1 day назад @ youtube.com
DeepSeek Just Changed How AI Sees Images Forever
DeepSeek Just Changed How AI Sees Images Forever DeepSeek Just Changed How AI Sees Images Forever

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://github.com/ailuntx/Thinking-with-Visual-Primitives

https://huggingface.co/datasets/NodeLinker/deepseek-ai-Thinking-with-Visual-Primitives-deleted-repo/blob/main/Thinking_with_Visual_Primitives.pdf Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn B…

2 weeks, 4 days назад @ youtube.com
NVIDIA’s New AI Is Fast For A Strange Reason
NVIDIA’s New AI Is Fast For A Strange Reason NVIDIA’s New AI Is Fast For A Strange Reason

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://arxiv.org/abs/2604.24954

https://developer.nvidia.com/blog/nvidia-nemotron-3-nano-omni-powers-multimodal-agent-reasoning-in-a-single-efficient-open-model/

https://huggingface.co/blog/nvidia/nemotron-3-nano-omni-multimodal-intelligence Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, R…

3 weeks, 6 days назад @ youtube.com
OpenAI's GPT 5.5 Instant: The Good, The Bad And The Insane
OpenAI's GPT 5.5 Instant: The Good, The Bad And The Insane OpenAI's GPT 5.5 Instant: The Good, The Bad And The Insane

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 GPT 5.5 Instant:

https://deploymentsafety.openai.com/gpt-5-5-instant/introduction

https://openai.com/index/gpt-5-5-instant/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwie…

1 month назад @ youtube.com
DeepSeek V4 AI: Crushing The Competition
DeepSeek V4 AI: Crushing The Competition DeepSeek V4 AI: Crushing The Competition

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 Check out DeepSeek here:

https://www.deepseek.com/en/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

Thumbnail design: https://felicia.hu

1 month назад @ youtube.com
NVIDIA's New AI Turns One Photo Into A World That Never Breaks
NVIDIA's New AI Turns One Photo Into A World That Never Breaks NVIDIA's New AI Turns One Photo Into A World That Never Breaks

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://research.nvidia.com/labs/sil/projects/lyra2/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

Thumbnail design: https:…

1 month, 1 week назад @ youtube.com
Sakana AI’s God Simulator Is Brilliant
Sakana AI’s God Simulator Is Brilliant Sakana AI’s God Simulator Is Brilliant

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 Try it out! The paper is available here:

https://pub.sakana.ai/digital-ecosystem/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

Thumbnail design: https:…

1 month, 1 week назад @ youtube.com
DataFest Video DataFest Video
последний пост None
Семинары JetBrains Research Семинары JetBrains Research
последний пост None
Яндекс. Компьютерные науки Яндекс. Компьютерные науки
последний пост 7 часов назад
ARGUS: большой рекомендательный трансформер в системе с сотнями тысяч RPS
ARGUS: большой рекомендательный трансформер в системе с сотнями тысяч RPS ARGUS: большой рекомендательный трансформер в системе с сотнями тысяч RPS

На Data Fest 2026 в Москве Георгий Смирнов из Поисковых сервисов и ИИ рассказал, как команда внедрила крупный рекомендательный трансформер ARGUS в рекламу Яндекса. Он поделился инженерными находками и компромиссами, которые позволили запустить тяжёлую модель в условиях продакшена с нагрузкой в сотни тысяч RPS. Больше материалов для разработчиков: https://t.me/+owyCvdge8WIyNTUy #DataFest #DataFest2026 #AI #ML #LLM #GenAI #MachineLearning #DataScience #MLOps #AIAgents #RAG #ComputerVision #AutonomousDriving #Yandex #Яндекс #TechTalk #Developers #ArtificialIntelligence #ReinforcementLearning #MultimodalAI

7 часов назад @ youtube.com
Closed-loop RL для задачи автономного вождения
Closed-loop RL для задачи автономного вождения Closed-loop RL для задачи автономного вождения

На Data Fest 2026 в Москве Павел Лукьянов из Б2Б Тех представил практический сетап Closed-loop RL для планировщика автономного автомобиля. В докладе он рассказал про архитектуру предобучения, минимальные требования к среде и симуляции, а также про выбор алгоритмов и ревордов для стабильного обучения. Больше материалов для разработчиков: https://t.me/+owyCvdge8WIyNTUy #DataFest #DataFest2026 #AI #ML #LLM #GenAI #MachineLearning #DataScience #MLOps #AIAgents #RAG #ComputerVision #AutonomousDriving #Yandex #Яндекс #TechTalk #Developers #ArtificialIntelligence #ReinforcementLearning #MultimodalAI

7 часов назад @ youtube.com
Как ускорить мультимодальную разметку в четыре раза без потери качества
Как ускорить мультимодальную разметку в четыре раза без потери качества Как ускорить мультимодальную разметку в четыре раза без потери качества

На Data Fest 2026 в Москве Александр Мандров из Поисковых сервисов и ИИ поделился подходом к четырёхкратному ускорению мультимодальной разметки. Вместо обучения на всём массиве команда отбирала самые сложные примеры и обогащала их рассуждениями большой модели, доводя компактную Alice AI VLM Light до уровня Pro в отдельных доменах. Александр объяснил, когда такая стратегия окупается, какие фильтры работают на реальных краудсорсинговых данных и где маленькая модель уже способна заменить Pro. Больше материалов для разработчиков: https://t.me/+owyCvdge8WIyNTUy #DataFest #DataFest2026 #AI #ML #LLM #GenAI #MachineLearning #DataScience #MLOps #AIAgents #RAG #ComputerVision #AutonomousDriving #Yand…

7 часов назад @ youtube.com
AgentOps в продакшене: LLM-агенты и автоматизация поддержки
AgentOps в продакшене: LLM-агенты и автоматизация поддержки AgentOps в продакшене: LLM-агенты и автоматизация поддержки

На Data Fest 2026 в Москве Иван Насонов из Персональных сервисов рассказал, как автоматизация поддержки с LLM-агентами достигла высокого качества и скорости около 5 мс на токен благодаря собственной инфраструктуре на базе SGLang. Отдельно он представил Conclave AI — внутренний продукт, который упрощает разработку агентных сценариев, сокращает путь до продакшена и позволяет масштабировать внедрение в реальных сервисах. Больше материалов для разработчиков: https://t.me/+owyCvdge8WIyNTUy #DataFest #DataFest2026 #AI #ML #LLM #GenAI #MachineLearning #DataScience #MLOps #AIAgents #RAG #ComputerVision #AutonomousDriving #Yandex #Яндекс #TechTalk #Developers #ArtificialIntelligence #ReinforcementLe…

7 часов назад @ youtube.com
Гибридная генеративно-ранжирующая модель в рекомендациях Яндекс Музыки
Гибридная генеративно-ранжирующая модель в рекомендациях Яндекс Музыки Гибридная генеративно-ранжирующая модель в рекомендациях Яндекс Музыки

Технология генеративных рекомендаций на базе Semantic IDs захватывает сейчас все крупнейшие мировые системы персонализации. На Data Fest 2026 в Москве Дарья Тихонович из команды Поисковых сервисов и ИИ рассказала об опыте разработки в этой сфере: экспериментах с семантическим индексом, методах получения эмбеддингов, алгоритмах квантизации и архитектурах генеративных моделей. Больше материалов для разработчиков: https://t.me/+owyCvdge8WIyNTUy #DataFest #DataFest2026 #AI #ML #LLM #GenAI #MachineLearning #DataScience #MLOps #AIAgents #RAG #ComputerVision #AutonomousDriving #Yandex #Яндекс #TechTalk #Developers #ArtificialIntelligence #ReinforcementLearning #MultimodalAI

7 часов назад @ youtube.com
Виртуальное тестирование автономного транспорта
Виртуальное тестирование автономного транспорта Виртуальное тестирование автономного транспорта

На Data Fest 2026 в Москве Анастасия Демидова из Б2Б Тех представила подход к генерации фотореалистичного многокамерного видео, который имитирует данные сенсоров автономного транспорта и использует современные диффузионные модели. Метод воспроизводит реалистичные видео на основе структурного описания сцены: траектории агентов, конфигурации дорожной инфраструктуры, погодных условий и времени суток. Так можно быстро тестировать и улучшать алгоритмы восприятия и планирования прямо в виртуальной среде симулятора. Больше материалов для разработчиков: https://t.me/+owyCvdge8WIyNTUy #DataFest #DataFest2026 #AI #ML #LLM #GenAI #MachineLearning #DataScience #MLOps #AIAgents #RAG #ComputerVision #Aut…

7 часов назад @ youtube.com
AI-агенты для оптимизации бизнеса Яндекс Лавки
AI-агенты для оптимизации бизнеса Яндекс Лавки AI-агенты для оптимизации бизнеса Яндекс Лавки

На Data Fest 2026 в Москве Алёна Зайцева из команды Городских сервисов объяснила, как устроена платформа для AI-агентов. В докладе она рассказала про внедрение AI с точки зрения коммуникаций и целеполагания, расчёта и приёмки эффектов и, конечно, техники и скорости. Больше материалов для разработчиков: https://t.me/+owyCvdge8WIyNTUy #DataFest #DataFest2026 #AI #ML #LLM #GenAI #MachineLearning #DataScience #MLOps #AIAgents #RAG #ComputerVision #AutonomousDriving #Yandex #Яндекс #TechTalk #Developers #ArtificialIntelligence #ReinforcementLearning #MultimodalAI

7 часов назад @ youtube.com
X-split ARGUS: авторегрессивный энкодер-декодер для ранжирования рекламы
X-split ARGUS: авторегрессивный энкодер-декодер для ранжирования рекламы X-split ARGUS: авторегрессивный энкодер-декодер для ранжирования рекламы

На Data Fest 2026 в Москве Александр Плошкин из команды Поисковых сервисов и ИИ рассказал о развитии архитектуры ARGUS: от context-aware-бейзлайна с узким местом в виде одного user-вектора (cos-sim) — к target-aware и далее к X-split. Александр показал, как переход к позднему связыванию и cross-attention с разделением истории на офлайн и рантайм улучшает качество. А ещё разобрал обучение X-split: windowed attention для эмуляции лага, онлайн-батчевание и дистилляцию от Target-Aware. Больше материалов для разработчиков: https://t.me/+owyCvdge8WIyNTUy #DataFest #DataFest2026 #AI #ML #LLM #GenAI #MachineLearning #DataScience #MLOps #AIAgents #RAG #ComputerVision #AutonomousDriving #Yandex #Янде…

7 часов назад @ youtube.com
Data Fest. Трек Practical ML
Data Fest. Трек Practical ML Data Fest. Трек Practical ML

Обсудим применение ML в рабочих задачах команд Яндекса. Расписание трансляции:

13:00 — открытие

13:10 — «X-split ARGUS: авторегрессивный энкодер-декодер для ранжирования рекламы», Александр Плошкин

13:40 — «AI-агенты для оптимизации бизнеса Яндекс Лавки», Алёна Зайцева

14:10 — «Контролируемая генерация многокамерного видео для симуляции сенсорных данных автономного транспорта», Анастасия Демидова Перерыв 40 минут 15:20 — «Гибридная генеративно-ранжирующая модель на базе Semantic IDs в рекомендациях Яндекс Музыки», Дарья Тихонович

15:50 — «AgentOps в продакшене: инфраструктура для LLM-агентов и автоматизация поддержки Персональных сервисов Яндекса», Иван Насонов Перерыв 25 минут 16:45 — «Как…

2 weeks назад @ youtube.com
RAG за минуту
RAG за минуту RAG за минуту

Если хотите поработать с этим и узнать изнутри как всё устроено — приходите на Weekend Offer ML 30-31 мая: https://clck.ru/3TSCym Андрей Соколов, руководитель команды обучения моделей с внешним контекстом в Яндекс R&D, рассказал, как устроены RAG-системы трёх продуктов Яндекса. В докладе Андрей поделился продуктовыми кейсами и разобрал на их примере, как оценивают качество, где промптинг не вывозит и почему трогать один компонент системы — это всегда риск. Полное видео уже на канале.

2 weeks, 4 days назад @ youtube.com
Как работает Context Engineering в Agent-loop Нейроюриста Яндекса
Как работает Context Engineering в Agent-loop Нейроюриста Яндекса Как работает Context Engineering в Agent-loop Нейроюриста Яндекса

Если хотите поработать с этим и узнать изнутри как всё устроено — приходите на Weekend Offer ML 30-31 мая: https://clck.ru/3TSCym Андрей Соколов, руководитель команды обучения моделей с внешним контекстом в Яндекс R&D, рассказал, как устроены RAG-системы трёх продуктов Яндекса. В докладе Андрей поделился продуктовыми кейсами и разобрал на их примере, как оценивают качество, где промптинг не вывозит и почему трогать один компонент системы — это всегда риск. Полное видео уже на канале.

3 weeks, 3 days назад @ youtube.com
Оценка стабильности ML-системы: робастность в RAG на примере Алисы
Оценка стабильности ML-системы: робастность в RAG на примере Алисы Оценка стабильности ML-системы: робастность в RAG на примере Алисы

Если хотите поработать с этим и узнать изнутри как всё устроено — приходите на Weekend Offer ML 30-31 мая: https://clck.ru/3TSCym Андрей Соколов, руководитель команды обучения моделей с внешним контекстом в Яндекс R&D, рассказал, как устроены RAG-системы трёх продуктов Яндекса. В докладе Андрей поделился продуктовыми кейсами и разобрал на их примере, как оценивают качество, где промптинг не вывозит и почему трогать один компонент системы — это всегда риск. Полное видео уже на канале.

3 weeks, 6 days назад @ youtube.com
Какие бенчмарки мы используем для оценки качества Алисы
Какие бенчмарки мы используем для оценки качества Алисы Какие бенчмарки мы используем для оценки качества Алисы

Если хотите поработать с этим и узнать изнутри как всё устроено — приходите на Weekend Offer ML 30-31 мая: https://clck.ru/3TSCym Андрей Соколов, руководитель команды обучения моделей с внешним контекстом в Яндекс R&D, рассказал, как устроены RAG-системы трёх продуктов Яндекса. В докладе Андрей поделился продуктовыми кейсами и разобрал на их примере, как оценивают качество, где промптинг не вывозит и почему трогать один компонент системы — это всегда риск. Полное видео уже на канале.

1 month назад @ youtube.com
Как Brickit определяет классы деталей LEGO
Как Brickit определяет классы деталей LEGO Как Brickit определяет классы деталей LEGO

Андрей Татаринов, CEO и CTO в Epoch8, рассказал о системе компьютерного зрения для приложения Brickit, которое сканирует множество деталей лего и подсказывает, что из них можно собрать. В докладе Андрей объяснил, как его команда боролась с редкими классами и оптимизировала пайплайн под мобильные устройства. А ещё он поделился MLOps-решениями для масштабируемого дообучения и поддержки модели. Полное видео уже на канале! #Brickit, #Lego, #AI, #Нейросети, #ComputerVision, #DeepLearning, #MobileAI, #Tech, #Стартап, #Будущее, #MachineLearning, #Гаджеты, #Технологии, #WOW, #AIприложение

1 month, 1 week назад @ youtube.com
С какими данными работает Brickit
С какими данными работает Brickit С какими данными работает Brickit

Андрей Татаринов, CEO и CTO в Epoch8, рассказал о системе компьютерного зрения для приложения Brickit, которое сканирует множество деталей лего и подсказывает, что из них можно собрать. В докладе Андрей объяснил, как его команда боролась с редкими классами и оптимизировала пайплайн под мобильные устройства. А ещё он поделился MLOps-решениями для масштабируемого дообучения и поддержки модели. Полное видео уже на канале! #Brickit, #Lego, #AI, #Нейросети, #ComputerVision, #DeepLearning, #MobileAI, #Tech, #Стартап, #Будущее, #MachineLearning, #Гаджеты, #Технологии, #WOW, #AIприложение

1 month, 1 week назад @ youtube.com
ML Trainings ML Trainings
последний пост 1 day, 10 hours назад
Ценность кода в глазах заказчика
Ценность кода в глазах заказчика Ценность кода в глазах заказчика 1 day, 10 hours назад @ youtube.com
Дмитрий и Валентин обсуждают ИИ и конфликт щита и меча
Дмитрий и Валентин обсуждают ИИ и конфликт щита и меча Дмитрий и Валентин обсуждают ИИ и конфликт щита и меча 1 day, 10 hours назад @ youtube.com
Как отличить чушь от не чуши в психологии
Как отличить чушь от не чуши в психологии Как отличить чушь от не чуши в психологии 1 day, 10 hours назад @ youtube.com
Как мы создаем продукт: экспертиза, которая ценится
Как мы создаем продукт: экспертиза, которая ценится Как мы создаем продукт: экспертиза, которая ценится 1 day, 10 hours назад @ youtube.com
ChatGPT и способы верифицировать ответы
ChatGPT и способы верифицировать ответы ChatGPT и способы верифицировать ответы 1 day, 10 hours назад @ youtube.com
ChatGPT и психологические консультации
ChatGPT и психологические консультации ChatGPT и психологические консультации 1 day, 10 hours назад @ youtube.com
Капитанский мостик №22: BYD оплатит ДТП | Святой Престол и ИИ | Anthropic и тормоза
Капитанский мостик №22: BYD оплатит ДТП | Святой Престол и ИИ | Anthropic и тормоза Капитанский мостик №22: BYD оплатит ДТП | Святой Престол и ИИ | Anthropic и тормоза

описание: 0:00:00 Начало

0:01:04 24 робота в сутки

0:05:25 ИИ-сообщество и преступления

0:08:55 BYD оплатит ДТП

0:11:50 NeurIPS против ИИ

0:19:33 Святой Престол и ИИ

0:25:24 Писать код и делать продукты

0:32:05 Европейский ИИ

0:35:05 Россияне доверяют ИИ

0:37:39 Больше кода от Cursor

0:45:11 X5 и прибыль от ИИ

0:48:30 Подростки и ИИ-психолог

0:57:39 Mythos распространяется

1:04:30 Новая модель от Microsoft

1:10:28 Anthropic и тормоза ИИ-саммари:

Обсуждение последних новостей в области технологий, роботизации, искусственного интеллекта и их влияния на общество и бизнес. Ведущие делятся своими мнениями и анализируют текущие тренды. Обсуждение внедрения ИИ в программирование, влияние технологи…

2 days, 14 hours назад @ youtube.com
Uber: затраты на ИИ сравнялись с зарплатами сотрудников
Uber: затраты на ИИ сравнялись с зарплатами сотрудников Uber: затраты на ИИ сравнялись с зарплатами сотрудников 1 week, 1 day назад @ youtube.com
Технологии меняются быстрее, чем пилотирование
Технологии меняются быстрее, чем пилотирование Технологии меняются быстрее, чем пилотирование 1 week, 1 day назад @ youtube.com
Как Uber внедряет технологии
Как Uber внедряет технологии Как Uber внедряет технологии 1 week, 1 day назад @ youtube.com
Илон Маск и Grok 5: миф о сильном ИИ
Илон Маск и Grok 5: миф о сильном ИИ Илон Маск и Grok 5: миф о сильном ИИ 1 week, 1 day назад @ youtube.com
Дмитрий и Валентин обсуждают работоспособность
Дмитрий и Валентин обсуждают работоспособность Дмитрий и Валентин обсуждают работоспособность 1 week, 1 day назад @ youtube.com
Капитанский мостик №21: невыездные ИИ-китайцы | автозаводы делают роботов | женщины не нужны
Капитанский мостик №21: невыездные ИИ-китайцы | автозаводы делают роботов | женщины не нужны Капитанский мостик №21: невыездные ИИ-китайцы | автозаводы делают роботов | женщины не нужны

0:00:00 введение

0:00:40 невыездные ИИ-китайцы

0:04:22 роботы для пожилых

0:07:39 автозаводы делают роботов

0:11:23 выставка роботов в Токио

0:19:50 закон о роботах в России

0:28:30 500 миллионов за токены

0:31:27 Uber переплачивает за ИИ

0:36:43 Nvidia тратит на ИИ

0:42:44 arXiv против нейрослопа

0:53:21 бенчмарк на 1С

0:57:43 Демис Хассабис про AGI

1:01:19 Илон Маск про AGI

1:08:25 вышел Claude 4.8

1:11:08 женщины не нужны ИИ ИИ-саммари:

Обсуждение последних новостей в области технологий, политики и робототехники, включая ограничения на выезд, развитие роботов в Японии и Китае, а также новые законы о беспилотных системах в России. Обсуждение последних достижений в области ИИ, регуляции бе…

1 week, 2 days назад @ youtube.com
Капитанский мостик №20: Карпаты в Anthropic | Люди не нужны | Штраф за галлюцинации
Капитанский мостик №20: Карпаты в Anthropic | Люди не нужны | Штраф за галлюцинации Капитанский мостик №20: Карпаты в Anthropic | Люди не нужны | Штраф за галлюцинации

В этот раз в гостях у капитанов был Валера Бабушкин, поговорили про актуальные вопросы найма, ИИ в промышленности и вообще все все на свете. 0:00:00 Начало

0:01:00 Карпаты в Anthropic

0:09:29 Microsoft и Claude

0:24:07 Amazon и токены

0:25:16 Люди не нужны

0:38:18 Вышел Qwen 3.7

0:44:23 OpenAI и доступ

0:47:46 Сбер и китайские чипы

0:53:20 ИИ-литератор

1:08:10 ИИ-парикмахер

1:12:43 Штраф за галлюцинации Канал в mattermost: https://mm.ods.ai/ods/channels/data_captain (авторизуйтесь через ODS.ai) Другие площадки: https://taplink.cc/datacaptains #Anthropic #Microsoft #OpenAI #Сбер #Qwen

2 weeks, 1 day назад @ youtube.com
Перегруженные рутины и агенты критикуют капитализм
Перегруженные рутины и агенты критикуют капитализм Перегруженные рутины и агенты критикуют капитализм 2 weeks, 6 days назад @ youtube.com
Primer Primer
последний пост 5 months, 1 week назад
Taking AI Doom Seriously For 62 Minutes
Taking AI Doom Seriously For 62 Minutes Taking AI Doom Seriously For 62 Minutes

Patreon: https://www.patreon.com/primerlearning

80,000 Hours: 80000hours.org/primer https://www.desmos.com/calculator/a5pfjtr4tr Other connections:

Discord: https://discord.gg/NbruaNW

Twitch: https://www.twitch.tv/justin_helps

Store: https://store.dftba.com/collections/primer Reddit: https://www.reddit.com/r/primerlearning/

Bsky: https://bsky.app/profile/justinhelps.bsky.social

Twitter: https://twitter.com/primerlearning Links to other resources:

https://yoshuabengio.org/2024/07/09/reasoning-through-arguments-against-taking-ai-safety-seriously/

https://www.youtube.com/c/robertmilesai

https://www.youtube.com/@Siliconversations

https://www.youtube.com/@Go-Meta

https://www.youtube.com/@Dwarkes…

5 months, 1 week назад @ youtube.com
Simulating a single brain cell
Simulating a single brain cell Simulating a single brain cell

Patreon:

https://www.patreon.com/primerlearning Helpful resources if you want to learn more about neural networks

https://www.youtube.com/@AndrejKarpathy

https://course.fast.ai/

https://www.youtube.com/@WelchLabsVideo

https://www.youtube.com/@3blue1brown Early papers. These probably aren't helpful for understanding the concepts in this video, but if you're interested in history.

The Perceptron – A perceiving and recognizing automaton: https://bpb-us-e2.wpmucdn.com/websites.umass.edu/dist/a/27637/files/2016/03/rosenblatt-1957.pdf

The Perceptron: A probabilistic model for information storage and organization in the brain: https://www.ling.upenn.edu/courses/cogs501/Rosenblatt1958.pdf A Logical…

8 months, 2 weeks назад @ youtube.com
🎧 Podcasts
Lex Fridman AI Podcast Lex Fridman AI Podcast
последний пост 1 week, 4 days назад
#497 – Biggest Mysteries in Physics: Antimatter, Dark Energy & ToE – Don Lincoln
#497 – Biggest Mysteries in Physics: Antimatter, Dark Energy & ToE – Don Lincoln #497 – Biggest Mysteries in Physics: Antimatter, Dark Energy & ToE – Don Lincoln

Don Lincoln is a particle physicist at Fermilab who has spent decades working at the frontiers of high energy physics.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep497-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://upwork.com/lexLarridin: Measure AI adoption in your business.

Go to https://larridin.comFin: AI agent for customer service.

Go to https://fin.ai/lexLMNT: Zero-sugar electrolyte drink mix.

1 week, 4 days назад @ lexfridman.com
#496 – FFmpeg: The Incredible Technology Behind Video on the Internet
#496 – FFmpeg: The Incredible Technology Behind Video on the Internet #496 – FFmpeg: The Incredible Technology Behind Video on the Internet

Jean-Baptiste Kempf is lead developer of VLC and president of VideoLAN.

Kieran Kunhya is a longtime FFmpeg contributor, codec engineer, and the person behind the now-infamous FFmpeg account on X.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep496-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://larridin.comBlitzy: AI agent for large enterprise codebases.

Go to https://perplexity.ai/OUTLINE:(00:00) – Introduction(03:00) – Sponsors, Comments, and Reflections(10:48) – Weirdest things VLC opens(15:12) – How video playback works(24:33) – Video codecs and containers(35:20) – FFmpeg explained(56:20)…

1 month назад @ lexfridman.com
#495 – Vikings, Ragnar, Berserkers, Valhalla & the Warriors of the Viking Age
#495 – Vikings, Ragnar, Berserkers, Valhalla & the Warriors of the Viking Age #495 – Vikings, Ragnar, Berserkers, Valhalla & the Warriors of the Viking Age

Lars Brownworth is a historian, teacher, podcaster, and author specializing in Viking history, medieval Europe, and the Byzantine Empire.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep495-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://larridin.comBetterHelp: Online therapy and counseling.

Go to https://drinkLMNT.com/lexFin: AI agent for customer service.

Go to https://perplexity.ai/OUTLINE:(00:00) – Introduction(01:03) – Sponsors, Comments, and Reflections(08:57) – The start of the Viking Age(18:50) – Viking military strategy, tactics & technology(32:33) – Ragnar Lothbrok(42:00) – The Grea…

2 months назад @ lexfridman.com
#494 – Jensen Huang: NVIDIA – The $4 Trillion Company & the AI Revolution
#494 – Jensen Huang: NVIDIA – The $4 Trillion Company & the AI Revolution #494 – Jensen Huang: NVIDIA – The $4 Trillion Company & the AI Revolution

Jensen Huang is the co-founder and CEO of NVIDIA, the world’s most valuable company and the engine powering the AI computing revolution.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep494-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://drinkLMNT.com/lexFin: AI agent for customer service.

Go to https://quo.com/lexOUTLINE:(00:00) – Introduction(00:26) – Sponsors, Comments, and Reflections(06:34) – Extreme co-design and rack-scale engineering(09:20) – How Jensen runs NVIDIA(28:41) – AI scaling laws(43:41) – Biggest blockers to AI scaling laws(45:25) – Supply chain(47:20) – Memory(53:25) – Power…

2 months, 2 weeks назад @ lexfridman.com
#493 – Jeff Kaplan: World of Warcraft, Overwatch, Blizzard, and Future of Gaming
#493 – Jeff Kaplan: World of Warcraft, Overwatch, Blizzard, and Future of Gaming #493 – Jeff Kaplan: World of Warcraft, Overwatch, Blizzard, and Future of Gaming

Jeff Kaplan is a legendary Blizzard game designer of World of Warcraft and Overwatch, now preparing to launch a new game, The Legend of California, from his new studio Kintsugiyama – available to wishlist on Steam today, with alpha later in March.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep493-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://fin.ai/lexBlitzy: AI agent for large enterprise codebases.

Go to https://blitzy.com/lexBetterHelp: Online therapy and counseling.

Go to https://betterhelp.com/lexShopify: Sell stuff online.

3 months назад @ lexfridman.com
#492 – Rick Beato: Greatest Guitarists of All Time, History & Future of Music
#492 – Rick Beato: Greatest Guitarists of All Time, History & Future of Music #492 – Rick Beato: Greatest Guitarists of All Time, History & Future of Music

Rick Beato is a music educator, interviewer, producer, songwriter, and a true multi-instrument musician, playing guitar, bass, cello & piano.

His incredible YouTube channel celebrates great musicians & musical ideas, and helps millions of people fall in love with great music all over again.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep492-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://upliftdesk.com/lexBetterHelp: Online therapy and counseling.

Go to https://drinkLMNT.com/lexFin: AI agent for customer service.

3 months, 1 week назад @ lexfridman.com
#491 – OpenClaw: The Viral AI Agent that Broke the Internet – Peter Steinberger
#491 – OpenClaw: The Viral AI Agent that Broke the Internet – Peter Steinberger #491 – OpenClaw: The Viral AI Agent that Broke the Internet – Peter Steinberger

Peter Steinberger is the creator of OpenClaw, an open-source AI agent framework that’s the fastest-growing project in GitHub history.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep491-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://coderabbit.ai/lexFin: AI agent for customer service.

Go to https://fin.ai/lexBlitzy: AI agent for large enterprise codebases.

Go to https://drinkLMNT.com/lexOUTLINE:(00:00) – Introduction(03:51) – Sponsors, Comments, and Reflections(15:29) – OpenClaw origin story(18:48) – Mind-blowing moment(28:15) – Why OpenClaw went viral(32:12) – Self-modifying AI agent(36:57)…

3 months, 3 weeks назад @ lexfridman.com
#490 – State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI
#490 – State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI #490 – State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI

Nathan Lambert and Sebastian Raschka are machine learning researchers, engineers, and educators.

Sebastian Raschka is the author of Build a Large Language Model (From Scratch) and Build a Reasoning Model (From Scratch).

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep490-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

(25:11) – ChatGPT vs Claude vs Gemini vs Grok: Who is winning?

(36:11) – Best AI for coding(43:02) – Open Source vs Closed Source LLMs(54:41) – Transformers: Evolution of LLMs since 2019(1:02:38) – AI Scaling Laws: Are they dead or still holding?

4 months, 1 week назад @ lexfridman.com
#489 – Paul Rosolie: Uncontacted Tribes in the Amazon Jungle
#489 – Paul Rosolie: Uncontacted Tribes in the Amazon Jungle #489 – Paul Rosolie: Uncontacted Tribes in the Amazon Jungle

Paul Rosolie is a naturalist, explorer, author of a new book titled Junglekeeper, and is someone who has dedicated his life to protecting the Amazon rainforest.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep489-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://perplexity.ai/BetterHelp: Online therapy and counseling.

Go to https://fin.ai/lexMiro: Online collaborative whiteboard platform.

Go to https://miro.com/MasterClass: Online classes from world-class experts.

4 months, 3 weeks назад @ lexfridman.com
#488 – Infinity, Paradoxes that Broke Mathematics, Gödel Incompleteness & the Multiverse – Joel David Hamkins
#488 – Infinity, Paradoxes that Broke Mathematics, Gödel Incompleteness & the Multiverse – Joel David Hamkins #488 – Infinity, Paradoxes that Broke Mathematics, Gödel Incompleteness & the Multiverse – Joel David Hamkins

Joel David Hamkins is a mathematician and philosopher specializing in set theory, the foundations of mathematics, and the nature of infinity, and he’s the #1 highest-rated user on MathOverflow.

He is also the author of several books, including Proof and the Art of Mathematics and Lectures on the Philosophy of Mathematics.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep488-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://masterclass.com/lexpodOUTLINE:(00:00) – Introduction(01:58) – Sponsors, Comments, and Reflections(15:40) – Infinity & paradoxes(1:02:50) – Russell’s paradox(1:15:57) – Gödel’s…

5 months, 1 week назад @ lexfridman.com
#487 – Irving Finkel: Deciphering Secrets of Ancient Civilizations & Flood Myths
#487 – Irving Finkel: Deciphering Secrets of Ancient Civilizations & Flood Myths #487 – Irving Finkel: Deciphering Secrets of Ancient Civilizations & Flood Myths

Irving Finkel is a scholar of ancient languages and a longtime curator at the British Museum, renowned for his expertise in Mesopotamian history and cuneiform writing.

He specializes in reading and interpreting cuneiform inscriptions, including tablets from Sumerian, Akkadian, Babylonian, and Assyrian contexts.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep487-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://shopify.com/lexMiro: Online collaborative whiteboard platform.

Go to https://miro.com/Chevron: Reliable energy for data centers.

5 months, 4 weeks назад @ lexfridman.com
#486 – Michael Levin: Hidden Reality of Alien Intelligence & Biological Life
#486 – Michael Levin: Hidden Reality of Alien Intelligence & Biological Life #486 – Michael Levin: Hidden Reality of Alien Intelligence & Biological Life

Michael Levin is a biologist at Tufts University working on novel ways to understand and control complex pattern formation in biological systems.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep486-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://upliftdesk.com/lexMiro: Online collaborative whiteboard platform.

Go to https://miro.com/MasterClass: Online classes from world-class experts.

(2:42:41) – Mind uploading(3:01:22) – Alien intelligence(3:16:17) – Advice for young people(3:22:46) – Questions for AGI

6 months, 1 week назад @ lexfridman.com
#485 – David Kirtley: Nuclear Fusion, Plasma Physics, and the Future of Energy
#485 – David Kirtley: Nuclear Fusion, Plasma Physics, and the Future of Energy #485 – David Kirtley: Nuclear Fusion, Plasma Physics, and the Future of Energy

David Kirtley is a nuclear fusion engineer and CEO of Helion Energy, a company working on building the world's first commercial fusion power plant by 2028.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep485-sc

See below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc. Transcript:

https://lexfridman.com/david-kirtley-transcript CONTACT LEX:

Feedback - give feedback to Lex: https://lexfridman.com/survey

AMA - submit questions, videos or call-in: https://lexfridman.com/ama

Hiring - join our team: https://lexfridman.com/hiring

Other - other ways to get in touch: https://lexfridman.com/contact EPISODE LINKS:

David's X: htt…

6 months, 3 weeks назад @ lexfridman.com
#484 – Dan Houser: GTA, Red Dead Redemption, Rockstar, Absurd & Future of Gaming
#484 – Dan Houser: GTA, Red Dead Redemption, Rockstar, Absurd & Future of Gaming #484 – Dan Houser: GTA, Red Dead Redemption, Rockstar, Absurd & Future of Gaming

Dan Houser is co-founder of Rockstar Games and is a legendary creative mind behind Grand Theft Auto (GTA) and Red Dead Redemption series of video games.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep484-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://box.com/aiUPLIFT Desk: Standing desks and office ergonomics.

Go to https://drinkLMNT.com/lexOUTLINE:(00:00) – Introduction(01:29) – Sponsors, Comments, and Reflections(11:32) – Greatest films of all time(23:45) – Making video games(26:36) – GTA 3(29:55) – Open world video games(32:42) – Character creation(36:09) – Superintelligent AI in A Bette…

7 months, 1 week назад @ lexfridman.com
#483 – Julia Shaw: Criminal Psychology of Murder, Serial Killers, Memory & Sex
#483 – Julia Shaw: Criminal Psychology of Murder, Serial Killers, Memory & Sex #483 – Julia Shaw: Criminal Psychology of Murder, Serial Killers, Memory & Sex

Julia Shaw is a criminal psychologist and author who in her books explores human nature, including psychopathy, violent crime, the psychology of evil, police interrogation, false memory manipulation, deception detection, and human sexuality.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep483-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://shopify.com/lexBetterHelp: Online therapy and counseling.

Go to https://betterhelp.com/lexLMNT: Zero-sugar electrolyte drink mix.

Go to https://drinkLMNT.com/lexAG1: All-in-one daily nutrition drink.

7 months, 4 weeks назад @ lexfridman.com
Microsoft Research Podcast Microsoft Research Podcast
последний пост 1 month, 2 weeks назад
Can we AI our way to a more sustainable world?
Can we AI our way to a more sustainable world? Can we AI our way to a more sustainable world?

Because I do think there’s a role for AI, a huge role for AI.

BURGER: Right, right.

BURGER: Right, right.

So I think that’s also something quite important here that, you know, AI can help facilitate.

And I think that’s not just applying AI to solve solutions through optimization but also thinking about this in an integrated way.

1 month, 2 weeks назад @ microsoft.com
Ideas: Steering AI toward the work future we want
Ideas: Steering AI toward the work future we want Ideas: Steering AI toward the work future we want

JANSSEN: Yeah, yeah, exactly.

TEEVAN: Yeah, yeah, yeah.

I’m curious what you have found particularly surprising about how people and organizations are leveraging AI right now.

And so I do like to picture a future of work where humans are flourishing with AI and where humans still get to do meaningful work.

And I’m very curious about how we can take advantage of AI and do more without running ourselves into the ground because we’re not AI, right?

2 months назад @ microsoft.com
Will machines ever be intelligent?
Will machines ever be intelligent? Will machines ever be intelligent?

And the question we’re going to discuss is, are machines intelligent?

No, no, that’s right, that’s right.

I mean, in some sense, you could potentially have a super intelligent system, right, that’s far more intelligent than anything else on the planet.

BURGER: Right, right.

At the same time, I think, you know, transformers are not intelligent in the way that a three-year-old is, right?

2 months, 2 weeks назад @ microsoft.com
Trailer: The Shape of Things to Come
Trailer: The Shape of Things to Come Trailer: The Shape of Things to Come

Join Microsoft’s Doug Burger and guests as they dig into the fundamental truths about AI and how it will reshape the future.

Technical advances are moving at such a rapid pace that it can be challenging to define the tomorrow we’re working toward.

In The Shape of Things to Come, Microsoft research leader Doug Burger and experts from across disciplines tease out the thorniest AI issues facing technologists, policymakers, business decision-makers, and other stakeholders today.

It’s important to understand what the emerging shapes are and how we should respond.” – Doug Burger, Technical Fellow and Corporate Vice President, Microsoft ResearchAbout Doug BurgerDoug Burger is a research leader in …

3 months, 1 week назад @ microsoft.com
Ideas: Community building, machine learning, and the future of AI
Ideas: Community building, machine learning, and the future of AI Ideas: Community building, machine learning, and the future of AI

This week, machine learning researchers around the world will be attending the annual Conference on Neural Information Processing Systems, or NeurIPS.

In this series, we’ll explore the technologies that are shaping our future and the big ideas that propel them forward.

So around that time when I started my PhD at Penn, I was working in machine learning theory and algorithmic economics.

How had you experienced a lack of community or network of women in machine learning before the founding of WiML?

So particularly when working on topics related to fairness, I’ve ended up focusing a bunch on stuff to do with marginalized groups as part of my responsible AI work.

6 months, 1 week назад @ microsoft.com
Ideas: More AI-resilient biosecurity with the Paraphrase Project
Ideas: More AI-resilient biosecurity with the Paraphrase Project Ideas: More AI-resilient biosecurity with the Paraphrase Project

Today, I’m excited to talk about the Paraphrase Project, an effort I co-led exploring how advances in AI tools for protein design might impact biosecurity.

These “patches,” akin to those in cybersecurity, have now been shared with organizations globally to strengthen biosecurity screening.

The project highlights that the same AI tools capable of incredible good can also be misused, requiring us to be vigilant, thoughtful, and creative so we continue to get the most benefit out of AI tools while working to ensure that we avoid costly misuses.

So things like, how similar is this to that template, wild-type protein structure that we used as our conditioning information?

But I feel like broadly…

8 months назад @ microsoft.com
Coauthor roundtable: Reflecting on healthcare economics, biomedical research, and medical education
Coauthor roundtable: Reflecting on healthcare economics, biomedical research, and medical education Coauthor roundtable: Reflecting on healthcare economics, biomedical research, and medical education

KOHANE: So I think you’ve “nerd sniped” me because you [LAUGHTER]—which is all too easy—but I think there’s a central issue here.

But I actually think this is dark matter of human organizational technology that is not well understood.

AZEEM AZHAR: We didn’t talk about, you know, AI in its ability to potentially do this, which is to extend the clinician’s presence throughout the week.

And so I think there’s always going to be an opening for either differences of opinion or agreeing with you too much.

And this gets into whether AI is really going to get almost to the ab initio understanding of human biology.

9 months, 3 weeks назад @ microsoft.com
NLP Highlights NLP Highlights
последний пост None
Data Skeptic
последний пост 1 month, 1 week назад
Student Spotlight: Aaron Payne, Data Analyst
Student Spotlight: Aaron Payne, Data Analyst Student Spotlight: Aaron Payne, Data Analyst

Aaron Payne, an MBA student at Georgia Tech studying business analytics and a Senior Insights Analyst at Chick-fil-A, joins Kyle Polich to talk about turning analytics into decisions that matter. They unpack a real-world forecasting project with Comfama in Colombia, including messy data realities, interpretability tradeoffs, and why "data science for good" starts with the people impacted.

1 month, 1 week назад @ dataskeptic.com
The Future is Agentic in Recommender Systems
The Future is Agentic in Recommender Systems The Future is Agentic in Recommender Systems

Kyle Polich sits down with Yashar Deldjoo, research scientist and Associate Professor at the Polytechnic University of Bari, to explore how recommender systems have evolved and why trustworthiness matters. They unpack key dimensions of responsible AI, including robustness to adversarial attacks, privacy, explainability, and fairness, and discuss how LLMs introduce new risks like hallucinations. The episode closes with a look at "agentic" recommender systems, where tools and memory shift recommendations from ranked lists to end-to-end task completion.

1 month, 2 weeks назад @ dataskeptic.com
Book Ratings and Recommendations
Book Ratings and Recommendations Book Ratings and Recommendations

Goodreads star ratings can be misleading as measures of "book quality," and research from Hannes Rosenbusch suggests that for many professionally published books, differences between readers often matter more than differences between books. The episode also explores how to model reader preferences, why reviews often reveal more about the reviewer than the text, and how LLMs can aid computational literary research while still falling short of human editors in creative writing.

2 months, 2 weeks назад @ dataskeptic.com
Disentanglement and Interpretability in Recommender Systems
Disentanglement and Interpretability in Recommender Systems Disentanglement and Interpretability in Recommender Systems 3 months назад @ dataskeptic.com
Collective Altruism in Recommender Systems
Collective Altruism in Recommender Systems Collective Altruism in Recommender Systems

Ekaterina (Kat) Filadova from MIT EECS joins us to discuss strategic learning in recommender systems—what happens when users collectively coordinate to game recommendation algorithms. Kat's research reveals surprising findings: algorithmic "protest movements" can paradoxically help platforms by providing clearer preference signals, and the challenge of distinguishing coordinated behavior from bot activity is more complex than it appears. This episode explores the intersection of machine learning and game theory, examining what happens when your training data actively responds to your algorithm.

3 months, 1 week назад @ dataskeptic.com
Niche vs Mainstream
Niche vs Mainstream Niche vs Mainstream

Anas Buhayh discusses multi-stakeholder fairness in recommender systems and the S'mores framework—a simulation allowing users to choose between mainstream and niche algorithms. His research shows specialized recommenders improve utility for niche users while raising questions about filter bubbles and data privacy.

3 months, 3 weeks назад @ dataskeptic.com
Healthy Friction in Job Recommender Systems
Healthy Friction in Job Recommender Systems Healthy Friction in Job Recommender Systems

In this episode, host Kyle Polich speaks with Roan Schellingerhout, a fourth-year PhD student at Maastricht University, about explainable multi-stakeholder recommender systems for job recruitment. Roan discusses his research on creating AI-powered job matching systems that balance the needs of multiple stakeholders—job seekers, recruiters, HR professionals, and companies. The conversation explores different types of explanations for job recommendations, including textual, bar chart, and graph-based formats, with findings showing that lay users strongly prefer simple textual explanations over more technical visualizations. Roan shares insights from his "healthy friction" study, which tested …

4 months, 1 week назад @ dataskeptic.com
Fairness in PCA-Based Recommenders
Fairness in PCA-Based Recommenders Fairness in PCA-Based Recommenders

In this episode, we explore the fascinating world of recommender systems and algorithmic fairness with David Liu, Assistant Research Professor at Cornell University's Center for Data Science for Enterprise and Society. David shares insights from his research on how machine learning models can inadvertently create unfairness, particularly for minority and niche user groups, even without any malicious intent. We dive deep into his groundbreaking work on Principal Component Analysis (PCA) and collaborative filtering, examining why these fundamental techniques sometimes fail to serve all users equally. David introduces the concept of "power niche users" - highly active users with specialized in…

4 months, 2 weeks назад @ dataskeptic.com
Video Recommendations in Industry
Video Recommendations in Industry Video Recommendations in Industry

In this episode, Kyle Polich sits down with Cory Zechmann, a content curator working in streaming television with 16 years of experience running the music blog "Silence Nogood." They explore the intersection of human curation and machine learning in content discovery, discussing the concept of "algatorial" curation—where algorithms and editorial expertise work together. Key topics include the cold start problem, why every metric is just a "proxy metric" for what users actually want, the challenge of filter bubbles, and the importance of balancing familiarity with discovery. Cory shares insights on why TikTok's algorithm works so well (clean data and massive interaction volume), the crucial …

5 months, 2 weeks назад @ dataskeptic.com
Eye Tracking in Recommender Systems
Eye Tracking in Recommender Systems Eye Tracking in Recommender Systems

In this episode, Santiago de Leon takes us deep into the world of eye tracking and its revolutionary applications in recommender systems. As a researcher at the Kempelin Institute and Brno University, Santiago explains the mechanics of eye tracking technology—how it captures gaze data and processes it into fixations and saccades to reveal user browsing patterns. He introduces the groundbreaking RecGaze dataset, the first eye tracking dataset specifically designed for recommender systems research, which opens new possibilities for understanding how users interact with carousel interfaces like Netflix. Through collaboration between psychologists and AI researchers, Santiago's work demonstrate…

5 months, 3 weeks назад @ dataskeptic.com
Cracking the Cold Start Problem
Cracking the Cold Start Problem Cracking the Cold Start Problem

In this episode of Data Skeptic, we dive deep into the technical foundations of building modern recommender systems. Unlike traditional machine learning classification problems where you can simply apply XGBoost to tabular data, recommender systems require sophisticated hybrid approaches that combine multiple techniques. Our guest, Boya Xu, an assistant professor of marketing at Virginia Tech, walks us through a cutting-edge method that integrates three key components: collaborative filtering for dimensionality reduction, embeddings to represent users and items in latent space, and bandit learning to balance exploration and exploitation when deploying new recommendations. Boya shares insigh…

6 months назад @ dataskeptic.com
Designing Recommender Systems for Digital Humanities
Designing Recommender Systems for Digital Humanities Designing Recommender Systems for Digital Humanities

In this episode of Data Skeptic, we explore the fascinating intersection of recommender systems and digital humanities with guest Florian Atzenhofer-Baumgartner, a PhD student at Graz University of Technology. Florian is working on Monasterium.net, Europe's largest online collection of historical charters, containing millions of medieval and early modern documents from across the continent. The conversation delves into why traditional recommender systems fall short in the digital humanities space, where users range from expert historians and genealogists to art historians and linguists, each with unique research needs and information-seeking behaviors. Florian explains the technical challen…

6 months, 2 weeks назад @ dataskeptic.com
DataRec Library for Reproducible in Recommend Systems
DataRec Library for Reproducible in Recommend Systems DataRec Library for Reproducible in Recommend Systems

In this episode of Data Skeptic's Recommender Systems series, host Kyle Polich explores DataRec, a new Python library designed to bring reproducibility and standardization to recommender systems research. Guest Alberto Carlo Mario Mancino, a postdoc researcher from Politecnico di Bari, Italy, discusses the challenges of dataset management in recommendation research—from version control issues to preprocessing inconsistencies—and how DataRec provides automated downloads, checksum verification, and standardized filtering strategies for popular datasets like MovieLens, Last.fm, and Amazon reviews. The conversation covers Alberto's research journey through knowledge graphs, graph-based recommen…

6 months, 4 weeks назад @ dataskeptic.com
Shilling Attacks on Recommender Systems
Shilling Attacks on Recommender Systems Shilling Attacks on Recommender Systems

In this episode of Data Skeptic's Recommender Systems series, Kyle sits down with Aditya Chichani, a senior machine learning engineer at Walmart, to explore the darker side of recommendation algorithms. The conversation centers on shilling attacks—a form of manipulation where malicious actors create multiple fake profiles to game recommender systems, either to promote specific items or sabotage competitors. Aditya, who researched these attacks during his undergraduate studies at SPIT before completing his master's in computer science with a data science specialization at UC Berkeley, explains how these vulnerabilities emerge particularly in collaborative filtering systems. From promoting a …

7 months назад @ dataskeptic.com
Music Playlist Recommendations
Music Playlist Recommendations Music Playlist Recommendations

In this episode, Rebecca Salganik, a PhD student at the University of Rochester with a background in vocal performance and composition, discusses her research on fairness in music recommendation systems. She explores three key types of fairness—group, individual, and counterfactual—and examines how algorithms create challenges like popularity bias (favoring mainstream content) and multi-interest bias (underserving users with diverse tastes). Rebecca introduces LARP, her multi-stage multimodal framework for playlist continuation that uses contrastive learning to align text and audio representations, learn song relationships, and create playlist-level embeddings to address the cold start prob…

7 months, 1 week назад @ dataskeptic.com
SuperDataScience SuperDataScience
последний пост 10 часов назад
999: What's Left to Build When Software Is Free, with Chip Huyen
999: What's Left to Build When Software Is Free, with Chip Huyen 999: What's Left to Build When Software Is Free, with Chip Huyen

Chip Huyen joins host Jon Krohn for this milestone episode 999 to talk about her record-breaking book "AI Engineering" the most-read title on the O'Reilly platform last year and how the AI landscape has shifted since her last appearance. Chip breaks down what separates AI engineering from machine learning engineering, makes the case for a "start simple" workflow, gets candid about the real costs of running LLMs in production, and shares why she's now fascinated by physical AI, robotics, and world models and why the durable problems worth solving are increasingly human ones. Jon Krohn guides the conversation from the practical content of the book through to where the field is heading next. A…

10 часов назад @ podtrac.com
998: In Case You Missed It in May 2026
998: In Case You Missed It in May 2026 998: In Case You Missed It in May 2026

In this month’s episode of ICYMI, Jon Krohn explores how AI agents are simultaneously creating new risks and unlocking powerful new ways of working with data. Hear from Anneka Gupta, Cal Al-Dhubaib, Trevor Manz, Jazmia Henry, Jeremy Mumford, and Jacob Miller, discussing why the old cybersecurity playbook breaks down in the age of Claude Mythos, how the notebook became an AI agent’s working memory, what it really takes to build a foundation model from scratch, and why failing slowly is the most expensive mistake an AI team can make. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/998⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natali…

4 days, 10 hours назад @ podtrac.com
997: How This AI Startup Hit 20M Users (No Moat)
997: How This AI Startup Hit 20M Users (No Moat) 997: How This AI Startup Hit 20M Users (No Moat)

Dr. Andrey Kurenkov returns to the show to talk about Astrocade's astronomical growth from pre-alpha to over 20 million engaged users, what it actually takes to build a vibe-coding platform that scales, and how the broader AI landscape has shifted since his last appearance. Andrey shares behind-the-scenes lessons from building B2C user-generated content products, why the real moat is community rather than tech, and his current thinking on humanoid robotics, AGI, and the AI risks people actually overlook. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.superdatascience.com/997⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode…

1 week назад @ podtrac.com
996: TrueFoundry’s Nikunj Bajaj on How to Get $100M Returns on AI Agent Deployments
996: TrueFoundry’s Nikunj Bajaj on How to Get $100M Returns on AI Agent Deployments 996: TrueFoundry’s Nikunj Bajaj on How to Get $100M Returns on AI Agent Deployments

TrueFoundry co-founder and CEO Nikunj Bajaj speaks to Jon Krohn about how enterprises like Nvidia and Siemens are realizing returns of over $100 million from single agent deployments, the AI gateway architecture that makes it possible to connect, observe, and govern agents at scale, and why the familiar advice to “start small” is the wrong way to roll out AI agents inside a large organization. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/996 Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.⁠⁠⁠ In this episode you will learn: (01:21) What TrueFoundry does and why agents in production nee…

1 week, 4 days назад @ podtrac.com
995: End-to-End Foundation Models for the Energy Industry, with Jazmia Henry
995: End-to-End Foundation Models for the Energy Industry, with Jazmia Henry 995: End-to-End Foundation Models for the Energy Industry, with Jazmia Henry

Jazmia Henry joins Jon Krohn to break down what it actually takes to build end-to-end foundation models for the energy industry. From wrangling decades of handwritten oil-and-gas documents into usable training data, to bespoke tokenizers, reinforcement learning, and inference at scale, Jazmia walks through every stage of the stack. Along the way she explains why reinforcement learning models are "bursty," what reward hacking is and how her Grounded Continuous Evaluation framework fixes it, and revisits the 2023 NeurIPS paper that argued, to widespread skepticism at the time, that scaling bad data degrades model performance. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠h…

2 weeks назад @ podtrac.com
994: AI’s Putting Recent Grads Out of Work; Here’s How to Get Hired Anyway!
994: AI’s Putting Recent Grads Out of Work; Here’s How to Get Hired Anyway! 994: AI’s Putting Recent Grads Out of Work; Here’s How to Get Hired Anyway!

Unemployment for recent computer-science graduates now rivals rates for fine-arts and anthropology majors, and undergraduate CS enrollment fell 11% in 2025. In this Five-Minute Friday, Jon Krohn walks through the data on both sides of the debate, from Stanford research showing a 13% employment drop for young workers in AI-exposed jobs, to Federal Reserve studies finding no statistically detectable link between AI adoption and reduced hiring. Jon shares his own view on where the truth lies and offers five concrete pieces of advice for graduates and senior professionals alike on how to get hired in 2026. Additional materials:⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/993⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠…

2 weeks, 4 days назад @ podtrac.com
993: How to Build AI-First Organizations, with Jacob Miller and Jeremy Mumford
993: How to Build AI-First Organizations, with Jacob Miller and Jeremy Mumford 993: How to Build AI-First Organizations, with Jacob Miller and Jeremy Mumford

For years, AI content has come in the form of “use this library, use this tool” tutorials that age out within months. Jacob Miller and Jeremy Mumford, co-authors of the brand new Wiley book Architected Intelligence, wanted to write something different, a guide to the higher-level principles of building AI products and AI-first organizations that will still be relevant in five or ten years. In this episode, the two Pattern engineers walk Jon Krohn through the core ideas of their book: why you should design products and processes so they can be executed by a human, an AI agent, or any hybrid combination; why most companies are still treating hallucinations as a model problem when they’re actu…

3 weeks назад @ podtrac.com
992: Tokenmaxxing vs AI Hardware Bottlenecks
992: Tokenmaxxing vs AI Hardware Bottlenecks 992: Tokenmaxxing vs AI Hardware Bottlenecks

While “tokenmaxxing”, the social media trend of maximizing AI token consumption as a vanity metric, takes off online, the physical infrastructure behind AI is slamming into serious bottlenecks. In this Five-Minute Friday, Jon Krohn maps out the four overlapping supply-chain constraints choking AI compute: GPUs (with NVIDIA Blackwell sold out through mid-2026), high-bandwidth memory (quintupled demand since 2023, only three manufacturers worldwide), CPUs (agentic AI requires 12x more CPUs per GPU than chatbots), and electricity (Gartner projects power shortages will restrict 40% of AI data centres by 2027). Find out why the five biggest hyperscalers are on track to spend $725 billion on AI i…

3 weeks, 4 days назад @ podtrac.com
991: Pair Programming with AI in Your Python Notebook, with Dr. Trevor Manz
991: Pair Programming with AI in Your Python Notebook, with Dr. Trevor Manz 991: Pair Programming with AI in Your Python Notebook, with Dr. Trevor Manz

Dr. Trevor Manz of Marimo talks to Jon Krohn about Marimo Pair, an open-source agent skill that teaches coding agents like Claude Code how to drive a reactive Python notebook, reading cell state, running Python in the kernel, taking screenshots of cells, and iterating on data tasks the way agents iterate on traditional software. Trevor also unpacks recursive language models, his AnyWidget project that bridges Python and the web, and his journey from a Wisconsin small town and Harvard bioinformatics research to founding-engineer life at Marimo. Listen to the episode to hear why no matter where AI takes us, curiosity and going deep on a topic will always be valuable. Additional materials: ⁠⁠⁠…

4 weeks назад @ podtrac.com
990: Inside Mythos: Anthropic's Locked-Down Frontier Model
990: Inside Mythos: Anthropic's Locked-Down Frontier Model 990: Inside Mythos: Anthropic's Locked-Down Frontier Model

Anthropic has built a frontier AI model so capable at finding software vulnerabilities that it has decided not to release it to the general public. In this Five-Minute Friday, Jon Krohn breaks down Claude Mythos Preview, a general-purpose model whose hacking abilities emerged as a side effect of broad improvements in code understanding and reasoning. Find out how Mythos achieved a nearly 100x improvement over Opus 4.6 on Firefox exploit generation, why Mozilla patched 271 vulnerabilities in a single release using an early version of the model, and what Project Glasswing Anthropic’s gated industry consortium means for the future of cybersecurity. Jon also shares practical tips for securing t…

1 month назад @ podtrac.com
989: Security for Mythos-Era Agentic Risks, with Rubrik’s Anneka Gupta and Cal Al-Dhubaib
989: Security for Mythos-Era Agentic Risks, with Rubrik’s Anneka Gupta and Cal Al-Dhubaib 989: Security for Mythos-Era Agentic Risks, with Rubrik’s Anneka Gupta and Cal Al-Dhubaib

Rubrik’s Anneka Gupta and Cal Al-Dhubaib speak to Jon Krohn about cybersecurity measures, the risks AI in business might pose for malicious attacks, and why AI should be kept “boring.” Find out how Rubrik safeguards client data, what zero trust is in the context of cybersecurity, and why cyber-resilience needs to be a top priority for companies looking to adopt AI. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/989⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information. In this episode you will learn: (02:25) All about Rubrik (08:51) The announcement of Claude …

1 month назад @ podtrac.com
988: In Case You Missed It in April 2026
988: In Case You Missed It in April 2026 988: In Case You Missed It in April 2026

In this month’s episode of In Case You Missed It, Jon Krohn talks to guests about memory and education, and how artificial intelligence is continuing to help lower the barriers to access. Hear from Matt Glickman, Traci Walker-Griffith, Richmond Alake, and Linda Haviv, discussing the foundations of AI agent memory, how engineers can develop at scale, and why they believe AI could be your child’s perfect tutor in the classroom. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/988⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

1 month, 1 week назад @ podtrac.com
987: AI Infrastructure, Ray, and Why Nonlinear Careers Win, with Linda Haviv
987: AI Infrastructure, Ray, and Why Nonlinear Careers Win, with Linda Haviv 987: AI Infrastructure, Ray, and Why Nonlinear Careers Win, with Linda Haviv

Linda Haviv talks to Jon Krohn about staying current on AI matters, why open-source technology is narrowing the gap in its race with proprietary models, and how being a content creator in tech is key to career growth and longevity. She emphasizes that non-linear pathways to a career in tech can give applicants an edge, and stresses the importance of continuous upskilling to “stay relevant.” In her view, systems thinking is becoming more important than coding skills. Hear why in this episode. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/987⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascienc…

1 month, 1 week назад @ podtrac.com
986: Building Hardware is Hard but AI Agents Help, with Kishore Subramanian
986: Building Hardware is Hard but AI Agents Help, with Kishore Subramanian 986: Building Hardware is Hard but AI Agents Help, with Kishore Subramanian

CTO of Propel Software Kishore Subramanian talks to Jon Krohn about how product lifecycle management (PLM) software and quality management systems help ensure compliance, record management, and quality assurance. Listen to the episode to hear Kishore Subramanian talk about best practices for getting started with Agentforce 360, his top tips for deploying AI projects, and why yoga and meditation could make you better at building AI products! Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/984⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.⁠⁠⁠ In this episode you will learn: (05:21) …

1 month, 2 weeks назад @ podtrac.com
985: The Four Types of Memory Every AI Agent Needs, with Richmond Alake
985: The Four Types of Memory Every AI Agent Needs, with Richmond Alake 985: The Four Types of Memory Every AI Agent Needs, with Richmond Alake

Oracle’s Director of AI Developer Experience Richmond Alake returns to the show to talk to Jon Krohn about agent memory; the network of systems, models, databases and LLMs that enable AI agents to learn and adapt over time. Listen to the episode to hear about Richmond’s “100 Days of Agent Memory” initiative, retrieval-augmented generation’s (RAG) limitations with AI agents, the layers of the AI agent stack, and what makes the Oracle AI database so useful to developers. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/985⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship inf…

1 month, 2 weeks назад @ podtrac.com
Data Science at Home Data Science at Home
последний пост 3 weeks назад
Recommend and manipulate: the dangers of the attention economy
Recommend and manipulate: the dangers of the attention economy Recommend and manipulate: the dangers of the attention economy

This sort of operation is directly exploiting a core feature of internet social media platforms.

The main purpose of recommender systems is to recommend people the same items similar people show an interest in.

Some of the most common methods to implement recommender systems, use concepts such as cosine/correlation similarity, matrix factorization, neural autoencoders and sequence predictors.

As you say, recommender systems exist because the business model of social media platforms is to monetise attention.

F: So you are saying that this is not an accident: is this the basis of the optimisation of the recommender system?

3 weeks назад @ datascienceathome.com
Social media is an ant mill (Internet is a disaster) (Ep. 303)
Social media is an ant mill (Internet is a disaster) (Ep. 303) Social media is an ant mill (Internet is a disaster) (Ep. 303)

Personal newsletter:https://defragzone.substack.com📩 Newsletter: https://datascienceathome.substack.com🎙 Podcast: Available on Spotify, Apple Podcasts, and more.

🐦 Twitter: @DataScienceAtHome📘LinkedIn: https://www.linkedin.com/in/fragadaleta/Instagram: https://www.instagram.com/datascienceathome/Facebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Whether you’re a data professional, tech enthusiast, or just curious about the field, our podcast delivers insights, interviews…

3 weeks назад @ datascienceathome.com
AI and videogames (Ep. 305)
AI and videogames (Ep. 305) AI and videogames (Ep. 305)

What is the state of AI and videogames?

This and much more is covered in this 1st episode of AI and videogames.

Check outshift.comNEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Send us mail at: [email protected]’t forget to like, subscribe, and hit the 🔔 for updates on the latest in AI and data science!

3 weeks назад @ datascienceathome.com
AI and videogames: Conversational NPCs (Ep. 306)
AI and videogames: Conversational NPCs (Ep. 306) AI and videogames: Conversational NPCs (Ep. 306)

Can NPCs in videogames leverage new LLM-based tech?

Check outshift.comNEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Whether you’re a data professional, tech enthusiast, or just curious about the field, our podcast delivers insights, interviews, and discussions.

Send us mail at: [email protected]’t forget to like, subscribe, and hit the 🔔 for updates on the latest in AI and data science!

3 weeks назад @ datascienceathome.com
AI tips & tricks (Ep. 307)
AI tips & tricks (Ep. 307) AI tips & tricks (Ep. 307)

🐦 Twitter: @DataScienceAtHome📘LinkedIn: https://www.linkedin.com/in/fragadaleta/Instagram: https://www.instagram.com/datascienceathome/Facebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastSPONSORSThis episode is brought to you by Outshift, Cisco’s incubation engine.

Check outshift.comNEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Whether you’re a data professional, tech enthusiast, or just curious about the field, our podcast delivers insights, interviews, and discussions.

Send us mail at: [email protected]’t forget to like, subscribe, and hit the …

3 weeks назад @ datascienceathome.com
Europe, wake up! You Can’t Be a Superpower on Someone Else’s Servers (Ep. 304)
Europe, wake up! You Can’t Be a Superpower on Someone Else’s Servers (Ep. 304) Europe, wake up! You Can’t Be a Superpower on Someone Else’s Servers (Ep. 304)

Tech sovereignty takes 3 years and political will.

Check outshift.comNEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Whether you’re a data professional, tech enthusiast, or just curious about the field, our podcast delivers insights, interviews, and discussions.

Send us mail at: [email protected]’t forget to like, subscribe, and hit the 🔔 for updates on the latest in AI and data science!

1 month, 2 weeks назад @ datascienceathome.com
About Apple’s Privacy (Ep. 302)
About Apple’s Privacy (Ep. 302) About Apple’s Privacy (Ep. 302)

Apple just spent $2B on tech that reads your silent speech.

🐦 Twitter: @DataScienceAtHome📘LinkedIn: https://www.linkedin.com/in/fragadaleta/Instagram: https://www.instagram.com/datascienceathome/Facebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Whether you’re a data professional, tech enthusiast, or just curious about the field, our podcast delivers insights, interviews, and discussions.

Send us mail at: [email protected]’t forget to like, subscribe, and hi…

1 month, 2 weeks назад @ datascienceathome.com
Productivity is the new data breach (Ep. 301)
Productivity is the new data breach (Ep. 301) Productivity is the new data breach (Ep. 301)

Personal newsletter:https://defragzone.substack.com📩 Newsletter: https://datascienceathome.substack.com🎙 Podcast: Available on Spotify, Apple Podcasts, and more.

🐦 Twitter: @DataScienceAtHome📘LinkedIn: https://www.linkedin.com/in/fragadaleta/Instagram: https://www.instagram.com/datascienceathome/Facebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Whether you’re a data professional, tech enthusiast, or just curious about the field, our podcast delivers insights, interviews…

1 month, 2 weeks назад @ datascienceathome.com
Programmable Money: The Cage They’ll Call Convenience (Ep. 300)
Programmable Money: The Cage They’ll Call Convenience (Ep. 300) Programmable Money: The Cage They’ll Call Convenience (Ep. 300)

This episode breaks down programmable money, the technology that turns your wallet into a permission system.

Personal newsletter: https://defragzone.substack.com📩 Newsletter: https://datascienceathome.substack.com🎙 Podcast: Available on Spotify, Apple Podcasts, and more.

🐦 Twitter: @DataScienceAtHome📘LinkedIn: https://www.linkedin.com/in/fragadaleta/Instagram: https://www.instagram.com/datascienceathome/Facebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Send us mail at: …

1 month, 2 weeks назад @ datascienceathome.com
There Is No AI. There’s a Stateless Function on 10,000 GPUs Pretending to Know You (Ep. 299)
There Is No AI. There’s a Stateless Function on 10,000 GPUs Pretending to Know You (Ep. 299) There Is No AI. There’s a Stateless Function on 10,000 GPUs Pretending to Know You (Ep. 299)

Personal newsletter: https://defragzone.substack.com📩 Newsletter: https://datascienceathome.substack.com🎙 Podcast: Available on Spotify, Apple Podcasts, and more.

🐦 Twitter: @DataScienceAtHome📘 LinkedIn: https://www.linkedin.com/in/fragadaleta/ Instagram: https://www.instagram.com/datascienceathome/Facebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Whether you’re a data professional, tech enthusiast, or just curious about the field, our podcast delivers insights, intervi…

3 months назад @ datascienceathome.com
Bias in the machine (edited)
Bias in the machine (edited) Bias in the machine (edited)

The title of today’s episode is Bias in the machineC: Francesco, today we are starting with an infuriating discussion.

The failure of the medical community as a whole to recognise this obvious bias up to the 21st century is an example of how insidious the problem of bias is.

Three: The bias in your training sample: people put training samples together, and people have culture, experience, and prejudice.

These assumptions inform the way AI systems work—and fail—to this day.

When an algorithm is a black box and you can’t look inside, you have no way of analysing its bias.

3 months назад @ datascienceathome.com
What is wrong with reinforcement learning? (Ep. 82)
What is wrong with reinforcement learning? (Ep. 82) What is wrong with reinforcement learning? (Ep. 82)

Join the discussion on our Discord serverAfter reinforcement learning agents doing great at playing Atari video games, Alpha Go, doing financial trading, dealing with language modeling, let me tell you the real story here.In this episode I want to shine some light on reinforcement learning (RL) and the limitations that every practitioner should consider before taking certain directions.

RL seems to work so well!

What is wrong with it?

Are you a listener of Data Science at Home podcast?

Or did you subscribe to the Artificial Intelligence at your fingertips newsletter?

4 months назад @ datascienceathome.com
How to generate very large images with GANs (Ep. 76)
How to generate very large images with GANs (Ep. 76) How to generate very large images with GANs (Ep. 76)

Join the discussion on our Discord serverIn this episode I explain how a research group from the University of Lubeck dominated the curse of dimensionality for the generation of large medical images with GANs.

The problem is not as trivial as it seems.

Many researchers have failed in generating large images with GANs before.

One interesting application of such approach is in medicine for the generation of CT and X-ray images.Enjoy the show!

ReferencesMulti-scale GANs for Memory-efficient Generation of High Resolution Medical Images https://arxiv.org/abs/1907.01376

4 months назад @ datascienceathome.com
Training neural networks faster without GPU [RB] (Ep. 77)
Training neural networks faster without GPU [RB] (Ep. 77) Training neural networks faster without GPU [RB] (Ep. 77)

Join the discussion on our Discord serverTraining neural networks faster usually involves the usage of powerful GPUs.

In this episode I explain an interesting method from a group of researchers from Google Brain, who can train neural networks faster by squeezing the hardware to their needs and making the training pipeline more dense.

Enjoy the show!

ReferencesFaster Neural Network Training with Data Echoinghttps://arxiv.org/abs/1907.05550

4 months назад @ datascienceathome.com
More powerful deep learning with transformers (Ep. 84) (Rebroadcast)
More powerful deep learning with transformers (Ep. 84) (Rebroadcast) More powerful deep learning with transformers (Ep. 84) (Rebroadcast)

Some of the most powerful NLP models like BERT and GPT-2 have one thing in common: they all use the transformer architecture.

Such architecture is built on top of another important concept already known to the community: self-attention.In this episode I explain what these mechanisms are, how they work and why they are so powerful.

Don’t forget to subscribe to our Newsletter or join the discussion on our Discord serverReferences

4 months назад @ datascienceathome.com