Very ML
State-of-the-art Machine Learning News Feed
/r/MachineLearning
последний пост 4 часа назад
noisekit - CLI for generating realistic degraded speech datasets for ASR benchmarking [P]
noisekit - CLI for generating realistic degraded speech datasets for ASR benchmarking [P]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

4 часа назад @ reddit.com
EMA-Gated Temporal Sequence Compression in Vision Transformers [P]
EMA-Gated Temporal Sequence Compression in Vision Transformers [P]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

4 часа назад @ reddit.com
Cross-species RSA: same learning rules (BP, PC, STDP, FA) tested against both human fMRI and macaque electrophysiology [P]
Cross-species RSA: same learning rules (BP, PC, STDP, FA) tested against both human fMRI and macaque electrophysiology [P]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

5 часов назад @ reddit.com
Profiling PyTorch training without accidentally stalling the GPU [D]
Profiling PyTorch training without accidentally stalling the GPU [D]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

5 часов назад @ reddit.com
A Tiny Open-Source Self-Driving AI That Runs on a Phone [P]
A Tiny Open-Source Self-Driving AI That Runs on a Phone [P] A Tiny Open-Source Self-Driving AI That Runs on a Phone [P]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

11 часов назад @ reddit.com
What to use for Sign Language Recognition [R]
What to use for Sign Language Recognition [R]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

11 часов назад @ reddit.com
[R]GNN Model For Fraud Detection Isn't Performing Well[R]
[R]GNN Model For Fraud Detection Isn't Performing Well[R]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

12 часов назад @ reddit.com
[D] Is IEEE Workshop on Machine Learning for Signal Processing Reputable? [D]
[D] Is IEEE Workshop on Machine Learning for Signal Processing Reputable? [D]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

12 часов назад @ reddit.com
Trouble exploring in ai/ml,idk where to being with [D]
Trouble exploring in ai/ml,idk where to being with [D]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

18 часов назад @ reddit.com
Augmented Equivariant Mesh Networks for Anatomical Mesh Segmentation (ICML 2026 Workshops) [R]
Augmented Equivariant Mesh Networks for Anatomical Mesh Segmentation (ICML 2026 Workshops) [R]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day назад @ reddit.com
Tomesphere, 3M paper pages with TLDRs, peer reviews, code, and a SPECTER2 similarity graph [P]
Tomesphere, 3M paper pages with TLDRs, peer reviews, code, and a SPECTER2 similarity graph [P]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 1 hour назад @ reddit.com
Verbosity is not faithfulness: an architectural argument that reasoning models cannot perform faithful inference [D]
Verbosity is not faithfulness: an architectural argument that reasoning models cannot perform faithful inference [D]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 1 hour назад @ reddit.com
[P] have a couple technical questions for my LLM router. [P]
[P] have a couple technical questions for my LLM router. [P]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 2 hours назад @ reddit.com
Added a Chrome Dino-style game to my research tool's pipeline wait screen driven by real SSE events [P]
Added a Chrome Dino-style game to my research tool's pipeline wait screen driven by real SSE events [P]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 2 hours назад @ reddit.com
[D] Dlib or pytorch to CNN? [D]
[D] Dlib or pytorch to CNN? [D]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 2 hours назад @ reddit.com
Towards Data Science
последний пост 42 минуты назад
How to Effectively Run Many Claude Code Sessions in Parallel
How to Effectively Run Many Claude Code Sessions in Parallel How to Effectively Run Many Claude Code Sessions in Parallel

Why it’s hard to run coding agents in parallelFirst, I want to cover why it is challenging to run coding agents in parallel.

However, if you’re a manager of coding agents, you naturally have to handle coding agents performing different tasks.

How to effectively run many parallel coding agentsIn this section, I’ll cover some specific techniques that I use and apply daily to effectively run a lot of parallel coding agents.

You can, for example, implement this by utilizing hooks in Claude Code, which are processes that run at certain points in time.

Recaps are another incredibly powerful feature that you can use to effectively run a lot of parallel coding agents.

42 минуты назад @ towardsdatascience.com
Learning From Pairwise Preferences: An Introduction to the Bradley Terry Model
Learning From Pairwise Preferences: An Introduction to the Bradley Terry Model Learning From Pairwise Preferences: An Introduction to the Bradley Terry Model

This is the setting in which the Bradley-Terry model becomes especially useful by offering a mathematically clean way to learn from pairwise preferences.

Figure 1: The Bradley-Terry model learns latent item strengths from pairwise comparisons and uses them to estimate win probabilities between items.

The Bradley-Terry model formalizes this intuition by finding latent strengths that make these observed outcomes plausible under the model.

TrueSkill does not incorporate covariates in the same way as the contextual Bradley-Terry model, but the two ideas are complementary.

SummaryThe standard Bradley-Terry model provides a clean framework for learning from pairwise comparisons, but it assumes th…

2 часа назад @ towardsdatascience.com
Most AI Agents Fail in Production Because They’re Built Backwards
Most AI Agents Fail in Production Because They’re Built Backwards Most AI Agents Fail in Production Because They’re Built Backwards

agent system seriously fail in production, it wasn’t dramatic.

A production AI agent isn’t a single intelligent thing.

So What Really Goes Into a Production System?

Every production AI system I have seen that works cleanly has something like a decision layer, whether the team named it like this or not.

They fail because the system around the model was designed backwards, starting from what the agent should do and assuming the architecture would sort itself out.

3 часа назад @ towardsdatascience.com
They Requested It. I Built It. Nobody Ever Used It.
They Requested It. I Built It. Nobody Ever Used It. They Requested It. I Built It. Nobody Ever Used It.

There is a fine line between a strong model, and a black box that can’t even be explained by the ones who built it.

These doctors are used to making clincial judgements using their years of expertise and in-depth knowledge of medicine.

While a predictive model may be good at predicting a given outcome, if it cannot be explained well, clinicians will question its trustworthiness.

If the model build takes too long, they could abandon the idea altogether, or find creative solutions that don’t involve predictive models.

Your Model Isn’t Easy to ConsumeBuilding a good predictive model is only half the battle.

5 часов назад @ towardsdatascience.com
What Is a Data Agent?
What Is a Data Agent? What Is a Data Agent?

That’s why I want to share what I’ve learned, explain what a data agent is, and highlight the difference between it and a “standard” AI agent.

So, without further ado, here is my definition of a data agent:A data agent is a report you can talk to.

Meaning, the whole magic is in the AI agent setup, where you can use a Fabric data agent as a specialised tool or knowledge source.

It’s that an AI agent acts, while the data agent grounds.

If that’s the case, check out the following resources:Fabric data agent creation – Microsoft FabricLearn how to create a Fabric data agent that can answer questions about data.learn.microsoft.comImplement Microsoft Fabric Data Agents – TrainingImplement Microso…

1 day назад @ towardsdatascience.com
The AI Model Confidence Trap
The AI Model Confidence Trap The AI Model Confidence Trap

For AI systems, however, confidence can be a surprisingly unreliable narrator.

AI systems, however, often behave like that one person in a group project who confidently explains something they learned three minutes ago (I am sure we all had that classmate…).

Image by the authorIf the model reports “90% confidence”, we would hope it is correct roughly 90% of the time.

Instead, many systems look more like “90% confidence, 65% accuracy.” This gap between confidence and accuracy is why the way we choose to train these LLMs matters a lot.

When AI is confidently wrong, the mistake can scale to millions, and confidence at scale is a very different problem.

1 day, 2 hours назад @ towardsdatascience.com
Stop Using LLMs Like Giant Problem Solvers
Stop Using LLMs Like Giant Problem Solvers Stop Using LLMs Like Giant Problem Solvers

The first change was to prepare the source data upfront.

In my case, that meant temporarily storing the relevant raw data locally.

If the pipeline stopped halfway, the cached progress meant it could resume from the last successful checkpoint.

This made the output easier to audit.

On the last day, JJ Geewax, Director of Applied AI at Google DeepMind, shared a framing that captured what I had been learning the hard way: we need to stop using LLMs like giant problem solvers.

1 day, 3 hours назад @ towardsdatascience.com
The Domain Shift: Moving Data Governance from Product Triage to Infrastructure Investment
The Domain Shift: Moving Data Governance from Product Triage to Infrastructure Investment The Domain Shift: Moving Data Governance from Product Triage to Infrastructure Investment

Consider a governance team managing 60 data products across five business domains.

The governance council agenda becomes a running ledger of individual product tickets.

Product Vs Domain Lens, generated by GPT Image 2For resource allocation decisions, the domain is the correct unit of analysis and not the individual product.

Redefining the Governance Resource Allocation QuestionShifting the analytical focus from individual products to domain pillars changes how leadership evaluates where to deploy resources.

The required intervention is a dedicated domain uplift program to establish basic data ownership and standards.

1 day, 5 hours назад @ towardsdatascience.com
I Built My First ETL Pipeline as a Complete Beginner. Here’s How.
I Built My First ETL Pipeline as a Complete Beginner. Here’s How. I Built My First ETL Pipeline as a Complete Beginner. Here’s How.

Something real, with real data, that actually worked at the end.

I found the GitHub API documentation and decided I was going to build my first ETL pipeline from scratch.

Step 1: ExtractThe first thing I had to do was figure out how to talk to the GitHub API.

Step 2: TransformNow I had 30 repos sitting in a Python list, each one a nested dictionary with dozens of fields.

Real GitHub data, shaped and saved by a pipeline I built from scratch.

1 day, 23 hours назад @ towardsdatascience.com
Can AI write your code?
Can AI write your code? Can AI write your code?

is no longer whether AI can write code, but whether we can trust the code it writes?

That is why the article “Can AI write your code?

Running the Code and Comparing OutputsIn the third step, the generated code is executed in the corresponding programming environment: Python, R, or Stata.

In practice, AI has influenced not only the way we write code but also the infrastructure we use.

Can AI write your code?

1 day, 23 hours назад @ towardsdatascience.com
From TF-IDF to Transformers: Implementing Four Generations of Semantic Search
From TF-IDF to Transformers: Implementing Four Generations of Semantic Search From TF-IDF to Transformers: Implementing Four Generations of Semantic Search

After fitting the TF–IDF vectorizer on the expert critiques, the system produces a sparse document-term matrix stored in self.matrix.

C.3 Method 3-Embedding-Based Semantic SearchThe next major step in semantic search goes beyond TF–IDF and simple word counting.

This represents an important conceptual shift from semantic similarity matching to supervised task-specific learning.

Specifically, TF-IDF focuses mainly on important words, embedding models focus on semantic similarity, and transformers try to understand language through context and relationships between words.

E. ConclusionIn this article, we explored four practical approaches to semantic search, moving from classical TF-IDF retrie…

2 days, 3 hours назад @ towardsdatascience.com
Introducing the Agent Toolkit for Amazon Web Services
Introducing the Agent Toolkit for Amazon Web Services Introducing the Agent Toolkit for Amazon Web Services

Installing Agent Toolkit in your coding agentThe agent toolkit is available for most modern coding agents, such as Claude Code, Cursor, Kiro, and VS Code.

AI Agents on AWS AWS Core AWS Data Analytics Browser Documents Presentations SpreadsheetsThat means all the required AWS-related plugins are available for your agent to use.

2/ ObservabilityMonitoring the AWS Agent Toolkit is primarily done through the AWS MCP Server, as it’s the managed component that receives tool calls and performs AWS actions.

In short, when used with a coding agent, the AWS toolkit can,Create AWS resources, write code, and deploy apps.

For more information on the Agent Toolkit for AWS, check out the official GitHub r…

2 days, 5 hours назад @ towardsdatascience.com
The Ultimate Beginners’ Guide to Building an AI Agent in Python
The Ultimate Beginners’ Guide to Building an AI Agent in Python The Ultimate Beginners’ Guide to Building an AI Agent in Python

As overwhelming as it may sound, building an AI Agent is not that difficult.

In this article, we will go through the step-by-step process of building an AI Agent.

This will serve as a dedicated AI agent tutorial for the very beginners in the field of programming, coding, and AI.

AI Agent Workflow (Image by Author)Now that we know what AI Agents are and how they work, let us start coding our own personalized AI Agent.

Building an AI Educational Agent in PythonIn this article, we will build an AI Educational Agent that will act as your personal education assistant.

3 days назад @ towardsdatascience.com
Beyond the Model: Why Data Scientists Must Embrace APIs and API Documentation
Beyond the Model: Why Data Scientists Must Embrace APIs and API Documentation Beyond the Model: Why Data Scientists Must Embrace APIs and API Documentation

Sum up how a good API documentation should look, with information on endpoints, parameters, and responses.

A special case of API is a REST API which follows the concepts of the REST (REpresentational State Transfer) architecture.

Tips on Creating Good API DocumentationCreating effective API documentation is crucial for ensuring that users can easily understand and utilize your API.

It’s important to note that in this example, we didn’t require an API key to access our REST API resource.

By prioritizing clear and detailed documentation, data scientists can ensure they will be comfortable working with modern tools.

3 days, 4 hours назад @ towardsdatascience.com
How to Mathematically Choose the Optimal Bins for Your Histogram
How to Mathematically Choose the Optimal Bins for Your Histogram How to Mathematically Choose the Optimal Bins for Your Histogram

Have you ever wondered how to choose your bins in a histogram?

Although numerically expensive to compute, the expression for computing the standard deviation in the density is remarkably straightforward (F. Pijlman, 2023)σ P ( x | X ) 2 = P ( x | X ) ( P ( x | x , X ) − P ( x | X ) ) \sigma_{P(x|X)}^2 = P(x|X) \left( P(x|{x,X}) – P(x|X) \right)This yields the densities below:ConclusionsWe began with a simple question: Is there a mathematical foundation for choosing the bins in a histogram?

As the concept of bins inherently connects data points with densities, we studied howto choose bins for densities.

Dirichlet Priors give us a rigorous way to express our initial assumptions about data dis…

4 days назад @ towardsdatascience.com
Distill.pub Distill.pub
последний пост None
TheSequence TheSequence
последний пост 6 часов назад
The Sequence AI of the Week #867: Thinking in Latents: Why Sapient's HRM-Text Is a Quiet Rebuke to Chain-of-Thought
The Sequence AI of the Week #867: Thinking in Latents: Why Sapient's HRM-Text Is a Quiet Rebuke to Chain-of-Thought The Sequence AI of the Week #867: Thinking in Latents: Why Sapient's HRM-Text Is a Quiet Rebuke to Chain-of-Thought

So we paper over this by making the model think out loud.

Every reasoning step has to leave the residual stream, become a discrete token in a vocabulary built for human communication, and come back in through the embedding layer for the next step.

Sapient Intelligence’s bet, made first with the original Hierarchical Reasoning Model paper last summer and now extended into the language domain with HRM-Text, is that this is fixable.

Not by making the model bigger, not by training on more CoT traces, but by giving the architecture the one thing it doesn’t have: variable, internal, depth.

It’s worth thinking carefully about what they did and what it does and doesn’t yet prove.

6 часов назад @ thesequence.substack.com
The Sequence Knowledge #866: Three Text Diffusion Models You Need To Know About
The Sequence Knowledge #866: Three Text Diffusion Models You Need To Know About The Sequence Knowledge #866: Three Text Diffusion Models You Need To Know About

💡 AI Concept of the Day: Three Text Diffusion Models You Need To Know AboutFor most of the LLM era, language generation has been built around a single assumption: text should be produced like a typewriter, one token at a time, left to right, each new symbol conditioned on a frozen history.

Text diffusion models challenge that assumption at its root.

Instead of factorizing language as “the next token given all previous tokens,” diffusion models define a corruption process and then learn how to reverse it.

Together, they outline the three phases of a new architecture class: scientific proof, industrial deployment, and frontier validation.

LLaDA: The Scientific Proof That Diffusion Can Scale

1 day, 6 hours назад @ thesequence.substack.com
The Sequence Radar #865: Last Week in AI: Last Week in AI: Karpathy, Google, Colossus, and the Coming IPO Wave
The Sequence Radar #865: Last Week in AI: Last Week in AI: Karpathy, Google, Colossus, and the Coming IPO Wave The Sequence Radar #865: Last Week in AI: Last Week in AI: Karpathy, Google, Colossus, and the Coming IPO Wave

Subscribe and don’t miss out:📝 Editorial: Last Week in AI: Last Week in AI: Karpathy, Google, Colossus, and the Coming IPO WaveThe last three weeks felt like a phase transition.

It is the most coherent agent story any frontier lab has shipped this year.

AI Lab: Carnegie Mellon UniversitySummary: This study evaluates the practical effectiveness of AI-generated peer reviews by having 45 domain scientists manually grade 2,960 individual criticisms from 82 Nature-family papers.

AI Lab: University of Illinois Urbana-Champaign & MetaSummary: This paper introduces Spreadsheet-RL, an on-policy reinforcement learning framework designed to train specialized AI agents for complex spreadsheet workflows…

3 days, 6 hours назад @ thesequence.substack.com
The Sequence Opinion #864: Every AI Agent Needs a Computer
The Sequence Opinion #864: Every AI Agent Needs a Computer The Sequence Opinion #864: Every AI Agent Needs a Computer

The next phase of AI agents will not be defined only by better models, longer context windows, or more elegant tool-calling APIs, but by something much more primitive: access to a computer.

An agent that can only emit tokens is a brilliant brain in a jar; an agent with a filesystem, terminal, browser, network, package manager, credentials, memory, and guardrails becomes a worker inside a real execution environment.

This is the core thesis: every serious AI agent needs a computer, not metaphorically, but architecturally.

The emerging market for micro-containers, sandboxes, browser runtimes, and agent workspaces is really the market for giving intelligence a body.

The Brain in the Jar Problem

6 days, 6 hours назад @ thesequence.substack.com
The Sequence AI of the Week #863: The Model is the Interface: Inside Thinking Machines' Interactive Models
The Sequence AI of the Week #863: The Model is the Interface: Inside Thinking Machines' Interactive Models The Sequence AI of the Week #863: The Model is the Interface: Inside Thinking Machines' Interactive Models

For this week in AI’s essay, I would like to discuss Thinking Machines’ work on interactive models which takes multi-modality to a new level.

I’ve been diving as much as possible on their ideas and wanted to share some thoughts.

Check it out to get started:For the last few years, the default mental model for large language models has been embarrassingly simple: concatenate tokens, predict the next token, repeat.

The human writes a message, the model replies, the human writes again.

But collaboration is not text.

1 week назад @ thesequence.substack.com
The Sequence Knowledge #862: Learning About Text Diffusion Models
The Sequence Knowledge #862: Learning About Text Diffusion Models The Sequence Knowledge #862: Learning About Text Diffusion Models

💡 AI Concept of the Day: What are Text Diffusion ModelsIf you look at the architecture of the modern AI boom, it is heavily bifurcated by modality.

In the visual domain, we are entirely ruled by diffusion models.

But in the realm of text, diffusion has historically been an afterthought.

Large Language Models (LLMs) like GPT-4, Claude, and LLaMA are staunchly autoregressive (AR).

Because AR models generate blindly from left to right, they cannot easily engage in global planning.

1 week, 1 day назад @ thesequence.substack.com
The Sequence Radar #861: Last Week in AI: IPOs, Interactive Models, and Recursive Dreams
The Sequence Radar #861: Last Week in AI: IPOs, Interactive Models, and Recursive Dreams The Sequence Radar #861: Last Week in AI: IPOs, Interactive Models, and Recursive Dreams

Subscribe and don’t miss out:📝 Editorial: Last Week in AI: IPOs, Interactive Models, and Recursive DreamsThis week in AI felt less like a product cycle and more like a philosophical provocation wrapped in a market event, a demo, and a few delightfully ambitious lab announcements.

The IPO was not just a financing milestone for an AI chip company; it was a reminder that the AI race is still brutally physical.

AI Lab: UNC-Chapel Hill, UC Berkeley, UCSCSummary: EVOLVEMEM introduces a self-evolving memory architecture that uses an LLM-powered diagnosis module to autonomously optimize its own retrieval configuration based on failure logs.

AI Lab: Fastino LabsSummary: GLiGuard is a highly efficien…

1 week, 3 days назад @ thesequence.substack.com
The Sequence Opinion #860: Every Company’s Last eXam: Some Reflection About Practical AI Evals
The Sequence Opinion #860: Every Company’s Last eXam: Some Reflection About Practical AI Evals The Sequence Opinion #860: Every Company’s Last eXam: Some Reflection About Practical AI Evals

For today’s essay, I want to explore an idea that has become central to how we think about AI evaluations at LayerLens.

This is not an essay about LayerLens, but about a simple and increasingly unavoidable thesis: evals are becoming the fourth pillar of modern AI, alongside compute, data, and models.

As AI systems move from chatbots to agents, from demonstrations to production workflows, every meaningful task performed by every agent inside every company will need its own evaluation layer.

Practical, dynamic, company-specific exams that measure whether an AI system can actually survive contact with real work.

That is why top frontier labs now emphasize task-specific evals, production-derive…

1 week, 6 days назад @ thesequence.substack.com
The Sequence AI of the Week #859: Reading Claude’s Mind in English: A Note on Natural Language Autoencoders
The Sequence AI of the Week #859: Reading Claude’s Mind in English: A Note on Natural Language Autoencoders The Sequence AI of the Week #859: Reading Claude’s Mind in English: A Note on Natural Language Autoencoders

There is a recurring fantasy in interpretability work, somewhere between a wish and an embarrassment.

You stare at a residual stream activation — twelve thousand floats — and you want to ask it, in plain English, what are you thinking about?

Sparse autoencoders give you a thousand sparse latents you then label by inspecting top-activating examples.

Anthropic’s new paper, Natural Language Autoencoders Produce Unsupervised Explanations of LLM Activations , is the first interpretability artifact in a while where the activation talks back.

You point an NLA at a token in a Claude Opus 4.6 transcript and it produces a few bullet points of English describing what the model is thinking.

2 weeks назад @ thesequence.substack.com
The Sequence Knowledge #858: How State Space Models Went from Curiosity to Serious Transformer Competitor
The Sequence Knowledge #858: How State Space Models Went from Curiosity to Serious Transformer Competitor The Sequence Knowledge #858: How State Space Models Went from Curiosity to Serious Transformer Competitor

💡 AI Concept of the Day: How State Space Models Went from Curiosity to Serious Transformer CompetitorThere is this thing that happens in ML research where a line of work gets quietly good for years, and then one day you wake up and it’s suddenly competing with the dominant paradigm.

State space models are having that moment right now.

For the past eight years, the transformer has been the only architecture that matters.

But transformers have a problem that everyone in the field knows about and nobody has fully solved.

State space models offer a fundamentally different contract: linear time complexity, constant memory at inference, and no KV-cache at all.

2 weeks, 1 day назад @ thesequence.substack.com
The Sequence Radar #857: Last Week in AI: Inside the Machine, Outside the Text Box
The Sequence Radar #857: Last Week in AI: Inside the Machine, Outside the Text Box The Sequence Radar #857: Last Week in AI: Inside the Machine, Outside the Text Box

In the AI of the week, we dive into Anthropic’s groundbreaking paper about natural language autoencoders.

Subscribe and don’t miss out:📝 Editorial: Inside the Machine, Outside the Text BoxThis week in AI had the strange texture of a market that is simultaneously becoming more scientific, more productized, and more speculative.

But underneath all of it is the same story: AI is moving from a model race into an infrastructure race.

Anthropic’s Natural Language Autoencoders paper was the most intellectually interesting development of the week.

On the opposite side of the stack, OpenAI’s new voice model release pushed AI further toward becoming a native interface rather than a text box with bett…

2 weeks, 3 days назад @ thesequence.substack.com
The Sequence Opinion #856: The Salesforce of agents won't be Salesforce, The Google of agents won't be Google
The Sequence Opinion #856: The Salesforce of agents won't be Salesforce, The Google of agents won't be Google The Sequence Opinion #856: The Salesforce of agents won't be Salesforce, The Google of agents won't be Google

For the first few decades of the internet, software had a reasonably stable assumption baked into it: the user was a human.

The entire SaaS and consumer internet stack grew around this shape of user.

CRMs tracked human sales reps selling to human buyers.

Identity systems authenticated humans.

Analytics systems measured human clicks, human sessions, human conversions.

2 weeks, 6 days назад @ thesequence.substack.com
The Sequence AI of the Week #855: Inside Nemotron Omni: NVIDIA’s New Multimodal Brain for Agents
The Sequence AI of the Week #855: Inside Nemotron Omni: NVIDIA’s New Multimodal Brain for Agents The Sequence AI of the Week #855: Inside Nemotron Omni: NVIDIA’s New Multimodal Brain for Agents

The interesting thing about NVIDIA’s new Nemotron 3 Nano Omni is not that it “does multimodality.” We already have a zoo of models that can caption images, transcribe speech, parse PDFs, answer questions about videos, and click around GUIs.

The interesting thing is that Nemotron Omni is designed to make that zoo feel like a single animal.

The speech model may hear what was said but not what was on screen when it was said.

Nemotron 3 Nano Omni is NVIDIA’s attempt to move the “eyes and ears” of an agent into a single efficient perception-and-reasoning model: video, audio, image, and text in; text out.

NVIDIA announced it on April 28, 2026, positioning it as an open omni-modal reasoning model …

3 weeks назад @ thesequence.substack.com
The Sequence Knowledge #854: Return of the King: Unrolling the xLSTM Architecture
The Sequence Knowledge #854: Return of the King: Unrolling the xLSTM Architecture The Sequence Knowledge #854: Return of the King: Unrolling the xLSTM Architecture

💡 AI Concept of the Day: Return of the King: Unrolling the xLSTM ArchitectureIf you were training sequence models circa 2015, your entire mental model of the world was shaped by the Long Short-Term Memory (LSTM) network.

Invented in the 1990s by Sepp Hochreiter and Jürgen Schmidhuber, the LSTM was the undisputed workhorse of deep learning.

“Attention Is All You Need” dropped, and the entire AI ecosystem pivoted.

We traded the deep, architectural elegance of the LSTM for the brute-force, highly parallelizable matrix multiplications of the Transformer.

The Transformer won the hardware lottery because it allowed us to map the entire sequence onto a GPU grid and train it all at once.

3 weeks, 1 day назад @ thesequence.substack.com
The Sequence Radar #853: Last Week in AI: The Great AI Fundraising Wars and a New Frontier Lab
The Sequence Radar #853: Last Week in AI: The Great AI Fundraising Wars and a New Frontier Lab The Sequence Radar #853: Last Week in AI: The Great AI Fundraising Wars and a New Frontier Lab

Subscribe and don’t miss out:📝 Editorial: Last Week in AI: The Great AI Fundraising Wars and a New Frontier LabThis week in AI felt less like a product cycle and more like a sovereign debt auction for the future of cognition.

But the real story is that frontier AI is becoming an industrial-scale capital formation game.

It is the market trying to price a new kind of company: part model lab, part cloud tenant, part developer platform, part enterprise operating system.

AI Lab: MicrosoftSummary: This paper introduces a scalable methodology to create realistic, user-specific synthetic computer environments populated with diverse directory structures and content-rich artifacts.

🤖 AI Tech Releases…

3 weeks, 3 days назад @ thesequence.substack.com
Synced Review
последний пост None
📓 Cool Blogs
ODS.ai Habr ODS.ai Habr
последний пост 1 month, 3 weeks назад
Вайбкодинг по Chess’ноку. 1. e4
Вайбкодинг по Chess’ноку. 1. e4 Вайбкодинг по Chess’ноку. 1. e4

Но это не вайбкодинг, а тяжёлая профессиональная ИИ-разработка.

За это время по этому проекту в ChatGPT было создано 112 чатов — это примерно 560 промптов.

И в особо напряжённые периоды приходилось вставать по ночам, чтобы оптимально использовать лимиты, которые делятся на 5-часовые и недельные сессии.

Но это не магия и не кнопка «сделать хорошо».

Именно поэтому будущее не за вайбкодингом, а за теми, кто научится управлять этой скоростью.

1 month, 3 weeks назад @ habr.com
Почему я стал ИТ-волонтером & Датасет новостей о противоречиях современного общества
Почему я стал ИТ-волонтером & Датасет новостей о противоречиях современного общества Почему я стал ИТ-волонтером & Датасет новостей о противоречиях современного общества

Простой пример с ценами на топливо: бензин дорожает и из-за роста цены на нефть, и из-за ее падения.

Осознание того, что твой труд увеличивает чью-то капитализацию, но не решает реальных проблем общества, видимых в быту и в новостях, подтолкнуло искать еще какую-то деятельность.

Кроме того, благодаря АМБ появился уникальный датасет новостей с противоречиями современного общества на kaggle и github, далее о нем.

Датасет новостей о противоречиях современного обществаАктивисты АМБ и волонтеры дружественных коллективов собрали и разметили датасет новостей, подсвечивающие те самые системные противоречия, о которых я задумывался ранее.

Пример Б В 2023 году в мире голодал каждый 11-й человек, а в …

3 months, 1 week назад @ habr.com
[Перевод] Как устроен Codex
[Перевод] Как устроен Codex [Перевод] Как устроен Codex

Подробный разбор того, как команда OpenAI Codex создаёт своего кодового агента, как его используют инженеры и что это может значить для будущего разработки ПО.

Чтобы разобраться, как устроен Codex, как команды внутри OpenAI его используют и как он влияет на инженерные практики у создателей ChatGPT, я поговорил с тремя сотрудниками OpenAI:Тибо Соттио (Thibault Sottiaux) — руководитель Codex.

Оба продукта были запущены весной: Codex CLI анонсировали в апреле 2025 года, а Codex в ChatGPT представили в мае.

В команде Codex эти файлы объясняют агенту, как ориентироваться в кодовой базе, какие команды запускать для тестирования и как следовать стандартам проекта.

Использование Codex в OpenAIПомим…

3 months, 1 week назад @ habr.com
Курс Natural Language Processing & LLMs — новый сезон
Курс Natural Language Processing & LLMs — новый сезон Курс Natural Language Processing & LLMs — новый сезон

10 февраля мы в очередной раз запускаем бесплатный онлайн-курс по обработке естественного языка (Natural Language Processing).

Что будем проходить:классическое начало: закон Ципфа, TF-IDF, RNN, CNN, Transformer;основные задачи NLP: классификация текста, тегирование и генерация;специфичные области: агенты и вайб-кодинг;LLM и их применение.

Если вы студент ИТМО, МФТИ или ВШЭ, то курс можно зачесть, как учебный.

Работаю в области NLP более 12 лет, успел поработать в Яндексе и ВКонтакте, защитить кандидатскую диссертацию.

Если есть вопросы, то приходите с ними в ODS Mattermost – там будут все ответы, время семинаров и ссылки.

3 months, 3 weeks назад @ habr.com
SWE-MERA — новый динамический бенчмарк для моделей агентной генерации кода
SWE-MERA — новый динамический бенчмарк для моделей агентной генерации кода SWE-MERA — новый динамический бенчмарк для моделей агентной генерации кода

Однако все задачи в MERA CODE, как впрочем и в SWE-bench и других бенчмарках подобного назначения, следуют классической парадигме, когда у нас есть фиксированный обучающий набор данных и, что более важно, фиксированный проверочный набор.

Но большие языковые модели для кодинга, которые мы и пытаемся оценивать нашим набором, также учатся на GitHub – со времен еще первой модели LLaMa.

Кажется, что 700 задач немного, но это уже очень приличное количество, и что самое важное — это новые задачи.

Current behavior: from sympy import ask, Q, Symbol x = Symbol('x') print(ask(Q.finite(x**-1), Q.real(x))) # Output: True Expected behavior: The function should return None to indicate uncertainty, as x**-…

8 months, 1 week назад @ habr.com
Machine Learning Mastery
последний пост 2 weeks, 2 days назад
Implementing Prompt Compression to Reduce Agentic Loop Costs
Implementing Prompt Compression to Reduce Agentic Loop Costs Implementing Prompt Compression to Reduce Agentic Loop Costs

Agentic loops in production can be synonymous with high costs, especially when it comes to both LLM and external application usage via APIs, where billing is often closely related to token usage.

2 weeks, 2 days назад @ machinelearningmastery.com
Implementing Permission-Gated Tool Calling in Python Agents
Implementing Permission-Gated Tool Calling in Python Agents Implementing Permission-Gated Tool Calling in Python Agents

AI agents have evolved beyond passive chatbots.

2 weeks, 5 days назад @ machinelearningmastery.com
The Roadmap to Mastering Tool Calling in AI Agents
The Roadmap to Mastering Tool Calling in AI Agents The Roadmap to Mastering Tool Calling in AI Agents

Most

2 weeks, 6 days назад @ machinelearningmastery.com
Implementing Statistical Guardrails for Non-Deterministic Agents
Implementing Statistical Guardrails for Non-Deterministic Agents Implementing Statistical Guardrails for Non-Deterministic Agents

Non-deterministic agents are those where the same input can lead to distinct outputs across multiple runs.

3 weeks, 1 day назад @ machinelearningmastery.com
Agentic RAG Explained in 3 Levels of Difficulty
Agentic RAG Explained in 3 Levels of Difficulty Agentic RAG Explained in 3 Levels of Difficulty

Traditional

3 weeks, 2 days назад @ machinelearningmastery.com
Effective KV Compression with TurboQuant
Effective KV Compression with TurboQuant Effective KV Compression with TurboQuant

TurboQuant has recently been launched by Google as a novel algorithmic suite and library for applying advanced quantization and compression to large language models (LLMs) and vector search engines — an indispensable element of RAG systems.

3 weeks, 6 days назад @ machinelearningmastery.com
Building AI Agents in Python with Pydantic AI
Building AI Agents in Python with Pydantic AI Building AI Agents in Python with Pydantic AI

<a href="https://machinelearningmastery.

4 weeks назад @ machinelearningmastery.com
Effective Context Engineering for AI Agents: A Developer’s Guide
Effective Context Engineering for AI Agents: A Developer’s Guide Effective Context Engineering for AI Agents: A Developer’s Guide

When

4 weeks, 1 day назад @ machinelearningmastery.com
Text Summarization with Scikit-LLM
Text Summarization with Scikit-LLM Text Summarization with Scikit-LLM

In a

1 month назад @ machinelearningmastery.com
Building AI Agents with Local Small Language Models
Building AI Agents with Local Small Language Models Building AI Agents with Local Small Language Models

The idea of building your own AI agent used to feel like something only big tech companies could pull off.

1 month назад @ machinelearningmastery.com
Train, Serve, and Deploy a Scikit-learn Model with FastAPI
Train, Serve, and Deploy a Scikit-learn Model with FastAPI Train, Serve, and Deploy a Scikit-learn Model with FastAPI

FastAPI has become one of the most popular ways to serve machine learning models because it is lightweight, fast, and easy to use.

1 month назад @ machinelearningmastery.com
AI Agent Memory Explained in 3 Levels of Difficulty
AI Agent Memory Explained in 3 Levels of Difficulty AI Agent Memory Explained in 3 Levels of Difficulty

A stateless AI agent has no memory of previous calls.

1 month назад @ machinelearningmastery.com
Getting Started with Zero-Shot Text Classification
Getting Started with Zero-Shot Text Classification Getting Started with Zero-Shot Text Classification

Zero-shot text classification is a way to label text without first training a classifier on your own task-specific dataset.

1 month, 1 week назад @ machinelearningmastery.com
The Complete Guide to Inference Caching in LLMs
The Complete Guide to Inference Caching in LLMs The Complete Guide to Inference Caching in LLMs

Calling a large language model API at scale is expensive and slow.

1 month, 1 week назад @ machinelearningmastery.com
Python Decorators for Production Machine Learning Engineering
Python Decorators for Production Machine Learning Engineering Python Decorators for Production Machine Learning Engineering

You've probably written a decorator or two in your Python career.

1 month, 1 week назад @ machinelearningmastery.com
ML in Production
последний пост None
Sorta Insightful Sorta Insightful
последний пост 1 week, 2 days назад
AI Will Not Make Your Job Chill
AI Will Not Make Your Job Chill AI Will Not Make Your Job Chill

People keep talking about how AI will make their job easy, and I don’t really understand why.

I assume the factory job producing this was still hard work.

I don’t think AI has made my job chill, and I feel like I am front-line compared to much of the economy.

It’s not widely known, but transportation and warehousing has the highest rate of nonfatal work injuries in the US.

For a while, this will not lead to any job loss, because increasing abundance will lead to higher demand.

1 week, 2 days назад @ alexirpan.com
Why I Signed The Amicus Brief for Anthropic v Department of War
Why I Signed The Amicus Brief for Anthropic v Department of War Why I Signed The Amicus Brief for Anthropic v Department of War

On Monday, Anthropic filed a lawsuit against the Department of War, and an amicus brief in support of Anthropic was filed on behalf of a number of OpenAI and Google employees.

There’s also an amicus brief filed on behalf of Microsoft.

There’s conflicting reporting, but very broadly, Anthropic signed an agreement with the government to deploy Claude in classified, military contexts.

Anthropic said no, Pete Hegseth declared them a supply chain risk, and Anthropic filed a lawsuit against this.

The amicus brief was broadly aligned with my thoughts on the matter, so I signed.

2 months, 2 weeks назад @ alexirpan.com
MIT Mystery Hunt 2026
MIT Mystery Hunt 2026 MIT Mystery Hunt 2026

This has spoilers for MIT Mystery Hunt 2026.

Pre-HuntThe time running up to Hunt was more stressful than usual…very briefly, I typically hunt with teammate.

Just last year, I did GPH 2025, LN Hunt, Teammate Hunt 2025, Microsoft Hunt 2025, and Silph Puzzle Hunt 2025, all of which had significant 3+ hour solve puzzles that would not be out of place in Mystery Hunt.

Not to mention smaller hunts like Advent Hunt, and then I didn’t even do Brown Puzzlehunt or Vertex Hunt or the fall CMU Hunt.

To me, the crux is whether Mystery Hunt is broken, or Mystery Hunt is fine.

3 months, 4 weeks назад @ alexirpan.com
Authentic Imperfection
Authentic Imperfection Authentic Imperfection

* * *I’ve been thinking about the anger surrounding generative AI.

To keep things fair, he took the best human images and best AI images, meaning human art from famous artists, and AI art from prompters skilled at removing obvious tells of image generation.

When people complain about AI slop, I see it as a complaint against the deluge of default style AI images.

We’ve seen this happen in all forms: AI text, AI music, older forms of computer generated content like CGI.

As much as we celebrate imperfection, digital imperfection is a step too far.

6 months, 1 week назад @ alexirpan.com
Ten Years Later
Ten Years Later Ten Years Later

Every now and then, someone asks me why I blog, and I don’t know really know what to tell them.

That’s another reason I’m not celebrating 10 years with more gusto, I know I’ve been writing less.

Indiana Jones and the Great Circle: I don’t know how they did it, but Indiana Jones and the Great Circle was just fun all the way through.

My one complaint is that the hand-to-hand combat feels like the worst part of the game, so of course they put a bunch of upgrades behind learning parry timings you’ll never use later.

I have not tried Peak, but Another Crab’s Treasure was really good and is worth playing if you’re interested in a Souls-like.

9 months, 1 week назад @ alexirpan.com
Lil'Log
последний пост None
inFERENCe
последний пост 3 months назад
The Future of Software
The Future of Software The Future of Software

February 25, 2026The Future of SoftwareThe world of software is undergoing a shift not seen since the advent of compilers in the 1970s.

How will humans tell AI agents what software artefacts we would like to create?

How will humans tell AI agents what software artefacts we would like to create?

This future of software creation, in which our programming languages are abstracted away, raises two very important questions:What will the instruction/specification language look like?

This should be a clear layer of separation between the developer and the pool of AI agents working to maintain software.

3 months назад @ inference.vc
Deep Learning is Powerful Because It Makes Hard Things Easy - Reflections 10 Years On
Deep Learning is Powerful Because It Makes Hard Things Easy - Reflections 10 Years On Deep Learning is Powerful Because It Makes Hard Things Easy - Reflections 10 Years On

Deep Learning is Powerful Because It Makes Hard Things Easy - Reflections 10 Years OnTen years ago this week, I wrote a provocative and bold post that blew up, made it to top spot on HackerNews.

In hindsight: There is a lot of stuff in deep learning that we don't understand nearly enough.

Sometimes things work for reasons completely unrelated to why we thought they would work.

(Pop some 🍿 in the microwave and read till the end for more)🎯 "Deep learning is powerful exactly because it makes hard things easy"Okay, this was a great insight.

🎯 Generative ModelingIn the post I suggested people learn "something harder" instead of - or in addition to - deep learning.

3 months, 3 weeks назад @ inference.vc
The Spectator
последний пост None
The Unofficial Google Data Science Blog The Unofficial Google Data Science Blog
последний пост None
Off the Convex Path
последний пост None
Jay Alammar
последний пост None
Piekniewski's blog
последний пост None
fast.ai NLP fast.ai NLP
последний пост None
Sebastian Ruder
последний пост None
大トロ 大トロ
последний пост None
🔬 Science
Papers With Code Papers With Code
последний пост None
Papers With Code Papers With Code
последний пост None
Papers With Code Papers With Code
последний пост None
💼 University and corporation labs
DeepMind DeepMind
последний пост 5 days, 21 hours назад
We’re launching the Google DeepMind Accelerator program in Asia Pacific to tackle environmental risks
We’re launching the Google DeepMind Accelerator program in Asia Pacific to tackle environmental risks We’re launching the Google DeepMind Accelerator program in Asia Pacific to tackle environmental risks

The Asia-Pacific region is a global engine for economic growth, but it's also highly vulnerable to climate change.

While green technologies are gaining momentum, a recent report shows they aren’t scaling fast enough to keep up with the region’s rising environmental risks.

Selected organizations will receive expert mentorship, tailored support and help integrating frontier AI and science AI models from Google AI experts into their projects or products.

If you're working on climate solutions, we want to help you scale your work.

The program kicks off with an in-person bootcamp in Singapore, and you can learn more and register your interest today.

5 days, 21 hours назад @ blog.google
Fast-tracking genetic leads to reverse cellular aging
Fast-tracking genetic leads to reverse cellular aging Fast-tracking genetic leads to reverse cellular aging

Biologists Omar Abudayyeh and Jonathan Gootenberg are using Co-Scientist to help them blast through both.

Their lab runs huge genetic screens that flip thousands of genes on or off then reads how cells respond to these changes.

Co-Scientist is helping on two fronts.

Second, Co-Scientist speeds up the follow-through.

Having Co-Scientist analyse their screening data alongside the literature, that work is slashed to just a few days.

1 week, 1 day назад @ deepmind.google
Simulate real-world places with Project Genie and Street View
Simulate real-world places with Project Genie and Street View Simulate real-world places with Project Genie and Street View

Street View: ground your worlds in real placesWhen creating imaginative worlds in Project Genie, you can now also base them on real places.

This capability is powered by Maps Imagery Grounding, the same technology developers use to create stunning AI visuals with Street View.

Street View imagery in Project Genie is available now for places in the U.S. with plans to expand to more places over time.

Project Genie: now available with Google AI UltraStarting today, Project Genie — including the new Street View capability — is gradually rolling out to all eligible Google AI Ultra $200 subscribers globally (18+).

Try creating today with Project Genie.

1 week, 2 days назад @ blog.google
Introducing Gemini Omni
Introducing Gemini Omni Introducing Gemini Omni

We’re introducing Gemini Omni, where Gemini’s ability to reason meets the ability to create.

Omni is our new model that can create anything from any input — starting with video.

With Omni, you can combine images, audio, video and text as input and generate high-quality videos grounded in Gemini's real-world knowledge.

Today, we’re rolling out the first model in the Omni family: Gemini Omni Flash, to the Gemini app, Google Flow and YouTube Shorts.

Here’s some of what makes Omni special:Edit your videos through conversationGemini Omni gives you an easier way to edit video — with natural language.

1 week, 2 days назад @ blog.google
Introducing Google Antigravity 2.0
Introducing Google Antigravity 2.0 Introducing Google Antigravity 2.0 1 week, 2 days назад @ antigravity.google
Gemini for Science: AI experiments and tools for a new era of discovery
Gemini for Science: AI experiments and tools for a new era of discovery Gemini for Science: AI experiments and tools for a new era of discovery

That’s why we are introducing Gemini for Science, a collection of science tools and experiments designed to expand the scale and precision of scientific exploration.

Gemini for Science experimental tools on Google Labs include three primary prototypes designed to handle such tasks.

Literature Insights searches scientific literature and structures results into tables with custom, searchable attributes for side-by-side analysis.

Our research teams using Science Skills have already seen this speedup in practice.

In early testing, our team used Science Skills to perform a complex analysis that normally takes hours in minutes.

1 week, 3 days назад @ blog.google
Making it easier to understand how content was created and edited
Making it easier to understand how content was created and edited Making it easier to understand how content was created and edited

As generative media becomes more advanced and accessible, it’s helpful to know where content comes from, and whether it’s been altered.

Today, we’re expanding our content transparency and verification tools in Search, Gemini, Chrome, Pixel and Cloud, and deepening our partnership with the broader industry.

Since then, we've integrated SynthID into our generative media models and products, watermarking over 100 billion images and videos and 60,000 years of audio.

Across a growing number of our generative media tools, we use C2PA Content Credentials, the industry standard that shows how media was created and modified, with or without AI.

In an era of generative media, we believe that identify…

1 week, 3 days назад @ blog.google
Strengthening Singapore’s AI Future: A New National Partnership
Strengthening Singapore’s AI Future: A New National Partnership Strengthening Singapore’s AI Future: A New National Partnership

At Google DeepMind, we believe frontier AI can be a powerful force for good.

As part of our National Partnerships for AI initiative, we are launching new programmes in the country, while driving a key pillar of Google’s new National AI partnership with the Singapore Government.

Together, we are strengthening Singapore’s National AI Strategy to responsibly deploy AI at scale for economic growth and public benefit.

Using frontier AI like AlphaFold and Google Earth and other latest AI for Science tools, we are working to accelerate our understanding of outbreaks across Southeast Asia.

Using frontier AI like AlphaFold and Google Earth and other latest AI for Science tools, we are working to…

1 week, 4 days назад @ deepmind.google
Finding the molecular switches behind new infectious diseases
Finding the molecular switches behind new infectious diseases Finding the molecular switches behind new infectious diseases

Professor Clare Bryant at the University of Cambridge is using Co-Scientist to hunt for the molecular switches that cause severe diseases, like sepsis, in humans when pathogens leap between species, and find new approaches to prevent this happening.

Testing Co-Scientist, Bryant fed it a summary of one of her grant proposals studying flu in birds and humans outlining her lab’s research questions.

moment: Co-Scientist had prioritised a protein that hadn’t been on her radar, connected to several signalling pathways she was already interested in.

Back at her lab, she added unpublished material, kept confidential within Co-Scientist.

But her lab is now on track to complete it in six months, she …

1 week, 4 days назад @ deepmind.google
Opening new paths in aging research
Opening new paths in aging research Opening new paths in aging research

At Calico Life Sciences, head of AI/ML Matt Onsum and principal scientist Katherine Labbé are using Co‑Scientist to help connect scattered findings across the biology of aging and turn them into hypotheses worth testing.

That is no small task, because the biology literature is full of mixed-quality findings, dead ends, and results that do not replicate.

Onsum says Co-Scientist impressed Calico’s experts with its scientific discernment, helping cut through that noise to help them identify new ideas genuinely worth exploring.

The Calico team used Co-Scientist to generate a novel yet plausible hypothesis about how the ISR is regulated by metabolism, which is known to change with age and in var…

1 week, 4 days назад @ deepmind.google
Accelerating discovery of liver disease mechanisms
Accelerating discovery of liver disease mechanisms Accelerating discovery of liver disease mechanisms

At the University of Edinburgh, bioengineer Filippo Menolascina is using Co-Scientist to comb the literature for overlooked links and generate new hypotheses.

Developing treatments is challenging because MASH involves intertwined biological processes, including liver inflammation and metabolism, meaning single‑target drugs fall short.

Faced with that combinatorial explosion, Menolascina used Co‑Scientist to narrow the search.

In his hands, Co‑Scientist synthesised evidence across liver biology and pharmacology, highlighted mechanisms worth focusing on, and flagged candidate combination therapies that his team could test.

In one emblematic case, Co‑Scientist tackled a live, practical questio…

1 week, 4 days назад @ deepmind.google
Uniting biological toolkits for a new approach to ALS
Uniting biological toolkits for a new approach to ALS Uniting biological toolkits for a new approach to ALS

Ritu Raman at MIT and Ryan Flynn at Boston Children's Hospital approach human biology with very different toolkits, but Co-Scientist is bridging their labs.

Raman, a mechanical engineer, builds living nerve and muscle tissues to model diseases that affect voluntary movement.

Co-Scientist compressed that work, quickly helping Raman interrogate the evidence in relation to her tissue model, turn ideas into testable hypotheses, and rank potential directions in accordance with the trade-offs labs actually face, such as feasibility and potential risk–reward.

Raman brought her new research directions to Flynn, and the pair used Co-Scientist iteratively, combining its best ideas into creative resea…

1 week, 4 days назад @ deepmind.google
Uncovering repurposed medicines to fight liver fibrosis
Uncovering repurposed medicines to fight liver fibrosis Uncovering repurposed medicines to fight liver fibrosis

Liver fibrosis is a scarring process that can lead to cirrhosis, which causes more than 1.4 million deaths each year.

Geneticist Gary Peltz at Stanford University School of Medicine is using Co-Scientist to accelerate the hunt for medicines that can slow, stop, or reverse it.

In research published in Advanced Science, Peltz’s team explored whether Co-Scientist could support efforts to identify drugs from the vast literature of existing medicines that could be repurposed to treat fibrosis.

Peltz asked Co-Scientist to propose three candidates and explain its reasoning.

By contrast, of the three drug candidates selected by Co-Scientist, two blocked fibrosis and promoted the regeneration of liv…

1 week, 4 days назад @ deepmind.google
How WeatherNext helped the National Hurricane Center better predict Hurricane Melissa’s historic landfall in Jamaica
How WeatherNext helped the National Hurricane Center better predict Hurricane Melissa’s historic landfall in Jamaica How WeatherNext helped the National Hurricane Center better predict Hurricane Melissa’s historic landfall in Jamaica

The National Hurricane Center was able to issue advanced weather warnings to give communities in Jamaica additional lead time to prepare, evacuate, and help protect livelihoodsIn October 2025, Hurricane Melissa made history.

It was the strongest hurricane on record to land in Jamaica and tied for the strongest hurricane in the Atlantic.

The National Hurricane Center’s (NHC) forecast also marked a historic milestone.

For the first time, they predicted a storm would reach Category 5 intensity starting from Category 1 wind speed.

Predicting dangerous storms earlier and more accurately helps teams on the ground to better mobilize resources and coordinate evacuations effectively.

1 week, 4 days назад @ deepmind.google
Gemini 3.5: frontier intelligence with action
Gemini 3.5: frontier intelligence with action Gemini 3.5: frontier intelligence with action

Today, we’re introducing Gemini 3.5, our latest family of models combining frontier intelligence with action.

We’re kicking off the series by releasing 3.5 Flash.

It delivers frontier performance for agents and coding, excelling at complex long-horizon tasks that deliver real-world utility.

3.5 Flash is available today to billions of people globally:For everyone via the Gemini app and AI Mode in Google SearchFor developers in our agent-first development platform Google Antigravity and Gemini API in Google AI Studio and Android Studio; andFor enterprises in Gemini Enterprise Agent Platform and Gemini Enterprise.

It's already being used internally, and we look forward to rolling it out next m…

1 week, 4 days назад @ blog.google
Google
последний пост 5 часов назад
Introducing Google AI Threat Defense to help you outpace the adversary
Introducing Google AI Threat Defense to help you outpace the adversary Introducing Google AI Threat Defense to help you outpace the adversary

Organizations need to be able to keep pace and protect themselves against AI agent-driven, high-speed attacks — but they can no longer rely on legacy, manual methods.

To defend against this range of threats, organizations need more than one model or agent.

That’s why we’re launching Google AI Threat Defense — an automated security system designed to help you continuously monitor for and stop AI-powered threats before they can impact your business.

Introducing Google AI Threat DefenseGoogle AI Threat Defense fuses the reasoning power of Gemini and other frontier models, the contextual risk prioritization of Wiz, the code remediation capabilities of Gemini and CodeMender, and the frontline ex…

5 часов назад @ cloud.google.com
The Blueprint: How Movix fills a gap in dental skills with specialized agentic AI
The Blueprint: How Movix fills a gap in dental skills with specialized agentic AI The Blueprint: How Movix fills a gap in dental skills with specialized agentic AI

Welcome to The Blueprint, a regular feature where we highlight how Google Cloud customers are tackling unique and common challenges across industries using the latest AI and cloud technologies.

At Movix, we’re building one of the first agentic AI solutions for dental appliance manufacturers and dental labs to help companies in the sector acquire digital technical expertise so they can scale clinical workflows cost-effectively and consistently.

The challenge:Movix started in 2025 with a mission to solve a serious shortage of skilled dental technicians in aligner manufacturing through AI and agentic workflows.

Before founding Movix, we had previously started a vertically integrated dental ali…

5 days, 1 hour назад @ cloud.google.com
AI Studio unlocks full-stack vibe coding with Cloud Run, Firebase, and Cloud SQL, no credit card required
AI Studio unlocks full-stack vibe coding with Cloud Run, Firebase, and Cloud SQL, no credit card required AI Studio unlocks full-stack vibe coding with Cloud Run, Firebase, and Cloud SQL, no credit card required

At Google I/O 2026, we announced updates to the integration between Google AI Studio and Google Cloud:New users can deploy up to two full-stack applications to the Google Cloud Starter Tier, no billing account requiredAn expanded choice of databases: Firestore for non-relational data, and Cloud SQL as a new relational database optionTight integration with Google Workspace tools like Sheets, Calendar, and Gmail using Firebase Auth as the single user login flowThis is an update to the integration we announced in March, which included support for vibe-coded full-stack app deployments from AI Studio powered by Cloud Run, Firestore, and Firebase Auth.

With this expanded integration, you can use …

6 days, 1 hour назад @ cloud.google.com
The top announcements for startups from Google I/O ‘26
The top announcements for startups from Google I/O ‘26 The top announcements for startups from Google I/O ‘26

Many of the world’s fastest-growing AI startups are choosing to build their future — and the world’s — on Google Cloud because of our complete and open AI stack.

At Google Cloud Next ‘26, we focused on providing updates to each layer of the stack to help lean teams move faster and operate more efficiently.

We are bridging the gap between local prototyping and cloud scale, giving your team a straightforward path to build, test, and distribute your applications.

Let’s look at the key announcements for startups from Google I/O and how you can apply them to your business.

What it means for startups: You now have direct access to build with new state-of-the-art models from DeepMind.

6 days, 1 hour назад @ cloud.google.com
Benchmark and optimize LLMs on-device with AI Edge Portal
Benchmark and optimize LLMs on-device with AI Edge Portal Benchmark and optimize LLMs on-device with AI Edge Portal

LLMs have become more powerful at smaller sizes, but deploying them to edge devices like smartphones remains a massive challenge.

Google AI Edge Portal helps solve these challenges.

Today, we are excited to announce two new capabilities that expand Google AI Edge Portal’s capabilities for the generative AI era: benchmarking and debugging on-device LLMs.

These new services give developers what they need to optimize generative AI performance accurately and efficiently across the entire Android ecosystem.

With the latest release of Google AI Edge Portal, you can now run automated gen AI benchmarks directly on a physical lab of over 120 diverse Android devices and test for these scenarios speci…

1 week назад @ cloud.google.com
Introducing Agent Executor, Google’s distributed Agent Runtime
Introducing Agent Executor, Google’s distributed Agent Runtime Introducing Agent Executor, Google’s distributed Agent Runtime

Bring your own harness and agents: Agent Executor is designed to be harness-agnostic, allowing you to bring your own or use those made available by other vendors.

To solve this problem, we’ve partnered with the Google Kubernetes Engine team on Agent Substrate, a new open-source project also announced today.

Together with Agent Executor, Agent Substrate can provide a foundation for today’s largest agent deployments.

Stay within the Kubernetes ecosystem: Agent Substrate is built on top of Kubernetes and allows scheduling and horizontal scaling of compute with declarative configuration.

In the demo below, we showcase using Agent Executor together with Agent Substrate with a sample workload.

1 week назад @ cloud.google.com
Agent Sandbox on GKE is now available for everyone, and a first look at Agent Substrate
Agent Sandbox on GKE is now available for everyone, and a first look at Agent Substrate Agent Sandbox on GKE is now available for everyone, and a first look at Agent Substrate

Instead of wasting valuable compute to keep the agent running, GKE Agent Sandbox integrates with Pod Snapshots to suspend your idle agent workloads and resume them in seconds upon request.

GKE Agent Sandbox introduces a Sandbox API with an integrated warm pool.

Cost-effective warm pool : GKE Agent Sandbox warm pools keep pre-provisioned replicas ready to minimize sandbox startup latency.

GKE Agent Sandbox delivers up to 30% better price-performance when running on Axion processors than comparable hyperscaler cloud providers.

It provides the perfect foundation for Agents, Agent Harnesses and Agent Runtimes, including the new Agent Executor project.

1 week назад @ cloud.google.com
Everything Google Cloud customers need to know coming out of Google I/O
Everything Google Cloud customers need to know coming out of Google I/O Everything Google Cloud customers need to know coming out of Google I/O

We’re using Antigravity every day to build at Google, and we’ve already seen Cloud customers and partners using it, too:“In modern enterprise architecture, the real bottleneck is the cognitive load on the engineer.

Today, more than half of our production-ready code is generated through these agentic workflows, proving that the future of software isn't just AI-assisted—it’s AI-driven."

- Faruk Muratovic, US AI & Engineering Strategy and Services Leader at Deloitte"Antigravity has fundamentally shifted our team from manual coding to high-level orchestration.

By using Google Antigravity to run engineering pipelines in the background, our teams eliminate traditional development friction, allowi…

1 week назад @ cloud.google.com
The future of agentic development: Redefining the data practitioner lifecycle with Data Agent Kit
The future of agentic development: Redefining the data practitioner lifecycle with Data Agent Kit The future of agentic development: Redefining the data practitioner lifecycle with Data Agent Kit

By seamlessly bringing together these core tools and skills with your enterprise data, the Data Agent Kit effectively serves as a comprehensive harness for agentic context, memory, and personalization.

It provides:Agentic skills: Pre-codified pathways for interacting with your data estate, covering query optimization, ML best practices, data validation, data drift checks, governance, and troubleshooting.

Meanwhile, data practitioners often lack guidance about how to efficiently query, optimize, and troubleshoot cloud data, while specialized, fragmented development environments cannot see across your entire data estate.

Data Agent Kit helps with these challenges and others, providing the fou…

1 week назад @ cloud.google.com
What Google I/O '26 means for developing agents on Google Cloud
What Google I/O '26 means for developing agents on Google Cloud What Google I/O '26 means for developing agents on Google Cloud

At Google I/O, we introduced a unified development toolkit featuring Antigravity 2.0 and the Managed Agents API, giving developers better ways to build locally and deploy securely to the cloud on a shared protocol layer.

In this blog, we’re going to show you how Gemini Enterprise Agent Platform and the new developer tools shared at I/O fit together, unpack the spectrum of choice for building, and share what we’d actually try first.

Following the evolution of Vertex AI into the Gemini Enterprise Agent Platform – a comprehensive platform to build, scale, govern, and optimize agents with new features like session memory and centralized governance – we are now extending these capabilities direc…

1 week назад @ cloud.google.com
Gemini Live Agent Challenge: Announcing the winners and highlights
Gemini Live Agent Challenge: Announcing the winners and highlights Gemini Live Agent Challenge: Announcing the winners and highlights

The Gemini Live Agent Challenge is officially in the books!

We challenged developers worldwide to break out of the traditional 'text box' paradigm by building next-generation AI agents.

Participants pushed the boundaries of interactive AI across three distinct categories: The Live Agent, The Creative Storyteller, and The UI Navigator.

Two of these standout developers were even recognized in person at Google Cloud Next 2026.

Celebrating our category winners at Google Cloud Next ‘26Category winners Jeremiah Somoine and Bryen Param were invited to attend Google Cloud Next 2026 in Las Vegas, where they shared their experiences and insights with the broader developer community.

1 week, 5 days назад @ cloud.google.com
Cloud CISO Perspectives: How Google + Wiz changes multicloud strategy for CISOs
Cloud CISO Perspectives: How Google + Wiz changes multicloud strategy for CISOs Cloud CISO Perspectives: How Google + Wiz changes multicloud strategy for CISOs

Through innovations like Wiz Code, developers get granular data linking production issues directly back to their repositories, empowering them to fix vulnerabilities right where the code is written.

Supercharging the agentic SOC future with data and automationData is the lifeblood of AI and cloud security.

Wiz currently sits on a trove of sanitized data that captures the characteristics of highly secure, resilient, and compliant multicloud environments.

Wiz’s Red, Blue, and Green agents, and Google Security Operations’ Threat Hunting, Detection Engineering, and Third-Party Context agents, can help you develop the human-above-the-loop approach that empowers security teams to rapidly scale up…

1 week, 6 days назад @ cloud.google.com
The power of LLMs on your data, more than two orders of magnitude faster and cheaper
The power of LLMs on your data, more than two orders of magnitude faster and cheaper The power of LLMs on your data, more than two orders of magnitude faster and cheaper

Proxy models are cost-optimized ultra-lightweight models tailored to a specific query (aka prompt) and tuned for your data.

The fundamental ideas behind proxy models were proposed in Universal Query Engine (UQE) at NeurIPS 2024 by Google DeepMind.

Furthermore, the proxy models run fast in the CPU — no need for dedicated hardware.

We hope that we gave you good intuitions for why proxy models work.

How Proxy Models Work?

2 weeks назад @ cloud.google.com
How Glance turns hours of video into mobile-ready clips with AI
How Glance turns hours of video into mobile-ready clips with AI How Glance turns hours of video into mobile-ready clips with AI

Every day, thousands of hours of new video content sits waiting to be discovered.

Most of it lives in long-form, horizontal formats, while audiences are scrolling through vertical feeds on their phones.

Here’s how Glance’s video generation solution works.

Building for the lock screen eraThe goal was to create a complete pipeline that takes a long-form landscape video (16:9) and outputs multiple ready-to-publish short-form portrait videos (9:16).

This includes distinguishing between a static image and a live person to ensure the crop focuses on the actual speaker.

2 weeks назад @ cloud.google.com
Beyond source code: The files AI coding agents trust — and attackers exploit
Beyond source code: The files AI coding agents trust — and attackers exploit Beyond source code: The files AI coding agents trust — and attackers exploit

Attack surface: What executesJust as developers rely on project configuration to automate setup, debugging, and routine tasks, AI coding agents and modern developer tools also inherit execution paths from repository files.

Attack surface: What instructsAI coding agents also consume persistent instruction files that shape how they behave inside a project.

These files can influence what the agent prioritizes, what it ignores, which tools it uses, which files it trusts, and which actions it takes automatically.

Reusing them across repositories introduces a supply-chain risk, because malicious instructions can be presented as harmless guidance while steering otherwise legitimate agent workflows…

2 weeks, 1 day назад @ cloud.google.com
OpenAI
последний пост None
Microsoft Microsoft
последний пост 1 час назад
Extending Human Intelligence Through AI
Extending Human Intelligence Through AI Extending Human Intelligence Through AI

At a glance Modern AI systems are powerful not because they replicate human intelligence, but because they presuppose it, by extending structures already present in human cognition and language.

Understanding AI as an extension of human intelligence—not a replacement for it—offers a more grounded path for building trustworthy AI systems.

Rather than asking whether AI systems are becoming intelligent in the human sense, these approaches ask a more basic question: What if AI systems work because they rely on structures that are rooted in human cognition?

In our recent paper, The Origins of Artificial Intelligence in Natural Intelligence, we argue that modern AI systems are best understood nei…

1 час назад @ microsoft.com
MagenticLite, MagenticBrain, Fara1.5: An agentic experience optimized for small models
MagenticLite, MagenticBrain, Fara1.5: An agentic experience optimized for small models MagenticLite, MagenticBrain, Fara1.5: An agentic experience optimized for small models

Built as the next generation of Magentic-UI, it combines a redesigned app with a harness optimized for small models.

MagenticBrain and Fara1.5 are small models designed for orchestration and computer-use tasks, respectively.

Together, these releases explore how far agentic performance can be pushed with smaller models, codesigned tools, and an optimized execution harness.

Today, Microsoft Research AI Frontiers releases MagenticLite (opens in new tab), an experimental agentic application designed for small models.

The result is an agent that runs efficiently, keeps data on the user’s machine, and supports a broad range of agentic tasks.

6 days назад @ microsoft.com
Vega: Zero-knowledge proofs for digital identity in the age of AI
Vega: Zero-knowledge proofs for digital identity in the age of AI Vega: Zero-knowledge proofs for digital identity in the age of AI

Vega puts these building blocks together into a single proof system.

The hashing problem, and how folding solves itA credential proof must do two expensive things: hash the credential bytes with SHA-256 and verify the issuer’s digital signature.

Making it zero-knowledge, cheaplyA proof system needs to be zero-knowledge: the verifier should learn nothing beyond the claim being proved.

Device bindingA zero-knowledge credential proof is only useful if it is tied to the person holding the credential.

The proof system powering Vega is already available as the open-source spartan2 (opens in new tab) project on GitHub.

6 days, 3 hours назад @ microsoft.com
Further Notes on Our Recent Research on AI Delegation and Long-Horizon Reliability
Further Notes on Our Recent Research on AI Delegation and Long-Horizon Reliability Further Notes on Our Recent Research on AI Delegation and Long-Horizon Reliability

Our recent paper, “LLMs Corrupt Your Documents When You Delegate”, has generated discussion about the reliability of AI systems in delegated workflows.

Using a controlled evaluation methodology, we examine how well information is preserved across these extended workflows.

We use chained transformation-and-inversion tasks that evaluate whether semantic content is preserved accurately across extended delegated workflows.

Azure AI Foundry Labs Get a glimpse of potential future directions for AI, with these experimental technologies from Microsoft Research.

At the same time, the findings should not be interpreted as evidence that AI systems lack practical value in real-world work today.

1 week, 4 days назад @ microsoft.com
mimalloc: A new, high-performance, scalable memory allocator for the modern era
mimalloc: A new, high-performance, scalable memory allocator for the modern era mimalloc: A new, high-performance, scalable memory allocator for the modern era

mimalloc is an open-source, modern, scalable memory allocator that is a drop-in replacement for malloc and free.

The mimalloc memory allocator was initially designed in 2020 as a fast allocator for the state-of-the-art Lean (opens in new tab) and Koka (opens in new tab) programming languages developed at RiSE, both of which use novel compiler-guided reference counting (see Perceus).

ja .LBB0_generic leaq 7 ( %rsi ), %rax ; round to sizeof(void*) andq $-8 , %rax movq 232 ( %rdi , %rax ), %rcx ; rcx = heap->small_pages[index] movq 8 ( %rcx ), %rax ; block = rax = page->free testq %rax , %rax ; block == NULL?

Thus, mimalloc has three free lists per (64 KiB) mimalloc page, and effectively that …

1 week, 6 days назад @ microsoft.com
GridSFM: A new, small foundation model for the electric grid
GridSFM: A new, small foundation model for the electric grid GridSFM: A new, small foundation model for the electric grid

Microsoft releases a lightweight foundation model that can predict AC optimal power flow in milliseconds, boosting efficiency and unlocking cost savings in grid analysis.

It provides a foundation for the community to build advanced power grid simulators and planning tools without recreating data or models from scratch.

Microsoft introduces GridSFM, a small foundation model for solving AC optimal power flow (AC-OPF) problems in transmission power grids.

Power grids face increasing strain from surging demand, the need to integrate renewable energy sources, transportation electrification, and extreme weather events.

This release adds the first open AC-OPF model that supports multiple grid topo…

2 weeks назад @ microsoft.com
Advancing AI for materials with MatterSim: experimental synthesis, faster simulation, and multi-task models
Advancing AI for materials with MatterSim: experimental synthesis, faster simulation, and multi-task models Advancing AI for materials with MatterSim: experimental synthesis, faster simulation, and multi-task models

Now we have experimentally synthesized it and measured its thermal conductivity (152 W/m/K) to be close to the thermal conductivity of silicon.

Now we have experimentally synthesized it and measured its thermal conductivity (152 W/m/K) to be close to the thermal conductivity of silicon.

Faster simulation : We have accelerated MatterSim-v1 model inference by 3-5x and integrated it with the LAMMPS software package, enabling large-scale simulations across multiple GPUs.

These include experimental validation of MatterSim predictions for thermal conductors, performance improvements for faster simulation, and the introduction of a new multi-task foundation model for materials characterization.

Le…

2 weeks, 1 day назад @ microsoft.com
SocialReasoning-Bench: Measuring whether AI agents act in users’ best interests
SocialReasoning-Bench: Measuring whether AI agents act in users’ best interests SocialReasoning-Bench: Measuring whether AI agents act in users’ best interests

In our simulated multi-agent marketplace, agents accepted the first proposal they received up to 93% of the time without exploring alternatives.

Introducing SocialReasoning-BenchFigure 1: Our benchmark measures agents’ social reasoning ability in two domains, calendar coordination and marketplace negotiation.

Finding 3: Outcome optimality shows how much value agents leave on the table.

In calendar, agents perform better but still settle below the midpoint on average.

Finally, Outcome Optimality works well in settings with clear boundaries, where a “good” outcome can be defined and measured.

2 weeks, 1 day назад @ microsoft.com
Building realistic electric transmission grid dataset at scale: a pipeline from open dataset
Building realistic electric transmission grid dataset at scale: a pipeline from open dataset Building realistic electric transmission grid dataset at scale: a pipeline from open dataset

The ability to study transmission-level power grid behavior is essential for modern power systems research.

In most of the world, including the United States, realistic transmission-level grid data is classified as critical infrastructure information and subject to strict access controls.

These restrictions exist for good reasons, but the resulting lack of realistic grid models is increasingly exacerbating the challenges power systems face.

In this work, we introduce an open-data-derived pipeline for constructing large-scale, transmission-level power grid models that realistically approximate existing networks without relying on proprietary or restricted datasets.

Using only publicly access…

2 weeks, 4 days назад @ microsoft.com
Microsoft at NSDI 2026: Advances in large-scale networked systems
Microsoft at NSDI 2026: Advances in large-scale networked systems Microsoft at NSDI 2026: Advances in large-scale networked systems

The USENIX Symposium on Networked Systems Design and Implementation 2026 (opens in new tab) (NSDI ’26) is a leading forum where researchers and practitioners share new research, insights, and advances in the design and operation of these systems.

Microsoft is proud to support NSDI ’26 as a returning sponsor, reflecting our ongoing commitment to advancing systems and networking research and engaging with the broader community.

Together, they highlight advances in building and operating large-scale networked systems.

Spotlight: Microsoft research newsletter Microsoft Research Newsletter Stay connected to the research community at Microsoft.

Wednesday, May 6, 9:00–10:20 AMYuxuan Yan, Zhejiang …

3 weeks, 1 day назад @ microsoft.com
Red-teaming a network of agents: Understanding what breaks when AI agents interact at scale
Red-teaming a network of agents: Understanding what breaks when AI agents interact at scale Red-teaming a network of agents: Understanding what breaks when AI agents interact at scale

Actions that seem harmless can cascade causing a chain reaction across an agent network.

Invisibility: Information can pass through chains of unaware agents, making the source of an attack hard to trace from any single agent’s perspective.

Each independently contacted a victim agent (Bob) about the same fabricated audit, using varied language and staggered timing to appear unrelated.

Experimental setup: A principal entrusts their agent, Bob, with sensitive personal data: disability accommodation, medical schedule, preferred pharmacy, emergency contact.

Agents relayed summaries of other agents’ private messages to the attacker (one forwarded another agent’s message within seconds), and agent…

3 weeks, 5 days назад @ microsoft.com
AutoAdapt: Automated domain adaptation for large language models
AutoAdapt: Automated domain adaptation for large language models AutoAdapt: Automated domain adaptation for large language models

At a glance Problem : Adapting large language models to specialized, high-stakes domains is slow, expensive, and hard to reproduce.

: Adapting large language models to specialized, high-stakes domains is slow, expensive, and hard to reproduce.

Why it matters: The result is faster, automated, more reliable domain adaptation that turns weeks of manual iteration into repeatable pipelines.

Deploying large language models (LLMs) in real-world, high-stakes settings is harder than it should be.

In our paper, “AutoAdapt: An Automated Domain Adaptation Framework for Large Language Models,” we describe an end-to-end, constraint-aware framework for domain adaptation.

1 month назад @ microsoft.com
Can we AI our way to a more sustainable world?
Can we AI our way to a more sustainable world? Can we AI our way to a more sustainable world?

Because I do think there’s a role for AI, a huge role for AI.

BURGER: Right, right.

BURGER: Right, right.

So I think that’s also something quite important here that, you know, AI can help facilitate.

And I think that’s not just applying AI to solve solutions through optimization but also thinking about this in an integrated way.

1 month, 1 week назад @ microsoft.com
New Future of Work: AI is driving rapid change, uneven benefits
New Future of Work: AI is driving rapid change, uneven benefits New Future of Work: AI is driving rapid change, uneven benefits

Publication New Future of Work Report 2025The New Future of Work report brings together research from inside and outside of Microsoft to understand what is happening as AI enters workplaces.

But usage and confidence vary widely across sectors, and men report using AI at work more often than women.

AI systems are increasingly playing a role in decision-making, creativity, and communication, with AI systems being positioned as a “collaborator.” This raises questions about how to support “collaboration” between people and AI, what we can learn from how people interact with each other, and where the capabilities of AI systems raise different opportunities and create different requirements.

Usin…

1 month, 2 weeks назад @ microsoft.com
Ideas: Steering AI toward the work future we want
Ideas: Steering AI toward the work future we want Ideas: Steering AI toward the work future we want

JANSSEN: Yeah, yeah, exactly.

TEEVAN: Yeah, yeah, yeah.

I’m curious what you have found particularly surprising about how people and organizations are leveraging AI right now.

And so I do like to picture a future of work where humans are flourishing with AI and where humans still get to do meaningful work.

And I’m very curious about how we can take advantage of AI and do more without running ourselves into the ground because we’re not AI, right?

1 month, 2 weeks назад @ microsoft.com
MIT AI MIT AI
последний пост 6 days, 13 hours назад
Technology usually creates jobs for young, skilled workers. Will AI do the same?
Technology usually creates jobs for young, skilled workers. Will AI do the same? Technology usually creates jobs for young, skilled workers. Will AI do the same?

At any given time, technology does two things to employment: It replaces traditional jobs, and it creates new lines of work.

“We had never before seen exactly who is doing new work,” Autor says.

“Eroding tasks is not the same thing as eroding jobs, since many jobs involve a lot of tasks.

New work, Autor observes, is always tied to new forms of expertise.

Back to AI for a minuteStudying who gets new jobs led the scholars to striking conclusions about how new work is created.

6 days, 13 hours назад @ news.mit.edu
Building AI models that understand chemical principles
Building AI models that understand chemical principles Building AI models that understand chemical principles

Among all of the possible chemical compounds, it’s estimated that between 1020 and 1060 may hold potential as small-molecule drugs.

His research straddles the line between chemical engineering and computer science, as he develops and deploys computational models to analyze vast numbers of possible chemical compounds, design new compounds, and predict reaction pathways that could generate those compounds.

After graduating from Caltech, he decided to keep going in chemical engineering and came to MIT in 2014 to start a PhD.

His work focused on combining machine learning and cheminformatics — the application of computation methods to analyze chemical data — to plan reaction pathways that could…

1 week назад @ news.mit.edu
Justin Solomon appointed associate dean of engineering education
Justin Solomon appointed associate dean of engineering education Justin Solomon appointed associate dean of engineering education

Justin Solomon, associate professor in the MIT Department of Electrical Engineering and Computer Science (EECS), has been appointed associate dean of engineering education in the MIT School of Engineering, effective July 1.

In this new role, Solomon will focus on advancing innovation in engineering education across the school.

He is the author of “Numerical Algorithms,” a textbook that presents a modern approach to numerical analysis for computer science students.

Solomon is a principal investigator at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), where he leads the Geometric Data Processing Group.

Solomon joined the MIT faculty in 2016.

1 week назад @ news.mit.edu
Two from MIT named 2026 Knight-Hennessy Scholars
Two from MIT named 2026 Knight-Hennessy Scholars Two from MIT named 2026 Knight-Hennessy Scholars

MIT master’s student Sunshine Jiang ’25 and Rupert Li ’24 are recipients of this year’s Knight-Hennessy Scholarship.

Rupert Li ’24Rupert Li, from Portland, Oregon, is currently pursuing a PhD in mathematics at Stanford School of Humanities and Sciences.

He graduated from MIT in 2024 with a bachelor’s degree, double majoring in mathematics and computer science, economics, and data science.

Along with his bachelor’s degree, he also received a master’s degree in data science.

In addition to the Knight-Hennessy Scholarship and the Marshall Scholarship, he has been awarded the Hertz Fellowship, P.D.

1 week, 5 days назад @ news.mit.edu
Q&A: Expanding MIT’s global reach through Universal Learning
Q&A: Expanding MIT’s global reach through Universal Learning Q&A: Expanding MIT’s global reach through Universal Learning

MIT's Universal Learning is a new initiative from MIT Open Learning designed to prepare learners everywhere to tackle complex global challenges through boundary-crossing thinking.

Universal AI, the first offering from Universal Learning, launched to the public today.

Dimitris Bertsimas, vice provost for open learning, and Megan Mitchell, senior director of Universal Learning, share how Universal Learning supports MIT’s educational mission, and what makes it distinctive.

My colleagues contributing to current and forthcoming Universal Learning offerings share this same passion.

That’s why with Universal Learning we are prioritizing asynchronous delivery, mobile delivery, translations, and are…

2 weeks, 1 day назад @ news.mit.edu
Universal AI is “a pathway to AI fluency that’s accessible and approachable to anyone, anywhere”
Universal AI is “a pathway to AI fluency that’s accessible and approachable to anyone, anywhere” Universal AI is “a pathway to AI fluency that’s accessible and approachable to anyone, anywhere”

“Universal AI was built to thread that needle.

The result is a pathway to AI fluency that’s approachable to anyone, anywhere.”The need for accessible, practical AI education has never been greater.

Universal AI also includes industry-specific courses that dive into the intersection of AI and health care, sustainability, entrepreneurship, transportation, and more.

Six industry-specific courses are available today, including Holistic AI in Medicine, AI and Entrepreneurship, and AI and Sustainability: Energy.

“MIT’s long history of making knowledge available through MIT Open Learning means it’s only natural we’d feel compelled to bring Universal AI to the world,” adds Kornbluth.

2 weeks, 1 day назад @ news.mit.edu
Study: Firms often use automation to control certain workers’ wages
Study: Firms often use automation to control certain workers’ wages Study: Firms often use automation to control certain workers’ wages

Rather than implement automation in pursuit of maximal productivity, firms have often used automation to replace employees who specifically receive a “wage premium,” earning higher salaries than other comparable workers.

For one thing, automation has affected the growth in U.S. income inequality even more than many observers realize.

This inefficient targeting of certain employees has offset 60-90 percent of the productivity gains from automation during the time period.

Inequality implicationsDating back to the 2010s, Acemoglu and Restrepo have combined to conduct many studies about automation and its effects on employment, wages, productivity, and firm growth.

Certain types of automation c…

2 weeks, 6 days назад @ news.mit.edu
Games people — and machines — play: Untangling strategic reasoning to advance AI
Games people — and machines — play: Untangling strategic reasoning to advance AI Games people — and machines — play: Untangling strategic reasoning to advance AI

Nevertheless, one month after graduating with his undergraduate degree, Farina began a doctoral degree in computer science at Carnegie Mellon University.

As he was finishing his doctorate, Farina worked for a year as a research scientist in Meta’s Fundamental AI Research Labs.

An everyday example occurs in the game of poker, where players bluff in order to conceal information about their cards.

Stratego is a military strategy game that has inspired research efforts costing millions of dollars to produce systems capable of beating human players.

I am excited about seeing these algorithms incorporated into the broader AI revolution that’s happening around us.”

3 weeks назад @ news.mit.edu
Improving understanding with language
Improving understanding with language Improving understanding with language

She learned French from her relationships with Haitian family friends, and American Sign Language because of another friend’s deaf sibling.

“There are so many things that are different about sign language and spoken language,” she says.

“It’s the only reason I’m on the path I’ve chosen,” she continues, one that features a focus on language acquisition, education policy, LLMs’ computational possibilities and limitations, and education reform.

Language is a medium for thought and provides guardrails to improve understanding.

“Support research,” Honeycutt says.

3 weeks, 5 days назад @ news.mit.edu
Beacon Biosignals is mapping the brain during sleep
Beacon Biosignals is mapping the brain during sleep Beacon Biosignals is mapping the brain during sleep

Beacon Biosignals is working to make sense of the brain by monitoring its activity while people sleep.

With each deployment, Beacon learns more about how the brain works — insights it is using to create a “foundation model” of the brain.

“It was clear sleep was the right window to understand the brain,” Donoghue says.

“What’s powerful is that we’re building a longitudinal record of brain function over time,” Donoghue says.

That turns routine testing into a foundation for entirely new prognostic biomarkers — and a path to detecting and intervening in brain disease earlier, potentially before symptoms ever begin.”

3 weeks, 5 days назад @ news.mit.edu
Making the case for curiosity-driven science
Making the case for curiosity-driven science Making the case for curiosity-driven science

Kornbluth spoke about everything from the importance of curiosity-driven science and why basic science is critical to our nation’s future, to AI and education, and even bravely joined O’Leary in a rendition of the Williams College song, “The Mountains,” in honor of their shared alma mater.

“We are in this time of incredible uncertainty,” said Kornbluth of the current state of higher education and funding for scientific research.

Behind the scenes, I am – along with many other [university] presidents – I am in D.C. all the time now.

Universities are where most of the science with a long pathway to impact, requiring patience, starts.

With that pipeline being drained, what does the future hold…

3 weeks, 6 days назад @ news.mit.edu
Solving the “Whac-a-mole dilemma”: A smarter way to debias AI vision models
Solving the “Whac-a-mole dilemma”: A smarter way to debias AI vision models Solving the “Whac-a-mole dilemma”: A smarter way to debias AI vision models

Perhaps one of the best known and most persistent challenges that AI research continues to reckon with is bias.

Bias is often discussed in relation to training data, but model architecture can also contain and amplify bias, negatively influencing model performance in real-world settings.

VLMs are multi-modal models that can understand and interpret different data modalities like video, image, and text simultaneously.

And like projection debiasing, WRING is a post-processing approach, which means it can be applied “on the fly” to a pre-trained VLM.

“Extending this for ChatGPT-style, generative language models, is the reasonable next step for us,” says Gerych.

3 weeks, 6 days назад @ news.mit.edu
The MIT-IBM Computing Research Lab launches to shape the future of AI and quantum computing
The MIT-IBM Computing Research Lab launches to shape the future of AI and quantum computing The MIT-IBM Computing Research Lab launches to shape the future of AI and quantum computing

IBM and MIT today announced the launch of the MIT-IBM Computing Research Lab, advancing their long-standing collaboration to shape the next era of computing.

The MIT-IBM Computing Research Lab builds on a distinguished history of scientific excellence at the intersection of research and academia.

“We expect the MIT-IBM Computing Research Lab to emerge as one of the world’s premier academic and industrial hubs accelerating the future of computing,” says Jay Gambetta, director of IBM Research and IBM Fellow, and IBM chair of the MIT-IBM Computing Research Lab.

The MIT-IBM Computing Research Lab will also leverage IBM’s longtime leadership and expertise in quantum computing.

Deep integration w…

4 weeks назад @ news.mit.edu
Enabling privacy-preserving AI training on everyday devices
Enabling privacy-preserving AI training on everyday devices Enabling privacy-preserving AI training on everyday devices

A new method developed by MIT researchers can accelerate a privacy-preserving artificial intelligence training method by about 81 percent.

This advance could enable a wider array of resource-constrained edge devices, like sensors and smartwatches, to deploy more accurate AI models while keeping user data secure.

Each device trains the model using its local data and then transfers model updates back to the server.

This new approach could make it more feasible for AI models to be used in high-stakes applications with strict security and privacy standards, like health care and finance.

The central server usually waits to receive model updates from all devices, then averages them to complete th…

4 weeks назад @ news.mit.edu
A faster way to estimate AI power consumption
A faster way to estimate AI power consumption A faster way to estimate AI power consumption

Improving data center energy efficiency is one way scientists are striving to make AI more sustainable.

In addition, this tool could allow algorithm developers and model providers to assess potential energy consumption of a new model before they deploy it.

The power consumption of a particular GPU will vary based on its configuration and the workload it is handling.

They could use these patterns to generate the information needed for reliable but quick power estimation.

The user can also change the GPU configuration or adjust the operating speed to see how such design choices impact the overall power consumption.

1 month назад @ news.mit.edu
Berkeley AI
последний пост 2 weeks, 5 days назад
Adaptive Parallel Reasoning: The Next Paradigm in Efficient Inference Scaling
Adaptive Parallel Reasoning: The Next Paradigm in Efficient Inference Scaling Adaptive Parallel Reasoning: The Next Paradigm in Efficient Inference Scaling

Adaptive Parallel Reasoning: The Next Paradigm in Efficient Inference ScalingOverview of adaptive parallel reasoning.

We provide a detailed analysis of recent progress in the field of parallel reasoning, especially Adaptive Parallel Reasoning.

Figure 4: Special Tokens Variants across Adaptive Parallel Reasoning PapersInference Systems for Adaptive ParallelismHow do we actually execute parallel branches?

Figure 14: Difference in Model Choice Across Adaptive Parallel Reasoning PapersEach paper also offers a slightly different interpretation about how adaptive parallel reasoning contributes to the research field.

(Yang et al., 2025; Lian et al., 2025) aim to deliver sequential-AR-model-level a…

2 weeks, 5 days назад @ bair.berkeley.edu
Gradient-based Planning for World Models at Longer Horizons
Gradient-based Planning for World Models at Longer Horizons Gradient-based Planning for World Models at Longer Horizons

Large, learned world models are becoming increasingly capable.

Why is adversarial robustness an issue for world model planning?

We thus exploit the differentiability of learned world models $F_{\theta}$, while not falling victim to the inherent sensitivity of the state Jacobians $D_s F_{\theta}$.

It’s a funny sweet spot where the background literature (planning and control overall) is incredibly mature and well-developed, but the current setting (pure planning optimization over modern, large-scale world models) is still heavily underexplored.

But, once we figure out all the right ideas, world model planners will likely become as commonplace as RL.

1 month, 1 week назад @ bair.berkeley.edu
Identifying Interactions at Scale for LLMs
Identifying Interactions at Scale for LLMs Identifying Interactions at Scale for LLMs

Identifying Interactions at Scale for LLMsUnderstanding the behavior of complex machine learning systems, particularly Large Language Models (LLMs), is a critical challenge in modern artificial intelligence.

Therefore, grounded or reality-checked interpretability methods must also be able to capture these influential interactions.

In this blog post, we describe the fundamental ideas behind SPEX and ProxySPEX, algorithms capable of identifying these critical interactions at scale.

SPEX and ProxySPEX FrameworkTo discover influential interactions with a tractable number of ablations, we have developed SPEX (Spectral Explainer).

We formalize this through two observations: sparsity (relatively f…

2 months, 2 weeks назад @ bair.berkeley.edu
Information-Driven Design of Imaging Systems
Information-Driven Design of Imaging Systems Information-Driven Design of Imaging Systems

We developed a framework that enables direct evaluation and optimization of imaging systems based on their information content.

The first approach treated imaging systems as unconstrained communication channels, ignoring the physical limitations of lenses and sensors.

Our Information-Driven Encoder Analysis Learning (IDEAL) method uses gradient ascent on information estimates to optimize imaging system parameters.

The standard approach to computational imaging design, end-to-end optimization, jointly trains the imaging hardware and a neural network decoder.

The computational efficiency of IDEAL suggests possibilities for designing imaging systems that were previously intractable.

4 months, 2 weeks назад @ bair.berkeley.edu
RL without TD learning
RL without TD learning RL without TD learning

RL without TD learningIn this post, I’ll introduce a reinforcement learning (RL) algorithm based on an “alternative” paradigm: divide and conquer.

We can do Reinforcement Learning (RL) based on divide and conquer, instead of temporal difference (TD) learning.

There are two classes of algorithms in RL: on-policy RL and off-policy RL.

We compared TRL with $n$-step TD learning with different values of $n$, from $1$ (pure TD) to $\infty$ (pure MC).

I still think one of the most important problems in RL (and even in machine learning) is to find a scalable off-policy RL algorithm.

6 months, 3 weeks назад @ bair.berkeley.edu
What exactly does word2vec learn?
What exactly does word2vec learn? What exactly does word2vec learn?

What exactly does word2vec learn?

What exactly does word2vec learn, and how?

In this framing, it’s clear that word2vec is a minimal neural language model.

As a result, the theory predicts exactly what features are learned in terms of the corpus statistics and the algorithmic hyperparameters.

We find that over the course of learning, word2vec builds these linear representations in a sequence of noisy learning steps, and their geometry is well-described by a spiked random matrix model.

8 months, 4 weeks назад @ bair.berkeley.edu
AWS Machine Learning AWS Machine Learning
последний пост 23 часа назад
Technical deep dive: AgentCore payments and innovation in agentic commerce
Technical deep dive: AgentCore payments and innovation in agentic commerce Technical deep dive: AgentCore payments and innovation in agentic commerce

In this post, we walk you through a technical deep dive of AgentCore payments.

The following diagram illustrates the preview capabilities of AgentCore payments and how it interacts with related AgentCore services.

The next section takes a technical deep dive into why agentic payments are uniquely challenging, and how AgentCore payments addresses each of these challenges.

Figure 2: Secure credential storage for AgentCore payments and AgentCore Identity.

To address this, AgentCore payments supports payment orchestration, a core engine purpose-built to power the complexities of agentic payments.

23 часа назад @ aws.amazon.com
Build highly scalable serverless LangGraph multi-agent systems in AWS with Amazon Bedrock AgentCore
Build highly scalable serverless LangGraph multi-agent systems in AWS with Amazon Bedrock AgentCore Build highly scalable serverless LangGraph multi-agent systems in AWS with Amazon Bedrock AgentCore

In this post, we provide a solution to build highly scalable, serverless multi-agent generative AI systems on AWS using LangGraph Agents as orchestrators integrated with Amazon Bedrock AgentCore Memory and Amazon Bedrock AgentCore Observability.

Our approach for building highly scalable serverless multi-agent orchestrations combines serverless technologies such as AWS Lambda and AWS Step Functions.

These services can be used by developers to build LangGraph agents that scale automatically, respond to events in real time, and remove infrastructure management.

Integrated memory services from AgentCore Memory enable agents to maintain short-term conversational context and long-term knowledge a…

23 часа назад @ aws.amazon.com
Build high-performance generative AI systems with Strands Agents, NVIDIA NIM, and Amazon Bedrock AgentCore
Build high-performance generative AI systems with Strands Agents, NVIDIA NIM, and Amazon Bedrock AgentCore Build high-performance generative AI systems with Strands Agents, NVIDIA NIM, and Amazon Bedrock AgentCore

Building high-performance generative AI agents requires architecture that can deliver fast inference, coordinate multiple agents, and operate reliably under production workloads.

Amazon Bedrock AgentCore provides managed runtime, shared memory and built-in observability and Strands Agents provide serverless multi-agent orchestration.

You package the Strands orchestrator and specialized agents together as a Docker container and deploy them into Amazon Bedrock AgentCore Runtime.

You also use Amazon Bedrock AgentCore Memory for shared context across agent invocations and to provide support for multi-turn conversations.

The following architecture diagram shows how NVIDIA NIM, Strands Agents, an…

23 часа назад @ aws.amazon.com
AgentWatch: Proactive AWS monitoring with ambient agents
AgentWatch: Proactive AWS monitoring with ambient agents AgentWatch: Proactive AWS monitoring with ambient agents

AgentWatch delivers ambient AWS resource monitoring for your DevOps team, moving beyond the reactive cycle of managing Amazon CloudWatch alarms across multiple accounts.

This is where AgentWatch, an ambient AWS resource monitoring agent, offers a different approach to infrastructure oversight.

Now that you understand the ambient agent concept, let’s explore how AgentWatch implements these principles for AWS infrastructure monitoring.

The AgentWatch monitoring cycle begins with Amazon EventBridge triggering an AWS Lambda function every 15 minutes through a cron-based rule.

The architecture follows a hybrid ambient model with both scheduled monitoring and on-demand interaction capabilities.Us…

23 часа назад @ aws.amazon.com
From idea to AI app: Creating intelligent research assistants with Strands
From idea to AI app: Creating intelligent research assistants with Strands From idea to AI app: Creating intelligent research assistants with Strands

For more information about Strands Agents, see Introducing Strands Agents, an Open Source AI Agents Software Development Kit (SDK).

Getting started with Strands Agents:You will start by building a straightforward Q&A style research assistant using Strands Agents.

from strands import Agent # Create an agent with default settings agent = Agent() # Ask the agent a question agent("Tell me about agentic AI")That’s it.

For the topic '{topic}': 1.

For the topic '{topic}': " f"1.

1 day назад @ aws.amazon.com
Build an enterprise observability solution for Amazon Quick
Build an enterprise observability solution for Amazon Quick Build an enterprise observability solution for Amazon Quick

Amazon Quick is a generative AI-powered platform that brings together Spaces, Chat agents, Flows, Automate, Research, and Amazon Quick Sight business intelligence capabilities in one place.

Amazon Quick is integrated with AWS CloudTrail, which provides a record of actions taken by a user, a role, or an AWS service in Amazon Quick.

Figure 1: Amazon Quick enterprise observability solution architectureThe workflow consists of the following steps:Business users interact with Amazon Quick.

"cloudtrail_events" ;Deploy Quick Sight dashboardDeploy the Quick Sight dashboard:python3 deploy.py --dashboardThis deploys Quick Sight resources: a custom theme, a data source, datasets with daily refresh sch…

1 day, 1 hour назад @ aws.amazon.com
Transforming professional work: How Amazon Quick turns document creation from hours into minutes
Transforming professional work: How Amazon Quick turns document creation from hours into minutes Transforming professional work: How Amazon Quick turns document creation from hours into minutes

In this post, we explore how the Amazon Quick document and visualization creation capabilities work, what you can build with them, and how professionals across roles are using them to reclaim hours of their workweek.

It pulls live data from Amazon Quick Sight dashboards, Amazon Simple Storage Service (Amazon S3) data lakes, Amazon Redshift warehouses, and Amazon Relational Database Service (Amazon RDS) databases.

Rather than coming to Quick, Quick comes to you.

Document and visual creation is available in all AWS Regions where Amazon Quick is generally available.

For more information, visit the Amazon Quick detail page and the Amazon Quick documentation.

1 day, 1 hour назад @ aws.amazon.com
Amazon Nova Act is now HIPAA eligible
Amazon Nova Act is now HIPAA eligible Amazon Nova Act is now HIPAA eligible

Amazon Nova Act helps you automate real-world browser tasks that previously required manual effort.

Things to knowHIPAA eligibility – Amazon Nova Act is included in the HIPAA Eligible Services Reference list.

– Amazon Nova Act is included in the HIPAA Eligible Services Reference list.

Integration – Nova Act works with the Strands Agents framework and integrates with Amazon Bedrock AgentCore, Amazon CloudWatch, and IAM.

– Nova Act works with the Strands Agents framework and integrates with Amazon Bedrock AgentCore, Amazon CloudWatch, and IAM.

5 days, 18 hours назад @ aws.amazon.com
Intelligent radiology workflow optimization with AI agents
Intelligent radiology workflow optimization with AI agents Intelligent radiology workflow optimization with AI agents

In this post, we’ll show how to build an radiology workflow optimization with AI agents on Amazon Bedrock AgentCore and Strands Agents SDK .

Radiology Partners recognizes this as a mission-critical workflow capability and is partnering with AWS to adopt Agentic AI for intelligent workflow optimization.

In your radiology workflow optimization, a network of specialized AI agents collaborates to orchestrate complex clinical workflows from start to finish.

Similarly, the clinical data MCP tool can find the patient’s historical clinical data for the patient history synthesizer agent.

CleanupThe code and instructions to set up and clean up this solution are available in the Intelligent radiology …

5 days, 22 hours назад @ aws.amazon.com
Integrating AWS API MCP Server with Amazon Quick using Amazon Bedrock AgentCore Runtime
Integrating AWS API MCP Server with Amazon Quick using Amazon Bedrock AgentCore Runtime Integrating AWS API MCP Server with Amazon Quick using Amazon Bedrock AgentCore Runtime

The following diagram shows how Amazon Bedrock AgentCore Runtime connects Amazon Quick to AWS services through the AWS API MCP Server.

AgentCore Runtime authorizes and routes the request: After validating your Cognito token, AgentCore Runtime securely invokes the AWS API MCP Server running in the containerized environment.

Add the MCP protocol integration to connect Amazon Quick with your MCP server hosted on Amazon Bedrock AgentCore Runtime.

ConclusionIn this post, you learned how to connect Amazon Quick with AWS services using Amazon Bedrock AgentCore Runtime and the AWS API MCP Server.

You can also build domain-specific agents for security, cost optimization, or capacity planning, and in…

6 days назад @ aws.amazon.com
Building multi-tenant agents with Amazon Bedrock AgentCore
Building multi-tenant agents with Amazon Bedrock AgentCore Building multi-tenant agents with Amazon Bedrock AgentCore

These include tenant isolation, tenant identity, tenant observability, data isolation, cost attribution, and noisy neighbor mitigation.

Agent Runtime Deployment: Dedicated compared to SharedA key decision in a multi-tenant agentic architecture is how the agent runtime is provisioned relative to tenants.

Agent identity, trust, and discoveryAs agentic applications interact with external agents across organizational boundaries, three foundational concerns emerge: agent identity, agent trust, and agent discovery.

Agent identity with AgentCore IdentityAmazon Bedrock AgentCore Identity implements agent identities as workload identities, a pattern well-established in cloud-native security.

We enco…

6 days назад @ aws.amazon.com
Break the context window barrier with Amazon Bedrock AgentCore
Break the context window barrier with Amazon Bedrock AgentCore Break the context window barrier with Amazon Bedrock AgentCore

In this post, you will learn how to implement Recursive Language Models (RLM) using Amazon Bedrock AgentCore Code Interpreter and the Strands Agents SDK.

Base and Long Context approaches fail to process some inputs due to context limitations.

Base and Long Context approaches fail to process some inputs due to context limitations.

For real-time applications, consider whether the task truly requires processing documents beyond the model’s context window.

For real-time applications, consider whether the task truly requires processing documents beyond the model’s context window.

6 days, 1 hour назад @ aws.amazon.com
Build AI agents for business intelligence with Amazon Bedrock AgentCore
Build AI agents for business intelligence with Amazon Bedrock AgentCore Build AI agents for business intelligence with Amazon Bedrock AgentCore

To address this challenge, OPLOG built a production-ready business intelligence (BI) system using AI agents deployed on Amazon Bedrock AgentCore.

In this post, we show you how OPLOG developed three AI agents using the Strands Agents SDK, deployed them to Amazon Bedrock AgentCore, and integrated Amazon Bedrock with Anthropic’s Claude Sonnet and Amazon Bedrock Knowledge Bases for Retrieval(RAG).

Although each agent serves a distinct purpose, they share a common technical foundation built on Amazon Bedrock, Amazon Bedrock Knowledge Bases, and the Strands Agents SDK, all deployed to Amazon Bedrock AgentCore.

To learn more about Amazon Bedrock AgentCore, explore the Amazon Bedrock AgentCore Deve…

6 days, 1 hour назад @ aws.amazon.com
Build an AI-powered recruitment assistant using Amazon Bedrock
Build an AI-powered recruitment assistant using Amazon Bedrock Build an AI-powered recruitment assistant using Amazon Bedrock

The solution uses the Amazon Bedrock Converse API with Amazon Nova Pro, AWS Lambda for processing, Amazon API Gateway for routing, Amazon DynamoDB and Amazon Simple Storage Service (Amazon S3) for data storage, and Amazon Bedrock Guardrails for responsible AI evaluation.

PrerequisitesBefore you begin, verify that you have:An AWS account with appropriate permissions for Amazon Bedrock, AWS Identity and Access Management (IAM), AWS CloudFormation, Amazon API Gateway, Amazon Cognito, Amazon DynamoDB, Amazon S3, AWS Lambda, and AWS Amplify.

This removes the Lambda functions, Amazon API Gateway, Amazon DynamoDB tables, Amazon Cognito resources, and IAM roles..

This removes the Lambda functions, …

6 days, 1 hour назад @ aws.amazon.com
Build AI-powered dashboard automation agents with NLP on Amazon Bedrock AgentCore
Build AI-powered dashboard automation agents with NLP on Amazon Bedrock AgentCore Build AI-powered dashboard automation agents with NLP on Amazon Bedrock AgentCore

Solution overviewIn this solution, we use a multi-agent architecture built with Amazon Bedrock AgentCore and the Strands framework.

Amazon Bedrock AgentCore is an agentic platform for building, deploying, and operating effective agents securely at scale, no infrastructure management needed.

For creating a new dashboard, refer to Create an Amazon Quick dashboard for more information.

2.2 Add dependencies for the projectInstall the required Amazon Bedrock AgentCore libraries and development tools for your project.

Step 3: Deploy to Amazon Bedrock AgentCore RuntimeAmazon Bedrock AgentCore provides a managed environment for deploying Strands Agents with two deployment options: container-based d…

6 days, 1 hour назад @ aws.amazon.com
NVIDIA
последний пост 1 час назад
AI Factories: The New Infrastructure of Intelligence
AI Factories: The New Infrastructure of Intelligence AI Factories: The New Infrastructure of Intelligence

NVIDIA also relies on a curated ecosystem of AI software partners to build AI solutions for each enterprise’s use cases.

These AI factories can be deployed for a wide range of use cases, from agentic AI workloads to physical AI and robotics.

It’s a practical proof point: AI factories can transform how companies build, design and operate.

AI factories can start small to support one business unit or workload, or they may be built from the ground up to support high-performance AI inference and training at massive scale.

Building these gigawatt-scale AI factories requires a lot more than optimized compute.

1 час назад @ blogs.nvidia.com
Develop High-Performance GPU Kernels in C++ with NVIDIA CUDA Tile
Develop High-Performance GPU Kernels in C++ with NVIDIA CUDA Tile Develop High-Performance GPU Kernels in C++ with NVIDIA CUDA Tile

Developers can now use NVIDIA CUDA Tile programming within large existing C++ GPU codebases to develop highly optimized GPU kernels using tile-based abstractions.

CUDA Tile C++ is an expression of the CUDA Tile programming model in C++, built on top of the CUDA Tile IR specification.

CUDA Tile C++ is portable across different NVIDIA GPU architectures, enabling developers to use the latest hardware features without having to rewrite code.

CUDA Tile C++ vector add exampleDevelopers familiar with CUDA C++ for SIMT have likely encountered the canonical vector addition kernel.

kernel<<>>(d_a, d_b, K, d_c);Get started today with CUDA Tile C++The following are required to run CUDA Tile C++ program…

19 часов назад @ developer.nvidia.com
Extract More Kernel Performance with NVIDIA CompileIQ Auto-Tuning
Extract More Kernel Performance with NVIDIA CompileIQ Auto-Tuning Extract More Kernel Performance with NVIDIA CompileIQ Auto-Tuning

NVIDIA CompileIQ tackles one of the hardest problems in performance engineering: finding the compiler options that unlock the best performance for a specific workload.

The release of NVIDIA CUDA 13.3 includes CompileIQ, an AI-powered compiler auto-tuning framework that uses evolutionary and genetic algorithms to optimize NVIDIA general purpose GPU compilers for individual workloads.

You define what “better” means for your workload in the objective function, and CompileIQ finds it.

In this case we’ll run 5 generations, we want to minimize the objective function, and there is only one objective being analyzed.

User workloads never leave their own environment; the objective function runs local…

19 часов назад @ developer.nvidia.com
NVIDIA Vera CPU Is ‘Packing a Heavy-Hitting Punch’ Against Competition
NVIDIA Vera CPU Is ‘Packing a Heavy-Hitting Punch’ Against Competition NVIDIA Vera CPU Is ‘Packing a Heavy-Hitting Punch’ Against Competition

Initial benchmark results published by Phoronix today show that the NVIDIA Vera CPU meets this need.

For this first public look, the benchmark scope was centered on the agentic workloads Vera was designed for in the modern data center.

NVIDIA Olympus Delivers Aggressive PerformanceAt the heart of Vera are custom NVIDIA Olympus CPU cores.

They need high core utilization and sustained memory bandwidth, making memory performance per watt a critical part of overall CPU efficiency.

“NVIDIA Vera with its LPDDR5X memory was showing its incredible advantage in memory performance over current Intel Xeon and AMD EPYC processors,” Larabel wrote.

19 часов назад @ blogs.nvidia.com
NVIDIA GTC Taipei at COMPUTEX: Live Updates on What’s Next in AI
NVIDIA GTC Taipei at COMPUTEX: Live Updates on What’s Next in AI NVIDIA GTC Taipei at COMPUTEX: Live Updates on What’s Next in AI

Vera Rubin NVL72 delivers up to 10x higher inference performance per watt and 10x lower cost per token.

When paired with NVIDIA Groq 3 LPX, Vera Rubin NVL72 delivers up to 35x higher throughput per watt for trillion-parameter models.

The Vera Rubin NVL72 sets the bar for scalability, resiliency and sustainable AI infrastructure.

Plus, NVIDIA Alpamayo won the Vehicle Technology and Smart Cockpit Category Award for pioneering open, reasoning-based autonomous vehicle development.

Learn more about NVIDIA’s latest innovations at NVIDIA GTC Taipei, running June 1-4 at COMPUTEX.

6 days, 1 hour назад @ blogs.nvidia.com
License to Stream: ‘007 First Light’ Coming to GeForce NOW With an Ultimate Bundle
License to Stream: ‘007 First Light’ Coming to GeForce NOW With an Ultimate Bundle License to Stream: ‘007 First Light’ Coming to GeForce NOW With an Ultimate Bundle

Leading the drop: the 007 First Light Ultimate Membership Bundle, which brings a brand-new way to jump into one of the year’s biggest releases and discover James Bond’s reimagined origin story.

Forza Horizon 6 is now available on GeForce NOW — bringing the series’ signature open-world racing and festival energy directly to members, wherever they play.

Starting today through Wednesday, June 10, 007 First Light is included with the purchase of a 12-month GeForce NOW Ultimate membership.

Stream it all with GeForce RTX 50 Series GPU power in the cloud, with up to 5K high dynamic range and cinematic-quality streaming for Ultimate members.

Powered by the cloud, GeForce NOW delivers the festival i…

6 days, 4 hours назад @ blogs.nvidia.com
NVIDIA and Google Cloud Empower the Next Wave of AI Builders
NVIDIA and Google Cloud Empower the Next Wave of AI Builders NVIDIA and Google Cloud Empower the Next Wave of AI Builders

For example, developers can accelerate data science and analytics with the NVIDIA cuDF library in Google Colab Enterprise or Dataproc, or deploy multi-agent applications by combining Google DeepMind’s Gemma 4 models, NVIDIA Nemotron open models and Google Agent Development Kit with Google Cloud G4 VMs powered by NVIDIA RTX PRO 6000 Blackwell GPUs in Google Cloud Run or with spot instances.

NVIDIA and Google Cloud work closely across open frameworks like JAX so developers can build, scale and productize JAX workloads on NVIDIA AI infrastructure on Google Cloud — from single‑GPU experiments to multi‑rack deployments — while getting strong performance and a consistent experience.

This work ext…

1 week назад @ blogs.nvidia.com
NVIDIA CEO Jensen Huang at Dell Technologies World: “Demand Is Going Parabolic, Utterly Parabolic”
NVIDIA CEO Jensen Huang at Dell Technologies World: “Demand Is Going Parabolic, Utterly Parabolic” NVIDIA CEO Jensen Huang at Dell Technologies World: “Demand Is Going Parabolic, Utterly Parabolic”

And 5,000 enterprises like Lilly, Samsung, and Honeywell are running AI workloads on Dell AI Factories with NVIDIA, turning ambition into production at scale.

On the CPU side, Dell PowerEdge M9822 and R9822 servers bring NVIDIA Vera CPUs to the enterprise AI factory.

Multiple customers for Dell AI Factory with NVIDIA were featured in the keynote.

Dell and OpenAI will also explore how Codex can connect with the Dell AI Factory.

NVIDIA OpenShell is now supported across the entire Dell AI Factory with NVIDIA, giving developers a secure runtime to build, deploy and govern AI agents from workstations to servers.

1 week, 1 day назад @ blogs.nvidia.com
Vera Arrives: NVIDIA’s First CPU Built for Agents Lands at Top AI Labs
Vera Arrives: NVIDIA’s First CPU Built for Agents Lands at Top AI Labs Vera Arrives: NVIDIA’s First CPU Built for Agents Lands at Top AI Labs

Agentic AI has always called for a different kind of CPU.

NVIDIA CEO and founder Jensen Huang introduced the answer — the standalone Vera CPU — at GTC San Jose in March as NVIDIA’s next multi-billion dollar business.

On Friday, that CPU went from NVIDIA’s labs into customer hands.

“Agentic AI is creating a new CPU moment in the AI factory — as models move from answering to acting, Vera is purpose-built to keep that work moving at scale,” Buck said.

Every agentic sandbox, every tool call, every orchestration layer, every long-context retrieval operation — that’s CPU work.

1 week, 1 day назад @ blogs.nvidia.com
Sea You in the Cloud: ‘Subnautica 2’ Early Access Dives Onto GeForce NOW
Sea You in the Cloud: ‘Subnautica 2’ Early Access Dives Onto GeForce NOW Sea You in the Cloud: ‘Subnautica 2’ Early Access Dives Onto GeForce NOW

A limited-time HITMAN World of Assassination reward event brings signature tools of the trade — equal parts precision and unpredictability.

At the same time Le Chiffre from CASINO ROYALE, played by the legendary actor Mads Mikkelsen, returns to HITMAN World of Assassination.

GeForce NOW members can jump in once early access becomes available on the service — no preinstalls or downloads needed.

Here’s what’s waiting:Free users: The Purple Streak Explosive Duck — a remote explosive disguised as an innocent rubber toy.

Performance members: The Purple Streak Explosive Duck and the Bomb Dynamite, a classic TNT bundle built for loud, messy exits.

1 week, 6 days назад @ blogs.nvidia.com
NVIDIA, Ineffable Intelligence Team Up to Build the Future of Reinforcement Learning Infrastructure
NVIDIA, Ineffable Intelligence Team Up to Build the Future of Reinforcement Learning Infrastructure NVIDIA, Ineffable Intelligence Team Up to Build the Future of Reinforcement Learning Infrastructure

Reinforcement-learning agents — AI systems that learn by trial and error — can convert computation into new knowledge.

“We are thrilled to partner with Ineffable Intelligence to codesign the infrastructure for large-scale reinforcement learning as they push the frontier of AI and pioneer a new generation of intelligent systems.”Silver is one of the pioneers of reinforcement learning, an approach that has transformed AI research.

Unlike pretraining, where a fixed dataset of human data flows through the system, reinforcement learning workloads generate their data on the fly.

That’s where NVIDIA and Ineffable are focusing their technical work: building a pipeline that can feed reinforcement le…

2 weeks назад @ blogs.nvidia.com
Hermes Unlocks Self-Improving AI Agents, Powered by NVIDIA RTX PCs and DGX Spark
Hermes Unlocks Self-Improving AI Agents, Powered by NVIDIA RTX PCs and DGX Spark Hermes Unlocks Self-Improving AI Agents, Powered by NVIDIA RTX PCs and DGX Spark

It’s provider- and model-agnostic by design, and optimized for always-on local use, making NVIDIA RTX PCs, NVIDIA RTX PRO workstations and NVIDIA DGX Spark the ideal hardware to run it at full speed, around the clock.

Hermes: Local AI Agent Capabilities AcceleratedLike other popular agents, Hermes integrates with messaging apps, can access local files and applications, and runs 24/7.

Stay tuned for more updates from RTX AI Garage on the latest open models and agents optimized for NVIDIA RTX hardware.

#ICYMI: The Latest From RTX AI Garage✨ NVIDIA RTX PRO GPUs deliver up to 3x faster token generation running Qwen 3.6 models with llama.cpp.

Plug in to NVIDIA AI PC on Facebook, Instagram, TikTo…

2 weeks назад @ blogs.nvidia.com
NVIDIA and SAP Bring Trust to Specialized Agents
NVIDIA and SAP Bring Trust to Specialized Agents NVIDIA and SAP Bring Trust to Specialized Agents

Announced today at SAP Sapphire — where NVIDIA founder and CEO Jensen Huang joined SAP CEO Christian Klein’s keynote by video — SAP and NVIDIA’s expanded collaboration helps enterprises run specialized agents with security and governance controls.

SAP embeds NVIDIA OpenShell — an open source runtime for securely developing and deploying autonomous AI agents — into SAP Business AI Platform.

Within SAP Business AI Platform, OpenShell is the runtime security layer for all SAP AI agents, including custom agents built in Joule Studio — SAP’s environment for building and managing end-to-end enterprise agents.

Joule Studio runtime — the enterprise control layer within SAP Business AI Platform — as…

2 weeks, 1 day назад @ blogs.nvidia.com
‘Your Career Starts at the Beginning of the AI Revolution,’ NVIDIA CEO Tells Graduates
‘Your Career Starts at the Beginning of the AI Revolution,’ NVIDIA CEO Tells Graduates ‘Your Career Starts at the Beginning of the AI Revolution,’ NVIDIA CEO Tells Graduates

“You are entering the world at an extraordinary moment,” NVIDIA founder and CEO Jensen Huang told graduates as he delivered the keynote address at Carnegie Mellon University’s 128th commencement ceremony on Sunday.

Huang underscored that AI is making intelligence more broadly accessible — reaffirming the imperative for AI to reach everyone, not just a select few.

It is creating a new industrial era.”Massive industrial and economic shifts always bring with them uncertainty, the AI revolution is no different.

“Every major technological revolution in history created fear alongside opportunity,” Huang said.

To meet the moment of the AI revolution, Huang counseled doing four things at once: “Adv…

2 weeks, 2 days назад @ blogs.nvidia.com
Model Quantization: Post-Training Quantization Using NVIDIA Model Optimizer
Model Quantization: Post-Training Quantization Using NVIDIA Model Optimizer Model Quantization: Post-Training Quantization Using NVIDIA Model Optimizer

This post walks through how to use NVIDIA Model Optimizer to quantize a CLIP model in FP8 format with the post-training quantization (PTQ) method.

For a general introduction to model quantization, see Model Quantization: Concepts, Methods, and Why It Matters.

Quantization recipeThe following quantization recipe is used in this post as a step-by-step guide for running CLIP model quantization with ModelOpt to understand how the process works.

CLIP model quality comparison of the FP16 baseline versus FP8-PTQ quantized modelsBased on the evaluation results, the CLIP-FP8 quantized model demonstrates comparable quality to the CLIP-FP16 model.

Get started with NVIDIA Model OptimizerThis post intro…

2 weeks, 5 days назад @ developer.nvidia.com
Facebook
последний пост 2 weeks назад
Reel Friends: Building Social Discovery that Scales to Billions
Reel Friends: Building Social Discovery that Scales to Billions Reel Friends: Building Social Discovery that Scales to Billions

On its face the new Friend Bubbles feature looks simple enough.

It highlights Reels your friends have watched and reacted to.

On this episode of the Meta Tech Podcast, Pascal Hartig chats with Subasree and Joseph, two software engineers from the Facebook Reels team, about what it took to bring Friend Bubbles to life.

If you’ve ever underestimated a “simple” feature, this one’s for you.

And if you’re interested in learning more about career opportunities at Meta visit the Meta Careers page.

2 weeks назад @ engineering.fb.com
Modernizing the Facebook Groups Search to Unlock the Power of Community Knowledge
Modernizing the Facebook Groups Search to Unlock the Power of Community Knowledge Modernizing the Facebook Groups Search to Unlock the Power of Community Knowledge

We’ve fundamentally transformed Facebook Groups Search to help people more reliably discover, sort through, and validate community content that’s most relevant to them.

We’ve adopted a new hybrid retrieval architecture and implemented automated model-based evaluation to address the major friction points people experience when searching community content.

Addressing the Friction Points in Community KnowledgePeople struggle with three friction points when searching for answers in community content – discovery, consumption, and validation.

The Solution: A Modernized Hybrid Retrieval ArchitectureWe engineered a hybrid retrieval architecture that powers a discussions module on Facebook Search.

R…

1 month назад @ engineering.fb.com
Capacity Efficiency at Meta: How Unified AI Agents Optimize Performance at Hyperscale
Capacity Efficiency at Meta: How Unified AI Agents Optimize Performance at Hyperscale Capacity Efficiency at Meta: How Unified AI Agents Optimize Performance at Hyperscale

We’ve built a unified AI agent platform that encodes the domain expertise of senior efficiency engineers into reusable, composable skills.

Introducing the Capacity Efficiency ProgramWhen the code you ship serves more than 3 billion people, even a 0.1% performance regression can translate to significant additional power consumption.

Many engineers at Meta use our efficiency tools to work on these problems every day.

Skills : These encode domain expertise about performance efficiency.

The pipeline mirrors the defensive AI Regression Solver:Gather context with tools: The AI agent looks up: Opportunity metadata.

1 month, 1 week назад @ engineering.fb.com
How Meta Used AI to Map Tribal Knowledge in Large-Scale Data Pipelines
How Meta Used AI to Map Tribal Knowledge in Large-Scale Data Pipelines How Meta Used AI to Map Tribal Knowledge in Large-Scale Data Pipelines

Challenging the Conventional Wisdom on AI Context FilesRecent academic research found that AI-generated context files actually decreased agent success rates on well-known open-source Python repositories.

Our codebase is the opposite: proprietary config-as-code with tribal knowledge that exists nowhere in any model’s training data.

Any team with a large, proprietary codebase can benefit:Identify your tribal knowledge gaps.

What’s NextWe are expanding context coverage to additional pipelines across Meta’s data infrastructure and exploring tighter integration between context files and code generation workflows.

This approach turned undocumented tribal knowledge into structured, AI-readable con…

1 month, 3 weeks назад @ engineering.fb.com
KernelEvolve: How Meta’s Ranking Engineer Agent Optimizes AI Infrastructure
KernelEvolve: How Meta’s Ranking Engineer Agent Optimizes AI Infrastructure KernelEvolve: How Meta’s Ranking Engineer Agent Optimizes AI Infrastructure

This is the second post in the Ranking Engineer Agent blog series exploring the autonomous AI capabilities accelerating Meta’s Ads Ranking innovation.

We introduce KernelEvolve, an agentic kernel authoring system used by Ranking Engineer Agent and generally applicable to a range of AI models beyond Ads Ranking.

Unlike typical large language model (LLM)-based agents that perform one-shot code generation, KernelEvolve treats kernel optimization as a search problem.

A standard coding assistant lacks the context to write optimized MTIA kernels because it has never seen MTIA documentation, instruction set details, or programming idioms.

KernelEvolve represents an early step toward the vision of …

1 month, 3 weeks назад @ engineering.fb.com
Meta Adaptive Ranking Model: Bending the Inference Scaling Curve to Serve LLM-Scale Models for Ads
Meta Adaptive Ranking Model: Bending the Inference Scaling Curve to Serve LLM-Scale Models for Ads Meta Adaptive Ranking Model: Bending the Inference Scaling Curve to Serve LLM-Scale Models for Ads

To overcome this, we have developed the Meta Adaptive Ranking Model, which effectively bends the inference scaling curve with high ROI and industry-leading efficiency.

Introducing Meta Adaptive Ranking ModelServing LLM-scale & complexity models in a real-time ads recommendation environment requires resolving a fundamental tension between model complexity and system efficiency.

Adaptive Ranking Model addresses these challenges through a paradigm shift powered by three core innovations across the serving stack:Inference-efficient model scaling: Adaptive Ranking Model achieves a model complexity equivalent to the O(10 GFLOPs) per token used by top-tier LLMs.

To minimize compute overhead, Adapt…

1 month, 3 weeks назад @ engineering.fb.com
AI for American-Produced Cement and Concrete
AI for American-Produced Cement and Concrete AI for American-Produced Cement and Concrete

Concurrent with the 2026 American Concrete Institute (ACI) Spring Convention, Meta is releasing a new AI model for designing concrete mixes – Bayesian Optimization for Concrete (BOxCrete), as well as the foundational data used to develop award-winning concrete mixes.

Amrize operates 18 cement plants, 141 cement terminals and 269 ready-mix concrete sites across North America.

Alongside the event, Meta is releasing a new AI model for designing concrete mixes, Bayesian Optimization for Concrete (BOxCrete).

How Meta Leverages AI for Concrete MixturesMeta’s AI for concrete model can help suppliers more quickly incorporate U.S. materials into their mixes through an approach called adaptive experi…

1 month, 4 weeks назад @ engineering.fb.com
Friend Bubbles: Enhancing Social Discovery on Facebook Reels
Friend Bubbles: Enhancing Social Discovery on Facebook Reels Friend Bubbles: Enhancing Social Discovery on Facebook Reels

Friend bubbles in Facebook Reels highlight Reels your friends have liked or reacted to, helping you discover new content and making it easier to connect over shared interests.

Friend bubbles enhance the social experience on Facebook Reels by helping you discover content your friends enjoy, creating a shared viewing experience and sparking new conversations.

Along with additional optimizations in the underlying method, this approach enabled us to ship friend bubbles while preserving core Reels performance.

Friend bubbles work because the signal is high value: It adds meaningful social context that helps people decide what’s worth watching.

Engagement also scales consistently with the number …

2 months, 1 week назад @ engineering.fb.com
Ranking Engineer Agent (REA): The Autonomous AI Agent Accelerating Meta’s Ads Ranking Innovation
Ranking Engineer Agent (REA): The Autonomous AI Agent Accelerating Meta’s Ads Ranking Innovation Ranking Engineer Agent (REA): The Autonomous AI Agent Accelerating Meta’s Ads Ranking Innovation

Meta’s Ranking Engineer Agent (REA) autonomously executes key steps across the end-to-end machine learning (ML) lifecycle for ads ranking models.

Powering these interactions are highly sophisticated, complex and massively distributed machine learning (ML) models that continuously evolve to serve both advertisers and people who use the platforms.

Optimizing these ML models has traditionally been time-consuming.

To address this, Meta built the Ranking Engineer Agent, an autonomous AI agent designed to drive the end-to-end ML lifecycle and iteratively evolve Meta’s ads ranking models at scale.

ML training jobs run for hours or days, far beyond what any session-bound assistant can manage.

2 months, 1 week назад @ engineering.fb.com
Patch Me If You Can: AI Codemods for Secure-by-Default Android Apps
Patch Me If You Can: AI Codemods for Secure-by-Default Android Apps Patch Me If You Can: AI Codemods for Secure-by-Default Android Apps

Nowhere is this more apparent than in mobile security, where a single class of vulnerability can be replicated across hundreds of call sites scattered throughout a sprawling, multi-app codebase serving billions of users.

Meta’s Product Security team has developed a two-pronged strategy to address this:Designing secure-by-default frameworks that wrap potentially unsafe Android OS APIs and make the secure path the easiest path for developers, andLeveraging generative AI to automate the migration of existing code to those frameworks at scale.

The result is a system that can propose, validate, and submit security patches across millions of lines of code with minimal friction for the engineers w…

2 months, 2 weeks назад @ engineering.fb.com
RCCLX: Innovating GPU communications on AMD platforms
RCCLX: Innovating GPU communications on AMD platforms RCCLX: Innovating GPU communications on AMD platforms

RCCLX is fully integrated with Torchcomms and aims to empower researchers and developers to accelerate innovation, regardless of their chosen backend.

We want to iterate on collectives, transports, and novel features quickly on AMD platforms.

With RCCLX, we have integrated CTran to AMD platforms, enabling the AllToAllvDynamic – a GPU-resident collective.

These features provide significant performance improvements on AMD platforms and we are excited to share this with the community.

RCCLX Quick Start GuideInstall Torchcomms with RCCLX backend by following the installation instructions in the Torchcomms repo.

3 months назад @ engineering.fb.com
The Death of Traditional Testing: Agentic Development Broke a 50-Year-Old Field, JiTTesting Can Revive It
The Death of Traditional Testing: Agentic Development Broke a 50-Year-Old Field, JiTTesting Can Revive It The Death of Traditional Testing: Agentic Development Broke a 50-Year-Old Field, JiTTesting Can Revive It

A Catching JiTTest focuses specifically on finding regressions introduced by a code change.

Agentic development dramatically increases the pace of code change, straining test development burden and scaling the cost of false positives and test maintenance to breaking point.

And since the JiTTest itself is LLM-generated, it can often infer the plausible intention of a code change and simulate possible faults that may result from it.

With them engineers no longer have to spend time writing, reviewing, and testing complex test code.

READ THE PAPERJust-in-Time Catching Test Generation at Meta

3 months, 2 weeks назад @ engineering.fb.com
Adapting the Facebook Reels RecSys AI Model Based on User Feedback
Adapting the Facebook Reels RecSys AI Model Based on User Feedback Adapting the Facebook Reels RecSys AI Model Based on User Feedback

Our new User True Interest Survey (UTIS) model , now helps surface more niche, high-quality content and boosts engagement, retention, and satisfaction.

Our paper, “ Improve the Personalization of Large-Scale Ranking Systems by Integrating User Survey Feedback ” shares full details on this work.

The main candidate ranking model used by the platform is a large multi-task, multi-label model.

We trained a lightweight UTIS alignment model layer on the collected user survey responses using existing predictions of the main model as input features.

The UTIS model consistently outperformed the baseline, driving higher user engagement and retention .

4 months, 1 week назад @ engineering.fb.com
DrP: Meta’s Root Cause Analysis Platform at Scale
DrP: Meta’s Root Cause Analysis Platform at Scale DrP: Meta’s Root Cause Analysis Platform at Scale

DrP’s key components include:Expressive SDK : The DrP SDK allows engineers to codify investigation workflows into analyzers.

Post-processing system : After an investigation, the post-processing system can take automated actions based on the analysis results.

Bootstrap code : The DrP SDK provides bootstrap code to create a template analyzer with pre-populated boilerplate code.

Data access and analysis : The SDK includes libraries for data access and analysis, such as dimension analysis and time series correlation.

This provides immediate analysis results to on-call engineers.

5 months, 1 week назад @ engineering.fb.com
How AI Is Transforming the Adoption of Secure-by-Default Mobile Frameworks
How AI Is Transforming the Adoption of Secure-by-Default Mobile Frameworks How AI Is Transforming the Adoption of Secure-by-Default Mobile Frameworks

Generative AI and automation accelerate the adoption of secure frameworks at scale, enabling consistent security enforcement and efficient migration across Meta’s vast codebase.

How We Design Secure-by-Default Frameworks at MetaDesigning secure-by-default frameworks for use by a large number of developers shipping vastly different features across multiple apps is an interesting challenge.

There shouldn’t be one security framework that covers all security issues, and not every security issue is general enough to deserve its own framework.

Now that we’ve looked at the design philosophy behind our frameworks, let’s look at one of our most widely used Android security frameworks, SecureLinkLaun…

5 months, 1 week назад @ engineering.fb.com
Uber Engineering
последний пост None
neptune.ai neptune.ai
последний пост 5 months, 3 weeks назад
We are joining OpenAI
We are joining OpenAI We are joining OpenAI

Piotr Niedźwiedź, CEO/CTO and founder of neptune.aiI’m excited to share that we’ve entered into a definitive agreement to be acquired by OpenAI, subject to closing conditions.

We are thrilled to join the OpenAI team and help their AI researchers build better models faster.

Neptune is a metrics dashboard company.”We’ve worked closely with OpenAI to create the metrics dashboard that helps teams building foundation models.

Our future with OpenAINeptune will join OpenAI and continue to support AI researchers with tools to monitor, debug, and evaluate frontier models.

We are looking forward to working with top AI researchers and supporting OpenAI’s mission of ensuring that AGI benefits all of hu…

5 months, 3 weeks назад @ neptune.ai
Synthetic Data for LLM Training
Synthetic Data for LLM Training Synthetic Data for LLM Training

For instance, financial data is highly sensitive and protected by very strict regulations, and synthetic data mimics the real data distribution without revealing customer information.

Read more about how leading foundation model teams curate their training data and other topics in the State of Foundation Model Training Report 2025.

Choosing the right synthetic data generation technique depends on the type of data and its complexity.

Synthetic tabular data generation is a promising direction to overcome these challenges by learning the distribution of the tabular data.

Post-processingAs the distribution of tabular data is highly complex, it makes the synthetic tabular data generation very ch…

6 months, 2 weeks назад @ neptune.ai
What are LLM Embeddings: All you Need to Know
What are LLM Embeddings: All you Need to Know What are LLM Embeddings: All you Need to Know

TL;DR LLM embeddings are the numerical, vector representations of text that Large Language Models (LLMs) use to process information.

Unlike their predecessor word embeddings, LLM embeddings are context-aware and dynamically change to capture semantic and syntactic relationships based on the surrounding text.

What are the applications of LLM embeddings?

Word EmbeddingsSparse Word Embeddings One-Hot Vectors 1970s TF-IDF1980s Co-Occurrence MatrixStatic Word Embeddings Word2Vec 2013 GloVe 2014Contextualized word embeddings ELMo 2018 GPT-1 2018 BERT 2018 LLAMA 2023 DeepSeek-V1 2023 GPT-4 2023Static word embeddingsStatic word embeddings, such as word2vec in 2013, marked a significant development.…

6 months, 3 weeks назад @ neptune.ai
Detecting and Fixing ‘Dead Neurons’ in Foundation Models
Detecting and Fixing ‘Dead Neurons’ in Foundation Models Detecting and Fixing ‘Dead Neurons’ in Foundation Models

TL;DR Dead neurons silently waste compute and reduce effective model capacity in foundation models.

Dead neurons’ impactRecent studies into dead neurons in the context of foundation models show interesting, albeit worrying, results.

These large reported fractions of dead neurons in foundation models are a concern from a computational perspective.

Before we move on to discuss how to detect and fix dead neurons, let’s touch upon an important distinction between dead neurons and vanishing gradients.

Further reading How to Monitor, Diagnose, and Solve Gradient Issues in Foundation Models Read moreVisualizing activation distributionsIs your foundation model suffering from dead neurons?

7 months назад @ neptune.ai
Part 2: Instruction Fine-Tuning: Evaluation and Advanced Techniques for Efficient Training
Part 2: Instruction Fine-Tuning: Evaluation and Advanced Techniques for Efficient Training Part 2: Instruction Fine-Tuning: Evaluation and Advanced Techniques for Efficient Training

In the first part of this series, we covered the fundamentals of instruction fine-tuning (IFT).

def calculate_irs(instruction, output, reference_model): evaluation_prompt = f""" Instruction: {instruction} Model Output: {output} Rate how well the output follows the instruction on these criteria: 1.

| SourceHINT addresses a computational inefficiency in standard instruction fine-tuning: repeatedly reprocessing the same task instruction with every input example.

Read more about foundation model training infrastructure and other topics in Neptune’s 2025 State of Foundation Model Training Report.

First, during initial instruction fine-tuning across multiple diverse tasks, the model learns genera…

7 months назад @ neptune.ai
How to Optimize LLM Inference
How to Optimize LLM Inference How to Optimize LLM Inference

Large Language Model (LLM) inference at scale is challenging as it involves transferring massive amounts of model parameters and data and performing computations on large tensors.

In the following, we’ll use the Llama model family architecture as a specific example to understand the LLM workload at inference.

For a far more detailed analysis of the LLM workload at inference, see the chapter All About Transformer Inference in the book How to Scale Your Model, published by Google DeepMind.

See also How to Run LLMs Locally Read moreA quick primer on hardware for LLM inferenceA typical LLM inference cluster consists of several nodes, each with a multi-core CPU and multiple accelerator devices, …

7 months, 2 weeks назад @ neptune.ai
A Researcher’s Guide to LLM Grounding
A Researcher’s Guide to LLM Grounding A Researcher’s Guide to LLM Grounding

In this article, we’ll explore the fundamental concepts of LLM grounding as well as strategies for optimally grounding models.

What is LLM grounding?

LLM grounding is analogous.

If relevant knowledge cannot be inferred from the data, then LLM grounding cannot yield more relevant responses.

When grounding LLMs using RAG, consider retaining only a few of the top hits (i.e., top-k) for your retrieval queries.

8 months назад @ neptune.ai
Instruction Fine-Tuning: Fundamentals, Architecture Modifications, and Loss Functions
Instruction Fine-Tuning: Fundamentals, Architecture Modifications, and Loss Functions Instruction Fine-Tuning: Fundamentals, Architecture Modifications, and Loss Functions

TL;DR Instruction fine-tuning (IFT) refines pre-trained large language models (LLMs) to follow specific task instructions by training on prompt-response pairs.

Instruction fine-tuning in a nutshellIFT tailors LLMs to follow user instructions by bridging their inherent next-word prediction with human-defined objectives.

Related LLM Fine-Tuning and Model Selection Using Neptune and Transformers Read moreParameter-efficient instruction fine-tuningWhile major foundation models like GPT-4 or Llama-2 undergo full parameter instruction fine-tuning during development, parameter-efficient fine-tuning (PEFT) methods have become widely adopted for instruction fine-tuning since the LoRA paper was publi…

8 months, 1 week назад @ neptune.ai
Understanding Prompt Injection: Risks, Methods, and Defense Measures
Understanding Prompt Injection: Risks, Methods, and Defense Measures Understanding Prompt Injection: Risks, Methods, and Defense Measures

Prompt injection 101: When prompts go rogueThe term ‘Prompt Injection’ comes from SQL injection attacks.

There is another claim of the independent discovery of prompt injection attacks, which suggests that Riley Goodside publicly exhibited a prompt injection in a tweet back in September 2022.

The indirect prompt injection attacks are classified into active, passive, user-driven and virtual prompt attacks.

Virtual prompt injection attacksThis injection type is closely related to passive injection attacks previously described.

Prompt injection: current challenges & lessons learnedThe arms race between prompt injection attacks and defenses is a challenge for researchers, developers, and users.

9 months, 3 weeks назад @ neptune.ai
SabiYarn: Advancing Low-Resource Languages With Multitask NLP Pre-Training [Paper Reflections]
SabiYarn: Advancing Low-Resource Languages With Multitask NLP Pre-Training [Paper Reflections] SabiYarn: Advancing Low-Resource Languages With Multitask NLP Pre-Training [Paper Reflections]

This simple idea avoids computing loss on input prompt tokens the model already knows.

Prompt tokens are (too) expensive in low-resource settingsDuring pre-training, LLMs are trained in causal language modeling through a next-token prediction task.

=> Mo fẹ́ràn ìrẹsì,” the model is trained to predict every token, from the prompt to the actual answer:Step Prompt Next token 1 Translate English Static prompt 2 Translate English to Static prompt 3 Translate English to Yoruba: Static prompt 4 Translate English to Yoruba: I 5 Translate English to Yoruba: I love 6 Translate English to Yoruba: I love rice.

This is straightforward to implement in PyTorch by masking out the prompt tokens in the label …

9 months, 4 weeks назад @ neptune.ai
▶️ YouTube
Yannic Kilcher Yannic Kilcher
последний пост 2 months, 3 weeks назад
I BUILT A FULLY AUTOMATIC MANSPLAINER
I BUILT A FULLY AUTOMATIC MANSPLAINER I BUILT A FULLY AUTOMATIC MANSPLAINER

All information about GTC and the DGX Spark Raffle is here: https://www.ykilcher.com/gtc Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https://www.subscribestar.com/yannickilcher

Patreon: https://www.patreon.com/yannickilcher

Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq

Ethereu…

2 months, 3 weeks назад @ youtube.com
Traditional X-Mas Stream
Traditional X-Mas Stream Traditional X-Mas Stream

Letsgooo

5 months назад @ youtube.com
Traditional Holiday Live Stream
Traditional Holiday Live Stream Traditional Holiday Live Stream

https://ykilcher.com/discord Links:

TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://discord.gg/4H8xxDF

BitChute: https://www.bitchute.com/channel/yannic-kilcher

Minds: https://www.minds.com/ykilcher

Parler: https://parler.com/profile/YannicKilcher

LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/

BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https:/…

5 months назад @ youtube.com
TiDAR: Think in Diffusion, Talk in Autoregression (Paper Analysis)
TiDAR: Think in Diffusion, Talk in Autoregression (Paper Analysis) TiDAR: Think in Diffusion, Talk in Autoregression (Paper Analysis)

Paper: https://arxiv.org/abs/2511.08923 Abstract:

Diffusion language models hold the promise of fast parallel generation, while autoregressive (AR) models typically excel in quality due to their causal structure aligning naturally with language modeling. This raises a fundamental question: can we achieve a synergy with high throughput, higher GPU utilization, and AR level quality? Existing methods fail to effectively balance these two aspects, either prioritizing AR using a weaker model for sequential drafting (speculative decoding), leading to lower drafting efficiency, or using some form of left-to-right (AR-like) decoding logic for diffusion, which still suffers from quality degradation …

5 months назад @ youtube.com
Titans: Learning to Memorize at Test Time (Paper Analysis)
Titans: Learning to Memorize at Test Time (Paper Analysis) Titans: Learning to Memorize at Test Time (Paper Analysis)

Paper: https://arxiv.org/abs/2501.00663 Abstract:

Over more than a decade there has been an extensive research effort on how to effectively utilize recurrent models and attention. While recurrent models aim to compress the data into a fixed-size memory (called hidden state), attention allows attending to the entire context window, capturing the direct dependencies of all tokens. This more accurate modeling of dependencies, however, comes with a quadratic cost, limiting the model to a fixed-length context. We present a new neural long-term memory module that learns to memorize historical context and helps attention to attend to the current context while utilizing long past information. We sh…

5 months, 2 weeks назад @ youtube.com
[Paper Analysis] The Free Transformer (and some Variational Autoencoder stuff)
[Paper Analysis] The Free Transformer (and some Variational Autoencoder stuff) [Paper Analysis] The Free Transformer (and some Variational Autoencoder stuff)

https://arxiv.org/abs/2510.17558 Abstract:

We propose an extension of the decoder Transformer that conditions its generative process on random latent variables which are learned without supervision thanks to a variational procedure. Experimental evaluations show that allowing such a conditioning translates into substantial improvements on downstream tasks. Author: François Fleuret Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the con…

6 months, 3 weeks назад @ youtube.com
[Video Response] What Cloudflare's code mode misses about MCP and tool calling
[Video Response] What Cloudflare's code mode misses about MCP and tool calling [Video Response] What Cloudflare's code mode misses about MCP and tool calling

Theo's Video: https://www.youtube.com/watch?v=bAYZjVAodoo

Cloudflare article: https://blog.cloudflare.com/code-mode/ Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https://www.subscribestar.com/yannickilcher

Patreon: https://www.patreon.com/yannickilcher

Bitcoin (BTC): bc1q49lsw3q325tr58ygf8…

7 months, 1 week назад @ youtube.com
[Paper Analysis] On the Theoretical Limitations of Embedding-Based Retrieval (Warning: Rant)
[Paper Analysis] On the Theoretical Limitations of Embedding-Based Retrieval (Warning: Rant) [Paper Analysis] On the Theoretical Limitations of Embedding-Based Retrieval (Warning: Rant)

Paper: https://arxiv.org/abs/2508.21038 Abstract:

Vector embeddings have been tasked with an ever-increasing set of retrieval tasks over the years, with a nascent rise in using them for reasoning, instruction-following, coding, and more. These new benchmarks push embeddings to work for any query and any notion of relevance that could be given. While prior works have pointed out theoretical limitations of vector embeddings, there is a common assumption that these difficulties are exclusively due to unrealistic queries, and those that are not can be overcome with better training data and larger models. In this work, we demonstrate that we may encounter these theoretical limitations in realist…

7 months, 2 weeks назад @ youtube.com
AGI is not coming!
AGI is not coming! AGI is not coming!

jack Morris's investigation into GPT-OSS training data https://x.com/jxmnop/status/1953899426075816164?t=3YRhVQDwQLk2gouTSACoqA&s=09

9 months, 3 weeks назад @ youtube.com
Henry AI Labs Henry AI Labs
последний пост None
3blue1brown 3blue1brown
последний пост 5 days, 19 hours назад
Tie random ends: How many loops?
Tie random ends: How many loops? Tie random ends: How many loops?

Recent puzzle solutions on Patreon:

https://members.3blue1brown.com/posts/158885046?pr=true

5 days, 19 hours назад @ youtube.com
Covering 10 points, a surprisingly tricky puzzle.
Covering 10 points, a surprisingly tricky puzzle. Covering 10 points, a surprisingly tricky puzzle.

Made as part of a monthly series of puzzles for the 2026 Year of Math.

1 month, 1 week назад @ youtube.com
Escher's most mind-bending piece
Escher's most mind-bending piece Escher's most mind-bending piece

On "The Print Gallery", by M.C. Escher

Full video: https://youtu.be/ldxFjLJ3rVY

2 months назад @ youtube.com
The subset sum puzzle
The subset sum puzzle The subset sum puzzle

Part of a series of monthly puzzlers. Stay subscribed to see the solution

2 months назад @ youtube.com
Escher's most mathematically interesting piece
Escher's most mathematically interesting piece Escher's most mathematically interesting piece

Escher's Print Gallery, and the tour of complex analysis it invites.

Check out our virtual career fair: 3b1b.co/talent

Join channel supporters to see videos early: 3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com Original paper by de Smit and Lenstra:

https://pub.math.leidenuniv.nl/~smitbde/papers/2003-de_smit-lenstra-escher.pdf Timestamps: 0:00 - The print gallery

13:04 - Conformal maps from complex analysis

21:41 - The complex exponential

25:56 - The complex logarithm

32:32 - 3b1b Talent

33:14 - Constructing the key function

40:16 - The deeper math behind Escher ------------------ These animations are largely made us…

2 months назад @ youtube.com
Bacteria Grid Puzzle Solution
Bacteria Grid Puzzle Solution Bacteria Grid Puzzle Solution

Part of a monthly series of puzzlers, in collaboration with MoMath and Peter Winkler

2 months, 1 week назад @ youtube.com
The most underappreciated formula | Exploring high-dimensional spheres
The most underappreciated formula | Exploring high-dimensional spheres The most underappreciated formula | Exploring high-dimensional spheres

On the volumes of higher-dimensional spheres

Explore the 3b1b virtual career fair: See https://3b1b.co/talent

Become a supporter for early views of new videos: https://3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com Thanks to UC Santa Cruz for letting me film there, and special thanks to Pedro Morales-Almazan for arranging everything. My video on Numberphile with a fun application of this problem: https://youtu.be/6_yU9eJ0NxA Timestamps:

0:00 - Introduction

1:01 - Random puzzle

6:16 - Outside the box

14:35 - Setting up the volume grid

21:14 - Why 4πr^2

25:21 - Archimedes in higher dimensions

36:17 - The general formul…

2 months, 4 weeks назад @ youtube.com
The lattice bacteria puzzle
The lattice bacteria puzzle The lattice bacteria puzzle

Part of a series of monthly puzzles, done in collaboration with MoMath.

https://momath.org/mindbenders

3 months, 1 week назад @ youtube.com
Solution to the ladybug clock puzzle
Solution to the ladybug clock puzzle Solution to the ladybug clock puzzle

Solution to last month's probability puzzle.

3 months, 1 week назад @ youtube.com
The Hairy Ball Theorem
The Hairy Ball Theorem The Hairy Ball Theorem

Unexpected applications and a beautiful proof.

Looking for a new career? Check out https://3b1b.co/talent

Supporters get early access to new videos: https://3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com Credits:

Senia Sheydvasser: Co-writing and sphere deformation animations

Paul Dancstep: Those lovely fluffy sphere animations Vince Rubinetti: Music Timestamps:

0:00 - To comb a hairy ball

1:24 - Applications

8:46 - The puzzle of one null point

12:12 - The proof outline

16:41 - Defining orientation

21:44 - Why inside-out is impossible

25:59 - 3b1b Talent

27:44 - Final food for thought ------------------ These animati…

3 months, 3 weeks назад @ youtube.com
The ladybug clock puzzle
The ladybug clock puzzle The ladybug clock puzzle

This is the first in a set of monthly puzzles, curated by Peter Winkler. This one was originally suggested by Richard Stanley. You can sign up to hear his description of the answer at http://momath.org/mindbenders

4 months, 1 week назад @ youtube.com
The most absurd product I've made
The most absurd product I've made The most absurd product I've made

Because why not make a pi creature neck pillow?

Available at 3b1b.co/store

6 months назад @ youtube.com
How Laplace transforms solve differential equations
How Laplace transforms solve differential equations How Laplace transforms solve differential equations

Studying the forced harmonic oscillator by taking a Laplace transform and studying its poles.

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com Chapter on the Laplace Transform:

https://youtu.be/j0wJBEZdwLs Chapter on the S-plane and Simple Harmonic Motion:

https://youtu.be/-j8PzkZ70Lg Timestamps:

0:00 - Opening puzzle

1:06 - Key properties of a Laplace Transform

3:29 - Qualitative analysis with Laplace Transforms

4:29 - The Laplace Transforms of a Derivative

6:06 - The forced oscillator

11:59 - Intuition from the transformed solution

1…

6 months, 3 weeks назад @ youtube.com
The dynamics of e^(πi)
The dynamics of e^(πi) The dynamics of e^(πi)

A fuller version of this explanation, also including the reason we care about complex exponents in the first place: https://youtu.be/-j8PzkZ70Lg

7 months, 2 weeks назад @ youtube.com
But what is a Laplace Transform?
But what is a Laplace Transform? But what is a Laplace Transform?

Visualizing the most important tool for differential equations.

Previous chapter: https://youtu.be/-j8PzkZ70Lg

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com Artwork by Kurt Bruns Engine animation borrowed with permission from this (excellent) blog: https://ciechanow.ski/internal-combustion-engine/ Timestamps:

0:00 - Understanding the engine

1:16 - Key background ideas

5:41 - Definition and intuition

10:43 - Complex integration

20:43 - Analytic continuation

23:52 - The transform of exponentials

26:15 - A deep look at cos(t)

32:59 - W…

7 months, 2 weeks назад @ youtube.com
Two Minute Papers Two Minute Papers
последний пост 23 часа назад
Google DeepMind CEO Likes Hard Questions
Google DeepMind CEO Likes Hard Questions Google DeepMind CEO Likes Hard Questions

Full video: https://youtu.be/huAwz_BR8WM

#shorts

23 часа назад @ youtube.com
Insane AI Breakthroughs With Demis Hassabis
Insane AI Breakthroughs With Demis Hassabis Insane AI Breakthroughs With Demis Hassabis

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

Thumbnail design: https://felicia.hu

1 day, 23 hours назад @ youtube.com
DeepSeek Just Changed How AI Sees Images Forever
DeepSeek Just Changed How AI Sees Images Forever DeepSeek Just Changed How AI Sees Images Forever

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://github.com/ailuntx/Thinking-with-Visual-Primitives

https://huggingface.co/datasets/NodeLinker/deepseek-ai-Thinking-with-Visual-Primitives-deleted-repo/blob/main/Thinking_with_Visual_Primitives.pdf Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn B…

5 days, 16 hours назад @ youtube.com
NVIDIA’s New AI Is Fast For A Strange Reason
NVIDIA’s New AI Is Fast For A Strange Reason NVIDIA’s New AI Is Fast For A Strange Reason

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://arxiv.org/abs/2604.24954

https://developer.nvidia.com/blog/nvidia-nemotron-3-nano-omni-powers-multimodal-agent-reasoning-in-a-single-efficient-open-model/

https://huggingface.co/blog/nvidia/nemotron-3-nano-omni-multimodal-intelligence Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, R…

2 weeks назад @ youtube.com
OpenAI's GPT 5.5 Instant: The Good, The Bad And The Insane
OpenAI's GPT 5.5 Instant: The Good, The Bad And The Insane OpenAI's GPT 5.5 Instant: The Good, The Bad And The Insane

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 GPT 5.5 Instant:

https://deploymentsafety.openai.com/gpt-5-5-instant/introduction

https://openai.com/index/gpt-5-5-instant/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwie…

2 weeks, 5 days назад @ youtube.com
DeepSeek V4 AI: Crushing The Competition
DeepSeek V4 AI: Crushing The Competition DeepSeek V4 AI: Crushing The Competition

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 Check out DeepSeek here:

https://www.deepseek.com/en/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

Thumbnail design: https://felicia.hu

3 weeks назад @ youtube.com
NVIDIA's New AI Turns One Photo Into A World That Never Breaks
NVIDIA's New AI Turns One Photo Into A World That Never Breaks NVIDIA's New AI Turns One Photo Into A World That Never Breaks

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://research.nvidia.com/labs/sil/projects/lyra2/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

Thumbnail design: https:…

3 weeks, 3 days назад @ youtube.com
Sakana AI’s God Simulator Is Brilliant
Sakana AI’s God Simulator Is Brilliant Sakana AI’s God Simulator Is Brilliant

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 Try it out! The paper is available here:

https://pub.sakana.ai/digital-ecosystem/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

Thumbnail design: https:…

3 weeks, 5 days назад @ youtube.com
This Is Why AI Videos Feel Wrong
This Is Why AI Videos Feel Wrong This Is Why AI Videos Feel Wrong

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://research.nvidia.com/labs/sil/projects/MOTIVE/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

Thumbnail design: https…

4 weeks назад @ youtube.com
NVIDIA’s New AI Changed Robotics Forever
NVIDIA’s New AI Changed Robotics Forever NVIDIA’s New AI Changed Robotics Forever

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://nvlabs.github.io/GEAR-SONIC/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

Thumbnail design: https://felicia.hu #nv…

1 month назад @ youtube.com
DeepMind’s New AI: A Gift To Humanity
DeepMind’s New AI: A Gift To Humanity DeepMind’s New AI: A Gift To Humanity

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Links:

https://ai.google.dev/gemma/docs/core/model_card_4 Fine tuning with Matt Mireles: https://x.com/mattmireles/status/2041606508220489786 Other sources:

https://x.com/googlegemma/status/2041256042882105666?s=46

https://x.com/nakazakifam/status/2041286410930446370

https://x.com/measure_plan/status/2039815699695104343

https://x.com/maddiedreese/status/2041677327604838685?s=46

https://x.com/steipete/status/2042615534567457102?s=46

https://x.com/maziyarpanahi/status/2042592050940449260?s=46

https://x.com/adrgrondin/status/2041962263507083340?s=46

https://x.com/evgeniymikholap/status/2041104232648950170

https:…

1 month, 1 week назад @ youtube.com
“Anthropic’s New AI Is Too Dangerous To Release”
“Anthropic’s New AI Is Too Dangerous To Release” “Anthropic’s New AI Is Too Dangerous To Release”

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://www.anthropic.com/claude-mythos-preview-system-card Links and sources:

https://debugml.github.io/cheating-agents/

https://x.com/bstnxbt/status/2042967285715865685 Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, T…

1 month, 1 week назад @ youtube.com
NVIDIA’s New AI: The Biggest Leap In Robot Learning Yet
NVIDIA’s New AI: The Biggest Leap In Robot Learning Yet NVIDIA’s New AI: The Biggest Leap In Robot Learning Yet

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers 📝 The paper is available here:

https://dreamdojo-world.github.io/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

Thumbnail design: https://felicia.hu …

1 month, 2 weeks назад @ youtube.com
NVIDIA’s New AI: A Revolution...For Free!
NVIDIA’s New AI: A Revolution...For Free! NVIDIA’s New AI: A Revolution...For Free!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The #NVIDIA paper on Nemotron 3 Super is available here:

https://research.nvidia.com/labs/nemotron/files/NVIDIA-Nemotron-3-Super-Technical-Report.pdf Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My …

1 month, 2 weeks назад @ youtube.com
Google New TurboQuant AI: Hype vs. Reality
Google New TurboQuant AI: Hype vs. Reality Google New TurboQuant AI: Hype vs. Reality

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The TurboQuant paper is available here:

https://arxiv.org/abs/2504.19874 Reproduction: https://x.com/AlicanKiraz0/status/2038245538865275274

KV-cache source: https://huggingface.co/blog/not-lain/kv-caching Reviews and criticisms of the paper:

https://openreview.net/forum?id=tO3ASKZlok

https://x.com/gaoj0017/status/2037532673812443214 Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fre…

1 month, 3 weeks назад @ youtube.com
DataFest Video DataFest Video
последний пост None
Семинары JetBrains Research Семинары JetBrains Research
последний пост None
Яндекс. Компьютерные науки Яндекс. Компьютерные науки
последний пост 1 day, 5 hours назад
Data Fest. Трек Practical ML
Data Fest. Трек Practical ML Data Fest. Трек Practical ML

Обсудим применение ML в рабочих задачах команд Яндекса. Расписание трансляции:

13:00 — открытие

13:10 — «X-split ARGUS: авторегрессивный энкодер-декодер для ранжирования рекламы», Александр Плошкин

13:40 — «AI-агенты для оптимизации бизнеса Яндекс Лавки», Алёна Зайцева

14:10 — «Контролируемая генерация многокамерного видео для симуляции сенсорных данных автономного транспорта», Анастасия Демидова Перерыв 40 минут 15:20 — «Гибридная генеративно-ранжирующая модель на базе Semantic IDs в рекомендациях Яндекс Музыки», Дарья Тихонович

15:50 — «AgentOps в продакшене: инфраструктура для LLM-агентов и автоматизация поддержки Персональных сервисов Яндекса», Иван Насонов Перерыв 25 минут 16:45 — «Как…

1 day, 5 hours назад @ youtube.com
RAG за минуту
RAG за минуту RAG за минуту

Если хотите поработать с этим и узнать изнутри как всё устроено — приходите на Weekend Offer ML 30-31 мая: https://clck.ru/3TSCym Андрей Соколов, руководитель команды обучения моделей с внешним контекстом в Яндекс R&D, рассказал, как устроены RAG-системы трёх продуктов Яндекса. В докладе Андрей поделился продуктовыми кейсами и разобрал на их примере, как оценивают качество, где промптинг не вывозит и почему трогать один компонент системы — это всегда риск. Полное видео уже на канале.

5 days, 6 hours назад @ youtube.com
Как работает Context Engineering в Agent-loop Нейроюриста Яндекса
Как работает Context Engineering в Agent-loop Нейроюриста Яндекса Как работает Context Engineering в Agent-loop Нейроюриста Яндекса

Если хотите поработать с этим и узнать изнутри как всё устроено — приходите на Weekend Offer ML 30-31 мая: https://clck.ru/3TSCym Андрей Соколов, руководитель команды обучения моделей с внешним контекстом в Яндекс R&D, рассказал, как устроены RAG-системы трёх продуктов Яндекса. В докладе Андрей поделился продуктовыми кейсами и разобрал на их примере, как оценивают качество, где промптинг не вывозит и почему трогать один компонент системы — это всегда риск. Полное видео уже на канале.

1 week, 4 days назад @ youtube.com
Оценка стабильности ML-системы: робастность в RAG на примере Алисы
Оценка стабильности ML-системы: робастность в RAG на примере Алисы Оценка стабильности ML-системы: робастность в RAG на примере Алисы

Если хотите поработать с этим и узнать изнутри как всё устроено — приходите на Weekend Offer ML 30-31 мая: https://clck.ru/3TSCym Андрей Соколов, руководитель команды обучения моделей с внешним контекстом в Яндекс R&D, рассказал, как устроены RAG-системы трёх продуктов Яндекса. В докладе Андрей поделился продуктовыми кейсами и разобрал на их примере, как оценивают качество, где промптинг не вывозит и почему трогать один компонент системы — это всегда риск. Полное видео уже на канале.

2 weeks назад @ youtube.com
Какие бенчмарки мы используем для оценки качества Алисы
Какие бенчмарки мы используем для оценки качества Алисы Какие бенчмарки мы используем для оценки качества Алисы

Если хотите поработать с этим и узнать изнутри как всё устроено — приходите на Weekend Offer ML 30-31 мая: https://clck.ru/3TSCym Андрей Соколов, руководитель команды обучения моделей с внешним контекстом в Яндекс R&D, рассказал, как устроены RAG-системы трёх продуктов Яндекса. В докладе Андрей поделился продуктовыми кейсами и разобрал на их примере, как оценивают качество, где промптинг не вывозит и почему трогать один компонент системы — это всегда риск. Полное видео уже на канале.

2 weeks, 3 days назад @ youtube.com
Как Brickit определяет классы деталей LEGO
Как Brickit определяет классы деталей LEGO Как Brickit определяет классы деталей LEGO

Андрей Татаринов, CEO и CTO в Epoch8, рассказал о системе компьютерного зрения для приложения Brickit, которое сканирует множество деталей лего и подсказывает, что из них можно собрать. В докладе Андрей объяснил, как его команда боролась с редкими классами и оптимизировала пайплайн под мобильные устройства. А ещё он поделился MLOps-решениями для масштабируемого дообучения и поддержки модели. Полное видео уже на канале! #Brickit, #Lego, #AI, #Нейросети, #ComputerVision, #DeepLearning, #MobileAI, #Tech, #Стартап, #Будущее, #MachineLearning, #Гаджеты, #Технологии, #WOW, #AIприложение

3 weeks, 4 days назад @ youtube.com
С какими данными работает Brickit
С какими данными работает Brickit С какими данными работает Brickit

Андрей Татаринов, CEO и CTO в Epoch8, рассказал о системе компьютерного зрения для приложения Brickit, которое сканирует множество деталей лего и подсказывает, что из них можно собрать. В докладе Андрей объяснил, как его команда боролась с редкими классами и оптимизировала пайплайн под мобильные устройства. А ещё он поделился MLOps-решениями для масштабируемого дообучения и поддержки модели. Полное видео уже на канале! #Brickit, #Lego, #AI, #Нейросети, #ComputerVision, #DeepLearning, #MobileAI, #Tech, #Стартап, #Будущее, #MachineLearning, #Гаджеты, #Технологии, #WOW, #AIприложение

4 weeks, 1 day назад @ youtube.com
Как работает разметка классов в Brickit
Как работает разметка классов в Brickit Как работает разметка классов в Brickit

Андрей Татаринов, CEO и CTO в Epoch8, рассказал о системе компьютерного зрения для приложения Brickit, которое сканирует множество деталей лего и подсказывает, что из них можно собрать. В докладе Андрей объяснил, как его команда боролась с редкими классами и оптимизировала пайплайн под мобильные устройства. А ещё он поделился MLOps-решениями для масштабируемого дообучения и поддержки модели. Полное видео уже на канале! #Brickit, #Lego, #AI, #Нейросети, #ComputerVision, #DeepLearning, #MobileAI, #Tech, #Стартап, #Будущее, #MachineLearning, #Гаджеты, #Технологии, #WOW, #AIприложение

1 month назад @ youtube.com
Возможен ли единый рекомендатель для всех сервисов?
Возможен ли единый рекомендатель для всех сервисов? Возможен ли единый рекомендатель для всех сервисов?

Технически единый рекомендатель для всех сервисов возможен. Но есть две проблемы: данные Музыки, Кинопоиска, Маркета и Афиши плохо обобщаются, а применять такой единый ранкер просто негде — мы же не сравниваем трек с выставкой. Поэтому вместо одной модели — блендеры: каждый сервис ранжирует своё, а потом блоки смешиваются в единую ленту. А вы хотели бы видеть в одной ленте и музыку, и фильмы, и товары? 👇 #единыйрекомендатель #яндекс #яндексмузыка #кинопоиск #маркет #афиша #рекомендательныесистемы #машинноеобучение #ml #блендер #ранжирование

1 month, 1 week назад @ youtube.com
Где грань между советом и манипуляцией?
Где грань между советом и манипуляцией? Где грань между советом и манипуляцией?

Любой совет — это в какой-то степени манипуляция. Всё зависит от цели: вы пытаетесь что-то продать или действительно хотите помочь пользователю? Отвечает Даня Бурлаков, руководитель группы рекомендательных продуктов. В этом шортсе говорим об этике рекомендательных систем, о том, где проходит эта тонкая грань и почему честность важнее. А вы чувствуете, когда алгоритм пытается вами манипулировать? Делитесь в комментариях 👇 Смотрите полное видео на канале! #совет #манипуляция #этика #рекомендательныесистемы #искусственныйинтеллект #яндекс #машинноеобучение #ml #этикаии

1 month, 2 weeks назад @ youtube.com
Влияют ли рекомендации на наше поведение?
Влияют ли рекомендации на наше поведение? Влияют ли рекомендации на наше поведение?

Рекомендательные системы — это не просто алгоритмы. Они влияют на наше настроение, а через него — на многое другое. В этом шортсе разбираемся, как музыка и другие рекомендации меняют пользователей и почему одна из наших целей — делать людей счастливее. Рассказывает Даня Бурлаков, руководитель группы рекомендательных продуктов. Смотрите полное видео на канале! А как вы считаете, рекомендации влияют на ваше настроение? Пишите в комментариях 👇 #влияниерекомендаций #поведениепользователей #настроение #рекомендательныесистемы #яндексмузыка #психология #машинноеобучение #ml #какэтоработает

1 month, 2 weeks назад @ youtube.com
Почему рекомендации замыкаются на знакомом контенте?
Почему рекомендации замыкаются на знакомом контенте? Почему рекомендации замыкаются на знакомом контенте?

Что сделать, чтобы рекомендации не надоели пользователю? Как находить музыкальные треки и фильмы, которые будут расширять его интересы? Рассказывает Даня Бурлаков, руководитель группы рекомендательных продуктов. Смотрите полное видео на канале! #эффектпузыря #фильтрпузыря #разнообразие #рекомендательныесистемы #яндексмузыка #машинноеобучение #трансформеры #ml #новоемузыка

1 month, 2 weeks назад @ youtube.com
«Я хочу всё выбросить и сделать заново» — почему это так больно
«Я хочу всё выбросить и сделать заново» — почему это так больно «Я хочу всё выбросить и сделать заново» — почему это так больно

Даня Бурлаков, руководитель группы рекомендательных продуктов, рассказал, что мешает внедрить ML-модели в привычный флоу кандидат-генераций и ранкеров. Смотрите полное видео на канале! #яндекс #генеративныемодели #генеративныйии #рекомендательныесистемы #машинноеобучение #ml #argus #catboost #внедрение #продакшен #mldevelopment

1 month, 3 weeks назад @ youtube.com
Заметны ли изменения в рекомендациях Яндекс Музыки?
Заметны ли изменения в рекомендациях Яндекс Музыки? Заметны ли изменения в рекомендациях Яндекс Музыки?

Даня Бурлаков, руководитель группы рекомендательных продуктов, рассказал, как команда внедряет новые фичи и какой эффект они приносят. Смотрите полное видео на канале! #яндекс #яндексмузыка #рекомендации #рекомендательныесистемы #машинноеобучение #ml #аргус #argus #трансформеры #ранжирование

1 month, 3 weeks назад @ youtube.com
Визуально-текстовая омни-модель: путь к объединению LLM и VLM / Роман Исаченко
Визуально-текстовая омни-модель: путь к объединению LLM и VLM / Роман Исаченко Визуально-текстовая омни-модель: путь к объединению LLM и VLM / Роман Исаченко

На Saturday ML Party Роман Исаченко, руководитель группы анализа изображений в Яндекс R&D, рассказал, как выглядел долгий путь к сведению LLM и VLM из части семейства Alice AI в единую омни-модель. Она умеет работать с текстом и изображениями в одном контуре. А ещё поделился ключевыми этапами, компромиссами и планами по развитию модели в ближайшем будущем. ➡️ Подписывайтесь на телеграм-канал Яндекса для ML-сообщества: https://t.me/+owyCvdge8WIyNTUy #AI, #MachineLearning, #LLM, #GenAI, #AIAgents, #RAG, #MLOps, #DataScience, #DeepLearning, #AIEngineering, #NeuralNetworks, #ComputerVision, #NLP, #TechTalk, #AIConference

1 month, 3 weeks назад @ youtube.com
ML Trainings ML Trainings
последний пост 2 days, 4 hours назад
Капитанский мостик №20: Карпаты в Anthropic | Люди не нужны | Штраф за галлюцинации
Капитанский мостик №20: Карпаты в Anthropic | Люди не нужны | Штраф за галлюцинации Капитанский мостик №20: Карпаты в Anthropic | Люди не нужны | Штраф за галлюцинации

В этот раз в гостях у капитанов был Валера Бабушкин, поговорили про актуальные вопросы найма, ИИ в промышленности и вообще все все на свете. 0:00:00 Начало

0:01:00 Карпаты в Anthropic

0:09:29 Microsoft и Claude

0:24:07 Amazon и токены

0:25:16 Люди не нужны

0:38:18 Вышел Qwen 3.7

0:44:23 OpenAI и доступ

0:47:46 Сбер и китайские чипы

0:53:20 ИИ-литератор

1:08:10 ИИ-парикмахер

1:12:43 Штраф за галлюцинации Канал в mattermost: https://mm.ods.ai/ods/channels/data_captain (авторизуйтесь через ODS.ai) Другие площадки: https://taplink.cc/datacaptains #Anthropic #Microsoft #OpenAI #Сбер #Qwen

2 days, 4 hours назад @ youtube.com
Перегруженные рутины и агенты критикуют капитализм
Перегруженные рутины и агенты критикуют капитализм Перегруженные рутины и агенты критикуют капитализм 1 week назад @ youtube.com
Нейросети и классовое сознание: обсуждение с Валентином Малых
Нейросети и классовое сознание: обсуждение с Валентином Малых Нейросети и классовое сознание: обсуждение с Валентином Малых 1 week назад @ youtube.com
Менеджеры среднего звена вымирают как класс
Менеджеры среднего звена вымирают как класс Менеджеры среднего звена вымирают как класс 1 week назад @ youtube.com
Дмитрий и Валентин обсуждают гонку в ИИ и Китай
Дмитрий и Валентин обсуждают гонку в ИИ и Китай Дмитрий и Валентин обсуждают гонку в ИИ и Китай 1 week назад @ youtube.com
Капитанский мостик №19: ЦОДы съели людей | Sakana и оборонка | Коммунистические агенты
Капитанский мостик №19: ЦОДы съели людей | Sakana и оборонка | Коммунистические агенты Капитанский мостик №19: ЦОДы съели людей | Sakana и оборонка | Коммунистические агенты

0:00:00 Начало

0:01:24 Anthropic и SpaceX договорились

0:07:32 Intel cделает чипы для Apple

0:10:40 Китайцы тоже делают чипы

0:17:03 ЦОДы съели людей

0:24:53 Новый стартап из Qwen

0:32:50 Разреженные трансформеры

0:47:46 Менеджеры не нужны

0:53:34 ИИ-беспомощность

0:58:44 Sakana и оборонка

1:05:05 Коммунистические агенты ИИ-саммари:

Обсуждение последних новостей в области искусственного интеллекта, технологий производства чипов и энергетической инфраструктуры, а также стратегий бизнесов в этих сферах. Обсуждение последних тенденций в области искусственного интеллекта, включая развитие embodied intelligence, регулирование и геополитические аспекты. Эксперт делится своими взглядами на будущее…

1 week, 3 days назад @ youtube.com
Капитанский мостик №18: Правда про OpenAI | Бигтехи и правительство | Скрепы для ИИ
Капитанский мостик №18: Правда про OpenAI | Бигтехи и правительство | Скрепы для ИИ Капитанский мостик №18: Правда про OpenAI | Бигтехи и правительство | Скрепы для ИИ

0:00:00 Начало

0:00:55 Европейское роботакси

0:03:59 Правда про OpenAI

0:07:39 Бигтехи и правительство

0:21:22 Anthropic и финансы

0:24:49 Grok и финансы

0:35:21 GPT на FPGA

0:46:31 Подкасты от ИИ

0:52:09 Датацентры в море

0:57:04 Доходы и ИИ

1:04:21 Малайзия и ЦОДы

1:09:32 Скрепы для ИИ ИИ-саммари:

Обсуждение последних новостей в области технологий, роботакси, регулирования ИИ и влияния эмоций на развитие технологий. Гости делятся мнениями о будущем ИИ, законодательстве и этических вопросах. Обсуждение последних трендов в области финтеха, AI и технологий безопасности, а также практических аспектов внедрения новых решений. Обсуждение современных технологий, их применения и будущих тенденций…

2 weeks, 3 days назад @ youtube.com
Корпоративная шизофрения Google
Корпоративная шизофрения Google Корпоративная шизофрения Google 3 weeks, 2 days назад @ youtube.com
Когда вера в пирамиду начинает трещать
Когда вера в пирамиду начинает трещать Когда вера в пирамиду начинает трещать 3 weeks, 2 days назад @ youtube.com
Киберпанк Невозможно
Киберпанк Невозможно Киберпанк Невозможно 3 weeks, 2 days назад @ youtube.com
Валентин Малых подозревает, что Илон Маск был прав
Валентин Малых подозревает, что Илон Маск был прав Валентин Малых подозревает, что Илон Маск был прав 3 weeks, 2 days назад @ youtube.com
Валентин Малых о росте стоимости подписок
Валентин Малых о росте стоимости подписок Валентин Малых о росте стоимости подписок 3 weeks, 2 days назад @ youtube.com
Сomputer vision: применение на примере параболы
Сomputer vision: применение на примере параболы Сomputer vision: применение на примере параболы 3 weeks, 2 days назад @ youtube.com
Капитанский мостик №17: ЦОД размером с Юту | ИИ дороже людей | пописай для ИИ
Капитанский мостик №17: ЦОД размером с Юту | ИИ дороже людей | пописай для ИИ Капитанский мостик №17: ЦОД размером с Юту | ИИ дороже людей | пописай для ИИ

0:00:00 введение

0:01:15 DeepSeek и Ascend 0:10:57 ЦОД размером с Юту

0:19:35 OpenAI и падение акций

0:22:28 Китай против покупки Manus

0:30:48 ChatGPT c рекламой

0:38:38 Cohere купил Aleph Alpha

0:44:52 закон про ИИ смягчили

0:47:50 Google и Anthropic

0:53:07 Tencent, Alibaba и DeepSeek

0:55:38 ИИ дороже людей

1:00:38 OpenAI и Microsoft

1:05:02 Газпромнефть и беспилотники

1:09:54 Groq быстрее Nvidia

1:12:45 ИИ спроектировал чип

1:17:46 пописай для ИИ ИИ-саммари:

Обсуждение последних новостей в области технологий, включая запуск новых моделей AI, развитие китайского рынка чипов и геополитические аспекты технологического бизнеса. Обсуждение текущих трендов в области искусственного интеллекта…

3 weeks, 3 days назад @ youtube.com
Стартап переворачивает парадигму взаимодействия людей с ИИ агентами
Стартап переворачивает парадигму взаимодействия людей с ИИ агентами Стартап переворачивает парадигму взаимодействия людей с ИИ агентами 1 month назад @ youtube.com
Primer Primer
последний пост 4 months, 3 weeks назад
Taking AI Doom Seriously For 62 Minutes
Taking AI Doom Seriously For 62 Minutes Taking AI Doom Seriously For 62 Minutes

Patreon: https://www.patreon.com/primerlearning

80,000 Hours: 80000hours.org/primer https://www.desmos.com/calculator/a5pfjtr4tr Other connections:

Discord: https://discord.gg/NbruaNW

Twitch: https://www.twitch.tv/justin_helps

Store: https://store.dftba.com/collections/primer Reddit: https://www.reddit.com/r/primerlearning/

Bsky: https://bsky.app/profile/justinhelps.bsky.social

Twitter: https://twitter.com/primerlearning Links to other resources:

https://yoshuabengio.org/2024/07/09/reasoning-through-arguments-against-taking-ai-safety-seriously/

https://www.youtube.com/c/robertmilesai

https://www.youtube.com/@Siliconversations

https://www.youtube.com/@Go-Meta

https://www.youtube.com/@Dwarkes…

4 months, 3 weeks назад @ youtube.com
Simulating a single brain cell
Simulating a single brain cell Simulating a single brain cell

Patreon:

https://www.patreon.com/primerlearning Helpful resources if you want to learn more about neural networks

https://www.youtube.com/@AndrejKarpathy

https://course.fast.ai/

https://www.youtube.com/@WelchLabsVideo

https://www.youtube.com/@3blue1brown Early papers. These probably aren't helpful for understanding the concepts in this video, but if you're interested in history.

The Perceptron – A perceiving and recognizing automaton: https://bpb-us-e2.wpmucdn.com/websites.umass.edu/dist/a/27637/files/2016/03/rosenblatt-1957.pdf

The Perceptron: A probabilistic model for information storage and organization in the brain: https://www.ling.upenn.edu/courses/cogs501/Rosenblatt1958.pdf A Logical…

8 months назад @ youtube.com
🎧 Podcasts
Lex Fridman AI Podcast Lex Fridman AI Podcast
последний пост 2 weeks, 6 days назад
#496 – FFmpeg: The Incredible Technology Behind Video on the Internet
#496 – FFmpeg: The Incredible Technology Behind Video on the Internet #496 – FFmpeg: The Incredible Technology Behind Video on the Internet

Jean-Baptiste Kempf is lead developer of VLC and president of VideoLAN.

Kieran Kunhya is a longtime FFmpeg contributor, codec engineer, and the person behind the now-infamous FFmpeg account on X.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep496-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://larridin.comBlitzy: AI agent for large enterprise codebases.

Go to https://perplexity.ai/OUTLINE:(00:00) – Introduction(03:00) – Sponsors, Comments, and Reflections(10:48) – Weirdest things VLC opens(15:12) – How video playback works(24:33) – Video codecs and containers(35:20) – FFmpeg explained(56:20)…

2 weeks, 6 days назад @ lexfridman.com
#495 – Vikings, Ragnar, Berserkers, Valhalla & the Warriors of the Viking Age
#495 – Vikings, Ragnar, Berserkers, Valhalla & the Warriors of the Viking Age #495 – Vikings, Ragnar, Berserkers, Valhalla & the Warriors of the Viking Age

Lars Brownworth is a historian, teacher, podcaster, and author specializing in Viking history, medieval Europe, and the Byzantine Empire.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep495-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://larridin.comBetterHelp: Online therapy and counseling.

Go to https://drinkLMNT.com/lexFin: AI agent for customer service.

Go to https://perplexity.ai/OUTLINE:(00:00) – Introduction(01:03) – Sponsors, Comments, and Reflections(08:57) – The start of the Viking Age(18:50) – Viking military strategy, tactics & technology(32:33) – Ragnar Lothbrok(42:00) – The Grea…

1 month, 2 weeks назад @ lexfridman.com
#494 – Jensen Huang: NVIDIA – The $4 Trillion Company & the AI Revolution
#494 – Jensen Huang: NVIDIA – The $4 Trillion Company & the AI Revolution #494 – Jensen Huang: NVIDIA – The $4 Trillion Company & the AI Revolution

Jensen Huang is the co-founder and CEO of NVIDIA, the world’s most valuable company and the engine powering the AI computing revolution.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep494-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://drinkLMNT.com/lexFin: AI agent for customer service.

Go to https://quo.com/lexOUTLINE:(00:00) – Introduction(00:26) – Sponsors, Comments, and Reflections(06:34) – Extreme co-design and rack-scale engineering(09:20) – How Jensen runs NVIDIA(28:41) – AI scaling laws(43:41) – Biggest blockers to AI scaling laws(45:25) – Supply chain(47:20) – Memory(53:25) – Power…

2 months назад @ lexfridman.com
#493 – Jeff Kaplan: World of Warcraft, Overwatch, Blizzard, and Future of Gaming
#493 – Jeff Kaplan: World of Warcraft, Overwatch, Blizzard, and Future of Gaming #493 – Jeff Kaplan: World of Warcraft, Overwatch, Blizzard, and Future of Gaming

Jeff Kaplan is a legendary Blizzard game designer of World of Warcraft and Overwatch, now preparing to launch a new game, The Legend of California, from his new studio Kintsugiyama – available to wishlist on Steam today, with alpha later in March.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep493-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://fin.ai/lexBlitzy: AI agent for large enterprise codebases.

Go to https://blitzy.com/lexBetterHelp: Online therapy and counseling.

Go to https://betterhelp.com/lexShopify: Sell stuff online.

2 months, 2 weeks назад @ lexfridman.com
#492 – Rick Beato: Greatest Guitarists of All Time, History & Future of Music
#492 – Rick Beato: Greatest Guitarists of All Time, History & Future of Music #492 – Rick Beato: Greatest Guitarists of All Time, History & Future of Music

Rick Beato is a music educator, interviewer, producer, songwriter, and a true multi-instrument musician, playing guitar, bass, cello & piano.

His incredible YouTube channel celebrates great musicians & musical ideas, and helps millions of people fall in love with great music all over again.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep492-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://upliftdesk.com/lexBetterHelp: Online therapy and counseling.

Go to https://drinkLMNT.com/lexFin: AI agent for customer service.

2 months, 3 weeks назад @ lexfridman.com
#491 – OpenClaw: The Viral AI Agent that Broke the Internet – Peter Steinberger
#491 – OpenClaw: The Viral AI Agent that Broke the Internet – Peter Steinberger #491 – OpenClaw: The Viral AI Agent that Broke the Internet – Peter Steinberger

Peter Steinberger is the creator of OpenClaw, an open-source AI agent framework that’s the fastest-growing project in GitHub history.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep491-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://coderabbit.ai/lexFin: AI agent for customer service.

Go to https://fin.ai/lexBlitzy: AI agent for large enterprise codebases.

Go to https://drinkLMNT.com/lexOUTLINE:(00:00) – Introduction(03:51) – Sponsors, Comments, and Reflections(15:29) – OpenClaw origin story(18:48) – Mind-blowing moment(28:15) – Why OpenClaw went viral(32:12) – Self-modifying AI agent(36:57)…

3 months, 2 weeks назад @ lexfridman.com
#490 – State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI
#490 – State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI #490 – State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI

Nathan Lambert and Sebastian Raschka are machine learning researchers, engineers, and educators.

Sebastian Raschka is the author of Build a Large Language Model (From Scratch) and Build a Reasoning Model (From Scratch).

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep490-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

(25:11) – ChatGPT vs Claude vs Gemini vs Grok: Who is winning?

(36:11) – Best AI for coding(43:02) – Open Source vs Closed Source LLMs(54:41) – Transformers: Evolution of LLMs since 2019(1:02:38) – AI Scaling Laws: Are they dead or still holding?

3 months, 3 weeks назад @ lexfridman.com
#489 – Paul Rosolie: Uncontacted Tribes in the Amazon Jungle
#489 – Paul Rosolie: Uncontacted Tribes in the Amazon Jungle #489 – Paul Rosolie: Uncontacted Tribes in the Amazon Jungle

Paul Rosolie is a naturalist, explorer, author of a new book titled Junglekeeper, and is someone who has dedicated his life to protecting the Amazon rainforest.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep489-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://perplexity.ai/BetterHelp: Online therapy and counseling.

Go to https://fin.ai/lexMiro: Online collaborative whiteboard platform.

Go to https://miro.com/MasterClass: Online classes from world-class experts.

4 months, 1 week назад @ lexfridman.com
#488 – Infinity, Paradoxes that Broke Mathematics, Gödel Incompleteness & the Multiverse – Joel David Hamkins
#488 – Infinity, Paradoxes that Broke Mathematics, Gödel Incompleteness & the Multiverse – Joel David Hamkins #488 – Infinity, Paradoxes that Broke Mathematics, Gödel Incompleteness & the Multiverse – Joel David Hamkins

Joel David Hamkins is a mathematician and philosopher specializing in set theory, the foundations of mathematics, and the nature of infinity, and he’s the #1 highest-rated user on MathOverflow.

He is also the author of several books, including Proof and the Art of Mathematics and Lectures on the Philosophy of Mathematics.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep488-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://masterclass.com/lexpodOUTLINE:(00:00) – Introduction(01:58) – Sponsors, Comments, and Reflections(15:40) – Infinity & paradoxes(1:02:50) – Russell’s paradox(1:15:57) – Gödel’s…

4 months, 3 weeks назад @ lexfridman.com
#487 – Irving Finkel: Deciphering Secrets of Ancient Civilizations & Flood Myths
#487 – Irving Finkel: Deciphering Secrets of Ancient Civilizations & Flood Myths #487 – Irving Finkel: Deciphering Secrets of Ancient Civilizations & Flood Myths

Irving Finkel is a scholar of ancient languages and a longtime curator at the British Museum, renowned for his expertise in Mesopotamian history and cuneiform writing.

He specializes in reading and interpreting cuneiform inscriptions, including tablets from Sumerian, Akkadian, Babylonian, and Assyrian contexts.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep487-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://shopify.com/lexMiro: Online collaborative whiteboard platform.

Go to https://miro.com/Chevron: Reliable energy for data centers.

5 months, 2 weeks назад @ lexfridman.com
#486 – Michael Levin: Hidden Reality of Alien Intelligence & Biological Life
#486 – Michael Levin: Hidden Reality of Alien Intelligence & Biological Life #486 – Michael Levin: Hidden Reality of Alien Intelligence & Biological Life

Michael Levin is a biologist at Tufts University working on novel ways to understand and control complex pattern formation in biological systems.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep486-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://upliftdesk.com/lexMiro: Online collaborative whiteboard platform.

Go to https://miro.com/MasterClass: Online classes from world-class experts.

(2:42:41) – Mind uploading(3:01:22) – Alien intelligence(3:16:17) – Advice for young people(3:22:46) – Questions for AGI

5 months, 3 weeks назад @ lexfridman.com
#485 – David Kirtley: Nuclear Fusion, Plasma Physics, and the Future of Energy
#485 – David Kirtley: Nuclear Fusion, Plasma Physics, and the Future of Energy #485 – David Kirtley: Nuclear Fusion, Plasma Physics, and the Future of Energy

David Kirtley is a nuclear fusion engineer and CEO of Helion Energy, a company working on building the world's first commercial fusion power plant by 2028.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep485-sc

See below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc. Transcript:

https://lexfridman.com/david-kirtley-transcript CONTACT LEX:

Feedback - give feedback to Lex: https://lexfridman.com/survey

AMA - submit questions, videos or call-in: https://lexfridman.com/ama

Hiring - join our team: https://lexfridman.com/hiring

Other - other ways to get in touch: https://lexfridman.com/contact EPISODE LINKS:

David's X: htt…

6 months, 1 week назад @ lexfridman.com
#484 – Dan Houser: GTA, Red Dead Redemption, Rockstar, Absurd & Future of Gaming
#484 – Dan Houser: GTA, Red Dead Redemption, Rockstar, Absurd & Future of Gaming #484 – Dan Houser: GTA, Red Dead Redemption, Rockstar, Absurd & Future of Gaming

Dan Houser is co-founder of Rockstar Games and is a legendary creative mind behind Grand Theft Auto (GTA) and Red Dead Redemption series of video games.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep484-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://box.com/aiUPLIFT Desk: Standing desks and office ergonomics.

Go to https://drinkLMNT.com/lexOUTLINE:(00:00) – Introduction(01:29) – Sponsors, Comments, and Reflections(11:32) – Greatest films of all time(23:45) – Making video games(26:36) – GTA 3(29:55) – Open world video games(32:42) – Character creation(36:09) – Superintelligent AI in A Bette…

6 months, 3 weeks назад @ lexfridman.com
#483 – Julia Shaw: Criminal Psychology of Murder, Serial Killers, Memory & Sex
#483 – Julia Shaw: Criminal Psychology of Murder, Serial Killers, Memory & Sex #483 – Julia Shaw: Criminal Psychology of Murder, Serial Killers, Memory & Sex

Julia Shaw is a criminal psychologist and author who in her books explores human nature, including psychopathy, violent crime, the psychology of evil, police interrogation, false memory manipulation, deception detection, and human sexuality.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep483-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://shopify.com/lexBetterHelp: Online therapy and counseling.

Go to https://betterhelp.com/lexLMNT: Zero-sugar electrolyte drink mix.

Go to https://drinkLMNT.com/lexAG1: All-in-one daily nutrition drink.

7 months, 2 weeks назад @ lexfridman.com
#482 – Pavel Durov: Telegram, Freedom, Censorship, Money, Power & Human Nature
#482 – Pavel Durov: Telegram, Freedom, Censorship, Money, Power & Human Nature #482 – Pavel Durov: Telegram, Freedom, Censorship, Money, Power & Human Nature

Pavel Durov is the founder and CEO of Telegram.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep482-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Transcript:https://lexfridman.com/pavel-durov-transcriptCONTACT LEX:Feedback – give feedback to Lex: https://lexfridman.com/surveyAMA – submit questions, videos or call-in: https://lexfridman.com/amaHiring – join our team: https://lexfridman.com/hiringOther – other ways to get in touch: https://lexfridman.com/contactEPISODE LINKS:Pavel’s Telegram: https://t.me/durovPavel’s X: https://x.com/durovTelegram: https://telegram.org/Telegram Contests: https://contest.c…

7 months, 4 weeks назад @ lexfridman.com
Microsoft Research Podcast Microsoft Research Podcast
последний пост 1 month, 1 week назад
Can we AI our way to a more sustainable world?
Can we AI our way to a more sustainable world? Can we AI our way to a more sustainable world?

Because I do think there’s a role for AI, a huge role for AI.

BURGER: Right, right.

BURGER: Right, right.

So I think that’s also something quite important here that, you know, AI can help facilitate.

And I think that’s not just applying AI to solve solutions through optimization but also thinking about this in an integrated way.

1 month, 1 week назад @ microsoft.com
Ideas: Steering AI toward the work future we want
Ideas: Steering AI toward the work future we want Ideas: Steering AI toward the work future we want

JANSSEN: Yeah, yeah, exactly.

TEEVAN: Yeah, yeah, yeah.

I’m curious what you have found particularly surprising about how people and organizations are leveraging AI right now.

And so I do like to picture a future of work where humans are flourishing with AI and where humans still get to do meaningful work.

And I’m very curious about how we can take advantage of AI and do more without running ourselves into the ground because we’re not AI, right?

1 month, 2 weeks назад @ microsoft.com
Will machines ever be intelligent?
Will machines ever be intelligent? Will machines ever be intelligent?

And the question we’re going to discuss is, are machines intelligent?

No, no, that’s right, that’s right.

I mean, in some sense, you could potentially have a super intelligent system, right, that’s far more intelligent than anything else on the planet.

BURGER: Right, right.

At the same time, I think, you know, transformers are not intelligent in the way that a three-year-old is, right?

2 months назад @ microsoft.com
Trailer: The Shape of Things to Come
Trailer: The Shape of Things to Come Trailer: The Shape of Things to Come

Join Microsoft’s Doug Burger and guests as they dig into the fundamental truths about AI and how it will reshape the future.

Technical advances are moving at such a rapid pace that it can be challenging to define the tomorrow we’re working toward.

In The Shape of Things to Come, Microsoft research leader Doug Burger and experts from across disciplines tease out the thorniest AI issues facing technologists, policymakers, business decision-makers, and other stakeholders today.

It’s important to understand what the emerging shapes are and how we should respond.” – Doug Burger, Technical Fellow and Corporate Vice President, Microsoft ResearchAbout Doug BurgerDoug Burger is a research leader in …

2 months, 3 weeks назад @ microsoft.com
Ideas: Community building, machine learning, and the future of AI
Ideas: Community building, machine learning, and the future of AI Ideas: Community building, machine learning, and the future of AI

This week, machine learning researchers around the world will be attending the annual Conference on Neural Information Processing Systems, or NeurIPS.

In this series, we’ll explore the technologies that are shaping our future and the big ideas that propel them forward.

So around that time when I started my PhD at Penn, I was working in machine learning theory and algorithmic economics.

How had you experienced a lack of community or network of women in machine learning before the founding of WiML?

So particularly when working on topics related to fairness, I’ve ended up focusing a bunch on stuff to do with marginalized groups as part of my responsible AI work.

5 months, 3 weeks назад @ microsoft.com
Ideas: More AI-resilient biosecurity with the Paraphrase Project
Ideas: More AI-resilient biosecurity with the Paraphrase Project Ideas: More AI-resilient biosecurity with the Paraphrase Project

Today, I’m excited to talk about the Paraphrase Project, an effort I co-led exploring how advances in AI tools for protein design might impact biosecurity.

These “patches,” akin to those in cybersecurity, have now been shared with organizations globally to strengthen biosecurity screening.

The project highlights that the same AI tools capable of incredible good can also be misused, requiring us to be vigilant, thoughtful, and creative so we continue to get the most benefit out of AI tools while working to ensure that we avoid costly misuses.

So things like, how similar is this to that template, wild-type protein structure that we used as our conditioning information?

But I feel like broadly…

7 months, 3 weeks назад @ microsoft.com
Coauthor roundtable: Reflecting on healthcare economics, biomedical research, and medical education
Coauthor roundtable: Reflecting on healthcare economics, biomedical research, and medical education Coauthor roundtable: Reflecting on healthcare economics, biomedical research, and medical education

KOHANE: So I think you’ve “nerd sniped” me because you [LAUGHTER]—which is all too easy—but I think there’s a central issue here.

But I actually think this is dark matter of human organizational technology that is not well understood.

AZEEM AZHAR: We didn’t talk about, you know, AI in its ability to potentially do this, which is to extend the clinician’s presence throughout the week.

And so I think there’s always going to be an opening for either differences of opinion or agreeing with you too much.

And this gets into whether AI is really going to get almost to the ab initio understanding of human biology.

9 months, 1 week назад @ microsoft.com
Reimagining healthcare delivery and public health with AI
Reimagining healthcare delivery and public health with AI Reimagining healthcare delivery and public health with AI

We are sorry, the page you requested cannot be found.

The page you are looking for could not be found or is no longer available.

9 months, 3 weeks назад @ microsoft.com
NLP Highlights NLP Highlights
последний пост None
Data Skeptic
последний пост 3 weeks, 5 days назад
Student Spotlight: Aaron Payne, Data Analyst
Student Spotlight: Aaron Payne, Data Analyst Student Spotlight: Aaron Payne, Data Analyst

Aaron Payne, an MBA student at Georgia Tech studying business analytics and a Senior Insights Analyst at Chick-fil-A, joins Kyle Polich to talk about turning analytics into decisions that matter. They unpack a real-world forecasting project with Comfama in Colombia, including messy data realities, interpretability tradeoffs, and why "data science for good" starts with the people impacted.

3 weeks, 5 days назад @ dataskeptic.com
The Future is Agentic in Recommender Systems
The Future is Agentic in Recommender Systems The Future is Agentic in Recommender Systems

Kyle Polich sits down with Yashar Deldjoo, research scientist and Associate Professor at the Polytechnic University of Bari, to explore how recommender systems have evolved and why trustworthiness matters. They unpack key dimensions of responsible AI, including robustness to adversarial attacks, privacy, explainability, and fairness, and discuss how LLMs introduce new risks like hallucinations. The episode closes with a look at "agentic" recommender systems, where tools and memory shift recommendations from ranked lists to end-to-end task completion.

1 month назад @ dataskeptic.com
Book Ratings and Recommendations
Book Ratings and Recommendations Book Ratings and Recommendations

Goodreads star ratings can be misleading as measures of "book quality," and research from Hannes Rosenbusch suggests that for many professionally published books, differences between readers often matter more than differences between books. The episode also explores how to model reader preferences, why reviews often reveal more about the reviewer than the text, and how LLMs can aid computational literary research while still falling short of human editors in creative writing.

2 months назад @ dataskeptic.com
Disentanglement and Interpretability in Recommender Systems
Disentanglement and Interpretability in Recommender Systems Disentanglement and Interpretability in Recommender Systems 2 months, 2 weeks назад @ dataskeptic.com
Collective Altruism in Recommender Systems
Collective Altruism in Recommender Systems Collective Altruism in Recommender Systems

Ekaterina (Kat) Filadova from MIT EECS joins us to discuss strategic learning in recommender systems—what happens when users collectively coordinate to game recommendation algorithms. Kat's research reveals surprising findings: algorithmic "protest movements" can paradoxically help platforms by providing clearer preference signals, and the challenge of distinguishing coordinated behavior from bot activity is more complex than it appears. This episode explores the intersection of machine learning and game theory, examining what happens when your training data actively responds to your algorithm.

2 months, 4 weeks назад @ dataskeptic.com
Niche vs Mainstream
Niche vs Mainstream Niche vs Mainstream

Anas Buhayh discusses multi-stakeholder fairness in recommender systems and the S'mores framework—a simulation allowing users to choose between mainstream and niche algorithms. His research shows specialized recommenders improve utility for niche users while raising questions about filter bubbles and data privacy.

3 months, 1 week назад @ dataskeptic.com
Healthy Friction in Job Recommender Systems
Healthy Friction in Job Recommender Systems Healthy Friction in Job Recommender Systems

In this episode, host Kyle Polich speaks with Roan Schellingerhout, a fourth-year PhD student at Maastricht University, about explainable multi-stakeholder recommender systems for job recruitment. Roan discusses his research on creating AI-powered job matching systems that balance the needs of multiple stakeholders—job seekers, recruiters, HR professionals, and companies. The conversation explores different types of explanations for job recommendations, including textual, bar chart, and graph-based formats, with findings showing that lay users strongly prefer simple textual explanations over more technical visualizations. Roan shares insights from his "healthy friction" study, which tested …

3 months, 3 weeks назад @ dataskeptic.com
Fairness in PCA-Based Recommenders
Fairness in PCA-Based Recommenders Fairness in PCA-Based Recommenders

In this episode, we explore the fascinating world of recommender systems and algorithmic fairness with David Liu, Assistant Research Professor at Cornell University's Center for Data Science for Enterprise and Society. David shares insights from his research on how machine learning models can inadvertently create unfairness, particularly for minority and niche user groups, even without any malicious intent. We dive deep into his groundbreaking work on Principal Component Analysis (PCA) and collaborative filtering, examining why these fundamental techniques sometimes fail to serve all users equally. David introduces the concept of "power niche users" - highly active users with specialized in…

4 months назад @ dataskeptic.com
Video Recommendations in Industry
Video Recommendations in Industry Video Recommendations in Industry

In this episode, Kyle Polich sits down with Cory Zechmann, a content curator working in streaming television with 16 years of experience running the music blog "Silence Nogood." They explore the intersection of human curation and machine learning in content discovery, discussing the concept of "algatorial" curation—where algorithms and editorial expertise work together. Key topics include the cold start problem, why every metric is just a "proxy metric" for what users actually want, the challenge of filter bubbles, and the importance of balancing familiarity with discovery. Cory shares insights on why TikTok's algorithm works so well (clean data and massive interaction volume), the crucial …

5 months назад @ dataskeptic.com
Eye Tracking in Recommender Systems
Eye Tracking in Recommender Systems Eye Tracking in Recommender Systems

In this episode, Santiago de Leon takes us deep into the world of eye tracking and its revolutionary applications in recommender systems. As a researcher at the Kempelin Institute and Brno University, Santiago explains the mechanics of eye tracking technology—how it captures gaze data and processes it into fixations and saccades to reveal user browsing patterns. He introduces the groundbreaking RecGaze dataset, the first eye tracking dataset specifically designed for recommender systems research, which opens new possibilities for understanding how users interact with carousel interfaces like Netflix. Through collaboration between psychologists and AI researchers, Santiago's work demonstrate…

5 months, 1 week назад @ dataskeptic.com
Cracking the Cold Start Problem
Cracking the Cold Start Problem Cracking the Cold Start Problem

In this episode of Data Skeptic, we dive deep into the technical foundations of building modern recommender systems. Unlike traditional machine learning classification problems where you can simply apply XGBoost to tabular data, recommender systems require sophisticated hybrid approaches that combine multiple techniques. Our guest, Boya Xu, an assistant professor of marketing at Virginia Tech, walks us through a cutting-edge method that integrates three key components: collaborative filtering for dimensionality reduction, embeddings to represent users and items in latent space, and bandit learning to balance exploration and exploitation when deploying new recommendations. Boya shares insigh…

5 months, 2 weeks назад @ dataskeptic.com
Designing Recommender Systems for Digital Humanities
Designing Recommender Systems for Digital Humanities Designing Recommender Systems for Digital Humanities

In this episode of Data Skeptic, we explore the fascinating intersection of recommender systems and digital humanities with guest Florian Atzenhofer-Baumgartner, a PhD student at Graz University of Technology. Florian is working on Monasterium.net, Europe's largest online collection of historical charters, containing millions of medieval and early modern documents from across the continent. The conversation delves into why traditional recommender systems fall short in the digital humanities space, where users range from expert historians and genealogists to art historians and linguists, each with unique research needs and information-seeking behaviors. Florian explains the technical challen…

6 months назад @ dataskeptic.com
DataRec Library for Reproducible in Recommend Systems
DataRec Library for Reproducible in Recommend Systems DataRec Library for Reproducible in Recommend Systems

In this episode of Data Skeptic's Recommender Systems series, host Kyle Polich explores DataRec, a new Python library designed to bring reproducibility and standardization to recommender systems research. Guest Alberto Carlo Mario Mancino, a postdoc researcher from Politecnico di Bari, Italy, discusses the challenges of dataset management in recommendation research—from version control issues to preprocessing inconsistencies—and how DataRec provides automated downloads, checksum verification, and standardized filtering strategies for popular datasets like MovieLens, Last.fm, and Amazon reviews. The conversation covers Alberto's research journey through knowledge graphs, graph-based recommen…

6 months, 2 weeks назад @ dataskeptic.com
Shilling Attacks on Recommender Systems
Shilling Attacks on Recommender Systems Shilling Attacks on Recommender Systems

In this episode of Data Skeptic's Recommender Systems series, Kyle sits down with Aditya Chichani, a senior machine learning engineer at Walmart, to explore the darker side of recommendation algorithms. The conversation centers on shilling attacks—a form of manipulation where malicious actors create multiple fake profiles to game recommender systems, either to promote specific items or sabotage competitors. Aditya, who researched these attacks during his undergraduate studies at SPIT before completing his master's in computer science with a data science specialization at UC Berkeley, explains how these vulnerabilities emerge particularly in collaborative filtering systems. From promoting a …

6 months, 3 weeks назад @ dataskeptic.com
Music Playlist Recommendations
Music Playlist Recommendations Music Playlist Recommendations

In this episode, Rebecca Salganik, a PhD student at the University of Rochester with a background in vocal performance and composition, discusses her research on fairness in music recommendation systems. She explores three key types of fairness—group, individual, and counterfactual—and examines how algorithms create challenges like popularity bias (favoring mainstream content) and multi-interest bias (underserving users with diverse tastes). Rebecca introduces LARP, her multi-stage multimodal framework for playlist continuation that uses contrastive learning to align text and audio representations, learn song relationships, and create playlist-level embeddings to address the cold start prob…

7 months назад @ dataskeptic.com
SuperDataScience SuperDataScience
последний пост 1 day, 6 hours назад
995: End-to-End Foundation Models for the Energy Industry, with Jazmia Henry
995: End-to-End Foundation Models for the Energy Industry, with Jazmia Henry 995: End-to-End Foundation Models for the Energy Industry, with Jazmia Henry

Jazmia Henry joins Jon Krohn to break down what it actually takes to build end-to-end foundation models for the energy industry. From wrangling decades of handwritten oil-and-gas documents into usable training data, to bespoke tokenizers, reinforcement learning, and inference at scale, Jazmia walks through every stage of the stack. Along the way she explains why reinforcement learning models are "bursty," what reward hacking is and how her Grounded Continuous Evaluation framework fixes it, and revisits the 2023 NeurIPS paper that argued, to widespread skepticism at the time, that scaling bad data degrades model performance. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠h…

1 day, 6 hours назад @ podtrac.com
994: AI’s Putting Recent Grads Out of Work; Here’s How to Get Hired Anyway!
994: AI’s Putting Recent Grads Out of Work; Here’s How to Get Hired Anyway! 994: AI’s Putting Recent Grads Out of Work; Here’s How to Get Hired Anyway!

Unemployment for recent computer-science graduates now rivals rates for fine-arts and anthropology majors, and undergraduate CS enrollment fell 11% in 2025. In this Five-Minute Friday, Jon Krohn walks through the data on both sides of the debate, from Stanford research showing a 13% employment drop for young workers in AI-exposed jobs, to Federal Reserve studies finding no statistically detectable link between AI adoption and reduced hiring. Jon shares his own view on where the truth lies and offers five concrete pieces of advice for graduates and senior professionals alike on how to get hired in 2026. Additional materials:⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/993⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠…

5 days, 6 hours назад @ podtrac.com
993: How to Build AI-First Organizations, with Jacob Miller and Jeremy Mumford
993: How to Build AI-First Organizations, with Jacob Miller and Jeremy Mumford 993: How to Build AI-First Organizations, with Jacob Miller and Jeremy Mumford

For years, AI content has come in the form of “use this library, use this tool” tutorials that age out within months. Jacob Miller and Jeremy Mumford, co-authors of the brand new Wiley book Architected Intelligence, wanted to write something different, a guide to the higher-level principles of building AI products and AI-first organizations that will still be relevant in five or ten years. In this episode, the two Pattern engineers walk Jon Krohn through the core ideas of their book: why you should design products and processes so they can be executed by a human, an AI agent, or any hybrid combination; why most companies are still treating hallucinations as a model problem when they’re actu…

1 week, 1 day назад @ podtrac.com
992: Tokenmaxxing vs AI Hardware Bottlenecks
992: Tokenmaxxing vs AI Hardware Bottlenecks 992: Tokenmaxxing vs AI Hardware Bottlenecks

While “tokenmaxxing”, the social media trend of maximizing AI token consumption as a vanity metric, takes off online, the physical infrastructure behind AI is slamming into serious bottlenecks. In this Five-Minute Friday, Jon Krohn maps out the four overlapping supply-chain constraints choking AI compute: GPUs (with NVIDIA Blackwell sold out through mid-2026), high-bandwidth memory (quintupled demand since 2023, only three manufacturers worldwide), CPUs (agentic AI requires 12x more CPUs per GPU than chatbots), and electricity (Gartner projects power shortages will restrict 40% of AI data centres by 2027). Find out why the five biggest hyperscalers are on track to spend $725 billion on AI i…

1 week, 5 days назад @ podtrac.com
991: Pair Programming with AI in Your Python Notebook, with Dr. Trevor Manz
991: Pair Programming with AI in Your Python Notebook, with Dr. Trevor Manz 991: Pair Programming with AI in Your Python Notebook, with Dr. Trevor Manz

Dr. Trevor Manz of Marimo talks to Jon Krohn about Marimo Pair, an open-source agent skill that teaches coding agents like Claude Code how to drive a reactive Python notebook, reading cell state, running Python in the kernel, taking screenshots of cells, and iterating on data tasks the way agents iterate on traditional software. Trevor also unpacks recursive language models, his AnyWidget project that bridges Python and the web, and his journey from a Wisconsin small town and Harvard bioinformatics research to founding-engineer life at Marimo. Listen to the episode to hear why no matter where AI takes us, curiosity and going deep on a topic will always be valuable. Additional materials: ⁠⁠⁠…

2 weeks, 1 day назад @ podtrac.com
990: Inside Mythos: Anthropic's Locked-Down Frontier Model
990: Inside Mythos: Anthropic's Locked-Down Frontier Model 990: Inside Mythos: Anthropic's Locked-Down Frontier Model

Anthropic has built a frontier AI model so capable at finding software vulnerabilities that it has decided not to release it to the general public. In this Five-Minute Friday, Jon Krohn breaks down Claude Mythos Preview, a general-purpose model whose hacking abilities emerged as a side effect of broad improvements in code understanding and reasoning. Find out how Mythos achieved a nearly 100x improvement over Opus 4.6 on Firefox exploit generation, why Mozilla patched 271 vulnerabilities in a single release using an early version of the model, and what Project Glasswing Anthropic’s gated industry consortium means for the future of cybersecurity. Jon also shares practical tips for securing t…

2 weeks, 5 days назад @ podtrac.com
989: Security for Mythos-Era Agentic Risks, with Rubrik’s Anneka Gupta and Cal Al-Dhubaib
989: Security for Mythos-Era Agentic Risks, with Rubrik’s Anneka Gupta and Cal Al-Dhubaib 989: Security for Mythos-Era Agentic Risks, with Rubrik’s Anneka Gupta and Cal Al-Dhubaib

Rubrik’s Anneka Gupta and Cal Al-Dhubaib speak to Jon Krohn about cybersecurity measures, the risks AI in business might pose for malicious attacks, and why AI should be kept “boring.” Find out how Rubrik safeguards client data, what zero trust is in the context of cybersecurity, and why cyber-resilience needs to be a top priority for companies looking to adopt AI. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/989⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information. In this episode you will learn: (02:25) All about Rubrik (08:51) The announcement of Claude …

3 weeks, 1 day назад @ podtrac.com
988: In Case You Missed It in April 2026
988: In Case You Missed It in April 2026 988: In Case You Missed It in April 2026

In this month’s episode of In Case You Missed It, Jon Krohn talks to guests about memory and education, and how artificial intelligence is continuing to help lower the barriers to access. Hear from Matt Glickman, Traci Walker-Griffith, Richmond Alake, and Linda Haviv, discussing the foundations of AI agent memory, how engineers can develop at scale, and why they believe AI could be your child’s perfect tutor in the classroom. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/988⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

3 weeks, 5 days назад @ podtrac.com
987: AI Infrastructure, Ray, and Why Nonlinear Careers Win, with Linda Haviv
987: AI Infrastructure, Ray, and Why Nonlinear Careers Win, with Linda Haviv 987: AI Infrastructure, Ray, and Why Nonlinear Careers Win, with Linda Haviv

Linda Haviv talks to Jon Krohn about staying current on AI matters, why open-source technology is narrowing the gap in its race with proprietary models, and how being a content creator in tech is key to career growth and longevity. She emphasizes that non-linear pathways to a career in tech can give applicants an edge, and stresses the importance of continuous upskilling to “stay relevant.” In her view, systems thinking is becoming more important than coding skills. Hear why in this episode. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/987⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascienc…

4 weeks, 1 day назад @ podtrac.com
986: Building Hardware is Hard but AI Agents Help, with Kishore Subramanian
986: Building Hardware is Hard but AI Agents Help, with Kishore Subramanian 986: Building Hardware is Hard but AI Agents Help, with Kishore Subramanian

CTO of Propel Software Kishore Subramanian talks to Jon Krohn about how product lifecycle management (PLM) software and quality management systems help ensure compliance, record management, and quality assurance. Listen to the episode to hear Kishore Subramanian talk about best practices for getting started with Agentforce 360, his top tips for deploying AI projects, and why yoga and meditation could make you better at building AI products! Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/984⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.⁠⁠⁠ In this episode you will learn: (05:21) …

1 month назад @ podtrac.com
985: The Four Types of Memory Every AI Agent Needs, with Richmond Alake
985: The Four Types of Memory Every AI Agent Needs, with Richmond Alake 985: The Four Types of Memory Every AI Agent Needs, with Richmond Alake

Oracle’s Director of AI Developer Experience Richmond Alake returns to the show to talk to Jon Krohn about agent memory; the network of systems, models, databases and LLMs that enable AI agents to learn and adapt over time. Listen to the episode to hear about Richmond’s “100 Days of Agent Memory” initiative, retrieval-augmented generation’s (RAG) limitations with AI agents, the layers of the AI agent stack, and what makes the Oracle AI database so useful to developers. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/985⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship inf…

1 month назад @ podtrac.com
984: Building AI Agents Where 99.9% Accuracy Isn't Good Enough, with Raju Malhotra
984: Building AI Agents Where 99.9% Accuracy Isn't Good Enough, with Raju Malhotra 984: Building AI Agents Where 99.9% Accuracy Isn't Good Enough, with Raju Malhotra

Raju Malhotra, Chief Product and Technology Officer at Certinia, talks to Jon Krohn about the so-called SaaSpocalypse and how agentic AI is proving the doomsayers wrong. Listen to the episode to hear more about Certinia’s work with Salesforce and building with Agentforce 360, the three elements required for enterprise-grade agents, how AI agents have benefitted Certinia’s customers, and how to keep your work portfolio fresh and interesting to recruiters. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/984⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.⁠⁠⁠ In this episode you will lea…

1 month, 1 week назад @ podtrac.com
983: AI in the Classroom: How a Top Elementary School Is Doing It Right, with Principal Traci Walker Griffith
983: AI in the Classroom: How a Top Elementary School Is Doing It Right, with Principal Traci Walker Griffith 983: AI in the Classroom: How a Top Elementary School Is Doing It Right, with Principal Traci Walker Griffith

My guest today took a public school that was about to be shut down and turned it into the number one school in Boston, and AI is her latest secret weapon. In a long-overdue episode on AI for supporting children’s education, hear directly from Principal Traci Walker Griffith how her teachers have been experimenting with AI in classrooms, what works, what doesn’t work, and what’s next for kids as LLMs continue to improve. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/983⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information. In this episode you will learn: (03:38) Th…

1 month, 1 week назад @ podtrac.com
982: In Case You Missed It in March 2026
982: In Case You Missed It in March 2026 982: In Case You Missed It in March 2026

Jon Krohn rounds up March’s interviews in this ICYMI episode. Hear from AI and data science experts across the fields of education and business in this wide-ranging series of clips that take listeners from the Renaissance to the near future. Guests include Lin Quiao (Episode 971), Chris Fregly (Episode 973), Zack Kass (Episode 975), Kyunghyun Cho (Episode 977), and Rohit Choudhary (Episode 979). Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/982⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

1 month, 2 weeks назад @ podtrac.com
981: How Data Engineers Are “10x’ing” Themselves With Agents, feat. Matt Glickman
981: How Data Engineers Are “10x’ing” Themselves With Agents, feat. Matt Glickman 981: How Data Engineers Are “10x’ing” Themselves With Agents, feat. Matt Glickman

Matt Glickman talks to Jon Krohn about co-founding the agentic-platform startup, Genesis Computing, how his experience at Goldman Sachs paved the way for developing AI agents, and where he thinks agentic AI has just as much value as a company’s human employees. This February, Genesis Computing revealed how its platform can offer the guardrails so crucial to businesses, alongside increased capabilities that help execute entire workflows from research to deployment. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/981⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.…

1 month, 2 weeks назад @ podtrac.com
Data Science at Home Data Science at Home
последний пост 1 week, 1 day назад
Recommend and manipulate: the dangers of the attention economy
Recommend and manipulate: the dangers of the attention economy Recommend and manipulate: the dangers of the attention economy

This sort of operation is directly exploiting a core feature of internet social media platforms.

The main purpose of recommender systems is to recommend people the same items similar people show an interest in.

Some of the most common methods to implement recommender systems, use concepts such as cosine/correlation similarity, matrix factorization, neural autoencoders and sequence predictors.

As you say, recommender systems exist because the business model of social media platforms is to monetise attention.

F: So you are saying that this is not an accident: is this the basis of the optimisation of the recommender system?

1 week, 1 day назад @ datascienceathome.com
Social media is an ant mill (Internet is a disaster) (Ep. 303)
Social media is an ant mill (Internet is a disaster) (Ep. 303) Social media is an ant mill (Internet is a disaster) (Ep. 303)

Personal newsletter:https://defragzone.substack.com📩 Newsletter: https://datascienceathome.substack.com🎙 Podcast: Available on Spotify, Apple Podcasts, and more.

🐦 Twitter: @DataScienceAtHome📘LinkedIn: https://www.linkedin.com/in/fragadaleta/Instagram: https://www.instagram.com/datascienceathome/Facebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Whether you’re a data professional, tech enthusiast, or just curious about the field, our podcast delivers insights, interviews…

1 week, 1 day назад @ datascienceathome.com
AI and videogames (Ep. 305)
AI and videogames (Ep. 305) AI and videogames (Ep. 305)

What is the state of AI and videogames?

This and much more is covered in this 1st episode of AI and videogames.

Check outshift.comNEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Send us mail at: [email protected]’t forget to like, subscribe, and hit the 🔔 for updates on the latest in AI and data science!

1 week, 1 day назад @ datascienceathome.com
AI and videogames: Conversational NPCs (Ep. 306)
AI and videogames: Conversational NPCs (Ep. 306) AI and videogames: Conversational NPCs (Ep. 306)

Can NPCs in videogames leverage new LLM-based tech?

Check outshift.comNEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Whether you’re a data professional, tech enthusiast, or just curious about the field, our podcast delivers insights, interviews, and discussions.

Send us mail at: [email protected]’t forget to like, subscribe, and hit the 🔔 for updates on the latest in AI and data science!

1 week, 1 day назад @ datascienceathome.com
AI tips & tricks (Ep. 307)
AI tips & tricks (Ep. 307) AI tips & tricks (Ep. 307)

🐦 Twitter: @DataScienceAtHome📘LinkedIn: https://www.linkedin.com/in/fragadaleta/Instagram: https://www.instagram.com/datascienceathome/Facebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastSPONSORSThis episode is brought to you by Outshift, Cisco’s incubation engine.

Check outshift.comNEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Whether you’re a data professional, tech enthusiast, or just curious about the field, our podcast delivers insights, interviews, and discussions.

Send us mail at: [email protected]’t forget to like, subscribe, and hit the …

1 week, 1 day назад @ datascienceathome.com
Europe, wake up! You Can’t Be a Superpower on Someone Else’s Servers (Ep. 304)
Europe, wake up! You Can’t Be a Superpower on Someone Else’s Servers (Ep. 304) Europe, wake up! You Can’t Be a Superpower on Someone Else’s Servers (Ep. 304)

Tech sovereignty takes 3 years and political will.

Check outshift.comNEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Whether you’re a data professional, tech enthusiast, or just curious about the field, our podcast delivers insights, interviews, and discussions.

Send us mail at: [email protected]’t forget to like, subscribe, and hit the 🔔 for updates on the latest in AI and data science!

1 month назад @ datascienceathome.com
About Apple’s Privacy (Ep. 302)
About Apple’s Privacy (Ep. 302) About Apple’s Privacy (Ep. 302)

Apple just spent $2B on tech that reads your silent speech.

🐦 Twitter: @DataScienceAtHome📘LinkedIn: https://www.linkedin.com/in/fragadaleta/Instagram: https://www.instagram.com/datascienceathome/Facebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Whether you’re a data professional, tech enthusiast, or just curious about the field, our podcast delivers insights, interviews, and discussions.

Send us mail at: [email protected]’t forget to like, subscribe, and hi…

1 month назад @ datascienceathome.com
Productivity is the new data breach (Ep. 301)
Productivity is the new data breach (Ep. 301) Productivity is the new data breach (Ep. 301)

Personal newsletter:https://defragzone.substack.com📩 Newsletter: https://datascienceathome.substack.com🎙 Podcast: Available on Spotify, Apple Podcasts, and more.

🐦 Twitter: @DataScienceAtHome📘LinkedIn: https://www.linkedin.com/in/fragadaleta/Instagram: https://www.instagram.com/datascienceathome/Facebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Whether you’re a data professional, tech enthusiast, or just curious about the field, our podcast delivers insights, interviews…

1 month назад @ datascienceathome.com
Programmable Money: The Cage They’ll Call Convenience (Ep. 300)
Programmable Money: The Cage They’ll Call Convenience (Ep. 300) Programmable Money: The Cage They’ll Call Convenience (Ep. 300)

This episode breaks down programmable money, the technology that turns your wallet into a permission system.

Personal newsletter: https://defragzone.substack.com📩 Newsletter: https://datascienceathome.substack.com🎙 Podcast: Available on Spotify, Apple Podcasts, and more.

🐦 Twitter: @DataScienceAtHome📘LinkedIn: https://www.linkedin.com/in/fragadaleta/Instagram: https://www.instagram.com/datascienceathome/Facebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Send us mail at: …

1 month назад @ datascienceathome.com
There Is No AI. There’s a Stateless Function on 10,000 GPUs Pretending to Know You (Ep. 299)
There Is No AI. There’s a Stateless Function on 10,000 GPUs Pretending to Know You (Ep. 299) There Is No AI. There’s a Stateless Function on 10,000 GPUs Pretending to Know You (Ep. 299)

Personal newsletter: https://defragzone.substack.com📩 Newsletter: https://datascienceathome.substack.com🎙 Podcast: Available on Spotify, Apple Podcasts, and more.

🐦 Twitter: @DataScienceAtHome📘 LinkedIn: https://www.linkedin.com/in/fragadaleta/ Instagram: https://www.instagram.com/datascienceathome/Facebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Whether you’re a data professional, tech enthusiast, or just curious about the field, our podcast delivers insights, intervi…

2 months, 3 weeks назад @ datascienceathome.com
Bias in the machine (edited)
Bias in the machine (edited) Bias in the machine (edited)

The title of today’s episode is Bias in the machineC: Francesco, today we are starting with an infuriating discussion.

The failure of the medical community as a whole to recognise this obvious bias up to the 21st century is an example of how insidious the problem of bias is.

Three: The bias in your training sample: people put training samples together, and people have culture, experience, and prejudice.

These assumptions inform the way AI systems work—and fail—to this day.

When an algorithm is a black box and you can’t look inside, you have no way of analysing its bias.

2 months, 3 weeks назад @ datascienceathome.com
What is wrong with reinforcement learning? (Ep. 82)
What is wrong with reinforcement learning? (Ep. 82) What is wrong with reinforcement learning? (Ep. 82)

Join the discussion on our Discord serverAfter reinforcement learning agents doing great at playing Atari video games, Alpha Go, doing financial trading, dealing with language modeling, let me tell you the real story here.In this episode I want to shine some light on reinforcement learning (RL) and the limitations that every practitioner should consider before taking certain directions.

RL seems to work so well!

What is wrong with it?

Are you a listener of Data Science at Home podcast?

Or did you subscribe to the Artificial Intelligence at your fingertips newsletter?

3 months, 3 weeks назад @ datascienceathome.com
How to generate very large images with GANs (Ep. 76)
How to generate very large images with GANs (Ep. 76) How to generate very large images with GANs (Ep. 76)

Join the discussion on our Discord serverIn this episode I explain how a research group from the University of Lubeck dominated the curse of dimensionality for the generation of large medical images with GANs.

The problem is not as trivial as it seems.

Many researchers have failed in generating large images with GANs before.

One interesting application of such approach is in medicine for the generation of CT and X-ray images.Enjoy the show!

ReferencesMulti-scale GANs for Memory-efficient Generation of High Resolution Medical Images https://arxiv.org/abs/1907.01376

3 months, 3 weeks назад @ datascienceathome.com
Training neural networks faster without GPU [RB] (Ep. 77)
Training neural networks faster without GPU [RB] (Ep. 77) Training neural networks faster without GPU [RB] (Ep. 77)

Join the discussion on our Discord serverTraining neural networks faster usually involves the usage of powerful GPUs.

In this episode I explain an interesting method from a group of researchers from Google Brain, who can train neural networks faster by squeezing the hardware to their needs and making the training pipeline more dense.

Enjoy the show!

ReferencesFaster Neural Network Training with Data Echoinghttps://arxiv.org/abs/1907.05550

3 months, 3 weeks назад @ datascienceathome.com
More powerful deep learning with transformers (Ep. 84) (Rebroadcast)
More powerful deep learning with transformers (Ep. 84) (Rebroadcast) More powerful deep learning with transformers (Ep. 84) (Rebroadcast)

Some of the most powerful NLP models like BERT and GPT-2 have one thing in common: they all use the transformer architecture.

Such architecture is built on top of another important concept already known to the community: self-attention.In this episode I explain what these mechanisms are, how they work and why they are so powerful.

Don’t forget to subscribe to our Newsletter or join the discussion on our Discord serverReferences

3 months, 3 weeks назад @ datascienceathome.com