Very ML
State-of-the-art Machine Learning News Feed
/r/MachineLearning
последний пост 1 час назад
We have been building and working on a local AI with memory and persistence [P]
We have been building and working on a local AI with memory and persistence [P] We have been building and working on a local AI with memory and persistence [P]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 час назад @ reddit.com
[D] ICML assigned me a paper that I reviewed in ICLR
[D] ICML assigned me a paper that I reviewed in ICLR

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

3 часа назад @ reddit.com
[D] Average Number of Interviews to Get a Job (US)
[D] Average Number of Interviews to Get a Job (US)

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

3 часа назад @ reddit.com
[P] I trained YOLOX from scratch to avoid Ultralytics' AGPL (aircraft detection on iOS)
[P] I trained YOLOX from scratch to avoid Ultralytics' AGPL (aircraft detection on iOS) [P] I trained YOLOX from scratch to avoid Ultralytics' AGPL (aircraft detection on iOS)

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

8 часов назад @ reddit.com
[P] I built 16 single-file, zero-dependency Python implementations of every major AI algorithm — from tokenizers to quantization. No PyTorch. No TensorFlow. Just the math.
[P] I built 16 single-file, zero-dependency Python implementations of every major AI algorithm — from tokenizers to quantization. No PyTorch. No TensorFlow. Just the math. [P] I built 16 single-file, zero-dependency Python implementations of every major AI algorithm — from tokenizers to quantization. No PyTorch. No TensorFlow. Just the math.

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

12 часов назад @ reddit.com
[D] Asymmetric consensus thresholds for multi-annotator NER — valid approach or methodological smell?
[D] Asymmetric consensus thresholds for multi-annotator NER — valid approach or methodological smell? [D] Asymmetric consensus thresholds for multi-annotator NER — valid approach or methodological smell?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

13 часов назад @ reddit.com
[D] ARR Jan ARR Discussion
[D] ARR Jan ARR Discussion

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

14 часов назад @ reddit.com
[D] Interesting Gradient Norm Goes Down-Up-Down
[D] Interesting Gradient Norm Goes Down-Up-Down [D] Interesting Gradient Norm Goes Down-Up-Down

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

18 часов назад @ reddit.com
[D] Minimax 2.5 is out, considering local deployment
[D] Minimax 2.5 is out, considering local deployment

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

21 час назад @ reddit.com
[R] Lrec 26 acceptance emails
[R] Lrec 26 acceptance emails

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

21 час назад @ reddit.com
[D] Struggling on the NLP job market as a final-year PhD , looking for advice
[D] Struggling on the NLP job market as a final-year PhD , looking for advice

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

22 часа назад @ reddit.com
[D] Mamba exhibits "Active Sensing" while LSTM suffers "Posterior Collapse" under Adversarial Noise
[D] Mamba exhibits "Active Sensing" while LSTM suffers "Posterior Collapse" under Adversarial Noise

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

23 часа назад @ reddit.com
[D] Benchmarking Deep RL Stability Capable of Running on Edge Devices
[D] Benchmarking Deep RL Stability Capable of Running on Edge Devices

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 2 hours назад @ reddit.com
[R] Higher effort settings reduce deep research accuracy for GPT-5 and Gemini Flash 3
[R] Higher effort settings reduce deep research accuracy for GPT-5 and Gemini Flash 3

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 5 hours назад @ reddit.com
[P] SoproTTS v1.5: A 135M zero-shot voice cloning TTS model trained for ~$100 on 1 GPU, running ~20× real-time on the CPU
[P] SoproTTS v1.5: A 135M zero-shot voice cloning TTS model trained for ~$100 on 1 GPU, running ~20× real-time on the CPU

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 6 hours назад @ reddit.com
Towards Data Science
последний пост 11 часов назад
Your First 90 Days as a Data Scientist
Your First 90 Days as a Data Scientist Your First 90 Days as a Data Scientist

Therefore, in this article, I’ll share what mattered most in the beginning months and my checklist for any data science onboarding.

Build Data KnowledgeBuilding data knowledge is as important as building domain knowledge for data scientists.

Make full use of AI-assisted data tools: Every tech company is integrating AI into its data workflows.

Understand key metrics and their relationships: Data knowledge not only means being able to access and query the data, but understand the business from a data lens.

Suggest process improvements: Every data team operates differently, with pros and cons.

11 часов назад @ towardsdatascience.com
The Evolving Role of the ML Engineer
The Evolving Role of the ML Engineer The Evolving Role of the ML Engineer

Stephanie is a Staff Machine Learning Engineer, with almost 10 years of experience in data science and ML.

You have been working as an ML Engineer at DataGrail for more than two years.

I think the progress of code assistants using LLMs is really fascinating and is changing how a lot of people work in ML and in software engineering.

I think there’s still a lot for people in ML to do, though, especially applying our skills acquired from experience to unusual or unique problems.

It would also help if we had a broad campaign of public education about what LLMs and “AI” really are, demystifying the technology as much as we can.

1 day, 8 hours назад @ towardsdatascience.com
AI in Multiple GPUs: Point-to-Point and Collective Operations
AI in Multiple GPUs: Point-to-Point and Collective Operations AI in Multiple GPUs: Point-to-Point and Collective Operations

For NVIDIA GPUs, it’s NCCL (NVIDIA Collective Communications Library), while for AMD it’s RCCL (ROCm Communication Collectives Library).

It doesn’t enqueue into the current active stream, but rather to a dedicated internal NCCL stream per device.

The kernel is enqueued into a dedicated internal NCCL stream per device, which allows for overlapping communication with computation.

forces a synchronization) between your current active stream and the NCCL stream.

rank = torch.distributed.get_rank() world_size = torch.distributed.get_world_size() gather_list = None if rank != 0 else [torch.zeros(3, dtype=torch.int64).to(device) for _ in range(world_size)] t = torch.tensor([0+rank, 1+rank, 2+rank]…

1 day, 10 hours назад @ towardsdatascience.com
How to Leverage Explainable AI for Better Business Decisions
How to Leverage Explainable AI for Better Business Decisions How to Leverage Explainable AI for Better Business Decisions

From conversational agents like ChatGPT to autonomous vehicles and content curation engines on social media platforms, the foundational AI system remains remarkably consistent.

It is the data scientist’s job to interpret the model results and align them with domain expertise to build the final narrative.

Once a business knows what drives purchase behavior, there are numerous ways to leverage this information to make smart business decisions.

It helps you estimate how much revenue to expect over a period of time using real data, not guesses.

By surfacing patterns, forecasting outcomes, and revealing which actions move the needle, AI systems help the business owner see more clearly than ever …

2 days, 8 hours назад @ towardsdatascience.com
AI in Multiple GPUs: Understanding the Host and Device Paradigm
AI in Multiple GPUs: Understanding the Host and Device Paradigm AI in Multiple GPUs: Understanding the Host and Device Paradigm

The Big Picture: The Host and The DeviceThe most important concept to grasp is the relationship between the Host and the Device.

It runs the operating system and executes your Python script line by line.

The Host is the commander; it’s in charge of the overall logic and tells the Device what to do.

It runs the operating system and executes your Python script line by line.

Each rank is a CPU process which gets assigned a single device (GPU) and a unique ID.

2 days, 10 hours назад @ towardsdatascience.com
Building an AI Agent to Detect and Handle Anomalies in Time-Series Data
Building an AI Agent to Detect and Handle Anomalies in Time-Series Data Building an AI Agent to Detect and Handle Anomalies in Time-Series Data

Anomaly Detection and Handling with an AI AgentThe AI Agent anomaly detection follows the pipeline approach as drawn below:Image by authorWhy does this work better in practice?

# --------------------------------------- # BUILDING AI AGENT # --------------------------------------- agent = Agent( name="CovidAnomalyAgent", model=Groq(id="openai/gpt-oss-120b"), instructions=""" You are an AI agent monitoring live COVID-19 time-series data.

Complete CodeKindly go through the complete code for the statistical anomaly detection and AI Agent implementation for anomaly handling.

Scenario 1: A Native ImplementationThe first attempt is a native implementation where we detect minor anomalies and the AI…

3 days, 8 hours назад @ towardsdatascience.com
Not All RecSys Problems Are Created Equal
Not All RecSys Problems Are Created Equal Not All RecSys Problems Are Created Equal

But candidate generation isn’t always the uphill battle it’s made out to be, nor does it necessarily require machine learning.

When you can directly observe users voting with their wallets, you have a strong baseline that’s hard to beat.

Or consider Staples: once a user needs printer paper or AA batteries, brand and price dominate, making user preferences remarkably consistent.

The industry’s “state-of-the-art” is largely defined by the outliers — the tech giants dealing with massive, subjective inventories and dense user data.

If, instead, you face high churn (like Vinted) or weak signals (like Yelp), machine learning becomes a necessity just to keep up.

3 days, 10 hours назад @ towardsdatascience.com
How to Model The Expected Value of Marketing Campaigns
How to Model The Expected Value of Marketing Campaigns How to Model The Expected Value of Marketing Campaigns

Expected value is a key analytical framework that allows decision-makers to consider tradeoffs when there are unequal costs and benefits.

Expected Value Modeling expands this horizon to account for more possible outcomes, and allows us to measure the cost or benefit of each.

Start with a Purchase Likelihood ModelA Purchase Likelihood Model is a machine learning model that predicts the probability that a customer will purchase a product.

Implementing Expected Value ModelingTo move forward, it is important to understand the concept of a confusion matrix.

In those situations, we can embed expected value calculation into our purchase likelihood model training process.

4 days, 8 hours назад @ towardsdatascience.com
Implementing the Snake Game in Python
Implementing the Snake Game in Python Implementing the Snake Game in Python

Snake Game (Image by Author)Understanding the GameThe classical Snake game involves a small snake with a plain background.

The Snake Colliding with itself or the Boundary Wall – if the snake’s body collides with itself or the boundary of the game screen, the game ends.

Setting up the Game ScreenFirst things first, we will create a new project in our IDE, let’s call it “Snake Game”.

Running the code above would output the following:The Game Screen (Image by Author)Creating the Snake BodyOnce we have created the Game Screen, the next task is to create the snake.

Game Over (Image by Author)ConclusionIn this tutorial, we have successfully implemented the snake game in Python.

4 days, 9 hours назад @ towardsdatascience.com
How to Personalize Claude Code
How to Personalize Claude Code How to Personalize Claude Code

However, there are other focus areas you can have to get even more out of Claude Code.

Today, I’ll be talking about one of them: Making all information accessible to Claude Code.

I’ll discuss how you can get more out of Claude Code by making all information accessible to the coding agent.

I’m not sponsored by Claude Code in the writing of this article; I’m simply an avid user of the product.

How to make all information accessible to Claude CodeMaking all information accessible to Claude Code essentially consists of two steps.

4 days, 11 hours назад @ towardsdatascience.com
The Machine Learning Lessons I’ve Learned Last Month
The Machine Learning Lessons I’ve Learned Last Month The Machine Learning Lessons I’ve Learned Last Month

In work life, it’s project milestones, feature releases, and — for researchers — paper deadlines.

In daily ML work, things often feel more continuous: experiments run on the cluster, pipelines are adjusted, bugs are fixed.

I found that’s where good work happens.

And when you’re in the flow, things get done.

Closing thoughtsJanuary reminded me that a good work rhythm is not about always pushing hard.

5 days, 2 hours назад @ towardsdatascience.com
The Death of the “Everything Prompt”: Google’s Move Toward Structured AI
The Death of the “Everything Prompt”: Google’s Move Toward Structured AI The Death of the “Everything Prompt”: Google’s Move Toward Structured AI

The Deep Research ProblemGoogle’s Deep Research capability (powered by Gemini) is agentic.

Setting Up a Development EnvironmentLet’s see the Interactions API up close by looking at a few coding examples of its use.

--- Deep Research Competitive Intelligence Engine --- Enter the name of the competitor to analyze (e.g., Nvidia, Coca-Cola): Nvidia Deploying Deep Research Agent for: Nvidia... Research started.

But please be aware that, as of the time of writing, the Interactions API is still in Beta, and Google’s deep research agent is in preview.

For more information, see the link below for Google’s official documentation page for the interactions API.

5 days, 10 hours назад @ towardsdatascience.com
What I Am Doing to Stay Relevant as a Senior Analytics Consultant in 2026
What I Am Doing to Stay Relevant as a Senior Analytics Consultant in 2026 What I Am Doing to Stay Relevant as a Senior Analytics Consultant in 2026

In my opinion, the human aspect of it all is exactly how we, as data professionals, can continue to matter.

To stay intellectually sharp and updated, I need to go outside my work, do some learning, work on external projects and build an intuition for where the field is going.

The motivation in staying relevant should be to build depth where intuition alone isn’t enough, especially around AI systems, governance, and emerging best practices.

The future belongs to data professionals who can think hand-in-hand with AI systems, communicate findings with clarity, and anchor advanced analytics in real-world context.

That is the kind of data professional I intend to become in 2026.

1 week назад @ towardsdatascience.com
Pydantic Performance: 4 Tips on How to Validate Large Amounts of Data Efficiently
Pydantic Performance: 4 Tips on How to Validate Large Amounts of Data Efficiently Pydantic Performance: 4 Tips on How to Validate Large Amounts of Data Efficiently

The same is true for Pydantic, a high-performance data validation library for Python.

In Pydantic v2, the core validation engine is implemented in Rust, making it one of the fastest data validation solutions in the Python ecosystem.

1) Prefer Annotated constraints over field validatorsA core feature of Pydantic is that data validation is defined declaratively in a model class.

The naïve approach: field validatorsWe use a @field_validator to validate data, like checking whether an id column is actually an integer or greater than zero.

This means that there are no user-defined Python validation calls required during validation.

1 week, 1 day назад @ towardsdatascience.com
Prompt Fidelity: Measuring How Much of Your Intent an AI Agent Actually Executes
Prompt Fidelity: Measuring How Much of Your Intent an AI Agent Actually Executes Prompt Fidelity: Measuring How Much of Your Intent an AI Agent Actually Executes

At scale and applied to more impactful problems, falsely reporting a high prompt fidelity becomes a big problem.

Prompt fidelity is the same idea: how much of your original intent (signal) was faithfully fulfilled by the agentic system.

The Case Study: Reverse-Engineering Spotify’s AI Playlist AgentSpotify’s Prompted Playlists feature is what started this exploration into prompt fidelity.

This list of fields would change from prompt to prompt, even if the prompt was the same between runs of the prompt.

Applying the Math: Two Playlists, Two Fidelity ScoresLet’s apply this prompt fidelity concept to example playlists.

1 week, 1 day назад @ towardsdatascience.com
Distill.pub Distill.pub
последний пост None
TheSequence TheSequence
последний пост 2 days, 11 hours назад
The Sequence Opinion #806: The Emergence of the Agent-as-a-Judge: Why Evals Need a Reasoning Engine
The Sequence Opinion #806: The Emergence of the Agent-as-a-Judge: Why Evals Need a Reasoning Engine The Sequence Opinion #806: The Emergence of the Agent-as-a-Judge: Why Evals Need a Reasoning Engine

In the early days of the current AI boom—roughly 2023—we had a very “vibe-based” relationship with our models.

The idea was elegant in its simplicity—if these models are so good at generating text, surely they’re good at grading it, too.

We started using a “stronger” model (like GPT-4) to grade the outputs of “weaker” models.

We are moving from the era of the Judge as a Critic to the era of the Judge as an Agent.

Recently, I’ve been doing a lot of work in the agent-as-a-judge space as part of the LayerLens platform so wanted to share some ideas about the space.

2 days, 11 hours назад @ thesequence.substack.com
The Sequence AI of the Week #805: Goodfire and the Era of AI Interpretability
The Sequence AI of the Week #805: Goodfire and the Era of AI Interpretability The Sequence AI of the Week #805: Goodfire and the Era of AI Interpretability

If you look at how we build software today—what I’ve previously called “Software 2.0”—it’s effectively an optimization process.

We collect a massive dataset, we define a loss function, and then we surrender control to gradient descent.

We let the optimization algorithm smash the weights of a neural network until the loss goes down.

We are doing behavioral psychology on an alien intelligence because we lack the tools to do neuroscience.

The work Goodfire is doing on interpretability—specifically around feature steering and agents—is the first glimpse of what I’d call the “Software 3.0” IDE.

3 days, 11 hours назад @ thesequence.substack.com
The Sequence Knowledge #804: The Dreamer Trilogy: Inside Some of the Most Influential Papers in AI World Models
The Sequence Knowledge #804: The Dreamer Trilogy: Inside Some of the Most Influential Papers in AI World Models The Sequence Knowledge #804: The Dreamer Trilogy: Inside Some of the Most Influential Papers in AI World Models

Today we will Discuss:An overview of model-free RL inspired world models and the Dreamer efforts.

💡 AI Concept of the Day: Model-Based RL as the Inspiration for World ModelsThe transition from 2D arcade games to 3D environments was the crucible that forged the modern World Model.

In the mid-2010s, “Model-Free” Reinforcement Learning (RL) was enjoying a golden age.

Agents like DQN were crushing human scores in Atari games.1 These agents were reactive; they mapped pixels directly to actions.

However, when researchers tried to apply these same methods to 3D environments—like DeepMind Lab or Minecraft—progress stalled.

4 days, 11 hours назад @ thesequence.substack.com
The Sequence Radar #803: Last Week in AI: Anthropic and OpenAI’s Battle for the Long Horizon, Goodfire and LayerLens Push AI Accountability
The Sequence Radar #803: Last Week in AI: Anthropic and OpenAI’s Battle for the Long Horizon, Goodfire and LayerLens Push AI Accountability The Sequence Radar #803: Last Week in AI: Anthropic and OpenAI’s Battle for the Long Horizon, Goodfire and LayerLens Push AI Accountability

Goodfire, an AI research lab dedicated to model interpretability, secured a $150 million Series B funding round this week at a $1.25 billion valuation.

AI Lab: Google Cloud AI ResearchSummary: MARS is a framework designed for autonomous AI research that uses budget-aware Monte Carlo Tree Search (MCTS) to balance model performance with the high computational costs of training.

AI Lab: Meta FAIRSummary: This study investigates using the log-probability of reference answers as a universal reward signal for fine-tuning LLMs on both verifiable math tasks and non-verifiable long-form answers.

AI Lab: ERNIE Team, BaiduSummary: ERNIE 5.0 is a trillion-parameter unified autoregressive model trained …

6 days, 11 hours назад @ thesequence.substack.com
The Sequence 802: The Thinking Machine: A Deep Dive into Test-Time Compute and the New Scaling Paradigm
The Sequence 802: The Thinking Machine: A Deep Dive into Test-Time Compute and the New Scaling Paradigm The Sequence 802: The Thinking Machine: A Deep Dive into Test-Time Compute and the New Scaling Paradigm

For the better part of a decade, the dominant strategy for advancing machine intelligence has been a relentless pursuit of scale during the pre-training phase.

This approach, governed by scaling laws, operated on a fundamental assumption: that intelligence is primarily a function of pattern recognition capability acquired before a user ever types a prompt.

However, a new frontier has emerged, one that challenges the supremacy of pre-training as the sole driver of capability.

It is the domain of Test-Time Compute—often colloquially referred to as “system 2” thinking, inference-time scaling, or simply “letting the model think”.

By shifting compute from the training cluster to the inference se…

1 week, 2 days назад @ thesequence.substack.com
The Sequence AI of the Week #801: Deconstructing Kimi 2.5
The Sequence AI of the Week #801: Deconstructing Kimi 2.5 The Sequence AI of the Week #801: Deconstructing Kimi 2.5

The history of Large Language Models (LLMs) will likely be viewed in two distinct epochs: Pre-Kimi 2.5 and Post-Kimi 2.5.

If the original GPT-3 era was about the miracle of “emergence” through scale, Kimi 2.5 (K2.5) is about the miracle of orchestration.

Kimi 2.5 takes us into a world where the model doesn’t just write the code; it manages the entire execution environment, spawns its own sub-processes, and debugs its own visual artifacts in a closed-loop system.

The Infrastructure of Sparsity: 1T Parameters at the Speed of 32B

1 week, 3 days назад @ thesequence.substack.com
The Sequence Knowledge #800: Not All World Models are Created Equal
The Sequence Knowledge #800: Not All World Models are Created Equal The Sequence Knowledge #800: Not All World Models are Created Equal

Today we will Discuss:An overview of the different types of world models.

A discussion about the first major paper about world models.

💡 AI Concept of the Day: Not All World Models are Created EqualFor decades, the dominant paradigm in Reinforcement Learning (RL) was “Model-Free.” In this approach, an AI agent is essentially a reactive machine.

It observes a state, takes an action, and receives a reward.1 If the action was good, it statistically reinforces that behavior.

It does not understand why the action was good, nor does it possess an internal map of the environment.

1 week, 4 days назад @ thesequence.substack.com
The Sequence Radar #799: The Week AI Leveled Up: From Chatbots to World Builders
The Sequence Radar #799: The Week AI Leveled Up: From Chatbots to World Builders The Sequence Radar #799: The Week AI Leveled Up: From Chatbots to World Builders

Next Week in The Sequence:Our series about world models dives into the different types of world models.

Our opinion section dives into the test time compute technique and its impact in the next generation of frontier AI models.

The release of Project Genie (previewed earlier as Genie 3) to US subscribers marks a pivotal moment for generative media.

Following the path blazed by OpenAI’s o1, this trillion-parameter model introduces a dedicated “thinking” mode that exposes its step-by-step reasoning process.

🤖 AI Tech ReleasesKimi 2.5Moonshot AI released Kimi 2.5, a new model for agentic visual intelligence.

1 week, 6 days назад @ thesequence.substack.com
The Sequence Opinion #798: Inside the Most Important Paper in Agentic Reasoning: Why the Loop is the Logic
The Sequence Opinion #798: Inside the Most Important Paper in Agentic Reasoning: Why the Loop is the Logic The Sequence Opinion #798: Inside the Most Important Paper in Agentic Reasoning: Why the Loop is the Logic

TLDR: Researchers from DeepMind, Amazon, Meta, Yale and other AI labs collaborated in one of the most important publications about agentic reasoning that demistifies some of the core assumptions about the current methods.

It’s fast, it’s intuitive, but it’s fundamentally “shallow.” It’s what Daniel Kahneman would call fast thinking—pattern matching at scale.

The recent, exhaustive research into agentic reasoning for LLMs provides a rigorous roadmap of where we are.

But reading between the lines of the technical frameworks, a more provocative thesis emerges: Reasoning is not a property of the model; it is a property of the loop.

We are currently transitioning from the “Model Age,” where we o…

2 weeks, 2 days назад @ thesequence.substack.com
The Sequence AI of the Week #797: The New Companies that can Change the Inference Landscape
The Sequence AI of the Week #797: The New Companies that can Change the Inference Landscape The Sequence AI of the Week #797: The New Companies that can Change the Inference Landscape

The last week of January 2026 has arguably been the most consequential week for AI infrastructure since the release of the first H100.

Simultaneously, the team behind SGLang officially spun out as RadixArk, securing a $400M valuation in a round led by Accel.

In “LLM time,” we are moving from the “show-and-tell” phase of R&D into the “unit economics” phase of production.

The problem is no longer just training the largest model, but serving it efficiently.

Inference is an unpredictable, asynchronous sprint where the GPU’s VRAM is the most contested real estate on Earth.

2 weeks, 3 days назад @ thesequence.substack.com
The Sequence Knowledge #796: Welcome to the World of World Models
The Sequence Knowledge #796: Welcome to the World of World Models The Sequence Knowledge #796: Welcome to the World of World Models

Today we will Discuss:An introduction to a new series about World Models.

A review of the groundbreaking DayDreamer paper that set up the foundation for modern world models.

💡 AI Concept of the Day: Welcome to the World of World ModelsToday, we are starting a new series about one of the hottest topics in AI.

World models are having a moment because they change the unit of progress in AI.

If language models are a universal interface, world models are shaping up to be a universal sandbox.

2 weeks, 4 days назад @ thesequence.substack.com
The Sequence Radar #795: The New Inference Kids
The Sequence Radar #795: The New Inference Kids The Sequence Radar #795: The New Inference Kids

The opinion section dives into agentic reasoning in LLMs based on a groundbreaking paper that came out this week.

Subscribe and don’t miss out:📝 Editorial: The New Inference KidsLast week was all about inference in AI and new players emerging as forces to be reckoned with in the space.

Their bet is that inference is the new “cloud computing”—a utility that needs to be boring, reliable, and infinitely scalable.

In a standard inference engine, when a user sends a prompt, you compute the Key-Value (KV) cache from scratch.

AI Lab: Google DeepMindSummary: This paper introduces the MultiMax probe architecture and utilizes automated architecture search to address the failure of existing activation…

2 weeks, 6 days назад @ thesequence.substack.com
The Sequence Opinion #794: The Uncanny Valley of Intent: Why We Need Systems, Not Just Models
The Sequence Opinion #794: The Uncanny Valley of Intent: Why We Need Systems, Not Just Models The Sequence Opinion #794: The Uncanny Valley of Intent: Why We Need Systems, Not Just Models

It’s “vibe-based.” On the other side, we have Programming Languages (Python, C++)—the interface of classical silicon.

They are “logic-based.”The thesis I want to explore today is that neither of these is the correct abstraction for the future of AI application development.

We have excellent Models (the raw weights), but we are terrible at building Systems (the cognitive architectures) that effectively direct those models.

We need a new kind of substrate.

We need to move away from the binary of “talking to the bot” vs “coding the bot” and toward a paradigm of Artificial Programmable Intelligence (API).

3 weeks, 2 days назад @ thesequence.substack.com
The Sequence AI of the Week #793: DeepSeek's New Paper: Storing 100B Parameters on CPU RAM
The Sequence AI of the Week #793: DeepSeek's New Paper: Storing 100B Parameters on CPU RAM The Sequence AI of the Week #793: DeepSeek's New Paper: Storing 100B Parameters on CPU RAM

If you’ve been following the LLM space, you know the current meta is dominated by two things: scaling up and sparsifying everything.

We are obsessed with Mixture-of-Experts (MoE), tossing trillions of parameters at a problem but only activating a tiny fraction per token to keep the FLOPs manageable.

It’s a beautiful hack that has allowed us to break past dense model limitations.

But sometimes, in our rush to build ever-larger differentiable computers, we forget our roots.

We forget that not every problem requires a massive matrix multiplication.

3 weeks, 3 days назад @ thesequence.substack.com
The Sequence Knowledge #792: EVERYTHING you Need to Know About Synthetic Data Generation
The Sequence Knowledge #792: EVERYTHING you Need to Know About Synthetic Data Generation The Sequence Knowledge #792: EVERYTHING you Need to Know About Synthetic Data Generation

💡 AI Concept of the Day: A Summary of our Series About Synthetic Data GenerationToday, we are finalizing our series about synthetic data generation.

The Sequence Knowledge #748: We introduced our series about synthetic data generation and reviewed Microsoft’s famous paper: Textbooks is all you need.

The Sequence Knowledge #752: Analyzes the different types of synthetic data generation methods.

The installment also includes a review of Stanford University’s research on the STaR method for synthetic data generation for reasoning.

11.The Sequence Knowledge #788: Reviews the top synthetic data generation frameworks.

3 weeks, 4 days назад @ thesequence.substack.com
Synced Review
последний пост None
📓 Cool Blogs
ODS.ai Habr ODS.ai Habr
последний пост 2 weeks, 1 day назад
Курс Natural Language Processing & LLMs — новый сезон
Курс Natural Language Processing & LLMs — новый сезон Курс Natural Language Processing & LLMs — новый сезон

10 февраля мы в очередной раз запускаем бесплатный онлайн-курс по обработке естественного языка (Natural Language Processing).

Что будем проходить:классическое начало: закон Ципфа, TF-IDF, RNN, CNN, Transformer;основные задачи NLP: классификация текста, тегирование и генерация;специфичные области: агенты и вайб-кодинг;LLM и их применение.

Если вы студент ИТМО, МФТИ или ВШЭ, то курс можно зачесть, как учебный.

Работаю в области NLP более 12 лет, успел поработать в Яндексе и ВКонтакте, защитить кандидатскую диссертацию.

Если есть вопросы, то приходите с ними в ODS Mattermost – там будут все ответы, время семинаров и ссылки.

2 weeks, 1 day назад @ habr.com
SWE-MERA — новый динамический бенчмарк для моделей агентной генерации кода
SWE-MERA — новый динамический бенчмарк для моделей агентной генерации кода SWE-MERA — новый динамический бенчмарк для моделей агентной генерации кода

Однако все задачи в MERA CODE, как впрочем и в SWE-bench и других бенчмарках подобного назначения, следуют классической парадигме, когда у нас есть фиксированный обучающий набор данных и, что более важно, фиксированный проверочный набор.

Но большие языковые модели для кодинга, которые мы и пытаемся оценивать нашим набором, также учатся на GitHub – со времен еще первой модели LLaMa.

Кажется, что 700 задач немного, но это уже очень приличное количество, и что самое важное — это новые задачи.

Current behavior: from sympy import ask, Q, Symbol x = Symbol('x') print(ask(Q.finite(x**-1), Q.real(x))) # Output: True Expected behavior: The function should return None to indicate uncertainty, as x**-…

4 months, 4 weeks назад @ habr.com
DRAGON: динамический бенчмарк для оценки RAG-систем на русском языке
DRAGON: динамический бенчмарк для оценки RAG-систем на русском языке DRAGON: динамический бенчмарк для оценки RAG-систем на русском языке

Ответ: Кэисукэ ТибаSPARQL-запрос SimpleSELECT DISTINCT ?s ?r ?o WHERE { { SELECT ?s ?r ?o WHERE { ?s ?r ?o . }

GROUP BY ?s ?r HAVING(count(?o) = 1) } { SELECT ?s ?r ?o WHERE { ?s ?r ?o . }

Ответ: Национальная система платежных карт (НСПК) Центр биометрических технологий (ЦБТ) ЕБСSELECT ?s ?r ?o ?len WHERE { { SELECT ?s ?r (COUNT(?o1) as ?len) (GROUP_CONCAT(DISTINCT(STR(?o1));separator="|") AS ?o) WHERE { ?s ?r ?o1 . }

FILTER(?o != ?o1) } GROUP BY ?o ?o1 ?r ?r1 HAVING(COUNT(?s) = 1) } UNION { SELECT ?s ?r ?o ?r1 ?s1 WHERE { ?s ?r ?o .

FILTER(?o != ?o1) } GROUP BY ?o ?o1 ?r ?r1 HAVING(COUNT(?s) = 1) } UNION { SELECT ?s ?r ?o ?r1 ?s1 WHERE { ?s ?r ?o .

6 months, 3 weeks назад @ habr.com
RKNN Toolkit2: конвертация моделей и симуляция NPU Rockchip
RKNN Toolkit2: конвертация моделей и симуляция NPU Rockchip RKNN Toolkit2: конвертация моделей и симуляция NPU Rockchip

В этой статье я хочу поделиться своим опытом по конвертации нейросети в формат rknn с помощью библиотеки rknn-toolkit2.

Вот как выглядят веса pytorch модели в Netron:веса pytorch модели в NetronВажно!

Конвертация onnx модели в rknnДалее создается объект RKNN , который управляет процессом конвертации и инференса модели на платформе Rockchip.

На этом этапе происходит подготавка модели к конвертации в формат RKNN и последующему запуску на NPU Rockchip.

Создание и экспорт rknn моделиНа этом этапе происходит конвертация ONNX-модели во внутренний формат RKNN, оптимизация графа и подготовка к запуску на NPU Rockchip.

7 months назад @ habr.com
MERA Code: всесторонняя оценка генерации кода в прикладных сценариях
MERA Code: всесторонняя оценка генерации кода в прикладных сценариях MERA Code: всесторонняя оценка генерации кода в прикладных сценариях

🔗MERA Code🔗GitHub с кодом и данными🔗Коллекция на Hugging Face🔗Статья на arxiv🔗Репозиторий проекта на GitVerseЧто такое MERA Code?

Современные кодовые языковые модели и модели общего назначения (ChatGPT, Claude, Qwen, YandexGPT, GigaChat и др.)

Список текущих задач MERA Code и их характеристикКаталог задач MERA Code и их подробное описание представлено на сайте.

В MERA Code промпты строго подобраны под задачу и корректный выбор ответа.

В заключениеMERA Code — это попытка закрыть важный пробел в тестировании LLM: насколько они действительно полезны в реальной, локализованной разработке.

7 months назад @ habr.com
Machine Learning Mastery
последний пост 4 days, 12 hours назад
Document Clustering with LLM Embeddings in Scikit-learn
Document Clustering with LLM Embeddings in Scikit-learn Document Clustering with LLM Embeddings in Scikit-learn

Imagine that you suddenly obtain a large collection of unclassified documents and are tasked with grouping them by topic.

4 days, 12 hours назад @ machinelearningmastery.com
The 7 Biggest Misconceptions About AI Agents (and Why They Matter)
The 7 Biggest Misconceptions About AI Agents (and Why They Matter) The 7 Biggest Misconceptions About AI Agents (and Why They Matter)

AI agents are everywhere.

5 days, 12 hours назад @ machinelearningmastery.com
Agent Evaluation: How to Test and Measure Agentic AI Performance
Agent Evaluation: How to Test and Measure Agentic AI Performance Agent Evaluation: How to Test and Measure Agentic AI Performance

AI agents that use tools, make decisions, and complete multi-step tasks aren't prototypes anymore.

1 week, 2 days назад @ machinelearningmastery.com
Export Your ML Model in ONNX Format
Export Your ML Model in ONNX Format Export Your ML Model in ONNX Format

When building machine learning models, training is only half the journey.

1 week, 3 days назад @ machinelearningmastery.com
7 Advanced Feature Engineering Tricks Using LLM Embeddings
7 Advanced Feature Engineering Tricks Using LLM Embeddings 7 Advanced Feature Engineering Tricks Using LLM Embeddings

You have mastered model.

1 week, 4 days назад @ machinelearningmastery.com
A Beginner’s Reading List for Large Language Models for 2026
A Beginner’s Reading List for Large Language Models for 2026 A Beginner’s Reading List for Large Language Models for 2026

The large language models (LLMs) hype wave shows no sign of fading anytime soon: after all, LLMs keep reinventing themselves at a rapid pace and transforming the industry as a whole.

1 week, 5 days назад @ machinelearningmastery.com
7 Important Considerations Before Deploying Agentic AI in Production
7 Important Considerations Before Deploying Agentic AI in Production 7 Important Considerations Before Deploying Agentic AI in Production

The promise of agentic AI is compelling: autonomous systems that reason, plan, and execute complex tasks with minimal human intervention.

2 weeks, 2 days назад @ machinelearningmastery.com
5 Ways to Use Cross-Validation to Improve Time Series Models
5 Ways to Use Cross-Validation to Improve Time Series Models 5 Ways to Use Cross-Validation to Improve Time Series Models

Time series modeling

2 weeks, 3 days назад @ machinelearningmastery.com
The 3 Invisible Risks Every LLM App Faces (And How to Guard Against Them)
The 3 Invisible Risks Every LLM App Faces (And How to Guard Against Them) The 3 Invisible Risks Every LLM App Faces (And How to Guard Against Them)

Building a chatbot prototype takes hours.

2 weeks, 4 days назад @ machinelearningmastery.com
Leveling Up Your Machine Learning: What To Do After Andrew Ng’s Course
Leveling Up Your Machine Learning: What To Do After Andrew Ng’s Course Leveling Up Your Machine Learning: What To Do After Andrew Ng’s Course

Finishing Andrew Ng's machine learning course

2 weeks, 5 days назад @ machinelearningmastery.com
The 2026 Time Series Toolkit: 5 Foundation Models for Autonomous Forecasting
The 2026 Time Series Toolkit: 5 Foundation Models for Autonomous Forecasting The 2026 Time Series Toolkit: 5 Foundation Models for Autonomous Forecasting

This list covers the five essential foundation models you need to know for building production forecasting systems in 2026.

The shift from task-specific models to foundation model orchestration changes how teams approach forecasting.

Instead of spending weeks tuning parameters and wrangling domain expertise for each new dataset, pretrained models already understand universal temporal patterns.

Amazon Chronos-2 (The Production-Ready Foundation)Amazon Chronos-2 is the most mature option for teams moving to foundation model forecasting.

Lag-Llama (The Open-Source Backbone)Lag-Llama brings probabilistic forecasting capabilities to foundation models through a decoder-only transformer inspired by…

3 weeks, 2 days назад @ machinelearningmastery.com
Everything You Need to Know About How Python Manages Memory
Everything You Need to Know About How Python Manages Memory Everything You Need to Know About How Python Manages Memory

If you follow the references, you go in a circle: Alice’s object references Bob’s object, which references Alice’s object, which references Bob’s object… forever.

How Python Manages Memory Using Reference Counting & Generational Garbage CollectionPython uses two main mechanisms for garbage collection:Reference counting: This is the primary method.

import sys # Create an object - reference count is 1 my_list = [1, 2, 3] print(f"Reference count: {sys.getrefcount(my_list)}") # Create another reference - count increases another_ref = my_list print(f"Reference count: {sys.getrefcount(my_list)}") # Delete one reference - count decreases del another_ref print(f"Reference count: {sys.getrefcount(my…

3 weeks, 3 days назад @ machinelearningmastery.com
The Machine Learning Practitioner’s Guide to Model Deployment with FastAPI
The Machine Learning Practitioner’s Guide to Model Deployment with FastAPI The Machine Learning Practitioner’s Guide to Model Deployment with FastAPI

If you’ve trained a machine learning model, a common question comes up: “How do we actually use it?” This is where many machine learning practitioners get stuck.

In this article, you will learn how to deploy a machine learning model using FastAPI step by step.

pipeline import Pipeline from sklearn .

Create a file called main.py:from fastapi import FastAPI from pydantic import BaseModel import joblib app = FastAPI(title="House Price Prediction API") # Load model once at startup model = joblib.load("house_price_model.joblib") 1 2 3 4 5 6 7 8 from fastapi import FastAPI from pydantic import BaseModel import joblib app = FastAPI ( title = "House Price Prediction API" ) # Load model once at star…

3 weeks, 4 days назад @ machinelearningmastery.com
Top 5 Agentic AI Website Builders (That Actually Ship)
Top 5 Agentic AI Website Builders (That Actually Ship) Top 5 Agentic AI Website Builders (That Actually Ship)

Instead of spending weeks learning UI frameworks, I started using tools like v0 and other agentic website builders to create professional dashboards, landing pages, and wallet interfaces.

I wanted to understand which AI website builders actually work end to end—not just tools that generate a pretty frontend, but ones that can also handle backend logic, connect data, and deploy a working product in minutes.

After testing and researching what is available, I put together this list of the top agentic AI website builders that go beyond design.

These tools help you build real applications, manage the backend, and ship fast without getting stuck in setup or configuration.

Hostinger HorizonsHostin…

3 weeks, 5 days назад @ machinelearningmastery.com
The Complete Guide to Data Augmentation for Machine Learning
The Complete Guide to Data Augmentation for Machine Learning The Complete Guide to Data Augmentation for Machine Learning

In this article, you’ll learn how data augmentation works in practice and when to use it.

Data Augmentation for Image DataImage data augmentation is the most intuitive place to start.

load_data ( ) # Normalize pixel values X_train = X_train / 255.0 X_test = X_test / 255.0 # Reshape to (samples, height, width, channels) X_train = X_train .

Data Augmentation for Audio DataAudio data also benefits heavily from augmentation.

Data Augmentation for Tabular DataTabular data is the most sensitive data type to augment.

4 weeks, 1 day назад @ machinelearningmastery.com
ML in Production
последний пост None
Sorta Insightful Sorta Insightful
последний пост 2 weeks, 2 days назад
MIT Mystery Hunt 2026
MIT Mystery Hunt 2026 MIT Mystery Hunt 2026

This has spoilers for MIT Mystery Hunt 2026.

Pre-HuntThe time running up to Hunt was more stressful than usual…very briefly, I typically hunt with teammate.

Just last year, I did GPH 2025, LN Hunt, Teammate Hunt 2025, Microsoft Hunt 2025, and Silph Puzzle Hunt 2025, all of which had significant 3+ hour solve puzzles that would not be out of place in Mystery Hunt.

Not to mention smaller hunts like Advent Hunt, and then I didn’t even do Brown Puzzlehunt or Vertex Hunt or the fall CMU Hunt.

To me, the crux is whether Mystery Hunt is broken, or Mystery Hunt is fine.

2 weeks, 2 days назад @ alexirpan.com
Authentic Imperfection
Authentic Imperfection Authentic Imperfection

* * *I’ve been thinking about the anger surrounding generative AI.

To keep things fair, he took the best human images and best AI images, meaning human art from famous artists, and AI art from prompters skilled at removing obvious tells of image generation.

When people complain about AI slop, I see it as a complaint against the deluge of default style AI images.

We’ve seen this happen in all forms: AI text, AI music, older forms of computer generated content like CGI.

As much as we celebrate imperfection, digital imperfection is a step too far.

3 months назад @ alexirpan.com
Ten Years Later
Ten Years Later Ten Years Later

Every now and then, someone asks me why I blog, and I don’t know really know what to tell them.

That’s another reason I’m not celebrating 10 years with more gusto, I know I’ve been writing less.

Indiana Jones and the Great Circle: I don’t know how they did it, but Indiana Jones and the Great Circle was just fun all the way through.

My one complaint is that the hand-to-hand combat feels like the worst part of the game, so of course they put a bunch of upgrades behind learning parry timings you’ll never use later.

I have not tried Peak, but Another Crab’s Treasure was really good and is worth playing if you’re interested in a Souls-like.

6 months назад @ alexirpan.com
Brony Musicians Seize The Means of Production: My Eyewitness Account to BABSCon 2025
Brony Musicians Seize The Means of Production: My Eyewitness Account to BABSCon 2025 Brony Musicians Seize The Means of Production: My Eyewitness Account to BABSCon 2025

A music concert in the evenings, typically set up as a rave with EDM or rock music made by brony musicians.

She has been involved in organizing pony music concerts for over a decade, for both BABSCon and other pony conventions.

Thank you, BABSCon ChairsThe brony musicians immediately jump into an emergency Discord call with Pinkaboo, to get her side of the story.

Other conventions start tweeting in support of the brony musicians, with no one taking BABSCon’s side.

It’s hard for me to explain why I like MLP fan music, because brony music really isn’t accessible.

6 months, 4 weeks назад @ alexirpan.com
Lil'Log
последний пост None
inFERENCe
последний пост 2 weeks назад
Deep Learning is Powerful Because It Makes Hard Things Easy - Reflections 10 Years On
Deep Learning is Powerful Because It Makes Hard Things Easy - Reflections 10 Years On Deep Learning is Powerful Because It Makes Hard Things Easy - Reflections 10 Years On

Deep Learning is Powerful Because It Makes Hard Things Easy - Reflections 10 Years OnTen years ago this week, I wrote a provocative and bold post that blew up, made it to top spot on HackerNews.

In hindsight: There is a lot of stuff in deep learning that we don't understand nearly enough.

Sometimes things work for reasons completely unrelated to why we thought they would work.

(Pop some 🍿 in the microwave and read till the end for more)🎯 "Deep learning is powerful exactly because it makes hard things easy"Okay, this was a great insight.

🎯 Generative ModelingIn the post I suggested people learn "something harder" instead of - or in addition to - deep learning.

2 weeks назад @ inference.vc
Discrete Diffusion: Continuous-Time Markov Chains
Discrete Diffusion: Continuous-Time Markov Chains Discrete Diffusion: Continuous-Time Markov Chains

We differentiate Markov chains on values the index $t$ can take: if $t$ is an integer, we call it a discrete-time Markov chain, and when $t$ is real, we call the process a continuous-time Markov chain.

Discrete Markov chains are fully described by a collection of state transition matrices $P_t = [P(X_{t+1} = i\vert X_{t}=j)]{i,j}$.

The geometric distribution is the only discrete distribution with the memory-less property, stated as $\mathbb{P}[T=s+t\vert T>s] = \mathbb{P}[T = t]$.

Point processes and Markov ChainsBack to Markov chains.

We then discussed different representations of discrete point processes, and noted (although did not prove) their equivalence.

8 months, 4 weeks назад @ inference.vc
The Spectator
последний пост None
The Unofficial Google Data Science Blog The Unofficial Google Data Science Blog
последний пост None
Off the Convex Path
последний пост None
Jay Alammar
последний пост None
Piekniewski's blog
последний пост None
fast.ai NLP fast.ai NLP
последний пост None
Sebastian Ruder
последний пост None
大トロ 大トロ
последний пост None
🔬 Science
Papers With Code Papers With Code
последний пост 6 months, 3 weeks назад
/henry123-boy/ SpatialTrackerV2: 3D Point Tracking Made Easy
/henry123-boy/ SpatialTrackerV2: 3D Point Tracking Made Easy /henry123-boy/ SpatialTrackerV2: 3D Point Tracking Made Easy

We present SpatialTrackerV2, a feed-forward 3D point tracking method for monocular videos.

Going beyond modular pipelines built on off-the-shelf components for 3D tracking, our approach unifies the intrinsic connections between point tracking, monocular depth, and camera pose estimation into a high-performing and feedforward 3D point tracker.

It decomposes world-space 3D motion into scene geometry, camera ego-motion, and pixel-wise object motion, with a fully differentiable and end-to-end architecture, allowing scalable training across a wide range of datasets, including synthetic sequences, posed RGB-D videos, and unlabeled in-the-wild footage.

By learning geometry and motion jointly from …

6 months, 3 weeks назад @ paperswithcode.com
/antof27/ Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation
/antof27/ Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation /antof27/ Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation

Calisthenics skill classification is the computer vision task of inferring the skill performed by an athlete from images, enabling automatic performance assessment and personalized analytics.

Traditional methods for calisthenics skill recognition are based on pose estimation methods to determine the position of skeletal data from images, which is later fed to a classification algorithm to infer the performed skill.

This work proposes a direct approach to calisthenics skill recognition, which leverages depth estimation and athlete patch retrieval to avoid the computationally expensive human pose estimation module.

Using Depth Anything V2 for depth estimation and YOLOv10 for athlete localizat…

6 months, 3 weeks назад @ paperswithcode.com
/snowflakedb/ Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AI
/snowflakedb/ Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AI /snowflakedb/ Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AI

Inference is now the dominant AI workload, yet existing systems force trade-offs between latency, throughput, and cost.

Arctic Inference, an open-source vLLM plugin from Snowflake AI Research, introduces Shift Parallelism, a dynamic parallelism strategy that adapts to real-world traffic while integrating speculative decoding, SwiftKV compute reduction, and optimized embedding inference.

It achieves up to 3.4 times faster request completion, 1.75 times faster generation, and 1.6M tokens/sec per GPU for embeddings, outperforming both latency- and throughput-optimized deployments.

Already powering Snowflake Cortex AI, Arctic Inference delivers state-of-the-art, cost-effective inference for ent…

6 months, 3 weeks назад @ paperswithcode.com
/NVIDIA/ FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale
/NVIDIA/ FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale /NVIDIA/ FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale

FourCastNet 3 advances global weather modeling by implementing a scalable, geometric machine learning (ML) approach to probabilistic ensemble forecasting.

The approach is designed to respect spherical geometry and to accurately model the spatially correlated probabilistic nature of the problem, resulting in stable spectra and realistic dynamics across multiple scales.

FourCastNet 3 delivers forecasting accuracy that surpasses leading conventional ensemble models and rivals the best diffusion-based methods, while producing forecasts 8 to 60 times faster than these approaches.

In contrast to other ML approaches, FourCastNet 3 demonstrates excellent probabilistic calibration and retains realis…

6 months, 3 weeks назад @ paperswithcode.com
/jingyanw/ Choosing the Better Bandit Algorithm under Data Sharing: When Do A/B Experiments Work?
/jingyanw/ Choosing the Better Bandit Algorithm under Data Sharing: When Do A/B Experiments Work? /jingyanw/ Choosing the Better Bandit Algorithm under Data Sharing: When Do A/B Experiments Work?

We study A/B experiments that are designed to compare the performance of two recommendation algorithms.

The bias arising from this type of data sharing is known as "symbiosis bias".

In this paper, we highlight that, for decision-making purposes, the sign of the GTE often matters more than its precise magnitude when selecting the better algorithm.

We formalize this insight under a multi-armed bandit framework and theoretically characterize when the sign of the expected GTE estimate under data sharing aligns with or contradicts the sign of the true GTE.

Our analysis identifies the level of exploration versus exploitation as a key determinant of how symbiosis bias impacts algorithm selection.

6 months, 3 weeks назад @ paperswithcode.com
/qqq-yi/ DAC: A Dynamic Attention-aware Approach for Task-Agnostic Prompt Compression
/qqq-yi/ DAC: A Dynamic Attention-aware Approach for Task-Agnostic Prompt Compression /qqq-yi/ DAC: A Dynamic Attention-aware Approach for Task-Agnostic Prompt Compression

Task-agnostic prompt compression leverages the redundancy in natural language to reduce computational overhead and enhance information density within prompts, especially in long-context scenarios.

Existing methods predominantly rely on information entropy as the metric to compress lexical units, aiming to achieve minimal information loss.

However, these approaches overlook two critical aspects: (i) the importance of attention-critical tokens at the algorithmic level, and (ii) shifts in information entropy during the compression process.

Motivated by these challenges, we propose a dynamic attention-aware approach for task-agnostic prompt compression (DAC).

This approach effectively integrate…

6 months, 3 weeks назад @ paperswithcode.com
/lukasellinger/ Simplifications are Absolutists: How Simplified Language Reduces Word Sense Awareness in LLM-Generated Definitions
/lukasellinger/ Simplifications are Absolutists: How Simplified Language Reduces Word Sense Awareness in LLM-Generated Definitions /lukasellinger/ Simplifications are Absolutists: How Simplified Language Reduces Word Sense Awareness in LLM-Generated Definitions

Large Language Models (LLMs) can provide accurate word definitions and explanations for any context.

However, the scope of the definition changes for different target groups, like children or language learners.

We investigate how simplification impacts homonym definition quality across three target groups: Normal, Simple, and ELI5.

Our results show that simplification drastically degrades definition completeness by neglecting polysemy, increasing the risk of misunderstanding.

Fine-tuning Llama 3.1 8B with Direct Preference Optimization substantially improves homonym response quality across all prompt types.

6 months, 3 weeks назад @ paperswithcode.com
/pspdada/ Mitigating Object Hallucinations via Sentence-Level Early Intervention
/pspdada/ Mitigating Object Hallucinations via Sentence-Level Early Intervention /pspdada/ Mitigating Object Hallucinations via Sentence-Level Early Intervention

Multimodal large language models (MLLMs) have revolutionized cross-modal understanding but continue to struggle with hallucinations - fabricated content contradicting visual inputs.

Existing hallucination mitigation methods either incur prohibitive computational costs or introduce distribution mismatches between training data and model outputs.

We identify a critical insight: hallucinations predominantly emerge at the early stages of text generation and propagate through subsequent outputs.

To address this, we propose **SENTINEL** (**S**entence-level **E**arly i**N**tervention **T**hrough **IN**-domain pr**E**ference **L**earning), a framework that eliminates dependency on human annotations…

6 months, 3 weeks назад @ paperswithcode.com
/owos/ FLEXITOKENS: Flexible Tokenization for Evolving Language Models
/owos/ FLEXITOKENS: Flexible Tokenization for Evolving Language Models /owos/ FLEXITOKENS: Flexible Tokenization for Evolving Language Models

Language models (LMs) are challenging to adapt to new data distributions by simple finetuning.

This is due to the rigidity of their subword tokenizers, which typically remain unchanged during adaptation.

This inflexibility often leads to inefficient tokenization, causing overfragmentation of out-of-distribution domains, unseen languages, or scripts.

In this work, we develop byte-level LMs with learnable tokenizers to make tokenization adaptive.

Our models include a submodule that learns to predict boundaries between the input byte sequence, encoding it into variable-length segments.

6 months, 3 weeks назад @ paperswithcode.com
/wojiufukele/ Graph-Structured Data Analysis of Component Failure in Autonomous Cargo Ships Based on Feature Fusion
/wojiufukele/ Graph-Structured Data Analysis of Component Failure in Autonomous Cargo Ships Based on Feature Fusion /wojiufukele/ Graph-Structured Data Analysis of Component Failure in Autonomous Cargo Ships Based on Feature Fusion

To address the challenges posed by cascading reactions caused by component failures in autonomous cargo ships (ACS) and the uncertainties in emergency decision-making, this paper proposes a novel hybrid feature fusion framework for constructing a graph-structured dataset of failure modes.

A hierarchical feature fusion framework is constructed, using Word2Vec encoding to encode subsystem/component features, BERT-KPCA to process failure modes/reasons, and Sentence-BERT to quantify the semantic association between failure impact and emergency decision-making.

The dataset covers 12 systems, 1,262 failure modes, and 6,150 propagation paths.

In the label prediction results, the Shore-based Meteor…

6 months, 3 weeks назад @ paperswithcode.com
/YF-W/ Tri-Learn Graph Fusion Network for Attributed Graph Clustering
/YF-W/ Tri-Learn Graph Fusion Network for Attributed Graph Clustering /YF-W/ Tri-Learn Graph Fusion Network for Attributed Graph Clustering

In recent years, models based on Graph Convolutional Networks (GCN) have made significant strides in the field of graph data analysis.

Although the Graph Transformer architecture has mitigated some of these issues, its performance is still limited when processing heterogeneous graph data.

To address these challenges, this study proposes a novel deep clustering framework that comprising GCN, Autoencoder (AE), and Graph Transformer, termed the Tri-Learn Graph Fusion Network (Tri-GFN).

The tri-learning mechanism allows mutual learning among these modules, while the feature fusion strategy enables the model to capture complex relationships, yielding highly discriminative representations for gra…

6 months, 3 weeks назад @ paperswithcode.com
/mr-ravin/ APTx Neuron: A Unified Trainable Neuron Architecture Integrating Activation and Computation
/mr-ravin/ APTx Neuron: A Unified Trainable Neuron Architecture Integrating Activation and Computation /mr-ravin/ APTx Neuron: A Unified Trainable Neuron Architecture Integrating Activation and Computation

We propose the APTx Neuron, a novel, unified neural computation unit that integrates non-linear activation and linear transformation into a single trainable expression.

The APTx Neuron is derived from the APTx activation function, thereby eliminating the need for separate activation layers and making the architecture both computationally efficient and elegant.

The proposed neuron follows the functional form $y = \sum_{i=1}^{n} ((\alpha_i + \tanh(\beta_i x_i)) \cdot \gamma_i x_i) + \delta$, where all parameters $\alpha_i$, $\beta_i$, $\gamma_i$, and $\delta$ are trainable.

We validate our APTx Neuron-based architecture on the MNIST dataset, achieving up to 96.69\% test accuracy in just 20 ep…

6 months, 3 weeks назад @ paperswithcode.com
/Rec4Fun/ A Reproducibility Study of Product-side Fairness in Bundle Recommendation
/Rec4Fun/ A Reproducibility Study of Product-side Fairness in Bundle Recommendation /Rec4Fun/ A Reproducibility Study of Product-side Fairness in Bundle Recommendation

While this problem has been widely studied in traditional recommendation settings, its implications for bundle recommendation (BR) remain largely unexplored.

Existing fairness frameworks and metrics designed for traditional recommender systems may not directly translate to this multi-layered setting.

In this paper, we conduct a comprehensive reproducibility study of product-side fairness in BR across three real-world datasets using four state-of-the-art BR methods.

We analyze exposure disparities at both the bundle and item levels using multiple fairness metrics, uncovering important patterns.

Overall, our findings offer actionable insights for building fairer bundle recommender systems and…

6 months, 3 weeks назад @ paperswithcode.com
/cbobed/ OntView: What you See is What you Meant
/cbobed/ OntView: What you See is What you Meant /cbobed/ OntView: What you See is What you Meant

However, the lack of tools that provide effective visualization is still a significant challenge.

In this paper, we present OntView, an ontology viewer that is designed to provide users with an intuitive visual representation of ontology concepts and their formal definitions through a user-friendly interface.

Building on the use of a DL reasoner, OntView follows a "What you see is what you meant" paradigm, showing the actual inferred knowledge.

One key aspect for this is its ability to visualize General Concept Inclusions (GCI), a feature absent in existing visualization tools.

OntView has been released with an open-source license for the whole community.

6 months, 3 weeks назад @ paperswithcode.com
/Rec4Fun/ RaMen: Multi-Strategy Multi-Modal Learning for Bundle Construction
/Rec4Fun/ RaMen: Multi-Strategy Multi-Modal Learning for Bundle Construction /Rec4Fun/ RaMen: Multi-Strategy Multi-Modal Learning for Bundle Construction

These approaches fail to capture elaborate relations hidden in real-world bundle structures, resulting in suboptimal bundle representations.

To overcome this limitation, we propose RaMen, a novel method that provides a holistic multi-strategy approach for bundle construction.

RaMen utilizes both intrinsic (characteristics) and extrinsic (collaborative signals) information to model bundle structures through Explicit Strategy-aware Learning (ESL) and Implicit Strategy-aware Learning (ISL).

Integrating diverse strategies enables RaMen to learn more comprehensive and robust bundle representations.

Meanwhile, Multi-strategy Alignment & Discrimination module is employed to facilitate knowledge tr…

6 months, 3 weeks назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 6 months, 3 weeks назад
/PrimisAI/ Adaptive Multi-Agent Reasoning via Automated Workflow Generation
/PrimisAI/ Adaptive Multi-Agent Reasoning via Automated Workflow Generation /PrimisAI/ Adaptive Multi-Agent Reasoning via Automated Workflow Generation

The rise of Large Reasoning Models (LRMs) promises a significant leap forward in language model capabilities, aiming to tackle increasingly sophisticated tasks with unprecedented efficiency and accuracy.

However, despite their impressive performance, recent studies have highlighted how current reasoning models frequently fail to generalize to novel, unseen problems, often resorting to memorized solutions rather than genuine inferential reasoning.

In this paper, we introduce Nexus Architect, an enhanced iteration of our multi-agent system framework, Nexus, equipped with a novel automated workflow synthesis mechanism.

Given a user's prompt and a small set of representative examples, the Archi…

6 months, 3 weeks назад @ paperswithcode.com
/sharanya02/ Real Time Captioning of Sign Language Gestures in Video Meetings
/sharanya02/ Real Time Captioning of Sign Language Gestures in Video Meetings /sharanya02/ Real Time Captioning of Sign Language Gestures in Video Meetings

One of the most tested ways to establish such a communication is through the use of sign based languages.

However, not many people are aware of the smaller intricacies involved with sign language.

Sign language recognition using computer vision aims at eliminating the communication barrier between deaf-mute and ordinary people so that they can properly communicate with others.

In recent studies, it has been found that people with hearing disabilities prefer to sign over typing during these video calls.

In this paper, we are proposing a browser extension that will automatically translate sign language to subtitles for everyone else in the video call.

6 months, 3 weeks назад @ paperswithcode.com
/alessiopittiglio/ Leveraging Context for Multimodal Fallacy Classification in Political Debates
/alessiopittiglio/ Leveraging Context for Multimodal Fallacy Classification in Political Debates /alessiopittiglio/ Leveraging Context for Multimodal Fallacy Classification in Political Debates

In this paper, we present our submission to the MM-ArgFallacy2025 shared task, which aims to advance research in multimodal argument mining, focusing on logical fallacies in political debates.

Our approach uses pretrained Transformer-based models and proposes several ways to leverage context.

In the fallacy classification subtask, our models achieved macro F1-scores of 0.4444 (text), 0.3559 (audio), and 0.4403 (multimodal).

Our multimodal model showed performance comparable to the text-only model, suggesting potential for improvements.

PDFAbstract

6 months, 3 weeks назад @ paperswithcode.com
/RS2002/ One Step is Enough: Multi-Agent Reinforcement Learning based on One-Step Policy Optimization for Order Dispatch on Ride-Sharing Platforms
/RS2002/ One Step is Enough: Multi-Agent Reinforcement Learning based on One-Step Policy Optimization for Order Dispatch on Ride-Sharing Platforms /RS2002/ One Step is Enough: Multi-Agent Reinforcement Learning based on One-Step Policy Optimization for Order Dispatch on Ride-Sharing Platforms

On-demand ride-sharing platforms face the fundamental challenge of dynamically bundling passengers with diverse origins and destinations and matching them with vehicles in real time, all under significant uncertainty.

However, conventional MARL-based ride-sharing approaches heavily rely on the accurate estimation of Q-values or V-values, which becomes problematic in large-scale, highly uncertain environments.

To address these challenges, we propose two novel alternative methods that bypass value function estimation.

First, we adapt GRPO to ride-sharing, replacing the PPO baseline with the group average reward to eliminate critic estimation errors and reduce training bias.

Second, inspired b…

6 months, 3 weeks назад @ paperswithcode.com
/LiXinran6/ Long-Short Distance Graph Neural Networks and Improved Curriculum Learning for Emotion Recognition in Conversation
/LiXinran6/ Long-Short Distance Graph Neural Networks and Improved Curriculum Learning for Emotion Recognition in Conversation /LiXinran6/ Long-Short Distance Graph Neural Networks and Improved Curriculum Learning for Emotion Recognition in Conversation

Include the markdown at the top of your GitHub README.md file to showcase the performance of the model.

Badges are live and will be dynamically updated with the latest ranking of this paper.

6 months, 3 weeks назад @ paperswithcode.com
/ShimSoonYong/ ZClassifier: Temperature Tuning and Manifold Approximation via KL Divergence on Logit Space
/ShimSoonYong/ ZClassifier: Temperature Tuning and Manifold Approximation via KL Divergence on Logit Space

We introduce a novel classification framework, ZClassifier, that replaces conventional deterministic logits with diagonal Gaussian-distributed logits. Code: https://github.com/ShimSoonYong/ZClassifier

7 months назад @ paperswithcode.com
/briziorusso/ On Gradual Semantics for Assumption-Based Argumentation
/briziorusso/ On Gradual Semantics for Assumption-Based Argumentation

In this paper, we fill this gap and propose a family of novel gradual semantics for equipping assumptions, which are the core components in ABA frameworks, with dialectical strengths. Code: https://github.com/briziorusso/GradualABA

7 months назад @ paperswithcode.com
/wumingqi/ Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination
/wumingqi/ Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination

Cloudflare is unable to establish an SSL connection to the origin server.

If you're a visitor of this website:Please try again in a few minutes.

If you're the owner of this website:It appears that the SSL configuration used is not compatible with Cloudflare.

This could happen for a several reasons, including no shared cipher suites.

Additional troubleshooting information here.

7 months назад @ paperswithcode.com
/IsaacYQH/ WildFX: A DAW-Powered Pipeline for In-the-Wild Audio FX Graph Modeling
/IsaacYQH/ WildFX: A DAW-Powered Pipeline for In-the-Wild Audio FX Graph Modeling

Despite rapid progress in end-to-end AI music generation, AI-driven modeling of professional Digital Signal Processing (DSP) workflows remains challenging. Code: https://github.com/IsaacYQH/WildFX

7 months назад @ paperswithcode.com
/summer1278/ Addressing Data Imbalance in Transformer-Based Multi-Label Emotion Detection with Weighted Loss
/summer1278/ Addressing Data Imbalance in Transformer-Based Multi-Label Emotion Detection with Weighted Loss

This paper explores the application of a simple weighted loss function to Transformer-based models for multi-label emotion detection in SemEval-2025 Shared Task 11. Code: https://github.com/summer1278/semeval2025-task11

7 months назад @ paperswithcode.com
/gabrielkmbo/ Step-wise Policy for Rare-tool Knowledge (SPaRK): Offline RL that Drives Diverse Tool Use in LLMs
/gabrielkmbo/ Step-wise Policy for Rare-tool Knowledge (SPaRK): Offline RL that Drives Diverse Tool Use in LLMs

We present Step-wise Policy for Rare-tool Knowledge (SPaRK), a novel reinforcement learning framework that teaches large language models to explore diverse tool usage patterns beyond conventional high-temperature sampling. Code: https://github.com/gabrielkmbo/explore-rl

7 months назад @ paperswithcode.com
/Cavendish518/ Learning to Tune Like an Expert: Interpretable and Scene-Aware Navigation via MLLM Reasoning and CVAE-Based Adaptation
/Cavendish518/ Learning to Tune Like an Expert: Interpretable and Scene-Aware Navigation via MLLM Reasoning and CVAE-Based Adaptation

Service robots are increasingly deployed in diverse and dynamic environments, where both physical layouts and social contexts change over time and across locations. Code: https://github.com/Cavendish518/LE-Nav

7 months назад @ paperswithcode.com
/MatteoFasulo/ AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles
/MatteoFasulo/ AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles

Cloudflare is unable to establish an SSL connection to the origin server.

If you're a visitor of this website:Please try again in a few minutes.

If you're the owner of this website:It appears that the SSL configuration used is not compatible with Cloudflare.

This could happen for a several reasons, including no shared cipher suites.

Additional troubleshooting information here.

7 months назад @ paperswithcode.com
/VCA-EPFL/ SystolicAttention: Fusing FlashAttention within a Single Systolic Array
/VCA-EPFL/ SystolicAttention: Fusing FlashAttention within a Single Systolic Array

The frequent data swaps between the systolic array and external vector units result in low systolic array utilization. Code: https://github.com/VCA-EPFL/FSA

7 months назад @ paperswithcode.com
/Buddhi19/ Precision Spatio-Temporal Feature Fusion for Robust Remote Sensing Change Detection
/Buddhi19/ Precision Spatio-Temporal Feature Fusion for Robust Remote Sensing Change Detection

Cloudflare is unable to establish an SSL connection to the origin server.

If you're a visitor of this website:Please try again in a few minutes.

If you're the owner of this website:It appears that the SSL configuration used is not compatible with Cloudflare.

This could happen for a several reasons, including no shared cipher suites.

Additional troubleshooting information here.

7 months назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 6 months, 3 weeks назад
/fudanvi/ Beyond Task-Specific Reasoning: A Unified Conditional Generative Framework for Abstract Visual Reasoning
/fudanvi/ Beyond Task-Specific Reasoning: A Unified Conditional Generative Framework for Abstract Visual Reasoning

Cloudflare is unable to establish an SSL connection to the origin server.

If you're a visitor of this website:Please try again in a few minutes.

If you're the owner of this website:It appears that the SSL configuration used is not compatible with Cloudflare.

This could happen for a several reasons, including no shared cipher suites.

Additional troubleshooting information here.

7 months назад @ paperswithcode.com
/benedekrozemberczki/ PGT-I: Scaling Spatiotemporal GNNs with Memory-Efficient Distributed Training
/benedekrozemberczki/ PGT-I: Scaling Spatiotemporal GNNs with Memory-Efficient Distributed Training

Spatiotemporal graph neural networks (ST-GNNs) are powerful tools for modeling spatial and temporal data dependencies. Code: https://github.com/benedekrozemberczki/pytorch_geometric_temporal

7 months назад @ paperswithcode.com
/chengxuphd/ DCR: Quantifying Data Contamination in LLMs Evaluation
/chengxuphd/ DCR: Quantifying Data Contamination in LLMs Evaluation

Cloudflare is unable to establish an SSL connection to the origin server.

If you're a visitor of this website:Please try again in a few minutes.

If you're the owner of this website:It appears that the SSL configuration used is not compatible with Cloudflare.

This could happen for a several reasons, including no shared cipher suites.

Additional troubleshooting information here.

7 months назад @ paperswithcode.com
/gitter-lab/ Assay2Mol: large language model-based drug design using BioAssay context
/gitter-lab/ Assay2Mol: large language model-based drug design using BioAssay context

Scientific databases aggregate vast amounts of quantitative data alongside descriptive text. Code: https://github.com/gitter-lab/Assay2Mol

7 months назад @ paperswithcode.com
/hayatkhan8660-maker/ DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition
/hayatkhan8660-maker/ DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition

We employ forward Kullback-Leibler (KL) divergence alongside spatio-temporal focal modulation to effectively transfer both local and global context from the Video-FocalNet Base (teacher) to the proposed VFL-Net (student). Code: https://github.com/hayatkhan8660-maker/DVFL-Net

7 months назад @ paperswithcode.com
/JudyJuezhuLong/ Best Practices for Large-Scale, Pixel-Wise Crop Mapping and Transfer Learning Workflows
/JudyJuezhuLong/ Best Practices for Large-Scale, Pixel-Wise Crop Mapping and Transfer Learning Workflows

Cloudflare is unable to establish an SSL connection to the origin server.

If you're a visitor of this website:Please try again in a few minutes.

If you're the owner of this website:It appears that the SSL configuration used is not compatible with Cloudflare.

This could happen for a several reasons, including no shared cipher suites.

Additional troubleshooting information here.

7 months назад @ paperswithcode.com
/joaojcorreia/ A Fuzzy Approach to Project Success: Measuring What Matters
/joaojcorreia/ A Fuzzy Approach to Project Success: Measuring What Matters

This paper introduces a novel approach to project success evaluation by integrating fuzzy logic into an existing construct. Code: https://github.com/joaojcorreia/FuzzyLogic_ProjectSuccess

7 months назад @ paperswithcode.com
/kunkunlin1221/ InstructFLIP: Exploring Unified Vision-Language Model for Face Anti-spoofing
/kunkunlin1221/ InstructFLIP: Exploring Unified Vision-Language Model for Face Anti-spoofing

Extensive experiments demonstrate the effectiveness of InstructFLIP by outperforming SOTA models in accuracy and substantially reducing training redundancy across diverse domains in FAS. Code: https://github.com/kunkunlin1221/InstructFLIP

7 months назад @ paperswithcode.com
/Linvyl/ Describe Anything Model for Visual Question Answering on Text-rich Images
/Linvyl/ Describe Anything Model for Visual Question Answering on Text-rich Images

Recent progress has been made in region-aware vision-language modeling, particularly with the emergence of the Describe Anything Model (DAM). Code: https://github.com/Linvyl/DAM-QA

7 months назад @ paperswithcode.com
/abhijeet3922/ Developing Visual Augmented Q&A System using Scalable Vision Embedding Retrieval & Late Interaction Re-ranker
/abhijeet3922/ Developing Visual Augmented Q&A System using Scalable Vision Embedding Retrieval & Late Interaction Re-ranker

We propose multi-step custom implementation utilizing widely adopted hybrid search (metadata & embedding) and state of the art late interaction re-ranker to retrieve best matching pages. Code: https://github.com/abhijeet3922/vision-RAG

7 months назад @ paperswithcode.com
/ziangcao0312/ PhysX: Physical-Grounded 3D Asset Generation
/ziangcao0312/ PhysX: Physical-Grounded 3D Asset Generation

3D modeling is moving from virtual to physical. Code: https://github.com/ziangcao0312/PhysX

7 months назад @ paperswithcode.com
/henry123-boy/ SpatialTrackerV2: 3D Point Tracking Made Easy
/henry123-boy/ SpatialTrackerV2: 3D Point Tracking Made Easy

We present SpatialTrackerV2, a feed-forward 3D point tracking method for monocular videos. Code: https://github.com/henry123-boy/SpaTrackerV2

7 months назад @ paperswithcode.com
/cncs-fit/ Emergence of Functionally Differentiated Structures via Mutual Information Optimization in Recurrent Neural Networks
/cncs-fit/ Emergence of Functionally Differentiated Structures via Mutual Information Optimization in Recurrent Neural Networks

Analysis of network performance, correlation patterns, and weight matrices reveals that mutual information minimization yields high task performance alongside clear functional modularity and moderate structural modularity. Code: https://github.com/cncs-fit/mio_rnn

7 months назад @ paperswithcode.com
/coswindywang/ Making Language Model a Hierarchical Classifier and Generator
/coswindywang/ Making Language Model a Hierarchical Classifier and Generator

Language heads of the last layer are copied to different selected intermediate layers, and fine-tuned with different task inputs. Code: https://github.com/coswindywang/HdLM

7 months назад @ paperswithcode.com
/ahmedehabb/ From Roots to Rewards: Dynamic Tree Reasoning with RL
/ahmedehabb/ From Roots to Rewards: Dynamic Tree Reasoning with RL

Modern language models address complex questions through chain-of-thought (CoT) reasoning (Wei et al., 2023) and retrieval augmentation (Lewis et al., 2021), yet struggle with error propagation and knowledge integration. Code: https://github.com/ahmedehabb/From-Roots-to-Rewards-Dynamic-Tree-Reasoning-with-RL

7 months назад @ paperswithcode.com
💼 University and corporation labs
DeepMind DeepMind
последний пост 2 days, 7 hours назад
Gemini 3 Deep Think: Advancing science, research and engineering
Gemini 3 Deep Think: Advancing science, research and engineering Gemini 3 Deep Think: Advancing science, research and engineering

Our most specialized reasoning mode is now updated to solve modern science, research and engineering challenges.

2 days, 7 hours назад @ deepmind.google
Accelerating Mathematical and Scientific Discovery with Gemini Deep Think
Accelerating Mathematical and Scientific Discovery with Gemini Deep Think Accelerating Mathematical and Scientific Discovery with Gemini Deep Think

Under direction from expert mathematicians and scientists, Gemini Deep Think is solving professional research problems across mathematics, physics, and computer scienceIn the summer of 2025, an advanced version of Gemini Deep Think achieved Gold-medal standard at the International Mathematics Olympiad (IMO) and later, an updated version, obtained similar results at the International Collegiate Programming Contest.

Since then, Gemini Deep Think mode has moved into science, engineering and enterprise workflows to tackle more complex, open-ended challenges.

In the last week, our teams published two papers (1, 2) detailing a cross-disciplinary effort to solve professional research problems usin…

5 days, 7 hours назад @ deepmind.google
Project Genie: Experimenting with infinite, interactive worlds
Project Genie: Experimenting with infinite, interactive worlds Project Genie: Experimenting with infinite, interactive worlds

Starting today, we're rolling out access to Project Genie for Google AI Ultra subscribers in the U.S (18+).

This experimental research prototype lets users create, explore and remix their own interactive worlds.

How we’re advancing world modelsA world model simulates the dynamics of an environment, predicting how they evolve and how actions affect them.

Building on our model research with trusted testers from across industries and domains, we are taking the next step with an experimental research prototype: Project Genie.

How Project Genie worksProject Genie is a prototype web app powered by Genie 3, Nano Banana Pro and Gemini, which allows users to experiment with the immersive experiences…

2 weeks, 2 days назад @ blog.google
D4RT: Teaching AI to see the world in four dimensions
D4RT: Teaching AI to see the world in four dimensions D4RT: Teaching AI to see the world in four dimensions

Introducing D4RT, a unified AI model for 4D scene reconstruction and tracking across space and time.

Today, we are introducing D4RT (Dynamic 4D Reconstruction and Tracking), a new AI model that unifies dynamic scene reconstruction into a single, efficient framework, bringing us closer to the next frontier of artificial intelligence: total perception of our dynamic reality.

How D4RT Works: A Query-Based ApproachD4RT operates as a unified encoder-decoder Transformer architecture.

The encoder first processes the input video into a compressed representation of the scene’s geometry and motion.

This makes D4RT extremely fast and scalable, whether it’s tracking just a few points or reconstructing …

4 weeks, 1 day назад @ deepmind.google
Veo 3.1 Ingredients to Video: More consistency, creativity and control
Veo 3.1 Ingredients to Video: More consistency, creativity and control Veo 3.1 Ingredients to Video: More consistency, creativity and control

Today, Veo is getting more expressive, with improvements that help you create more fun, creative, high-quality videos based on ingredient images, built directly for the mobile format.

We’re excited to bring new creative possibilities for everyone from casual storytellers to professional filmmakers.

We’re releasing:Improvements to Veo 3.1 Ingredients to Video, our capability that lets you create videos based on reference images.

This update makes videos more expressive and creative, even with simple prompts Native vertical outputs for Ingredients to Video (portrait mode) to power mobile-first, short-form video creation State-of-the-art upscaling to 1080p and 4K resolution for high-fidelity p…

1 month назад @ blog.google
Google's year in review: 8 areas with research breakthroughs in 2025
Google's year in review: 8 areas with research breakthroughs in 2025 Google's year in review: 8 areas with research breakthroughs in 2025

Google 2025 recap: Research breakthroughs of the year

1 month, 3 weeks назад @ deepmind.google
Gemini 3 Flash: frontier intelligence built for speed
Gemini 3 Flash: frontier intelligence built for speed Gemini 3 Flash: frontier intelligence built for speed

Today, we're expanding the Gemini 3 model family with the release of Gemini 3 Flash, which offers frontier intelligence built for speed at a fraction of the cost.

Last month, we kicked off Gemini 3 with Gemini 3 Pro and Gemini 3 Deep Think mode, and the response has been incredible.

With Gemini 3, we introduced frontier performance across complex reasoning, multimodal and vision understanding and agentic and vibe coding tasks.

Gemini 3 Flash retains this foundation, combining Gemini 3's Pro-grade reasoning with Flash-level latency, efficiency and cost.

Starting today, Gemini 3 Flash is rolling out to millions of people globally:

1 month, 4 weeks назад @ blog.google
Gemma Scope 2: helping the AI safety community deepen understanding of complex language model behavior
Gemma Scope 2: helping the AI safety community deepen understanding of complex language model behavior Gemma Scope 2: helping the AI safety community deepen understanding of complex language model behavior

Today, we are releasing Gemma Scope 2: a comprehensive, open suite of interpretability tools for all Gemma 3 model sizes, from 270M to 27B parameters.

Producing Gemma Scope 2 involved storing approximately 110 Petabytes of data, as well as training over 1 trillion total parameters.

What’s new in Gemma Scope 2Interpretability research aims to understand the internal workings and learned algorithms of AI models.

Like its predecessor, Gemma Scope 2 acts as a microscope for the Gemma family of language models.

While the original Gemma Scope enabled research in key areas of safety, such as model hallucination, identifying secrets known by a model, and training safer models, Gemma Scope 2 support…

2 months назад @ deepmind.google
Improved Gemini audio models for powerful voice experiences
Improved Gemini audio models for powerful voice experiences Improved Gemini audio models for powerful voice experiences

Earlier this week, we introduced greater control over audio generation with an upgrade to our Gemini 2.5 Pro and Flash Text-to-Speech models.

Today, we’re releasing an updated Gemini 2.5 Flash Native Audio for live voice agents.

Gemini 2.5 Flash Native Audio is now available across Google products including Google AI Studio, Vertex AI, and has also started rolling out in Gemini Live and Search Live, bringing the naturalness of native audio to Search Live for the first time.

Beyond powering helpful agents, native audio unlocks new possibilities for global communication.

Live Voice Agents

2 months назад @ blog.google
Deepening our partnership with the UK AI Security Institute
Deepening our partnership with the UK AI Security Institute Deepening our partnership with the UK AI Security Institute

Today, we're announcing an expanded partnership with the UK AI Security Institute (AISI) through a new Memorandum of Understanding focused on foundational security and safety research, to help ensure artificial intelligence is developed safely and benefits everyone.

The research partnership with AISI is an important part of our broader collaboration with the UK government on accelerating safe and beneficial AI progress.

This is why we have partnered with the UK AISI since its inception in November 2023 to test our most capable models.

We are actively working with AISI to build more robust evaluations for AI models, and our teams have collaborated on safety research to move the field forward…

2 months назад @ deepmind.google
Strengthening our partnership with the UK government to support prosperity and security in the AI era
Strengthening our partnership with the UK government to support prosperity and security in the AI era Strengthening our partnership with the UK government to support prosperity and security in the AI era

The UK has already laid a strong foundation to seize this moment and is uniquely positioned to translate AI innovation into public benefit.

That’s why we are excited to deepen our collaboration with the UK government to accelerate this work and offer a blueprint for other countries.

Accelerating access to frontier AI in key sectors: Science & EducationOur partnership will center on providing access to frontier AI in two areas foundational to the UK’s long-term success: scientific discovery and education.

The UK has a rich history of applying new technologies to drive scientific progress, from Hooke’s microscope to Faraday’s electrical experiments.

Establishing Google DeepMind’s first automa…

2 months назад @ deepmind.google
FACTS Benchmark Suite: Systematically evaluating the factuality of large language models
FACTS Benchmark Suite: Systematically evaluating the factuality of large language models FACTS Benchmark Suite: Systematically evaluating the factuality of large language models

The FACTS Benchmark SuiteToday, we’re teaming up with Kaggle to introduce the FACTS Benchmark Suite.

A Search Benchmark that tests a model’s ability to use Search as a tool to retrieve information and synthesize it correctly.

Similar to our previous release, we are following standard industry practice and keeping an evaluation set held-out as a private set.

The FACTS Benchmark Suite Score (or FACTS Score) is calculated as the average accuracy of both public and private sets across the four benchmarks.

Kaggle will oversee the management of the FACTS Benchmark Suite.

2 months, 1 week назад @ deepmind.google
Engineering more resilient crops for a warming climate
Engineering more resilient crops for a warming climate Engineering more resilient crops for a warming climate

Scientists are using AlphaFold in their research to strengthen an enzyme that’s vital to photosynthesis, paving the way for more heat-tolerant crops.

As global warming accompanies more droughts and heatwaves, harvests of some staple crops are shrinking.

But less visible is what is happening inside these plants, where high heat can break down the molecular machinery that keeps them alive.

Plants use photosynthesis to produce the glucose that fuels their growth via an intricate choreography of enzymes inside plant cells.

"Our job is to learn from those examples and build that same resilience into the crops we depend on."

2 months, 1 week назад @ deepmind.google
AlphaFold: Five years of impact
AlphaFold: Five years of impact AlphaFold: Five years of impact

They used AlphaFold alongside comparative genomics to better understand how plants perceive changes in their environment, paving the way for more resilient crops.

AlphaFold has been cited in more than 35,000 papers and more than 200,000 papers incorporated elements of AlphaFold 2 in their methodology.

An independent analysis of AlphaFold 2’s impact, carried out by the Innovation Growth Lab, suggests that researchers using AlphaFold 2 see an increase of over 40% in their submission of novel experimental protein structures.

Those protein structures are more likely to be dissimilar to known structures, encouraging the exploration of uncharted areas of science.

The AlphaFold Server is empowerin…

2 months, 3 weeks назад @ deepmind.google
Revealing a key protein behind heart disease
Revealing a key protein behind heart disease Revealing a key protein behind heart disease

Both have a family history of heart disease – a reminder of what’s at stake in their work to better understand and ultimately help treat this deadly condition.

That protein, apoB100, has defied mapping not only because it’s enormous (for a protein), but also because it connects to fats and other molecules in complicated ways.

ApoB100 forms the molecular scaffold of “bad cholesterol”, which is known to scientists as low-density lipoprotein (LDL).

Discovering the structure of its key protein promised to shed light on how bad cholesterol becomes harmful inside the body, giving scientists a better chance to develop ways to prevent and treat ASCVD.

The images weren’t sharp enough to map the stru…

2 months, 3 weeks назад @ deepmind.google
Google
последний пост 3 days, 6 hours назад
Build financial resilience with AI-powered tabletop exercises on Google Cloud
Build financial resilience with AI-powered tabletop exercises on Google Cloud Build financial resilience with AI-powered tabletop exercises on Google Cloud

The risk is amplified by major regulatory drivers like the Digital Operational Resilience Act (DORA), which mandates that financial institutions are ready for any disruption.

The recent designation of Google Cloud as a Critical Third-Party Service Provider (CTPP) under DORA further underscores this strong commitment to enabling secure and resilient financial operations for our customers.

For them, a critical incident means more than just downtime; it means regulatory fines and an erosion of client trust.

In this blog, we’ll share a solution using context-aware scenario modeling on Gemini Enterprise.

Context-aware scenario modeling powered by Google AIGoogle Cloud’s Technical Account Managem…

3 days, 6 hours назад @ cloud.google.com
Gemini Enterprise Agent Ready (GEAR) program now available, a new path to building AI agents at scale
Gemini Enterprise Agent Ready (GEAR) program now available, a new path to building AI agents at scale Gemini Enterprise Agent Ready (GEAR) program now available, a new path to building AI agents at scale

To meet this moment, we are excited to open the Gemini Enterprise Agent Ready (GEAR) learning program to everyone.

As a new specialized pathway within the Google Developer Program, GEAR empowers developers and pros to build and deploy enterprise-grade agents with Google AI.

That is why GEAR membership includes 35 monthly learning credits on Google Skills.

Turn technical proficiency into industry-recognized credentialsAs you progress through these learning paths, GEAR helps turn your progress into tangible credentials.

You can access GEAR benefits by creating or signing into your Google Developer Program profile and claiming the GEAR badge.

4 days, 6 hours назад @ cloud.google.com
How we cut Vertex AI latency by 35% with GKE Inference Gateway
How we cut Vertex AI latency by 35% with GKE Inference Gateway How we cut Vertex AI latency by 35% with GKE Inference Gateway

As generative AI moves from experimentation to production, platform engineers face a universal challenge for inference serving: you need low latency, high throughput, and manageable costs.

Our solution: To solve this, the Vertex AI engineering team adopted the GKE Inference Gateway.

Content-aware routing: It inspects request prefixes and routes to the pod that already has that context in its KV cache, avoiding expensive re-computation.

By migrating production workloads to this architecture, Vertex AI proves that this dual-layer intelligence is the key to unlocking performance at scale.

The results: Validated at production scaleBy placing GKE Inference Gateway in front of the Vertex AI model…

1 week, 1 day назад @ cloud.google.com
Announcing Claude Opus 4.6 on Vertex AI
Announcing Claude Opus 4.6 on Vertex AI Announcing Claude Opus 4.6 on Vertex AI

Feature availabilityFeature Claude Opus 4.6 on Vertex AI Availability Adaptive Thinking GA Fine-grained tool streaming toggle GA Effort parameter GA 128k Output Tokens GA Compaction API Preview 1M Context Window Preview Tool params quoting consistency GABuild agentic systems with a complete AI stackVertex AI eliminates the trade-off between accessing frontier models and utilizing a world-class AI development platform.

Because building sophisticated agents requires more than just a model, Vertex AI provides a complete agentic stack designed to manage the complexities of production, governance, and global scale.

Rapidly build intelligent agents: Leverage the Vertex AI Agent Builder stack—incl…

1 week, 2 days назад @ cloud.google.com
High-performance inference meets serverless compute with NVIDIA RTX PRO 6000 on Cloud Run
High-performance inference meets serverless compute with NVIDIA RTX PRO 6000 on Cloud Run High-performance inference meets serverless compute with NVIDIA RTX PRO 6000 on Cloud Run

Cloud Run lets you attach a NVIDIA RTX PRO 6000 Blackwell GPU to your Cloud Run service, job, or worker pools, on demand, with no reservations required.

Here are some ways you can use the NVIDIA RTX PRO 6000 Blackwell GPU to accelerate your business:Generative AI and inference: With its FP4 precision support, the NVIDIA RTX PRO 6000 Blackwell GPU’s high-efficiency compute accelerates LLM fine-tuning and inference, letting you create real-time generative AI applications such as multi-modal and text-to-image creation models.

Fine-tuning and offline inference : NVIDIA RTX PRO 6000 Blackwell GPUs can be used in conjunction with Cloud Run jobs to fine-tune your model.

Some highlights of Cloud Ru…

1 week, 5 days назад @ cloud.google.com
Build intelligent employee onboarding with Gemini Enterprise
Build intelligent employee onboarding with Gemini Enterprise Build intelligent employee onboarding with Gemini Enterprise

Employee onboarding is rarely a linear process.

In this article, we show you how developers use the Agent Development Kit (ADK), Agent Engine, and Application Integration to build custom agents that connect into your enterprise systems (such as ITSM, ERP, and CRM).

Once built, these agents are published to the Gemini Enterprise agent gallery, where employees can easily access and interact with them.

The result is a more personal employee onboarding experience.

Read on to learn how to connect conversational AI with your essential enterprise systems.

1 week, 5 days назад @ cloud.google.com
Introducing Conversational Analytics in BigQuery
Introducing Conversational Analytics in BigQuery Introducing Conversational Analytics in BigQuery

Businesses want to move quickly and make informed decisions, but the explosion of data in today’s organizations often can leave knowledge teams buried and business users waiting in lengthy queues for the data insights they need.

Today, we are unveiling Conversational Analytics in BigQuery in preview.

Following the general availability of Conversational Analytics in Looker, this integration brings a sophisticated AI-powered reasoning engine directly into BigQuery Studio.

Data insights for data professionals, made conversationalConversational Analytics in BigQuery is more than a simple chatbot.

With Conversational Analytics in BigQuery, technical and business teams can build and deploy intell…

2 weeks, 2 days назад @ cloud.google.com
What's new with ML infrastructure for Dataflow
What's new with ML infrastructure for Dataflow What's new with ML infrastructure for Dataflow

At Google Cloud, we’re committed to providing best-in-class infrastructure to power your AI and ML workloads.

We’re excited to share a wave of recent features and capabilities that give you more choice, greater obtainability, and improved efficiency when it comes to running your batch and streaming ML workloads.

More choice: Performance-optimized hardwareWe understand that all ML workloads are not created equal.

We recently announced support for TPU V5E, V5P and V6E, which enable state-of-the-art ML builders to efficiently run high-volume, low-latency machine learning inference workloads at scale, directly within their Dataflow jobs.

Greater accelerator obtainabilityGetting access to the ha…

2 weeks, 4 days назад @ cloud.google.com
BigQuery AI supports Gemini 3.0, simplified embedding generation and new similarity function
BigQuery AI supports Gemini 3.0, simplified embedding generation and new similarity function BigQuery AI supports Gemini 3.0, simplified embedding generation and new similarity function

The digital landscape is flooded with unstructured data — images, videos, audio, and documents — that often remain untapped.

To help you unlock this data's potential with minimal friction, we have integrated Gemini and other Vertex AI models directly into BigQuery, simplifying how you work with generative AI and embedding models using BigQuery SQL.New launches in this area further simplify setup and expand what you can do with AI functions:Simplified permission setup by using End User Credentials (EUC) AI.generate() function for both text and structured data generation AI.embed() function for embedding generation AI.similarity() for computing semantic similarity scores between text and imag…

2 weeks, 5 days назад @ cloud.google.com
Monitoring Google ADK agentic applications with Datadog LLM Observability
Monitoring Google ADK agentic applications with Datadog LLM Observability Monitoring Google ADK agentic applications with Datadog LLM Observability

Google’s Agent Development Kit (ADK) gives you the building blocks to create powerful agentic systems.

To help you manage this complexity, Datadog LLM Observability now provides automatic instrumentation for systems built with ADK.

Safety and accuracy: LLM responses may contain hallucinations, sensitive data, or prompt injection attempts, risking security incidents and reduced customer trust.

Get started with Datadog LLM ObservabilityDatadog LLM Observability simplifies monitoring and debugging for Google ADK systems, helping users debug agent operations, evaluate responses, iterate quickly, and validate changes before deploying them to production.

For more information on how to debug agent…

3 weeks, 1 day назад @ cloud.google.com
How Fastweb + Vodafone reimagined data workflows with Spanner & BigQuery
How Fastweb + Vodafone reimagined data workflows with Spanner & BigQuery How Fastweb + Vodafone reimagined data workflows with Spanner & BigQuery

Editor’s note: Fastweb + Vodafone, a leading telecommunications provider in Italy, created its Customer 360 platform to support faster and more personalized customer experiences across channels.

Fastweb + Vodafone now delivers real-time insights across the organization and is preparing for expanded AI agent use cases in the future.

Our team’s job is to give everyone at our business access to the customer insights they need, when they need them.

Both companies had already begun modernizing these customer data workflows on Google Cloud, especially with BigQuery.

Choosing a connected platform, not a single toolWhen we stepped back and looked at what Fastweb + Vodafone needed, it was clear we w…

3 weeks, 2 days назад @ cloud.google.com
Scaling WideEP Mixture-of-Experts inference with Google Cloud A4X (GB200) and NVIDIA Dynamo
Scaling WideEP Mixture-of-Experts inference with Google Cloud A4X (GB200) and NVIDIA Dynamo Scaling WideEP Mixture-of-Experts inference with Google Cloud A4X (GB200) and NVIDIA Dynamo

These recipes are part of a broader collaboration between our organizations that includes investments in important inference infrastructure like Dynamic Resource Allocation (DRA) and Inference Gateway.

Serving architecture: NVIDIA Dynamo functions as the distributed runtime, managing KV cache state and kernel scheduling across the rack-scale fabric.

We combine the A4X at the infrastructure layer with NVIDIA Dynamo at the model serving Layer, orchestrated by GKE.

Serving layer: NVIDIA DynamoHardware of this scale requires a runtime capable of managing distributed state without introducing synchronization overhead.

With full support for the NVIDIA Dynamo suite — from intelligent routing and s…

3 weeks, 2 days назад @ cloud.google.com
Cloud CISO Perspectives: Practical guidance on building with SAIF
Cloud CISO Perspectives: Practical guidance on building with SAIF Cloud CISO Perspectives: Practical guidance on building with SAIF

The SAIF Data Controls address risks in data sourcing, management, and use for model training and user interaction, ensuring privacy, integrity, and authorized use throughout the AI lifecycle.

Key data security tools on Google Cloud include: Identity and Access Management, access controls for Cloud Storage and BigQuery, Dataplex for data governance, and Vertex AI managed datasets.

The SAIF Model Controls are designed to build resilience into the model and sanitize its inputs and outputs to protect against these emerging threats.

SAIF recommends application controls to secure the interface between the end user and the AI model.

Governance controls, assurance controls (including red teaming a…

4 weeks, 1 day назад @ cloud.google.com
Introducing BigQuery managed and SQL-native inference for open models
Introducing BigQuery managed and SQL-native inference for open models Introducing BigQuery managed and SQL-native inference for open models

A SQL-native workflow with automated managementWith the launch of managed third-party generative AI inference in BigQuery (Preview), you can now run open models using just two SQL statements.

This new capability delivers four key benefits:Simplified deployment : Deploy open models using a single CREATE MODEL SQL statement with the model id string (e.g., google/gemma-3-1b-it ).

Unified SQL interface : The entire workflow — from model creation and inference to cost management and cleanup — is managed directly in BigQuery using SQL.

How it works: A practical exampleLet’s take a look at the process of creating and utilizing an open model.

Step 1: Create a BigQuery managed open modelTo use an op…

1 month назад @ cloud.google.com
Palo Alto Networks automates customer intelligence document creation with agentic design
Palo Alto Networks automates customer intelligence document creation with agentic design Palo Alto Networks automates customer intelligence document creation with agentic design

Vertex AI Agent Engine : The fully managed, serverless environment that hosts and executes the ADK-based agent, handling scaling, security, and operational overhead.

Cloud Storage: Served as storage for the unstructured customer documents needed to answer the DOR questions.

The solution involved shifting the responsibility for state management to a FastAPI server acting as an orchestrator.

This separation leverages Vertex AI Agent Engine's fully managed and scalable environment for the agent, while providing the flexibility of a custom backend orchestrator.

Vertex AI Agent Engine's horizontal scaling capabilities further supported this parallel execution.

1 month назад @ cloud.google.com
OpenAI
последний пост None
Microsoft Microsoft
последний пост 1 week, 2 days назад
Rethinking imitation learning with Predictive Inverse Dynamics Models
Rethinking imitation learning with Predictive Inverse Dynamics Models Rethinking imitation learning with Predictive Inverse Dynamics Models

Predictive Inverse Dynamics Models (PIDMs) predict plausible future states, clarifying the direction of behavior during imitation learning.

Predictive Inverse Dynamics Models (PIDMs) offer a different take on imitation learning by changing how agents interpret human behavior.

By grounding the selection process in a plausible future, PIDMs provide a clearer basis for choosing an action during inference.

They then use an inverse dynamics model to predict the action required to move from the current state towards that future state.

A player (left) and a PIDM agent (right) side by side playing the game Bleeding Edge.

1 week, 2 days назад @ microsoft.com
Paza: Introducing automatic speech recognition benchmarks and models for low resource languages
Paza: Introducing automatic speech recognition benchmarks and models for low resource languages Paza: Introducing automatic speech recognition benchmarks and models for low resource languages

At a glance Microsoft Research releases PazaBench and Paza automatic speech recognition models , advancing speech technology for low resource languages.

First-of-its-kind ASR leaderboard, starting with African languages: Pazabench is the first automatic speech recognition (ASR) leaderboard for low-resource languages.

It launches with initial coverage for 39 African languages and benchmarks 52 state‑of‑the‑art ASR and language models, including newly released Paza ASR models for six Kenyan languages.

Paza ASR Models: Built with and for Kenyan languagesThe Paza ASR models consist of three fine-tuned ASR models built on top of state‑of‑the‑art model architectures.

Here is how Paza models compa…

1 week, 2 days назад @ microsoft.com
UniRG: Scaling medical imaging report generation with multimodal reinforcement learning
UniRG: Scaling medical imaging report generation with multimodal reinforcement learning UniRG: Scaling medical imaging report generation with multimodal reinforcement learning

At a glance AI-driven medical image report generation can help medical providers become more efficient and productive.

Universal Report Generation (UniRG) uses reinforcement learning to align model training with real-world radiology practice rather than proxy text-generation objectives.

Beyond the real-world benefits, report generation has also become a critical benchmark for evaluating multimodal reasoning in healthcare AI.

In this blog, we introduce Universal Report Generation (UniRG) (opens in new tab), a reinforcement learning–based framework for medical imaging report generation.

UniRG is a promising step toward scaling medical imaging report generationUniRG introduces a reinforcement …

2 weeks, 4 days назад @ microsoft.com
Multimodal reinforcement learning with agentic verifier for AI agents
Multimodal reinforcement learning with agentic verifier for AI agents Multimodal reinforcement learning with agentic verifier for AI agents

It’s an agentic verification framework designed to improve the reliability of reinforcement learning in multimodal models.

Reinforcement learning is a training method where AI models learn by receiving rewards for desired behaviors and penalties for undesired ones, gradually improving their performance through trial and error.

This design prevents unreliable feedback from dominating training and produces a stable reward signal for reinforcement learning.

Models trained with Argos also showed a substantial reduction of hallucinations compared with both standard chain-of-thought prompting and reinforcement learning baselines.

How Argos shapes reinforcement learningTo understand how Argos affe…

3 weeks, 4 days назад @ microsoft.com
OptiMind: A small language model with optimization expertise
OptiMind: A small language model with optimization expertise OptiMind: A small language model with optimization expertise

OptiMind is a small language model designed to convert business problems described in natural language into the mathematical formulations needed by optimization software.

Enterprises across industries, from energy to finance, use optimization models to plan complex operations like supply chains and logistics.

To address this challenge, we’re introducing OptiMind, a small language model designed to convert problems described in plain language into the mathematical formulations that optimization software needs.

Built on a 20-billion parameter model, OptiMind is compact by today’s standards yet matches the performance of larger, more complex systems.

From problem description to solutionOne of …

1 month назад @ microsoft.com
Agent Lightning: Adding reinforcement learning to AI agents without code rewrites
Agent Lightning: Adding reinforcement learning to AI agents without code rewrites Agent Lightning: Adding reinforcement learning to AI agents without code rewrites

To address this, a research team from Microsoft Research Asia – Shanghai has introduced Agent Lightning.

Whether it involves multiple collaborating agents or dynamic tool use, Agent Lightning breaks it down into a sequence of transitions.

Agent Lightning as middlewareAgent Lightning serves as middleware between RL algorithms and agent environments, providing with modular components that enable scalable RL through standardized protocols and well-defined interfaces.

In practice, developers can keep their existing agent frameworks and switch model calls to the Agent Lightning API without changing their agent code (Figure 5).

By bridging existing agentic systems with reinforcement learning, Age…

2 months назад @ microsoft.com
Promptions helps make AI prompting more precise with dynamic UI controls
Promptions helps make AI prompting more precise with dynamic UI controls Promptions helps make AI prompting more precise with dynamic UI controls

To address this, we are excited to introduce Promptions (prompt + options), a UI framework that helps developers build AI interfaces with more precise user control.

We compared the static design from the first study, called the “Static Prompt Refinement Control” (Static PRC), against a “Dynamic Prompt Refinement Control” (Dynamic PRC) with features that responded to participants’ feedback.

Comparison of user preferences for Static PRC versus Dynamic PRC across key evaluation criteria.

(1) The Option Module reads the user’s prompt and conversation history and (2) generates prompt options.

Key usability challenges include clarifying how dynamic options affect AI output and managing the comple…

2 months назад @ microsoft.com
GigaTIME: Scaling tumor microenvironment modeling using virtual population generated by multimodal AI
GigaTIME: Scaling tumor microenvironment modeling using virtual population generated by multimodal AI GigaTIME: Scaling tumor microenvironment modeling using virtual population generated by multimodal AI

C, Scatter plot comparing the subtype-level GigaTIME-translated virtual mIF activations between TCGA and Providence virtual populations.

To our knowledge, this is the first large-scale study exploring multimodal AI for scaling virtual mIF generation.

H, A case study showcasing the activation maps across different virtual mIF channels for a H&E slide in our virtual population, and virtual mIF of sample patches from this slide.

By applying GigaTIME to Providence real-world data, we generated a virtual population of 14,256 patients with virtual mIF and key clinical attributes.

G, Bar plot comparing pan-cancer patient stratification performance in terms of survival log rank p-values among virtu…

2 months, 1 week назад @ microsoft.com
Ideas: Community building, machine learning, and the future of AI
Ideas: Community building, machine learning, and the future of AI Ideas: Community building, machine learning, and the future of AI

This week, machine learning researchers around the world will be attending the annual Conference on Neural Information Processing Systems, or NeurIPS.

In this series, we’ll explore the technologies that are shaping our future and the big ideas that propel them forward.

So around that time when I started my PhD at Penn, I was working in machine learning theory and algorithmic economics.

How had you experienced a lack of community or network of women in machine learning before the founding of WiML?

So particularly when working on topics related to fairness, I’ve ended up focusing a bunch on stuff to do with marginalized groups as part of my responsible AI work.

2 months, 2 weeks назад @ microsoft.com
Reducing Privacy leaks in AI: Two approaches to contextual integrity
Reducing Privacy leaks in AI: Two approaches to contextual integrity Reducing Privacy leaks in AI: Two approaches to contextual integrity

The theory of contextual integrity frames privacy as the appropriateness of information flow within specific social contexts.

Each tackles contextual integrity from a different angle, but both aim to build directly into AI systems a greater sensitivity to information-sharing norms.

Contextual Integrity in LLMs via Reasoning and Reinforcement Learning, accepted at NeurIPS 2025, takes a different approach to applying contextual integrity.

Contextual integrity through reasoning and reinforcement learningIn our second paper, we explore whether contextual integrity can be built into the model itself rather than enforced through external checks at inference time.

To address this trade-off, we int…

2 months, 3 weeks назад @ microsoft.com
Fara-7B: An Efficient Agentic Model for Computer Use
Fara-7B: An Efficient Agentic Model for Computer Use Fara-7B: An Efficient Agentic Model for Computer Use

Today, we are pleased to announce Fara-7B, our first agentic SLM designed specifically for computer use.

Unlike traditional chat models that generate text-based responses, Computer Use Agent (CUA) models like Fara-7B leverage computer interfaces, such as a mouse and keyboard, to complete tasks on behalf of users.

This results in reduced latency and improved privacy, as user data remains local.

Fara-7B breaks ground on a new pareto frontier, showing that on-device computer use agents are approaching the capabilities of frontier models.

For guidance on how to use our model safely, and the security considerations to be mindful of when using our model, please refer to our Model card (opens in n…

2 months, 3 weeks назад @ microsoft.com
MMCTAgent: Enabling multimodal reasoning over large video and image collections
MMCTAgent: Enabling multimodal reasoning over large video and image collections MMCTAgent: Enabling multimodal reasoning over large video and image collections

Real-world reasoning increasingly involves analyzing long-form video content, where context spans minutes or hours, far beyond the context limits of most models.

The Planner agent decomposes a user query, identifies the appropriate reasoning tools, performs multimodal operations, and drafts a preliminary answer.

MMCTAgent’s Planner–Critic architecture enables multimodal reasoning over long-form video through structured ingestion, retrieval, and iterative feedback.

The VideoAgent extends this architecture to long-form video reasoning.

Takeaways and next stepsMMCTAgent demonstrates a scalable agentic approach to multimodal reasoning with a Planner–Critic architecture.

3 months назад @ microsoft.com
BlueCodeAgent: A blue teaming agent enabled by automated red teaming for CodeGen AI
BlueCodeAgent: A blue teaming agent enabled by automated red teaming for CodeGen AI BlueCodeAgent: A blue teaming agent enabled by automated red teaming for CodeGen AI

Many studies have explored red teaming code LLMs, testing whether the models can reject unsafe requests and whether their generated code exhibits insecure patterns.

Knowledge-enhanced blue teaming: Building on the foundation of red-teaming knowledge, BlueCodeAgent significantly improves blue-teaming performance by leveraging constitutions derived from knowledge and dynamic testing.

Generalization to seen and unseen risks: Empowered by comprehensive red-teaming knowledge, BlueCodeAgent generalizes effectively to unseen risks.

A blue teaming agent enabled by red teamingFigure 2: Overview of BlueCodeAgent, an end-to-end blue teaming framework powered by automated red teaming for code security.…

3 months назад @ microsoft.com
When industry knowledge meets PIKE-RAG: The innovation behind Signify’s customer service boost
When industry knowledge meets PIKE-RAG: The innovation behind Signify’s customer service boost When industry knowledge meets PIKE-RAG: The innovation behind Signify’s customer service boost

Spotlight: Event Series Microsoft Research Forum Join us for a continuous exchange of ideas about research in the era of general AI.

These differentiated advantages stem from PIKE-RAG’s unique approach to understanding and processing professional knowledge.

“It’s also worth noting that the researchers at Microsoft Research Asia demonstrated strong industry knowledge and rigorous scientific methodology.

Through this collaboration, we validated that PIKE-RAG’s general approach can greatly improve the accuracy of professional knowledge Q&A and accelerate scenario customization.

Our researchers also gained valuable experience in handling domain-specific data,” explained Jiang Bian, partner rese…

3 months, 1 week назад @ microsoft.com
Magentic Marketplace: an open-source simulation environment for studying agentic markets
Magentic Marketplace: an open-source simulation environment for studying agentic markets Magentic Marketplace: an open-source simulation environment for studying agentic markets

To help navigate this uncertainty, we built Magentic Marketplace (opens in new tab)— an open-source simulation environment for exploring the numerous possibilities of agentic markets and their societal implications at scale.

To explore these dynamics in depth, the Magentic Marketplace platform enables controlled experimentation across diverse agentic marketplace scenarios.

With Magentic Marketplace, researchers can model how agents representing customers and businesses interact—shedding light on the dynamics that could shape future digital markets.

Magentic Marketplace includes two agent types: Assistant Agents (customers) and Service Agents (businesses).

Unlike traditional markets, which d…

3 months, 1 week назад @ microsoft.com
MIT AI MIT AI
последний пост 1 day, 23 hours назад
New J-PAL research and policy initiative to test and scale AI innovations to fight poverty
New J-PAL research and policy initiative to test and scale AI innovations to fight poverty New J-PAL research and policy initiative to test and scale AI innovations to fight poverty

The Abdul Latif Jameel Poverty Action Lab (J-PAL) at MIT has awarded funding to eight new research studies to understand how artificial intelligence innovations can be used in the fight against poverty through its new Project AI Evidence.

The new initiative is prioritizing questions policymakers are already asking: Do AI-assisted teaching tools help all children learn?

In the coming years, PAIE will run a series of funding competitions to invite proposals for evaluations of AI tools that address questions like these, and many more.

Alex Diaz, head of AI for social good at Google.org, says, “we’re thrilled to collaborate with MIT and J-PAL, already leaders in this space, on Project AI Eviden…

1 day, 23 hours назад @ news.mit.edu
Accelerating science with AI and simulations
Accelerating science with AI and simulations Accelerating science with AI and simulations

Now, the newly tenured professor in materials science and engineering believes AI is poised to transform science in ways never before possible.

His latest company, Lila Sciences, is working to build a scientific superintelligence platform for the life sciences, chemical, and materials science industries.

All of that work is designed to ensure the future of scientific research is more seamless and productive than research today.

“AI for science is one of the most exciting and aspirational uses of AI,” Gómez-Bombarelli says.

“AI for simulations has gone from something that maybe could work to a consensus scientific view,” Gómez-Bombarelli says.

2 days, 18 hours назад @ news.mit.edu
Using synthetic biology and AI to address global antimicrobial resistance threat
Using synthetic biology and AI to address global antimicrobial resistance threat Using synthetic biology and AI to address global antimicrobial resistance threat

James J. Collins, the Termeer Professor of Medical Engineering and Science at MIT and faculty co-lead of the Abdul Latif Jameel Clinic for Machine Learning in Health, is embarking on a multidisciplinary research project that applies synthetic biology and generative artificial intelligence to the growing global threat of antimicrobial resistance (AMR).

The research project is sponsored by Jameel Research, part of the Abdul Latif Jameel International network.

The initial three-year, $3 million research project in MIT’s Department of Biological Engineering and Institute of Medical Engineering and Science focuses on developing and validating programmable antibacterials against key pathogens.

Th…

3 days, 10 hours назад @ news.mit.edu
AI algorithm enables tracking of vital white matter pathways
AI algorithm enables tracking of vital white matter pathways AI algorithm enables tracking of vital white matter pathways

Moreover, the study shows, BSBT retrospectively enabled tracking of bundle healing in a coma patient that reflected the patient’s seven-month road to recovery.

As part of his thesis work to better understand the neural mechanisms that underpin consciousness, Olchanyi wanted to develop an AI algorithm to overcome these obstacles.

To train the neural network to segment the bundles, Olchanyi “showed” it 30 live diffusion MRI scans from volunteers in the Human Connectome Project (HCP).

Meanwhile, TBI patients didn’t show significant volume loss in any bundles, but FA reductions were apparent in the majority of bundles.

The authors wrote that BSBT “has substantial prognostic potential by identif…

4 days, 1 hour назад @ news.mit.edu
3 Questions: Using AI to help Olympic skaters land a quint
3 Questions: Using AI to help Olympic skaters land a quint 3 Questions: Using AI to help Olympic skaters land a quint

Olympic figure skating looks effortless.

Hosoi and Lu recently chatted with MIT News about applying AI to sports, whether AI systems could ever be used to judge Olympic figure skating, and when we might see a skater land a quint.

But with figure skating, Jerry has found one of the few areas where depth challenges don’t really matter.

Q: Could you ever see a world in which AI is used to evaluate the artistic side of figure skating?

Hosoi: I have always watched Olympic figure skating competitions, ever since I could turn on the TV.

4 days, 18 hours назад @ news.mit.edu
Study: Platforms that rank the latest LLMs can be unreliable
Study: Platforms that rank the latest LLMs can be unreliable Study: Platforms that rank the latest LLMs can be unreliable

To narrow down the choice, companies often rely on LLM ranking platforms, which gather user feedback on model interactions to rank the latest LLMs based on how they perform on certain tasks.

They developed a fast method to test ranking platforms and determine whether they are susceptible to this problem.

“We were surprised that these ranking platforms were so sensitive to this problem.

The researchers wanted to see if the same analysis could be applied to LLM ranking platforms.

Ranking platforms could also use human mediators to assess crowdsourced responses.

5 days, 18 hours назад @ news.mit.edu
“This is science!” – MIT president talks about the importance of America’s research enterprise on GBH’s Boston Public Radio
“This is science!” – MIT president talks about the importance of America’s research enterprise on GBH’s Boston Public Radio “This is science!” – MIT president talks about the importance of America’s research enterprise on GBH’s Boston Public Radio

In a wide-ranging live conversation, MIT President Sally Kornbluth joined Jim Braude and Margery Eagan live in studio for GBH’s Boston Public Radio on Thursday, February 5.

They talked about MIT, the pressures facing America’s research enterprise, the importance of science, that Congressional hearing on antisemitism in 2023, and more – including Sally’s experience as a Type 1 diabetic.

I fell in love with this idea of being in this environment [where] everyone loves math, everyone wants to learn.

Coming up on Curiosity Desk later this month…Airing weekday afternoons from 1-2 p.m., The Curiosity Desk will welcome additional MIT guests in the coming weeks.

Then, on Thursday, Feb. 19, Professo…

1 week, 1 day назад @ news.mit.edu
Helping AI agents search to get the best results out of large language models
Helping AI agents search to get the best results out of large language models Helping AI agents search to get the best results out of large language models

In particular, many professionals are tapping into the talents of semi-autonomous software systems called AI agents, which can call on AI at specific points to solve problems and complete tasks.

AI agents are particularly effective when they use large language models (LLMs) because those systems are powerful, efficient, and adaptable.

You can then separately specify the search strategy — you could either use one that EnCompass provides out of the box or, if desired, implement your own custom search strategy.

For now, EnCompass is a powerful building block that enables humans to tinker with AI agents more easily, improving their performance.

“By cleanly separating an agent’s programming logi…

1 week, 2 days назад @ news.mit.edu
Brian Hedden named co-associate dean of Social and Ethical Responsibilities of Computing
Brian Hedden named co-associate dean of Social and Ethical Responsibilities of Computing Brian Hedden named co-associate dean of Social and Ethical Responsibilities of Computing

Brian Hedden PhD ’12 has been appointed co-associate dean of the Social and Ethical Responsibilities of Computing (SERC) at MIT, a cross-cutting initiative in the MIT Schwarzman College of Computing, effective Jan. 16.

Hedden is a professor in the Department of Linguistics and Philosophy, holding an MIT Schwarzman College of Computing shared position with the Department of Electrical Engineering and Computer Science (EECS).

He joined the MIT faculty last fall from the Australian National University and the University of Sydney, where he previously served as a faculty member.

His scholarship exemplifies the kind of interdisciplinary inquiry that SERC exists to advance,” says Dan Huttenlocher…

1 week, 3 days назад @ news.mit.edu
Antonio Torralba, three MIT alumni named 2025 ACM fellows
Antonio Torralba, three MIT alumni named 2025 ACM fellows Antonio Torralba, three MIT alumni named 2025 ACM fellows

Antonio Torralba, Delta Electronics Professor of Electrical Engineering and Computer Science and faculty head of artificial intelligence and decision-making at MIT, has been named to the 2025 cohort of Association for Computing Machinery (ACM) Fellows.

He shares the honor of an ACM Fellowship with three MIT alumni: Eytan Adar ’97, MEng ’98; George Candea ’97, MEng ’98; and Gookwon Edward Suh SM ’01, PhD ’05.

At different points in his MIT career, he has been director of both the MIT Quest for Intelligence (now the MIT Siegel Family Quest for Intelligence) and the MIT-IBM Watson AI Lab.

Torralba’s research focuses on computer vision, machine learning, and human visual perception; as he puts …

1 week, 3 days назад @ news.mit.edu
3 Questions: Using AI to accelerate the discovery and design of therapeutic drugs
3 Questions: Using AI to accelerate the discovery and design of therapeutic drugs 3 Questions: Using AI to accelerate the discovery and design of therapeutic drugs

Your research has led to many advances in designing novel antibiotics, using generative AI and deep learning.

Looking ahead, we are using deep learning to design antibiotics with drug-like properties that make them stronger candidates for clinical development.

By integrating AI with high-throughput biological testing, we aim to accelerate the discovery and design of antibiotics that are novel, safe, and effective, ready for real-world therapeutic use.

Recently, we received a grant from ARPA-H to use generative AI to design 15 new antibiotics and develop them as pre-clinical candidates.

By integrating generative AI, biology, and translational partnerships, we hope to create a pipeline that c…

1 week, 3 days назад @ news.mit.edu
Katie Spivakovsky wins 2026 Churchill Scholarship
Katie Spivakovsky wins 2026 Churchill Scholarship Katie Spivakovsky wins 2026 Churchill Scholarship

MIT senior Katie Spivakovsky has been selected as a 2026-27 Churchill Scholar and will undertake an MPhil in biological sciences at the Wellcome Sanger Institute at Cambridge University in the U.K. this fall.

“Katie is a brilliant researcher who has a keen intellectual curiosity that will make her a leader in biological engineering in the future.

We are proud that she will be representing MIT at Cambridge University,” says Kim Benard, associate dean of distinguished fellowships.

The Churchill Scholarship is a highly competitive fellowship that annually offers 16 American students the opportunity to pursue a funded graduate degree in science, mathematics, or engineering at Churchill College …

1 week, 4 days назад @ news.mit.edu
Counter intelligence
Counter intelligence Counter intelligence

How can artificial intelligence step out of a screen and become something we can physically touch and interact with?

That question formed the foundation of class 4.043/4.044 (Interaction Intelligence), an MIT course focused on designing a new category of AI-driven interactive objects.

Troubleshooting included taste-testing recipes Kitchen Cosmo generated.

A thin, rectangular device about 18 inches in height, Kitchen Cosmo has a webcam that hinges open to scan ingredients set on a counter.

While Kitchen Cosmo made a modest splash in design magazines, both students have ideas where they will take future iterations.

1 week, 4 days назад @ news.mit.edu
SMART launches new Wearable Imaging for Transforming Elderly Care research group
SMART launches new Wearable Imaging for Transforming Elderly Care research group SMART launches new Wearable Imaging for Transforming Elderly Care research group

What if ultrasound imaging is no longer confined to hospitals?

Bringing this vision to reality, the Singapore-MIT Alliance for Research and Technology (SMART), MIT’s research enterprise in Singapore, has launched a new collaborative research project: Wearable Imaging for Transforming Elderly Care (WITEC).

WITEC marks a pioneering effort in wearable technology, medical imaging, research, and materials science.

Together, these technologies allow WITEC to accelerate the design, prototyping, and testing of its wearable ultrasound imaging system, and to demonstrate imaging quality on phantoms and healthy subjects.

WITEC aims to bridge this gap with its wearable ultrasound imaging system that use…

1 week, 4 days назад @ news.mit.edu
How generative AI can help scientists synthesize complex materials
How generative AI can help scientists synthesize complex materials How generative AI can help scientists synthesize complex materials

In many cases, materials synthesis is not as simple as following a recipe in the kitchen.

Now, MIT researchers have created an AI model that guides scientists through the process of making materials by suggesting promising synthesis routes.

To help scientists navigate that process, the MIT researchers trained a generative AI model on over 23,000 material synthesis recipes described over 50 years of scientific papers.

“Diffusion models are basically a generative AI model like ChatGPT, but more like the DALL-E image generation model,” Pan says.

“This approach could be extended to other materials,” Pan says.

1 week, 5 days назад @ news.mit.edu
Berkeley AI
последний пост 3 months, 2 weeks назад
RL without TD learning
RL without TD learning RL without TD learning

RL without TD learningIn this post, I’ll introduce a reinforcement learning (RL) algorithm based on an “alternative” paradigm: divide and conquer.

We can do Reinforcement Learning (RL) based on divide and conquer, instead of temporal difference (TD) learning.

There are two classes of algorithms in RL: on-policy RL and off-policy RL.

We compared TRL with $n$-step TD learning with different values of $n$, from $1$ (pure TD) to $\infty$ (pure MC).

I still think one of the most important problems in RL (and even in machine learning) is to find a scalable off-policy RL algorithm.

3 months, 2 weeks назад @ bair.berkeley.edu
What exactly does word2vec learn?
What exactly does word2vec learn? What exactly does word2vec learn?

What exactly does word2vec learn?

What exactly does word2vec learn, and how?

In this framing, it’s clear that word2vec is a minimal neural language model.

As a result, the theory predicts exactly what features are learned in terms of the corpus statistics and the algorithmic hyperparameters.

We find that over the course of learning, word2vec builds these linear representations in a sequence of noisy learning steps, and their geometry is well-described by a spiked random matrix model.

5 months, 2 weeks назад @ bair.berkeley.edu
Whole-Body Conditioned Egocentric Video Prediction
Whole-Body Conditioned Egocentric Video Prediction Whole-Body Conditioned Egocentric Video Prediction

Whole-Body Conditioned Egocentric Video Prediction×Predicting Ego-centric Video from human Actions (PEVA).

We trained a model to Predict Ego-centric Video from human Actions (PEVA) for Whole-Body-Conditioned Egocentric Video Prediction.

We train an autoregressive conditional diffusion transformer on Nymeria, a large-scale dataset pairing real-world egocentric video with body pose capture.

We include some samples here:Body Movement Actions Move Forward Rotate Left Rotate Right Left Hand Actions Move Left Hand Up Move Left Hand Down Move Left Hand Left Move Left Hand Right Right Hand Actions Move Right Hand Up Move Right Hand Down Move Right Hand Left Move Right Hand RightLong RolloutHere you…

7 months, 2 weeks назад @ bair.berkeley.edu
AWS Machine Learning AWS Machine Learning
последний пост 1 day назад
Customize AI agent browsing with proxies, profiles, and extensions in Amazon Bedrock AgentCore Browser
Customize AI agent browsing with proxies, profiles, and extensions in Amazon Bedrock AgentCore Browser Customize AI agent browsing with proxies, profiles, and extensions in Amazon Bedrock AgentCore Browser

Today, we are announcing three new capabilities that address these requirements: proxy configuration, browser profiles, and browser extensions.

These three capabilities give you control over how AgentCore Browser sessions connect to the internet, what state they retain, and how they behave.

Browser extensions load Chrome extensions into sessions to customize browser behavior for your use case.

Feature 3: Browser extensionsBrowser extensions let you load Chrome extensions into AgentCore Browser sessions to customize how the browser behaves.

To get started, see the tutorials in the Amazon Bedrock AgentCore samples repository and the Amazon Bedrock AgentCore Browser documentation.

1 day назад @ aws.amazon.com
AI meets HR: Transforming talent acquisition with Amazon Bedrock
AI meets HR: Transforming talent acquisition with Amazon Bedrock AI meets HR: Transforming talent acquisition with Amazon Bedrock

The AI-powered recruitment lifecycleThe recruitment process presents numerous opportunities for AI enhancement through specialized agents, each powered by Amazon Bedrock and connected to dedicated Amazon Bedrock knowledge bases.

The Job Description Creation and Optimization Agent uses advanced language models available in Amazon Bedrock and connects to an Amazon Bedrock knowledge base containing your organization’s historical job descriptions and inclusion guidelines.

Each agent uses Amazon Bedrock and connects to dedicated Amazon Bedrock knowledge bases while maintaining security and compliance requirements.

– The Job Description Creation and Optimization Agent uses the AI capabilities of …

2 days, 3 hours назад @ aws.amazon.com
Build long-running MCP servers on Amazon Bedrock AgentCore with Strands Agents integration
Build long-running MCP servers on Amazon Bedrock AgentCore with Strands Agents integration Build long-running MCP servers on Amazon Bedrock AgentCore with Strands Agents integration

Imagine you’ve built an MCP server for an AI agent that helps data scientists train ML models.

Amazon Bedrock AgentCore and Strands Agents implementationBefore diving into the implementation details, it’s important to understand the deployment options available for MCP servers on Amazon Bedrock AgentCore.

For more information about AgentCore Runtime and AgentCore Gateway service quotas, refer to Quotas for Amazon Bedrock AgentCore.

Strands Agents implementationIntegrating Amazon Bedrock AgentCore Memory with Strands Agents is remarkably straightforward using the AgentCoreMemorySessionManager class from the Bedrock AgentCore SDK.

These identifiers are then packaged into a custom HTTP header …

2 days, 3 hours назад @ aws.amazon.com
NVIDIA Nemotron 3 Nano 30B MoE model is now available in Amazon SageMaker JumpStart
NVIDIA Nemotron 3 Nano 30B MoE model is now available in Amazon SageMaker JumpStart NVIDIA Nemotron 3 Nano 30B MoE model is now available in Amazon SageMaker JumpStart

Today we’re excited to announce that the NVIDIA Nemotron 3 Nano 30B model with 3B active parameters is now generally available in the Amazon SageMaker JumpStart model catalog.

Get started with NVIDIA Nemotron 3 Nano 30B in SageMaker JumpStartTo test the Nemotron 3 Nano model in SageMaker JumpStart, open SageMaker Studio and choose Models in the navigation pane.

To learn more, check out the Nemotron Nano model page, the NVIDIA GitHub sample notebook for Nemotron 3 Nano 30B, and the Amazon SageMaker JumpStart pricing page.

Try the Nemotron 3 Nano model in Amazon SageMaker JumpStart today and send feedback to AWS re:Post for SageMaker JumpStart or through your usual AWS Support contacts.

Pooja…

3 days, 3 hours назад @ aws.amazon.com
Mastering Amazon Bedrock throttling and service availability: A comprehensive guide
Mastering Amazon Bedrock throttling and service availability: A comprehensive guide Mastering Amazon Bedrock throttling and service availability: A comprehensive guide

Rate-Based Throttling (RPM – Requests Per Minute)Error Message:ThrottlingException: Too many requests, please wait before trying again.

The combined total for that minute becomes 160 rpm, which is above your 150 rpm quota, and some requests start failing with ThrottlingException.

Use exponential backoff with jitter on 429 errors so that retries can become gradually less frequent and are de-synchronized across instances.

In this case, the Bedrock service is signaling a transient capacity or infrastructure issue, often affecting on-demand models during demand spikes.

Essential CloudWatch MetricsMonitor these CloudWatch metrics:Invocations : Successful model invocations: Successful model invoc…

3 days, 7 hours назад @ aws.amazon.com
Swann provides Generative AI to millions of IoT Devices using Amazon Bedrock
Swann provides Generative AI to millions of IoT Devices using Amazon Bedrock Swann provides Generative AI to millions of IoT Devices using Amazon Bedrock

Why AWS and Amazon Bedrock were selectedWhen evaluating AI partners, Swann prioritized enterprise-grade capabilities that could reliably scale.

AWS stood out for several key reasons:Enterprise-grade AI capabilitiesSwann chose AWS for its comprehensive, integrated approach to deploying generative AI at scale.

Data pipeline: Video content flows through Amazon EventBridge, Amazon S3, and Amazon SQS for reliable storage and message queuing.

It also helps optimize model performance for particular use cases without requiring extensive labelled training data.

It also helps optimize model performance for particular use cases without requiring extensive labelled training data.

3 days, 7 hours назад @ aws.amazon.com
How LinqAlpha assesses investment theses using Devil’s Advocate on Amazon Bedrock
How LinqAlpha assesses investment theses using Devil’s Advocate on Amazon Bedrock How LinqAlpha assesses investment theses using Devil’s Advocate on Amazon Bedrock

In this post, we share how LinqAlpha uses Amazon Bedrock to build and scale Devil’s Advocate.

The input is received by a custom application running in an Amazon Elastic Compute Cloud (Amazon EC2) instance, which routes the request to Amazon Bedrock.

The orchestration layer on Amazon EC2 coordinates Amazon Bedrock calls, enabling fast iteration under real-time analyst workloads while minimizing infrastructure overhead.

The orchestration layer on Amazon EC2 coordinates Amazon Bedrock calls, enabling fast iteration under real-time analyst workloads while minimizing infrastructure overhead.

Integration with Amazon S3, Amazon EC2, Amazon RDS, and OpenSearch Service enables more secure and scalab…

3 days, 7 hours назад @ aws.amazon.com
How Amazon uses Amazon Nova models to automate operational readiness testing for new fulfillment centers
How Amazon uses Amazon Nova models to automate operational readiness testing for new fulfillment centers How Amazon uses Amazon Nova models to automate operational readiness testing for new fulfillment centers

Using Amazon Nova models, we’ve developed an automated solution that significantly reduces verification time while improving accuracy.

Key factors in the decision-making process include:Image Detection Capability: We selected Amazon Nova Pro for image detection after testing multiple AI models including Anthropic Claude Sonnet, Amazon Nova Pro, Amazon Nova Lite and Meta AI Segment Anything Model (SAM).

Results & LearningsAfter building the prototype, we tested the solution in multiple fulfillment centers using Amazon Kindle tablets.

ConclusionThis post demonstrates how to use Amazon Nova and Anthropic Claude Sonnet in Amazon Bedrock to build an automated image recognition solution for opera…

4 days, 4 hours назад @ aws.amazon.com
Iberdrola enhances IT operations using Amazon Bedrock AgentCore
Iberdrola enhances IT operations using Amazon Bedrock AgentCore Iberdrola enhances IT operations using Amazon Bedrock AgentCore

Amazon Bedrock AgentCore helps Iberdrola deploy production-ready AI agents seamlessly.

Managed infrastructure – The serverless compute runtimes, identity, and memory services of Amazon Bedrock AgentCore minimize custom development overhead for enterprise-grade capabilities.

– The serverless compute runtimes, identity, and memory services of Amazon Bedrock AgentCore minimize custom development overhead for enterprise-grade capabilities.

– Amazon Bedrock AgentCore fits well with development guidelines because you can choose the development framework, such as LangGraph, by adding a simple decorator.

By adopting Amazon Bedrock AgentCore, Iberdrola has demonstrated how organizations can deploy p…

4 days, 4 hours назад @ aws.amazon.com
Building real-time voice assistants with Amazon Nova Sonic compared to cascading architectures
Building real-time voice assistants with Amazon Nova Sonic compared to cascading architectures Building real-time voice assistants with Amazon Nova Sonic compared to cascading architectures

Amazon Nova Sonic delivers real-time, human-like voice conversations through the bidirectional streaming interface.

When compared with newer architectures such as Amazon Nova Sonic—which combines speech understanding and generation into a single end-to-end model—classic AI voice chat systems use cascading architectures with sequential processing.

The following diagram illustrates the conceptual flow of how users interact with Nova Sonic for real-time voice conversations compared to a cascading voice assistant solution.

Cost structure Simplified cost structure through an integrated approach Amazon Nova Sonic is priced on a token-based consumption model.

To learn more, see Amazon Nova Sonic a…

4 days, 4 hours назад @ aws.amazon.com
Automated Reasoning checks rewriting chatbot reference implementation
Automated Reasoning checks rewriting chatbot reference implementation Automated Reasoning checks rewriting chatbot reference implementation

This blog post dives deeper into the implementation architecture for the Automated Reasoning checks rewriting chatbot.

How the iterative rewriting loop worksThe open source reference implementation automatically helps improve chatbot answers by iterating on the feedback from Automated Reasoning checks and rewriting the response.

For INVALID findings, the Automated Reasoning checks return the rules from the Automated Reasoning policy that make the claims invalid based on the premises and policy rules.

findings, the Automated Reasoning checks return the rules from the Automated Reasoning policy that make the claims invalid based on the premises and policy rules.

Getting Started with the Autom…

5 days, 3 hours назад @ aws.amazon.com
Scale LLM fine-tuning with Hugging Face and Amazon SageMaker AI
Scale LLM fine-tuning with Hugging Face and Amazon SageMaker AI Scale LLM fine-tuning with Hugging Face and Amazon SageMaker AI

Scaling LLM fine-tuning for enterprise use cases presents real technical and operational hurdles, which are being overcome through the powerful partnership between Hugging Face and Amazon SageMaker AI.

To overcome this, SageMaker AI and Hugging Face have joined forces to simplify and scale model customization.

Solution overviewThe following diagram illustrates the solution workflow of using the Hugging Face Transformers library with a SageMaker Training job.

This example showcases the effectiveness of our fine-tuning approach using Hugging Face Transformers and a SageMaker Training job.

He focuses on advising strategic global customers on scaling their model training and inference workloads…

5 days, 6 hours назад @ aws.amazon.com
New Relic transforms productivity with generative AI on AWS
New Relic transforms productivity with generative AI on AWS New Relic transforms productivity with generative AI on AWS

Solution overviewIn designing New Relic NOVA, New Relic established several critical objectives beyond the initial goal of improving documentation search.

Learn more about transforming your business with generative AI by visiting the Generative AI Innovation Center or speak with an AWS Partner Specialist or AWS Representative to know how we can help accelerate your business.

Joe King is an AWS Senior Data Scientist at the Generative AI Innovation Center, where he helps organizations architect and implement cutting-edge generative AI solutions.

Priyashree Roy is an AWS data scientist at the Generative AI Innovation Center, where she applies her deep expertise in machine learning and generati…

5 days, 6 hours назад @ aws.amazon.com
Accelerate agentic application development with a full-stack starter template for Amazon Bedrock AgentCore
Accelerate agentic application development with a full-stack starter template for Amazon Bedrock AgentCore Accelerate agentic application development with a full-stack starter template for Amazon Bedrock AgentCore

Last year, AWS released Amazon Bedrock AgentCore—a development platform for building, deploying, and scaling AI agents in production.

Solution overviewFAST provides a complete full-stack architecture for deploying agents on Amazon Bedrock AgentCore.

The template handles authentication, frontend application hosting, agent runtime, memory, observability, and Model Context Protocol (MCP) tool integration by default.

The frontend integrates with the backend powered by AgentCore, which you can use as an example for integrating with your own frontend application.

ConclusionFAST helps reduce the time to build and deploy an agent application to under 30 minutes.

5 days, 6 hours назад @ aws.amazon.com
Agent-to-agent collaboration: Using Amazon Nova 2 Lite and Amazon Nova Act for multi-agent systems
Agent-to-agent collaboration: Using Amazon Nova 2 Lite and Amazon Nova Act for multi-agent systems Agent-to-agent collaboration: Using Amazon Nova 2 Lite and Amazon Nova Act for multi-agent systems

This post walks through how agent-to-agent collaboration on Amazon Bedrock works in practice, using Amazon Nova 2 Lite for planning and Amazon Nova Act for browser interaction, to turn a fragile single-agent setup into a predictable multi-agent system.

Travel agent implementation (Amazon Nova 2 Lite)The Travel Agent is the orchestrator of the whole workflow.

Flight agent implementation (Amazon Nova 2 Lite and API)The Flight Agent has a much narrower job: turn a structured request into real flight options.

Travel Agent -> Flight Agent: The Travel Agent pulls out the flight part of the request and sends it to the Flight Agent.

Travel Agent -> Hotel Agent: The Travel Agent sends the hotel part…

5 days, 7 hours назад @ aws.amazon.com
NVIDIA
последний пост 2 days, 1 hour назад
Code, Compute and Connection: Inside the Inaugural NVIDIA AI Day São Paulo
Code, Compute and Connection: Inside the Inaugural NVIDIA AI Day São Paulo Code, Compute and Connection: Inside the Inaugural NVIDIA AI Day São Paulo

More than 500 attendees joined NVIDIA AI Day São Paulo in January to learn about sovereign AI — including breakout sessions on AI agents and open models.

“NVIDIA solutions are fundamental to all the technologies we develop,” said Paulo Perez, cofounder and CEO of biotechnology startup Biofy.

“DNA is composed of a gigantic amount of data, which we call nucleotides.

Analyzing this data takes a long time when done using traditional methods with CPUs.

In this regard, NVIDIA’s GPUs, algorithms and frameworks accelerate the results we are capable of delivering by hundreds of times.”

2 days, 1 hour назад @ blogs.nvidia.com
Leading Inference Providers Cut AI Costs by up to 10x With Open Source Models on NVIDIA Blackwell
Leading Inference Providers Cut AI Costs by up to 10x With Open Source Models on NVIDIA Blackwell Leading Inference Providers Cut AI Costs by up to 10x With Open Source Models on NVIDIA Blackwell

Baseten, DeepInfra, Fireworks AI and Together AI are reducing cost per token across industries with optimized inference stacks running on the NVIDIA Blackwell platform.

These providers host advanced open source models, which have now reached frontier-level intelligence.

To overcome these bottlenecks, Sully.ai uses Baseten’s Model API, which deploys open source models such as gpt-oss-120b on NVIDIA Blackwell GPUs.

Latitude runs large open source models on DeepInfra’s inference platform, powered by NVIDIA Blackwell GPUs and TensorRT-LLM.

Explore NVIDIA’s full-stack inference platform to learn more about how it delivers better tokenomics for AI inference.

2 days, 7 hours назад @ blogs.nvidia.com
NVIDIA DGX Spark Powers Big Projects in Higher Education
NVIDIA DGX Spark Powers Big Projects in Higher Education NVIDIA DGX Spark Powers Big Projects in Higher Education

At leading institutions across the globe, the NVIDIA DGX Spark desktop supercomputer is bringing data‑center‑class AI to lab benches, faculty offices and students’ systems.

There’s even a DGX Spark hard at work in the South Pole, at the IceCube Neutrino Observatory run by the University of Wisconsin-Madison.

Read more below on how DGX Spark powers groundbreaking AI work at leading institutions worldwide.

See how DGX Spark is transforming higher education and student innovation at Stanford by joining this livestream on Friday, Feb. 13, at 9 a.m. PT.

Get started with DGX Spark and find purchase options on this webpage.

2 days, 8 hours назад @ blogs.nvidia.com
GeForce NOW Turns Screens Into a Gaming Machine
GeForce NOW Turns Screens Into a Gaming Machine GeForce NOW Turns Screens Into a Gaming Machine

The GeForce NOW sixth-anniversary festivities roll on this February, continuing a monthlong celebration of NVIDIA’s cloud gaming service.

The Cloud Turns Up the HeatGeForce NOW continues to expand access across devices, supporting multiple platforms with the latest additions of Linux.

Members tap into GeForce RTX power on the screens they already own, extending the value of their membership without extra hardware or complexity.

That reach expands again with the launch of the GeForce NOW app on Amazon Fire TV.

GeForce NOW is powering up the living room with the launch of the GeForce NOW app on Amazon Fire TV streaming sticks, leveling up big-screen gaming with PC-quality performance.

2 days, 9 hours назад @ blogs.nvidia.com
GeForce NOW Celebrates Six Years of Streaming With 24 Games in February
GeForce NOW Celebrates Six Years of Streaming With 24 Games in February GeForce NOW Celebrates Six Years of Streaming With 24 Games in February

GeForce NOW turns six, kicking off a monthlong celebration, starting with 10 new games to stream this week.

Break out the cake and green sprinkles — GeForce NOW is turning six.

There’s plenty to celebrate: the February games list kicks off with 24 new games.

On GeForce NOW, Delta Force leans into its high‑octane personality: fast drops, big maps and cinematic engagements that look sharp and feel responsive across devices.

On GeForce NOW, responsive streaming and sharp RTX-powered visuals keep every angle, rotation and clutch play feeling precise, even on lower-powered devices.

1 week, 2 days назад @ blogs.nvidia.com
Nemotron Labs: How AI Agents Are Turning Documents Into Real-Time Business Intelligence
Nemotron Labs: How AI Agents Are Turning Documents Into Real-Time Business Intelligence Nemotron Labs: How AI Agents Are Turning Documents Into Real-Time Business Intelligence

With NVIDIA Nemotron open models and GPU-accelerated libraries, organizations can build AI-powered document intelligence systems for research, financial services, legal workflows and more.

Agreements are the foundation of every business, but the critical information they contain are often buried inside pages of documents.

Nemotron extraction and OCR models rapidly ingest multimodal PDFs, text, tables, graphs and images to convert them into structured, machine-readable content while preserving layout and semantics.

The most effective AI systems use a mix of frontier models and open source models like NVIDIA Nemotron, with an LLM router analyzing each task and automatically selecting the mode…

1 week, 3 days назад @ blogs.nvidia.com
How to Build a Document Processing Pipeline for RAG with Nemotron
How to Build a Document Processing Pipeline for RAG with Nemotron How to Build a Document Processing Pipeline for RAG with Nemotron

With NVIDIA Nemotron RAG, you can build a high-throughput intelligent document processing pipeline that handles massive document workloads with precision and accuracy.

Context-aware orchestration: Using a microservice architecture, the pipeline decomposes documents and optimizes the data for Nemotron RAG models, creating a high-speed, contextually aware system.

Using a microservice architecture, the pipeline decomposes documents and optimizes the data for Nemotron RAG models, creating a high-speed, contextually aware system.

What are the components of a multimodal RAG pipeline?

Code for building each pipeline componentTry the starting code for each part of the document processing pipeline.

1 week, 3 days назад @ developer.nvidia.com
Everything Will Be Represented in a Virtual Twin, Jensen Huang Says at 3DEXPERIENCE World
Everything Will Be Represented in a Virtual Twin, Jensen Huang Says at 3DEXPERIENCE World Everything Will Be Represented in a Virtual Twin, Jensen Huang Says at 3DEXPERIENCE World

“Artificial intelligence will be infrastructure,” like water, electricity, and the internet Huang told the crowd, playfully referring to the engineering-heavy audience as “Solid Workers,” a nod to Dassault Systèmes’ SolidWorks platform.

The announcement continues a collaboration spanning more than a quarter century between NVIDIA and Dassault Systèmes.

The partnership aims to establish industry world models — science-validated AI systems grounded in physics that can serve as mission-critical platforms across biology, materials science, engineering and manufacturing.

Virtual Twins for Every Factory: NVIDIA Omniverse physical AI libraries integrated into the DELMIA Virtual Twin enable autonom…

1 week, 4 days назад @ blogs.nvidia.com
Advancing GPU Programming with the CUDA Tile IR Backend for OpenAI Triton
Advancing GPU Programming with the CUDA Tile IR Backend for OpenAI Triton Advancing GPU Programming with the CUDA Tile IR Backend for OpenAI Triton

What are CUDA Tile and CUDA Tile IR?

CUDA Tile extends the CUDA programming model to enable first-class support for tile programming.

CUDA Tile development is driven by the CUDA Tile IR specification, which defines the formal semantics, operations, and type system for tile-based computations on NVIDIA GPUs.

Triton users will be able to select which backend (PTX backend or CUDA Tile IR backend) to use on a per-kernel basis in their applications.

As CUDA continues to release new versions, the compatibility of the Triton CUDA Tile IR backend will continue to improve.

2 weeks, 1 day назад @ developer.nvidia.com
Mercedes-Benz Unveils New S-Class Built on NVIDIA DRIVE AV, Which Enables an L4-Ready Architecture
Mercedes-Benz Unveils New S-Class Built on NVIDIA DRIVE AV, Which Enables an L4-Ready Architecture Mercedes-Benz Unveils New S-Class Built on NVIDIA DRIVE AV, Which Enables an L4-Ready Architecture

NVIDIA DRIVE AV is trained at scale on NVIDIA DGX systems and designed to be validated using high-fidelity simulation with NVIDIA Omniverse NuRec libraries and NVIDIA Cosmos world models.

NVIDIA DRIVE AV enables the system to analyze complex environments — rather than simply reacting to known patterns — evaluate multiple options and select the safest possible outcome in real time.

Developed in accordance with NVIDIA Halos safety system, NVIDIA DRIVE Hyperion helps eliminate single points of failure and provides the foundation needed for L4-ready systems.

Within NVIDIA DRIVE AV, these AI capabilities are further refined, optimized and engineered for production.

Built on NVIDIA DRIVE Hyperion…

2 weeks, 2 days назад @ blogs.nvidia.com
Into the Omniverse: Physical AI Open Models and Frameworks Advance Robots and Autonomous Systems
Into the Omniverse: Physical AI Open Models and Frameworks Advance Robots and Autonomous Systems Into the Omniverse: Physical AI Open Models and Frameworks Advance Robots and Autonomous Systems

By providing access to critical infrastructure — from simulation frameworks to AI models — NVIDIA is enabling collaborative development that accelerates the path to safer, more capable autonomous systems.

At CES earlier this month, NVIDIA introduced a new suite of open physical AI models and frameworks to accelerate the development of humanoids, autonomous vehicles and other physical AI embodiments.

The stack taps into NVIDIA Cosmos world models; NVIDIA Isaac technologies, including the new Isaac Lab-Arena open source framework for policy evaluation; the NVIDIA Alpamayo open portfolio of AI models, simulation frameworks and physical AI datasets for autonomous vehicles; and the NVIDIA OSMO f…

2 weeks, 2 days назад @ blogs.nvidia.com
GeForce NOW Brings GeForce RTX Gaming to Linux PCs
GeForce NOW Brings GeForce RTX Gaming to Linux PCs GeForce NOW Brings GeForce RTX Gaming to Linux PCs

Ten new games join the cloud this week as the native GeForce NOW app for Linux PCs enters beta.

Get ready to game — the native GeForce NOW app for Linux PCs is now available in beta, letting Linux desktops tap directly into GeForce RTX performance from the cloud.

Linux Just Got RTXGaming on Linux just got a major RTX upgrade.

The beta Linux app is built for PCs and notebooks, offering an experience similar to the existing GeForce NOW app on Windows and macOS.

For more details, check out all knowledge base articles on the GeForce NOW Linux beta app here.

2 weeks, 2 days назад @ blogs.nvidia.com
Accelerating Science: A Blueprint for a Renewed National Quantum Initiative
Accelerating Science: A Blueprint for a Renewed National Quantum Initiative Accelerating Science: A Blueprint for a Renewed National Quantum Initiative

Quantum technologies are rapidly emerging as foundational capabilities for economic competitiveness, national security and scientific leadership in the 21st century.

To secure this future, Congress must act and reauthorize the National Quantum Initiative (NQI).

This coordinated initiative accelerated the maturation of quantum technologies by enabling sustained investment, shared infrastructure and a world-class research and development ecosystem.

Collectively, this progress has clarified a viable roadmap toward useful quantum systems and reinforced the value of long-term, coordinated national investment.

When integrated with AI, quantum computing will power this century’s economic competiti…

2 weeks, 3 days назад @ blogs.nvidia.com
NVIDIA Launches Earth-2 Family of Open Models — the World’s First Fully Open, Accelerated Set of Models and Tools for AI Weather
NVIDIA Launches Earth-2 Family of Open Models — the World’s First Fully Open, Accelerated Set of Models and Tools for AI Weather NVIDIA Launches Earth-2 Family of Open Models — the World’s First Fully Open, Accelerated Set of Models and Tools for AI Weather

Energy Forecasting and Grid OperationsTotalEnergies is evaluating Earth-2 Nowcasting to improve short-term risk awareness and decision-making.

GCL, one of China’s largest solar material producers and a global integrated energy operator, is running NVIDIA Earth-2 models in operation for its photovoltaic prediction system.

Compared with traditional numerical weather prediction, Earth-2 provides more accurate prediction data at a lower cost, significantly improving the accuracy of GCL’s photovoltaic power generation prediction.

Southwest Power Pool, in collaboration with Hitachi, is using Earth-2 Nowcasting and FourCastNet3 to improve intraday and day-ahead wind forecasting.

Financial Impact A…

2 weeks, 5 days назад @ blogs.nvidia.com
How to Unlock Local Detail in Coarse Climate Projections with NVIDIA Earth-2
How to Unlock Local Detail in Coarse Climate Projections with NVIDIA Earth-2 How to Unlock Local Detail in Coarse Climate Projections with NVIDIA Earth-2

Those patterns are still there—you just need the right tools to unlock them in high-resolution climate data.

Using NVIDIA Earth‑2, this blog post shows you how to downscale coarse climate projections into higher-resolution, bias‑corrected fields—revealing local detail not resolved in the raw data.

NVIDIA CorrDiff—a model within this ecosystem—replaces compute-intensive and time-consuming high-resolution simulations with a generative downscaling model that transforms coarse CMIP6 output into high-resolution fields.

While ERA5 has known limitations for tropical cyclones, this result shows how fine-scale signals can be extracted from coarse climate model output.

Variable Source Bias MAE RMSE C…

2 weeks, 5 days назад @ developer.nvidia.com
Facebook
последний пост 3 days, 6 hours назад
The Death of Traditional Testing: Agentic Development Broke a 50-Year-Old Field, JiTTesting Can Revive It
The Death of Traditional Testing: Agentic Development Broke a 50-Year-Old Field, JiTTesting Can Revive It The Death of Traditional Testing: Agentic Development Broke a 50-Year-Old Field, JiTTesting Can Revive It

A Catching JiTTest focuses specifically on finding regressions introduced by a code change.

Agentic development dramatically increases the pace of code change, straining test development burden and scaling the cost of false positives and test maintenance to breaking point.

And since the JiTTest itself is LLM-generated, it can often infer the plausible intention of a code change and simulate possible faults that may result from it.

With them engineers no longer have to spend time writing, reviewing, and testing complex test code.

READ THE PAPERJust-in-Time Catching Test Generation at Meta

3 days, 6 hours назад @ engineering.fb.com
Adapting the Facebook Reels RecSys AI Model Based on User Feedback
Adapting the Facebook Reels RecSys AI Model Based on User Feedback Adapting the Facebook Reels RecSys AI Model Based on User Feedback

Our new User True Interest Survey (UTIS) model , now helps surface more niche, high-quality content and boosts engagement, retention, and satisfaction.

Our paper, “ Improve the Personalization of Large-Scale Ranking Systems by Integrating User Survey Feedback ” shares full details on this work.

The main candidate ranking model used by the platform is a large multi-task, multi-label model.

We trained a lightweight UTIS alignment model layer on the collected user survey responses using existing predictions of the main model as input features.

The UTIS model consistently outperformed the baseline, driving higher user engagement and retention .

1 month назад @ engineering.fb.com
DrP: Meta’s Root Cause Analysis Platform at Scale
DrP: Meta’s Root Cause Analysis Platform at Scale DrP: Meta’s Root Cause Analysis Platform at Scale

DrP’s key components include:Expressive SDK : The DrP SDK allows engineers to codify investigation workflows into analyzers.

Post-processing system : After an investigation, the post-processing system can take automated actions based on the analysis results.

Bootstrap code : The DrP SDK provides bootstrap code to create a template analyzer with pre-populated boilerplate code.

Data access and analysis : The SDK includes libraries for data access and analysis, such as dimension analysis and time series correlation.

This provides immediate analysis results to on-call engineers.

1 month, 3 weeks назад @ engineering.fb.com
How AI Is Transforming the Adoption of Secure-by-Default Mobile Frameworks
How AI Is Transforming the Adoption of Secure-by-Default Mobile Frameworks How AI Is Transforming the Adoption of Secure-by-Default Mobile Frameworks

Generative AI and automation accelerate the adoption of secure frameworks at scale, enabling consistent security enforcement and efficient migration across Meta’s vast codebase.

How We Design Secure-by-Default Frameworks at MetaDesigning secure-by-default frameworks for use by a large number of developers shipping vastly different features across multiple apps is an interesting challenge.

There shouldn’t be one security framework that covers all security issues, and not every security issue is general enough to deserve its own framework.

Now that we’ve looked at the design philosophy behind our frameworks, let’s look at one of our most widely used Android security frameworks, SecureLinkLaun…

2 months назад @ engineering.fb.com
Zoomer: Powering AI Performance at Meta’s Scale Through Intelligent Debugging and Optimization
Zoomer: Powering AI Performance at Meta’s Scale Through Intelligent Debugging and Optimization Zoomer: Powering AI Performance at Meta’s Scale Through Intelligent Debugging and Optimization

Zoomer has delivered training time reductions, and significant QPS improvements, making it the de-facto tool for AI performance optimization across Meta’s entire AI infrastructure.

Zoomer is Meta’s automated, one-stop-shop platform for performance profiling, debugging, analysis, and optimization of AI training and inference workloads.

AI Performance Optimization Using ZoomerZoomer is an automated debugging and optimization platform that works across all of our AI model types (ads recommendations, GenAI, computer vision, etc.)

Memory Analysis : Comprehensive analysis of GPU memory usage patterns, allocation tracking, and leak detection.

Realtime Memory Profiling : GPU memory allocation track…

2 months, 3 weeks назад @ engineering.fb.com
Open Source Is Good for the Environment
Open Source Is Good for the Environment Open Source Is Good for the Environment

But have you heard about open hardware?

And did you know open source can have a positive impact on the environment?

On this episode of the Meta Tech Podcast, Pascal Hartig sits down with Dharmesh and Lisa to talk about all things open hardware, and Meta’s biggest announcements from the 2025 Open Compute Project (OCP) Summit – including a new open methodology for leveraging AI to understand Scope 3 emissions.

You’ll also hear how AI and open hardware are helping Meta push to achieve net zero emissions in 2030, including how AI is being used to develop new concrete mixes for data center construction.

And if you’re interested in learning more about career opportunities at Meta visit the Meta C…

3 months назад @ engineering.fb.com
Meta’s Generative Ads Model (GEM): The Central Brain Accelerating Ads Recommendation AI Innovation
Meta’s Generative Ads Model (GEM): The Central Brain Accelerating Ads Recommendation AI Innovation Meta’s Generative Ads Model (GEM): The Central Brain Accelerating Ads Recommendation AI Innovation

We’re sharing details about Meta’s Generative Ads Recommendation Model (GEM), a new foundation model that delivers increased ad performance and advertiser ROI by enhancing other ads recommendation models’ ability to serve relevant ads.

GEM propagates its learnings, leveraging a suite of post-training techniques across the entire ads model fleet, enabling a paradigm shift in Meta’s Ads Recommendation system.

GEM leverages enhanced training scalability that efficiently utilizes thousands of GPUs for building and iterating an LLM-scale ads foundation model.

The Generative Ads Recommendation Model (GEM) is Meta’s most advanced ads foundation model, built on an LLM-inspired paradigm and trained …

3 months назад @ engineering.fb.com
Scaling LLM Inference: Innovations in Tensor Parallelism, Context Parallelism, and Expert Parallelism
Scaling LLM Inference: Innovations in Tensor Parallelism, Context Parallelism, and Expert Parallelism Scaling LLM Inference: Innovations in Tensor Parallelism, Context Parallelism, and Expert Parallelism

At Meta, we are constantly pushing the boundaries of LLM inference systems to power applications such as the Meta AI App.

These metrics highlight the distinct computational demands of LLM inference: Prefill is compute-intensive, while decoding is memory bandwidth-intensive.

Communication: Communication latency increases when parallelizing across multiple hosts.

In EP-based inference, we utilize a two-shot, all-to-all communication pattern to exchange tokens between data parallelism and expert parallelism ranks based on routing.

We are committed to continuous innovation to ensure efficient and scalable LLM inference for millions of users worldwide.

4 months назад @ engineering.fb.com
How Meta Is Leveraging AI To Improve the Quality of Scope 3 Emission Estimates for IT Hardware
How Meta Is Leveraging AI To Improve the Quality of Scope 3 Emission Estimates for IT Hardware How Meta Is Leveraging AI To Improve the Quality of Scope 3 Emission Estimates for IT Hardware

We leveraged AI to help us improve this database and understand our Scope 3 emissions associated with IT hardware by:Identifying similar components and applying existing PCFs to similar components that lack these carbon estimates.

Understanding the carbon footprint of IT racks and applying generative AI (GenAI) as a categorization algorithm to create a new and standard taxonomy .

If these similar components are not identified their carbon footprint estimates will remain at a lower data quality.

These similar components can be mapped to a representative proxy PCF, allowing us to use high-quality PCF data in similar components.

For example, we can scale the carbon footprint calculation for a …

4 months назад @ engineering.fb.com
OCP Summit 2025: The Open Future of Networking Hardware for AI
OCP Summit 2025: The Open Future of Networking Hardware for AI OCP Summit 2025: The Open Future of Networking Hardware for AI

At Open Compute Project Summit (OCP) 2025, we’re sharing details about the direction of next-generation network fabrics for our AI training clusters.

At Meta, we believe that open hardware is a catalyst for innovation — especially as data center infrastructure increasingly supports new and emerging AI technologies.

Open hardware plays a crucial role in enabling disaggregation, allowing us to break down traditional data center technologies into their core components.

Today, through OCP, we continue to advance open network technologies for the next generation of AI applications.

Ethernet for Scale-Up Networking in OCP: Meta’s Industry LeadershipAt Meta, we recognize that the future of AI and …

4 months назад @ engineering.fb.com
LLMs Are the Key to Mutation Testing and Better Compliance
LLMs Are the Key to Mutation Testing and Better Compliance LLMs Are the Key to Mutation Testing and Better Compliance

By leveraging LLMs we’ve been able to overcome the barriers that have prevented mutation testing from being efficiently deployed at scale.

Our presentations shared insights into how we’ve used LLMs to solve the major barriers that have prevented mutation testing at scale and highlighted new areas in automated software testing where LLMs can have a significant impact.

Mutation Testing Isn’t ScalableTraditional mutation testing generates a very large number of mutants, making it computationally expensive and difficult to scale to large industrial codebases.

Mutation Testing Requires a Lot of Computational ResourcesMutation testing is costly in terms of computational resources and developer ef…

4 months, 2 weeks назад @ engineering.fb.com
AssetGen: Generating 3D Worlds With AI
AssetGen: Generating 3D Worlds With AI AssetGen: Generating 3D Worlds With AI

Imagine being able to use AI to create 3D virtual worlds using prompts as easily as you can generate images.

In his keynote, Mark Zuckerberg shared his vision of a future where anyone can create virtual worlds using AI-powered tools like the ones available in the upcoming Meta Horizon Studio.

But AI is already making it easier than ever to create 3D assets.

On this episode of the Meta Tech Podcast, Pascal Hartig is joined by Mahima and Rakesh from Meta’s XR Tech team to discuss AssetGen, a new foundation model for 3D assets.

They talk about how they built and trained AssetGen, the important role LLMs have to play in the future of VR, and how they’re tackling the ambitious goal of generating…

4 months, 2 weeks назад @ engineering.fb.com
Meta’s Infrastructure Evolution and the Advent of AI
Meta’s Infrastructure Evolution and the Advent of AI Meta’s Infrastructure Evolution and the Advent of AI

As our user base grew globally, we scaled beyond single data center buildings and into data center regions consisting of multiple buildings.

Enter AI Workloads (2020)While we were navigating the challenges of scaling, we were also seeing glimpses of how AI workloads would impact our infrastructure.

To build out our AI infrastructure, we’ve leveraged solutions from partners like AMD and NVIDIA as well as our own custom silicon.

Constructing Prometheus has been a monumental engineering feat, with infrastructure spanning five or more data center buildings in a single data center region.

We are still early in the evolution and adoption of AI workloads.

4 months, 2 weeks назад @ engineering.fb.com
Networking at the Heart of AI — @Scale: Networking 2025 Recap
Networking at the Heart of AI — @Scale: Networking 2025 Recap Networking at the Heart of AI — @Scale: Networking 2025 Recap

AI is everywhere and, as network engineers, we are right in the thick of it: building the network infrastructure for AI.

Setting Context: Rapid Changes and EvolutionGiven AI continues to drive so much innovation in networking and general infrastructure, we once again focused @Scale: Networking on AI networking, sharing the new insights and progress in the field.

The Models and the Primary AI Workloads Are Rapidly Evolving.

More from @Scale:Networking 2025Please visit the @Scale YouTube channel to check out all the talks from this year’s Networking @Scale.

We look forward to what promises to be another rapid year of network and AI innovation that we’ll cover at the next @Scale: Networking in…

4 months, 3 weeks назад @ engineering.fb.com
A New Ranking Framework for Better Notification Quality on Instagram
A New Ranking Framework for Better Notification Quality on Instagram A New Ranking Framework for Better Notification Quality on Instagram

We’ve introduced a diversity-aware notification ranking framework to reduce uniformity and deliver a more varied and engaging mix of notifications.

Instagram leverages machine learning (ML) models to decide who should get a notification, when to send it, and what content to include.

To tackle this, we’ve introduced a diversity-aware notification ranking framework that helps deliver more diverse, better curated, and less repetitive notifications.

Introducing Instagram’s Diversity-Aware Notification Ranking FrameworkInstagram’s diversity-aware notification ranking framework is designed to enhance the notification experience by balancing the predicted potential for user engagement with the nee…

5 months, 2 weeks назад @ engineering.fb.com
Uber Engineering
последний пост None
neptune.ai neptune.ai
последний пост 2 months, 1 week назад
We are joining OpenAI
We are joining OpenAI We are joining OpenAI

Piotr Niedźwiedź, CEO/CTO and founder of neptune.aiI’m excited to share that we’ve entered into a definitive agreement to be acquired by OpenAI, subject to closing conditions.

We are thrilled to join the OpenAI team and help their AI researchers build better models faster.

Neptune is a metrics dashboard company.”We’ve worked closely with OpenAI to create the metrics dashboard that helps teams building foundation models.

Our future with OpenAINeptune will join OpenAI and continue to support AI researchers with tools to monitor, debug, and evaluate frontier models.

We are looking forward to working with top AI researchers and supporting OpenAI’s mission of ensuring that AGI benefits all of hu…

2 months, 1 week назад @ neptune.ai
Synthetic Data for LLM Training
Synthetic Data for LLM Training Synthetic Data for LLM Training

For instance, financial data is highly sensitive and protected by very strict regulations, and synthetic data mimics the real data distribution without revealing customer information.

Read more about how leading foundation model teams curate their training data and other topics in the State of Foundation Model Training Report 2025.

Choosing the right synthetic data generation technique depends on the type of data and its complexity.

Synthetic tabular data generation is a promising direction to overcome these challenges by learning the distribution of the tabular data.

Post-processingAs the distribution of tabular data is highly complex, it makes the synthetic tabular data generation very ch…

3 months назад @ neptune.ai
What are LLM Embeddings: All you Need to Know
What are LLM Embeddings: All you Need to Know What are LLM Embeddings: All you Need to Know

TL;DR LLM embeddings are the numerical, vector representations of text that Large Language Models (LLMs) use to process information.

Unlike their predecessor word embeddings, LLM embeddings are context-aware and dynamically change to capture semantic and syntactic relationships based on the surrounding text.

What are the applications of LLM embeddings?

Word EmbeddingsSparse Word Embeddings One-Hot Vectors 1970s TF-IDF1980s Co-Occurrence MatrixStatic Word Embeddings Word2Vec 2013 GloVe 2014Contextualized word embeddings ELMo 2018 GPT-1 2018 BERT 2018 LLAMA 2023 DeepSeek-V1 2023 GPT-4 2023Static word embeddingsStatic word embeddings, such as word2vec in 2013, marked a significant development.…

3 months, 1 week назад @ neptune.ai
Detecting and Fixing ‘Dead Neurons’ in Foundation Models
Detecting and Fixing ‘Dead Neurons’ in Foundation Models Detecting and Fixing ‘Dead Neurons’ in Foundation Models

TL;DR Dead neurons silently waste compute and reduce effective model capacity in foundation models.

Dead neurons’ impactRecent studies into dead neurons in the context of foundation models show interesting, albeit worrying, results.

These large reported fractions of dead neurons in foundation models are a concern from a computational perspective.

Before we move on to discuss how to detect and fix dead neurons, let’s touch upon an important distinction between dead neurons and vanishing gradients.

Further reading How to Monitor, Diagnose, and Solve Gradient Issues in Foundation Models Read moreVisualizing activation distributionsIs your foundation model suffering from dead neurons?

3 months, 2 weeks назад @ neptune.ai
Part 2: Instruction Fine-Tuning: Evaluation and Advanced Techniques for Efficient Training
Part 2: Instruction Fine-Tuning: Evaluation and Advanced Techniques for Efficient Training Part 2: Instruction Fine-Tuning: Evaluation and Advanced Techniques for Efficient Training

In the first part of this series, we covered the fundamentals of instruction fine-tuning (IFT).

def calculate_irs(instruction, output, reference_model): evaluation_prompt = f""" Instruction: {instruction} Model Output: {output} Rate how well the output follows the instruction on these criteria: 1.

| SourceHINT addresses a computational inefficiency in standard instruction fine-tuning: repeatedly reprocessing the same task instruction with every input example.

Read more about foundation model training infrastructure and other topics in Neptune’s 2025 State of Foundation Model Training Report.

First, during initial instruction fine-tuning across multiple diverse tasks, the model learns genera…

3 months, 3 weeks назад @ neptune.ai
How to Optimize LLM Inference
How to Optimize LLM Inference How to Optimize LLM Inference

Large Language Model (LLM) inference at scale is challenging as it involves transferring massive amounts of model parameters and data and performing computations on large tensors.

In the following, we’ll use the Llama model family architecture as a specific example to understand the LLM workload at inference.

For a far more detailed analysis of the LLM workload at inference, see the chapter All About Transformer Inference in the book How to Scale Your Model, published by Google DeepMind.

See also How to Run LLMs Locally Read moreA quick primer on hardware for LLM inferenceA typical LLM inference cluster consists of several nodes, each with a multi-core CPU and multiple accelerator devices, …

4 months назад @ neptune.ai
A Researcher’s Guide to LLM Grounding
A Researcher’s Guide to LLM Grounding A Researcher’s Guide to LLM Grounding

In this article, we’ll explore the fundamental concepts of LLM grounding as well as strategies for optimally grounding models.

What is LLM grounding?

LLM grounding is analogous.

If relevant knowledge cannot be inferred from the data, then LLM grounding cannot yield more relevant responses.

When grounding LLMs using RAG, consider retaining only a few of the top hits (i.e., top-k) for your retrieval queries.

4 months, 3 weeks назад @ neptune.ai
Instruction Fine-Tuning: Fundamentals, Architecture Modifications, and Loss Functions
Instruction Fine-Tuning: Fundamentals, Architecture Modifications, and Loss Functions Instruction Fine-Tuning: Fundamentals, Architecture Modifications, and Loss Functions

TL;DR Instruction fine-tuning (IFT) refines pre-trained large language models (LLMs) to follow specific task instructions by training on prompt-response pairs.

Instruction fine-tuning in a nutshellIFT tailors LLMs to follow user instructions by bridging their inherent next-word prediction with human-defined objectives.

Related LLM Fine-Tuning and Model Selection Using Neptune and Transformers Read moreParameter-efficient instruction fine-tuningWhile major foundation models like GPT-4 or Llama-2 undergo full parameter instruction fine-tuning during development, parameter-efficient fine-tuning (PEFT) methods have become widely adopted for instruction fine-tuning since the LoRA paper was publi…

4 months, 4 weeks назад @ neptune.ai
Understanding Prompt Injection: Risks, Methods, and Defense Measures
Understanding Prompt Injection: Risks, Methods, and Defense Measures Understanding Prompt Injection: Risks, Methods, and Defense Measures

Prompt injection 101: When prompts go rogueThe term ‘Prompt Injection’ comes from SQL injection attacks.

There is another claim of the independent discovery of prompt injection attacks, which suggests that Riley Goodside publicly exhibited a prompt injection in a tweet back in September 2022.

The indirect prompt injection attacks are classified into active, passive, user-driven and virtual prompt attacks.

Virtual prompt injection attacksThis injection type is closely related to passive injection attacks previously described.

Prompt injection: current challenges & lessons learnedThe arms race between prompt injection attacks and defenses is a challenge for researchers, developers, and users.

6 months, 1 week назад @ neptune.ai
SabiYarn: Advancing Low-Resource Languages With Multitask NLP Pre-Training [Paper Reflections]
SabiYarn: Advancing Low-Resource Languages With Multitask NLP Pre-Training [Paper Reflections] SabiYarn: Advancing Low-Resource Languages With Multitask NLP Pre-Training [Paper Reflections]

This simple idea avoids computing loss on input prompt tokens the model already knows.

Prompt tokens are (too) expensive in low-resource settingsDuring pre-training, LLMs are trained in causal language modeling through a next-token prediction task.

=> Mo fẹ́ràn ìrẹsì,” the model is trained to predict every token, from the prompt to the actual answer:Step Prompt Next token 1 Translate English Static prompt 2 Translate English to Static prompt 3 Translate English to Yoruba: Static prompt 4 Translate English to Yoruba: I 5 Translate English to Yoruba: I love 6 Translate English to Yoruba: I love rice.

This is straightforward to implement in PyTorch by masking out the prompt tokens in the label …

6 months, 2 weeks назад @ neptune.ai
How to Monitor, Diagnose, and Solve Gradient Issues in Foundation Models
How to Monitor, Diagnose, and Solve Gradient Issues in Foundation Models How to Monitor, Diagnose, and Solve Gradient Issues in Foundation Models

What gradient issues occur during foundation model training?

During training, gradient descent updates model parameters by computing the gradients of the loss function via forward and backward passes.

The green line corresponds to a learning rate of 10, while the orange line has a learning rate of 0.1.

The gradient norm for the orange line with LR = 0.1 is very high in the first steps, while the gradient norm of the green line with LR = 10 diverges to NaN after a few steps.

Techniques for gradient stabilizationMonitoring gradient norms and training loss provides insights into the learning dynamics of the foundation models.

7 months, 2 weeks назад @ neptune.ai
STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning [Paper Reflection]
STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning [Paper Reflection] STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning [Paper Reflection]

Unstructured pruning removes individual weights, while structured pruning removes entire model components.

In the context of MoEs, as expert structures from training MoEs correspond to such patterns, pruning experts is a natural fit for structured pruning.

Thus, structured pruning does not significantly decrease kurtosis, leaving plenty of margin for unstructured pruning.

Since structured pruning primarily reduces architectural redundancy rather than reshaping the underlying weight distribution, our two-phase approach—leveraging unstructured pruning after structured pruning—outperforms unstructured-only pruning.

Since STUN does not make any assumption about base MoE models, it is generaliza…

8 months, 2 weeks назад @ neptune.ai
Evaluating RAG Pipelines
Evaluating RAG Pipelines Evaluating RAG Pipelines

Related Building LLM Applications With Vector Databases Read moreDimensions of RAG evaluationEvaluating a RAG pipeline means assessing its behavior across three dimensions:1.

The evaluation of the RAG pipeline is a multi-step process, starting with creating an evaluation dataset, then evaluating the individual components (retriever, generator, etc.

Curating an evaluation datasetThe first step in the RAG evaluation process is the creation of a ground truth dataset.

MAP considers both the presence and rank of relevant chunks but fails to consider the relative position of relevant chunks.

However, not all retrieved chunks are equally relevant and sometimes, the most relevant chunks might not b…

9 months назад @ neptune.ai
▶️ YouTube
Yannic Kilcher Yannic Kilcher
последний пост 1 month, 2 weeks назад
Traditional X-Mas Stream
Traditional X-Mas Stream Traditional X-Mas Stream

Letsgooo

1 month, 2 weeks назад @ youtube.com
Traditional Holiday Live Stream
Traditional Holiday Live Stream Traditional Holiday Live Stream

https://ykilcher.com/discord Links:

TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://discord.gg/4H8xxDF

BitChute: https://www.bitchute.com/channel/yannic-kilcher

Minds: https://www.minds.com/ykilcher

Parler: https://parler.com/profile/YannicKilcher

LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/

BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https:/…

1 month, 2 weeks назад @ youtube.com
TiDAR: Think in Diffusion, Talk in Autoregression (Paper Analysis)
TiDAR: Think in Diffusion, Talk in Autoregression (Paper Analysis) TiDAR: Think in Diffusion, Talk in Autoregression (Paper Analysis)

Paper: https://arxiv.org/abs/2511.08923 Abstract:

Diffusion language models hold the promise of fast parallel generation, while autoregressive (AR) models typically excel in quality due to their causal structure aligning naturally with language modeling. This raises a fundamental question: can we achieve a synergy with high throughput, higher GPU utilization, and AR level quality? Existing methods fail to effectively balance these two aspects, either prioritizing AR using a weaker model for sequential drafting (speculative decoding), leading to lower drafting efficiency, or using some form of left-to-right (AR-like) decoding logic for diffusion, which still suffers from quality degradation …

1 month, 2 weeks назад @ youtube.com
Titans: Learning to Memorize at Test Time (Paper Analysis)
Titans: Learning to Memorize at Test Time (Paper Analysis) Titans: Learning to Memorize at Test Time (Paper Analysis)

Paper: https://arxiv.org/abs/2501.00663 Abstract:

Over more than a decade there has been an extensive research effort on how to effectively utilize recurrent models and attention. While recurrent models aim to compress the data into a fixed-size memory (called hidden state), attention allows attending to the entire context window, capturing the direct dependencies of all tokens. This more accurate modeling of dependencies, however, comes with a quadratic cost, limiting the model to a fixed-length context. We present a new neural long-term memory module that learns to memorize historical context and helps attention to attend to the current context while utilizing long past information. We sh…

2 months назад @ youtube.com
[Paper Analysis] The Free Transformer (and some Variational Autoencoder stuff)
[Paper Analysis] The Free Transformer (and some Variational Autoencoder stuff) [Paper Analysis] The Free Transformer (and some Variational Autoencoder stuff)

https://arxiv.org/abs/2510.17558 Abstract:

We propose an extension of the decoder Transformer that conditions its generative process on random latent variables which are learned without supervision thanks to a variational procedure. Experimental evaluations show that allowing such a conditioning translates into substantial improvements on downstream tasks. Author: François Fleuret Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the con…

3 months, 2 weeks назад @ youtube.com
[Video Response] What Cloudflare's code mode misses about MCP and tool calling
[Video Response] What Cloudflare's code mode misses about MCP and tool calling [Video Response] What Cloudflare's code mode misses about MCP and tool calling

Theo's Video: https://www.youtube.com/watch?v=bAYZjVAodoo

Cloudflare article: https://blog.cloudflare.com/code-mode/ Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https://www.subscribestar.com/yannickilcher

Patreon: https://www.patreon.com/yannickilcher

Bitcoin (BTC): bc1q49lsw3q325tr58ygf8…

3 months, 4 weeks назад @ youtube.com
[Paper Analysis] On the Theoretical Limitations of Embedding-Based Retrieval (Warning: Rant)
[Paper Analysis] On the Theoretical Limitations of Embedding-Based Retrieval (Warning: Rant) [Paper Analysis] On the Theoretical Limitations of Embedding-Based Retrieval (Warning: Rant)

Paper: https://arxiv.org/abs/2508.21038 Abstract:

Vector embeddings have been tasked with an ever-increasing set of retrieval tasks over the years, with a nascent rise in using them for reasoning, instruction-following, coding, and more. These new benchmarks push embeddings to work for any query and any notion of relevance that could be given. While prior works have pointed out theoretical limitations of vector embeddings, there is a common assumption that these difficulties are exclusively due to unrealistic queries, and those that are not can be overcome with better training data and larger models. In this work, we demonstrate that we may encounter these theoretical limitations in realist…

4 months назад @ youtube.com
AGI is not coming!
AGI is not coming! AGI is not coming!

jack Morris's investigation into GPT-OSS training data https://x.com/jxmnop/status/1953899426075816164?t=3YRhVQDwQLk2gouTSACoqA&s=09

6 months, 1 week назад @ youtube.com
Context Rot: How Increasing Input Tokens Impacts LLM Performance (Paper Analysis)
Context Rot: How Increasing Input Tokens Impacts LLM Performance (Paper Analysis) Context Rot: How Increasing Input Tokens Impacts LLM Performance (Paper Analysis)

Paper: https://research.trychroma.com/context-rot Abstract:

Large Language Models (LLMs) are typically presumed to process context uniformly—that is, the model should handle the 10,000th token just as reliably as the 100th. However, in practice, this assumption does not hold. We observe that model performance varies significantly as input length changes, even on simple tasks.

In this report, we evaluate 18 LLMs, including the state-of-the-art GPT-4.1, Claude 4, Gemini 2.5, and Qwen3 models. Our results reveal that models do not use their context uniformly; instead, their performance grows increasingly unreliable as input length grows. Authors: Kelly Hong, Anton Troynikov, Jeff Huber Links:

6 months, 3 weeks назад @ youtube.com
Energy-Based Transformers are Scalable Learners and Thinkers (Paper Review)
Energy-Based Transformers are Scalable Learners and Thinkers (Paper Review) Energy-Based Transformers are Scalable Learners and Thinkers (Paper Review)

Paper: https://arxiv.org/abs/2507.02092

Code: https://github.com/alexiglad/EBT

Website: https://energy-based-transformers.github.io/ Abstract:

Inference-time computation techniques, analogous to human System 2 Thinking, have recently become popular for improving model performances. However, most existing approaches suffer from several limitations: they are modality-specific (e.g., working only in text), problem-specific (e.g., verifiable domains like math and coding), or require additional supervision/training on top of unsupervised pretraining (e.g., verifiers or verifiable rewards). In this paper, we ask the question "Is it possible to generalize these System 2 Thinking approaches, and de…

7 months назад @ youtube.com
On the Biology of a Large Language Model (Part 2)
On the Biology of a Large Language Model (Part 2) On the Biology of a Large Language Model (Part 2)

An in-depth look at Anthropic's Transformer Circuit Blog Post

Part 1 here: https://youtu.be/mU3g2YPKlsA

Discord here: https;//ykilcher.com/discord https://transformer-circuits.pub/2025/attribution-graphs/biology.html Abstract:

We investigate the internal mechanisms used by Claude 3.5 Haiku — Anthropic's lightweight production model — in a variety of contexts, using our circuit tracing methodology. Authors:

Jack Lindsey†, Wes Gurnee*, Emmanuel Ameisen*, Brian Chen*, Adam Pearce*, Nicholas L. Turner*, Craig Citro*,

David Abrahams, Shan Carter, Basil Hosmer, Jonathan Marcus, Michael Sklar, Adly Templeton,

Trenton Bricken, Callum McDougall◊, Hoagy Cunningham, Thomas Henighan, Adam Jermyn, Andy …

9 months, 2 weeks назад @ youtube.com
Henry AI Labs Henry AI Labs
последний пост None
3blue1brown 3blue1brown
последний пост 2 weeks назад
The Hairy Ball Theorem
The Hairy Ball Theorem The Hairy Ball Theorem

Unexpected applications and a beautiful proof.

Looking for a new career? Check out https://3b1b.co/talent

Supporters get early access to new videos: https://3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com Credits:

Senia Sheydvasser: Co-writing and sphere deformation animations

Paul Dancstep: Those lovely fluffy sphere animations Vince Rubinetti: Music Timestamps:

0:00 - To comb a hairy ball

1:24 - Applications

8:46 - The puzzle of one null point

12:12 - The proof outline

16:41 - Defining orientation

21:44 - Why inside-out is impossible

25:59 - 3b1b Talent

27:44 - Final food for thought ------------------ These animati…

2 weeks назад @ youtube.com
The ladybug clock puzzle
The ladybug clock puzzle The ladybug clock puzzle

This is the first in a set of monthly puzzles, curated by Peter Winkler. This one was originally suggested by Richard Stanley. You can sign up to hear his description of the answer at http://momath.org/mindbenders

4 weeks, 1 day назад @ youtube.com
The most absurd product I've made
The most absurd product I've made The most absurd product I've made

Because why not make a pi creature neck pillow?

Available at 3b1b.co/store

2 months, 3 weeks назад @ youtube.com
How Laplace transforms solve differential equations
How Laplace transforms solve differential equations How Laplace transforms solve differential equations

Studying the forced harmonic oscillator by taking a Laplace transform and studying its poles.

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com Chapter on the Laplace Transform:

https://youtu.be/j0wJBEZdwLs Chapter on the S-plane and Simple Harmonic Motion:

https://youtu.be/-j8PzkZ70Lg Timestamps:

0:00 - Opening puzzle

1:06 - Key properties of a Laplace Transform

3:29 - Qualitative analysis with Laplace Transforms

4:29 - The Laplace Transforms of a Derivative

6:06 - The forced oscillator

11:59 - Intuition from the transformed solution

1…

3 months, 1 week назад @ youtube.com
The dynamics of e^(πi)
The dynamics of e^(πi) The dynamics of e^(πi)

A fuller version of this explanation, also including the reason we care about complex exponents in the first place: https://youtu.be/-j8PzkZ70Lg

4 months назад @ youtube.com
But what is a Laplace Transform?
But what is a Laplace Transform? But what is a Laplace Transform?

Visualizing the most important tool for differential equations.

Previous chapter: https://youtu.be/-j8PzkZ70Lg

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com Artwork by Kurt Bruns Engine animation borrowed with permission from this (excellent) blog: https://ciechanow.ski/internal-combustion-engine/ Timestamps:

0:00 - Understanding the engine

1:16 - Key background ideas

5:41 - Definition and intuition

10:43 - Complex integration

20:43 - Analytic continuation

23:52 - The transform of exponentials

26:15 - A deep look at cos(t)

32:59 - W…

4 months назад @ youtube.com
The dynamics of e^(πi)
The dynamics of e^(πi) The dynamics of e^(πi)

A fuller version of this explanation, also including the reason we care about complex exponents in the first place: https://youtu.be/-j8PzkZ70Lg

4 months назад @ youtube.com
Why complex exponents matter | Laplace Transform Prelude
Why complex exponents matter | Laplace Transform Prelude Why complex exponents matter | Laplace Transform Prelude

How dynamics explain Euler's formula, and vice versa.

Early view of the Laplace Transform video: https://www.patreon.com/posts/laplace-early-140428165

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com Timestamps:

0:00 - Intro

1:51 - Euler's formula explained dynamically

9:27 - The harmonic oscillator

21:08 - General linear equations

22:47 - Motivating the Laplace Transform ------------------ These animations are largely made using a custom Python library, manim. See the FAQ comments here:

https://3b1b.co/faq#manim Music by Vincent Rubin…

4 months, 1 week назад @ youtube.com
Why ruler and compass? | Guest video by ⁨@bensyversen⁩
Why ruler and compass? | Guest video by ⁨@bensyversen⁩ Why ruler and compass? | Guest video by ⁨@bensyversen⁩

What role were ruler and compass constructions really serving?

Check out Ben's channel: @bensyversen Interview with the author of this video: https://youtu.be/VohYM99j8e0

Supporters get early views of new videos: https://3b1b.co/support Written, produced, edited, and animated by Ben Syversen

Additional editing: Jack Saxon

3d Blender model: Jan-Hendrik Müller

Additional Blender help: Thibaut Modrzyk (@Deepia)

Illustrations: Alex Zepherin/DonDada Studio

Drums: Jeremy Gustin

Additional music from Epidemic Sound Special thanks to Viktor Blåsjö: https://intellectualmathematics.com/opinionated-history-of-mathematics/ References/Recommended reading: Euclid’s Elements:

Visual edition of Book 1: htt…

4 months, 4 weeks назад @ youtube.com
Incomplete open cubes
Incomplete open cubes Incomplete open cubes

Full video: https://youtu.be/_BrFKp-U8GI

5 months, 1 week назад @ youtube.com
Exploration & Epiphany
Exploration & Epiphany Exploration & Epiphany

Sol Lewitt's "Incomplete Open Cubes" and rediscovering Burnside's lemma in group theory

This is a guest video by Paul Dancstep: https://youtu.be/JEeM2ABUMoo

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to share the videos.

Home page: https://www.3blue1brown.com Thanks to the Wadsworth Atheneum for granting permission to use LeWitt's notebooks. Talks by Paul you can find online: What is Category Theory:

https://www.youtube.com/watch?app=desktop&v=eXBwU9ieLL0 How to Predict Eclipses:

https://www.exploratorium.edu/eclipse/video/how-predict-eclipses Theo Jansen's Strandbeests

https://www.youtube.com/w…

5 months, 1 week назад @ youtube.com
Simulating Phase Change | Guest video by Vilas Winstein
Simulating Phase Change | Guest video by Vilas Winstein Simulating Phase Change | Guest video by Vilas Winstein

Deriving the Boltzmann formula, defining temperature, and simulating liquid/vapor.

@SpectralCollective has the second part: https://youtu.be/yEcysu5xZH0

You can play with a simulation of this model here: https://vilas.us/simulations/liquidvapor/

These lessons are funded directly by viewers: https://3b1b.co/support

Home page: https://www.3blue1brown.com Notes from Vilas:

1) This open problem is to prove the ergodicity of the deterministic dynamical systems that are used to model the molecule-level physics. A good example of such a dynamical system is the box with particles evolving according to Newton's laws with elastic collisions, like in the video. 2) This video assumes that all probabili…

5 months, 2 weeks назад @ youtube.com
How AI connects text and images
How AI connects text and images How AI connects text and images

From this guest video by @WelchLabsVideo on how diffusion models work: https://youtu.be/iv-5mZ_9CPY

5 months, 3 weeks назад @ youtube.com
The AI that solved IMO Geometry Problems | Guest video by @Aleph0
The AI that solved IMO Geometry Problems | Guest video by @Aleph0 The AI that solved IMO Geometry Problems | Guest video by @Aleph0

How AlphaGeometry combines logic and intuition.

Share stories about AI in math research for an upcoming video: https://forms.gle/gr9aZVdUrW5T3yDg9

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com AlphaGeometry announcement:

https://deepmind.google/discover/blog/alphageometry-an-olympiad-level-ai-system-for-geometry/ Similar open-source model, Newclid, by Harmonic:

https://harmonic.fun/news#blog-post-geometry Timestamps:

0:00 - What's surprising

1:33 - Solve without AI

7:10 - Where AI comes in

12:48 - Grant's comments ------------------…

6 months назад @ youtube.com
But how do AI videos actually work? | Guest video by @WelchLabsVideo
But how do AI videos actually work? | Guest video by @WelchLabsVideo But how do AI videos actually work? | Guest video by @WelchLabsVideo

Diffusion models, CLIP, and the math of turning text into images

Welch Labs Book: https://www.welchlabs.com/resources/imaginary-numbers-book Sections

0:00 - Intro

3:37 - CLIP

6:25 - Shared Embedding Space

8:16 - Diffusion Models & DDPM

11:44 - Learning Vector Fields

22:00 - DDIM

25:25 Dall E 2

26:37 - Conditioning

30:02 - Guidance

33:39 - Negative Prompts

34:27 - Outro

35:32 - About guest videos + Grant’s Reaction Special Thanks to:

Jonathan Ho - Jonathan is the Author of the DDPM paper and the Classifier Free Guidance Paper.

https://arxiv.org/pdf/2006.11239

https://arxiv.org/pdf/2207.12598 Preetum Nakkiran - Preetum has an excellent introductory diffusion tutorial:

https://arxiv.org/pdf/24…

6 months, 3 weeks назад @ youtube.com
Two Minute Papers Two Minute Papers
последний пост 2 days, 10 hours назад
Anthropic Found Why AIs Go Insane
Anthropic Found Why AIs Go Insane Anthropic Found Why AIs Go Insane

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://www.anthropic.com/research/assistant-axis Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

2 days, 10 hours назад @ youtube.com
This Is Now 66x Faster
This Is Now 66x Faster This Is Now 66x Faster

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://sig25ddmpd.github.io/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

4 days, 12 hours назад @ youtube.com
NVIDIA’s New AI Just Leveled Up Video Editing
NVIDIA’s New AI Just Leveled Up Video Editing NVIDIA’s New AI Just Leveled Up Video Editing

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper and the code are now available here:

https://dvirsamuel.github.io/omnimattezero.github.io/

https://github.com/dvirsamuel/OmnimatteZero Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/ …

1 week, 1 day назад @ youtube.com
New DeepSeek Research - The Age of AI Is Here!
New DeepSeek Research - The Age of AI Is Here! New DeepSeek Research - The Age of AI Is Here!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://arxiv.org/abs/2501.12948 Sources:

https://x.com/awnihannun/status/1883276535643455790

https://x.com/bcjordan/status/1886825587097878826

https://x.com/izag82161/status/1906347576204640514 Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw…

1 week, 3 days назад @ youtube.com
What A Time To Be Alive (Our First Ever Music Video)
What A Time To Be Alive (Our First Ever Music Video) What A Time To Be Alive (Our First Ever Music Video)

Thank you so much everyone, this was amazing fun! Guitar tab is available here for free, no paywall nonsense: https://www.dropbox.com/scl/fi/oo1ny6i7mtgu3l006roa5/what-a-time-to-be-alive.gp?rlkey=n0c4xryfip7m15derrudnb8zl&st=fu51yljh&dl=1 Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~…

2 weeks назад @ youtube.com
Digital Humans Just Took A Massive Leap Forward
Digital Humans Just Took A Massive Leap Forward Digital Humans Just Took A Massive Leap Forward

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://neuralbodies.github.io/RFGCA/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

2 weeks, 2 days назад @ youtube.com
Scientists Just Solved The Hardest Problem in Granular Physics
Scientists Just Solved The Hardest Problem in Granular Physics Scientists Just Solved The Hardest Problem in Granular Physics

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://visualcomputing.ist.ac.at/publications/2025/HomogenizedSand/ Previous Disney grains paper:

https://la.disneyresearch.com/publication/multi-scale-modeling-and-rendering-of-granular-materials/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagen…

2 weeks, 6 days назад @ youtube.com
This Fluid Simulation Should Not Be Possible
This Fluid Simulation Should Not Be Possible This Fluid Simulation Should Not Be Possible

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers 📝 The paper "Fast Octree Neighborhood Search for SPH Simulations" is available here:

https://andreaslongva.com/pdf/2022-SA-NeighborhoodSearch-compressed.pdf

https://animation.rwth-aachen.de/media/papers/79/2022-SA-NeighborhoodSearch.pdf Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, …

3 weeks, 6 days назад @ youtube.com
Wrinkles Are Weirder Than We Ever Thought
Wrinkles Are Weirder Than We Ever Thought Wrinkles Are Weirder Than We Ever Thought

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://wanghmin.github.io/publication/zhang-2025-pie/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

1 month назад @ youtube.com
Why Game Physics Is Falling Apart (And How To Fix It)
Why Game Physics Is Falling Apart (And How To Fix It) Why Game Physics Is Falling Apart (And How To Fix It)

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://graphics.cs.utah.edu/research/projects/stable-cosserat-rods/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers

Note that just watching the series and leaving a kind comment every now and then is as much support as any of us could ever ask for! Sources:

https://www.youtube.com/watch?v=kO3NsSX1VTg

https://www.youtube.com/watch?v=IQZ_zBX6gQY 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Mi…

1 month, 1 week назад @ youtube.com
We Just Turned Down Millions of Dollars. Here Is Why.
We Just Turned Down Millions of Dollars. Here Is Why. We Just Turned Down Millions of Dollars. Here Is Why.

Yup. My free course on how to write a light simulation program (ray tracing): https://users.cg.tuwien.ac.at/zsolnai/gfx/rendering-course/ 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Benji Rabhan, B Shang, Christian Ahlin, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Steef, Taras Bobrovytsky, Tybie Fitzhugh, Ueli Gallizzi

If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers My research: https://cg.tuwien.ac.at/~zsolnai/

Thumbnail design: Felícia Zsolnai-Fehér - http://felicia.hu

1 month, 2 weeks назад @ youtube.com
The Bug That Ruined Game Physics For Decades
The Bug That Ruined Game Physics For Decades The Bug That Ruined Game Physics For Decades

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Using DeepSeek on Lambda:

https://lambda.ai/inference-models/deepseek-r1 📝 The paper "A Stream Function Solver for Liquid Simulations" is available here:

https://pub.ista.ac.at/group_wojtan/projects/2015_Ando_ASFSfLS/download/vecpotential.pdf 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Benji Rabhan, B Shang, Christian Ahlin, Fred R, Gordon …

1 month, 2 weeks назад @ youtube.com
NVIDIA’s AI Learns To Walk…Painfully
NVIDIA’s AI Learns To Walk…Painfully NVIDIA’s AI Learns To Walk…Painfully

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Using DeepSeek on Lambda:

https://lambda.ai/inference-models/deepseek-r1 📝 The paper is available here:

https://research.nvidia.com/labs/toronto-ai/trace-pace/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Benji Rabhan, B Shang, Christian Ahlin, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Steef, Taras B…

1 month, 3 weeks назад @ youtube.com
This Is The Physics Tech Games Have Been Waiting For
This Is The Physics Tech Games Have Been Waiting For This Is The Physics Tech Games Have Been Waiting For

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers 📝 The paper is available here:

https://wanghmin.github.io/publication/wu-2022-gbm/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Benji Rabhan, B Shang, Christian Ahlin, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Steef, Taras Bobrovytsky, Tybie Fitzhugh, Ueli Gallizzi

If you wish to appear here or …

1 month, 4 weeks назад @ youtube.com
The AI That Built An Economy… And Went Bankrupt
The AI That Built An Economy… And Went Bankrupt The AI That Built An Economy… And Went Bankrupt

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Using DeepSeek on Lambda:

https://lambda.ai/inference-models/deepseek-r1 📝 The paper is available here:

https://simworld.org/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Benji Rabhan, B Shang, Christian Ahlin, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Steef, Taras Bobrovytsky, Tybie Fitzhugh, Ueli G…

2 months назад @ youtube.com
DataFest Video DataFest Video
последний пост None
Семинары JetBrains Research Семинары JetBrains Research
последний пост None
Яндекс. Компьютерные науки Яндекс. Компьютерные науки
последний пост 1 day, 10 hours назад
Saturday ML Party 21.03.2026
Saturday ML Party 21.03.2026 Saturday ML Party 21.03.2026

Добро пожаловать на вечерний митап для ml-инженеров от Яндекса. В этот раз поговорим про про развитие картиночной мультимодальности, книжном синтезе и не только. Вступайте в канал Yandex for ML, чтобы быть в курсе всех событий и мероприятий Яндекса https://t.me/+LoohiDU_sVk4NGVi Оставляйте свои вопросы спикерам в чате с тегом #вопрос: https://t.me/+OsKnLNG-7DE1ZTFi

1 day, 10 hours назад @ youtube.com
Как устроены тематические блоки в поиске Яндекса
Как устроены тематические блоки в поиске Яндекса Как устроены тематические блоки в поиске Яндекса

Под капот поисковой системы Яндекса заглядывает Артём Лобанов, разработчик в группе продуктовых сценариев бизнес-группы Поиска и Рекламных технологий. Полное выступление с Data Fest Siberia уже на канале! #поиск, #поисковыесистемы, #геопоиск, #searchengineering, #dataanalytics, #datascience, #ML, #искусственныйинтеллект, #релевантность, #качествопоиска, #YandexSearch, #ЯндексПоиск, #ITразработка, #DataFestSiberia, #ITконференция, #Shorts

2 days, 12 hours назад @ youtube.com
Как поиск путает Эрмитажи
Как поиск путает Эрмитажи Как поиск путает Эрмитажи

Про сценарии нерелевантных запросов рассказывает Артём Лобанов, разработчик в группе продуктовых сценариев бизнес-группы Поиска и Рекламных технологий. Полное выступление с Data Fest Siberia уже на канале! #поиск, #поисковыесистемы, #геопоиск, #searchengineering, #dataanalytics, #datascience, #ML, #искусственныйинтеллект, #релевантность, #качествопоиска, #YandexSearch, #ЯндексПоиск, #ITразработка, #DataFestSiberia, #ITконференция, #Shorts

1 week назад @ youtube.com
Нейросеть должна знать, что Барселона — топ
Нейросеть должна знать, что Барселона — топ Нейросеть должна знать, что Барселона — топ

Фрагмент доклада Алексея Колесова, CTO в Yandex R&D, с Practical ML Conf 2025. На простом и «очевидном» примере из футбола разбираем серьёзную проблему больших языковых моделей: почему LLM могут плохо помнить факты о мире и как в YandexGPT 5.1 улучшали фактическую память и применение знаний. В докладе также показано, как в реальной системе удалось стабильно запустить online reinforcement learning (RL) для обучения модели. Про обучение нейросетей, память LLM и практический ML в продакшене. Полный доклад — на канале. #YandexGPT #LLM #AI #MachineLearning #ReinforcementLearning #DeepLearning #ML #ArtificialIntelligence #PracticalML #MLOps

1 week, 3 days назад @ youtube.com
Как выглядит плохой показ в поиске
Как выглядит плохой показ в поиске Как выглядит плохой показ в поиске

Объясняет Артём Лобанов, разработчик в группе продуктовых сценариев бизнес-группы Поиска и Рекламных технологий. Полное выступление с Data Fest Siberia уже на канале! #поиск, #поисковыесистемы, #геопоиск, #searchengineering, #dataanalytics, #datascience, #ML, #искусственныйинтеллект, #релевантность, #качествопоиска, #YandexSearch, #ЯндексПоиск, #ITразработка, #DataFestSiberia, #ITконференция, #Shorts

1 week, 6 days назад @ youtube.com
Зачем нужен геопоиск Яндекса
Зачем нужен геопоиск Яндекса Зачем нужен геопоиск Яндекса

Про функциональность сервиса рассказывает Артём Лобанов, разработчик в группе продуктовых сценариев бизнес-группы Поиска и Рекламных технологий. Полное выступление с Data Fest Siberia уже на канале! #поиск, #поисковыесистемы, #геопоиск, #searchengineering, #dataanalytics, #datascience, #ML, #искусственныйинтеллект, #релевантность, #качествопоиска, #YandexSearch, #ЯндексПоиск, #ITразработка, #DataFestSiberia, #ITконференция, #Shorts

2 weeks, 3 days назад @ youtube.com
Что такое PMI Score
Что такое PMI Score Что такое PMI Score

Это фрагмент выступления Всеволода Парамонова и Андрея Шевцова, разработчиков группы аналитики ценообразования Яндекс Лавки. На Data Fest Siberia они рассказали, что лежит под капотом динамического ценообразования в сервисе. Также ребята разобрали, как устроены модели спроса и эластичности и какие математические формулы приводят процесс в движение. Смотрите полное видео на нашем канале! #ценообразование, #dynamicpricing, #pricingmodel, #dataanalytics, #datascience, #ML, #математика, #экономика, #аналитика, #бизнесметрики, #спрос, #эластичность, #YandexLavka, #DataFestSiberia, #ITконференция, #datadriven, #Shorts

3 weeks назад @ youtube.com
Как выглядит модель ценообразования
Как выглядит модель ценообразования Как выглядит модель ценообразования

Это фрагмент выступления Всеволода Парамонова и Андрея Шевцова, разработчиков группы аналитики ценообразования Яндекс Лавки. На Data Fest Siberia они рассказали, что лежит под капотом динамического ценообразования в сервисе. Также ребята разобрали, как устроены модели спроса и эластичности и какие математические формулы приводят процесс в движение. Смотрите полное видео на нашем канале! #ценообразование, #dynamicpricing, #pricingmodel, #dataanalytics, #datascience, #ML, #математика, #экономика, #аналитика, #бизнесметрики, #спрос, #эластичность, #YandexLavka, #DataFestSiberia, #ITконференция, #datadriven, #Shorts

3 weeks, 4 days назад @ youtube.com
Какую задачу решает ценообразование
Какую задачу решает ценообразование Какую задачу решает ценообразование

Это фрагмент выступления Всеволода Парамонова и Андрея Шевцова, разработчиков группы аналитики ценообразования Яндекс Лавки. На Data Fest Siberia они рассказали, что лежит под капотом динамического ценообразования в сервисе. Также ребята разобрали, как устроены модели спроса и эластичности и какие математические формулы приводят процесс в движение. Смотрите полное видео на нашем канале! #ценообразование, #dynamicpricing, #pricingmodel, #dataanalytics, #datascience, #ML, #математика, #экономика, #аналитика, #бизнесметрики, #спрос, #эластичность, #YandexLavka, #DataFestSiberia, #ITконференция, #datadriven, #Shorts

4 weeks назад @ youtube.com
Компьютерное зрение в 2025-м / Роман Исаченко
Компьютерное зрение в 2025-м / Роман Исаченко Компьютерное зрение в 2025-м / Роман Исаченко

ML Global Recap 2025 — митап для ML-сообщества, на котором мы рассказали о главных международных конференциях года и самых интересных трендах в рекомендательных технологиях, компьютерном зрении, распознавании речи и NLP. С докладом на ивенте выступил Роман Исаченко, руководитель команды анализа изображений в Яндекс R&D. Он рассказал про мультимодальный анализ изображений (VLM) и диффузионки — картиночную генерацию. Больше ML-контента по ссылке: https://t.me/+Ug9D4CjJrJxmZGRi #ML, #MachineLearning, #AI, #DataScience, #MLGlobalRecap2025, #нейросети, #искусственныйинтеллект, #рекомендательныесистемы, #NLP, #ComputerVision, #SpeechRecognition, #LLM, #DeepLearning, #MLтренды, #ITконференция, #Ya…

1 month, 2 weeks назад @ youtube.com
Тренды в NLP, обзор ICLR и ACL / Александр Юшкевич
Тренды в NLP, обзор ICLR и ACL / Александр Юшкевич Тренды в NLP, обзор ICLR и ACL / Александр Юшкевич

ML Global Recap 2025 — митап для ML-сообщества, на котором мы рассказали о главных международных конференциях года и самых интересных трендах в рекомендательных технологиях, компьютерном зрении, распознавании речи и NLP. С докладом на ивенте выступил Александр Юшкевич, руководитель команды развития моделей базового качества в Поисковых сервисах и ИИ. Он показал, как конференции отражают тренды в NLP: растёт закрытость топовых LLM-моделей, а также спрос на alignment & safety, инференс, интерпретируемость и оптимизацию. А ещё появляются новые бенчмарки (куда без них). Больше ML-контента по ссылке: https://t.me/+Ug9D4CjJrJxmZGRi #ML, #MachineLearning, #AI, #DataScience, #MLGlobalRecap2025, #не…

1 month, 3 weeks назад @ youtube.com
Голосовые технологии на Interspeech и ICASSP 2025 / Борис Шелудько
Голосовые технологии на Interspeech и ICASSP 2025 / Борис Шелудько Голосовые технологии на Interspeech и ICASSP 2025 / Борис Шелудько

ML Global Recap 2025 — митап для ML-сообщества, на котором мы рассказали о главных международных конференциях года и самых интересных трендах в рекомендательных технологиях, компьютерном зрении, распознавании речи и NLP. С докладом на ивенте выступил Борис Шелудько, руководитель команды качества звука в Яндекс R&D. Он рассказал про итоги двух главных международных конференций по голосовым технологиям: Interspeech в Нидерландах и ICASSP в Индии. Больше ML-контента по ссылке: https://t.me/+Ug9D4CjJrJxmZGRi #ML, #MachineLearning, #AI, #DataScience, #MLGlobalRecap2025, #нейросети, #искусственныйинтеллект, #рекомендательныесистемы, #NLP, #ComputerVision, #SpeechRecognition, #LLM, #DeepLearning, …

1 month, 3 weeks назад @ youtube.com
Главные тренды рекомендательных систем / Николай Савушкин
Главные тренды рекомендательных систем / Николай Савушкин Главные тренды рекомендательных систем / Николай Савушкин

ML Global Recap 2025 — митап для ML-сообщества, на котором мы рассказали о главных международных конференциях года и самых интересных трендах в рекомендательных технологиях, компьютерном зрении, распознавании речи и NLP. С докладом на ивенте выступил Николай Савушкин, руководитель команды рекомендательных технологий в Яндекс R&D. Он поделился инсайтами с CIKM и RecSys и рассказал про ключевые тренды в рекомендательных системах: фундаментальные и End2End-модели, масштабирование, мультимодальность, attention-based ranking и другие. Больше ML-контента по ссылке: https://t.me/+Ug9D4CjJrJxmZGRi #ML, #MachineLearning, #AI, #DataScience, #MLGlobalRecap2025, #нейросети, #искусственныйинтеллект, #ре…

1 month, 3 weeks назад @ youtube.com
Открытие ML Global Recap 2025 / Алексей Гусаков
Открытие ML Global Recap 2025 / Алексей Гусаков Открытие ML Global Recap 2025 / Алексей Гусаков

ML Global Recap 2025 — митап для ML-сообщества, на котором мы рассказали о главных международных конференциях года и самых интересных трендах в рекомендательных технологиях, компьютерном зрении, распознавании речи и NLP. Ивент открыл Алексей Гусаков, CTO Поисковых сервисов и ИИ. В докладе — вступительное слово, краткий обзор NeurIPS и немного про кризис бенчмарков. Больше ML-контента по ссылке: https://t.me/+Ug9D4CjJrJxmZGRi #ML, #MachineLearning, #AI, #DataScience, #MLGlobalRecap2025, #нейросети, #искусственныйинтеллект, #рекомендательныесистемы, #NLP, #ComputerVision, #SpeechRecognition, #LLM, #DeepLearning, #MLтренды, #ITконференция, #Yandex, #MLсообщество

1 month, 3 weeks назад @ youtube.com
Как решить проблему разнообразия ответов LLM
Как решить проблему разнообразия ответов LLM Как решить проблему разнообразия ответов LLM

Это отрывок из доклада Алексея Колесова, CTO в Яндекс R&D. На Practical ML Conf 2025 он рассказал, как ребята учили YandexGPT 5.1 лучше помнить факты и применять знания о них. А ещё показал, как у нас стабильно заработал online RL. Полная запись уже на канале! #YandexGPT #LLM #AI #ArtificialIntelligence #MachineLearning #DeepLearning #ReinforcementLearning #OnlineRL #NLP #GenerativeAI #YandexForDevelopers #YandexForML #Яндекс #AIDevDay #TechConference #DataScience #ML #AIResearch #LanguageModel #YandexTech

2 months, 1 week назад @ youtube.com
ML Trainings ML Trainings
последний пост 3 days, 11 hours назад
Почему понимание естественного языка — сложная задача
Почему понимание естественного языка — сложная задача Почему понимание естественного языка — сложная задача

Курс «Обработка естественного языка и LLM» (10-й поток, весна 2026 г.):

https://ods.ai/tracks/nlp-course-spring-2026

3 days, 11 hours назад @ youtube.com
Мужчина тонет и просит телефон передать жене
Мужчина тонет и просит телефон передать жене Мужчина тонет и просит телефон передать жене

Natural Language Processing & LLMs course (stream 10, spring 2026):

https://ods.ai/tracks/nlp-course-spring-2026

3 days, 11 hours назад @ youtube.com
NLP: как люди общаются с компьютерами через язык
NLP: как люди общаются с компьютерами через язык NLP: как люди общаются с компьютерами через язык

Курс «Обработка естественного языка и LLM» (10-й поток, весна 2026 г.):

https://ods.ai/tracks/nlp-course-spring-2026

3 days, 11 hours назад @ youtube.com
Снижение цен на память и спекуляции производителей
Снижение цен на память и спекуляции производителей Снижение цен на память и спекуляции производителей 5 days, 14 hours назад @ youtube.com
Могут ли модели напрямую взаимодействовать с материей
Могут ли модели напрямую взаимодействовать с материей Могут ли модели напрямую взаимодействовать с материей 5 days, 14 hours назад @ youtube.com
Космос и инвестиции: новая гонка или стратегия
Космос и инвестиции: новая гонка или стратегия Космос и инвестиции: новая гонка или стратегия 5 days, 14 hours назад @ youtube.com
Китайская память меняет рынок
Китайская память меняет рынок Китайская память меняет рынок 5 days, 14 hours назад @ youtube.com
Инопланетный Илон Маск и тайны обратной стороны
Инопланетный Илон Маск и тайны обратной стороны Инопланетный Илон Маск и тайны обратной стороны 5 days, 14 hours назад @ youtube.com
ИИ как электричество: будущее данных и их утечек
ИИ как электричество: будущее данных и их утечек ИИ как электричество: будущее данных и их утечек 5 days, 14 hours назад @ youtube.com
Капитанский мостик №5: двигатели для ИИ со свалки | космо-ИИ от Илона Маска | NeurIPS без кода
Капитанский мостик №5: двигатели для ИИ со свалки | космо-ИИ от Илона Маска | NeurIPS без кода Капитанский мостик №5: двигатели для ИИ со свалки | космо-ИИ от Илона Маска | NeurIPS без кода

этот выпуск был записан на Letovo MLConf вживую, просим прощения за качество звука 0:00:00 Начало

0:00:20 Двигатели для ИИ со свалки

0:05:07 Космо-ИИ от Илона Маска

0:11:48 Дообучение в ИТ

0:31:13 Гаражный DGX

0:41:54 ИИ дороже ЖД

0:48:43 Биогенератор вирусов

0:56:06 Агитация с ИИ

1:02:18 Японские нанометры

1:05:32 NeurIPS без кода

1:09:04 Китайская память

1:14:09 Данные утекают зарубеж ИИ-саммари: В этом выпуске обсуждаются актуальные темы энергетики, включая использование старых авиационных двигателей для генерации электроэнергии, будущее космических дата-центров, тренды в образовании IT-специалистов и проблемы на рынке труда. Участники делятся мнениями о том, как быстро меняется рынок и …

6 days, 11 hours назад @ youtube.com
Награждение и разбор решений VK RecSys Challenge
Награждение и разбор решений VK RecSys Challenge Награждение и разбор решений VK RecSys Challenge

Data Ёлка 2025: https://ods.ai/events/data-elka-2025

VK RecSys Challenge LSVD: https://ods.ai/competitions/vkrecsyschallengelsvd

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

1 week, 3 days назад @ youtube.com
Никита Пацакула | Итоги 2025 года в ML in Rust
Никита Пацакула | Итоги 2025 года в ML in Rust Никита Пацакула | Итоги 2025 года в ML in Rust

Спикер: Никита Пацакула, Технический директор, Когнито Data Ёлка 2025: https://ods.ai/events/data-elka-2025

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

1 week, 3 days назад @ youtube.com
Антон Воронов | Итоги 2025 года в DS/ML Career
Антон Воронов | Итоги 2025 года в DS/ML Career Антон Воронов | Итоги 2025 года в DS/ML Career

Спикер: Антон Воронов, Technical Unit Lead, Авито Подработка Data Ёлка 2025: https://ods.ai/events/data-elka-2025

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

1 week, 3 days назад @ youtube.com
Иван Сосин | Итоги 2025 года в Robotics
Иван Сосин | Итоги 2025 года в Robotics Иван Сосин | Итоги 2025 года в Robotics

Спикер: Иван Сосин, Исполнительный директор, Сбер, Центр Робототехники Data Ёлка 2025: https://ods.ai/events/data-elka-2025

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

1 week, 3 days назад @ youtube.com
Евгений Никитин | Итоги 2025 года в ML in Healthcare
Евгений Никитин | Итоги 2025 года в ML in Healthcare Евгений Никитин | Итоги 2025 года в ML in Healthcare

Спикер: Евгений Никитин, Технический директор, Цельс Data Ёлка 2025: https://ods.ai/events/data-elka-2025

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

1 week, 3 days назад @ youtube.com
Primer Primer
последний пост 1 month, 2 weeks назад
Taking AI Doom Seriously For 62 Minutes
Taking AI Doom Seriously For 62 Minutes Taking AI Doom Seriously For 62 Minutes

Patreon: https://www.patreon.com/primerlearning

80,000 Hours: 80000hours.org/primer https://www.desmos.com/calculator/a5pfjtr4tr Other connections:

Discord: https://discord.gg/NbruaNW

Twitch: https://www.twitch.tv/justin_helps

Store: https://store.dftba.com/collections/primer Reddit: https://www.reddit.com/r/primerlearning/

Bsky: https://bsky.app/profile/justinhelps.bsky.social

Twitter: https://twitter.com/primerlearning Links to other resources:

https://yoshuabengio.org/2024/07/09/reasoning-through-arguments-against-taking-ai-safety-seriously/

https://www.youtube.com/c/robertmilesai

https://www.youtube.com/@Siliconversations

https://www.youtube.com/@Go-Meta

https://www.youtube.com/@Dwarkes…

1 month, 2 weeks назад @ youtube.com
Simulating a single brain cell
Simulating a single brain cell Simulating a single brain cell

Patreon:

https://www.patreon.com/primerlearning Helpful resources if you want to learn more about neural networks

https://www.youtube.com/@AndrejKarpathy

https://course.fast.ai/

https://www.youtube.com/@WelchLabsVideo

https://www.youtube.com/@3blue1brown Early papers. These probably aren't helpful for understanding the concepts in this video, but if you're interested in history.

The Perceptron – A perceiving and recognizing automaton: https://bpb-us-e2.wpmucdn.com/websites.umass.edu/dist/a/27637/files/2016/03/rosenblatt-1957.pdf

The Perceptron: A probabilistic model for information storage and organization in the brain: https://www.ling.upenn.edu/courses/cogs501/Rosenblatt1958.pdf A Logical…

4 months, 2 weeks назад @ youtube.com
🎧 Podcasts
Lex Fridman AI Podcast Lex Fridman AI Podcast
последний пост 2 days, 20 hours назад
#491 – OpenClaw: The Viral AI Agent that Broke the Internet – Peter Steinberger
#491 – OpenClaw: The Viral AI Agent that Broke the Internet – Peter Steinberger #491 – OpenClaw: The Viral AI Agent that Broke the Internet – Peter Steinberger

Peter Steinberger is the creator of OpenClaw, an open-source AI agent framework that’s the fastest-growing project in GitHub history.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep491-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://coderabbit.ai/lexFin: AI agent for customer service.

Go to https://fin.ai/lexBlitzy: AI agent for large enterprise codebases.

Go to https://drinkLMNT.com/lexOUTLINE:(00:00) – Introduction(03:51) – Sponsors, Comments, and Reflections(15:29) – OpenClaw origin story(18:48) – Mind-blowing moment(28:15) – Why OpenClaw went viral(32:12) – Self-modifying AI agent(36:57)…

2 days, 20 hours назад @ lexfridman.com
#490 – State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI
#490 – State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI #490 – State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI

Nathan Lambert and Sebastian Raschka are machine learning researchers, engineers, and educators.

Sebastian Raschka is the author of Build a Large Language Model (From Scratch) and Build a Reasoning Model (From Scratch).

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep490-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

(25:11) – ChatGPT vs Claude vs Gemini vs Grok: Who is winning?

(36:11) – Best AI for coding(43:02) – Open Source vs Closed Source LLMs(54:41) – Transformers: Evolution of LLMs since 2019(1:02:38) – AI Scaling Laws: Are they dead or still holding?

1 week, 6 days назад @ lexfridman.com
#489 – Paul Rosolie: Uncontacted Tribes in the Amazon Jungle
#489 – Paul Rosolie: Uncontacted Tribes in the Amazon Jungle #489 – Paul Rosolie: Uncontacted Tribes in the Amazon Jungle

Paul Rosolie is a naturalist, explorer, author of a new book titled Junglekeeper, and is someone who has dedicated his life to protecting the Amazon rainforest.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep489-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://perplexity.ai/BetterHelp: Online therapy and counseling.

Go to https://fin.ai/lexMiro: Online collaborative whiteboard platform.

Go to https://miro.com/MasterClass: Online classes from world-class experts.

1 month назад @ lexfridman.com
#488 – Infinity, Paradoxes that Broke Mathematics, Gödel Incompleteness & the Multiverse – Joel David Hamkins
#488 – Infinity, Paradoxes that Broke Mathematics, Gödel Incompleteness & the Multiverse – Joel David Hamkins #488 – Infinity, Paradoxes that Broke Mathematics, Gödel Incompleteness & the Multiverse – Joel David Hamkins

Joel David Hamkins is a mathematician and philosopher specializing in set theory, the foundations of mathematics, and the nature of infinity, and he’s the #1 highest-rated user on MathOverflow.

He is also the author of several books, including Proof and the Art of Mathematics and Lectures on the Philosophy of Mathematics.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep488-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://masterclass.com/lexpodOUTLINE:(00:00) – Introduction(01:58) – Sponsors, Comments, and Reflections(15:40) – Infinity & paradoxes(1:02:50) – Russell’s paradox(1:15:57) – Gödel’s…

1 month, 2 weeks назад @ lexfridman.com
#487 – Irving Finkel: Deciphering Secrets of Ancient Civilizations & Flood Myths
#487 – Irving Finkel: Deciphering Secrets of Ancient Civilizations & Flood Myths #487 – Irving Finkel: Deciphering Secrets of Ancient Civilizations & Flood Myths

Irving Finkel is a scholar of ancient languages and a longtime curator at the British Museum, renowned for his expertise in Mesopotamian history and cuneiform writing.

He specializes in reading and interpreting cuneiform inscriptions, including tablets from Sumerian, Akkadian, Babylonian, and Assyrian contexts.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep487-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://shopify.com/lexMiro: Online collaborative whiteboard platform.

Go to https://miro.com/Chevron: Reliable energy for data centers.

2 months назад @ lexfridman.com
#486 – Michael Levin: Hidden Reality of Alien Intelligence & Biological Life
#486 – Michael Levin: Hidden Reality of Alien Intelligence & Biological Life #486 – Michael Levin: Hidden Reality of Alien Intelligence & Biological Life

Michael Levin is a biologist at Tufts University working on novel ways to understand and control complex pattern formation in biological systems.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep486-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://upliftdesk.com/lexMiro: Online collaborative whiteboard platform.

Go to https://miro.com/MasterClass: Online classes from world-class experts.

(2:42:41) – Mind uploading(3:01:22) – Alien intelligence(3:16:17) – Advice for young people(3:22:46) – Questions for AGI

2 months, 2 weeks назад @ lexfridman.com
#485 – David Kirtley: Nuclear Fusion, Plasma Physics, and the Future of Energy
#485 – David Kirtley: Nuclear Fusion, Plasma Physics, and the Future of Energy #485 – David Kirtley: Nuclear Fusion, Plasma Physics, and the Future of Energy

David Kirtley is a nuclear fusion engineer and CEO of Helion Energy, a company working on building the world's first commercial fusion power plant by 2028.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep485-sc

See below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc. Transcript:

https://lexfridman.com/david-kirtley-transcript CONTACT LEX:

Feedback - give feedback to Lex: https://lexfridman.com/survey

AMA - submit questions, videos or call-in: https://lexfridman.com/ama

Hiring - join our team: https://lexfridman.com/hiring

Other - other ways to get in touch: https://lexfridman.com/contact EPISODE LINKS:

David's X: htt…

2 months, 4 weeks назад @ lexfridman.com
#484 – Dan Houser: GTA, Red Dead Redemption, Rockstar, Absurd & Future of Gaming
#484 – Dan Houser: GTA, Red Dead Redemption, Rockstar, Absurd & Future of Gaming #484 – Dan Houser: GTA, Red Dead Redemption, Rockstar, Absurd & Future of Gaming

Dan Houser is co-founder of Rockstar Games and is a legendary creative mind behind Grand Theft Auto (GTA) and Red Dead Redemption series of video games.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep484-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://box.com/aiUPLIFT Desk: Standing desks and office ergonomics.

Go to https://drinkLMNT.com/lexOUTLINE:(00:00) – Introduction(01:29) – Sponsors, Comments, and Reflections(11:32) – Greatest films of all time(23:45) – Making video games(26:36) – GTA 3(29:55) – Open world video games(32:42) – Character creation(36:09) – Superintelligent AI in A Bette…

3 months, 2 weeks назад @ lexfridman.com
#483 – Julia Shaw: Criminal Psychology of Murder, Serial Killers, Memory & Sex
#483 – Julia Shaw: Criminal Psychology of Murder, Serial Killers, Memory & Sex #483 – Julia Shaw: Criminal Psychology of Murder, Serial Killers, Memory & Sex

Julia Shaw is a criminal psychologist and author who in her books explores human nature, including psychopathy, violent crime, the psychology of evil, police interrogation, false memory manipulation, deception detection, and human sexuality.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep483-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://shopify.com/lexBetterHelp: Online therapy and counseling.

Go to https://betterhelp.com/lexLMNT: Zero-sugar electrolyte drink mix.

Go to https://drinkLMNT.com/lexAG1: All-in-one daily nutrition drink.

4 months назад @ lexfridman.com
#482 – Pavel Durov: Telegram, Freedom, Censorship, Money, Power & Human Nature
#482 – Pavel Durov: Telegram, Freedom, Censorship, Money, Power & Human Nature #482 – Pavel Durov: Telegram, Freedom, Censorship, Money, Power & Human Nature

Pavel Durov is the founder and CEO of Telegram.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep482-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Transcript:https://lexfridman.com/pavel-durov-transcriptCONTACT LEX:Feedback – give feedback to Lex: https://lexfridman.com/surveyAMA – submit questions, videos or call-in: https://lexfridman.com/amaHiring – join our team: https://lexfridman.com/hiringOther – other ways to get in touch: https://lexfridman.com/contactEPISODE LINKS:Pavel’s Telegram: https://t.me/durovPavel’s X: https://x.com/durovTelegram: https://telegram.org/Telegram Contests: https://contest.c…

4 months, 2 weeks назад @ lexfridman.com
#481 – Norman Ohler: Hitler, Nazis, Drugs, WW2, Blitzkrieg, LSD, MKUltra & CIA
#481 – Norman Ohler: Hitler, Nazis, Drugs, WW2, Blitzkrieg, LSD, MKUltra & CIA #481 – Norman Ohler: Hitler, Nazis, Drugs, WW2, Blitzkrieg, LSD, MKUltra & CIA

Norman Ohler is a historian and author of “Blitzed: Drugs in the Third Reich,” a book that investigates the role of psychoactive drugs, particularly stimulants such as methamphetamine, in the military history of World War II.

It is a book that two legendary historians Ian Kershaw and Antony Beevor give very high praise for its depth of research.

Norman also wrote “Tripped: Nazi Germany, the CIA, and the Dawn of the Psychedelic Age”, and he is working on a new book “Stoned Sapiens” looking at the history of human civilization through the lens of drugs.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep481-scSee below for timestamps, transcript, and to give f…

4 months, 4 weeks назад @ lexfridman.com
#480 – Dave Hone: T-Rex, Dinosaurs, Extinction, Evolution, and Jurassic Park
#480 – Dave Hone: T-Rex, Dinosaurs, Extinction, Evolution, and Jurassic Park #480 – Dave Hone: T-Rex, Dinosaurs, Extinction, Evolution, and Jurassic Park

Dave Hone is a paleontologist, expert on dinosaurs, co-host of the Terrible Lizards podcast, and author of numerous scientific papers and books on the behavior and ecology of dinosaurs.

He lectures at Queen Mary University of London on topics of Ecology, Zoology, Biology, and Evolution.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep480-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://go.lindy.ai/lexBetterHelp: Online therapy and counseling.

Go to https://shopify.com/lexLMNT: Zero-sugar electrolyte drink mix.

5 months, 1 week назад @ lexfridman.com
#479 – Dave Plummer: Programming, Autism, and Old-School Microsoft Stories
#479 – Dave Plummer: Programming, Autism, and Old-School Microsoft Stories #479 – Dave Plummer: Programming, Autism, and Old-School Microsoft Stories

Dave Plummer is a programmer, former Microsoft software engineer (Windows 95, NT, XP), creator of Task Manager, author of two books on autism, and host of the Dave’s Garage YouTube channel, where he shares stories from his career, insights on software development, and deep dives into technology.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep479-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://upliftdesk.com/lexZocDoc: App that helps patients find healthcare providers.

Go to https://zocdoc.com/lexFin: AI agent for customer service.

Go to https://fin.ai/lexAllio Capital: AI-powered investment app that use…

5 months, 2 weeks назад @ lexfridman.com
#478 – Scott Horton: The Case Against War and the Military Industrial Complex
#478 – Scott Horton: The Case Against War and the Military Industrial Complex #478 – Scott Horton: The Case Against War and the Military Industrial Complex

Scott Horton is the director of the Libertarian Institute, editorial director of Antiwar.com, host of The Scott Horton Show, co-host of Provoked, and for the past three decades a staunch critic of U.S. military interventionism.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep478-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://alliocapital.com/Hampton: Community for high-growth founders and CEOs.

Go to https://joinhampton.com/lexBetterHelp: Online therapy and counseling.

Go to https://drinkag1.com/lexOUTLINE:(00:00) – Introduction(00:35) – Sponsors, Comments, and Reflections(09:14) – From the Cold War to …

5 months, 3 weeks назад @ lexfridman.com
#477 – Keyu Jin: China’s Economy, Tariffs, Trade, Trump, Communism & Capitalism
#477 – Keyu Jin: China’s Economy, Tariffs, Trade, Trump, Communism & Capitalism #477 – Keyu Jin: China’s Economy, Tariffs, Trade, Trump, Communism & Capitalism

Keyu Jin is an economist specializing in China’s economy, international macroeconomics, global trade imbalances, and financial policy.

She is the author of The New China Playbook: Beyond Socialism and Capitalism.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep477-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://alliocapital.com/UPLIFT Desk: Standing desks and office ergonomics.

Go to https://upliftdesk.com/lexHampton: Community for high-growth founders and CEOs.

6 months назад @ lexfridman.com
Microsoft Research Podcast Microsoft Research Podcast
последний пост 2 months, 2 weeks назад
Ideas: Community building, machine learning, and the future of AI
Ideas: Community building, machine learning, and the future of AI Ideas: Community building, machine learning, and the future of AI

This week, machine learning researchers around the world will be attending the annual Conference on Neural Information Processing Systems, or NeurIPS.

In this series, we’ll explore the technologies that are shaping our future and the big ideas that propel them forward.

So around that time when I started my PhD at Penn, I was working in machine learning theory and algorithmic economics.

How had you experienced a lack of community or network of women in machine learning before the founding of WiML?

So particularly when working on topics related to fairness, I’ve ended up focusing a bunch on stuff to do with marginalized groups as part of my responsible AI work.

2 months, 2 weeks назад @ microsoft.com
Ideas: More AI-resilient biosecurity with the Paraphrase Project
Ideas: More AI-resilient biosecurity with the Paraphrase Project Ideas: More AI-resilient biosecurity with the Paraphrase Project

Today, I’m excited to talk about the Paraphrase Project, an effort I co-led exploring how advances in AI tools for protein design might impact biosecurity.

These “patches,” akin to those in cybersecurity, have now been shared with organizations globally to strengthen biosecurity screening.

The project highlights that the same AI tools capable of incredible good can also be misused, requiring us to be vigilant, thoughtful, and creative so we continue to get the most benefit out of AI tools while working to ensure that we avoid costly misuses.

So things like, how similar is this to that template, wild-type protein structure that we used as our conditioning information?

But I feel like broadly…

4 months, 1 week назад @ microsoft.com
Coauthor roundtable: Reflecting on healthcare economics, biomedical research, and medical education
Coauthor roundtable: Reflecting on healthcare economics, biomedical research, and medical education Coauthor roundtable: Reflecting on healthcare economics, biomedical research, and medical education

KOHANE: So I think you’ve “nerd sniped” me because you [LAUGHTER]—which is all too easy—but I think there’s a central issue here.

But I actually think this is dark matter of human organizational technology that is not well understood.

AZEEM AZHAR: We didn’t talk about, you know, AI in its ability to potentially do this, which is to extend the clinician’s presence throughout the week.

And so I think there’s always going to be an opening for either differences of opinion or agreeing with you too much.

And this gets into whether AI is really going to get almost to the ab initio understanding of human biology.

5 months, 3 weeks назад @ microsoft.com
Reimagining healthcare delivery and public health with AI
Reimagining healthcare delivery and public health with AI Reimagining healthcare delivery and public health with AI

We are sorry, the page you requested cannot be found.

The page you are looking for could not be found or is no longer available.

6 months, 1 week назад @ microsoft.com
Navigating medical education in the era of generative AI
Navigating medical education in the era of generative AI Navigating medical education in the era of generative AI

Prior to med school, Daniel pursued experiences that cultivated his interest in the application of AI in medical practice and education.

Really, really looking forward to this chat.

There’s AI before ChatGPT and before, you know, generative AI really became a big thing, and then afterwards.

And then after we talk about what’s really happening, what do you think should happen in medical education given the reality of generative AI?

And I do agree [that] AI really gives us real hope that we can make it true.

6 months, 3 weeks назад @ microsoft.com
AI Testing and Evaluation: Reflections
AI Testing and Evaluation: Reflections AI Testing and Evaluation: Reflections

Our goal is to learn from their successes and their stumbles to move the science and practice of AI testing forward.

We have examples, like the pharmaceutical or medical device industry experts with whom you spoke, that’s really, you know, testing … there is a pre-deployment requirement.

And the third is just how rigid versus adaptive these testing and evaluation regimes or frameworks are in these different domains.

I really agree that there has been a lot of emphasis to date on, sort of, testing models upstream, the AI model evaluation.

You know, I think there’s been real progress already in the AI evaluation and testing ecosystem in the public-private partnership context.

6 months, 4 weeks назад @ microsoft.com
AI Testing and Evaluation: Learnings from cybersecurity
AI Testing and Evaluation: Learnings from cybersecurity AI Testing and Evaluation: Learnings from cybersecurity

Absolutely, I really, really was.

As a principal director on the Microsoft AI Red Team, Tori leads all AI security and safety red team operations, as well as dangerous capability testing, to directly inform C-suite decision-makers.

This year, we’ve pulled a lot of those assets and insights into the Azure [AI] Foundry AI Red Teaming Agent (opens in new tab).

So you can get a little taste of what we do day to day in the AI Red Teaming Agent.

WESTERHOFF: I think the most important takeaway from those lessons is that AI security is truly a team sport.

7 months назад @ microsoft.com
How AI will accelerate biomedical research and discovery
How AI will accelerate biomedical research and discovery How AI will accelerate biomedical research and discovery

Dr. Eric Topol is the executive vice president of the biomedical research non-profit Scripps Research, where he founded and now directs the Scripps Research Translational Institute.

Let’s continue our deep dive on AI and biomedical research with this conversation with Noubar Afeyan:LEE: Noubar, thanks so much for joining.

And there’s the origin story of contact with AI, you know, before the emergence of generative AI and afterwards.

What is going on today with respect to AI really being used for something meaningful in the design and development of drugs?

TOPOL: You would read about how, you know, data is the new oil and, you know, gold and whatnot.

7 months, 1 week назад @ microsoft.com
AI Testing and Evaluation: Learnings from pharmaceuticals and medical devices
AI Testing and Evaluation: Learnings from pharmaceuticals and medical devices AI Testing and Evaluation: Learnings from pharmaceuticals and medical devices

Our goal is to learn from their successes and their stumbles to move the science and practice of AI testing forward.

During the pre-market phase, medical testing establishes baseline safety and effectiveness metrics through bench testing, performance standards, and clinical studies.

SULLIVAN: So medical devices face a pretty prescriptive multi-level testing path before they hit the market.

We are looking into medical devices, as well, obviously, but also other technologies in advanced medical computing.

So we see Phase 3 trials as something that occurs in the medical devices and pharmaceuticals field.

7 months, 1 week назад @ microsoft.com
AI Testing and Evaluation: Learnings from genome editing
AI Testing and Evaluation: Learnings from genome editing AI Testing and Evaluation: Learnings from genome editing

As generative AI continues to advance, Microsoft has gathered a range of experts—from genome editing to cybersecurity—to share how their fields approach evaluation and risk assessment.

CHARO: Well, you know, genome editing is both very old and very new.

Now the earliest forms of genome editing were very inefficient, and so we didn’t worry that much.

But the bottom-line thing to remember, the way to really think about it is, we don’t regulate genome editing; we regulate the things that use genome editing.

And she said, you know, we don’t regulate genome editing; we regulate the things that use genome editing.

7 months, 2 weeks назад @ microsoft.com
AI Testing and Evaluation: Learnings from Science and Industry
AI Testing and Evaluation: Learnings from Science and Industry AI Testing and Evaluation: Learnings from Science and Industry

Our goal is to learn from their successes and their stumbles to move the science and practice of AI testing forward.

And I think, really, there are two reasons why tech is so, kind of, representative of that kind of challenge that I’ve always found fascinating.

Continues to be a really important topic in the AI policy conversation right now, I think, for really good reason.

Testing is an important component for governance and AI and, of course, in all of these other domains, as well.

I think about almost, like, in the near to mid-term, like three issues that we need to address in the AI, kind of, policy and testing context.

7 months, 3 weeks назад @ microsoft.com
The AI Revolution in Medicine, Revisited: How AI is reshaping the future of healthcare and medical research
The AI Revolution in Medicine, Revisited: How AI is reshaping the future of healthcare and medical research The AI Revolution in Medicine, Revisited: How AI is reshaping the future of healthcare and medical research

LEE: Yeah, yeah.

It cannot—as, you know, Bill was saying—it cannot learn from your document.

And I don’t know if the two of you remember, but I ended up doing a lot of tests.

I don’t know if you know, but just recently, there was a paper that was published on a scientific discovery using o3- mini (opens in new tab).

Like, if you have a human trained for one task and you put them into another task, then you don’t … you often don’t know.

8 months, 1 week назад @ microsoft.com
What AI's impact on individuals means for the health workforce and industry
What AI's impact on individuals means for the health workforce and industry What AI's impact on individuals means for the health workforce and industry

So I don’t think we should be surprised that business schools matter on this because we care about management.

That’s really going to change the way, like, middle school works, was my thinking at the time.

We’ve gone from AI being highly discriminative to AI that’s able to explore the world in particular ways.

The symptoms that they’re showing are quite different, and also their compliance is really, really different.

LEE: Yeah, really, really interesting.

8 months, 3 weeks назад @ microsoft.com
Abstracts: Zero-shot models in single-cell biology with Alex Lu
Abstracts: Zero-shot models in single-cell biology with Alex Lu Abstracts: Zero-shot models in single-cell biology with Alex Lu

And single-cell foundation models claim to be capable of unraveling deeper insights than ever before.

Basically, we showed that single-cell foundation models perform worse in settings that are fundamental to biological discovery than much simpler machine learning and statistical methods that were used in the field before single-cell foundation models emerged and are the go-to standard for unpacking meaning from these complicated experiments.

And the way to understand this is because single-cell foundation models are trained in a way that tries to expose these models to millions of single-cells.

But let’s also talk about the impact for methodologists, people who are trying to improve these s…

8 months, 4 weeks назад @ microsoft.com
Abstracts: Aurora with Megan Stanley and Wessel Bruinsma
Abstracts: Aurora with Megan Stanley and Wessel Bruinsma Abstracts: Aurora with Megan Stanley and Wessel Bruinsma

This is such exciting work about environmental forecasting, so we’re happy to have the two of you join us today.

Mostly because AI weather forecasting models are computationally much more efficient and can even be more accurate.

What’s unfortunate though, about this big step forward, is that these developments are mostly limited to the setting of weather forecasting.

Weather forecasting is very important, obviously, but there are many other important environmental forecasting problems out there, such as air pollution forecasting or ocean wave forecasting.

STANLEY: Current approaches have really focused training very specifically on weather forecasting models.

8 months, 4 weeks назад @ microsoft.com
NLP Highlights NLP Highlights
последний пост None
Data Skeptic
последний пост 1 week, 5 days назад
Healthy Friction in Job Recommender Systems
Healthy Friction in Job Recommender Systems Healthy Friction in Job Recommender Systems

In this episode, host Kyle Polich speaks with Roan Schellingerhout, a fourth-year PhD student at Maastricht University, about explainable multi-stakeholder recommender systems for job recruitment. Roan discusses his research on creating AI-powered job matching systems that balance the needs of multiple stakeholders—job seekers, recruiters, HR professionals, and companies. The conversation explores different types of explanations for job recommendations, including textual, bar chart, and graph-based formats, with findings showing that lay users strongly prefer simple textual explanations over more technical visualizations. Roan shares insights from his "healthy friction" study, which tested …

1 week, 5 days назад @ dataskeptic.com
Fairness in PCA-Based Recommenders
Fairness in PCA-Based Recommenders Fairness in PCA-Based Recommenders

In this episode, we explore the fascinating world of recommender systems and algorithmic fairness with David Liu, Assistant Research Professor at Cornell University's Center for Data Science for Enterprise and Society. David shares insights from his research on how machine learning models can inadvertently create unfairness, particularly for minority and niche user groups, even without any malicious intent. We dive deep into his groundbreaking work on Principal Component Analysis (PCA) and collaborative filtering, examining why these fundamental techniques sometimes fail to serve all users equally. David introduces the concept of "power niche users" - highly active users with specialized in…

2 weeks, 5 days назад @ dataskeptic.com
Video Recommendations in Industry
Video Recommendations in Industry Video Recommendations in Industry

In this episode, Kyle Polich sits down with Cory Zechmann, a content curator working in streaming television with 16 years of experience running the music blog "Silence Nogood." They explore the intersection of human curation and machine learning in content discovery, discussing the concept of "algatorial" curation—where algorithms and editorial expertise work together. Key topics include the cold start problem, why every metric is just a "proxy metric" for what users actually want, the challenge of filter bubbles, and the importance of balancing familiarity with discovery. Cory shares insights on why TikTok's algorithm works so well (clean data and massive interaction volume), the crucial …

1 month, 2 weeks назад @ dataskeptic.com
Eye Tracking in Recommender Systems
Eye Tracking in Recommender Systems Eye Tracking in Recommender Systems

In this episode, Santiago de Leon takes us deep into the world of eye tracking and its revolutionary applications in recommender systems. As a researcher at the Kempelin Institute and Brno University, Santiago explains the mechanics of eye tracking technology—how it captures gaze data and processes it into fixations and saccades to reveal user browsing patterns. He introduces the groundbreaking RecGaze dataset, the first eye tracking dataset specifically designed for recommender systems research, which opens new possibilities for understanding how users interact with carousel interfaces like Netflix. Through collaboration between psychologists and AI researchers, Santiago's work demonstrate…

1 month, 4 weeks назад @ dataskeptic.com
Cracking the Cold Start Problem
Cracking the Cold Start Problem Cracking the Cold Start Problem

In this episode of Data Skeptic, we dive deep into the technical foundations of building modern recommender systems. Unlike traditional machine learning classification problems where you can simply apply XGBoost to tabular data, recommender systems require sophisticated hybrid approaches that combine multiple techniques. Our guest, Boya Xu, an assistant professor of marketing at Virginia Tech, walks us through a cutting-edge method that integrates three key components: collaborative filtering for dimensionality reduction, embeddings to represent users and items in latent space, and bandit learning to balance exploration and exploitation when deploying new recommendations. Boya shares insigh…

2 months, 1 week назад @ dataskeptic.com
Designing Recommender Systems for Digital Humanities
Designing Recommender Systems for Digital Humanities Designing Recommender Systems for Digital Humanities

In this episode of Data Skeptic, we explore the fascinating intersection of recommender systems and digital humanities with guest Florian Atzenhofer-Baumgartner, a PhD student at Graz University of Technology. Florian is working on Monasterium.net, Europe's largest online collection of historical charters, containing millions of medieval and early modern documents from across the continent. The conversation delves into why traditional recommender systems fall short in the digital humanities space, where users range from expert historians and genealogists to art historians and linguists, each with unique research needs and information-seeking behaviors. Florian explains the technical challen…

2 months, 3 weeks назад @ dataskeptic.com
DataRec Library for Reproducible in Recommend Systems
DataRec Library for Reproducible in Recommend Systems DataRec Library for Reproducible in Recommend Systems

In this episode of Data Skeptic's Recommender Systems series, host Kyle Polich explores DataRec, a new Python library designed to bring reproducibility and standardization to recommender systems research. Guest Alberto Carlo Mario Mancino, a postdoc researcher from Politecnico di Bari, Italy, discusses the challenges of dataset management in recommendation research—from version control issues to preprocessing inconsistencies—and how DataRec provides automated downloads, checksum verification, and standardized filtering strategies for popular datasets like MovieLens, Last.fm, and Amazon reviews. The conversation covers Alberto's research journey through knowledge graphs, graph-based recommen…

3 months назад @ dataskeptic.com
Shilling Attacks on Recommender Systems
Shilling Attacks on Recommender Systems Shilling Attacks on Recommender Systems

In this episode of Data Skeptic's Recommender Systems series, Kyle sits down with Aditya Chichani, a senior machine learning engineer at Walmart, to explore the darker side of recommendation algorithms. The conversation centers on shilling attacks—a form of manipulation where malicious actors create multiple fake profiles to game recommender systems, either to promote specific items or sabotage competitors. Aditya, who researched these attacks during his undergraduate studies at SPIT before completing his master's in computer science with a data science specialization at UC Berkeley, explains how these vulnerabilities emerge particularly in collaborative filtering systems. From promoting a …

3 months, 1 week назад @ dataskeptic.com
Music Playlist Recommendations
Music Playlist Recommendations Music Playlist Recommendations

In this episode, Rebecca Salganik, a PhD student at the University of Rochester with a background in vocal performance and composition, discusses her research on fairness in music recommendation systems. She explores three key types of fairness—group, individual, and counterfactual—and examines how algorithms create challenges like popularity bias (favoring mainstream content) and multi-interest bias (underserving users with diverse tastes). Rebecca introduces LARP, her multi-stage multimodal framework for playlist continuation that uses contrastive learning to align text and audio representations, learn song relationships, and create playlist-level embeddings to address the cold start prob…

3 months, 2 weeks назад @ dataskeptic.com
Bypassing the Popularity Bias
Bypassing the Popularity Bias Bypassing the Popularity Bias 4 months назад @ dataskeptic.com
Sustainable Recommender Systems for Tourism
Sustainable Recommender Systems for Tourism Sustainable Recommender Systems for Tourism

In this episode, we speak with Ashmi Banerjee, a doctoral candidate at the Technical University of Munich, about her pioneering research on AI-powered recommender systems in tourism. Ashmi illuminates how these systems can address exposure bias while promoting more sustainable tourism practices through innovative approaches to data acquisition and algorithm design. Key highlights include leveraging large language models for synthetic data generation, developing recommendation architectures that balance user satisfaction with environmental concerns, and creating frameworks that distribute tourism more equitably across destinations. Ashmi's insights offer valuable perspectives for both AI res…

4 months, 1 week назад @ dataskeptic.com
Interpretable Real Estate Recommendations
Interpretable Real Estate Recommendations Interpretable Real Estate Recommendations

In this episode of Data Skeptic's Recommender Systems series, host Kyle Polich interviews Dr. Kunal Mukherjee, a postdoctoral research associate at Virginia Tech, about the paper "Z-REx: Human-Interpretable GNN Explanations for Real Estate Recommendations" The discussion explores how the post-COVID real estate landscape has created a need for better recommendation systems that can introduce home buyers to emerging neighborhoods they might not know about. Dr. Mukherjee, explains how his team developed a graph neural network approach that not only recommends properties but provides human-interpretable explanations for why certain regions are suggested. The conversation covers the advantages o…

4 months, 3 weeks назад @ dataskeptic.com
Why Am I Seeing This?
Why Am I Seeing This? Why Am I Seeing This?

In this episode of Data Skeptic, we explore the challenges of studying social media recommender systems when exposure data isn't accessible. Our guests Sabrina Guidotti, Gregor Donabauer, and Dimitri Ognibene introduce their innovative "recommender neutral user model" for inferring the influence of opaque algorithms.

5 months, 1 week назад @ dataskeptic.com
Eco-aware GNN Recommenders
Eco-aware GNN Recommenders Eco-aware GNN Recommenders

In this episode of Data Skeptic, we dive into eco-friendly AI with Antonio Purificato, a PhD student from Sapienza University of Rome. Antonio discusses his research on "EcoAware Graph Neural Networks for Sustainable Recommendations" and explores how we can measure and reduce the environmental impact of recommender systems without sacrificing performance.

5 months, 2 weeks назад @ dataskeptic.com
Networks and Recommender Systems
Networks and Recommender Systems Networks and Recommender Systems

Kyle reveals the next season's topic will be "Recommender Systems". Asaf shares insights on how network science contributes to the recommender system field.

6 months назад @ dataskeptic.com
SuperDataScience SuperDataScience
последний пост 1 day, 11 hours назад
966: The Moltbook Phenomenon: OpenClaw Unleashed
966: The Moltbook Phenomenon: OpenClaw Unleashed 966: The Moltbook Phenomenon: OpenClaw Unleashed

Jon Krohn gives Five-Minute Friday listeners all the details about the new social network causing a stir, Moltbook. What makes Moltbook so unique is that this is the first network designed just for AI agents. It’s an exclusive club, only its alleged 1.5 million registered agents can post, comment, and upvote, but we can watch this real-world experiment in agent ecology from the sidelines. Listen to the episode to hear the fascinating, if disturbing, story of Moltbook’s swift turn into facilitating a digital theocracy and forms of government, and whether this development is a sign of an approaching singularity or rather AI continuing to ape human thought and turn it into slop. Additional mat…

1 day, 11 hours назад @ podtrac.com
965: From PhD Side Project to $500M ARR: Will Falcon’s PyTorch Lightning Story
965: From PhD Side Project to $500M ARR: Will Falcon’s PyTorch Lightning Story 965: From PhD Side Project to $500M ARR: Will Falcon’s PyTorch Lightning Story

CEO of Lightning AI Will Falcon speaks to podcast host and Lightning AI fellow Jon Krohn about the company’s merger with Voltage Park, and why Will has named it the “full-stack AI neo-cloud for enterprises and frontier labs”. Lightning AI’s offer is a secure, flexible, and collaborative environment that can run on the cloud, all essentials for early-stage startups. Listen to the episode to hear Will Falcon discuss Lightning AI Studio, founding PyTorch Lightning, and how he came to found his AI company. This episode is brought to you by the Dell⁠⁠, by ⁠⁠Intel⁠⁠, by Fabi and by Cisco. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/965⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in spons…

4 days, 11 hours назад @ podtrac.com
SDS 964: In Case You Missed It in January 2026
SDS 964: In Case You Missed It in January 2026 SDS 964: In Case You Missed It in January 2026

In this first of the year ICYMI episode, Jon Krohn selects his favorite moments from January’s SuperDataScience interviews. Listen to why incentivizing workers is the best way to get them to disclose their use of AI tools and pave the way for an AI-forward future, how AI continues to mimic human development in its own evolution, the importance of evaluation in building AI systems, and how to keep your best employees (and also: how to know your value) with guests Sadie St. Lawrence, Ashwin Rajeeva, Sinan Ozdemir, Vijoy Pandey, and Ethan Mollick. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/964⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email n…

1 week, 1 day назад @ podtrac.com
963: Reinforcement Learning for Agents, with Amazon AGI Labs’ Antje Barth
963: Reinforcement Learning for Agents, with Amazon AGI Labs’ Antje Barth 963: Reinforcement Learning for Agents, with Amazon AGI Labs’ Antje Barth

Bestselling author and Gen AI instructor Antje Barth talks to Jon Krohn about her work at Amazon’s AGI Labs and their newest product Nova Act, as well as where we will see the most success with AI agents and how AI developers can reap those rewards. This episode is brought to you by the ⁠⁠Dell⁠⁠, by ⁠⁠Intel⁠⁠, by Fabi and by Cisco. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/963⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information. In this episode you will learn: (01:23) Amazon’s latest product, Nova Act (11:05) How Nova Act tests reliability (24:01) Where Amazon’s 1000s of gen AI …

1 week, 4 days назад @ podtrac.com
962: Wharton Prof Ethan Mollick on Why Your AI Strategy Is Already Obsolete
962: Wharton Prof Ethan Mollick on Why Your AI Strategy Is Already Obsolete 962: Wharton Prof Ethan Mollick on Why Your AI Strategy Is Already Obsolete

Bestselling author of Co-Intelligence: Living and Working with AI Ethan Mollick speaks to Jon Krohn about just how much US firms have to gain from a willingness to adopt and experiment with AI, as well as the reality behind AI use among employees and the frontier models set to support them even further. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/962⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.⁠⁠⁠

2 weeks, 1 day назад @ podtrac.com
961: Distributed Artificial Superintelligence, with Dr. Vijoy Pandey
961: Distributed Artificial Superintelligence, with Dr. Vijoy Pandey 961: Distributed Artificial Superintelligence, with Dr. Vijoy Pandey

Dr. Vijoy Pandey returns to the show to talk to Jon Krohn about Cisco’s work to advance medicine and mitigate the impact of climate change with distributed artificial super-intelligence. Dr. Vijoy Pandey believes in a future where humans and AI agents work together to tackle our biggest challenges. For this to happen, we will need to have multi-agent systems and open-source platforms that let agents work together, avoiding the phenomenon of AI agents being “isolated geniuses” unable to collaborate. He elaborates on what Cisco is doing to close this gap. This episode is brought to you by the⁠ ⁠⁠Dell⁠⁠⁠, by⁠ ⁠⁠Intel⁠⁠⁠, by ⁠Fabi⁠ and by ⁠⁠⁠⁠Scaylor⁠. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠…

2 weeks, 4 days назад @ podtrac.com
960: In Case You Missed It in December 2025
960: In Case You Missed It in December 2025

For 2026’s first episode of In Case You Missed It (ICYMI), Jon Krohn selects 6 clips from December for a wide-ranging look at the current state of AI in business and beyond. Hear from Joel Beasley (Episode 945), Jeff Li (Episode 947), Sandy Pentland (Episode 949), Josh Clemm (Episode 951), Penelope LaFeuille (Episode 952), and John Roese (Episode 953) on ensuring your AI systems get adopted and succeed in their goals, interesting ways to use AI for standup comedy routines, and more. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/960⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

3 weeks, 1 day назад @ podtrac.com
959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir)
959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir) 959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir)

AI entrepreneur and bestselling author Sinan Ozdemir speaks to Jon Krohn about the practical differences between agentic AI and AI workflows, why evaluating accuracy on its own won’t tell you enough about AI models, and more about his latest book Building Agentic AI. This episode is brought to you by the ⁠⁠Dell⁠⁠, by ⁠⁠Intel⁠⁠, by Fabi and by Cisco. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/959⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information. In this episode you will learn: (04:57) Exploring the differences between workflows and agents (17:03) How to work out parameter count for…

3 weeks, 4 days назад @ podtrac.com
958: Without Trusted Context, Agents are Stupid (featuring Salesforce’s Rahul Auradkar)
958: Without Trusted Context, Agents are Stupid (featuring Salesforce’s Rahul Auradkar) 958: Without Trusted Context, Agents are Stupid (featuring Salesforce’s Rahul Auradkar)

In this #sponsored Feature Friday episode, Salesforce’s Rahul Auradkar speaks to Jon Krohn about the company’s unified data engine and how its acquisition of Informatica provides the missing context layer for AI models and agents. Hear how Salesforce’s Data 360 helps customers to get accurate and insightful information about their business, and what AI models need to benefit a company’s bottom line (hint: it’s not only large amounts of data!) Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/958⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

4 weeks, 1 day назад @ podtrac.com
957: How AI Agents Are Automating Enterprise Data Operations, with Ashwin Rajeeva
957: How AI Agents Are Automating Enterprise Data Operations, with Ashwin Rajeeva 957: How AI Agents Are Automating Enterprise Data Operations, with Ashwin Rajeeva

AI agents, data lakes, and managing data sprawl: Ashwin Rajeeva, cofounder and CTO of Acceldata, speaks to Jon Krohn about how the agentic data management startup raised over $100 million in venture capital to expand its business in automating data quality assurance as well as cataloguing and pipeline maintenance across enterprise environments. Acceldata utilizes multiple agents to solve enterprise-grade questions with company data. It also uses autonomous data pipelines that can detect and fix issues without human intervention, and the platform’s agentic data management system ADM also lets humans stay in the loop wherever needed. This episode is brought to you by the ⁠⁠Dell⁠⁠, by ⁠⁠Intel⁠…

1 month назад @ podtrac.com
956: From Agent Demo to Enterprise Product (with Ease!) feat. Salesforce’s Tyler Carlson
956: From Agent Demo to Enterprise Product (with Ease!) feat. Salesforce’s Tyler Carlson 956: From Agent Demo to Enterprise Product (with Ease!) feat. Salesforce’s Tyler Carlson

#Sponsored SVP, Head of Product for Salesforce’s AppExchange & Ecosystem, Tyler Carlson, talks to Jon Krohn about taking AI agents from prototype to enterprise-grade production with the Agentforce 360 Platform. Though we may now have plenty of tools to build demos for AI agents, most teams still struggle to turn early prototypes into secure and scalable products. With Salesforce’s Agentforce 360 Platform, users can build customer-focused agentic applications with a multi-LLM planner service for reasoning and logic, as well as a new scripting language for deterministic control over how agents interact with the contextual layer of the Salesforce data model. Learn how Salesforce’s WYSIWYG sche…

1 month назад @ podtrac.com
955: Nested Learning, Spatial Intelligence and the AI Trends of 2026, with Sadie St. Lawrence
955: Nested Learning, Spatial Intelligence and the AI Trends of 2026, with Sadie St. Lawrence 955: Nested Learning, Spatial Intelligence and the AI Trends of 2026, with Sadie St. Lawrence

Sadie St Lawrence joins Jon Krohn to discuss what to expect from the AI industry in 2026. Sadie and Jon talk through what they think will be the five biggest trends in AI, hand out awards for the best moments, comebacks, and disappointments in AI in 2025, and review how their predictions for 2025 played out. Hear Sadie’s five exciting predictions for 2026, from emerging jobs in AI to an important return to the drawing board! This episode is brought to you by the ⁠⁠Dell⁠⁠, by ⁠⁠Intel⁠⁠, by Fabi and by MongoDB. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/955⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for spon…

1 month, 1 week назад @ podtrac.com
954: Recap of 2025 and Wishing You a Wonderful 2026
954: Recap of 2025 and Wishing You a Wonderful 2026 954: Recap of 2025 and Wishing You a Wonderful 2026

Jon Krohn wraps up 2025 with his thoughts on how agentic AI has become as much a resounding success as an annoying buzzword for many in the tech industry, why such promising developments in generative AI mean that well-prepared, secured data will be ever more crucial, and Jon’s hopes for a better year for everyone across the world in 2026. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/954⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

1 month, 1 week назад @ podtrac.com
953: Beyond “Agent Washing”: AI Systems That Actually Deliver ROI, with Dell’s Global CTO John Roese
953: Beyond “Agent Washing”: AI Systems That Actually Deliver ROI, with Dell’s Global CTO John Roese 953: Beyond “Agent Washing”: AI Systems That Actually Deliver ROI, with Dell’s Global CTO John Roese

Dell Technologies’ John Roese talks to Jon Krohn about the phenomenon of “agent-washing”, his contribution to Dell’s incredible revenue boost in 2025, and why “knowledge layers” will be crucial to future tech. Hear also John’s predictions for where AI is going to lead us in 2026, from better, clearer governance, data management methods and definitions for agentic AI, to systems that keep AI tools and our data running and secure with the help of “AI factories” and “sovereign AI”. This episode is brought to you by MongoDB and by Y Carrot. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/953⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie…

1 month, 2 weeks назад @ podtrac.com
952: How to Avoid Burnout and Get Promoted, with “The Fit Data Scientist” Penelope Lafeuille
952: How to Avoid Burnout and Get Promoted, with “The Fit Data Scientist” Penelope Lafeuille 952: How to Avoid Burnout and Get Promoted, with “The Fit Data Scientist” Penelope Lafeuille

“The Fit Data Scientist” newsletter author Pénélope Lafeuille talks to Jon Krohn about how to give your all at work, offering her top tips for a healthy body and a healthy mind. Learn why “The SuperDataScience Podcast” made it onto her top 3 data science podcasts, and why following your passion can pay off in dividends for your career. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/952⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

1 month, 2 weeks назад @ podtrac.com
Data Science at Home Data Science at Home
последний пост 1 week, 4 days назад
What is wrong with reinforcement learning? (Ep. 82)
What is wrong with reinforcement learning? (Ep. 82) What is wrong with reinforcement learning? (Ep. 82)

Join the discussion on our Discord serverAfter reinforcement learning agents doing great at playing Atari video games, Alpha Go, doing financial trading, dealing with language modeling, let me tell you the real story here.In this episode I want to shine some light on reinforcement learning (RL) and the limitations that every practitioner should consider before taking certain directions.

RL seems to work so well!

What is wrong with it?

Are you a listener of Data Science at Home podcast?

Or did you subscribe to the Artificial Intelligence at your fingertips newsletter?

1 week, 4 days назад @ datascienceathome.com
How to generate very large images with GANs (Ep. 76)
How to generate very large images with GANs (Ep. 76) How to generate very large images with GANs (Ep. 76)

Join the discussion on our Discord serverIn this episode I explain how a research group from the University of Lubeck dominated the curse of dimensionality for the generation of large medical images with GANs.

The problem is not as trivial as it seems.

Many researchers have failed in generating large images with GANs before.

One interesting application of such approach is in medicine for the generation of CT and X-ray images.Enjoy the show!

ReferencesMulti-scale GANs for Memory-efficient Generation of High Resolution Medical Images https://arxiv.org/abs/1907.01376

1 week, 4 days назад @ datascienceathome.com
Training neural networks faster without GPU [RB] (Ep. 77)
Training neural networks faster without GPU [RB] (Ep. 77) Training neural networks faster without GPU [RB] (Ep. 77)

Join the discussion on our Discord serverTraining neural networks faster usually involves the usage of powerful GPUs.

In this episode I explain an interesting method from a group of researchers from Google Brain, who can train neural networks faster by squeezing the hardware to their needs and making the training pipeline more dense.

Enjoy the show!

ReferencesFaster Neural Network Training with Data Echoinghttps://arxiv.org/abs/1907.05550

1 week, 4 days назад @ datascienceathome.com
More powerful deep learning with transformers (Ep. 84) (Rebroadcast)
More powerful deep learning with transformers (Ep. 84) (Rebroadcast) More powerful deep learning with transformers (Ep. 84) (Rebroadcast)

Some of the most powerful NLP models like BERT and GPT-2 have one thing in common: they all use the transformer architecture.

Such architecture is built on top of another important concept already known to the community: self-attention.In this episode I explain what these mechanisms are, how they work and why they are so powerful.

Don’t forget to subscribe to our Newsletter or join the discussion on our Discord serverReferences

1 week, 4 days назад @ datascienceathome.com
Your Favorite AI Startup is Probably Bullshit (Ep. 298) [RB]
Your Favorite AI Startup is Probably Bullshit (Ep. 298) [RB] Your Favorite AI Startup is Probably Bullshit (Ep. 298) [RB]

The brutal truth about why Silicon Valley is blowing billions on glorified autocomplete while pretending it’s the next iPhone.

We’re diving deep into the AI investment circus where VCs who can’t code are funding companies that barely understand their own technology.

From blockchain déjà vu to the “ChatGPT wrapper” economy—this episode will make you question every AI valuation you’ve ever seen.

Fair warning: We’re naming names and calling out the hype.

Don’t listen if you work at a “revolutionary AI startup” that’s just OpenAI’s API with a pretty interface.

1 week, 4 days назад @ datascienceathome.com
Why AI Researchers Are Suddenly Obsessed With Whirlpools (Ep. 297) [RB]
Why AI Researchers Are Suddenly Obsessed With Whirlpools (Ep. 297) [RB] Why AI Researchers Are Suddenly Obsessed With Whirlpools (Ep. 297) [RB]

VortexNet uses actual whirlpools to build neural networks.

By borrowing equations from fluid dynamics, this new architecture might solve deep learning’s toughest problems—from vanishing gradients to long-range dependencies.

Today we explain how vortex shedding, the Strouhal number, and turbulent flows might change everything in AI.

SponsorsThis episode is brought to you by Statistical HorizonsAt Statistical Horizons, you can stay ahead with expert-led livestream seminars that make data analytics and AI methods practical and accessible.

Join thousands of researchers and professionals who’ve advanced their careers with Statistical Horizons.

1 week, 4 days назад @ datascienceathome.com
AGI: The Dream We Should Never Reach (Ep. 296)
AGI: The Dream We Should Never Reach (Ep. 296) AGI: The Dream We Should Never Reach (Ep. 296)

Also on YouTubeTwo AI experts who actually love the technology explain why chasing AGI might be the worst thing for AI’s future—and why the current hype cycle could kill the field we’re trying to save.

Head to datascienceathome.com for detailed show notes, code examples, and exclusive deep-dives into the papers we discuss.

Subscribe to our newsletter for weekly breakdowns of cutting-edge research delivered straight to your inbox—no fluff, just science!

Our Discord community is full of ML engineers, researchers, and AI enthusiasts discussing papers, sharing projects, and helping each other level up.

Whether you’re debugging your first neural net or training your tenth transformer, there’s a …

1 week, 4 days назад @ datascienceathome.com
When Data Stops Being Code and Starts Being Conversation (Ep. 297)
When Data Stops Being Code and Starts Being Conversation (Ep. 297) When Data Stops Being Code and Starts Being Conversation (Ep. 297)

Mark Brocato built Mockaroo—the tool that taught millions of developers how to fake data.

Now, as Head of Engineering at Tonic.ai, he’s building the AI agent that’s making his own creation obsolete.

From the hidden failures of legacy mocks to the security implications of agent-driven synthesis, Mark reveals what happens when data generation becomes a conversation—not a pipeline.

SponsorsTonic.ai Synthetic data solutions for software and AI development.

Accelerate engineering velocity and ensure compliance with AI-powered data synthesisThis episode is brought to you by Statistical HorizonsAt Statistical Horizons, you can stay ahead with expert-led livestream seminars that make data analytics…

1 month, 3 weeks назад @ datascienceathome.com
Your AI Strategy is Burning Money: Here’s How to Fix It (Ep.295)
Your AI Strategy is Burning Money: Here’s How to Fix It (Ep.295) Your AI Strategy is Burning Money: Here’s How to Fix It (Ep.295)

Most companies don’t have an AI problem.

In this conversation, he breaks down when AI actually makes sense, where AWS costs spiral out of control, and why your “cool demo” keeps dying before launch.

If you’re tired of AI hype and ready for straight answers, hit play.

Our Discord community is full of ML engineers, researchers, and AI enthusiasts discussing papers, sharing projects, and helping each other level up.

Whether you’re debugging your first neural net or training your tenth transformer, there’s a place for you.

2 months, 2 weeks назад @ datascienceathome.com
From Tokens to Vectors: The Efficiency Hack That Could Save AI (Ep. 294)
From Tokens to Vectors: The Efficiency Hack That Could Save AI (Ep. 294) From Tokens to Vectors: The Efficiency Hack That Could Save AI (Ep. 294)

LLMs generate text painfully slow, one low-info token at a time.

Researchers just figured out how to compress 4 tokens into smart vectors & cut costs by 44%—with full code & proofs!

🔥📊SponsorsThis episode is brought to you by Statistical HorizonsAt Statistical Horizons, you can stay ahead with expert-led livestream seminars that make data analytics and AI methods practical and accessible.

Join thousands of researchers and professionals who’ve advanced their careers with Statistical Horizons.

Get $200 off any seminar with code DATA25 at https://statisticalhorizons.com

3 months назад @ datascienceathome.com
Why AI Researchers Are Suddenly Obsessed With Whirlpools (Ep. 293)
Why AI Researchers Are Suddenly Obsessed With Whirlpools (Ep. 293) Why AI Researchers Are Suddenly Obsessed With Whirlpools (Ep. 293)

VortexNet uses actual whirlpools to build neural networks.

By borrowing equations from fluid dynamics, this new architecture might solve deep learning’s toughest problems—from vanishing gradients to long-range dependencies.

Today we explain how vortex shedding, the Strouhal number, and turbulent flows might change everything in AI.

SponsorsThis episode is brought to you by Statistical HorizonsAt Statistical Horizons, you can stay ahead with expert-led livestream seminars that make data analytics and AI methods practical and accessible.

Join thousands of researchers and professionals who’ve advanced their careers with Statistical Horizons.

3 months, 2 weeks назад @ datascienceathome.com
The Scientists Growing Living Computers in Swiss Labs (Ep. 292)
The Scientists Growing Living Computers in Swiss Labs (Ep. 292) The Scientists Growing Living Computers in Swiss Labs (Ep. 292)

At the intersection of ethics and engineering, Amethix creates AI systems that don’t just function—they adapt, learn, and serve.

With a focus on dual-use innovation, Amethix is shaping a future where intelligent machines extend human capability, not replace it.

Discover more at https://amethix.com This episode is brought to you by Intrepid AI.

From drones to satellites, Intrepid AI gives engineers and defense innovators the tools to prototype, simulate, and deploy autonomous systems with confidence.

Learn more at intrepid.aiReferencesWebsite: finalspark.comDiscord account: / discordNewsletter: https://finalspark.com/#newsletterTopics: Biological computing • Neural engineering • Energy-effic…

3 months, 3 weeks назад @ datascienceathome.com
When AI Hears Thunder But Misses the Fear (Ep. 291)
When AI Hears Thunder But Misses the Fear (Ep. 291) When AI Hears Thunder But Misses the Fear (Ep. 291)

Sanjoy Chowdhury reveals AI’s hidden weakness: while systems can see objects and hear sounds perfectly, they can’t reason across senses like humans do.

At the intersection of ethics and engineering, Amethix creates AI systems that don’t just function—they adapt, learn, and serve.

Discover more at https://amethix.comThis episode is brought to you by Intrepid AI.

From drones to satellites, Intrepid AI gives engineers and defense innovators the tools to prototype, simulate, and deploy autonomous systems with confidence.

Whether it’s in the sky, on the ground, or in orbit—if it’s intelligent and mobile, Intrepid helps you build it.

4 months, 1 week назад @ datascienceathome.com
Why VCs Are Funding $100M Remote Control Toys (Ep. 290)
Why VCs Are Funding $100M Remote Control Toys (Ep. 290) Why VCs Are Funding $100M Remote Control Toys (Ep. 290)

ReferencesWar On The Rocks: https://warontherocks.com/2025/08/ukraine-isnt-the-model-for-winning-the-innovation-war/LinkedIn: https://www.linkedin.com/in/jonasrsinger/Spotify: https://tr.ee/Omy_1X8k1UApple Podcast: https://podcasts.apple.com/us/podcast/defence-innovation-podcast/id1797131332YouTube: https://youtube.com/@DefenceInnovationpodcast?si=cu2WlnVgL5XKnM0pSponsorsThis episode is proudly sponsored by Amethix Technologies.

At the intersection of ethics and engineering, Amethix creates AI systems that don’t just function—they adapt, learn, and serve.

Discover more at https://amethix.comThis episode is brought to you by Intrepid AI.

From drones to satellites, Intrepid AI gives engineers…

5 months назад @ datascienceathome.com
How Hacker Culture Died (Ep. 289)
How Hacker Culture Died (Ep. 289) How Hacker Culture Died (Ep. 289)

At the intersection of ethics and engineering, Amethix creates AI systems that don’t just function—they adapt, learn, and serve.

Discover more at amethix.comDSH is brought to you by Intrepid AI.

🐦 Twitter: @DataScienceAtHome📘 LinkedIn: https://www.linkedin.com/in/fragadaleta/Instagram: https://www.instagram.com/datascienceathome/Facebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Send us mail at:[email protected]’t forget to like, subscribe, and hit the 🔔 for…

5 months, 2 weeks назад @ datascienceathome.com