Very ML
State-of-the-art Machine Learning News Feed
/r/MachineLearning
последний пост 1 час назад
[R] Graph-Oriented Generation (GOG): Replacing Vector R.A.G. for Codebases with Deterministic AST Traversal (70% Average Token Reduction)
[R] Graph-Oriented Generation (GOG): Replacing Vector R.A.G. for Codebases with Deterministic AST Traversal (70% Average Token Reduction)

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 час назад @ reddit.com
[D] ISBI 2026 in London
[D] ISBI 2026 in London

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

7 часов назад @ reddit.com
[R] Functional regularization: where do I start?
[R] Functional regularization: where do I start?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

7 часов назад @ reddit.com
[R] V5 Update: Original post title ... I built a language model where tokens are complex numbers and "meaning" emerges from wave interference -- no attention, O(n), 178M params, open-sourcing today (V4) Research
[R] V5 Update: Original post title ... I built a language model where tokens are complex numbers and "meaning" emerges from wave interference -- no attention, O(n), 178M params, open-sourcing today (V4) Research [R] V5 Update: Original post title ... I built a language model where tokens are complex numbers and "meaning" emerges from wave interference -- no attention, O(n), 178M params, open-sourcing today (V4) Research

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

10 часов назад @ reddit.com
[Project] Extracting vector geometry (SVG/DXF/STL) from photos + experimental hand-drawn sketch extraction
[Project] Extracting vector geometry (SVG/DXF/STL) from photos + experimental hand-drawn sketch extraction [Project] Extracting vector geometry (SVG/DXF/STL) from photos + experimental hand-drawn sketch extraction

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

11 часов назад @ reddit.com
[P] Domain specific LoRA fine tuning on consumer hardware
[P] Domain specific LoRA fine tuning on consumer hardware

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

11 часов назад @ reddit.com
[R] Low-effort papers
[R] Low-effort papers

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

11 часов назад @ reddit.com
[D] Two college students built a prototype that tries to detect contradictions between research papers — curious if this would actually be useful
[D] Two college students built a prototype that tries to detect contradictions between research papers — curious if this would actually be useful [D] Two college students built a prototype that tries to detect contradictions between research papers — curious if this would actually be useful

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

11 часов назад @ reddit.com
[D] Unpopular opinion: "context window size" is a red herring if you don’t control what goes in it.
[D] Unpopular opinion: "context window size" is a red herring if you don’t control what goes in it.

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

13 часов назад @ reddit.com
[D] ECCV submission flowed over page limit by 5 lines at the last minute.. how screwed are we?
[D] ECCV submission flowed over page limit by 5 lines at the last minute.. how screwed are we?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

14 часов назад @ reddit.com
[R] ML/AI industry in Germany
[R] ML/AI industry in Germany

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

18 часов назад @ reddit.com
[D] What MCP servers are actually useful in your AI agent workflows? Looking for gaps in the ecosystem
[D] What MCP servers are actually useful in your AI agent workflows? Looking for gaps in the ecosystem

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

18 часов назад @ reddit.com
[P] On-device speech toolkit for Apple Silicon — ASR, TTS, diarization, speech-to-speech, all in native Swift
[P] On-device speech toolkit for Apple Silicon — ASR, TTS, diarization, speech-to-speech, all in native Swift

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

20 часов назад @ reddit.com
[R] Anyone experimenting with heterogeneous (different base LLMs) multi-agent systems for open-ended scientific reasoning or hypothesis generation?
[R] Anyone experimenting with heterogeneous (different base LLMs) multi-agent systems for open-ended scientific reasoning or hypothesis generation?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

22 часа назад @ reddit.com
[R] MICCAI 2026 Early Decisions
[R] MICCAI 2026 Early Decisions

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 2 hours назад @ reddit.com
Towards Data Science
последний пост 9 часов назад
What Makes Quantum Machine Learning “Quantum”?
What Makes Quantum Machine Learning “Quantum”? What Makes Quantum Machine Learning “Quantum”?

One of the promising, yet still not fully understood uses of quantum computers is quantum machine learning.

So even I, someone working on and researching quantum computers, was very confused at first… I bet a lot of people’s first question when they hear “Quantum Machine Learning” is what, exactly, makes quantum machine learning quantum?

Quantum Mechanics EntersQuantum machine learning becomes quantum when quantum information is the computational substrate.

In contrast, quantum machine learning uses quantum states, which are complexvectors that follow the rules of quantum mechanics.

Many things called “quantum machine learning” today are quantum in name only, for example:Classical ML algori…

9 часов назад @ towardsdatascience.com
The Data Team’s Survival Guide for the Next Era of Data
The Data Team’s Survival Guide for the Next Era of Data The Data Team’s Survival Guide for the Next Era of Data

This puts data teams in a precarious spot.

If data teams want to survive this shift, we need to stop building like it’s the peak of the dbt gold rush.

In 2026, the most successful data teams won’t be the ones with the most complex architectures; they’ll be the ones who realized their cloud data platform has quietly eaten 70% of their specialized tooling.

It is no longer enough to “ask” software engineers for better data; data teams need the engineering fluency to actively collaborate with product teams and build data-literate systems from day one.

It is no longer enough to “ask” software engineers for better data; data teams need the to actively collaborate with product teams and build data…

13 часов назад @ towardsdatascience.com
The Black Box Problem: Why AI-Generated Code Stops Being Maintainable
The Black Box Problem: Why AI-Generated Code Stops Being Maintainable The Black Box Problem: Why AI-Generated Code Stops Being Maintainable

A developer opens a module that was generated in a single AI session.

What Makes AI-Generated Code a Black BoxAI-generated code isn’t bad code.

Practical ImplicationsFor code you’re generating nowTreat every AI generation as a boundary decision.

The highest-risk code isn’t code that doesn’t work, it’s code that works but can’t be maintained.

For more on composable architecture and structured AI generation, visit bit.dev.

15 часов назад @ towardsdatascience.com
How to Create Production-Ready Code with Claude Code
How to Create Production-Ready Code with Claude Code How to Create Production-Ready Code with Claude Code

Thus, you need to apply specific techniques to make sure that the code you generate with Claude Code is production-ready.

In this article, I’ll discuss how to make sure the code Claude Code generates is production-ready and ensures our business in the developed application.

I’ll describe how to create production-ready code with Claude Code by first making sure we have initial robustness, and secondly, helping you iterate on the code, ultimately achieving production-ready code fully through autonomous coding agents.

Why generate code with Claude CodeFirst of all, we need to discuss why you should generate code with coding agents such as Claude Code.

ConclusionIn this article, I’ve discussed …

16 часов назад @ towardsdatascience.com
AI in Multiple GPUs: ZeRO & FSDP
AI in Multiple GPUs: ZeRO & FSDP AI in Multiple GPUs: ZeRO & FSDP

In vanilla DDP, every GPU holds a complete copy of the model parameters, gradients, and optimizer states.

Image by author: Model, gradients and optimizer are redundant across GPUs in regular DDPZeRO (Zero Redundancy Optimizer) solves this.

There are three levels:ZeRO-1 partitions only optimizer statespartitions only optimizer states ZeRO-2 partitions optimizer states + gradientspartitions optimizer states + gradients ZeRO-3 partitions optimizer states + gradients + model parametersZeRO isn’t a parallelism technique because all GPUs still run the same forward and backward passes.

ZeRO-1: Optimizer State PartitioningIn ZeRO-1, only the optimizer states are partitioned.

Here’s how:As the param…

1 day, 8 hours назад @ towardsdatascience.com
How Human Work Will Remain Valuable in an AI World
How Human Work Will Remain Valuable in an AI World How Human Work Will Remain Valuable in an AI World

Beyer calls this “scar tissue”: the knowledge that only the real world can give you, through friction, in real time.

But it cannot generate the surprises of the real world.

What I do believe is that we’ll enter a mode of work assisted by AI systems and agents, making our work potentially far more efficient.

The optimistic AI scenario doesn’t require nominal wages to rise; it requires service prices to fall faster than income.

What If AI Doesn’t Actually End The World?” (2026) — x.com/KobeissiLetterPenrose, Roger.

1 day, 16 hours назад @ towardsdatascience.com
5 Ways to Implement Variable Discretization
5 Ways to Implement Variable Discretization 5 Ways to Implement Variable Discretization

Understanding variable discretization is essential for data science students building strong ML foundations and AI engineers designing interpretable systems.

When I experimented with variable discretization methods, I noticed how certain ML models became more stable and interpretable.

is variable discretization?

Disadvantages of variable discretizationThe main disadvantage of variable discretization is the loss of information occurred due to the creation of bins.

Types of variable discretizationWe will discuss the following types of variable discretization.

2 days, 7 hours назад @ towardsdatascience.com
Stop Tuning Hyperparameters. Start Tuning Your Problem.
Stop Tuning Hyperparameters. Start Tuning Your Problem.

80% of ML projects fail from bad problem framing, not bad models. A 5-step protocol to define the right problem before you write training code.

The post Stop Tuning Hyperparameters. Start Tuning Your Problem. appeared first on Towards Data Science.

2 days, 9 hours назад @ towardsdatascience.com
Escaping the Prototype Mirage: Why Enterprise AI Stalls
Escaping the Prototype Mirage: Why Enterprise AI Stalls Escaping the Prototype Mirage: Why Enterprise AI Stalls

Once the engineers return to their day jobs, the prototype is abandoned and it begins to decay, just like unmaintained software.

A Healthcare Example: Let’s say we have a Patient Intake Agent designed to triage patients, verify insurance, and schedule appointments.

The key symptoms include:Unknown Reliability : Most agents fall short of the strict Service Level Agreements (SLAs) enterprise use demands.

Reliability : Moving from “Vibes” to Golden Datasets and LLM-as-a-Judge (so our Patient Intake Agent can be continuously tested against thousands of simulated complex patient histories).

: Moving from “Vibes” to Golden Datasets and LLM-as-a-Judge (so our Patient Intake Agent can be continuous…

2 days, 15 hours назад @ towardsdatascience.com
RAG with Hybrid Search: How Does Keyword Search Work?
RAG with Hybrid Search: How Does Keyword Search Work? RAG with Hybrid Search: How Does Keyword Search Work?

Then, by combining the results of the similarity search and BM25 search, we can essentially get the best of both worlds and significantly improve the results of our RAG pipeline.. . .

In information retrieval systems, BM25 is a ranking function used to evaluate how relevant a document is to a search query.

The word ‘travel’ is not informative at all, since it is included in all documents of the knowledge base.

Thus, by incorporating a keyword search, on top of the similarity search in the RAG pipeline, we can identify relevant chunks more effectively and comprehensively.

Notice how now we match the user’s query both with the embeddings in the vector store (semantic search) and the BM25 inde…

2 days, 16 hours назад @ towardsdatascience.com
Graph Coloring You Can See
Graph Coloring You Can See Graph Coloring You Can See

Articles in this forum (here, here and here) have previously considered graph coloring, focusing mostly on constructive heuristics for the node coloring problem.

import networkx as nx import matplotlib.pyplot as plt import gcol G = nx.dodecahedral_graph() # Generate and display a node coloring c = gcol.node_coloring(G) nx.draw_networkx(G, node_color=gcol.get_node_colors(G, c)) plt.show() # Generate and display an edge coloring c = gcol.edge_coloring(G) nx.draw_networkx(G, edge_color=gcol.get_edge_colors(G, c)) plt.show() # Generate node positions and then a face coloring pos = nx.planar_layout(G) c = gcol.face_coloring(G, pos) gcol.draw_face_coloring(c, pos) nx.draw_networkx(G, pos) plt.sho…

3 days, 10 hours назад @ towardsdatascience.com
Why You Should Stop Writing Loops in Pandas
Why You Should Stop Writing Loops in Pandas Why You Should Stop Writing Loops in Pandas

: when I first started using Pandas, I wrote loops like this all the time:for i in range(len(df)): if df.loc[i, "sales"] > 1000: df.loc[i, "tier"] = "high" else: df.loc[i, "tier"] = "low"It worked.

More importantly, it keeps you thinking like a beginner — row by row — instead of like a professional Pandas user.

So instead of looping through rows, I tried expressing the rule at the column level:“If sales > 1000, label high.

Here’s the vectorized version:import numpy as np import time start = time.time() df_big["tier"] = np.where(df_big["sales"] > 1000, "high", "low") end = time.time() print("Vectorized time:", end - start)And the result?

So naturally, I started writing things like this:df["t…

3 days, 13 hours назад @ towardsdatascience.com
I Quit My $130,000 ML Engineer Job After Learning 4 Lessons
I Quit My $130,000 ML Engineer Job After Learning 4 Lessons I Quit My $130,000 ML Engineer Job After Learning 4 Lessons

So, in this article, I want to go over why I eventually quit my machine learning engineer job, and offer an alternative view to what these “dream” jobs are actually like.

I didn’t feel I learned much from this, and that’s not what I want at this point in my career.

Small ScopeThere were around 100 machine learning engineers across the company, and around 5 times that number across the whole data, machine learning and science organisation.

The easy route was for me to stay, eventually earn a promotion to senior machine learning engineer, and have a comfortable, well-paying job for the next decade.

Join my free newsletter where I share weekly tips, insights, and advice from my experience as a…

3 days, 15 hours назад @ towardsdatascience.com
Agentic RAG vs Classic RAG: From a Pipeline to a Control Loop
Agentic RAG vs Classic RAG: From a Pipeline to a Control Loop Agentic RAG vs Classic RAG: From a Pipeline to a Control Loop

Classic RAG: the pipeline mental modelClassic RAG is straightforward to understand because it follows a linear process.

Where classic RAG hits the wallClassic RAG is a “one-shot” approach.

Agentic RAG: from retrieval to a control loopAgentic RAG retains the retriever and generator components but changes the control structure.

Classic RAG with citations, careful retrieval, escalation High Complexity Classic RAG + second pass on failure signals (only loop when needed).

ConclusionIf your product primarily involves document lookup, extraction, or rapid assistance, classic RAG is typically the best default.

3 days, 16 hours назад @ towardsdatascience.com
YOLOv3 Paper Walkthrough: Even Better, But Not That Much
YOLOv3 Paper Walkthrough: Even Better, But Not That Much YOLOv3 Paper Walkthrough: Even Better, But Not That Much

x = self.conv0(x) print(f'after conv0\t\t: {x.size()}') x = self.conv1(x) print(f'after conv1\t\t: {x.size()}') x = self.conv2(x) print(f'after conv2\t\t: {x.size()}') x = self.conv3(x) print(f'after conv3\t\t: {x.size()}') x = self.conv4(x) print(f'after conv4\t\t: {x.size()}') large_obj = self.detection_head_large_obj(x) print(f'large object detection\t: {large_obj.size()}') ############################################### # Flow to 26x26 detection head.

x = self.conv5(x) print(f'after conv5\t\t: {x.size()}') x = self.upsample0(x) print(f'after upsample0\t\t: {x.size()}') x = torch.cat([x, branch1], dim=1) print(f'after concatenate\t: {x.size()}') x = self.conv6(x) print(f'after conv6\t\t:…

4 days, 6 hours назад @ towardsdatascience.com
Distill.pub Distill.pub
последний пост None
The Gradient The Gradient
последний пост 2 weeks, 2 days назад
The Reasonable Effectiveness of Virtue Ethics in AI Alignment
The Reasonable Effectiveness of Virtue Ethics in AI Alignment The Reasonable Effectiveness of Virtue Ethics in AI Alignment

I say she tries to be mathematically excellent, which involves promoting mathematical excellence through mathematical excellence, and that this structure is closely related to why 'mathematical excellence' can even be a concept.

This is not to suggest that ‘good mathematics causes future good mathematics’ is a full definition or even full informal description of good mathematics.

My claim is only that the fact that good mathematics has a disposition to cause future good mathematics reveals something essential about our concept of good mathematics (and about the material affordances enabling this concept).

‘mathematical excellence through mathematical excellence’ AI?

I believe ‘mathematical …

2 weeks, 2 days назад @ thegradient.pub
AGI Is Not Multimodal
AGI Is Not Multimodal AGI Is Not Multimodal

Despite this, scale maximalists have implicitly suggested that multimodal models can be a structure-agnostic framework for AGI.

While structure-agnostic scale maximalism has succeeded in producing LLMs and LVMs that pass Turing tests, a multimodal scale maximalist approach to AGI will not bear similar fruit.

CitationFor attribution in academic contexts or books, please cite this work asBenjamin A. Spiegel, "AGI Is Not Multimodal", The Gradient, 2025.

@article{spiegel2025agi, author = {Benjamin A. Spiegel}, title = {AGI Is Not Multimodal}, journal = {The Gradient}, year = {2025}, howpublished = {\url{https://thegradient.pub/agi-is-not-multimodal}, }ReferencesAndreas, Jacob.

“Language Models,…

9 months назад @ thegradient.pub
TheSequence TheSequence
последний пост 1 day, 16 hours назад
The Sequence Opinion #819: How AI Chips are Made?
The Sequence Opinion #819: How AI Chips are Made? The Sequence Opinion #819: How AI Chips are Made?

From Silicon to Intelligence: What Actually Goes Into Designing an AI ChipA deep dive into the architecture, physics, and craft of building hardware for neural networks — from RTL, Verilog to tape-out.

Let me tell you something that took me an embarrassingly long time to truly internalize: the reason deep learning works as well as it does in 2026 is only maybe 40% algorithms.

We got lucky — spectacularly lucky — that the GPU, a chip originally designed to make triangles pretty in Quake III, turned out to be almost exactly the right computational substrate for training neural networks.

But “almost exactly right” is doing a lot of heavy lifting in that sentence.

The story of AI chips is the s…

1 day, 16 hours назад @ thesequence.substack.com
The Sequence AI of the Week #818: You Cannot Miss Qwen 3.5
The Sequence AI of the Week #818: You Cannot Miss Qwen 3.5 The Sequence AI of the Week #818: You Cannot Miss Qwen 3.5

They launched the Qwen 3.5 series.

And just yesterday, in a move that signals their ambition to own the entire deployment stack, they released the Qwen 3.5 “Small” series—ranging from 0.8B to 9B parameters, aimed squarely at on-device edge computing.

The Qwen 3.5 family isn’t just bigger; it’s structurally different.

Let’s dive into the technical contributions that make Qwen 3.5 one of the most fascinating engineering feats in the open-weight ecosystem right now.

The Wall: Why Dense Models Don’t Scale Economically

2 days, 16 hours назад @ thesequence.substack.com
The Sequence Knowledge #817: DeepMind Genie and Interactive World Models
The Sequence Knowledge #817: DeepMind Genie and Interactive World Models The Sequence Knowledge #817: DeepMind Genie and Interactive World Models

Today we will Discuss:The core principles of actionable world models.

A deep dive into the amazing DeepMind Genie models.

💡 AI Concept of the Day: Actionable World Models and DeepMind’s GenieIf the “Sora moment” proved that AI could dream a consistent world, the next frontier is proving that AI can control it.

This distinction is the defining battleground of World Models today.

This brings us to the rise of Actionable World Models—systems that transform static video data into interactive, playable environments.

3 days, 16 hours назад @ thesequence.substack.com
The Sequence Radar #816: Last Week in AI: $110B Bets, Nano Banana 2, and the New Economic Reality
The Sequence Radar #816: Last Week in AI: $110B Bets, Nano Banana 2, and the New Economic Reality The Sequence Radar #816: Last Week in AI: $110B Bets, Nano Banana 2, and the New Economic Reality

Valued at $840 billion and backed by a powerful coalition including Amazon, SoftBank, and Nvidia, this capital injection underscores the immense resources required to build the sovereign-level infrastructure of the next AI generation.

While OpenAI secures the bank, Google is focused on the creative edge with the launch of Nano Banana 2 (technically Gemini 3.1 Flash Image).

Nano Banana 2 introduces advanced features like real-time web grounding, which allows it to pull information from Google Search to accurately render specific landmarks, people, or products.

The research establishes scaling laws for multimodal discrete diffusion and demonstrates strong cross-modal generation capabilities a…

5 days, 16 hours назад @ thesequence.substack.com
The Sequence Opinion #815: The End of RLHF? The Rise of Verifiable Rewards
The Sequence Opinion #815: The End of RLHF? The Rise of Verifiable Rewards The Sequence Opinion #815: The End of RLHF? The Rise of Verifiable Rewards

Today, we are exploring a massive architectural shift in how artificial intelligence models are trained, moving away from human bottlenecks and toward autonomous, verifiable reasoning.

Here is a full breakdown of the transition to what many are calling one of the most influential post-training techniques in frontier AI models.

The prevailing method—pretraining on massive internet corpora followed by Reinforcement Learning from Human Feedback (RLHF)—produces fast, intuitive pattern matchers.

Ultimately, RLHF scales only as fast as human bodies can work, meaning it fundamentally does not scale.

The Shift to Reinforcement Learning with Verifiable Rewards (RLVR)

1 week назад @ thesequence.substack.com
The Sequence Chat #814: Z.ai's Zixuan Li Talks About GLM
The Sequence Chat #814: Z.ai's Zixuan Li Talks About GLM The Sequence Chat #814: Z.ai's Zixuan Li Talks About GLM

I’m Zixuan Li, Head of Z.ai Global Ecosystem, responsible for Z.ai Chat, Z.ai API services, global partnerships, and global branding.

The Main SequenceLet’s start with the genesis of the General Language Model (GLM).

The GLM model architecture today has evolved significantly from its previous versions.

Shape standards: If we’re fortunate enough, we might help set some norms and standards for open models.

This means we not only serve customers on our own API platform but also assist other providers in deploying GLM models correctly and efficiently.

1 week, 1 day назад @ thesequence.substack.com
The Sequence AI of the Week #813: Deep Diving Into the Amazing GLM-5
The Sequence AI of the Week #813: Deep Diving Into the Amazing GLM-5 The Sequence AI of the Week #813: Deep Diving Into the Amazing GLM-5

The AI industry is currently undergoing a massive, structural paradigm shift in how we build, train, and interact with large language models.

For the past couple of years, we have been living in the era of “vibe coding”.

In this new paradigm, AI agents don’t just write snippets; they autonomously plan, navigate massive codebases, implement features, run tests, and iteratively fix their own bugs over long time horizons.

Achieving this level of autonomy requires fundamentally rethinking how a model reasons, how it handles immense context windows, and—crucially—how we align it using reinforcement learning.

Z.ai’s GLM-5 is an absolute masterclass in systems engineering that tackles these exact …

1 week, 2 days назад @ thesequence.substack.com
The Sequence Knowledge #812: The Sora Moment: When Video Models Became Physics Engines
The Sequence Knowledge #812: The Sora Moment: When Video Models Became Physics Engines The Sequence Knowledge #812: The Sora Moment: When Video Models Became Physics Engines

Today we will Discuss:Why video generation models are the new physics engines.

💡 AI Concept of the Day: The Sora Moment: When Video Models Became Physics EnginesFor years, computer graphics and artificial intelligence existed in parallel universes.

To get a realistic physics simulation, you needed a game engine with explicitly programmed rules for gravity, collision, and light.

The “Sora Moment” in early 2024 collapsed these two worlds.

When OpenAI released their technical report, they notably did not title it “A Better Video Generator.” They titled it: Video Generation Models as World Simulators.

1 week, 3 days назад @ thesequence.substack.com
The Sequence Radar #811: Last Week in AI: OpenAI's Capital Leap, India's Summit, and the Next Frontier of Models
The Sequence Radar #811: Last Week in AI: OpenAI's Capital Leap, India's Summit, and the Next Frontier of Models The Sequence Radar #811: Last Week in AI: OpenAI's Capital Leap, India's Summit, and the Next Frontier of Models

We explore reinforcement learning with verifiable rewards which is becoming the most important post-training technique in the new wave of AI models.

It utilizes a new asynchronous reinforcement learning infrastructure to decouple generation from training, achieving state-of-the-art performance on complex, end-to-end software engineering challenges.

AI Lab: Alibaba GroupSummary: To address concerns about noisy and ambiguous items in the Humanity’s Last Exam (HLE) benchmark, this paper introduces a verified and revised version called HLE-Verified.

AI Lab: University of Southern California, Microsoft, & University of PennsylvaniaSummary: This paper introduces Experiential Reinforcement Learnin…

1 week, 5 days назад @ thesequence.substack.com
The Sequence Opinion #810: The Black Box Behind the Mask: The Imperative of Post-Training Interpretability
The Sequence Opinion #810: The Black Box Behind the Mask: The Imperative of Post-Training Interpretability The Sequence Opinion #810: The Black Box Behind the Mask: The Imperative of Post-Training Interpretability

The modern paradigm of artificial intelligence development typically follows a two-stage process: pre-training and post-training.

Post-training, which includes Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF), acts as the civilizing layer.

It applies the “mask” of helpfulness, safety, and conversational fluency that users interact with daily.

While the broader field of mechanistic interpretability aims to reverse-engineer neural networks into human-understandable components, post-training interpretability has emerged as a distinct and critical subfield.

It focuses not merely on what knowledge a model possesses, but on how that knowledge is modulated, suppre…

2 weeks, 1 day назад @ thesequence.substack.com
The Sequence AI of the Week #809: Slow Thinking, Fast Discovery: Inside DeepMind’s Aletheia Architecture
The Sequence AI of the Week #809: Slow Thinking, Fast Discovery: Inside DeepMind’s Aletheia Architecture The Sequence AI of the Week #809: Slow Thinking, Fast Discovery: Inside DeepMind’s Aletheia Architecture

There is a specific kind of beauty in the way a neural network finally “clicks” on a hard problem.

If you’ve spent enough time staring at loss curves or step-by-step reasoning traces, you know the feeling.

For years, we’ve watched LLMs act like incredibly fast, somewhat overconfident students—blasting out answers that look right but often crumble under the slightest bit of logical pressure.

It is a specialized research agent built on top of the DeepThink architecture.

It marks a fundamental shift from the “System 1” (intuitive, fast) outputs we’ve grown used to, toward a “System 2” (deliberate, slow) approach to autonomous scientific discovery.

2 weeks, 2 days назад @ thesequence.substack.com
The Sequence Knowledge #808: Stop Trying to Generate the World: Inside the JEPA Way for World Models
The Sequence Knowledge #808: Stop Trying to Generate the World: Inside the JEPA Way for World Models The Sequence Knowledge #808: Stop Trying to Generate the World: Inside the JEPA Way for World Models

Today we will Discuss:An overview of the JEPA methods for world models.

A review of the three famous JEPA papers.

💡 AI Concept of the Day: Understanding LeCun’s JEPA Methods to Reimagine World ModelsIn the midst of the “Generative AI” boom—where OpenAI’s Sora and Runway are dazzling the world by hallucinating video—Yann LeCun, AI legend and former Meta’s Chief AI Scientist, has taken a staunchly contrarian stance.

His argument is simple but radical: If you want true intelligence, stop trying to generate the world.

LeCun proposes a different path, known as the Joint Embedding Predictive Architecture (JEPA).1 While Generative World Models (like Dreamer) prove understanding by recreating the i…

2 weeks, 3 days назад @ thesequence.substack.com
The Sequence Radar #807: Last Week in AI: From Mega-Rounds to Mathematical Breakthrough
The Sequence Radar #807: Last Week in AI: From Mega-Rounds to Mathematical Breakthrough The Sequence Radar #807: Last Week in AI: From Mega-Rounds to Mathematical Breakthrough

Subscribe and don’t miss out:📝 Editorial: Last Week in AI: From Mega-Rounds to Mathematical BreakthroughThe narrative of AI progress reached several new milestones this week, continuing a trend of increased technical specialization and reasoning depth.

AI Lab: Google Cloud AI ResearchSummary: This paper proposes a selective defense framework to protect AI agents from Indirect Prompt Injection (IPI) attacks by detecting “dominance shifts” in causal attribution.

AI Lab: Google DeepMindSummary: Researchers introduce “Aletheia,” a math research agent powered by Gemini Deep Think that iteratively generates, verifies, and revises mathematical proofs in natural language.

AI Lab: Meta SuperIntellig…

2 weeks, 5 days назад @ thesequence.substack.com
The Sequence Opinion #806: The Emergence of the Agent-as-a-Judge: Why Evals Need a Reasoning Engine
The Sequence Opinion #806: The Emergence of the Agent-as-a-Judge: Why Evals Need a Reasoning Engine The Sequence Opinion #806: The Emergence of the Agent-as-a-Judge: Why Evals Need a Reasoning Engine

In the early days of the current AI boom—roughly 2023—we had a very “vibe-based” relationship with our models.

The idea was elegant in its simplicity—if these models are so good at generating text, surely they’re good at grading it, too.

We started using a “stronger” model (like GPT-4) to grade the outputs of “weaker” models.

We are moving from the era of the Judge as a Critic to the era of the Judge as an Agent.

Recently, I’ve been doing a lot of work in the agent-as-a-judge space as part of the LayerLens platform so wanted to share some ideas about the space.

3 weeks, 1 day назад @ thesequence.substack.com
The Sequence AI of the Week #805: Goodfire and the Era of AI Interpretability
The Sequence AI of the Week #805: Goodfire and the Era of AI Interpretability The Sequence AI of the Week #805: Goodfire and the Era of AI Interpretability

If you look at how we build software today—what I’ve previously called “Software 2.0”—it’s effectively an optimization process.

We collect a massive dataset, we define a loss function, and then we surrender control to gradient descent.

We let the optimization algorithm smash the weights of a neural network until the loss goes down.

We are doing behavioral psychology on an alien intelligence because we lack the tools to do neuroscience.

The work Goodfire is doing on interpretability—specifically around feature steering and agents—is the first glimpse of what I’d call the “Software 3.0” IDE.

3 weeks, 2 days назад @ thesequence.substack.com
Synced Review
последний пост None
📓 Cool Blogs
ODS.ai Habr ODS.ai Habr
последний пост 2 weeks, 1 day назад
Почему я стал ИТ-волонтером & Датасет новостей о противоречиях современного общества
Почему я стал ИТ-волонтером & Датасет новостей о противоречиях современного общества Почему я стал ИТ-волонтером & Датасет новостей о противоречиях современного общества

Простой пример с ценами на топливо: бензин дорожает и из-за роста цены на нефть, и из-за ее падения.

Осознание того, что твой труд увеличивает чью-то капитализацию, но не решает реальных проблем общества, видимых в быту и в новостях, подтолкнуло искать еще какую-то деятельность.

Кроме того, благодаря АМБ появился уникальный датасет новостей с противоречиями современного общества на kaggle и github, далее о нем.

Датасет новостей о противоречиях современного обществаАктивисты АМБ и волонтеры дружественных коллективов собрали и разметили датасет новостей, подсвечивающие те самые системные противоречия, о которых я задумывался ранее.

Пример Б В 2023 году в мире голодал каждый 11-й человек, а в …

2 weeks, 1 day назад @ habr.com
[Перевод] Как устроен Codex
[Перевод] Как устроен Codex [Перевод] Как устроен Codex

Подробный разбор того, как команда OpenAI Codex создаёт своего кодового агента, как его используют инженеры и что это может значить для будущего разработки ПО.

Чтобы разобраться, как устроен Codex, как команды внутри OpenAI его используют и как он влияет на инженерные практики у создателей ChatGPT, я поговорил с тремя сотрудниками OpenAI:Тибо Соттио (Thibault Sottiaux) — руководитель Codex.

Оба продукта были запущены весной: Codex CLI анонсировали в апреле 2025 года, а Codex в ChatGPT представили в мае.

В команде Codex эти файлы объясняют агенту, как ориентироваться в кодовой базе, какие команды запускать для тестирования и как следовать стандартам проекта.

Использование Codex в OpenAIПомим…

2 weeks, 1 day назад @ habr.com
Курс Natural Language Processing & LLMs — новый сезон
Курс Natural Language Processing & LLMs — новый сезон Курс Natural Language Processing & LLMs — новый сезон

10 февраля мы в очередной раз запускаем бесплатный онлайн-курс по обработке естественного языка (Natural Language Processing).

Что будем проходить:классическое начало: закон Ципфа, TF-IDF, RNN, CNN, Transformer;основные задачи NLP: классификация текста, тегирование и генерация;специфичные области: агенты и вайб-кодинг;LLM и их применение.

Если вы студент ИТМО, МФТИ или ВШЭ, то курс можно зачесть, как учебный.

Работаю в области NLP более 12 лет, успел поработать в Яндексе и ВКонтакте, защитить кандидатскую диссертацию.

Если есть вопросы, то приходите с ними в ODS Mattermost – там будут все ответы, время семинаров и ссылки.

1 month назад @ habr.com
SWE-MERA — новый динамический бенчмарк для моделей агентной генерации кода
SWE-MERA — новый динамический бенчмарк для моделей агентной генерации кода SWE-MERA — новый динамический бенчмарк для моделей агентной генерации кода

Однако все задачи в MERA CODE, как впрочем и в SWE-bench и других бенчмарках подобного назначения, следуют классической парадигме, когда у нас есть фиксированный обучающий набор данных и, что более важно, фиксированный проверочный набор.

Но большие языковые модели для кодинга, которые мы и пытаемся оценивать нашим набором, также учатся на GitHub – со времен еще первой модели LLaMa.

Кажется, что 700 задач немного, но это уже очень приличное количество, и что самое важное — это новые задачи.

Current behavior: from sympy import ask, Q, Symbol x = Symbol('x') print(ask(Q.finite(x**-1), Q.real(x))) # Output: True Expected behavior: The function should return None to indicate uncertainty, as x**-…

5 months, 2 weeks назад @ habr.com
DRAGON: динамический бенчмарк для оценки RAG-систем на русском языке
DRAGON: динамический бенчмарк для оценки RAG-систем на русском языке DRAGON: динамический бенчмарк для оценки RAG-систем на русском языке

Ответ: Кэисукэ ТибаSPARQL-запрос SimpleSELECT DISTINCT ?s ?r ?o WHERE { { SELECT ?s ?r ?o WHERE { ?s ?r ?o . }

GROUP BY ?s ?r HAVING(count(?o) = 1) } { SELECT ?s ?r ?o WHERE { ?s ?r ?o . }

Ответ: Национальная система платежных карт (НСПК) Центр биометрических технологий (ЦБТ) ЕБСSELECT ?s ?r ?o ?len WHERE { { SELECT ?s ?r (COUNT(?o1) as ?len) (GROUP_CONCAT(DISTINCT(STR(?o1));separator="|") AS ?o) WHERE { ?s ?r ?o1 . }

FILTER(?o != ?o1) } GROUP BY ?o ?o1 ?r ?r1 HAVING(COUNT(?s) = 1) } UNION { SELECT ?s ?r ?o ?r1 ?s1 WHERE { ?s ?r ?o .

FILTER(?o != ?o1) } GROUP BY ?o ?o1 ?r ?r1 HAVING(COUNT(?s) = 1) } UNION { SELECT ?s ?r ?o ?r1 ?s1 WHERE { ?s ?r ?o .

7 months, 2 weeks назад @ habr.com
RKNN Toolkit2: конвертация моделей и симуляция NPU Rockchip
RKNN Toolkit2: конвертация моделей и симуляция NPU Rockchip RKNN Toolkit2: конвертация моделей и симуляция NPU Rockchip

В этой статье я хочу поделиться своим опытом по конвертации нейросети в формат rknn с помощью библиотеки rknn-toolkit2.

Вот как выглядят веса pytorch модели в Netron:веса pytorch модели в NetronВажно!

Конвертация onnx модели в rknnДалее создается объект RKNN , который управляет процессом конвертации и инференса модели на платформе Rockchip.

На этом этапе происходит подготавка модели к конвертации в формат RKNN и последующему запуску на NPU Rockchip.

Создание и экспорт rknn моделиНа этом этапе происходит конвертация ONNX-модели во внутренний формат RKNN, оптимизация графа и подготовка к запуску на NPU Rockchip.

7 months, 3 weeks назад @ habr.com
MERA Code: всесторонняя оценка генерации кода в прикладных сценариях
MERA Code: всесторонняя оценка генерации кода в прикладных сценариях MERA Code: всесторонняя оценка генерации кода в прикладных сценариях

🔗MERA Code🔗GitHub с кодом и данными🔗Коллекция на Hugging Face🔗Статья на arxiv🔗Репозиторий проекта на GitVerseЧто такое MERA Code?

Современные кодовые языковые модели и модели общего назначения (ChatGPT, Claude, Qwen, YandexGPT, GigaChat и др.)

Список текущих задач MERA Code и их характеристикКаталог задач MERA Code и их подробное описание представлено на сайте.

В MERA Code промпты строго подобраны под задачу и корректный выбор ответа.

В заключениеMERA Code — это попытка закрыть важный пробел в тестировании LLM: насколько они действительно полезны в реальной, локализованной разработке.

7 months, 3 weeks назад @ habr.com
Machine Learning Mastery
последний пост 1 day, 17 hours назад
Vector Databases vs. Graph RAG for Agent Memory: When to Use Which
Vector Databases vs. Graph RAG for Agent Memory: When to Use Which Vector Databases vs. Graph RAG for Agent Memory: When to Use Which

To clarify their respective roles, this article explores the underlying theory, practical strengths, and limitations of both approaches for agent memory.

Vector Databases: The Foundation of Semantic Agent MemoryVector databases represent memory as dense mathematical vectors, or embeddings, situated in high-dimensional space.

Vector databases are strong choices for agent memory.

Graph RAG: Structured Context and Relational MemoryGraph RAG addresses the limitations of semantic search by combining knowledge graphs with LLMs.

Developers would be well advised to adopt a layered approach: start agent memory with a vector database for basic conversational grounding.

1 day, 17 hours назад @ machinelearningmastery.com
5 Essential Security Patterns for Robust Agentic AI
5 Essential Security Patterns for Robust Agentic AI 5 Essential Security Patterns for Robust Agentic AI

This article lists 5 key security patterns for robust AI agents and highlights why they matter.

Just-in-Time Tool PrivilegesOften abbreviated as JIT, this is a security model that grants users or applications specialized or elevated access privileges only when needed, and only for a limited period of time.

In the realm of agentic AI, an example would be issuing short term access tokens to limits the “blast radius” if the agent becomes compromised.

Bounded AutonomyThis security principle allows AI agents to operate independently within a bounded setting, meaning within clearly defined safe parameters, striking a balance between control and efficiency.

The AI FirewallThis refers to a dedicate…

2 days, 11 hours назад @ machinelearningmastery.com
Deploying AI Agents to Production: Architecture, Infrastructure, and Implementation Roadmap
Deploying AI Agents to Production: Architecture, Infrastructure, and Implementation Roadmap Deploying AI Agents to Production: Architecture, Infrastructure, and Implementation Roadmap

Architecture Patterns: Choosing How Your Agent RunsYour first big decision is picking the right execution model for your agent.

Infrastructure Stack: What Agents Need to RunProduction agents need five infrastructure layers.

Deployment Topologies: Structuring Agent Systems at ScaleHow you organize agents in production depends on task complexity and volume requirements.

It’s about matching your agent’s requirements to infrastructure that can reliably, cost-effectively, and maintainably support those requirements in production.

Start with the simplest architecture that could work, instrument it thoroughly, and let production data guide evolution.

3 days, 17 hours назад @ machinelearningmastery.com
Build Semantic Search with LLM Embeddings
Build Semantic Search with LLM Embeddings Build Semantic Search with LLM Embeddings

Share Post ShareIn this article, you will learn how to build a simple semantic search engine using sentence embeddings and nearest neighbors.

The following code initializes the model and encodes the batch of text documents into embeddings.

print("Loading embedding model...") model = SentenceTransformer("all-MiniLM-L6-v2") # Convert text documents into numerical vector embeddings print("Encoding documents (this may take a few seconds)...") document_embeddings = model.encode(documents, show_progress_bar=True) print(f"Created {document_embeddings.shape[0]} embeddings.")

While this example is intentionally simple, semantic search engines like this form the foundational retrieval layer in modern…

4 days, 15 hours назад @ machinelearningmastery.com
Can LLM Embeddings Improve Time Series Forecasting? A Practical Feature Engineering Approach
Can LLM Embeddings Improve Time Series Forecasting? A Practical Feature Engineering Approach Can LLM Embeddings Improve Time Series Forecasting? A Practical Feature Engineering Approach

download ( ticker , start = "2008-01-01" , end = "2016-12-31" ) df_price = df_price [ [ "Close" ] ] df_price [ "return" ] = df_price [ "Close" ] .

std ( ) df_price = df_price .

decode ( 'utf-8' ) else : return str ( evaluated_value ) except ( ValueError , SyntaxError ) : if x . startswith ( "b" ) : return x . strip ( "b'\"" ) return x return x df_news [ "combined" ] = df_news [ "combined" ] .

columns : df_price = df_price .

columns : df_price = df_price .

1 week назад @ machinelearningmastery.com
KV Caching in LLMs: A Guide for Developers
KV Caching in LLMs: A Guide for Developers KV Caching in LLMs: A Guide for Developers

d_model = d_model self .

W_Q = Linear ( d_model , d_model ) self .

W_K = Linear ( d_model , d_model ) self .

W_V = Linear ( d_model , d_model ) self .

def reset_cache(self): self.cache_K = None self.cache_V = None 1 2 3 def reset_cache ( self ) : self .

1 week, 1 day назад @ machinelearningmastery.com
How to Combine LLM Embeddings + TF-IDF + Metadata in One Scikit-learn Pipeline
How to Combine LLM Embeddings + TF-IDF + Metadata in One Scikit-learn Pipeline How to Combine LLM Embeddings + TF-IDF + Metadata in One Scikit-learn Pipeline

Share Post ShareIn this article, you will learn how to fuse dense LLM sentence embeddings, sparse TF-IDF features, and structured metadata into a single scikit-learn pipeline for text classification.

Topics we will cover include:Loading and preparing a text dataset alongside synthetic metadata features.

Building parallel feature pipelines for TF-IDF, LLM embeddings, and numeric metadata.

pipeline import Pipeline from sklearn .

data y = dataset .

1 week, 2 days назад @ machinelearningmastery.com
Introduction to Small Language Models: The Complete Guide for 2026
Introduction to Small Language Models: The Complete Guide for 2026 Introduction to Small Language Models: The Complete Guide for 2026

Share Post ShareIn this article, you will learn what small language models are, why they matter in 2026, and how to use them effectively in real production systems.

Topics we will cover include:What defines small language models and how they differ from large language models.

Small language models (SLMs) make this possible.

What Are Small Language Models?

Small language models are language models with fewer than 10 billion parameters, usually ranging from 1 billion to 7 billion.

1 week, 3 days назад @ machinelearningmastery.com
Beyond Accuracy: 5 Metrics That Actually Matter for AI Agents
Beyond Accuracy: 5 Metrics That Actually Matter for AI Agents Beyond Accuracy: 5 Metrics That Actually Matter for AI Agents

Share Post ShareIntroductionAI agents, or autonomous systems powered by agentic AI, have reshaped the current landscape of AI systems and deployments.

While accuracy is one of the most common metrics used in static large language model evaluations, agent evaluations often require additional measures focused on action quality, tool use, and trajectory efficiency — especially when building modern AI agents.

It is strongly related to the return on investment (ROI) of using AI agents.

Recovery Rate (RR)How frequently does an agent identify an error and effectively replan to fix it?

It requires careful interpretation, since a very high recovery rate can sometimes reveal underlying instability if…

1 week, 4 days назад @ machinelearningmastery.com
Building a Simple MCP Server in Python
Building a Simple MCP Server in Python Building a Simple MCP Server in Python

Have you ever tried connecting a language model to your own data or tools? If so, you know it often means writing custom integrations, managing API schemas, and wrestling with authentication.

2 weeks, 1 day назад @ machinelearningmastery.com
Agentify Your App with GitHub Copilot’s Agentic Coding SDK
Agentify Your App with GitHub Copilot’s Agentic Coding SDK Agentify Your App with GitHub Copilot’s Agentic Coding SDK

For years, GitHub Copilot has served as a powerful pair programming tool for programmers, suggesting the next line of code.

2 weeks, 2 days назад @ machinelearningmastery.com
LLM Embeddings vs TF-IDF vs Bag-of-Words: Which Works Better in Scikit-learn?
LLM Embeddings vs TF-IDF vs Bag-of-Words: Which Works Better in Scikit-learn? LLM Embeddings vs TF-IDF vs Bag-of-Words: Which Works Better in Scikit-learn?

Machine learning models built with frameworks like scikit-learn can accommodate unstructured data like text, as long as this raw text is converted into a numerical representation that is understandable by algorithms, models, and machines in a broader sense.

2 weeks, 3 days назад @ machinelearningmastery.com
Top 7 Small Language Models You Can Run on a Laptop
Top 7 Small Language Models You Can Run on a Laptop Top 7 Small Language Models You Can Run on a Laptop

Powerful AI now runs on consumer hardware.

2 weeks, 4 days назад @ machinelearningmastery.com
Choosing Between PCA and t-SNE for Visualization
Choosing Between PCA and t-SNE for Visualization Choosing Between PCA and t-SNE for Visualization

For data scientists, working with high-dimensional data is part of daily life.

3 weeks, 1 day назад @ machinelearningmastery.com
The Machine Learning Practitioner’s Guide to Speculative Decoding
The Machine Learning Practitioner’s Guide to Speculative Decoding The Machine Learning Practitioner’s Guide to Speculative Decoding

Large language models generate text one token at a time.

3 weeks, 2 days назад @ machinelearningmastery.com
ML in Production
последний пост None
Sorta Insightful Sorta Insightful
последний пост 1 month назад
MIT Mystery Hunt 2026
MIT Mystery Hunt 2026 MIT Mystery Hunt 2026

This has spoilers for MIT Mystery Hunt 2026.

Pre-HuntThe time running up to Hunt was more stressful than usual…very briefly, I typically hunt with teammate.

Just last year, I did GPH 2025, LN Hunt, Teammate Hunt 2025, Microsoft Hunt 2025, and Silph Puzzle Hunt 2025, all of which had significant 3+ hour solve puzzles that would not be out of place in Mystery Hunt.

Not to mention smaller hunts like Advent Hunt, and then I didn’t even do Brown Puzzlehunt or Vertex Hunt or the fall CMU Hunt.

To me, the crux is whether Mystery Hunt is broken, or Mystery Hunt is fine.

1 month назад @ alexirpan.com
Authentic Imperfection
Authentic Imperfection Authentic Imperfection

* * *I’ve been thinking about the anger surrounding generative AI.

To keep things fair, he took the best human images and best AI images, meaning human art from famous artists, and AI art from prompters skilled at removing obvious tells of image generation.

When people complain about AI slop, I see it as a complaint against the deluge of default style AI images.

We’ve seen this happen in all forms: AI text, AI music, older forms of computer generated content like CGI.

As much as we celebrate imperfection, digital imperfection is a step too far.

3 months, 2 weeks назад @ alexirpan.com
Ten Years Later
Ten Years Later Ten Years Later

Every now and then, someone asks me why I blog, and I don’t know really know what to tell them.

That’s another reason I’m not celebrating 10 years with more gusto, I know I’ve been writing less.

Indiana Jones and the Great Circle: I don’t know how they did it, but Indiana Jones and the Great Circle was just fun all the way through.

My one complaint is that the hand-to-hand combat feels like the worst part of the game, so of course they put a bunch of upgrades behind learning parry timings you’ll never use later.

I have not tried Peak, but Another Crab’s Treasure was really good and is worth playing if you’re interested in a Souls-like.

6 months, 2 weeks назад @ alexirpan.com
Brony Musicians Seize The Means of Production: My Eyewitness Account to BABSCon 2025
Brony Musicians Seize The Means of Production: My Eyewitness Account to BABSCon 2025 Brony Musicians Seize The Means of Production: My Eyewitness Account to BABSCon 2025

A music concert in the evenings, typically set up as a rave with EDM or rock music made by brony musicians.

She has been involved in organizing pony music concerts for over a decade, for both BABSCon and other pony conventions.

Thank you, BABSCon ChairsThe brony musicians immediately jump into an emergency Discord call with Pinkaboo, to get her side of the story.

Other conventions start tweeting in support of the brony musicians, with no one taking BABSCon’s side.

It’s hard for me to explain why I like MLP fan music, because brony music really isn’t accessible.

7 months, 2 weeks назад @ alexirpan.com
Lil'Log
последний пост None
inFERENCe
последний пост 1 week, 2 days назад
The Future of Software
The Future of Software The Future of Software

February 25, 2026The Future of SoftwareThe world of software is undergoing a shift not seen since the advent of compilers in the 1970s.

How will humans tell AI agents what software artefacts we would like to create?

How will humans tell AI agents what software artefacts we would like to create?

This future of software creation, in which our programming languages are abstracted away, raises two very important questions:What will the instruction/specification language look like?

This should be a clear layer of separation between the developer and the pool of AI agents working to maintain software.

1 week, 2 days назад @ inference.vc
Deep Learning is Powerful Because It Makes Hard Things Easy - Reflections 10 Years On
Deep Learning is Powerful Because It Makes Hard Things Easy - Reflections 10 Years On Deep Learning is Powerful Because It Makes Hard Things Easy - Reflections 10 Years On

Deep Learning is Powerful Because It Makes Hard Things Easy - Reflections 10 Years OnTen years ago this week, I wrote a provocative and bold post that blew up, made it to top spot on HackerNews.

In hindsight: There is a lot of stuff in deep learning that we don't understand nearly enough.

Sometimes things work for reasons completely unrelated to why we thought they would work.

(Pop some 🍿 in the microwave and read till the end for more)🎯 "Deep learning is powerful exactly because it makes hard things easy"Okay, this was a great insight.

🎯 Generative ModelingIn the post I suggested people learn "something harder" instead of - or in addition to - deep learning.

1 month назад @ inference.vc
Discrete Diffusion: Continuous-Time Markov Chains
Discrete Diffusion: Continuous-Time Markov Chains Discrete Diffusion: Continuous-Time Markov Chains

We differentiate Markov chains on values the index $t$ can take: if $t$ is an integer, we call it a discrete-time Markov chain, and when $t$ is real, we call the process a continuous-time Markov chain.

Discrete Markov chains are fully described by a collection of state transition matrices $P_t = [P(X_{t+1} = i\vert X_{t}=j)]{i,j}$.

The geometric distribution is the only discrete distribution with the memory-less property, stated as $\mathbb{P}[T=s+t\vert T>s] = \mathbb{P}[T = t]$.

Point processes and Markov ChainsBack to Markov chains.

We then discussed different representations of discrete point processes, and noted (although did not prove) their equivalence.

9 months, 2 weeks назад @ inference.vc
The Spectator
последний пост None
The Unofficial Google Data Science Blog The Unofficial Google Data Science Blog
последний пост None
Off the Convex Path
последний пост None
Jay Alammar
последний пост None
Piekniewski's blog
последний пост None
fast.ai NLP fast.ai NLP
последний пост None
Sebastian Ruder
последний пост None
大トロ 大トロ
последний пост None
🔬 Science
Papers With Code Papers With Code
последний пост 7 months, 2 weeks назад
/henry123-boy/ SpatialTrackerV2: 3D Point Tracking Made Easy
/henry123-boy/ SpatialTrackerV2: 3D Point Tracking Made Easy /henry123-boy/ SpatialTrackerV2: 3D Point Tracking Made Easy

We present SpatialTrackerV2, a feed-forward 3D point tracking method for monocular videos.

Going beyond modular pipelines built on off-the-shelf components for 3D tracking, our approach unifies the intrinsic connections between point tracking, monocular depth, and camera pose estimation into a high-performing and feedforward 3D point tracker.

It decomposes world-space 3D motion into scene geometry, camera ego-motion, and pixel-wise object motion, with a fully differentiable and end-to-end architecture, allowing scalable training across a wide range of datasets, including synthetic sequences, posed RGB-D videos, and unlabeled in-the-wild footage.

By learning geometry and motion jointly from …

7 months, 2 weeks назад @ paperswithcode.com
/antof27/ Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation
/antof27/ Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation /antof27/ Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation

Calisthenics skill classification is the computer vision task of inferring the skill performed by an athlete from images, enabling automatic performance assessment and personalized analytics.

Traditional methods for calisthenics skill recognition are based on pose estimation methods to determine the position of skeletal data from images, which is later fed to a classification algorithm to infer the performed skill.

This work proposes a direct approach to calisthenics skill recognition, which leverages depth estimation and athlete patch retrieval to avoid the computationally expensive human pose estimation module.

Using Depth Anything V2 for depth estimation and YOLOv10 for athlete localizat…

7 months, 2 weeks назад @ paperswithcode.com
/snowflakedb/ Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AI
/snowflakedb/ Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AI /snowflakedb/ Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AI

Inference is now the dominant AI workload, yet existing systems force trade-offs between latency, throughput, and cost.

Arctic Inference, an open-source vLLM plugin from Snowflake AI Research, introduces Shift Parallelism, a dynamic parallelism strategy that adapts to real-world traffic while integrating speculative decoding, SwiftKV compute reduction, and optimized embedding inference.

It achieves up to 3.4 times faster request completion, 1.75 times faster generation, and 1.6M tokens/sec per GPU for embeddings, outperforming both latency- and throughput-optimized deployments.

Already powering Snowflake Cortex AI, Arctic Inference delivers state-of-the-art, cost-effective inference for ent…

7 months, 2 weeks назад @ paperswithcode.com
/NVIDIA/ FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale
/NVIDIA/ FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale /NVIDIA/ FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale

FourCastNet 3 advances global weather modeling by implementing a scalable, geometric machine learning (ML) approach to probabilistic ensemble forecasting.

The approach is designed to respect spherical geometry and to accurately model the spatially correlated probabilistic nature of the problem, resulting in stable spectra and realistic dynamics across multiple scales.

FourCastNet 3 delivers forecasting accuracy that surpasses leading conventional ensemble models and rivals the best diffusion-based methods, while producing forecasts 8 to 60 times faster than these approaches.

In contrast to other ML approaches, FourCastNet 3 demonstrates excellent probabilistic calibration and retains realis…

7 months, 2 weeks назад @ paperswithcode.com
/jingyanw/ Choosing the Better Bandit Algorithm under Data Sharing: When Do A/B Experiments Work?
/jingyanw/ Choosing the Better Bandit Algorithm under Data Sharing: When Do A/B Experiments Work? /jingyanw/ Choosing the Better Bandit Algorithm under Data Sharing: When Do A/B Experiments Work?

We study A/B experiments that are designed to compare the performance of two recommendation algorithms.

The bias arising from this type of data sharing is known as "symbiosis bias".

In this paper, we highlight that, for decision-making purposes, the sign of the GTE often matters more than its precise magnitude when selecting the better algorithm.

We formalize this insight under a multi-armed bandit framework and theoretically characterize when the sign of the expected GTE estimate under data sharing aligns with or contradicts the sign of the true GTE.

Our analysis identifies the level of exploration versus exploitation as a key determinant of how symbiosis bias impacts algorithm selection.

7 months, 2 weeks назад @ paperswithcode.com
/qqq-yi/ DAC: A Dynamic Attention-aware Approach for Task-Agnostic Prompt Compression
/qqq-yi/ DAC: A Dynamic Attention-aware Approach for Task-Agnostic Prompt Compression /qqq-yi/ DAC: A Dynamic Attention-aware Approach for Task-Agnostic Prompt Compression

Task-agnostic prompt compression leverages the redundancy in natural language to reduce computational overhead and enhance information density within prompts, especially in long-context scenarios.

Existing methods predominantly rely on information entropy as the metric to compress lexical units, aiming to achieve minimal information loss.

However, these approaches overlook two critical aspects: (i) the importance of attention-critical tokens at the algorithmic level, and (ii) shifts in information entropy during the compression process.

Motivated by these challenges, we propose a dynamic attention-aware approach for task-agnostic prompt compression (DAC).

This approach effectively integrate…

7 months, 2 weeks назад @ paperswithcode.com
/lukasellinger/ Simplifications are Absolutists: How Simplified Language Reduces Word Sense Awareness in LLM-Generated Definitions
/lukasellinger/ Simplifications are Absolutists: How Simplified Language Reduces Word Sense Awareness in LLM-Generated Definitions /lukasellinger/ Simplifications are Absolutists: How Simplified Language Reduces Word Sense Awareness in LLM-Generated Definitions

Large Language Models (LLMs) can provide accurate word definitions and explanations for any context.

However, the scope of the definition changes for different target groups, like children or language learners.

We investigate how simplification impacts homonym definition quality across three target groups: Normal, Simple, and ELI5.

Our results show that simplification drastically degrades definition completeness by neglecting polysemy, increasing the risk of misunderstanding.

Fine-tuning Llama 3.1 8B with Direct Preference Optimization substantially improves homonym response quality across all prompt types.

7 months, 2 weeks назад @ paperswithcode.com
/pspdada/ Mitigating Object Hallucinations via Sentence-Level Early Intervention
/pspdada/ Mitigating Object Hallucinations via Sentence-Level Early Intervention /pspdada/ Mitigating Object Hallucinations via Sentence-Level Early Intervention

Multimodal large language models (MLLMs) have revolutionized cross-modal understanding but continue to struggle with hallucinations - fabricated content contradicting visual inputs.

Existing hallucination mitigation methods either incur prohibitive computational costs or introduce distribution mismatches between training data and model outputs.

We identify a critical insight: hallucinations predominantly emerge at the early stages of text generation and propagate through subsequent outputs.

To address this, we propose **SENTINEL** (**S**entence-level **E**arly i**N**tervention **T**hrough **IN**-domain pr**E**ference **L**earning), a framework that eliminates dependency on human annotations…

7 months, 2 weeks назад @ paperswithcode.com
/owos/ FLEXITOKENS: Flexible Tokenization for Evolving Language Models
/owos/ FLEXITOKENS: Flexible Tokenization for Evolving Language Models /owos/ FLEXITOKENS: Flexible Tokenization for Evolving Language Models

Language models (LMs) are challenging to adapt to new data distributions by simple finetuning.

This is due to the rigidity of their subword tokenizers, which typically remain unchanged during adaptation.

This inflexibility often leads to inefficient tokenization, causing overfragmentation of out-of-distribution domains, unseen languages, or scripts.

In this work, we develop byte-level LMs with learnable tokenizers to make tokenization adaptive.

Our models include a submodule that learns to predict boundaries between the input byte sequence, encoding it into variable-length segments.

7 months, 2 weeks назад @ paperswithcode.com
/wojiufukele/ Graph-Structured Data Analysis of Component Failure in Autonomous Cargo Ships Based on Feature Fusion
/wojiufukele/ Graph-Structured Data Analysis of Component Failure in Autonomous Cargo Ships Based on Feature Fusion /wojiufukele/ Graph-Structured Data Analysis of Component Failure in Autonomous Cargo Ships Based on Feature Fusion

To address the challenges posed by cascading reactions caused by component failures in autonomous cargo ships (ACS) and the uncertainties in emergency decision-making, this paper proposes a novel hybrid feature fusion framework for constructing a graph-structured dataset of failure modes.

A hierarchical feature fusion framework is constructed, using Word2Vec encoding to encode subsystem/component features, BERT-KPCA to process failure modes/reasons, and Sentence-BERT to quantify the semantic association between failure impact and emergency decision-making.

The dataset covers 12 systems, 1,262 failure modes, and 6,150 propagation paths.

In the label prediction results, the Shore-based Meteor…

7 months, 2 weeks назад @ paperswithcode.com
/YF-W/ Tri-Learn Graph Fusion Network for Attributed Graph Clustering
/YF-W/ Tri-Learn Graph Fusion Network for Attributed Graph Clustering /YF-W/ Tri-Learn Graph Fusion Network for Attributed Graph Clustering

In recent years, models based on Graph Convolutional Networks (GCN) have made significant strides in the field of graph data analysis.

Although the Graph Transformer architecture has mitigated some of these issues, its performance is still limited when processing heterogeneous graph data.

To address these challenges, this study proposes a novel deep clustering framework that comprising GCN, Autoencoder (AE), and Graph Transformer, termed the Tri-Learn Graph Fusion Network (Tri-GFN).

The tri-learning mechanism allows mutual learning among these modules, while the feature fusion strategy enables the model to capture complex relationships, yielding highly discriminative representations for gra…

7 months, 2 weeks назад @ paperswithcode.com
/mr-ravin/ APTx Neuron: A Unified Trainable Neuron Architecture Integrating Activation and Computation
/mr-ravin/ APTx Neuron: A Unified Trainable Neuron Architecture Integrating Activation and Computation /mr-ravin/ APTx Neuron: A Unified Trainable Neuron Architecture Integrating Activation and Computation

We propose the APTx Neuron, a novel, unified neural computation unit that integrates non-linear activation and linear transformation into a single trainable expression.

The APTx Neuron is derived from the APTx activation function, thereby eliminating the need for separate activation layers and making the architecture both computationally efficient and elegant.

The proposed neuron follows the functional form $y = \sum_{i=1}^{n} ((\alpha_i + \tanh(\beta_i x_i)) \cdot \gamma_i x_i) + \delta$, where all parameters $\alpha_i$, $\beta_i$, $\gamma_i$, and $\delta$ are trainable.

We validate our APTx Neuron-based architecture on the MNIST dataset, achieving up to 96.69\% test accuracy in just 20 ep…

7 months, 2 weeks назад @ paperswithcode.com
/Rec4Fun/ A Reproducibility Study of Product-side Fairness in Bundle Recommendation
/Rec4Fun/ A Reproducibility Study of Product-side Fairness in Bundle Recommendation /Rec4Fun/ A Reproducibility Study of Product-side Fairness in Bundle Recommendation

While this problem has been widely studied in traditional recommendation settings, its implications for bundle recommendation (BR) remain largely unexplored.

Existing fairness frameworks and metrics designed for traditional recommender systems may not directly translate to this multi-layered setting.

In this paper, we conduct a comprehensive reproducibility study of product-side fairness in BR across three real-world datasets using four state-of-the-art BR methods.

We analyze exposure disparities at both the bundle and item levels using multiple fairness metrics, uncovering important patterns.

Overall, our findings offer actionable insights for building fairer bundle recommender systems and…

7 months, 2 weeks назад @ paperswithcode.com
/cbobed/ OntView: What you See is What you Meant
/cbobed/ OntView: What you See is What you Meant /cbobed/ OntView: What you See is What you Meant

However, the lack of tools that provide effective visualization is still a significant challenge.

In this paper, we present OntView, an ontology viewer that is designed to provide users with an intuitive visual representation of ontology concepts and their formal definitions through a user-friendly interface.

Building on the use of a DL reasoner, OntView follows a "What you see is what you meant" paradigm, showing the actual inferred knowledge.

One key aspect for this is its ability to visualize General Concept Inclusions (GCI), a feature absent in existing visualization tools.

OntView has been released with an open-source license for the whole community.

7 months, 2 weeks назад @ paperswithcode.com
/Rec4Fun/ RaMen: Multi-Strategy Multi-Modal Learning for Bundle Construction
/Rec4Fun/ RaMen: Multi-Strategy Multi-Modal Learning for Bundle Construction /Rec4Fun/ RaMen: Multi-Strategy Multi-Modal Learning for Bundle Construction

These approaches fail to capture elaborate relations hidden in real-world bundle structures, resulting in suboptimal bundle representations.

To overcome this limitation, we propose RaMen, a novel method that provides a holistic multi-strategy approach for bundle construction.

RaMen utilizes both intrinsic (characteristics) and extrinsic (collaborative signals) information to model bundle structures through Explicit Strategy-aware Learning (ESL) and Implicit Strategy-aware Learning (ISL).

Integrating diverse strategies enables RaMen to learn more comprehensive and robust bundle representations.

Meanwhile, Multi-strategy Alignment & Discrimination module is employed to facilitate knowledge tr…

7 months, 2 weeks назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 7 months, 2 weeks назад
/PrimisAI/ Adaptive Multi-Agent Reasoning via Automated Workflow Generation
/PrimisAI/ Adaptive Multi-Agent Reasoning via Automated Workflow Generation /PrimisAI/ Adaptive Multi-Agent Reasoning via Automated Workflow Generation

The rise of Large Reasoning Models (LRMs) promises a significant leap forward in language model capabilities, aiming to tackle increasingly sophisticated tasks with unprecedented efficiency and accuracy.

However, despite their impressive performance, recent studies have highlighted how current reasoning models frequently fail to generalize to novel, unseen problems, often resorting to memorized solutions rather than genuine inferential reasoning.

In this paper, we introduce Nexus Architect, an enhanced iteration of our multi-agent system framework, Nexus, equipped with a novel automated workflow synthesis mechanism.

Given a user's prompt and a small set of representative examples, the Archi…

7 months, 2 weeks назад @ paperswithcode.com
/sharanya02/ Real Time Captioning of Sign Language Gestures in Video Meetings
/sharanya02/ Real Time Captioning of Sign Language Gestures in Video Meetings /sharanya02/ Real Time Captioning of Sign Language Gestures in Video Meetings

One of the most tested ways to establish such a communication is through the use of sign based languages.

However, not many people are aware of the smaller intricacies involved with sign language.

Sign language recognition using computer vision aims at eliminating the communication barrier between deaf-mute and ordinary people so that they can properly communicate with others.

In recent studies, it has been found that people with hearing disabilities prefer to sign over typing during these video calls.

In this paper, we are proposing a browser extension that will automatically translate sign language to subtitles for everyone else in the video call.

7 months, 2 weeks назад @ paperswithcode.com
/alessiopittiglio/ Leveraging Context for Multimodal Fallacy Classification in Political Debates
/alessiopittiglio/ Leveraging Context for Multimodal Fallacy Classification in Political Debates /alessiopittiglio/ Leveraging Context for Multimodal Fallacy Classification in Political Debates

In this paper, we present our submission to the MM-ArgFallacy2025 shared task, which aims to advance research in multimodal argument mining, focusing on logical fallacies in political debates.

Our approach uses pretrained Transformer-based models and proposes several ways to leverage context.

In the fallacy classification subtask, our models achieved macro F1-scores of 0.4444 (text), 0.3559 (audio), and 0.4403 (multimodal).

Our multimodal model showed performance comparable to the text-only model, suggesting potential for improvements.

PDFAbstract

7 months, 2 weeks назад @ paperswithcode.com
/RS2002/ One Step is Enough: Multi-Agent Reinforcement Learning based on One-Step Policy Optimization for Order Dispatch on Ride-Sharing Platforms
/RS2002/ One Step is Enough: Multi-Agent Reinforcement Learning based on One-Step Policy Optimization for Order Dispatch on Ride-Sharing Platforms /RS2002/ One Step is Enough: Multi-Agent Reinforcement Learning based on One-Step Policy Optimization for Order Dispatch on Ride-Sharing Platforms

On-demand ride-sharing platforms face the fundamental challenge of dynamically bundling passengers with diverse origins and destinations and matching them with vehicles in real time, all under significant uncertainty.

However, conventional MARL-based ride-sharing approaches heavily rely on the accurate estimation of Q-values or V-values, which becomes problematic in large-scale, highly uncertain environments.

To address these challenges, we propose two novel alternative methods that bypass value function estimation.

First, we adapt GRPO to ride-sharing, replacing the PPO baseline with the group average reward to eliminate critic estimation errors and reduce training bias.

Second, inspired b…

7 months, 2 weeks назад @ paperswithcode.com
/LiXinran6/ Long-Short Distance Graph Neural Networks and Improved Curriculum Learning for Emotion Recognition in Conversation
/LiXinran6/ Long-Short Distance Graph Neural Networks and Improved Curriculum Learning for Emotion Recognition in Conversation /LiXinran6/ Long-Short Distance Graph Neural Networks and Improved Curriculum Learning for Emotion Recognition in Conversation

Include the markdown at the top of your GitHub README.md file to showcase the performance of the model.

Badges are live and will be dynamically updated with the latest ranking of this paper.

7 months, 2 weeks назад @ paperswithcode.com
/ShimSoonYong/ ZClassifier: Temperature Tuning and Manifold Approximation via KL Divergence on Logit Space
/ShimSoonYong/ ZClassifier: Temperature Tuning and Manifold Approximation via KL Divergence on Logit Space

We introduce a novel classification framework, ZClassifier, that replaces conventional deterministic logits with diagonal Gaussian-distributed logits. Code: https://github.com/ShimSoonYong/ZClassifier

7 months, 2 weeks назад @ paperswithcode.com
/briziorusso/ On Gradual Semantics for Assumption-Based Argumentation
/briziorusso/ On Gradual Semantics for Assumption-Based Argumentation

In this paper, we fill this gap and propose a family of novel gradual semantics for equipping assumptions, which are the core components in ABA frameworks, with dialectical strengths. Code: https://github.com/briziorusso/GradualABA

7 months, 2 weeks назад @ paperswithcode.com
/wumingqi/ Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination
/wumingqi/ Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination

Cloudflare is unable to establish an SSL connection to the origin server.

If you're a visitor of this website:Please try again in a few minutes.

If you're the owner of this website:It appears that the SSL configuration used is not compatible with Cloudflare.

This could happen for a several reasons, including no shared cipher suites.

Additional troubleshooting information here.

7 months, 2 weeks назад @ paperswithcode.com
/IsaacYQH/ WildFX: A DAW-Powered Pipeline for In-the-Wild Audio FX Graph Modeling
/IsaacYQH/ WildFX: A DAW-Powered Pipeline for In-the-Wild Audio FX Graph Modeling

Despite rapid progress in end-to-end AI music generation, AI-driven modeling of professional Digital Signal Processing (DSP) workflows remains challenging. Code: https://github.com/IsaacYQH/WildFX

7 months, 2 weeks назад @ paperswithcode.com
/summer1278/ Addressing Data Imbalance in Transformer-Based Multi-Label Emotion Detection with Weighted Loss
/summer1278/ Addressing Data Imbalance in Transformer-Based Multi-Label Emotion Detection with Weighted Loss

This paper explores the application of a simple weighted loss function to Transformer-based models for multi-label emotion detection in SemEval-2025 Shared Task 11. Code: https://github.com/summer1278/semeval2025-task11

7 months, 2 weeks назад @ paperswithcode.com
/gabrielkmbo/ Step-wise Policy for Rare-tool Knowledge (SPaRK): Offline RL that Drives Diverse Tool Use in LLMs
/gabrielkmbo/ Step-wise Policy for Rare-tool Knowledge (SPaRK): Offline RL that Drives Diverse Tool Use in LLMs

We present Step-wise Policy for Rare-tool Knowledge (SPaRK), a novel reinforcement learning framework that teaches large language models to explore diverse tool usage patterns beyond conventional high-temperature sampling. Code: https://github.com/gabrielkmbo/explore-rl

7 months, 2 weeks назад @ paperswithcode.com
/Cavendish518/ Learning to Tune Like an Expert: Interpretable and Scene-Aware Navigation via MLLM Reasoning and CVAE-Based Adaptation
/Cavendish518/ Learning to Tune Like an Expert: Interpretable and Scene-Aware Navigation via MLLM Reasoning and CVAE-Based Adaptation

Service robots are increasingly deployed in diverse and dynamic environments, where both physical layouts and social contexts change over time and across locations. Code: https://github.com/Cavendish518/LE-Nav

7 months, 2 weeks назад @ paperswithcode.com
/MatteoFasulo/ AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles
/MatteoFasulo/ AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles

Cloudflare is unable to establish an SSL connection to the origin server.

If you're a visitor of this website:Please try again in a few minutes.

If you're the owner of this website:It appears that the SSL configuration used is not compatible with Cloudflare.

This could happen for a several reasons, including no shared cipher suites.

Additional troubleshooting information here.

7 months, 2 weeks назад @ paperswithcode.com
/VCA-EPFL/ SystolicAttention: Fusing FlashAttention within a Single Systolic Array
/VCA-EPFL/ SystolicAttention: Fusing FlashAttention within a Single Systolic Array

The frequent data swaps between the systolic array and external vector units result in low systolic array utilization. Code: https://github.com/VCA-EPFL/FSA

7 months, 2 weeks назад @ paperswithcode.com
/Buddhi19/ Precision Spatio-Temporal Feature Fusion for Robust Remote Sensing Change Detection
/Buddhi19/ Precision Spatio-Temporal Feature Fusion for Robust Remote Sensing Change Detection

Cloudflare is unable to establish an SSL connection to the origin server.

If you're a visitor of this website:Please try again in a few minutes.

If you're the owner of this website:It appears that the SSL configuration used is not compatible with Cloudflare.

This could happen for a several reasons, including no shared cipher suites.

Additional troubleshooting information here.

7 months, 2 weeks назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 7 months, 2 weeks назад
/fudanvi/ Beyond Task-Specific Reasoning: A Unified Conditional Generative Framework for Abstract Visual Reasoning
/fudanvi/ Beyond Task-Specific Reasoning: A Unified Conditional Generative Framework for Abstract Visual Reasoning

Cloudflare is unable to establish an SSL connection to the origin server.

If you're a visitor of this website:Please try again in a few minutes.

If you're the owner of this website:It appears that the SSL configuration used is not compatible with Cloudflare.

This could happen for a several reasons, including no shared cipher suites.

Additional troubleshooting information here.

7 months, 2 weeks назад @ paperswithcode.com
/benedekrozemberczki/ PGT-I: Scaling Spatiotemporal GNNs with Memory-Efficient Distributed Training
/benedekrozemberczki/ PGT-I: Scaling Spatiotemporal GNNs with Memory-Efficient Distributed Training

Spatiotemporal graph neural networks (ST-GNNs) are powerful tools for modeling spatial and temporal data dependencies. Code: https://github.com/benedekrozemberczki/pytorch_geometric_temporal

7 months, 2 weeks назад @ paperswithcode.com
/chengxuphd/ DCR: Quantifying Data Contamination in LLMs Evaluation
/chengxuphd/ DCR: Quantifying Data Contamination in LLMs Evaluation

Cloudflare is unable to establish an SSL connection to the origin server.

If you're a visitor of this website:Please try again in a few minutes.

If you're the owner of this website:It appears that the SSL configuration used is not compatible with Cloudflare.

This could happen for a several reasons, including no shared cipher suites.

Additional troubleshooting information here.

7 months, 2 weeks назад @ paperswithcode.com
/gitter-lab/ Assay2Mol: large language model-based drug design using BioAssay context
/gitter-lab/ Assay2Mol: large language model-based drug design using BioAssay context

Scientific databases aggregate vast amounts of quantitative data alongside descriptive text. Code: https://github.com/gitter-lab/Assay2Mol

7 months, 2 weeks назад @ paperswithcode.com
/hayatkhan8660-maker/ DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition
/hayatkhan8660-maker/ DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition

We employ forward Kullback-Leibler (KL) divergence alongside spatio-temporal focal modulation to effectively transfer both local and global context from the Video-FocalNet Base (teacher) to the proposed VFL-Net (student). Code: https://github.com/hayatkhan8660-maker/DVFL-Net

7 months, 2 weeks назад @ paperswithcode.com
/JudyJuezhuLong/ Best Practices for Large-Scale, Pixel-Wise Crop Mapping and Transfer Learning Workflows
/JudyJuezhuLong/ Best Practices for Large-Scale, Pixel-Wise Crop Mapping and Transfer Learning Workflows

Cloudflare is unable to establish an SSL connection to the origin server.

If you're a visitor of this website:Please try again in a few minutes.

If you're the owner of this website:It appears that the SSL configuration used is not compatible with Cloudflare.

This could happen for a several reasons, including no shared cipher suites.

Additional troubleshooting information here.

7 months, 2 weeks назад @ paperswithcode.com
/joaojcorreia/ A Fuzzy Approach to Project Success: Measuring What Matters
/joaojcorreia/ A Fuzzy Approach to Project Success: Measuring What Matters

This paper introduces a novel approach to project success evaluation by integrating fuzzy logic into an existing construct. Code: https://github.com/joaojcorreia/FuzzyLogic_ProjectSuccess

7 months, 2 weeks назад @ paperswithcode.com
/kunkunlin1221/ InstructFLIP: Exploring Unified Vision-Language Model for Face Anti-spoofing
/kunkunlin1221/ InstructFLIP: Exploring Unified Vision-Language Model for Face Anti-spoofing

Extensive experiments demonstrate the effectiveness of InstructFLIP by outperforming SOTA models in accuracy and substantially reducing training redundancy across diverse domains in FAS. Code: https://github.com/kunkunlin1221/InstructFLIP

7 months, 2 weeks назад @ paperswithcode.com
/Linvyl/ Describe Anything Model for Visual Question Answering on Text-rich Images
/Linvyl/ Describe Anything Model for Visual Question Answering on Text-rich Images

Recent progress has been made in region-aware vision-language modeling, particularly with the emergence of the Describe Anything Model (DAM). Code: https://github.com/Linvyl/DAM-QA

7 months, 2 weeks назад @ paperswithcode.com
/abhijeet3922/ Developing Visual Augmented Q&A System using Scalable Vision Embedding Retrieval & Late Interaction Re-ranker
/abhijeet3922/ Developing Visual Augmented Q&A System using Scalable Vision Embedding Retrieval & Late Interaction Re-ranker

We propose multi-step custom implementation utilizing widely adopted hybrid search (metadata & embedding) and state of the art late interaction re-ranker to retrieve best matching pages. Code: https://github.com/abhijeet3922/vision-RAG

7 months, 2 weeks назад @ paperswithcode.com
/ziangcao0312/ PhysX: Physical-Grounded 3D Asset Generation
/ziangcao0312/ PhysX: Physical-Grounded 3D Asset Generation

3D modeling is moving from virtual to physical. Code: https://github.com/ziangcao0312/PhysX

7 months, 2 weeks назад @ paperswithcode.com
/henry123-boy/ SpatialTrackerV2: 3D Point Tracking Made Easy
/henry123-boy/ SpatialTrackerV2: 3D Point Tracking Made Easy

We present SpatialTrackerV2, a feed-forward 3D point tracking method for monocular videos. Code: https://github.com/henry123-boy/SpaTrackerV2

7 months, 2 weeks назад @ paperswithcode.com
/cncs-fit/ Emergence of Functionally Differentiated Structures via Mutual Information Optimization in Recurrent Neural Networks
/cncs-fit/ Emergence of Functionally Differentiated Structures via Mutual Information Optimization in Recurrent Neural Networks

Analysis of network performance, correlation patterns, and weight matrices reveals that mutual information minimization yields high task performance alongside clear functional modularity and moderate structural modularity. Code: https://github.com/cncs-fit/mio_rnn

7 months, 2 weeks назад @ paperswithcode.com
/coswindywang/ Making Language Model a Hierarchical Classifier and Generator
/coswindywang/ Making Language Model a Hierarchical Classifier and Generator

Language heads of the last layer are copied to different selected intermediate layers, and fine-tuned with different task inputs. Code: https://github.com/coswindywang/HdLM

7 months, 2 weeks назад @ paperswithcode.com
/ahmedehabb/ From Roots to Rewards: Dynamic Tree Reasoning with RL
/ahmedehabb/ From Roots to Rewards: Dynamic Tree Reasoning with RL

Modern language models address complex questions through chain-of-thought (CoT) reasoning (Wei et al., 2023) and retrieval augmentation (Lewis et al., 2021), yet struggle with error propagation and knowledge integration. Code: https://github.com/ahmedehabb/From-Roots-to-Rewards-Dynamic-Tree-Reasoning-with-RL

7 months, 2 weeks назад @ paperswithcode.com
💼 University and corporation labs
DeepMind DeepMind
последний пост 3 days, 12 hours назад
Gemini 3.1 Flash-Lite: Built for intelligence at scale
Gemini 3.1 Flash-Lite: Built for intelligence at scale Gemini 3.1 Flash-Lite: Built for intelligence at scale

Today, we're introducing Gemini 3.1 Flash-Lite, our fastest and most cost-efficient Gemini 3 series model.

Built for high-volume developer workloads at scale, 3.1 Flash-Lite delivers high quality for its price and model tier.

Starting today, 3.1 Flash-Lite is rolling out in preview to developers via the Gemini API in Google AI Studio and for enterprises via Vertex AI.

Cost-efficiency without compromisePriced at just $0.25/1M input tokens and $1.50/1M output tokens, 3.1 Flash-Lite delivers enhanced performance at a fraction of the cost of larger models.

This low latency is needed for high-frequency workflows, making it an ideal model for developers to build responsive, real-time experiences.

3 days, 12 hours назад @ blog.google
Nano Banana 2: Combining Pro capabilities with lightning-fast speed
Nano Banana 2: Combining Pro capabilities with lightning-fast speed Nano Banana 2: Combining Pro capabilities with lightning-fast speed

In August of last year, our Gemini Image model, Nano Banana, became a viral sensation, redefining image generation and editing.

Then in November, we released Nano Banana Pro, offering users advanced intelligence and studio-quality creative control.

Today, we’re bringing the best of both worlds to users across Google.

Introducing Nano Banana 2 (Gemini 3.1 Flash Image), our latest state-of-the-art image model.

Now you can get the advanced world knowledge, quality and reasoning you love in Nano Banana Pro, at lightning-fast speed.

1 week, 1 day назад @ blog.google
Gemini 3.1 Pro: A smarter model for your most complex tasks
Gemini 3.1 Pro: A smarter model for your most complex tasks Gemini 3.1 Pro: A smarter model for your most complex tasks

Today, we’re releasing the upgraded core intelligence that makes those breakthroughs possible: Gemini 3.1 Pro.

We are shipping 3.1 Pro across our consumer and developer products to bring this progress in intelligence to your everyday applications.

Starting today, 3.1 Pro is rolling out:For developers in preview via the Gemini API in Google AI Studio, Gemini CLI, our agentic development platform Google Antigravity and Android Studioin preview via the Gemini API in Google AI Studio, Gemini CLI, our agentic development platform Google Antigravity and Android Studio For enterprises in Vertex AI and Gemini Enterprisein Vertex AI and Gemini Enterprise For consumers via the Gemini app and Notebook…

2 weeks, 1 day назад @ blog.google
A new way to express yourself: Gemini can now create music
A new way to express yourself: Gemini can now create music A new way to express yourself: Gemini can now create music

New audio verification capabilitiesAll tracks generated in the Gemini app are embedded with SynthID, our imperceptible watermark for identifying Google AI-generated content.

We are also giving you more tools to help identify AI content, broadening our verification capabilities in the Gemini app to include audio, along with image and video.

Simply upload a file and ask if it was generated using Google AI, and Gemini will check for SynthID and use its own reasoning to return a response.

And Google AI Plus, Pro and Ultra subscribers will enjoy higher limits.

Our goal with music generation in the Gemini app is to help you add a fun, custom soundtrack to your daily life.

2 weeks, 2 days назад @ blog.google
Accelerating discovery in India through AI-powered science and education
Accelerating discovery in India through AI-powered science and education Accelerating discovery in India through AI-powered science and education

In the global AI transformation, India is showing exceptional leadership in applying the technology to tackle its own biggest challenges.

But India is going even further, playing a critical international role by convening this week the fourth global AI summit of governments, companies and civil society.

Partnership in India to broaden AI accessOur partnerships are designed to accelerate the pace of progress across India.

Ltd., a K-12 textbook publisher in India, Gemini will be used to transform two million static textbooks into AI-powered interactive journeys across more than 250 titles and 2,000 schools.

Ltd., a K-12 textbook publisher in India, Gemini will be used to transform two million…

2 weeks, 3 days назад @ deepmind.google
Gemini 3 Deep Think: Advancing science, research and engineering
Gemini 3 Deep Think: Advancing science, research and engineering Gemini 3 Deep Think: Advancing science, research and engineering

Our most specialized reasoning mode is now updated to solve modern science, research and engineering challenges.

3 weeks, 1 day назад @ deepmind.google
Accelerating Mathematical and Scientific Discovery with Gemini Deep Think
Accelerating Mathematical and Scientific Discovery with Gemini Deep Think Accelerating Mathematical and Scientific Discovery with Gemini Deep Think

Under direction from expert mathematicians and scientists, Gemini Deep Think is solving professional research problems across mathematics, physics, and computer scienceIn the summer of 2025, an advanced version of Gemini Deep Think achieved Gold-medal standard at the International Mathematics Olympiad (IMO) and later, an updated version, obtained similar results at the International Collegiate Programming Contest.

Since then, Gemini Deep Think mode has moved into science, engineering and enterprise workflows to tackle more complex, open-ended challenges.

In the last week, our teams published two papers (1, 2) detailing a cross-disciplinary effort to solve professional research problems usin…

3 weeks, 4 days назад @ deepmind.google
Project Genie: Experimenting with infinite, interactive worlds
Project Genie: Experimenting with infinite, interactive worlds Project Genie: Experimenting with infinite, interactive worlds

Starting today, we're rolling out access to Project Genie for Google AI Ultra subscribers in the U.S (18+).

This experimental research prototype lets users create, explore and remix their own interactive worlds.

How we’re advancing world modelsA world model simulates the dynamics of an environment, predicting how they evolve and how actions affect them.

Building on our model research with trusted testers from across industries and domains, we are taking the next step with an experimental research prototype: Project Genie.

How Project Genie worksProject Genie is a prototype web app powered by Genie 3, Nano Banana Pro and Gemini, which allows users to experiment with the immersive experiences…

1 month назад @ blog.google
D4RT: Teaching AI to see the world in four dimensions
D4RT: Teaching AI to see the world in four dimensions D4RT: Teaching AI to see the world in four dimensions

Introducing D4RT, a unified AI model for 4D scene reconstruction and tracking across space and time.

Today, we are introducing D4RT (Dynamic 4D Reconstruction and Tracking), a new AI model that unifies dynamic scene reconstruction into a single, efficient framework, bringing us closer to the next frontier of artificial intelligence: total perception of our dynamic reality.

How D4RT Works: A Query-Based ApproachD4RT operates as a unified encoder-decoder Transformer architecture.

The encoder first processes the input video into a compressed representation of the scene’s geometry and motion.

This makes D4RT extremely fast and scalable, whether it’s tracking just a few points or reconstructing …

1 month, 2 weeks назад @ deepmind.google
Veo 3.1 Ingredients to Video: More consistency, creativity and control
Veo 3.1 Ingredients to Video: More consistency, creativity and control Veo 3.1 Ingredients to Video: More consistency, creativity and control

Today, Veo is getting more expressive, with improvements that help you create more fun, creative, high-quality videos based on ingredient images, built directly for the mobile format.

We’re excited to bring new creative possibilities for everyone from casual storytellers to professional filmmakers.

We’re releasing:Improvements to Veo 3.1 Ingredients to Video, our capability that lets you create videos based on reference images.

This update makes videos more expressive and creative, even with simple prompts Native vertical outputs for Ingredients to Video (portrait mode) to power mobile-first, short-form video creation State-of-the-art upscaling to 1080p and 4K resolution for high-fidelity p…

1 month, 3 weeks назад @ blog.google
Google's year in review: 8 areas with research breakthroughs in 2025
Google's year in review: 8 areas with research breakthroughs in 2025 Google's year in review: 8 areas with research breakthroughs in 2025

Google 2025 recap: Research breakthroughs of the year

2 months, 1 week назад @ deepmind.google
Gemini 3 Flash: frontier intelligence built for speed
Gemini 3 Flash: frontier intelligence built for speed Gemini 3 Flash: frontier intelligence built for speed

Today, we're expanding the Gemini 3 model family with the release of Gemini 3 Flash, which offers frontier intelligence built for speed at a fraction of the cost.

Last month, we kicked off Gemini 3 with Gemini 3 Pro and Gemini 3 Deep Think mode, and the response has been incredible.

With Gemini 3, we introduced frontier performance across complex reasoning, multimodal and vision understanding and agentic and vibe coding tasks.

Gemini 3 Flash retains this foundation, combining Gemini 3's Pro-grade reasoning with Flash-level latency, efficiency and cost.

Starting today, Gemini 3 Flash is rolling out to millions of people globally:

2 months, 2 weeks назад @ blog.google
Gemma Scope 2: helping the AI safety community deepen understanding of complex language model behavior
Gemma Scope 2: helping the AI safety community deepen understanding of complex language model behavior Gemma Scope 2: helping the AI safety community deepen understanding of complex language model behavior

Today, we are releasing Gemma Scope 2: a comprehensive, open suite of interpretability tools for all Gemma 3 model sizes, from 270M to 27B parameters.

Producing Gemma Scope 2 involved storing approximately 110 Petabytes of data, as well as training over 1 trillion total parameters.

What’s new in Gemma Scope 2Interpretability research aims to understand the internal workings and learned algorithms of AI models.

Like its predecessor, Gemma Scope 2 acts as a microscope for the Gemma family of language models.

While the original Gemma Scope enabled research in key areas of safety, such as model hallucination, identifying secrets known by a model, and training safer models, Gemma Scope 2 support…

2 months, 2 weeks назад @ deepmind.google
Improved Gemini audio models for powerful voice experiences
Improved Gemini audio models for powerful voice experiences Improved Gemini audio models for powerful voice experiences

Earlier this week, we introduced greater control over audio generation with an upgrade to our Gemini 2.5 Pro and Flash Text-to-Speech models.

Today, we’re releasing an updated Gemini 2.5 Flash Native Audio for live voice agents.

Gemini 2.5 Flash Native Audio is now available across Google products including Google AI Studio, Vertex AI, and has also started rolling out in Gemini Live and Search Live, bringing the naturalness of native audio to Search Live for the first time.

Beyond powering helpful agents, native audio unlocks new possibilities for global communication.

Live Voice Agents

2 months, 3 weeks назад @ blog.google
Deepening our partnership with the UK AI Security Institute
Deepening our partnership with the UK AI Security Institute Deepening our partnership with the UK AI Security Institute

Today, we're announcing an expanded partnership with the UK AI Security Institute (AISI) through a new Memorandum of Understanding focused on foundational security and safety research, to help ensure artificial intelligence is developed safely and benefits everyone.

The research partnership with AISI is an important part of our broader collaboration with the UK government on accelerating safe and beneficial AI progress.

This is why we have partnered with the UK AISI since its inception in November 2023 to test our most capable models.

We are actively working with AISI to build more robust evaluations for AI models, and our teams have collaborated on safety research to move the field forward…

2 months, 3 weeks назад @ deepmind.google
Google
последний пост 1 day, 11 hours назад
The ultimate Nano Banana prompting guide
The ultimate Nano Banana prompting guide The ultimate Nano Banana prompting guide

Built on the Gemini 3 family of models, Nano Banana models apply deep reasoning capabilities to fully understand your prompt before generating an image.

So we spent weeks testing Nano Banana 2 and Nano Banana Pro against every use case we could imagine to test its limits.

What you’ll learn in this guide:Model overview Full breakdown of tech specs Best practices for effective prompting Prompting frameworks How Nano Banana works with other creative models, Veo and Lyria.

Most recently, we announced Nano Banana 2, which shines in three ways:More accurate visuals: Nano Banana 2 is powered by real-time information and images from web search.

Breakdown of tech specs for Nano Banana 2 and Nano Ban…

1 day, 11 hours назад @ cloud.google.com
Small models, high quality: Inside BMW Group’s experiments evaluating domain-specific language models
Small models, high quality: Inside BMW Group’s experiments evaluating domain-specific language models Small models, high quality: Inside BMW Group’s experiments evaluating domain-specific language models

One way of achieving better, more natural voice commands is by incorporating AI foundation models into vehicle systems, which offer more intelligence than traditional voice commands.

While large language models (LLMs) offer powerful capabilities, they present one considerable drawback, at least in automotive settings: their reliance on consistent network access makes LLMs impractical for in-vehicle use due to potential lag and interruption.

Small language models may offer an ideal balance — but finding the right trade-off between size and capability requires careful optimization.

These purpose-built, right-sized generative AI models can be run directly on edge devices, including vehicles.

S…

2 days, 11 hours назад @ cloud.google.com
Designing private network connectivity for RAG-capable gen AI apps
Designing private network connectivity for RAG-capable gen AI apps Designing private network connectivity for RAG-capable gen AI apps

The flexibility of Google Cloud allows enterprises to build secure and reliable architecture for their AI workloads.

In this blog we will look at a reference architecture for private connectivity for retrieval-augmented generation (RAG)-capable generative AI applications.

This architecture is for scenarios where communications of the overall system must use private IP addresses and must not traverse the internet.

RAG allows an application to retrieve relevant information from your documents, datasources, or databases in real time.

Design pattern exampleTo understand how to think about setting up your network for private connectivity for a RAG application in a regional design, let's look at …

4 days, 11 hours назад @ cloud.google.com
PayPal's historically large data migration is the foundation for its gen AI innovation
PayPal's historically large data migration is the foundation for its gen AI innovation PayPal's historically large data migration is the foundation for its gen AI innovation

To continue leading the next wave of innovation in financial services, we knew we had to modernize our data foundation.

The journey to unified dataAfter choosing Google Cloud as our data partner, we embarked on our historic data migration.

Discovery and analysis: Detailed inventories of data, workloads and inbound/outbound data streams is crucial for defining scope, effort and forecasting budget.

Personalized financial insights that help merchants optimize their businesses.

Lessons for the AI eraWhile our migration may be extraordinary in its scale, we are not alone in our needs or ambitions.

1 week, 1 day назад @ cloud.google.com
Pro-level image generation gets faster and more accessible with Nano Banana 2
Pro-level image generation gets faster and more accessible with Nano Banana 2 Pro-level image generation gets faster and more accessible with Nano Banana 2

Today, we’re entering a new and vibrant era of generative creativity with Nano Banana 2.

What’s new: Nano Banana 2 is our state-of-the-art image generation and editing model.

Whether you’re building the next major consumer app or marketing campaign, Nano Banana 2 delivers:More accurate visuals: Nano Banana 2 is powered by real-time information and images from web search.

Where you can access Nano Banana 2: Developers can start building with Nano Banana 2 in preview using the Gemini API in Vertex AI .

Deep dive into how Nano Banana 2 sets a new bar for creativeAdvanced world knowledge: Nano Banana 2 is powered by real-time information and images from web search.

1 week, 1 day назад @ cloud.google.com
A developer's guide to production-ready AI agents
A developer's guide to production-ready AI agents A developer's guide to production-ready AI agents

AI agents have moved from "interesting research concept" to "thing my team is actually building."

Agents don't behave like traditional software.

The patterns that served us well for deterministic code don't fully translate.

These resources first appeared during Kaggle’s 5 days of AI Agents Intensive, and they’ve proven so popular and useful, we wanted to make sure a wider audience had access, as well.

These guides offer practical frameworks and code samples you can adapt to your own projects.

1 week, 2 days назад @ cloud.google.com
Using Google Cloud AI to measure the physics of U.S. freestyle snowboarding and skiing
Using Google Cloud AI to measure the physics of U.S. freestyle snowboarding and skiing Using Google Cloud AI to measure the physics of U.S. freestyle snowboarding and skiing

Nearly every snowboard trick carries a number.

Without sensors on the athlete's body, there was never a way to measure what happens mid-flight.

Ski & Snowboard ahead of the Olympic Winter Games Milano Cortina 2026, we built an AI tool on Google Cloud that extracts full 3D biomechanical data from ordinary video.

The name breaks down like this: two off-axis inversions plus two horizontal rotations, each assigned a clean 360°, totaling 1,440°.

When we measured the true geometric rotation of his 3D pose through space, the number came back at an estimated 1,122°.

2 weeks, 1 day назад @ cloud.google.com
Simplify your AI workflow with autonomous embedding generation in BigQuery
Simplify your AI workflow with autonomous embedding generation in BigQuery Simplify your AI workflow with autonomous embedding generation in BigQuery

Traditionally, users have to set up embedding generation pipelines themselves to propagate source content updates, embedding generation, and storage.

To help BigQuery users with their AI workloads, we’re introducing autonomous embedding generation.

Managing embeddings, the old wayBefore autonomous generation, the process of updating your vector search database usually looked like this:Detect new rows in your source table.

Automatic synchronizationBigQuery should manage embedding generation on behalf of the user and keep generated embeddings in sync with the source data.

Using a familiar SQL syntax, you can now define an autonomous embedding column that BigQuery manages for you.

2 weeks, 1 day назад @ cloud.google.com
Introducing Gemini 3.1 Pro on Google Cloud
Introducing Gemini 3.1 Pro on Google Cloud Introducing Gemini 3.1 Pro on Google Cloud

Today, we’re announcing a step forward in core reasoning building on the Gemini 3 series: Gemini 3.1 Pro.

Why it matters for your business: Gemini 3.1 Pro is designed to solve tougher problems, giving you the reasoning depth your business needs.

Where you can access Gemini 3.1 Pro: Gemini 3.1 Pro is available starting today in preview in Vertex AI and Gemini Enterprise.

Developers can access the model in preview via the Gemini API in Google AI Studio, Android Studio, Google Antigravity, and Gemini CLI.

Gemini 3.1 Pro is already helping customers deliver resultsWe’re already seeing progress for businesses across industries – from clear leaps in quality to excellent execution.

2 weeks, 1 day назад @ cloud.google.com
Powering the next generation of agents with Google Cloud databases
Powering the next generation of agents with Google Cloud databases Powering the next generation of agents with Google Cloud databases

These servers run in Google Cloud, providing a secure interface for Gemini and other MCP-compliant clients to easily interact with data and infrastructure.

Firestore: Focused on mobile and web development, the Firestore MCP server enables agents to sync with live document collections.

The Developer Knowledge MCP server connects IDEs to Google’s documentation, allowing agents to answer technical questions and troubleshoot code with relevant context.

Full observability: To track agent activity, every query and action taken via these MCP servers is logged in Cloud Audit Logs.

Demo: From local code to managed dataLet’s see these new MCP servers in action.

2 weeks, 2 days назад @ cloud.google.com
Unlocking enterprise data to accelerate agentic AI: How Ab Initio does it
Unlocking enterprise data to accelerate agentic AI: How Ab Initio does it Unlocking enterprise data to accelerate agentic AI: How Ab Initio does it

To address the data challenges of the AI era, Google Cloud and Ab Initio are announcing a suite of products, including new data connectors, metadata connectors, and agents.

Taken together, these can help build agentic AI experiences that enable autonomous actions and accelerated human-in-the-loop decisions.

Ab Initio’s data integration, governance, active metadata, and agentic AI capabilities integrate directly with Google Cloud platform, notably BigQuery, Dataplex Universal Catalog, and Gemini.

How Google's Data Cloud powers agentic AIGoogle's Data Cloud is built to deliver the data and context needed to power agentic AI.

For Google Cloud customers, Ab Initio federates data from across on-…

2 weeks, 2 days назад @ cloud.google.com
Your guide to Provisioned Throughput (PT) on Vertex AI
Your guide to Provisioned Throughput (PT) on Vertex AI Your guide to Provisioned Throughput (PT) on Vertex AI

When AI agents make thousands of decisions a day, consistent performance isn't just a technical detail — it's a business requirement.

To help you scale, we are updating PT on Vertex AI with three key improvements:Model diversity: Run the right model for the right job.

In this post, we’ll share the resources available to you today on Vertex AI, and how you can get started.

Vertex AI Model Garden, our curated set of 200+ first-party, third-party, and open-source models, makes it easy to use the best resource for your business needs.

Powering multimodal innovationThe next wave of AI agents are seeing, hearing, and acting in real time.

2 weeks, 2 days назад @ cloud.google.com
Build financial resilience with AI-powered tabletop exercises on Google Cloud
Build financial resilience with AI-powered tabletop exercises on Google Cloud Build financial resilience with AI-powered tabletop exercises on Google Cloud

The risk is amplified by major regulatory drivers like the Digital Operational Resilience Act (DORA), which mandates that financial institutions are ready for any disruption.

The recent designation of Google Cloud as a Critical Third-Party Service Provider (CTPP) under DORA further underscores this strong commitment to enabling secure and resilient financial operations for our customers.

For them, a critical incident means more than just downtime; it means regulatory fines and an erosion of client trust.

In this blog, we’ll share a solution using context-aware scenario modeling on Gemini Enterprise.

Context-aware scenario modeling powered by Google AIGoogle Cloud’s Technical Account Managem…

3 weeks, 2 days назад @ cloud.google.com
Gemini Enterprise Agent Ready (GEAR) program now available, a new path to building AI agents at scale
Gemini Enterprise Agent Ready (GEAR) program now available, a new path to building AI agents at scale Gemini Enterprise Agent Ready (GEAR) program now available, a new path to building AI agents at scale

To meet this moment, we are excited to open the Gemini Enterprise Agent Ready (GEAR) learning program to everyone.

As a new specialized pathway within the Google Developer Program, GEAR empowers developers and pros to build and deploy enterprise-grade agents with Google AI.

That is why GEAR membership includes 35 monthly learning credits on Google Skills.

Turn technical proficiency into industry-recognized credentialsAs you progress through these learning paths, GEAR helps turn your progress into tangible credentials.

You can access GEAR benefits by creating or signing into your Google Developer Program profile and claiming the GEAR badge.

3 weeks, 3 days назад @ cloud.google.com
How we cut Vertex AI latency by 35% with GKE Inference Gateway
How we cut Vertex AI latency by 35% with GKE Inference Gateway How we cut Vertex AI latency by 35% with GKE Inference Gateway

As generative AI moves from experimentation to production, platform engineers face a universal challenge for inference serving: you need low latency, high throughput, and manageable costs.

Our solution: To solve this, the Vertex AI engineering team adopted the GKE Inference Gateway.

Content-aware routing: It inspects request prefixes and routes to the pod that already has that context in its KV cache, avoiding expensive re-computation.

By migrating production workloads to this architecture, Vertex AI proves that this dual-layer intelligence is the key to unlocking performance at scale.

The results: Validated at production scaleBy placing GKE Inference Gateway in front of the Vertex AI model…

4 weeks назад @ cloud.google.com
OpenAI
последний пост None
Microsoft Microsoft
последний пост 2 days, 10 hours назад
Phi-4-reasoning-vision and the lessons of training a multimodal reasoning model
Phi-4-reasoning-vision and the lessons of training a multimodal reasoning model Phi-4-reasoning-vision and the lessons of training a multimodal reasoning model

At a glance Phi-4-reasoning-vision-15B is a compact and smart open‑weight multimodal reasoning model that balances reasoning power, efficiency, and training data needs.

is a compact and smart open‑weight multimodal reasoning model that balances reasoning power, efficiency, and training data needs.

This leads to several possible training pipelines:Non-reasoning LLM → reasoning multimodal training: Reasoning and multimodal capabilities are trained together.

Non-reasoning LLM → non-reasoning multimodal → reasoning multimodal training: Multimodal capabilities are learned first, then reasoning is added.

Reasoning LLM → reasoning multimodal training: A reasoning base is used, but all multimodal d…

2 days, 10 hours назад @ microsoft.com
Trailer: The Shape of Things to Come
Trailer: The Shape of Things to Come Trailer: The Shape of Things to Come

This is The Shape of Things to Come.

I manage Microsoft Research’s worldwide labs, and I’m excited to introduce this new Microsoft Research Podcast series.

I called the podcast The Shape of Things to Come because as researchers, the problems that we choose to solve and the technologies that we develop do change the shape of the future.

It’s very hard to say whether we’re in an inflection point because I see the advancement of technology accelerating.

But I don’t know what the inflection point is because all I’ve seen is a curve going up.

3 days, 15 hours назад @ microsoft.com
CORPGEN advances AI agents for real work
CORPGEN advances AI agents for real work CORPGEN advances AI agents for real work

To determine what a benchmark would need to test, we ran MHTEs at scale on some of today’s leading AI agents, exposing four weaknesses.

CORPGEN’s architectureCORPGEN introduces digital employees: LLM-powered AI agents with persistent identities, role-specific expertise, and realistic work schedules.

At 46 tasks, CORPGEN completed 15.2% of tasks, compared with 4.3% for the baselines, roughly 3.5 times more.

CORPGEN also opens a new lens on how AI agents collaborate.

AcknowledgmentsThis work is a result of a collaboration between the Office of the CTO at Microsoft and the Microsoft AI Development Accelerator Program (MAIDAP).

1 week, 1 day назад @ microsoft.com
Media Authenticity Methods in Practice: Capabilities, Limitations, and Directions
Media Authenticity Methods in Practice: Capabilities, Limitations, and Directions Media Authenticity Methods in Practice: Capabilities, Limitations, and Directions

We refer to technologies aimed at helping viewers verify the source and history—that is, the provenance—of digital content as media integrity and authentication (MIA) methods.

Today, we are publishing our findings in the Media Integrity & Authentication: Status, Directions & Futures report.

The report distills lessons learned and outlines practical directions for strengthening media integrity in the years ahead.

Watch on-demand Opens in a new tabFindings and directions forwardOur research recognizes that different media integrity and authenticity methods serve differing purposes and offer distinct levels of protection.

Important directions include in-stream tools that display provenance inf…

2 weeks, 1 day назад @ microsoft.com
Project Silica’s advances in glass storage technology
Project Silica’s advances in glass storage technology Project Silica’s advances in glass storage technology

At a glance Microsoft Research publishes breakthrough in Nature on glass-based data storage that could preserve information for 10,000 years.

Glass is a permanent data storage material that is resistant to water, heat, and dust.

Phase voxels, a new storage method: We invented a new type of data storage in glass called phase voxels, in which the phase change of the glass is modified instead of its polarization, showing that only a single pulse is necessary to make a phase voxel.

We demonstrated that these phase voxels can also be formed in borosilicate glass and devised a technique to read the phase information from phase voxels encoded in this material.

A research-grade Writer used to set t…

2 weeks, 2 days назад @ microsoft.com
Rethinking imitation learning with Predictive Inverse Dynamics Models
Rethinking imitation learning with Predictive Inverse Dynamics Models Rethinking imitation learning with Predictive Inverse Dynamics Models

Predictive Inverse Dynamics Models (PIDMs) predict plausible future states, clarifying the direction of behavior during imitation learning.

Predictive Inverse Dynamics Models (PIDMs) offer a different take on imitation learning by changing how agents interpret human behavior.

By grounding the selection process in a plausible future, PIDMs provide a clearer basis for choosing an action during inference.

They then use an inverse dynamics model to predict the action required to move from the current state towards that future state.

A player (left) and a PIDM agent (right) side by side playing the game Bleeding Edge.

4 weeks, 1 day назад @ microsoft.com
Paza: Introducing automatic speech recognition benchmarks and models for low resource languages
Paza: Introducing automatic speech recognition benchmarks and models for low resource languages Paza: Introducing automatic speech recognition benchmarks and models for low resource languages

At a glance Microsoft Research releases PazaBench and Paza automatic speech recognition models , advancing speech technology for low resource languages.

First-of-its-kind ASR leaderboard, starting with African languages: Pazabench is the first automatic speech recognition (ASR) leaderboard for low-resource languages.

It launches with initial coverage for 39 African languages and benchmarks 52 state‑of‑the‑art ASR and language models, including newly released Paza ASR models for six Kenyan languages.

Paza ASR Models: Built with and for Kenyan languagesThe Paza ASR models consist of three fine-tuned ASR models built on top of state‑of‑the‑art model architectures.

Here is how Paza models compa…

4 weeks, 1 day назад @ microsoft.com
UniRG: Scaling medical imaging report generation with multimodal reinforcement learning
UniRG: Scaling medical imaging report generation with multimodal reinforcement learning UniRG: Scaling medical imaging report generation with multimodal reinforcement learning

At a glance AI-driven medical image report generation can help medical providers become more efficient and productive.

Universal Report Generation (UniRG) uses reinforcement learning to align model training with real-world radiology practice rather than proxy text-generation objectives.

Beyond the real-world benefits, report generation has also become a critical benchmark for evaluating multimodal reasoning in healthcare AI.

In this blog, we introduce Universal Report Generation (UniRG) (opens in new tab), a reinforcement learning–based framework for medical imaging report generation.

UniRG is a promising step toward scaling medical imaging report generationUniRG introduces a reinforcement …

1 month, 1 week назад @ microsoft.com
UniRG: Scaling medical imaging report generation with multimodal reinforcement learning
UniRG: Scaling medical imaging report generation with multimodal reinforcement learning UniRG: Scaling medical imaging report generation with multimodal reinforcement learning

At a glance AI-driven medical image report generation can help medical providers become more efficient and productive.

Universal Report Generation (UniRG) uses reinforcement learning to align model training with real-world radiology practice rather than proxy text-generation objectives.

Beyond the real-world benefits, report generation has also become a critical benchmark for evaluating multimodal reasoning in healthcare AI.

In this blog, we introduce Universal Report Generation (UniRG) (opens in new tab), a reinforcement learning–based framework for medical imaging report generation.

UniRG is a promising step toward scaling medical imaging report generationUniRG introduces a reinforcement …

1 month, 1 week назад @ microsoft.com
Multimodal reinforcement learning with agentic verifier for AI agents
Multimodal reinforcement learning with agentic verifier for AI agents Multimodal reinforcement learning with agentic verifier for AI agents

It’s an agentic verification framework designed to improve the reliability of reinforcement learning in multimodal models.

Reinforcement learning is a training method where AI models learn by receiving rewards for desired behaviors and penalties for undesired ones, gradually improving their performance through trial and error.

This design prevents unreliable feedback from dominating training and produces a stable reward signal for reinforcement learning.

Models trained with Argos also showed a substantial reduction of hallucinations compared with both standard chain-of-thought prompting and reinforcement learning baselines.

How Argos shapes reinforcement learningTo understand how Argos affe…

1 month, 2 weeks назад @ microsoft.com
OptiMind: A small language model with optimization expertise
OptiMind: A small language model with optimization expertise OptiMind: A small language model with optimization expertise

OptiMind is a small language model designed to convert business problems described in natural language into the mathematical formulations needed by optimization software.

Enterprises across industries, from energy to finance, use optimization models to plan complex operations like supply chains and logistics.

To address this challenge, we’re introducing OptiMind, a small language model designed to convert problems described in plain language into the mathematical formulations that optimization software needs.

Built on a 20-billion parameter model, OptiMind is compact by today’s standards yet matches the performance of larger, more complex systems.

From problem description to solutionOne of …

1 month, 2 weeks назад @ microsoft.com
Agent Lightning: Adding reinforcement learning to AI agents without code rewrites
Agent Lightning: Adding reinforcement learning to AI agents without code rewrites Agent Lightning: Adding reinforcement learning to AI agents without code rewrites

To address this, a research team from Microsoft Research Asia – Shanghai has introduced Agent Lightning.

Whether it involves multiple collaborating agents or dynamic tool use, Agent Lightning breaks it down into a sequence of transitions.

Agent Lightning as middlewareAgent Lightning serves as middleware between RL algorithms and agent environments, providing with modular components that enable scalable RL through standardized protocols and well-defined interfaces.

In practice, developers can keep their existing agent frameworks and switch model calls to the Agent Lightning API without changing their agent code (Figure 5).

By bridging existing agentic systems with reinforcement learning, Age…

2 months, 3 weeks назад @ microsoft.com
Promptions helps make AI prompting more precise with dynamic UI controls
Promptions helps make AI prompting more precise with dynamic UI controls Promptions helps make AI prompting more precise with dynamic UI controls

To address this, we are excited to introduce Promptions (prompt + options), a UI framework that helps developers build AI interfaces with more precise user control.

We compared the static design from the first study, called the “Static Prompt Refinement Control” (Static PRC), against a “Dynamic Prompt Refinement Control” (Dynamic PRC) with features that responded to participants’ feedback.

Comparison of user preferences for Static PRC versus Dynamic PRC across key evaluation criteria.

(1) The Option Module reads the user’s prompt and conversation history and (2) generates prompt options.

Key usability challenges include clarifying how dynamic options affect AI output and managing the comple…

2 months, 3 weeks назад @ microsoft.com
GigaTIME: Scaling tumor microenvironment modeling using virtual population generated by multimodal AI
GigaTIME: Scaling tumor microenvironment modeling using virtual population generated by multimodal AI GigaTIME: Scaling tumor microenvironment modeling using virtual population generated by multimodal AI

C, Scatter plot comparing the subtype-level GigaTIME-translated virtual mIF activations between TCGA and Providence virtual populations.

To our knowledge, this is the first large-scale study exploring multimodal AI for scaling virtual mIF generation.

H, A case study showcasing the activation maps across different virtual mIF channels for a H&E slide in our virtual population, and virtual mIF of sample patches from this slide.

By applying GigaTIME to Providence real-world data, we generated a virtual population of 14,256 patients with virtual mIF and key clinical attributes.

G, Bar plot comparing pan-cancer patient stratification performance in terms of survival log rank p-values among virtu…

2 months, 3 weeks назад @ microsoft.com
Ideas: Community building, machine learning, and the future of AI
Ideas: Community building, machine learning, and the future of AI Ideas: Community building, machine learning, and the future of AI

This week, machine learning researchers around the world will be attending the annual Conference on Neural Information Processing Systems, or NeurIPS.

In this series, we’ll explore the technologies that are shaping our future and the big ideas that propel them forward.

So around that time when I started my PhD at Penn, I was working in machine learning theory and algorithmic economics.

How had you experienced a lack of community or network of women in machine learning before the founding of WiML?

So particularly when working on topics related to fairness, I’ve ended up focusing a bunch on stuff to do with marginalized groups as part of my responsible AI work.

3 months назад @ microsoft.com
MIT AI MIT AI
последний пост 2 days, 23 hours назад
A “ChatGPT for spreadsheets” helps solve difficult engineering challenges faster
A “ChatGPT for spreadsheets” helps solve difficult engineering challenges faster A “ChatGPT for spreadsheets” helps solve difficult engineering challenges faster

Foundation models are huge artificial intelligence systems trained on vast, general datasets.

“A tabular foundation model is like a ChatGPT for spreadsheets.

It does this by using a tabular foundation model to estimate which variables (or combinations of variables) most influence the outcome.

Bigger problems, better solutionsOne of their biggest challenges was finding the best tabular foundation model for this task, Yu says.

In the future, the researchers want to study methods that could boost the performance of tabular foundation models.

2 days, 23 hours назад @ news.mit.edu
Featured video: Coding for underwater robotics
Featured video: Coding for underwater robotics Featured video: Coding for underwater robotics

During a summer internship at MIT Lincoln Laboratory, Ivy Mahncke, an undergraduate student of robotics engineering at Olin College of Engineering, took a hands-on approach to testing algorithms for underwater navigation.

She first discovered her love for working with underwater robotics as an intern at the Woods Hole Oceanographic Institution in 2024.

Mahncke spent the summer developing and troubleshooting an algorithm that would help a human diver and robotic vehicle collaboratively navigate underwater.

Her work in the laboratory culminated in field tests of the algorithm on an operational underwater vehicle.

Video by Tim Briggs/MIT Lincoln Laboratory | 2 minutes, 59 seconds

1 week назад @ news.mit.edu
New method could increase LLM training efficiency
New method could increase LLM training efficiency New method could increase LLM training efficiency

This reduces the amount of work the reasoning model must do, accelerating the training process.

When tested on multiple reasoning LLMs, the method doubled the training speed while preserving accuracy.

To teach them this skill, developers train reasoning LLMs using a technique called reinforcement learning (RL).

TLT reuses some components of the reasoning model training process to train the drafter, leading to extra gains in acceleration.

As an added bonus, the small drafter model could readily be utilized for efficient deployment as a free byproduct.

1 week, 1 day назад @ news.mit.edu
Mixing generative AI with physics to create personal items that work in the real world
Mixing generative AI with physics to create personal items that work in the real world Mixing generative AI with physics to create personal items that work in the real world

In an attempt to make these designs work in the real world, researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) are giving generative AI models a reality check.

Their “PhysiOpt” system augments these tools with physics simulations, making blueprints for personal items such as cups, keyholders, and bookends work as intended when they’re 3D printed.

This comprehensive scan provides a heat map over your 3D model, which indicates where your blueprint isn’t well-supported.

CSAIL researchers observed that PhysiOpt’s visual know-how helped it create 3D models more efficiently than “DiffIPC,” a comparable method that simulates and optimizes shapes.

The researchers’ …

1 week, 2 days назад @ news.mit.edu
AI to help researchers see the bigger picture in cell biology
AI to help researchers see the bigger picture in cell biology AI to help researchers see the bigger picture in cell biology

For instance, measuring proteins in a cell could yield different information about the effects of cancer than measuring gene expression or cell morphology.

While we have many ways of looking at a cell, at the end of the day we only have one underlying cell state.

“As a user, you can simply input your cell data and it automatically tells you which data are shared and which data are modality-specific,” Zhang says.

After training, the model can identify which data are shared and which are unique when fed cell data it has never seen before.

In addition, the researchers used their method to identify which measurement modality captured a certain protein marker that indicates DNA damage in cancer …

1 week, 2 days назад @ news.mit.edu
Enhancing maritime cybersecurity with technology and policy
Enhancing maritime cybersecurity with technology and policy Enhancing maritime cybersecurity with technology and policy

His research with the MIT Laboratory for Information and Decision Systems (LIDS) and the MIT Maritime Consortium team aims to improve the cybersecurity of critical maritime infrastructure using artificial intelligence, considering both the technology and policy frameworks of solutions.

“My current research focuses on applying AI techniques to cybersecurity problems and examining the policy implications of these advancements, especially in the context of maritime cybersecurity,” says Janjusevic.

In summer 2025, he interned with the Network Detection team at the AI cybersecurity company Vectra AI.

“In AI cybersecurity, the policy element is really important, as the field is so fast-moving and…

1 week, 2 days назад @ news.mit.edu
Study: AI chatbots provide less-accurate information to vulnerable users
Study: AI chatbots provide less-accurate information to vulnerable users Study: AI chatbots provide less-accurate information to vulnerable users

The models also refuse to answer questions at higher rates for these users, and in some cases, respond with condescending or patronizing language.

The researchers prepended short user biographies to each question, varying three traits: education level, English proficiency, and country of origin.

The effects were most pronounced for users at the intersection of these categories: those with less formal education who were also non-native English speakers saw the largest declines in response quality.

“LLMs have been marketed as tools that will foster more equitable access to information and revolutionize personalized learning,” says Poole-Dayan.

The people who may rely on these tools the most c…

2 weeks, 1 day назад @ news.mit.edu
Exposing biases, moods, personalities, and abstract concepts hidden in large language models
Exposing biases, moods, personalities, and abstract concepts hidden in large language models Exposing biases, moods, personalities, and abstract concepts hidden in large language models

By now, ChatGPT, Claude, and other large language models have accumulated so much human knowledge that they’re far from simple answer-generators; they can also express abstract concepts, such as certain tones, personalities, biases, and moods.

However, it’s not obvious exactly how these models represent abstract concepts to begin with from the knowledge they contain.

Now a team from MIT and the University of California San Diego has developed a way to test whether a large language model (LLM) contains hidden biases, personalities, moods, or other abstract concepts.

In the case of the “conspiracy theorist” concept, the team successfully identified a representation of this concept within one …

2 weeks, 1 day назад @ news.mit.edu
Parking-aware navigation system could prevent frustration and emissions
Parking-aware navigation system could prevent frustration and emissions Parking-aware navigation system could prevent frustration and emissions

For a motorist, this would reduce travel time by about 35 minutes, compared to waiting for a spot to open in the closest parking lot.

Their method also considers the case where a user arrives at the ideal parking lot but can’t find a space.

It takes into the account the distance to other parking lots and the probability of success of parking at each.

For example, some parking lots have magnetic detectors or gates that track the number of cars entering and exiting.

They also want to explore additional avenues for gathering data on parking availability, such as using satellite images, and estimate potential emissions reductions.

2 weeks, 1 day назад @ news.mit.edu
Personalization features can make LLMs more agreeable
Personalization features can make LLMs more agreeable Personalization features can make LLMs more agreeable

Many of the latest large language models (LLMs) are designed to remember details from past conversations or store user profiles, enabling these models to personalize responses.

The researchers hope these results inspire future research into the development of personalization methods that are more robust to LLM sycophancy.

Extended interactionsBased on their own sycophantic experiences with LLMs, the researchers started thinking about potential benefits and consequences of a model that is overly agreeable.

They compared the behavior of five LLMs with this user context versus the same LLMs that weren’t given any conversation data.

The boundary between personalization and sycophancy is not a f…

2 weeks, 2 days назад @ news.mit.edu
New J-PAL research and policy initiative to test and scale AI innovations to fight poverty
New J-PAL research and policy initiative to test and scale AI innovations to fight poverty New J-PAL research and policy initiative to test and scale AI innovations to fight poverty

The Abdul Latif Jameel Poverty Action Lab (J-PAL) at MIT has awarded funding to eight new research studies to understand how artificial intelligence innovations can be used in the fight against poverty through its new Project AI Evidence.

The new initiative is prioritizing questions policymakers are already asking: Do AI-assisted teaching tools help all children learn?

In the coming years, PAIE will run a series of funding competitions to invite proposals for evaluations of AI tools that address questions like these, and many more.

Alex Diaz, head of AI for social good at Google.org, says, “we’re thrilled to collaborate with MIT and J-PAL, already leaders in this space, on Project AI Eviden…

3 weeks, 1 day назад @ news.mit.edu
Accelerating science with AI and simulations
Accelerating science with AI and simulations Accelerating science with AI and simulations

Now, the newly tenured professor in materials science and engineering believes AI is poised to transform science in ways never before possible.

His latest company, Lila Sciences, is working to build a scientific superintelligence platform for the life sciences, chemical, and materials science industries.

All of that work is designed to ensure the future of scientific research is more seamless and productive than research today.

“AI for science is one of the most exciting and aspirational uses of AI,” Gómez-Bombarelli says.

“AI for simulations has gone from something that maybe could work to a consensus scientific view,” Gómez-Bombarelli says.

3 weeks, 1 day назад @ news.mit.edu
Using synthetic biology and AI to address global antimicrobial resistance threat
Using synthetic biology and AI to address global antimicrobial resistance threat Using synthetic biology and AI to address global antimicrobial resistance threat

James J. Collins, the Termeer Professor of Medical Engineering and Science at MIT and faculty co-lead of the Abdul Latif Jameel Clinic for Machine Learning in Health, is embarking on a multidisciplinary research project that applies synthetic biology and generative artificial intelligence to the growing global threat of antimicrobial resistance (AMR).

The research project is sponsored by Jameel Research, part of the Abdul Latif Jameel International network.

The initial three-year, $3 million research project in MIT’s Department of Biological Engineering and Institute of Medical Engineering and Science focuses on developing and validating programmable antibacterials against key pathogens.

Th…

3 weeks, 2 days назад @ news.mit.edu
AI algorithm enables tracking of vital white matter pathways
AI algorithm enables tracking of vital white matter pathways AI algorithm enables tracking of vital white matter pathways

Moreover, the study shows, BSBT retrospectively enabled tracking of bundle healing in a coma patient that reflected the patient’s seven-month road to recovery.

As part of his thesis work to better understand the neural mechanisms that underpin consciousness, Olchanyi wanted to develop an AI algorithm to overcome these obstacles.

To train the neural network to segment the bundles, Olchanyi “showed” it 30 live diffusion MRI scans from volunteers in the Human Connectome Project (HCP).

Meanwhile, TBI patients didn’t show significant volume loss in any bundles, but FA reductions were apparent in the majority of bundles.

The authors wrote that BSBT “has substantial prognostic potential by identif…

3 weeks, 3 days назад @ news.mit.edu
3 Questions: Using AI to help Olympic skaters land a quint
3 Questions: Using AI to help Olympic skaters land a quint 3 Questions: Using AI to help Olympic skaters land a quint

Olympic figure skating looks effortless.

Hosoi and Lu recently chatted with MIT News about applying AI to sports, whether AI systems could ever be used to judge Olympic figure skating, and when we might see a skater land a quint.

But with figure skating, Jerry has found one of the few areas where depth challenges don’t really matter.

Q: Could you ever see a world in which AI is used to evaluate the artistic side of figure skating?

Hosoi: I have always watched Olympic figure skating competitions, ever since I could turn on the TV.

3 weeks, 3 days назад @ news.mit.edu
Berkeley AI
последний пост 4 months назад
RL without TD learning
RL without TD learning RL without TD learning

RL without TD learningIn this post, I’ll introduce a reinforcement learning (RL) algorithm based on an “alternative” paradigm: divide and conquer.

We can do Reinforcement Learning (RL) based on divide and conquer, instead of temporal difference (TD) learning.

There are two classes of algorithms in RL: on-policy RL and off-policy RL.

We compared TRL with $n$-step TD learning with different values of $n$, from $1$ (pure TD) to $\infty$ (pure MC).

I still think one of the most important problems in RL (and even in machine learning) is to find a scalable off-policy RL algorithm.

4 months назад @ bair.berkeley.edu
What exactly does word2vec learn?
What exactly does word2vec learn? What exactly does word2vec learn?

What exactly does word2vec learn?

What exactly does word2vec learn, and how?

In this framing, it’s clear that word2vec is a minimal neural language model.

As a result, the theory predicts exactly what features are learned in terms of the corpus statistics and the algorithmic hyperparameters.

We find that over the course of learning, word2vec builds these linear representations in a sequence of noisy learning steps, and their geometry is well-described by a spiked random matrix model.

6 months назад @ bair.berkeley.edu
Whole-Body Conditioned Egocentric Video Prediction
Whole-Body Conditioned Egocentric Video Prediction Whole-Body Conditioned Egocentric Video Prediction

Whole-Body Conditioned Egocentric Video Prediction×Predicting Ego-centric Video from human Actions (PEVA).

We trained a model to Predict Ego-centric Video from human Actions (PEVA) for Whole-Body-Conditioned Egocentric Video Prediction.

We train an autoregressive conditional diffusion transformer on Nymeria, a large-scale dataset pairing real-world egocentric video with body pose capture.

We include some samples here:Body Movement Actions Move Forward Rotate Left Rotate Right Left Hand Actions Move Left Hand Up Move Left Hand Down Move Left Hand Left Move Left Hand Right Right Hand Actions Move Right Hand Up Move Right Hand Down Move Right Hand Left Move Right Hand RightLong RolloutHere you…

8 months, 1 week назад @ bair.berkeley.edu
AWS Machine Learning AWS Machine Learning
последний пост 1 day, 12 hours назад
Drive organizational growth with Amazon Lex multi-developer CI/CD pipeline
Drive organizational growth with Amazon Lex multi-developer CI/CD pipeline Drive organizational growth with Amazon Lex multi-developer CI/CD pipeline

Transforming development through scalable CI/CD practicesTraditional approaches to Amazon Lex development often rely on single-instance setups and manual workflows.

Solution architectureThe multi-developer CI/CD pipeline transforms Amazon Lex from a limited, single-user development tool into an enterprise-grade conversational AI platform.

Real-world success storiesThis multi-developer CI/CD pipeline for Amazon Lex has supported enterprise teams in improving their development efficiency.

Getting started with the solutionThe multi-developer CI/CD pipeline for Amazon Lex is available as an open source solution through our GitHub repository.

Step-by-step implementation guideTo create a multi-de…

1 day, 12 hours назад @ aws.amazon.com
Building custom model provider for Strands Agents with LLMs hosted on SageMaker AI endpoints
Building custom model provider for Strands Agents with LLMs hosted on SageMaker AI endpoints Building custom model provider for Strands Agents with LLMs hosted on SageMaker AI endpoints

The challenge is particularly significant because support for the Messages API is not guaranteed for the models hosted on SageMaker AI real-time endpoints.

This post demonstrates how to build custom model parsers for Strands agents when working with LLMs hosted on SageMaker that don’t natively support the Bedrock Messages API format.

Strands Custom ParsersStrands agents expect model responses in a specific format aligned with the Bedrock Messages API.

Implementing a Custom Model ParserCustom model parsers are a feature of the Strands Agents SDK that provides strong compatibility and flexibility for customers building agents powered by LLMs hosted on SageMaker AI.

ConclusionBuilding custom m…

1 day, 12 hours назад @ aws.amazon.com
Embed Amazon Quick Suite chat agents in enterprise applications
Embed Amazon Quick Suite chat agents in enterprise applications Embed Amazon Quick Suite chat agents in enterprise applications

Amazon Quick Suite embedded chat helps solve the first challenge by bringing conversational AI directly into your applications, so users can query structured data, search documents, and trigger actions without switching tools.

In this post, we show you how to solve the second challenge with a one-click deployment solution to embed the chat agents using the Quick Suite Embedding SDK in enterprise portals.

Provision users in Amazon Cognito and Quick SuiteComplete the following steps to provision users in Amazon Cognito and Quick Suite:Create an Amazon Cognito user in an Amazon Cognito user pool:python scripts/create_cognito_user.py --profile Create a federated user in Quick Suite:python scri…

2 days, 7 hours назад @ aws.amazon.com
Unlock powerful call center analytics with Amazon Nova foundation models
Unlock powerful call center analytics with Amazon Nova foundation models Unlock powerful call center analytics with Amazon Nova foundation models

Amazon Nova FMs for scaleAmazon Nova FMs provide leading price-performance, making them suitable for generative AI at scale.

In the context of call center analytics, Amazon Nova models can comprehend complex conversations, extract key information, and generate valuable insights that were previously difficult or impossible to obtain at scale.

Solution overviewThe Call Center Analytics demo application is built on a simple architecture that seamlessly integrates Amazon Bedrock and Amazon Nova to enable end-to-end call center analytics for both single-call and multi-call analytics.

Your goal is to determine if the customer in the below qualifies as Vulnerable Customer (VC) or Potentially Vuln…

2 days, 7 hours назад @ aws.amazon.com
How Ricoh built a scalable intelligent document processing solution on AWS
How Ricoh built a scalable intelligent document processing solution on AWS How Ricoh built a scalable intelligent document processing solution on AWS

This post demonstrates how enterprises can overcome document processing scaling limits by combining generative AI, serverless architecture, and standardized frameworks.

Ricoh engineered a repeatable, reusable framework using the AWS GenAI Intelligent Document Processing (IDP) Accelerator.

It was to build a scalable solution that could deliver state-of-the-art AI for document extraction and agentic workflows.

The IDP Common Package is a Python library that provides shared functionality for the Accelerated Intelligent Document Processing solution on AWS.

By developing a reusable framework rather than single-use solutions, Ricoh transformed document processing into a scalable service.

2 days, 7 hours назад @ aws.amazon.com
Building a scalable virtual try-on solution using Amazon Nova on AWS: part 1
Building a scalable virtual try-on solution using Amazon Nova on AWS: part 1 Building a scalable virtual try-on solution using Amazon Nova on AWS: part 1

The following image shows a source image, a reference image, a mask image, and the resulting try-on image.

You can experiment with virtual try-on in Amazon Nova Canvas within the Amazon Bedrock playground.

At its core, the solution uses the new virtual try-on in Amazon Nova Canvas in Amazon Bedrock.

Amazon DynamoDB Streams triggers an AWS Step Functions workflow and Amazon Simple Storage Service (Amazon S3) events to manage result delivery.

This Virtual try-on solution demonstrates how AWS serverless services can be combined with generative AI to address a significant challenge.

3 days, 12 hours назад @ aws.amazon.com
How Lendi revamped the refinance journey for its customers using agentic AI in 16 weeks using Amazon Bedrock
How Lendi revamped the refinance journey for its customers using agentic AI in 16 weeks using Amazon Bedrock How Lendi revamped the refinance journey for its customers using agentic AI in 16 weeks using Amazon Bedrock

The solution is built upon Amazon Bedrock foundation models and Amazon Bedrock Guardrails.

By using the wide range of foundation models (FMs) available on Amazon Bedrock, Lendi was able to select task-appropriate models optimized for specific use cases.

Intelligence layer – The core AI capabilities are powered by multiple specialized agents built on Amazon Bedrock foundation models.

Take advantage of using Amazon Bedrock batch inference on tasks that don’t require immediate results to reduce cost.

Use the generative AI capabilities of Amazon Bedrock and the scalable cloud infrastructure of AWS to accelerate AI-driven innovation.

3 days, 12 hours назад @ aws.amazon.com
How Tines enhances security analysis with Amazon Quick Suite
How Tines enhances security analysis with Amazon Quick Suite How Tines enhances security analysis with Amazon Quick Suite

Create MCP server in TinesYou can import an MCP server from the Tines story library into your Tines tenant.

Connect Quick Suite to Tines MCP serverComplete the following steps to connect Quick Suite to the Tines MCP server:On the Quick Suite console, choose Integrations under Connections in the navigation pane.

For MCP server endpoint, enter the MCP server URL from your MCP server in your Tines story, then choose Next.

Get Started with Quick Suite to create a Quick Suite instance in your AWS account and visit the Tines home page to sign up for a Tines Community Edition account.

For more details, refer to the Amazon Quick Suite User Guide and Tines MCP server documentation.

3 days, 12 hours назад @ aws.amazon.com
Building specialized AI without sacrificing intelligence: Nova Forge data mixing in action
Building specialized AI without sacrificing intelligence: Nova Forge data mixing in action Building specialized AI without sacrificing intelligence: Nova Forge data mixing in action

Model Training data VOC F1-Score MMLU accuracy Nova 2 Lite None (baseline) 0.38 0.75 Nova 2 Lite Customer data only 0.55 0.47 Nova 2 Lite 75% customer + 25% Nova data 0.5 0.74 Qwen3-30B Customer data only 0.55 0.0038When Nova 2 Lite is fine-tuned using customer data only, we observe a significant drop in MMLU accuracy from 0.75 to 0.47, indicating the loss of general-purpose capabilities.

Notably, when Nova data mixing is applied during fine-tuning, Nova 2 Lite retains near-baseline general performance.

This validates that Nova data mixing is a practical and effective mechanism for mitigating catastrophic forgetting while preserving domain performance.

ConclusionIn this post, we demonstrate…

4 days, 9 hours назад @ aws.amazon.com
Build a serverless conversational AI agent using Claude with LangGraph and managed MLflow on Amazon SageMaker AI
Build a serverless conversational AI agent using Claude with LangGraph and managed MLflow on Amazon SageMaker AI Build a serverless conversational AI agent using Claude with LangGraph and managed MLflow on Amazon SageMaker AI

This post explores how to build an intelligent conversational agent using Amazon Bedrock, LangGraph, and managed MLflow on Amazon SageMaker AI.

Solution overviewThe conversational AI agent presented in this post demonstrates a practical implementation for handling customer order inquiries, a common but often challenging use case for existing customer service automation solutions.

Customers access a React frontend hosted on Amazon Simple Storage Service (Amazon S3) and delivered through Amazon CloudFront.

Managed MLflow on Amazon SageMaker AI addresses these challenges through specialized tracing capabilities that monitor model interactions, latency, token usage, and conversation paths.

Each…

4 days, 9 hours назад @ aws.amazon.com
Build safe generative AI applications like a Pro: Best Practices with Amazon Bedrock Guardrails
Build safe generative AI applications like a Pro: Best Practices with Amazon Bedrock Guardrails Build safe generative AI applications like a Pro: Best Practices with Amazon Bedrock Guardrails

Best practices for using Amazon Bedrock GuardrailsFor the maximum benefit from Amazon Bedrock Guardrails, we recommend that you adopt the following best practices.

Native integration with Bedrock inference APIsWhen you use Amazon Bedrock Guardrails with inference APIs like InvokeModel , InvokeModelWithResponseStream , Converse , or ConverseStream , the system automatically handles a dual-checkpoint pattern for you.

The pricing for Amazon Bedrock Guardrails is based on text units consumed or images processed per configured safeguard.

ConclusionImplementing Amazon Bedrock Guardrails effectively requires thoughtful configuration and a deep understanding of your application’s unique risk profil…

4 days, 9 hours назад @ aws.amazon.com
Learnings from COBOL modernization in the real world
Learnings from COBOL modernization in the real world Learnings from COBOL modernization in the real world

AI is a genuine accelerator for COBOL modernization but to get results, AI needs additional context that source code alone can’t provide.Here’s what we’ve learned working with 400+ enterprise customers: mainframe modernization has two very different halves.

What a successful mainframe modernization requiresBounded, complete contextMainframe applications are big.

That way AI focuses on what it’s great at (understanding business logic, generating specifications) instead of guessing at connections it can’t see.

Platform-aware contextHere’s something that surprises people: the same COBOL source code behaves differently depending on the compiler and runtime.

Fiserv completed a mainframe moderniz…

1 week, 1 day назад @ aws.amazon.com
Reinforcement fine-tuning for Amazon Nova: Teaching AI through feedback
Reinforcement fine-tuning for Amazon Nova: Teaching AI through feedback Reinforcement fine-tuning for Amazon Nova: Teaching AI through feedback

That’s the core idea behind reinforcement fine-tuning (RFT), a model customization technique we’re excited to bring to Amazon Nova models.

SageMaker Training JobsFor teams that need more control over the training process, Amazon SageMaker Training Jobs offer a flexible middle ground with managed compute and ability to tweak multiple hyperparameters.

To get started with Amazon Nova RFT job on Amazon SageMaker Training Jobs, see the SFT and RFT notebooks.

For more information, see the RFT based evaluation to get started with Amazon Nova RFT job on Amazon SageMaker HyperPod.

More importantly, design a reward function that:Crisply follows what your evaluation metrics track : Your reward functio…

1 week, 1 day назад @ aws.amazon.com
Large model inference container – latest capabilities and performance enhancements
Large model inference container – latest capabilities and performance enhancements Large model inference container – latest capabilities and performance enhancements

AWS recently released significant updates to the Large Model Inference (LMI) container, delivering comprehensive performance improvements, expanded model support, and streamlined deployment capabilities for customers hosting LLMs on AWS.

LMCache support: transforming long-context performanceOne of the most significant capabilities introduced across the newest releases of LMI is comprehensive LMCache support, which fundamentally transforms how organizations can handle long-context inference workloads.

LMCache performance benchmarksComprehensive testing across various model sizes and context lengths reveals performance improvements that improve the user experience for long-context inference w…

1 week, 1 day назад @ aws.amazon.com
Efficiently serve dozens of fine-tuned models with vLLM on Amazon SageMaker AI and Amazon Bedrock
Efficiently serve dozens of fine-tuned models with vLLM on Amazon SageMaker AI and Amazon Bedrock Efficiently serve dozens of fine-tuned models with vLLM on Amazon SageMaker AI and Amazon Bedrock

To benefit from these optimizations, host your LoRA customized models on Amazon SageMaker AI or Amazon Bedrock.

MoE models contain multiple specialized neural networks called experts.

This means every expert requires four LoRA kernel operations in total: shrink and expand for gate_up , and shrink and expand for down .

Second, MoE LoRA combines two sources of sparsity: expert routing (tokens assigned to different experts) and adapter selection (requests using different LoRA adapters).

To get started with custom model hosting on Amazon, see the Amazon SageMaker AI hosting and Amazon Bedrock documentation.

1 week, 2 days назад @ aws.amazon.com
NVIDIA
последний пост 1 day, 11 hours назад
Controlling Floating-Point Determinism in NVIDIA CCCL
Controlling Floating-Point Determinism in NVIDIA CCCL Controlling Floating-Point Determinism in NVIDIA CCCL

The following code shows how to specify the determinism level in CUB (find the complete example online using compiler explorer).

We then use cuda::execution::requir e() to construct a cuda::std::execution::env object, setting the determinism level to not_guaranteed .

By default, cub::DeviceReduce is run-to-run deterministic, which corresponds to setting the determinism level to run_to_run in the single-phase API.

How results vary based on the determinism levelsThe three determinism levels differ in the amount of variation they produce across multiple runs:Not-guaranteed determinism produces slightly different summation values on each invocation.

Figure 2 compares the performance of the diff…

1 day, 11 hours назад @ developer.nvidia.com
March Into the Cloud With 15 New Games Coming to GeForce NOW
March Into the Cloud With 15 New Games Coming to GeForce NOW March Into the Cloud With 15 New Games Coming to GeForce NOW

March is in full bloom, and that means a fresh wave of games heading to the cloud.

15 new titles are joining the GeForce NOW library this month.

March into the cloud and see what’s new — and keep an eye on GFN Thursdays all month for more updates.

Catch every glorious disaster in full fidelity and play it on GeForce NOW, available this week.

Let us know on X or in the comments below, then see what Blue Thunder Gaming thinks of GeForce NOW.

1 day, 14 hours назад @ blogs.nvidia.com
Tuning Flash Attention for Peak Performance in NVIDIA CUDA Tile
Tuning Flash Attention for Peak Performance in NVIDIA CUDA Tile Tuning Flash Attention for Peak Performance in NVIDIA CUDA Tile

Tiled Flash Attention computationUnderstanding online softmaxThe key algorithmic insight of Flash Attention is the online softmax trick.

Causal attention mask for four tokensWith causal masking, roughly half the attention matrix is masked (the upper triangle).

Grouped-query attentionStandard multi-head attention has separate \(K,V\) matrices for each attention head, leading to high memory usage:Multi-head attention (MHA) : 32 query heads → 32 K/V heads (1:1 ratio): 32 query heads → 32 K/V heads (1:1 ratio) Grouped-query attention (GQA) : 32 query heads → 4 K/V heads (8:1 ratio): 32 query heads → 4 K/V heads (8:1 ratio) Multi-query attention (MQA): 32 query heads → 1 K/V head (32:1 ratio)In …

2 days, 11 hours назад @ developer.nvidia.com
cuTile.jl Brings NVIDIA CUDA Tile-Based Programming to Julia
cuTile.jl Brings NVIDIA CUDA Tile-Based Programming to Julia cuTile.jl Brings NVIDIA CUDA Tile-Based Programming to Julia

NVIDIA CUDA Tile is one of the most significant additions to NVIDIA CUDA programming and unlocks automatic access to tensor cores and other specialized hardware.

Earlier this year, NVIDIA released cuTile for Python, giving Python developers a natural way to write high-performance GPU kernels.

With CUDA Tile, developers describe operations on tiles of data, and the compiler handles the mapping to hardware.

Users who are already writing GPU code in Julia with CUDA.jl will find the transition to tile-based programming straightforward.

Getting startedJust like cuTile Python, cuTile.jl requires an NVIDIA Blackwell GPU and an NVIDIA driver for CUDA 13 or higher.

3 days, 8 hours назад @ developer.nvidia.com
NVIDIA Advances Autonomous Networks With Agentic AI Blueprints and Telco Reasoning Models
NVIDIA Advances Autonomous Networks With Agentic AI Blueprints and Telco Reasoning Models NVIDIA Advances Autonomous Networks With Agentic AI Blueprints and Telco Reasoning Models

New open source large telco model and NVIDIA Blueprints enable telecom operators to use their own data to train AI agents and build autonomous networks.

Reasoning models and AI agents fine-tuned on telecom data are key to enabling this shift.

For networks to become autonomous, there’s a need for an end-to-end agentic system that includes key components like telco network models and AI agents that talk to each other and use network simulation tools to validate actions.

And as part of GSMA’s new Open Telco AI initiative — launching tomorrow — NVIDIA is releasing the new open source LTM, implementation guide and agentic AI blueprints as open resources through GSMA, an organization for the mobi…

5 days, 21 hours назад @ blogs.nvidia.com
NVIDIA and Partners Show That Software-Defined AI-RAN Is the Next Wireless Generation
NVIDIA and Partners Show That Software-Defined AI-RAN Is the Next Wireless Generation NVIDIA and Partners Show That Software-Defined AI-RAN Is the Next Wireless Generation

Ahead of Mobile World Congress (MWC), running March 2-5 in Barcelona, NVIDIA and Nokia announced new AI-RAN collaborations with top telecom operators across Europe, Asia and North America, powered by NVIDIA AI-RAN platforms.

AI-RAN Goes From Lab to LiveTop telecom operators and partners are using NVIDIA platforms to bring AI-RAN to commercial deployment.

IOH has implemented software-defined 5G with Nokia’s vRAN software on NVIDIA AI-RAN platforms, moving from proof of concept to pre-commercial field validation.

Supermicro is extending support across the full NVIDIA AI-RAN portfolio, including NVIDIA ARC-Pro and NVIDIA RTX 6000-based configurations, as well as ARC-Compact systems with Nokia …

5 days, 21 hours назад @ blogs.nvidia.com
Now Live: The World’s Most Powerful AI Factory for Pharmaceutical Discovery and Development
Now Live: The World’s Most Powerful AI Factory for Pharmaceutical Discovery and Development Now Live: The World’s Most Powerful AI Factory for Pharmaceutical Discovery and Development

Saving and improving lives — that most human endeavor — just got a super-computational boost.

Lilly this week launched the most powerful AI factory wholly owned and operated by a pharmaceutical company to help its teams make meaningful medical advancements faster, more accurately and at unprecedented scale.

Dubbed LillyPod, it’s the world’s first NVIDIA DGX SuperPOD with DGX B300 systems.

Powered by a DGX SuperPOD with 1,016 NVIDIA Blackwell Ultra GPUs, Lilly’s AI factory delivers more than 9,000 petaflops of AI performance.

It was assembled in just four months.

1 week, 1 day назад @ blogs.nvidia.com
Horror Awakens in the Cloud: GeForce NOW Unleashes Capcom’s ‘Resident Evil: Requiem’
Horror Awakens in the Cloud: GeForce NOW Unleashes Capcom’s ‘Resident Evil: Requiem’ Horror Awakens in the Cloud: GeForce NOW Unleashes Capcom’s ‘Resident Evil: Requiem’

GeForce NOW’s anniversary celebration reaches a chilling crescendo as Capcom’s Resident Evil: Requiem creeps into the cloud — and the horrors look better than ever on a GeForce NOW Ultimate membership.

The Nightmare Returns in the CloudA new era of survival horror dawns with Resident Evil: Requiem, the iconic Capcom series’ most immersive entry yet.

Classic Resident Evil survival horror returns with tense combat and clever puzzles — now enhanced with seamless switching between first‑ and third‑person views for a more personal nightmare.

With GeForce RTX 5080-class power in the cloud, experience Requiem with lifelike lighting, full path tracing, ray‑traced reflections and cinematic realism a…

1 week, 1 day назад @ blogs.nvidia.com
From Radiology to Drug Discovery, Survey Reveals AI Is Delivering Clear Return on Investment in Healthcare
From Radiology to Drug Discovery, Survey Reveals AI Is Delivering Clear Return on Investment in Healthcare From Radiology to Drug Discovery, Survey Reveals AI Is Delivering Clear Return on Investment in Healthcare

From Radiology to Drug Discovery, Survey Reveals AI Is Delivering Clear Return on Investment in HealthcareAI is accelerating every aspect of healthcare — from radiology and drug discovery to medical device manufacturing and new treatment methods enabled by digital twins of the human body.

NVIDIA’s second annual “State of AI in Healthcare and Life Sciences” survey report reveals how the industry is moving from AI experimentation to execution, reaping return on investment (ROI) on core applications like medical imaging and drug discovery.

Highlights from this year’s report include:70% of respondents said their organizations are actively using AI, up from 63% in 2024.

82% said open source soft…

1 week, 3 days назад @ blogs.nvidia.com
NVIDIA Brings AI-Powered Cybersecurity to World’s Critical Infrastructure
NVIDIA Brings AI-Powered Cybersecurity to World’s Critical Infrastructure NVIDIA Brings AI-Powered Cybersecurity to World’s Critical Infrastructure

While zero trust has been widely adopted to secure enterprise IT environments, applying its principles to OT environments has traditionally been difficult.

A New Class of OT CybersecurityAcross these environments, a consistent OT cybersecurity architecture is taking shape.

NVIDIA-powered OT cybersecurity solutions are delivered through a global ecosystem of trusted partners.

Read this OT cybersecurity use case and solution overview for more.

Join NVIDIA at S4x26, running Feb. 24–26 in Miami, to see how accelerated computing and AI are transforming cybersecurity for OT and critical infrastructure.

1 week, 4 days назад @ blogs.nvidia.com
Survey Reveals AI Advances in Telecom: Networks and Automation in Driver’s Seat as Return on Investment Climbs
Survey Reveals AI Advances in Telecom: Networks and Automation in Driver’s Seat as Return on Investment Climbs Survey Reveals AI Advances in Telecom: Networks and Automation in Driver’s Seat as Return on Investment Climbs

Survey Reveals AI Advances in Telecom: Networks and Automation in Driver’s Seat as Return on Investment ClimbsAI is accelerating the telecommunications industry’s transformation, becoming the backbone of autonomous networks and AI-native wireless infrastructure.

NVIDIA’s fourth annual “State of AI in Telecommunications” survey report unpacks these trends, underscoring strong AI adoption, impact and investment in the industry.

Tangible Revenue Impact and Return on InvestmentThe telecommunications industry is seeing a definitive revenue impact from the use of AI.

The top AI use cases cited for return on investment (ROI) were AI for autonomous networks (50%), followed by improved customer serv…

2 weeks, 1 day назад @ blogs.nvidia.com
All About the Games: Play Over 4,500 Titles With GeForce NOW
All About the Games: Play Over 4,500 Titles With GeForce NOW All About the Games: Play Over 4,500 Titles With GeForce NOW

Celebrate #6YearsofGFN with gamers worldwide by participating in a community giveaway and playing 12 new games in the cloud.

The GeForce NOW anniversary celebration keeps on rolling, and this week is all about the games that make it possible.

With more than 4,500 titles supported in the cloud — plus 12 new games this week — there’s always something new to stream, share and discover.

A Library Built for Every PC GamerGeForce NOW features an ever-growing library of over 4,500 titles that are ready to stream in seconds.

Smart library syncing connects supported games directly into the cloud catalog, enabling instant access without downloads or hardware worries.

2 weeks, 1 day назад @ blogs.nvidia.com
Unlock Massive Token Throughput with GPU Fractioning in NVIDIA Run:ai
Unlock Massive Token Throughput with GPU Fractioning in NVIDIA Run:ai Unlock Massive Token Throughput with GPU Fractioning in NVIDIA Run:ai

Nebius’ AI Cloud provided the infrastructure foundation, dedicated NVIDIA GPUs, NVIDIA Quantum InfiniBand networking, and hyperscaler-grade performance and elasticity needed to deliver these gains at production scale.

Scale inference workloads with NVIDIA Run:ai and Nebius AI CloudThe NVIDIA Run:ai platform addresses these pain points through its high-throughput AI workload scheduler, built for large-scale GPU clusters and dynamic fractional GPU allocation, without sacrificing performance.

To learn more, see NVIDIA NIM LLM with NVIDIA Run:ai and Vanilla Kubernetes for Enterprise RA.

All three configurations—without NVIDIA Run:ai, NVIDIA Run:ai with full GPU, and NVIDIA Run:ai with fractiona…

2 weeks, 2 days назад @ developer.nvidia.com
Topping the GPU MODE Kernel Leaderboard with NVIDIA cuda.compute
Topping the GPU MODE Kernel Leaderboard with NVIDIA cuda.compute Topping the GPU MODE Kernel Leaderboard with NVIDIA cuda.compute

Frameworks like PyTorch address this by implementing kernels in CUDA C++—either handwritten or by leveraging libraries like the NVIDIA CUDA Core Compute Libraries.

The NVIDIA cuda.compute library overcomes these limitations by offering a high-level, Pythonic API for device-wide CUB primitives.

Using cuda.compute helped an NVIDIA CCCL team top the GPU MODE leaderboard, a kernel competition hosted by an online community with more than 20,000 members and a focus on learning and improving GPU programming.

It achieved the most first-place finishes overall on the tested GPU architectures: NVIDIA B200, NVIDIA H100, NVIDIA A100, and NVIDIA L4.

CUDA Python: GPU performance meets productivityCUB offe…

2 weeks, 2 days назад @ developer.nvidia.com
How NVIDIA Extreme Hardware-Software Co-Design Delivered a Large Inference Boost for Sarvam AI’s Sovereign Models
How NVIDIA Extreme Hardware-Software Co-Design Delivered a Large Inference Boost for Sarvam AI’s Sovereign Models How NVIDIA Extreme Hardware-Software Co-Design Delivered a Large Inference Boost for Sarvam AI’s Sovereign Models

To meet strict latency targets and improve inference efficiency for its flagship Sovereign 30B model, Sarvam AI collaborated with NVIDIA to co-design hardware and software optimizations.

This collaboration delivered a 4x speedup in inference performance on NVIDIA Blackwell over baseline NVIDIA H100 GPUs, and established a path for deployment on the next-generation NVIDIA Blackwell architecture.

This post walks through the joint engineering effort and shares benchmarks for the speed-ups achieved on the NVIDIA H100, the largest-deployed NVIDIA GPU in India.

NVIDIA Blackwell GPU offers a 2.8x higher token throughput vs Nvidia H100 SXM GPU at the 100 TPS/User operating point.

Stay up-to-date on…

2 weeks, 2 days назад @ developer.nvidia.com
Facebook
последний пост 1 week, 3 days назад
RCCLX: Innovating GPU communications on AMD platforms
RCCLX: Innovating GPU communications on AMD platforms RCCLX: Innovating GPU communications on AMD platforms

RCCLX is fully integrated with Torchcomms and aims to empower researchers and developers to accelerate innovation, regardless of their chosen backend.

We want to iterate on collectives, transports, and novel features quickly on AMD platforms.

With RCCLX, we have integrated CTran to AMD platforms, enabling the AllToAllvDynamic – a GPU-resident collective.

These features provide significant performance improvements on AMD platforms and we are excited to share this with the community.

RCCLX Quick Start GuideInstall Torchcomms with RCCLX backend by following the installation instructions in the Torchcomms repo.

1 week, 3 days назад @ engineering.fb.com
The Death of Traditional Testing: Agentic Development Broke a 50-Year-Old Field, JiTTesting Can Revive It
The Death of Traditional Testing: Agentic Development Broke a 50-Year-Old Field, JiTTesting Can Revive It The Death of Traditional Testing: Agentic Development Broke a 50-Year-Old Field, JiTTesting Can Revive It

A Catching JiTTest focuses specifically on finding regressions introduced by a code change.

Agentic development dramatically increases the pace of code change, straining test development burden and scaling the cost of false positives and test maintenance to breaking point.

And since the JiTTest itself is LLM-generated, it can often infer the plausible intention of a code change and simulate possible faults that may result from it.

With them engineers no longer have to spend time writing, reviewing, and testing complex test code.

READ THE PAPERJust-in-Time Catching Test Generation at Meta

3 weeks, 2 days назад @ engineering.fb.com
Adapting the Facebook Reels RecSys AI Model Based on User Feedback
Adapting the Facebook Reels RecSys AI Model Based on User Feedback Adapting the Facebook Reels RecSys AI Model Based on User Feedback

Our new User True Interest Survey (UTIS) model , now helps surface more niche, high-quality content and boosts engagement, retention, and satisfaction.

Our paper, “ Improve the Personalization of Large-Scale Ranking Systems by Integrating User Survey Feedback ” shares full details on this work.

The main candidate ranking model used by the platform is a large multi-task, multi-label model.

We trained a lightweight UTIS alignment model layer on the collected user survey responses using existing predictions of the main model as input features.

The UTIS model consistently outperformed the baseline, driving higher user engagement and retention .

1 month, 3 weeks назад @ engineering.fb.com
DrP: Meta’s Root Cause Analysis Platform at Scale
DrP: Meta’s Root Cause Analysis Platform at Scale DrP: Meta’s Root Cause Analysis Platform at Scale

DrP’s key components include:Expressive SDK : The DrP SDK allows engineers to codify investigation workflows into analyzers.

Post-processing system : After an investigation, the post-processing system can take automated actions based on the analysis results.

Bootstrap code : The DrP SDK provides bootstrap code to create a template analyzer with pre-populated boilerplate code.

Data access and analysis : The SDK includes libraries for data access and analysis, such as dimension analysis and time series correlation.

This provides immediate analysis results to on-call engineers.

2 months, 2 weeks назад @ engineering.fb.com
How AI Is Transforming the Adoption of Secure-by-Default Mobile Frameworks
How AI Is Transforming the Adoption of Secure-by-Default Mobile Frameworks How AI Is Transforming the Adoption of Secure-by-Default Mobile Frameworks

Generative AI and automation accelerate the adoption of secure frameworks at scale, enabling consistent security enforcement and efficient migration across Meta’s vast codebase.

How We Design Secure-by-Default Frameworks at MetaDesigning secure-by-default frameworks for use by a large number of developers shipping vastly different features across multiple apps is an interesting challenge.

There shouldn’t be one security framework that covers all security issues, and not every security issue is general enough to deserve its own framework.

Now that we’ve looked at the design philosophy behind our frameworks, let’s look at one of our most widely used Android security frameworks, SecureLinkLaun…

2 months, 3 weeks назад @ engineering.fb.com
Zoomer: Powering AI Performance at Meta’s Scale Through Intelligent Debugging and Optimization
Zoomer: Powering AI Performance at Meta’s Scale Through Intelligent Debugging and Optimization Zoomer: Powering AI Performance at Meta’s Scale Through Intelligent Debugging and Optimization

Zoomer has delivered training time reductions, and significant QPS improvements, making it the de-facto tool for AI performance optimization across Meta’s entire AI infrastructure.

Zoomer is Meta’s automated, one-stop-shop platform for performance profiling, debugging, analysis, and optimization of AI training and inference workloads.

AI Performance Optimization Using ZoomerZoomer is an automated debugging and optimization platform that works across all of our AI model types (ads recommendations, GenAI, computer vision, etc.)

Memory Analysis : Comprehensive analysis of GPU memory usage patterns, allocation tracking, and leak detection.

Realtime Memory Profiling : GPU memory allocation track…

3 months, 2 weeks назад @ engineering.fb.com
Open Source Is Good for the Environment
Open Source Is Good for the Environment Open Source Is Good for the Environment

But have you heard about open hardware?

And did you know open source can have a positive impact on the environment?

On this episode of the Meta Tech Podcast, Pascal Hartig sits down with Dharmesh and Lisa to talk about all things open hardware, and Meta’s biggest announcements from the 2025 Open Compute Project (OCP) Summit – including a new open methodology for leveraging AI to understand Scope 3 emissions.

You’ll also hear how AI and open hardware are helping Meta push to achieve net zero emissions in 2030, including how AI is being used to develop new concrete mixes for data center construction.

And if you’re interested in learning more about career opportunities at Meta visit the Meta C…

3 months, 3 weeks назад @ engineering.fb.com
Meta’s Generative Ads Model (GEM): The Central Brain Accelerating Ads Recommendation AI Innovation
Meta’s Generative Ads Model (GEM): The Central Brain Accelerating Ads Recommendation AI Innovation Meta’s Generative Ads Model (GEM): The Central Brain Accelerating Ads Recommendation AI Innovation

We’re sharing details about Meta’s Generative Ads Recommendation Model (GEM), a new foundation model that delivers increased ad performance and advertiser ROI by enhancing other ads recommendation models’ ability to serve relevant ads.

GEM propagates its learnings, leveraging a suite of post-training techniques across the entire ads model fleet, enabling a paradigm shift in Meta’s Ads Recommendation system.

GEM leverages enhanced training scalability that efficiently utilizes thousands of GPUs for building and iterating an LLM-scale ads foundation model.

The Generative Ads Recommendation Model (GEM) is Meta’s most advanced ads foundation model, built on an LLM-inspired paradigm and trained …

3 months, 3 weeks назад @ engineering.fb.com
Scaling LLM Inference: Innovations in Tensor Parallelism, Context Parallelism, and Expert Parallelism
Scaling LLM Inference: Innovations in Tensor Parallelism, Context Parallelism, and Expert Parallelism Scaling LLM Inference: Innovations in Tensor Parallelism, Context Parallelism, and Expert Parallelism

At Meta, we are constantly pushing the boundaries of LLM inference systems to power applications such as the Meta AI App.

These metrics highlight the distinct computational demands of LLM inference: Prefill is compute-intensive, while decoding is memory bandwidth-intensive.

Communication: Communication latency increases when parallelizing across multiple hosts.

In EP-based inference, we utilize a two-shot, all-to-all communication pattern to exchange tokens between data parallelism and expert parallelism ranks based on routing.

We are committed to continuous innovation to ensure efficient and scalable LLM inference for millions of users worldwide.

4 months, 2 weeks назад @ engineering.fb.com
How Meta Is Leveraging AI To Improve the Quality of Scope 3 Emission Estimates for IT Hardware
How Meta Is Leveraging AI To Improve the Quality of Scope 3 Emission Estimates for IT Hardware How Meta Is Leveraging AI To Improve the Quality of Scope 3 Emission Estimates for IT Hardware

We leveraged AI to help us improve this database and understand our Scope 3 emissions associated with IT hardware by:Identifying similar components and applying existing PCFs to similar components that lack these carbon estimates.

Understanding the carbon footprint of IT racks and applying generative AI (GenAI) as a categorization algorithm to create a new and standard taxonomy .

If these similar components are not identified their carbon footprint estimates will remain at a lower data quality.

These similar components can be mapped to a representative proxy PCF, allowing us to use high-quality PCF data in similar components.

For example, we can scale the carbon footprint calculation for a …

4 months, 3 weeks назад @ engineering.fb.com
OCP Summit 2025: The Open Future of Networking Hardware for AI
OCP Summit 2025: The Open Future of Networking Hardware for AI OCP Summit 2025: The Open Future of Networking Hardware for AI

At Open Compute Project Summit (OCP) 2025, we’re sharing details about the direction of next-generation network fabrics for our AI training clusters.

At Meta, we believe that open hardware is a catalyst for innovation — especially as data center infrastructure increasingly supports new and emerging AI technologies.

Open hardware plays a crucial role in enabling disaggregation, allowing us to break down traditional data center technologies into their core components.

Today, through OCP, we continue to advance open network technologies for the next generation of AI applications.

Ethernet for Scale-Up Networking in OCP: Meta’s Industry LeadershipAt Meta, we recognize that the future of AI and …

4 months, 3 weeks назад @ engineering.fb.com
LLMs Are the Key to Mutation Testing and Better Compliance
LLMs Are the Key to Mutation Testing and Better Compliance LLMs Are the Key to Mutation Testing and Better Compliance

By leveraging LLMs we’ve been able to overcome the barriers that have prevented mutation testing from being efficiently deployed at scale.

Our presentations shared insights into how we’ve used LLMs to solve the major barriers that have prevented mutation testing at scale and highlighted new areas in automated software testing where LLMs can have a significant impact.

Mutation Testing Isn’t ScalableTraditional mutation testing generates a very large number of mutants, making it computationally expensive and difficult to scale to large industrial codebases.

Mutation Testing Requires a Lot of Computational ResourcesMutation testing is costly in terms of computational resources and developer ef…

5 months, 1 week назад @ engineering.fb.com
AssetGen: Generating 3D Worlds With AI
AssetGen: Generating 3D Worlds With AI AssetGen: Generating 3D Worlds With AI

Imagine being able to use AI to create 3D virtual worlds using prompts as easily as you can generate images.

In his keynote, Mark Zuckerberg shared his vision of a future where anyone can create virtual worlds using AI-powered tools like the ones available in the upcoming Meta Horizon Studio.

But AI is already making it easier than ever to create 3D assets.

On this episode of the Meta Tech Podcast, Pascal Hartig is joined by Mahima and Rakesh from Meta’s XR Tech team to discuss AssetGen, a new foundation model for 3D assets.

They talk about how they built and trained AssetGen, the important role LLMs have to play in the future of VR, and how they’re tackling the ambitious goal of generating…

5 months, 1 week назад @ engineering.fb.com
Meta’s Infrastructure Evolution and the Advent of AI
Meta’s Infrastructure Evolution and the Advent of AI Meta’s Infrastructure Evolution and the Advent of AI

As our user base grew globally, we scaled beyond single data center buildings and into data center regions consisting of multiple buildings.

Enter AI Workloads (2020)While we were navigating the challenges of scaling, we were also seeing glimpses of how AI workloads would impact our infrastructure.

To build out our AI infrastructure, we’ve leveraged solutions from partners like AMD and NVIDIA as well as our own custom silicon.

Constructing Prometheus has been a monumental engineering feat, with infrastructure spanning five or more data center buildings in a single data center region.

We are still early in the evolution and adoption of AI workloads.

5 months, 1 week назад @ engineering.fb.com
Networking at the Heart of AI — @Scale: Networking 2025 Recap
Networking at the Heart of AI — @Scale: Networking 2025 Recap Networking at the Heart of AI — @Scale: Networking 2025 Recap

AI is everywhere and, as network engineers, we are right in the thick of it: building the network infrastructure for AI.

Setting Context: Rapid Changes and EvolutionGiven AI continues to drive so much innovation in networking and general infrastructure, we once again focused @Scale: Networking on AI networking, sharing the new insights and progress in the field.

The Models and the Primary AI Workloads Are Rapidly Evolving.

More from @Scale:Networking 2025Please visit the @Scale YouTube channel to check out all the talks from this year’s Networking @Scale.

We look forward to what promises to be another rapid year of network and AI innovation that we’ll cover at the next @Scale: Networking in…

5 months, 1 week назад @ engineering.fb.com
Uber Engineering
последний пост None
neptune.ai neptune.ai
последний пост 3 months назад
We are joining OpenAI
We are joining OpenAI We are joining OpenAI

Piotr Niedźwiedź, CEO/CTO and founder of neptune.aiI’m excited to share that we’ve entered into a definitive agreement to be acquired by OpenAI, subject to closing conditions.

We are thrilled to join the OpenAI team and help their AI researchers build better models faster.

Neptune is a metrics dashboard company.”We’ve worked closely with OpenAI to create the metrics dashboard that helps teams building foundation models.

Our future with OpenAINeptune will join OpenAI and continue to support AI researchers with tools to monitor, debug, and evaluate frontier models.

We are looking forward to working with top AI researchers and supporting OpenAI’s mission of ensuring that AGI benefits all of hu…

3 months назад @ neptune.ai
Synthetic Data for LLM Training
Synthetic Data for LLM Training Synthetic Data for LLM Training

For instance, financial data is highly sensitive and protected by very strict regulations, and synthetic data mimics the real data distribution without revealing customer information.

Read more about how leading foundation model teams curate their training data and other topics in the State of Foundation Model Training Report 2025.

Choosing the right synthetic data generation technique depends on the type of data and its complexity.

Synthetic tabular data generation is a promising direction to overcome these challenges by learning the distribution of the tabular data.

Post-processingAs the distribution of tabular data is highly complex, it makes the synthetic tabular data generation very ch…

3 months, 3 weeks назад @ neptune.ai
What are LLM Embeddings: All you Need to Know
What are LLM Embeddings: All you Need to Know What are LLM Embeddings: All you Need to Know

TL;DR LLM embeddings are the numerical, vector representations of text that Large Language Models (LLMs) use to process information.

Unlike their predecessor word embeddings, LLM embeddings are context-aware and dynamically change to capture semantic and syntactic relationships based on the surrounding text.

What are the applications of LLM embeddings?

Word EmbeddingsSparse Word Embeddings One-Hot Vectors 1970s TF-IDF1980s Co-Occurrence MatrixStatic Word Embeddings Word2Vec 2013 GloVe 2014Contextualized word embeddings ELMo 2018 GPT-1 2018 BERT 2018 LLAMA 2023 DeepSeek-V1 2023 GPT-4 2023Static word embeddingsStatic word embeddings, such as word2vec in 2013, marked a significant development.…

4 months назад @ neptune.ai
Detecting and Fixing ‘Dead Neurons’ in Foundation Models
Detecting and Fixing ‘Dead Neurons’ in Foundation Models Detecting and Fixing ‘Dead Neurons’ in Foundation Models

TL;DR Dead neurons silently waste compute and reduce effective model capacity in foundation models.

Dead neurons’ impactRecent studies into dead neurons in the context of foundation models show interesting, albeit worrying, results.

These large reported fractions of dead neurons in foundation models are a concern from a computational perspective.

Before we move on to discuss how to detect and fix dead neurons, let’s touch upon an important distinction between dead neurons and vanishing gradients.

Further reading How to Monitor, Diagnose, and Solve Gradient Issues in Foundation Models Read moreVisualizing activation distributionsIs your foundation model suffering from dead neurons?

4 months, 1 week назад @ neptune.ai
Part 2: Instruction Fine-Tuning: Evaluation and Advanced Techniques for Efficient Training
Part 2: Instruction Fine-Tuning: Evaluation and Advanced Techniques for Efficient Training Part 2: Instruction Fine-Tuning: Evaluation and Advanced Techniques for Efficient Training

In the first part of this series, we covered the fundamentals of instruction fine-tuning (IFT).

def calculate_irs(instruction, output, reference_model): evaluation_prompt = f""" Instruction: {instruction} Model Output: {output} Rate how well the output follows the instruction on these criteria: 1.

| SourceHINT addresses a computational inefficiency in standard instruction fine-tuning: repeatedly reprocessing the same task instruction with every input example.

Read more about foundation model training infrastructure and other topics in Neptune’s 2025 State of Foundation Model Training Report.

First, during initial instruction fine-tuning across multiple diverse tasks, the model learns genera…

4 months, 2 weeks назад @ neptune.ai
How to Optimize LLM Inference
How to Optimize LLM Inference How to Optimize LLM Inference

Large Language Model (LLM) inference at scale is challenging as it involves transferring massive amounts of model parameters and data and performing computations on large tensors.

In the following, we’ll use the Llama model family architecture as a specific example to understand the LLM workload at inference.

For a far more detailed analysis of the LLM workload at inference, see the chapter All About Transformer Inference in the book How to Scale Your Model, published by Google DeepMind.

See also How to Run LLMs Locally Read moreA quick primer on hardware for LLM inferenceA typical LLM inference cluster consists of several nodes, each with a multi-core CPU and multiple accelerator devices, …

4 months, 3 weeks назад @ neptune.ai
A Researcher’s Guide to LLM Grounding
A Researcher’s Guide to LLM Grounding A Researcher’s Guide to LLM Grounding

In this article, we’ll explore the fundamental concepts of LLM grounding as well as strategies for optimally grounding models.

What is LLM grounding?

LLM grounding is analogous.

If relevant knowledge cannot be inferred from the data, then LLM grounding cannot yield more relevant responses.

When grounding LLMs using RAG, consider retaining only a few of the top hits (i.e., top-k) for your retrieval queries.

5 months, 1 week назад @ neptune.ai
Instruction Fine-Tuning: Fundamentals, Architecture Modifications, and Loss Functions
Instruction Fine-Tuning: Fundamentals, Architecture Modifications, and Loss Functions Instruction Fine-Tuning: Fundamentals, Architecture Modifications, and Loss Functions

TL;DR Instruction fine-tuning (IFT) refines pre-trained large language models (LLMs) to follow specific task instructions by training on prompt-response pairs.

Instruction fine-tuning in a nutshellIFT tailors LLMs to follow user instructions by bridging their inherent next-word prediction with human-defined objectives.

Related LLM Fine-Tuning and Model Selection Using Neptune and Transformers Read moreParameter-efficient instruction fine-tuningWhile major foundation models like GPT-4 or Llama-2 undergo full parameter instruction fine-tuning during development, parameter-efficient fine-tuning (PEFT) methods have become widely adopted for instruction fine-tuning since the LoRA paper was publi…

5 months, 2 weeks назад @ neptune.ai
Understanding Prompt Injection: Risks, Methods, and Defense Measures
Understanding Prompt Injection: Risks, Methods, and Defense Measures Understanding Prompt Injection: Risks, Methods, and Defense Measures

Prompt injection 101: When prompts go rogueThe term ‘Prompt Injection’ comes from SQL injection attacks.

There is another claim of the independent discovery of prompt injection attacks, which suggests that Riley Goodside publicly exhibited a prompt injection in a tweet back in September 2022.

The indirect prompt injection attacks are classified into active, passive, user-driven and virtual prompt attacks.

Virtual prompt injection attacksThis injection type is closely related to passive injection attacks previously described.

Prompt injection: current challenges & lessons learnedThe arms race between prompt injection attacks and defenses is a challenge for researchers, developers, and users.

7 months назад @ neptune.ai
SabiYarn: Advancing Low-Resource Languages With Multitask NLP Pre-Training [Paper Reflections]
SabiYarn: Advancing Low-Resource Languages With Multitask NLP Pre-Training [Paper Reflections] SabiYarn: Advancing Low-Resource Languages With Multitask NLP Pre-Training [Paper Reflections]

This simple idea avoids computing loss on input prompt tokens the model already knows.

Prompt tokens are (too) expensive in low-resource settingsDuring pre-training, LLMs are trained in causal language modeling through a next-token prediction task.

=> Mo fẹ́ràn ìrẹsì,” the model is trained to predict every token, from the prompt to the actual answer:Step Prompt Next token 1 Translate English Static prompt 2 Translate English to Static prompt 3 Translate English to Yoruba: Static prompt 4 Translate English to Yoruba: I 5 Translate English to Yoruba: I love 6 Translate English to Yoruba: I love rice.

This is straightforward to implement in PyTorch by masking out the prompt tokens in the label …

7 months, 1 week назад @ neptune.ai
How to Monitor, Diagnose, and Solve Gradient Issues in Foundation Models
How to Monitor, Diagnose, and Solve Gradient Issues in Foundation Models How to Monitor, Diagnose, and Solve Gradient Issues in Foundation Models

What gradient issues occur during foundation model training?

During training, gradient descent updates model parameters by computing the gradients of the loss function via forward and backward passes.

The green line corresponds to a learning rate of 10, while the orange line has a learning rate of 0.1.

The gradient norm for the orange line with LR = 0.1 is very high in the first steps, while the gradient norm of the green line with LR = 10 diverges to NaN after a few steps.

Techniques for gradient stabilizationMonitoring gradient norms and training loss provides insights into the learning dynamics of the foundation models.

8 months назад @ neptune.ai
STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning [Paper Reflection]
STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning [Paper Reflection] STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning [Paper Reflection]

Unstructured pruning removes individual weights, while structured pruning removes entire model components.

In the context of MoEs, as expert structures from training MoEs correspond to such patterns, pruning experts is a natural fit for structured pruning.

Thus, structured pruning does not significantly decrease kurtosis, leaving plenty of margin for unstructured pruning.

Since structured pruning primarily reduces architectural redundancy rather than reshaping the underlying weight distribution, our two-phase approach—leveraging unstructured pruning after structured pruning—outperforms unstructured-only pruning.

Since STUN does not make any assumption about base MoE models, it is generaliza…

9 months назад @ neptune.ai
Evaluating RAG Pipelines
Evaluating RAG Pipelines Evaluating RAG Pipelines

Related Building LLM Applications With Vector Databases Read moreDimensions of RAG evaluationEvaluating a RAG pipeline means assessing its behavior across three dimensions:1.

The evaluation of the RAG pipeline is a multi-step process, starting with creating an evaluation dataset, then evaluating the individual components (retriever, generator, etc.

Curating an evaluation datasetThe first step in the RAG evaluation process is the creation of a ground truth dataset.

MAP considers both the presence and rank of relevant chunks but fails to consider the relative position of relevant chunks.

However, not all retrieved chunks are equally relevant and sometimes, the most relevant chunks might not b…

9 months, 3 weeks назад @ neptune.ai
▶️ YouTube
Yannic Kilcher Yannic Kilcher
последний пост 6 часов назад
I BUILT A FULLY AUTOMATIC MANSPLAINER
I BUILT A FULLY AUTOMATIC MANSPLAINER I BUILT A FULLY AUTOMATIC MANSPLAINER

All information about GTC and the DGX Spark Raffle is here: https://www.ykilcher.com/gtc Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https://www.subscribestar.com/yannickilcher

Patreon: https://www.patreon.com/yannickilcher

Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq

Ethereu…

6 часов назад @ youtube.com
Traditional X-Mas Stream
Traditional X-Mas Stream Traditional X-Mas Stream

Letsgooo

2 months, 1 week назад @ youtube.com
Traditional Holiday Live Stream
Traditional Holiday Live Stream Traditional Holiday Live Stream

https://ykilcher.com/discord Links:

TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://discord.gg/4H8xxDF

BitChute: https://www.bitchute.com/channel/yannic-kilcher

Minds: https://www.minds.com/ykilcher

Parler: https://parler.com/profile/YannicKilcher

LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/

BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https:/…

2 months, 1 week назад @ youtube.com
TiDAR: Think in Diffusion, Talk in Autoregression (Paper Analysis)
TiDAR: Think in Diffusion, Talk in Autoregression (Paper Analysis) TiDAR: Think in Diffusion, Talk in Autoregression (Paper Analysis)

Paper: https://arxiv.org/abs/2511.08923 Abstract:

Diffusion language models hold the promise of fast parallel generation, while autoregressive (AR) models typically excel in quality due to their causal structure aligning naturally with language modeling. This raises a fundamental question: can we achieve a synergy with high throughput, higher GPU utilization, and AR level quality? Existing methods fail to effectively balance these two aspects, either prioritizing AR using a weaker model for sequential drafting (speculative decoding), leading to lower drafting efficiency, or using some form of left-to-right (AR-like) decoding logic for diffusion, which still suffers from quality degradation …

2 months, 1 week назад @ youtube.com
Titans: Learning to Memorize at Test Time (Paper Analysis)
Titans: Learning to Memorize at Test Time (Paper Analysis) Titans: Learning to Memorize at Test Time (Paper Analysis)

Paper: https://arxiv.org/abs/2501.00663 Abstract:

Over more than a decade there has been an extensive research effort on how to effectively utilize recurrent models and attention. While recurrent models aim to compress the data into a fixed-size memory (called hidden state), attention allows attending to the entire context window, capturing the direct dependencies of all tokens. This more accurate modeling of dependencies, however, comes with a quadratic cost, limiting the model to a fixed-length context. We present a new neural long-term memory module that learns to memorize historical context and helps attention to attend to the current context while utilizing long past information. We sh…

2 months, 3 weeks назад @ youtube.com
[Paper Analysis] The Free Transformer (and some Variational Autoencoder stuff)
[Paper Analysis] The Free Transformer (and some Variational Autoencoder stuff) [Paper Analysis] The Free Transformer (and some Variational Autoencoder stuff)

https://arxiv.org/abs/2510.17558 Abstract:

We propose an extension of the decoder Transformer that conditions its generative process on random latent variables which are learned without supervision thanks to a variational procedure. Experimental evaluations show that allowing such a conditioning translates into substantial improvements on downstream tasks. Author: François Fleuret Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the con…

4 months назад @ youtube.com
[Video Response] What Cloudflare's code mode misses about MCP and tool calling
[Video Response] What Cloudflare's code mode misses about MCP and tool calling [Video Response] What Cloudflare's code mode misses about MCP and tool calling

Theo's Video: https://www.youtube.com/watch?v=bAYZjVAodoo

Cloudflare article: https://blog.cloudflare.com/code-mode/ Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https://www.subscribestar.com/yannickilcher

Patreon: https://www.patreon.com/yannickilcher

Bitcoin (BTC): bc1q49lsw3q325tr58ygf8…

4 months, 2 weeks назад @ youtube.com
[Paper Analysis] On the Theoretical Limitations of Embedding-Based Retrieval (Warning: Rant)
[Paper Analysis] On the Theoretical Limitations of Embedding-Based Retrieval (Warning: Rant) [Paper Analysis] On the Theoretical Limitations of Embedding-Based Retrieval (Warning: Rant)

Paper: https://arxiv.org/abs/2508.21038 Abstract:

Vector embeddings have been tasked with an ever-increasing set of retrieval tasks over the years, with a nascent rise in using them for reasoning, instruction-following, coding, and more. These new benchmarks push embeddings to work for any query and any notion of relevance that could be given. While prior works have pointed out theoretical limitations of vector embeddings, there is a common assumption that these difficulties are exclusively due to unrealistic queries, and those that are not can be overcome with better training data and larger models. In this work, we demonstrate that we may encounter these theoretical limitations in realist…

4 months, 3 weeks назад @ youtube.com
AGI is not coming!
AGI is not coming! AGI is not coming!

jack Morris's investigation into GPT-OSS training data https://x.com/jxmnop/status/1953899426075816164?t=3YRhVQDwQLk2gouTSACoqA&s=09

6 months, 4 weeks назад @ youtube.com
Context Rot: How Increasing Input Tokens Impacts LLM Performance (Paper Analysis)
Context Rot: How Increasing Input Tokens Impacts LLM Performance (Paper Analysis) Context Rot: How Increasing Input Tokens Impacts LLM Performance (Paper Analysis)

Paper: https://research.trychroma.com/context-rot Abstract:

Large Language Models (LLMs) are typically presumed to process context uniformly—that is, the model should handle the 10,000th token just as reliably as the 100th. However, in practice, this assumption does not hold. We observe that model performance varies significantly as input length changes, even on simple tasks.

In this report, we evaluate 18 LLMs, including the state-of-the-art GPT-4.1, Claude 4, Gemini 2.5, and Qwen3 models. Our results reveal that models do not use their context uniformly; instead, their performance grows increasingly unreliable as input length grows. Authors: Kelly Hong, Anton Troynikov, Jeff Huber Links:

7 months, 2 weeks назад @ youtube.com
Energy-Based Transformers are Scalable Learners and Thinkers (Paper Review)
Energy-Based Transformers are Scalable Learners and Thinkers (Paper Review) Energy-Based Transformers are Scalable Learners and Thinkers (Paper Review)

Paper: https://arxiv.org/abs/2507.02092

Code: https://github.com/alexiglad/EBT

Website: https://energy-based-transformers.github.io/ Abstract:

Inference-time computation techniques, analogous to human System 2 Thinking, have recently become popular for improving model performances. However, most existing approaches suffer from several limitations: they are modality-specific (e.g., working only in text), problem-specific (e.g., verifiable domains like math and coding), or require additional supervision/training on top of unsupervised pretraining (e.g., verifiers or verifiable rewards). In this paper, we ask the question "Is it possible to generalize these System 2 Thinking approaches, and de…

7 months, 2 weeks назад @ youtube.com
Henry AI Labs Henry AI Labs
последний пост None
3blue1brown 3blue1brown
последний пост 1 week назад
The most underappreciated formula | Exploring high-dimensional spheres
The most underappreciated formula | Exploring high-dimensional spheres The most underappreciated formula | Exploring high-dimensional spheres

On the volumes of higher-dimensional spheres

Explore the 3b1b virtual career fair: See https://3b1b.co/talent

Become a supporter for early views of new videos: https://3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com Thanks to UC Santa Cruz for letting me film there, and special thanks to Pedro Morales-Almazan for arranging everything. My video on Numberphile with a fun application of this problem: https://youtu.be/6_yU9eJ0NxA Timestamps:

0:00 - Introduction

1:01 - Random puzzle

6:16 - Outside the box

14:35 - Setting up the volume grid

21:14 - Why 4πr^2

25:21 - Archimedes in higher dimensions

36:17 - The general formul…

1 week назад @ youtube.com
The lattice bacteria puzzle
The lattice bacteria puzzle The lattice bacteria puzzle

Part of a series of monthly puzzles, done in collaboration with MoMath.

https://momath.org/mindbenders

2 weeks, 2 days назад @ youtube.com
Solution to the ladybug clock puzzle
Solution to the ladybug clock puzzle Solution to the ladybug clock puzzle

Solution to last month's probability puzzle.

2 weeks, 4 days назад @ youtube.com
The Hairy Ball Theorem
The Hairy Ball Theorem The Hairy Ball Theorem

Unexpected applications and a beautiful proof.

Looking for a new career? Check out https://3b1b.co/talent

Supporters get early access to new videos: https://3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com Credits:

Senia Sheydvasser: Co-writing and sphere deformation animations

Paul Dancstep: Those lovely fluffy sphere animations Vince Rubinetti: Music Timestamps:

0:00 - To comb a hairy ball

1:24 - Applications

8:46 - The puzzle of one null point

12:12 - The proof outline

16:41 - Defining orientation

21:44 - Why inside-out is impossible

25:59 - 3b1b Talent

27:44 - Final food for thought ------------------ These animati…

1 month назад @ youtube.com
The ladybug clock puzzle
The ladybug clock puzzle The ladybug clock puzzle

This is the first in a set of monthly puzzles, curated by Peter Winkler. This one was originally suggested by Richard Stanley. You can sign up to hear his description of the answer at http://momath.org/mindbenders

1 month, 2 weeks назад @ youtube.com
The most absurd product I've made
The most absurd product I've made The most absurd product I've made

Because why not make a pi creature neck pillow?

Available at 3b1b.co/store

3 months, 1 week назад @ youtube.com
How Laplace transforms solve differential equations
How Laplace transforms solve differential equations How Laplace transforms solve differential equations

Studying the forced harmonic oscillator by taking a Laplace transform and studying its poles.

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com Chapter on the Laplace Transform:

https://youtu.be/j0wJBEZdwLs Chapter on the S-plane and Simple Harmonic Motion:

https://youtu.be/-j8PzkZ70Lg Timestamps:

0:00 - Opening puzzle

1:06 - Key properties of a Laplace Transform

3:29 - Qualitative analysis with Laplace Transforms

4:29 - The Laplace Transforms of a Derivative

6:06 - The forced oscillator

11:59 - Intuition from the transformed solution

1…

4 months назад @ youtube.com
The dynamics of e^(πi)
The dynamics of e^(πi) The dynamics of e^(πi)

A fuller version of this explanation, also including the reason we care about complex exponents in the first place: https://youtu.be/-j8PzkZ70Lg

4 months, 3 weeks назад @ youtube.com
But what is a Laplace Transform?
But what is a Laplace Transform? But what is a Laplace Transform?

Visualizing the most important tool for differential equations.

Previous chapter: https://youtu.be/-j8PzkZ70Lg

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com Artwork by Kurt Bruns Engine animation borrowed with permission from this (excellent) blog: https://ciechanow.ski/internal-combustion-engine/ Timestamps:

0:00 - Understanding the engine

1:16 - Key background ideas

5:41 - Definition and intuition

10:43 - Complex integration

20:43 - Analytic continuation

23:52 - The transform of exponentials

26:15 - A deep look at cos(t)

32:59 - W…

4 months, 3 weeks назад @ youtube.com
The dynamics of e^(πi)
The dynamics of e^(πi) The dynamics of e^(πi)

A fuller version of this explanation, also including the reason we care about complex exponents in the first place: https://youtu.be/-j8PzkZ70Lg

4 months, 3 weeks назад @ youtube.com
Why complex exponents matter | Laplace Transform Prelude
Why complex exponents matter | Laplace Transform Prelude Why complex exponents matter | Laplace Transform Prelude

How dynamics explain Euler's formula, and vice versa.

Early view of the Laplace Transform video: https://www.patreon.com/posts/laplace-early-140428165

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to simply share the videos.

Home page: https://www.3blue1brown.com Timestamps:

0:00 - Intro

1:51 - Euler's formula explained dynamically

9:27 - The harmonic oscillator

21:08 - General linear equations

22:47 - Motivating the Laplace Transform ------------------ These animations are largely made using a custom Python library, manim. See the FAQ comments here:

https://3b1b.co/faq#manim Music by Vincent Rubin…

5 months назад @ youtube.com
Why ruler and compass? | Guest video by ⁨@bensyversen⁩
Why ruler and compass? | Guest video by ⁨@bensyversen⁩ Why ruler and compass? | Guest video by ⁨@bensyversen⁩

What role were ruler and compass constructions really serving?

Check out Ben's channel: @bensyversen Interview with the author of this video: https://youtu.be/VohYM99j8e0

Supporters get early views of new videos: https://3b1b.co/support Written, produced, edited, and animated by Ben Syversen

Additional editing: Jack Saxon

3d Blender model: Jan-Hendrik Müller

Additional Blender help: Thibaut Modrzyk (@Deepia)

Illustrations: Alex Zepherin/DonDada Studio

Drums: Jeremy Gustin

Additional music from Epidemic Sound Special thanks to Viktor Blåsjö: https://intellectualmathematics.com/opinionated-history-of-mathematics/ References/Recommended reading: Euclid’s Elements:

Visual edition of Book 1: htt…

5 months, 2 weeks назад @ youtube.com
Incomplete open cubes
Incomplete open cubes Incomplete open cubes

Full video: https://youtu.be/_BrFKp-U8GI

6 months назад @ youtube.com
Exploration & Epiphany
Exploration & Epiphany Exploration & Epiphany

Sol Lewitt's "Incomplete Open Cubes" and rediscovering Burnside's lemma in group theory

This is a guest video by Paul Dancstep: https://youtu.be/JEeM2ABUMoo

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to share the videos.

Home page: https://www.3blue1brown.com Thanks to the Wadsworth Atheneum for granting permission to use LeWitt's notebooks. Talks by Paul you can find online: What is Category Theory:

https://www.youtube.com/watch?app=desktop&v=eXBwU9ieLL0 How to Predict Eclipses:

https://www.exploratorium.edu/eclipse/video/how-predict-eclipses Theo Jansen's Strandbeests

https://www.youtube.com/w…

6 months назад @ youtube.com
Simulating Phase Change | Guest video by Vilas Winstein
Simulating Phase Change | Guest video by Vilas Winstein Simulating Phase Change | Guest video by Vilas Winstein

Deriving the Boltzmann formula, defining temperature, and simulating liquid/vapor.

@SpectralCollective has the second part: https://youtu.be/yEcysu5xZH0

You can play with a simulation of this model here: https://vilas.us/simulations/liquidvapor/

These lessons are funded directly by viewers: https://3b1b.co/support

Home page: https://www.3blue1brown.com Notes from Vilas:

1) This open problem is to prove the ergodicity of the deterministic dynamical systems that are used to model the molecule-level physics. A good example of such a dynamical system is the box with particles evolving according to Newton's laws with elastic collisions, like in the video. 2) This video assumes that all probabili…

6 months, 1 week назад @ youtube.com
Two Minute Papers Two Minute Papers
последний пост 1 week, 5 days назад
Adobe & NVIDIA: 10,000,000 Sparkles At 280 FPS
Adobe & NVIDIA: 10,000,000 Sparkles At 280 FPS Adobe & NVIDIA: 10,000,000 Sparkles At 280 FPS

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://perso.telecom-paristech.fr/boubek/papers/Glinty/ Demo: https://www.shadertoy.com/view/tcdGDl Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/ #nvidia #adobe

1 week, 5 days назад @ youtube.com
The Impossible Physics Of Fire
The Impossible Physics Of Fire The Impossible Physics Of Fire

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers 📝 The paper is available here:

https://helgewrede.github.io/firex/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

2 weeks, 1 day назад @ youtube.com
NVIDIA’s New AI Tells You When Photos Lie
NVIDIA’s New AI Tells You When Photos Lie NVIDIA’s New AI Tells You When Photos Lie

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here: https://research.nvidia.com/labs/sil/projects/ppisp/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/ #nvidia

2 weeks, 5 days назад @ youtube.com
Anthropic Found Why AIs Go Insane
Anthropic Found Why AIs Go Insane Anthropic Found Why AIs Go Insane

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://www.anthropic.com/research/assistant-axis Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

3 weeks, 1 day назад @ youtube.com
This Is Now 66x Faster
This Is Now 66x Faster This Is Now 66x Faster

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://sig25ddmpd.github.io/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

3 weeks, 3 days назад @ youtube.com
NVIDIA’s New AI Just Leveled Up Video Editing
NVIDIA’s New AI Just Leveled Up Video Editing NVIDIA’s New AI Just Leveled Up Video Editing

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper and the code are now available here:

https://dvirsamuel.github.io/omnimattezero.github.io/

https://github.com/dvirsamuel/OmnimatteZero Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/ …

4 weeks назад @ youtube.com
New DeepSeek Research - The Age of AI Is Here!
New DeepSeek Research - The Age of AI Is Here! New DeepSeek Research - The Age of AI Is Here!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://arxiv.org/abs/2501.12948 Sources:

https://x.com/awnihannun/status/1883276535643455790

https://x.com/bcjordan/status/1886825587097878826

https://x.com/izag82161/status/1906347576204640514 Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw…

1 month назад @ youtube.com
What A Time To Be Alive (Our First Ever Music Video)
What A Time To Be Alive (Our First Ever Music Video) What A Time To Be Alive (Our First Ever Music Video)

Thank you so much everyone, this was amazing fun! Guitar tab is available here for free, no paywall nonsense: https://www.dropbox.com/scl/fi/oo1ny6i7mtgu3l006roa5/what-a-time-to-be-alive.gp?rlkey=n0c4xryfip7m15derrudnb8zl&st=fu51yljh&dl=1 Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~…

1 month назад @ youtube.com
Digital Humans Just Took A Massive Leap Forward
Digital Humans Just Took A Massive Leap Forward Digital Humans Just Took A Massive Leap Forward

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://neuralbodies.github.io/RFGCA/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

1 month назад @ youtube.com
Scientists Just Solved The Hardest Problem in Granular Physics
Scientists Just Solved The Hardest Problem in Granular Physics Scientists Just Solved The Hardest Problem in Granular Physics

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://visualcomputing.ist.ac.at/publications/2025/HomogenizedSand/ Previous Disney grains paper:

https://la.disneyresearch.com/publication/multi-scale-modeling-and-rendering-of-granular-materials/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagen…

1 month, 1 week назад @ youtube.com
This Fluid Simulation Should Not Be Possible
This Fluid Simulation Should Not Be Possible This Fluid Simulation Should Not Be Possible

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers 📝 The paper "Fast Octree Neighborhood Search for SPH Simulations" is available here:

https://andreaslongva.com/pdf/2022-SA-NeighborhoodSearch-compressed.pdf

https://animation.rwth-aachen.de/media/papers/79/2022-SA-NeighborhoodSearch.pdf Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, …

1 month, 2 weeks назад @ youtube.com
Wrinkles Are Weirder Than We Ever Thought
Wrinkles Are Weirder Than We Ever Thought Wrinkles Are Weirder Than We Ever Thought

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://wanghmin.github.io/publication/zhang-2025-pie/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/

1 month, 3 weeks назад @ youtube.com
Why Game Physics Is Falling Apart (And How To Fix It)
Why Game Physics Is Falling Apart (And How To Fix It) Why Game Physics Is Falling Apart (And How To Fix It)

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here:

https://graphics.cs.utah.edu/research/projects/stable-cosserat-rods/ Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers

Note that just watching the series and leaving a kind comment every now and then is as much support as any of us could ever ask for! Sources:

https://www.youtube.com/watch?v=kO3NsSX1VTg

https://www.youtube.com/watch?v=IQZ_zBX6gQY 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Mi…

1 month, 4 weeks назад @ youtube.com
We Just Turned Down Millions of Dollars. Here Is Why.
We Just Turned Down Millions of Dollars. Here Is Why. We Just Turned Down Millions of Dollars. Here Is Why.

Yup. My free course on how to write a light simulation program (ray tracing): https://users.cg.tuwien.ac.at/zsolnai/gfx/rendering-course/ 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Benji Rabhan, B Shang, Christian Ahlin, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Steef, Taras Bobrovytsky, Tybie Fitzhugh, Ueli Gallizzi

If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers My research: https://cg.tuwien.ac.at/~zsolnai/

Thumbnail design: Felícia Zsolnai-Fehér - http://felicia.hu

2 months назад @ youtube.com
The Bug That Ruined Game Physics For Decades
The Bug That Ruined Game Physics For Decades The Bug That Ruined Game Physics For Decades

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Using DeepSeek on Lambda:

https://lambda.ai/inference-models/deepseek-r1 📝 The paper "A Stream Function Solver for Liquid Simulations" is available here:

https://pub.ista.ac.at/group_wojtan/projects/2015_Ando_ASFSfLS/download/vecpotential.pdf 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Benji Rabhan, B Shang, Christian Ahlin, Fred R, Gordon …

2 months назад @ youtube.com
DataFest Video DataFest Video
последний пост None
Семинары JetBrains Research Семинары JetBrains Research
последний пост None
Яндекс. Компьютерные науки Яндекс. Компьютерные науки
последний пост 1 day, 17 hours назад
Сфоткал кучу LEGO — получил инструкцию за секунду 🤯
Сфоткал кучу LEGO — получил инструкцию за секунду 🤯 Сфоткал кучу LEGO — получил инструкцию за секунду 🤯

Андрей Татаринов, CEO и CTO в Epoch8, рассказал о системе компьютерного зрения для приложения Brickit, которое сканирует множество деталей лего и подсказывает, что из них можно собрать. В докладе Андрей объяснил, как его команда боролась с редкими классами и оптимизировала пайплайн под мобильные устройства. А ещё он поделился MLOps-решениями для масштабируемого дообучения и поддержки модели. Полное видео уже на канале! #Brickit, #Lego, #AI, #Нейросети, #ComputerVision, #DeepLearning, #MobileAI, #Tech, #Стартап, #Будущее, #MachineLearning, #Гаджеты, #Технологии, #WOW, #AIприложение

1 day, 17 hours назад @ youtube.com
Saturday ML Party 21.03.2026
Saturday ML Party 21.03.2026 Saturday ML Party 21.03.2026

Добро пожаловать на вечерний митап для ml-инженеров от Яндекса. В этот раз поговорим про про развитие картиночной мультимодальности, книжном синтезе и не только. Вступайте в канал Yandex for ML, чтобы быть в курсе всех событий и мероприятий Яндекса https://t.me/+LoohiDU_sVk4NGVi Оставляйте свои вопросы спикерам в чате с тегом #вопрос: https://t.me/+OsKnLNG-7DE1ZTFi

3 weeks назад @ youtube.com
Как устроены тематические блоки в поиске Яндекса
Как устроены тематические блоки в поиске Яндекса Как устроены тематические блоки в поиске Яндекса

Под капот поисковой системы Яндекса заглядывает Артём Лобанов, разработчик в группе продуктовых сценариев бизнес-группы Поиска и Рекламных технологий. Полное выступление с Data Fest Siberia уже на канале! #поиск, #поисковыесистемы, #геопоиск, #searchengineering, #dataanalytics, #datascience, #ML, #искусственныйинтеллект, #релевантность, #качествопоиска, #YandexSearch, #ЯндексПоиск, #ITразработка, #DataFestSiberia, #ITконференция, #Shorts

3 weeks, 1 day назад @ youtube.com
Как поиск путает Эрмитажи
Как поиск путает Эрмитажи Как поиск путает Эрмитажи

Про сценарии нерелевантных запросов рассказывает Артём Лобанов, разработчик в группе продуктовых сценариев бизнес-группы Поиска и Рекламных технологий. Полное выступление с Data Fest Siberia уже на канале! #поиск, #поисковыесистемы, #геопоиск, #searchengineering, #dataanalytics, #datascience, #ML, #искусственныйинтеллект, #релевантность, #качествопоиска, #YandexSearch, #ЯндексПоиск, #ITразработка, #DataFestSiberia, #ITконференция, #Shorts

3 weeks, 6 days назад @ youtube.com
Нейросеть должна знать, что Барселона — топ
Нейросеть должна знать, что Барселона — топ Нейросеть должна знать, что Барселона — топ

Фрагмент доклада Алексея Колесова, CTO в Yandex R&D, с Practical ML Conf 2025. На простом и «очевидном» примере из футбола разбираем серьёзную проблему больших языковых моделей: почему LLM могут плохо помнить факты о мире и как в YandexGPT 5.1 улучшали фактическую память и применение знаний. В докладе также показано, как в реальной системе удалось стабильно запустить online reinforcement learning (RL) для обучения модели. Про обучение нейросетей, память LLM и практический ML в продакшене. Полный доклад — на канале. #YandexGPT #LLM #AI #MachineLearning #ReinforcementLearning #DeepLearning #ML #ArtificialIntelligence #PracticalML #MLOps

1 month назад @ youtube.com
Как выглядит плохой показ в поиске
Как выглядит плохой показ в поиске Как выглядит плохой показ в поиске

Объясняет Артём Лобанов, разработчик в группе продуктовых сценариев бизнес-группы Поиска и Рекламных технологий. Полное выступление с Data Fest Siberia уже на канале! #поиск, #поисковыесистемы, #геопоиск, #searchengineering, #dataanalytics, #datascience, #ML, #искусственныйинтеллект, #релевантность, #качествопоиска, #YandexSearch, #ЯндексПоиск, #ITразработка, #DataFestSiberia, #ITконференция, #Shorts

1 month назад @ youtube.com
Зачем нужен геопоиск Яндекса
Зачем нужен геопоиск Яндекса Зачем нужен геопоиск Яндекса

Про функциональность сервиса рассказывает Артём Лобанов, разработчик в группе продуктовых сценариев бизнес-группы Поиска и Рекламных технологий. Полное выступление с Data Fest Siberia уже на канале! #поиск, #поисковыесистемы, #геопоиск, #searchengineering, #dataanalytics, #datascience, #ML, #искусственныйинтеллект, #релевантность, #качествопоиска, #YandexSearch, #ЯндексПоиск, #ITразработка, #DataFestSiberia, #ITконференция, #Shorts

1 month, 1 week назад @ youtube.com
Что такое PMI Score
Что такое PMI Score Что такое PMI Score

Это фрагмент выступления Всеволода Парамонова и Андрея Шевцова, разработчиков группы аналитики ценообразования Яндекс Лавки. На Data Fest Siberia они рассказали, что лежит под капотом динамического ценообразования в сервисе. Также ребята разобрали, как устроены модели спроса и эластичности и какие математические формулы приводят процесс в движение. Смотрите полное видео на нашем канале! #ценообразование, #dynamicpricing, #pricingmodel, #dataanalytics, #datascience, #ML, #математика, #экономика, #аналитика, #бизнесметрики, #спрос, #эластичность, #YandexLavka, #DataFestSiberia, #ITконференция, #datadriven, #Shorts

1 month, 1 week назад @ youtube.com
Как выглядит модель ценообразования
Как выглядит модель ценообразования Как выглядит модель ценообразования

Это фрагмент выступления Всеволода Парамонова и Андрея Шевцова, разработчиков группы аналитики ценообразования Яндекс Лавки. На Data Fest Siberia они рассказали, что лежит под капотом динамического ценообразования в сервисе. Также ребята разобрали, как устроены модели спроса и эластичности и какие математические формулы приводят процесс в движение. Смотрите полное видео на нашем канале! #ценообразование, #dynamicpricing, #pricingmodel, #dataanalytics, #datascience, #ML, #математика, #экономика, #аналитика, #бизнесметрики, #спрос, #эластичность, #YandexLavka, #DataFestSiberia, #ITконференция, #datadriven, #Shorts

1 month, 2 weeks назад @ youtube.com
Какую задачу решает ценообразование
Какую задачу решает ценообразование Какую задачу решает ценообразование

Это фрагмент выступления Всеволода Парамонова и Андрея Шевцова, разработчиков группы аналитики ценообразования Яндекс Лавки. На Data Fest Siberia они рассказали, что лежит под капотом динамического ценообразования в сервисе. Также ребята разобрали, как устроены модели спроса и эластичности и какие математические формулы приводят процесс в движение. Смотрите полное видео на нашем канале! #ценообразование, #dynamicpricing, #pricingmodel, #dataanalytics, #datascience, #ML, #математика, #экономика, #аналитика, #бизнесметрики, #спрос, #эластичность, #YandexLavka, #DataFestSiberia, #ITконференция, #datadriven, #Shorts

1 month, 2 weeks назад @ youtube.com
Компьютерное зрение в 2025-м / Роман Исаченко
Компьютерное зрение в 2025-м / Роман Исаченко Компьютерное зрение в 2025-м / Роман Исаченко

ML Global Recap 2025 — митап для ML-сообщества, на котором мы рассказали о главных международных конференциях года и самых интересных трендах в рекомендательных технологиях, компьютерном зрении, распознавании речи и NLP. С докладом на ивенте выступил Роман Исаченко, руководитель команды анализа изображений в Яндекс R&D. Он рассказал про мультимодальный анализ изображений (VLM) и диффузионки — картиночную генерацию. Больше ML-контента по ссылке: https://t.me/+Ug9D4CjJrJxmZGRi #ML, #MachineLearning, #AI, #DataScience, #MLGlobalRecap2025, #нейросети, #искусственныйинтеллект, #рекомендательныесистемы, #NLP, #ComputerVision, #SpeechRecognition, #LLM, #DeepLearning, #MLтренды, #ITконференция, #Ya…

2 months, 1 week назад @ youtube.com
Тренды в NLP, обзор ICLR и ACL / Александр Юшкевич
Тренды в NLP, обзор ICLR и ACL / Александр Юшкевич Тренды в NLP, обзор ICLR и ACL / Александр Юшкевич

ML Global Recap 2025 — митап для ML-сообщества, на котором мы рассказали о главных международных конференциях года и самых интересных трендах в рекомендательных технологиях, компьютерном зрении, распознавании речи и NLP. С докладом на ивенте выступил Александр Юшкевич, руководитель команды развития моделей базового качества в Поисковых сервисах и ИИ. Он показал, как конференции отражают тренды в NLP: растёт закрытость топовых LLM-моделей, а также спрос на alignment & safety, инференс, интерпретируемость и оптимизацию. А ещё появляются новые бенчмарки (куда без них). Больше ML-контента по ссылке: https://t.me/+Ug9D4CjJrJxmZGRi #ML, #MachineLearning, #AI, #DataScience, #MLGlobalRecap2025, #не…

2 months, 1 week назад @ youtube.com
Голосовые технологии на Interspeech и ICASSP 2025 / Борис Шелудько
Голосовые технологии на Interspeech и ICASSP 2025 / Борис Шелудько Голосовые технологии на Interspeech и ICASSP 2025 / Борис Шелудько

ML Global Recap 2025 — митап для ML-сообщества, на котором мы рассказали о главных международных конференциях года и самых интересных трендах в рекомендательных технологиях, компьютерном зрении, распознавании речи и NLP. С докладом на ивенте выступил Борис Шелудько, руководитель команды качества звука в Яндекс R&D. Он рассказал про итоги двух главных международных конференций по голосовым технологиям: Interspeech в Нидерландах и ICASSP в Индии. Больше ML-контента по ссылке: https://t.me/+Ug9D4CjJrJxmZGRi #ML, #MachineLearning, #AI, #DataScience, #MLGlobalRecap2025, #нейросети, #искусственныйинтеллект, #рекомендательныесистемы, #NLP, #ComputerVision, #SpeechRecognition, #LLM, #DeepLearning, …

2 months, 1 week назад @ youtube.com
Главные тренды рекомендательных систем / Николай Савушкин
Главные тренды рекомендательных систем / Николай Савушкин Главные тренды рекомендательных систем / Николай Савушкин

ML Global Recap 2025 — митап для ML-сообщества, на котором мы рассказали о главных международных конференциях года и самых интересных трендах в рекомендательных технологиях, компьютерном зрении, распознавании речи и NLP. С докладом на ивенте выступил Николай Савушкин, руководитель команды рекомендательных технологий в Яндекс R&D. Он поделился инсайтами с CIKM и RecSys и рассказал про ключевые тренды в рекомендательных системах: фундаментальные и End2End-модели, масштабирование, мультимодальность, attention-based ranking и другие. Больше ML-контента по ссылке: https://t.me/+Ug9D4CjJrJxmZGRi #ML, #MachineLearning, #AI, #DataScience, #MLGlobalRecap2025, #нейросети, #искусственныйинтеллект, #ре…

2 months, 1 week назад @ youtube.com
Открытие ML Global Recap 2025 / Алексей Гусаков
Открытие ML Global Recap 2025 / Алексей Гусаков Открытие ML Global Recap 2025 / Алексей Гусаков

ML Global Recap 2025 — митап для ML-сообщества, на котором мы рассказали о главных международных конференциях года и самых интересных трендах в рекомендательных технологиях, компьютерном зрении, распознавании речи и NLP. Ивент открыл Алексей Гусаков, CTO Поисковых сервисов и ИИ. В докладе — вступительное слово, краткий обзор NeurIPS и немного про кризис бенчмарков. Больше ML-контента по ссылке: https://t.me/+Ug9D4CjJrJxmZGRi #ML, #MachineLearning, #AI, #DataScience, #MLGlobalRecap2025, #нейросети, #искусственныйинтеллект, #рекомендательныесистемы, #NLP, #ComputerVision, #SpeechRecognition, #LLM, #DeepLearning, #MLтренды, #ITконференция, #Yandex, #MLсообщество

2 months, 2 weeks назад @ youtube.com
ML Trainings ML Trainings
последний пост 4 days, 16 hours назад
Митап #1 | Data Fusion Contest 2026
Митап #1 | Data Fusion Contest 2026 Митап #1 | Data Fusion Contest 2026

Представляем ежегодную серию соревнований по машинному обучению Data Fusion Contest. Страница с задачами: https://ods.ai/tracks/data-fusion-2026-competitions 24 февраля (19:00 - 20:00 по мск) мы провели первый митап по Data Fusion Contest 2026 в ODS спейсе Spatial.Chat. Программа:

Алексей Натёкин

Разбор задачи 1 “Страж”, задачи 2 “Киберполка”, задачи 3 “Герои”, Q&A и обсуждение вопросов с участниками.

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

4 days, 16 hours назад @ youtube.com
Страхи перед LLM и их обоснованность
Страхи перед LLM и их обоснованность Страхи перед LLM и их обоснованность 4 days, 19 hours назад @ youtube.com
Как один аккаунт на github изменил жизнь Kai Gritun
Как один аккаунт на github изменил жизнь Kai Gritun Как один аккаунт на github изменил жизнь Kai Gritun 4 days, 19 hours назад @ youtube.com
Дроны против дронов: будущее войны
Дроны против дронов: будущее войны Дроны против дронов: будущее войны 4 days, 19 hours назад @ youtube.com
Бум комов и ai разные времена, разные опасения
Бум комов и ai разные времена, разные опасения Бум комов и ai разные времена, разные опасения 4 days, 19 hours назад @ youtube.com
Будущее труда в эпоху искусственного интеллекта
Будущее труда в эпоху искусственного интеллекта Будущее труда в эпоху искусственного интеллекта 4 days, 20 hours назад @ youtube.com
Автоматические системы и их последствия в 60 х годах
Автоматические системы и их последствия в 60 х годах Автоматические системы и их последствия в 60 х годах 4 days, 20 hours назад @ youtube.com
Капитанский мостик №8: Anthropic против Пентагона | Каким был 2028 год | Собака — гейм-дизайнер
Капитанский мостик №8: Anthropic против Пентагона | Каким был 2028 год | Собака — гейм-дизайнер Капитанский мостик №8: Anthropic против Пентагона | Каким был 2028 год | Собака — гейм-дизайнер

сегодня гостем капитанов был Тимур Маджидов, специалист по ИИ в химии, д.х.н. 0:00:00 Начало

0:00:40 Anthropic против Пентагона

0:16:54 LLM не умеют считать

0:29:35 Модели выучили SWE-bench

0:41:23 ИИ-агент сделал бизнес

0:45:14 ИИ-бум не вау

0:52:21 Каким был 2028 год

0:57:07 Claude против COBOL

1:11:35 ИИ за работу

1:19:20 Собака — гейм-дизайнер ИИ-саммари: В этом подкасте обсуждаются актуальные новости в области искусственного интеллекта, в частности, контракт компании Anthropic с Пентагоном и вопросы этики в использовании ИИ в химии. Гости делятся мнениями о репутационных рисках, связанных с использованием ИИ в военных целях, а также о нейтральности алгоритмов в химических задачах. В эт…

5 days, 21 hours назад @ youtube.com
Счетные и предсказательные вектора в дистрибутивной семантике
Счетные и предсказательные вектора в дистрибутивной семантике Счетные и предсказательные вектора в дистрибутивной семантике

Natural Language Processing & LLMs course (stream 10, spring 2026):

https://ods.ai/tracks/nlp-course-spring-2026

1 week, 2 days назад @ youtube.com
Матрица встречаемости и svd в действии
Матрица встречаемости и svd в действии Матрица встречаемости и svd в действии

Natural Language Processing & LLMs course (stream 10, spring 2026):

https://ods.ai/tracks/nlp-course-spring-2026

1 week, 2 days назад @ youtube.com
Как предсказывать вектор слова на основе контекста
Как предсказывать вектор слова на основе контекста Как предсказывать вектор слова на основе контекста

Natural Language Processing & LLMs course (stream 10, spring 2026):

https://ods.ai/tracks/nlp-course-spring-2026

1 week, 2 days назад @ youtube.com
Как компьютер понимает слова через числа и контекст
Как компьютер понимает слова через числа и контекст Как компьютер понимает слова через числа и контекст

Natural Language Processing & LLMs course (stream 10, spring 2026):

https://ods.ai/tracks/nlp-course-spring-2026

1 week, 2 days назад @ youtube.com
Векторные представления и первые шаги deep learning
Векторные представления и первые шаги deep learning Векторные представления и первые шаги deep learning

Natural Language Processing & LLMs course (stream 10, spring 2026):

https://ods.ai/tracks/nlp-course-spring-2026

1 week, 2 days назад @ youtube.com
word2vec два вектора для одного слова
word2vec два вектора для одного слова word2vec два вектора для одного слова

Natural Language Processing & LLMs course (stream 10, spring 2026):

https://ods.ai/tracks/nlp-course-spring-2026

1 week, 2 days назад @ youtube.com
Царский подарок невесте в Турции: подарили память
Царский подарок невесте в Турции: подарили память Царский подарок невесте в Турции: подарили память 1 week, 5 days назад @ youtube.com
Primer Primer
последний пост 2 months назад
Taking AI Doom Seriously For 62 Minutes
Taking AI Doom Seriously For 62 Minutes Taking AI Doom Seriously For 62 Minutes

Patreon: https://www.patreon.com/primerlearning

80,000 Hours: 80000hours.org/primer https://www.desmos.com/calculator/a5pfjtr4tr Other connections:

Discord: https://discord.gg/NbruaNW

Twitch: https://www.twitch.tv/justin_helps

Store: https://store.dftba.com/collections/primer Reddit: https://www.reddit.com/r/primerlearning/

Bsky: https://bsky.app/profile/justinhelps.bsky.social

Twitter: https://twitter.com/primerlearning Links to other resources:

https://yoshuabengio.org/2024/07/09/reasoning-through-arguments-against-taking-ai-safety-seriously/

https://www.youtube.com/c/robertmilesai

https://www.youtube.com/@Siliconversations

https://www.youtube.com/@Go-Meta

https://www.youtube.com/@Dwarkes…

2 months назад @ youtube.com
Simulating a single brain cell
Simulating a single brain cell Simulating a single brain cell

Patreon:

https://www.patreon.com/primerlearning Helpful resources if you want to learn more about neural networks

https://www.youtube.com/@AndrejKarpathy

https://course.fast.ai/

https://www.youtube.com/@WelchLabsVideo

https://www.youtube.com/@3blue1brown Early papers. These probably aren't helpful for understanding the concepts in this video, but if you're interested in history.

The Perceptron – A perceiving and recognizing automaton: https://bpb-us-e2.wpmucdn.com/websites.umass.edu/dist/a/27637/files/2016/03/rosenblatt-1957.pdf

The Perceptron: A probabilistic model for information storage and organization in the brain: https://www.ling.upenn.edu/courses/cogs501/Rosenblatt1958.pdf A Logical…

5 months, 1 week назад @ youtube.com
🎧 Podcasts
Lex Fridman AI Podcast Lex Fridman AI Podcast
последний пост 6 days назад
#492 – Rick Beato: Greatest Guitarists of All Time, History & Future of Music
#492 – Rick Beato: Greatest Guitarists of All Time, History & Future of Music #492 – Rick Beato: Greatest Guitarists of All Time, History & Future of Music

Rick Beato is a music educator, interviewer, producer, songwriter, and a true multi-instrument musician, playing guitar, bass, cello & piano.

His incredible YouTube channel celebrates great musicians & musical ideas, and helps millions of people fall in love with great music all over again.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep492-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://upliftdesk.com/lexBetterHelp: Online therapy and counseling.

Go to https://drinkLMNT.com/lexFin: AI agent for customer service.

6 days назад @ lexfridman.com
#491 – OpenClaw: The Viral AI Agent that Broke the Internet – Peter Steinberger
#491 – OpenClaw: The Viral AI Agent that Broke the Internet – Peter Steinberger #491 – OpenClaw: The Viral AI Agent that Broke the Internet – Peter Steinberger

Peter Steinberger is the creator of OpenClaw, an open-source AI agent framework that’s the fastest-growing project in GitHub history.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep491-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://coderabbit.ai/lexFin: AI agent for customer service.

Go to https://fin.ai/lexBlitzy: AI agent for large enterprise codebases.

Go to https://drinkLMNT.com/lexOUTLINE:(00:00) – Introduction(03:51) – Sponsors, Comments, and Reflections(15:29) – OpenClaw origin story(18:48) – Mind-blowing moment(28:15) – Why OpenClaw went viral(32:12) – Self-modifying AI agent(36:57)…

3 weeks, 2 days назад @ lexfridman.com
#490 – State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI
#490 – State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI #490 – State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI

Nathan Lambert and Sebastian Raschka are machine learning researchers, engineers, and educators.

Sebastian Raschka is the author of Build a Large Language Model (From Scratch) and Build a Reasoning Model (From Scratch).

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep490-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

(25:11) – ChatGPT vs Claude vs Gemini vs Grok: Who is winning?

(36:11) – Best AI for coding(43:02) – Open Source vs Closed Source LLMs(54:41) – Transformers: Evolution of LLMs since 2019(1:02:38) – AI Scaling Laws: Are they dead or still holding?

1 month назад @ lexfridman.com
#489 – Paul Rosolie: Uncontacted Tribes in the Amazon Jungle
#489 – Paul Rosolie: Uncontacted Tribes in the Amazon Jungle #489 – Paul Rosolie: Uncontacted Tribes in the Amazon Jungle

Paul Rosolie is a naturalist, explorer, author of a new book titled Junglekeeper, and is someone who has dedicated his life to protecting the Amazon rainforest.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep489-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://perplexity.ai/BetterHelp: Online therapy and counseling.

Go to https://fin.ai/lexMiro: Online collaborative whiteboard platform.

Go to https://miro.com/MasterClass: Online classes from world-class experts.

1 month, 3 weeks назад @ lexfridman.com
#488 – Infinity, Paradoxes that Broke Mathematics, Gödel Incompleteness & the Multiverse – Joel David Hamkins
#488 – Infinity, Paradoxes that Broke Mathematics, Gödel Incompleteness & the Multiverse – Joel David Hamkins #488 – Infinity, Paradoxes that Broke Mathematics, Gödel Incompleteness & the Multiverse – Joel David Hamkins

Joel David Hamkins is a mathematician and philosopher specializing in set theory, the foundations of mathematics, and the nature of infinity, and he’s the #1 highest-rated user on MathOverflow.

He is also the author of several books, including Proof and the Art of Mathematics and Lectures on the Philosophy of Mathematics.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep488-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://masterclass.com/lexpodOUTLINE:(00:00) – Introduction(01:58) – Sponsors, Comments, and Reflections(15:40) – Infinity & paradoxes(1:02:50) – Russell’s paradox(1:15:57) – Gödel’s…

2 months назад @ lexfridman.com
#487 – Irving Finkel: Deciphering Secrets of Ancient Civilizations & Flood Myths
#487 – Irving Finkel: Deciphering Secrets of Ancient Civilizations & Flood Myths #487 – Irving Finkel: Deciphering Secrets of Ancient Civilizations & Flood Myths

Irving Finkel is a scholar of ancient languages and a longtime curator at the British Museum, renowned for his expertise in Mesopotamian history and cuneiform writing.

He specializes in reading and interpreting cuneiform inscriptions, including tablets from Sumerian, Akkadian, Babylonian, and Assyrian contexts.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep487-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://shopify.com/lexMiro: Online collaborative whiteboard platform.

Go to https://miro.com/Chevron: Reliable energy for data centers.

2 months, 3 weeks назад @ lexfridman.com
#486 – Michael Levin: Hidden Reality of Alien Intelligence & Biological Life
#486 – Michael Levin: Hidden Reality of Alien Intelligence & Biological Life #486 – Michael Levin: Hidden Reality of Alien Intelligence & Biological Life

Michael Levin is a biologist at Tufts University working on novel ways to understand and control complex pattern formation in biological systems.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep486-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://upliftdesk.com/lexMiro: Online collaborative whiteboard platform.

Go to https://miro.com/MasterClass: Online classes from world-class experts.

(2:42:41) – Mind uploading(3:01:22) – Alien intelligence(3:16:17) – Advice for young people(3:22:46) – Questions for AGI

3 months назад @ lexfridman.com
#485 – David Kirtley: Nuclear Fusion, Plasma Physics, and the Future of Energy
#485 – David Kirtley: Nuclear Fusion, Plasma Physics, and the Future of Energy #485 – David Kirtley: Nuclear Fusion, Plasma Physics, and the Future of Energy

David Kirtley is a nuclear fusion engineer and CEO of Helion Energy, a company working on building the world's first commercial fusion power plant by 2028.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep485-sc

See below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc. Transcript:

https://lexfridman.com/david-kirtley-transcript CONTACT LEX:

Feedback - give feedback to Lex: https://lexfridman.com/survey

AMA - submit questions, videos or call-in: https://lexfridman.com/ama

Hiring - join our team: https://lexfridman.com/hiring

Other - other ways to get in touch: https://lexfridman.com/contact EPISODE LINKS:

David's X: htt…

3 months, 2 weeks назад @ lexfridman.com
#484 – Dan Houser: GTA, Red Dead Redemption, Rockstar, Absurd & Future of Gaming
#484 – Dan Houser: GTA, Red Dead Redemption, Rockstar, Absurd & Future of Gaming #484 – Dan Houser: GTA, Red Dead Redemption, Rockstar, Absurd & Future of Gaming

Dan Houser is co-founder of Rockstar Games and is a legendary creative mind behind Grand Theft Auto (GTA) and Red Dead Redemption series of video games.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep484-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://box.com/aiUPLIFT Desk: Standing desks and office ergonomics.

Go to https://drinkLMNT.com/lexOUTLINE:(00:00) – Introduction(01:29) – Sponsors, Comments, and Reflections(11:32) – Greatest films of all time(23:45) – Making video games(26:36) – GTA 3(29:55) – Open world video games(32:42) – Character creation(36:09) – Superintelligent AI in A Bette…

4 months назад @ lexfridman.com
#483 – Julia Shaw: Criminal Psychology of Murder, Serial Killers, Memory & Sex
#483 – Julia Shaw: Criminal Psychology of Murder, Serial Killers, Memory & Sex #483 – Julia Shaw: Criminal Psychology of Murder, Serial Killers, Memory & Sex

Julia Shaw is a criminal psychologist and author who in her books explores human nature, including psychopathy, violent crime, the psychology of evil, police interrogation, false memory manipulation, deception detection, and human sexuality.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep483-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://shopify.com/lexBetterHelp: Online therapy and counseling.

Go to https://betterhelp.com/lexLMNT: Zero-sugar electrolyte drink mix.

Go to https://drinkLMNT.com/lexAG1: All-in-one daily nutrition drink.

4 months, 3 weeks назад @ lexfridman.com
#482 – Pavel Durov: Telegram, Freedom, Censorship, Money, Power & Human Nature
#482 – Pavel Durov: Telegram, Freedom, Censorship, Money, Power & Human Nature #482 – Pavel Durov: Telegram, Freedom, Censorship, Money, Power & Human Nature

Pavel Durov is the founder and CEO of Telegram.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep482-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Transcript:https://lexfridman.com/pavel-durov-transcriptCONTACT LEX:Feedback – give feedback to Lex: https://lexfridman.com/surveyAMA – submit questions, videos or call-in: https://lexfridman.com/amaHiring – join our team: https://lexfridman.com/hiringOther – other ways to get in touch: https://lexfridman.com/contactEPISODE LINKS:Pavel’s Telegram: https://t.me/durovPavel’s X: https://x.com/durovTelegram: https://telegram.org/Telegram Contests: https://contest.c…

5 months, 1 week назад @ lexfridman.com
#481 – Norman Ohler: Hitler, Nazis, Drugs, WW2, Blitzkrieg, LSD, MKUltra & CIA
#481 – Norman Ohler: Hitler, Nazis, Drugs, WW2, Blitzkrieg, LSD, MKUltra & CIA #481 – Norman Ohler: Hitler, Nazis, Drugs, WW2, Blitzkrieg, LSD, MKUltra & CIA

Norman Ohler is a historian and author of “Blitzed: Drugs in the Third Reich,” a book that investigates the role of psychoactive drugs, particularly stimulants such as methamphetamine, in the military history of World War II.

It is a book that two legendary historians Ian Kershaw and Antony Beevor give very high praise for its depth of research.

Norman also wrote “Tripped: Nazi Germany, the CIA, and the Dawn of the Psychedelic Age”, and he is working on a new book “Stoned Sapiens” looking at the history of human civilization through the lens of drugs.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep481-scSee below for timestamps, transcript, and to give f…

5 months, 2 weeks назад @ lexfridman.com
#480 – Dave Hone: T-Rex, Dinosaurs, Extinction, Evolution, and Jurassic Park
#480 – Dave Hone: T-Rex, Dinosaurs, Extinction, Evolution, and Jurassic Park #480 – Dave Hone: T-Rex, Dinosaurs, Extinction, Evolution, and Jurassic Park

Dave Hone is a paleontologist, expert on dinosaurs, co-host of the Terrible Lizards podcast, and author of numerous scientific papers and books on the behavior and ecology of dinosaurs.

He lectures at Queen Mary University of London on topics of Ecology, Zoology, Biology, and Evolution.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep480-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://go.lindy.ai/lexBetterHelp: Online therapy and counseling.

Go to https://shopify.com/lexLMNT: Zero-sugar electrolyte drink mix.

6 months назад @ lexfridman.com
#479 – Dave Plummer: Programming, Autism, and Old-School Microsoft Stories
#479 – Dave Plummer: Programming, Autism, and Old-School Microsoft Stories #479 – Dave Plummer: Programming, Autism, and Old-School Microsoft Stories

Dave Plummer is a programmer, former Microsoft software engineer (Windows 95, NT, XP), creator of Task Manager, author of two books on autism, and host of the Dave’s Garage YouTube channel, where he shares stories from his career, insights on software development, and deep dives into technology.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep479-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://upliftdesk.com/lexZocDoc: App that helps patients find healthcare providers.

Go to https://zocdoc.com/lexFin: AI agent for customer service.

Go to https://fin.ai/lexAllio Capital: AI-powered investment app that use…

6 months, 1 week назад @ lexfridman.com
#478 – Scott Horton: The Case Against War and the Military Industrial Complex
#478 – Scott Horton: The Case Against War and the Military Industrial Complex #478 – Scott Horton: The Case Against War and the Military Industrial Complex

Scott Horton is the director of the Libertarian Institute, editorial director of Antiwar.com, host of The Scott Horton Show, co-host of Provoked, and for the past three decades a staunch critic of U.S. military interventionism.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep478-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://alliocapital.com/Hampton: Community for high-growth founders and CEOs.

Go to https://joinhampton.com/lexBetterHelp: Online therapy and counseling.

Go to https://drinkag1.com/lexOUTLINE:(00:00) – Introduction(00:35) – Sponsors, Comments, and Reflections(09:14) – From the Cold War to …

6 months, 2 weeks назад @ lexfridman.com
Microsoft Research Podcast Microsoft Research Podcast
последний пост 3 days, 15 hours назад
Trailer: The Shape of Things to Come
Trailer: The Shape of Things to Come Trailer: The Shape of Things to Come

Join Microsoft’s Doug Burger and guests as they dig into the fundamental truths about AI and how it will reshape the future.

Technical advances are moving at such a rapid pace that it can be challenging to define the tomorrow we’re working toward.

In The Shape of Things to Come, Microsoft research leader Doug Burger and experts from across disciplines tease out the thorniest AI issues facing technologists, policymakers, business decision-makers, and other stakeholders today.

It’s important to understand what the emerging shapes are and how we should respond.” – Doug Burger, Technical Fellow and Corporate Vice President, Microsoft ResearchAbout Doug BurgerDoug Burger is a research leader in …

3 days, 15 hours назад @ microsoft.com
Ideas: Community building, machine learning, and the future of AI
Ideas: Community building, machine learning, and the future of AI Ideas: Community building, machine learning, and the future of AI

This week, machine learning researchers around the world will be attending the annual Conference on Neural Information Processing Systems, or NeurIPS.

In this series, we’ll explore the technologies that are shaping our future and the big ideas that propel them forward.

So around that time when I started my PhD at Penn, I was working in machine learning theory and algorithmic economics.

How had you experienced a lack of community or network of women in machine learning before the founding of WiML?

So particularly when working on topics related to fairness, I’ve ended up focusing a bunch on stuff to do with marginalized groups as part of my responsible AI work.

3 months назад @ microsoft.com
Ideas: More AI-resilient biosecurity with the Paraphrase Project
Ideas: More AI-resilient biosecurity with the Paraphrase Project Ideas: More AI-resilient biosecurity with the Paraphrase Project

Today, I’m excited to talk about the Paraphrase Project, an effort I co-led exploring how advances in AI tools for protein design might impact biosecurity.

These “patches,” akin to those in cybersecurity, have now been shared with organizations globally to strengthen biosecurity screening.

The project highlights that the same AI tools capable of incredible good can also be misused, requiring us to be vigilant, thoughtful, and creative so we continue to get the most benefit out of AI tools while working to ensure that we avoid costly misuses.

So things like, how similar is this to that template, wild-type protein structure that we used as our conditioning information?

But I feel like broadly…

5 months назад @ microsoft.com
Coauthor roundtable: Reflecting on healthcare economics, biomedical research, and medical education
Coauthor roundtable: Reflecting on healthcare economics, biomedical research, and medical education Coauthor roundtable: Reflecting on healthcare economics, biomedical research, and medical education

KOHANE: So I think you’ve “nerd sniped” me because you [LAUGHTER]—which is all too easy—but I think there’s a central issue here.

But I actually think this is dark matter of human organizational technology that is not well understood.

AZEEM AZHAR: We didn’t talk about, you know, AI in its ability to potentially do this, which is to extend the clinician’s presence throughout the week.

And so I think there’s always going to be an opening for either differences of opinion or agreeing with you too much.

And this gets into whether AI is really going to get almost to the ab initio understanding of human biology.

6 months, 2 weeks назад @ microsoft.com
Reimagining healthcare delivery and public health with AI
Reimagining healthcare delivery and public health with AI Reimagining healthcare delivery and public health with AI

We are sorry, the page you requested cannot be found.

The page you are looking for could not be found or is no longer available.

7 months назад @ microsoft.com
Navigating medical education in the era of generative AI
Navigating medical education in the era of generative AI Navigating medical education in the era of generative AI

Prior to med school, Daniel pursued experiences that cultivated his interest in the application of AI in medical practice and education.

Really, really looking forward to this chat.

There’s AI before ChatGPT and before, you know, generative AI really became a big thing, and then afterwards.

And then after we talk about what’s really happening, what do you think should happen in medical education given the reality of generative AI?

And I do agree [that] AI really gives us real hope that we can make it true.

7 months, 2 weeks назад @ microsoft.com
AI Testing and Evaluation: Reflections
AI Testing and Evaluation: Reflections AI Testing and Evaluation: Reflections

Our goal is to learn from their successes and their stumbles to move the science and practice of AI testing forward.

We have examples, like the pharmaceutical or medical device industry experts with whom you spoke, that’s really, you know, testing … there is a pre-deployment requirement.

And the third is just how rigid versus adaptive these testing and evaluation regimes or frameworks are in these different domains.

I really agree that there has been a lot of emphasis to date on, sort of, testing models upstream, the AI model evaluation.

You know, I think there’s been real progress already in the AI evaluation and testing ecosystem in the public-private partnership context.

7 months, 2 weeks назад @ microsoft.com
AI Testing and Evaluation: Learnings from cybersecurity
AI Testing and Evaluation: Learnings from cybersecurity AI Testing and Evaluation: Learnings from cybersecurity

Absolutely, I really, really was.

As a principal director on the Microsoft AI Red Team, Tori leads all AI security and safety red team operations, as well as dangerous capability testing, to directly inform C-suite decision-makers.

This year, we’ve pulled a lot of those assets and insights into the Azure [AI] Foundry AI Red Teaming Agent (opens in new tab).

So you can get a little taste of what we do day to day in the AI Red Teaming Agent.

WESTERHOFF: I think the most important takeaway from those lessons is that AI security is truly a team sport.

7 months, 3 weeks назад @ microsoft.com
How AI will accelerate biomedical research and discovery
How AI will accelerate biomedical research and discovery How AI will accelerate biomedical research and discovery

Dr. Eric Topol is the executive vice president of the biomedical research non-profit Scripps Research, where he founded and now directs the Scripps Research Translational Institute.

Let’s continue our deep dive on AI and biomedical research with this conversation with Noubar Afeyan:LEE: Noubar, thanks so much for joining.

And there’s the origin story of contact with AI, you know, before the emergence of generative AI and afterwards.

What is going on today with respect to AI really being used for something meaningful in the design and development of drugs?

TOPOL: You would read about how, you know, data is the new oil and, you know, gold and whatnot.

7 months, 4 weeks назад @ microsoft.com
AI Testing and Evaluation: Learnings from pharmaceuticals and medical devices
AI Testing and Evaluation: Learnings from pharmaceuticals and medical devices AI Testing and Evaluation: Learnings from pharmaceuticals and medical devices

Our goal is to learn from their successes and their stumbles to move the science and practice of AI testing forward.

During the pre-market phase, medical testing establishes baseline safety and effectiveness metrics through bench testing, performance standards, and clinical studies.

SULLIVAN: So medical devices face a pretty prescriptive multi-level testing path before they hit the market.

We are looking into medical devices, as well, obviously, but also other technologies in advanced medical computing.

So we see Phase 3 trials as something that occurs in the medical devices and pharmaceuticals field.

8 months назад @ microsoft.com
AI Testing and Evaluation: Learnings from genome editing
AI Testing and Evaluation: Learnings from genome editing AI Testing and Evaluation: Learnings from genome editing

As generative AI continues to advance, Microsoft has gathered a range of experts—from genome editing to cybersecurity—to share how their fields approach evaluation and risk assessment.

CHARO: Well, you know, genome editing is both very old and very new.

Now the earliest forms of genome editing were very inefficient, and so we didn’t worry that much.

But the bottom-line thing to remember, the way to really think about it is, we don’t regulate genome editing; we regulate the things that use genome editing.

And she said, you know, we don’t regulate genome editing; we regulate the things that use genome editing.

8 months, 1 week назад @ microsoft.com
AI Testing and Evaluation: Learnings from Science and Industry
AI Testing and Evaluation: Learnings from Science and Industry AI Testing and Evaluation: Learnings from Science and Industry

Our goal is to learn from their successes and their stumbles to move the science and practice of AI testing forward.

And I think, really, there are two reasons why tech is so, kind of, representative of that kind of challenge that I’ve always found fascinating.

Continues to be a really important topic in the AI policy conversation right now, I think, for really good reason.

Testing is an important component for governance and AI and, of course, in all of these other domains, as well.

I think about almost, like, in the near to mid-term, like three issues that we need to address in the AI, kind of, policy and testing context.

8 months, 2 weeks назад @ microsoft.com
The AI Revolution in Medicine, Revisited: How AI is reshaping the future of healthcare and medical research
The AI Revolution in Medicine, Revisited: How AI is reshaping the future of healthcare and medical research The AI Revolution in Medicine, Revisited: How AI is reshaping the future of healthcare and medical research

LEE: Yeah, yeah.

It cannot—as, you know, Bill was saying—it cannot learn from your document.

And I don’t know if the two of you remember, but I ended up doing a lot of tests.

I don’t know if you know, but just recently, there was a paper that was published on a scientific discovery using o3- mini (opens in new tab).

Like, if you have a human trained for one task and you put them into another task, then you don’t … you often don’t know.

8 months, 3 weeks назад @ microsoft.com
What AI's impact on individuals means for the health workforce and industry
What AI's impact on individuals means for the health workforce and industry What AI's impact on individuals means for the health workforce and industry

So I don’t think we should be surprised that business schools matter on this because we care about management.

That’s really going to change the way, like, middle school works, was my thinking at the time.

We’ve gone from AI being highly discriminative to AI that’s able to explore the world in particular ways.

The symptoms that they’re showing are quite different, and also their compliance is really, really different.

LEE: Yeah, really, really interesting.

9 months, 1 week назад @ microsoft.com
Abstracts: Zero-shot models in single-cell biology with Alex Lu
Abstracts: Zero-shot models in single-cell biology with Alex Lu Abstracts: Zero-shot models in single-cell biology with Alex Lu

And single-cell foundation models claim to be capable of unraveling deeper insights than ever before.

Basically, we showed that single-cell foundation models perform worse in settings that are fundamental to biological discovery than much simpler machine learning and statistical methods that were used in the field before single-cell foundation models emerged and are the go-to standard for unpacking meaning from these complicated experiments.

And the way to understand this is because single-cell foundation models are trained in a way that tries to expose these models to millions of single-cells.

But let’s also talk about the impact for methodologists, people who are trying to improve these s…

9 months, 2 weeks назад @ microsoft.com
NLP Highlights NLP Highlights
последний пост None
Data Skeptic
последний пост 1 week назад
Collective Altruism in Recommender Systems
Collective Altruism in Recommender Systems Collective Altruism in Recommender Systems

Ekaterina (Kat) Filadova from MIT EECS joins us to discuss strategic learning in recommender systems—what happens when users collectively coordinate to game recommendation algorithms. Kat's research reveals surprising findings: algorithmic "protest movements" can paradoxically help platforms by providing clearer preference signals, and the challenge of distinguishing coordinated behavior from bot activity is more complex than it appears. This episode explores the intersection of machine learning and game theory, examining what happens when your training data actively responds to your algorithm.

1 week назад @ dataskeptic.com
Niche vs Mainstream
Niche vs Mainstream Niche vs Mainstream

Anas Buhayh discusses multi-stakeholder fairness in recommender systems and the S'mores framework—a simulation allowing users to choose between mainstream and niche algorithms. His research shows specialized recommenders improve utility for niche users while raising questions about filter bubbles and data privacy.

2 weeks, 2 days назад @ dataskeptic.com
Healthy Friction in Job Recommender Systems
Healthy Friction in Job Recommender Systems Healthy Friction in Job Recommender Systems

In this episode, host Kyle Polich speaks with Roan Schellingerhout, a fourth-year PhD student at Maastricht University, about explainable multi-stakeholder recommender systems for job recruitment. Roan discusses his research on creating AI-powered job matching systems that balance the needs of multiple stakeholders—job seekers, recruiters, HR professionals, and companies. The conversation explores different types of explanations for job recommendations, including textual, bar chart, and graph-based formats, with findings showing that lay users strongly prefer simple textual explanations over more technical visualizations. Roan shares insights from his "healthy friction" study, which tested …

1 month назад @ dataskeptic.com
Fairness in PCA-Based Recommenders
Fairness in PCA-Based Recommenders Fairness in PCA-Based Recommenders

In this episode, we explore the fascinating world of recommender systems and algorithmic fairness with David Liu, Assistant Research Professor at Cornell University's Center for Data Science for Enterprise and Society. David shares insights from his research on how machine learning models can inadvertently create unfairness, particularly for minority and niche user groups, even without any malicious intent. We dive deep into his groundbreaking work on Principal Component Analysis (PCA) and collaborative filtering, examining why these fundamental techniques sometimes fail to serve all users equally. David introduces the concept of "power niche users" - highly active users with specialized in…

1 month, 1 week назад @ dataskeptic.com
Video Recommendations in Industry
Video Recommendations in Industry Video Recommendations in Industry

In this episode, Kyle Polich sits down with Cory Zechmann, a content curator working in streaming television with 16 years of experience running the music blog "Silence Nogood." They explore the intersection of human curation and machine learning in content discovery, discussing the concept of "algatorial" curation—where algorithms and editorial expertise work together. Key topics include the cold start problem, why every metric is just a "proxy metric" for what users actually want, the challenge of filter bubbles, and the importance of balancing familiarity with discovery. Cory shares insights on why TikTok's algorithm works so well (clean data and massive interaction volume), the crucial …

2 months, 1 week назад @ dataskeptic.com
Eye Tracking in Recommender Systems
Eye Tracking in Recommender Systems Eye Tracking in Recommender Systems

In this episode, Santiago de Leon takes us deep into the world of eye tracking and its revolutionary applications in recommender systems. As a researcher at the Kempelin Institute and Brno University, Santiago explains the mechanics of eye tracking technology—how it captures gaze data and processes it into fixations and saccades to reveal user browsing patterns. He introduces the groundbreaking RecGaze dataset, the first eye tracking dataset specifically designed for recommender systems research, which opens new possibilities for understanding how users interact with carousel interfaces like Netflix. Through collaboration between psychologists and AI researchers, Santiago's work demonstrate…

2 months, 2 weeks назад @ dataskeptic.com
Cracking the Cold Start Problem
Cracking the Cold Start Problem Cracking the Cold Start Problem

In this episode of Data Skeptic, we dive deep into the technical foundations of building modern recommender systems. Unlike traditional machine learning classification problems where you can simply apply XGBoost to tabular data, recommender systems require sophisticated hybrid approaches that combine multiple techniques. Our guest, Boya Xu, an assistant professor of marketing at Virginia Tech, walks us through a cutting-edge method that integrates three key components: collaborative filtering for dimensionality reduction, embeddings to represent users and items in latent space, and bandit learning to balance exploration and exploitation when deploying new recommendations. Boya shares insigh…

2 months, 4 weeks назад @ dataskeptic.com
Designing Recommender Systems for Digital Humanities
Designing Recommender Systems for Digital Humanities Designing Recommender Systems for Digital Humanities

In this episode of Data Skeptic, we explore the fascinating intersection of recommender systems and digital humanities with guest Florian Atzenhofer-Baumgartner, a PhD student at Graz University of Technology. Florian is working on Monasterium.net, Europe's largest online collection of historical charters, containing millions of medieval and early modern documents from across the continent. The conversation delves into why traditional recommender systems fall short in the digital humanities space, where users range from expert historians and genealogists to art historians and linguists, each with unique research needs and information-seeking behaviors. Florian explains the technical challen…

3 months, 1 week назад @ dataskeptic.com
DataRec Library for Reproducible in Recommend Systems
DataRec Library for Reproducible in Recommend Systems DataRec Library for Reproducible in Recommend Systems

In this episode of Data Skeptic's Recommender Systems series, host Kyle Polich explores DataRec, a new Python library designed to bring reproducibility and standardization to recommender systems research. Guest Alberto Carlo Mario Mancino, a postdoc researcher from Politecnico di Bari, Italy, discusses the challenges of dataset management in recommendation research—from version control issues to preprocessing inconsistencies—and how DataRec provides automated downloads, checksum verification, and standardized filtering strategies for popular datasets like MovieLens, Last.fm, and Amazon reviews. The conversation covers Alberto's research journey through knowledge graphs, graph-based recommen…

3 months, 3 weeks назад @ dataskeptic.com
Shilling Attacks on Recommender Systems
Shilling Attacks on Recommender Systems Shilling Attacks on Recommender Systems

In this episode of Data Skeptic's Recommender Systems series, Kyle sits down with Aditya Chichani, a senior machine learning engineer at Walmart, to explore the darker side of recommendation algorithms. The conversation centers on shilling attacks—a form of manipulation where malicious actors create multiple fake profiles to game recommender systems, either to promote specific items or sabotage competitors. Aditya, who researched these attacks during his undergraduate studies at SPIT before completing his master's in computer science with a data science specialization at UC Berkeley, explains how these vulnerabilities emerge particularly in collaborative filtering systems. From promoting a …

4 months назад @ dataskeptic.com
Music Playlist Recommendations
Music Playlist Recommendations Music Playlist Recommendations

In this episode, Rebecca Salganik, a PhD student at the University of Rochester with a background in vocal performance and composition, discusses her research on fairness in music recommendation systems. She explores three key types of fairness—group, individual, and counterfactual—and examines how algorithms create challenges like popularity bias (favoring mainstream content) and multi-interest bias (underserving users with diverse tastes). Rebecca introduces LARP, her multi-stage multimodal framework for playlist continuation that uses contrastive learning to align text and audio representations, learn song relationships, and create playlist-level embeddings to address the cold start prob…

4 months, 1 week назад @ dataskeptic.com
Bypassing the Popularity Bias
Bypassing the Popularity Bias Bypassing the Popularity Bias 4 months, 3 weeks назад @ dataskeptic.com
Sustainable Recommender Systems for Tourism
Sustainable Recommender Systems for Tourism Sustainable Recommender Systems for Tourism

In this episode, we speak with Ashmi Banerjee, a doctoral candidate at the Technical University of Munich, about her pioneering research on AI-powered recommender systems in tourism. Ashmi illuminates how these systems can address exposure bias while promoting more sustainable tourism practices through innovative approaches to data acquisition and algorithm design. Key highlights include leveraging large language models for synthetic data generation, developing recommendation architectures that balance user satisfaction with environmental concerns, and creating frameworks that distribute tourism more equitably across destinations. Ashmi's insights offer valuable perspectives for both AI res…

4 months, 4 weeks назад @ dataskeptic.com
Interpretable Real Estate Recommendations
Interpretable Real Estate Recommendations Interpretable Real Estate Recommendations

In this episode of Data Skeptic's Recommender Systems series, host Kyle Polich interviews Dr. Kunal Mukherjee, a postdoctoral research associate at Virginia Tech, about the paper "Z-REx: Human-Interpretable GNN Explanations for Real Estate Recommendations" The discussion explores how the post-COVID real estate landscape has created a need for better recommendation systems that can introduce home buyers to emerging neighborhoods they might not know about. Dr. Mukherjee, explains how his team developed a graph neural network approach that not only recommends properties but provides human-interpretable explanations for why certain regions are suggested. The conversation covers the advantages o…

5 months, 2 weeks назад @ dataskeptic.com
Why Am I Seeing This?
Why Am I Seeing This? Why Am I Seeing This?

In this episode of Data Skeptic, we explore the challenges of studying social media recommender systems when exposure data isn't accessible. Our guests Sabrina Guidotti, Gregor Donabauer, and Dimitri Ognibene introduce their innovative "recommender neutral user model" for inferring the influence of opaque algorithms.

5 months, 4 weeks назад @ dataskeptic.com
SuperDataScience SuperDataScience
последний пост 16 часов назад
972: In Case You Missed It in February 2026
972: In Case You Missed It in February 2026 972: In Case You Missed It in February 2026

Jon Krohn recaps the month of February in this episode of In Case You Missed It. Across four interviews with Will Falcon (Episode 965), Tom Griffiths (Episode 969), Antje Barth (Episode 963), and Praveen Murugesan (Episode 967), Jon questions the brains behind some of the AI industry’s most innovative companies about launching a startup, developing a popular product, what artificial intelligence can still learn from human intelligence, and how AI might finally start to think on its own. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/972⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

16 часов назад @ podtrac.com
971: 90% of The World’s Data is Private; Lin Qiao’s Fireworks AI is Unlocking It
971: 90% of The World’s Data is Private; Lin Qiao’s Fireworks AI is Unlocking It 971: 90% of The World’s Data is Private; Lin Qiao’s Fireworks AI is Unlocking It

Lin Qiao, CEO of Fireworks AI, talks to Jon Krohn about how she builds effective models quickly, why coding agents can perform at the level of a junior engineer, and what she attributes to the success of Fireworks AI: True to its name, the company exploded into the AI industry with over $300 million secured in venture capital, as well as netting a further $250 million Series C funding. For Lin, many enterprises miss out by not being familiar with open models. Open models give a lot of control to the user, offering customizability and at a much lower price point. Listen to hear how Fireworks AI helps companies continue to save money through AI. This episode is brought to you by the⁠ ⁠⁠⁠⁠Dell…

3 days, 16 hours назад @ podtrac.com
970: The “100x Engineer”: How to Be One, But Should You?
970: The “100x Engineer”: How to Be One, But Should You? 970: The “100x Engineer”: How to Be One, But Should You?

Working with code-gen models and Claude Code: In this Five-Minute Friday, Jon Krohn addresses how AI superstars like Andrej Karpathy are using AI agents in their coding work, the outlook for code-gen in 2026, and how you can get started. Hear about Karpathy’s work as well as the soaring success of Peter Steinberger and how he managed to surpass the GitHub commit rate of teams as an individual working with AI agents. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/970⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

1 week назад @ podtrac.com
969: The Laws of Thought: The Math of Minds and Machines, with Prof. Tom Griffiths
969: The Laws of Thought: The Math of Minds and Machines, with Prof. Tom Griffiths 969: The Laws of Thought: The Math of Minds and Machines, with Prof. Tom Griffiths

Princeton Professor Tom Griffiths talks to Jon Krohn about his new book, The Laws of Thought, which grapples with the mathematical models behind biological and artificial intelligence, and what makes the human brain so fascinating for psychologists and computer scientists to study. In this episode, he details how the mathematical principles governing the external world can also be used to explore cognitive science, or “the internal world.” This episode is brought to you by the ⁠⁠Dell⁠⁠, by ⁠⁠Intel⁠⁠, by Cisco and by Acceldata. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/969⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natal…

1 week, 3 days назад @ podtrac.com
968: Is AI Automating Away All Coding Jobs?
968: Is AI Automating Away All Coding Jobs? 968: Is AI Automating Away All Coding Jobs?

Now that AI agents can develop new apps from product development to delivery, do AI developers have reason to worry about their careers? Podcast host Jon Krohn addresses the stark predictions that AI could “eliminate half of all entry-level white-collar jobs” by going back to the data. Find out why the numbers show a very different picture, which in-demand occupations have increased by 40% since late 2022, and Jon’s advice on why technical professionals shouldn’t panic in this latest Five-Minute Friday. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/968⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship…

2 weeks назад @ podtrac.com
967: AI for the Physical World, with Samsara's Praveen Murugesan
967: AI for the Physical World, with Samsara's Praveen Murugesan 967: AI for the Physical World, with Samsara's Praveen Murugesan

VP of Engineering at Samsara Praveen Murugesan talks to Jon Krohn about processing 20 trillion data points covering 90 billion miles across private and public sectors, how the company helps truckers who operate long hours and travel for long stretches without cellphone signal, and who they’re looking to hire to help this agile startup keep on developing high-impact solutions for real-world problems. And, if you’re looking to work for the company, there’s no better time to apply, and you’ll want to listen to the end of the show to hear exactly what Praveen looks for in new hires. This episode is brought to you by the ⁠⁠Dell⁠⁠, by ⁠⁠Intel⁠⁠, by Acceldata and by the ODSC, the Open Data Science…

2 weeks, 3 days назад @ podtrac.com
966: The Moltbook Phenomenon: OpenClaw Unleashed
966: The Moltbook Phenomenon: OpenClaw Unleashed 966: The Moltbook Phenomenon: OpenClaw Unleashed

Jon Krohn gives Five-Minute Friday listeners all the details about the new social network causing a stir, Moltbook. What makes Moltbook so unique is that this is the first network designed just for AI agents. It’s an exclusive club, only its alleged 1.5 million registered agents can post, comment, and upvote, but we can watch this real-world experiment in agent ecology from the sidelines. Listen to the episode to hear the fascinating, if disturbing, story of Moltbook’s swift turn into facilitating a digital theocracy and forms of government, and whether this development is a sign of an approaching singularity or rather AI continuing to ape human thought and turn it into slop. Additional mat…

3 weeks назад @ podtrac.com
965: From PhD Side Project to $500M ARR: Will Falcon’s PyTorch Lightning Story
965: From PhD Side Project to $500M ARR: Will Falcon’s PyTorch Lightning Story 965: From PhD Side Project to $500M ARR: Will Falcon’s PyTorch Lightning Story

CEO of Lightning AI Will Falcon speaks to podcast host and Lightning AI fellow Jon Krohn about the company’s merger with Voltage Park, and why Will has named it the “full-stack AI neo-cloud for enterprises and frontier labs”. Lightning AI’s offer is a secure, flexible, and collaborative environment that can run on the cloud, all essentials for early-stage startups. Listen to the episode to hear Will Falcon discuss Lightning AI Studio, founding PyTorch Lightning, and how he came to found his AI company. This episode is brought to you by the Dell⁠⁠, by ⁠⁠Intel⁠⁠, by Fabi and by Cisco. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/965⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in spons…

3 weeks, 3 days назад @ podtrac.com
SDS 964: In Case You Missed It in January 2026
SDS 964: In Case You Missed It in January 2026 SDS 964: In Case You Missed It in January 2026

In this first of the year ICYMI episode, Jon Krohn selects his favorite moments from January’s SuperDataScience interviews. Listen to why incentivizing workers is the best way to get them to disclose their use of AI tools and pave the way for an AI-forward future, how AI continues to mimic human development in its own evolution, the importance of evaluation in building AI systems, and how to keep your best employees (and also: how to know your value) with guests Sadie St. Lawrence, Ashwin Rajeeva, Sinan Ozdemir, Vijoy Pandey, and Ethan Mollick. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/964⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email n…

4 weeks назад @ podtrac.com
963: Reinforcement Learning for Agents, with Amazon AGI Labs’ Antje Barth
963: Reinforcement Learning for Agents, with Amazon AGI Labs’ Antje Barth 963: Reinforcement Learning for Agents, with Amazon AGI Labs’ Antje Barth

Bestselling author and Gen AI instructor Antje Barth talks to Jon Krohn about her work at Amazon’s AGI Labs and their newest product Nova Act, as well as where we will see the most success with AI agents and how AI developers can reap those rewards. This episode is brought to you by the ⁠⁠Dell⁠⁠, by ⁠⁠Intel⁠⁠, by Fabi and by Cisco. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/963⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information. In this episode you will learn: (01:23) Amazon’s latest product, Nova Act (11:05) How Nova Act tests reliability (24:01) Where Amazon’s 1000s of gen AI …

1 month назад @ podtrac.com
962: Wharton Prof Ethan Mollick on Why Your AI Strategy Is Already Obsolete
962: Wharton Prof Ethan Mollick on Why Your AI Strategy Is Already Obsolete 962: Wharton Prof Ethan Mollick on Why Your AI Strategy Is Already Obsolete

Bestselling author of Co-Intelligence: Living and Working with AI Ethan Mollick speaks to Jon Krohn about just how much US firms have to gain from a willingness to adopt and experiment with AI, as well as the reality behind AI use among employees and the frontier models set to support them even further. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/962⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.⁠⁠⁠

1 month назад @ podtrac.com
961: Distributed Artificial Superintelligence, with Dr. Vijoy Pandey
961: Distributed Artificial Superintelligence, with Dr. Vijoy Pandey 961: Distributed Artificial Superintelligence, with Dr. Vijoy Pandey

Dr. Vijoy Pandey returns to the show to talk to Jon Krohn about Cisco’s work to advance medicine and mitigate the impact of climate change with distributed artificial super-intelligence. Dr. Vijoy Pandey believes in a future where humans and AI agents work together to tackle our biggest challenges. For this to happen, we will need to have multi-agent systems and open-source platforms that let agents work together, avoiding the phenomenon of AI agents being “isolated geniuses” unable to collaborate. He elaborates on what Cisco is doing to close this gap. This episode is brought to you by the⁠ ⁠⁠Dell⁠⁠⁠, by⁠ ⁠⁠Intel⁠⁠⁠, by ⁠Fabi⁠ and by ⁠⁠⁠⁠Scaylor⁠. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠…

1 month, 1 week назад @ podtrac.com
960: In Case You Missed It in December 2025
960: In Case You Missed It in December 2025

For 2026’s first episode of In Case You Missed It (ICYMI), Jon Krohn selects 6 clips from December for a wide-ranging look at the current state of AI in business and beyond. Hear from Joel Beasley (Episode 945), Jeff Li (Episode 947), Sandy Pentland (Episode 949), Josh Clemm (Episode 951), Penelope LaFeuille (Episode 952), and John Roese (Episode 953) on ensuring your AI systems get adopted and succeed in their goals, interesting ways to use AI for standup comedy routines, and more. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/960⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

1 month, 1 week назад @ podtrac.com
959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir)
959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir) 959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir)

AI entrepreneur and bestselling author Sinan Ozdemir speaks to Jon Krohn about the practical differences between agentic AI and AI workflows, why evaluating accuracy on its own won’t tell you enough about AI models, and more about his latest book Building Agentic AI. This episode is brought to you by the ⁠⁠Dell⁠⁠, by ⁠⁠Intel⁠⁠, by Fabi and by Cisco. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/959⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information. In this episode you will learn: (04:57) Exploring the differences between workflows and agents (17:03) How to work out parameter count for…

1 month, 2 weeks назад @ podtrac.com
958: Without Trusted Context, Agents are Stupid (featuring Salesforce’s Rahul Auradkar)
958: Without Trusted Context, Agents are Stupid (featuring Salesforce’s Rahul Auradkar) 958: Without Trusted Context, Agents are Stupid (featuring Salesforce’s Rahul Auradkar)

In this #sponsored Feature Friday episode, Salesforce’s Rahul Auradkar speaks to Jon Krohn about the company’s unified data engine and how its acquisition of Informatica provides the missing context layer for AI models and agents. Hear how Salesforce’s Data 360 helps customers to get accurate and insightful information about their business, and what AI models need to benefit a company’s bottom line (hint: it’s not only large amounts of data!) Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/958⁠⁠⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

1 month, 2 weeks назад @ podtrac.com
Data Science at Home Data Science at Home
последний пост 1 day, 19 hours назад
There Is No AI. There’s a Stateless Function on 10,000 GPUs Pretending to Know You (Ep. 299)
There Is No AI. There’s a Stateless Function on 10,000 GPUs Pretending to Know You (Ep. 299) There Is No AI. There’s a Stateless Function on 10,000 GPUs Pretending to Know You (Ep. 299)

Personal newsletter: https://defragzone.substack.com📩 Newsletter: https://datascienceathome.substack.com🎙 Podcast: Available on Spotify, Apple Podcasts, and more.

🐦 Twitter: @DataScienceAtHome📘 LinkedIn: https://www.linkedin.com/in/fragadaleta/ Instagram: https://www.instagram.com/datascienceathome/Facebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Whether you’re a data professional, tech enthusiast, or just curious about the field, our podcast delivers insights, intervi…

1 day, 19 hours назад @ datascienceathome.com
Bias in the machine (edited)
Bias in the machine (edited) Bias in the machine (edited)

The title of today’s episode is Bias in the machineC: Francesco, today we are starting with an infuriating discussion.

The failure of the medical community as a whole to recognise this obvious bias up to the 21st century is an example of how insidious the problem of bias is.

Three: The bias in your training sample: people put training samples together, and people have culture, experience, and prejudice.

These assumptions inform the way AI systems work—and fail—to this day.

When an algorithm is a black box and you can’t look inside, you have no way of analysing its bias.

1 day, 19 hours назад @ datascienceathome.com
What is wrong with reinforcement learning? (Ep. 82)
What is wrong with reinforcement learning? (Ep. 82) What is wrong with reinforcement learning? (Ep. 82)

Join the discussion on our Discord serverAfter reinforcement learning agents doing great at playing Atari video games, Alpha Go, doing financial trading, dealing with language modeling, let me tell you the real story here.In this episode I want to shine some light on reinforcement learning (RL) and the limitations that every practitioner should consider before taking certain directions.

RL seems to work so well!

What is wrong with it?

Are you a listener of Data Science at Home podcast?

Or did you subscribe to the Artificial Intelligence at your fingertips newsletter?

1 month назад @ datascienceathome.com
How to generate very large images with GANs (Ep. 76)
How to generate very large images with GANs (Ep. 76) How to generate very large images with GANs (Ep. 76)

Join the discussion on our Discord serverIn this episode I explain how a research group from the University of Lubeck dominated the curse of dimensionality for the generation of large medical images with GANs.

The problem is not as trivial as it seems.

Many researchers have failed in generating large images with GANs before.

One interesting application of such approach is in medicine for the generation of CT and X-ray images.Enjoy the show!

ReferencesMulti-scale GANs for Memory-efficient Generation of High Resolution Medical Images https://arxiv.org/abs/1907.01376

1 month назад @ datascienceathome.com
Training neural networks faster without GPU [RB] (Ep. 77)
Training neural networks faster without GPU [RB] (Ep. 77) Training neural networks faster without GPU [RB] (Ep. 77)

Join the discussion on our Discord serverTraining neural networks faster usually involves the usage of powerful GPUs.

In this episode I explain an interesting method from a group of researchers from Google Brain, who can train neural networks faster by squeezing the hardware to their needs and making the training pipeline more dense.

Enjoy the show!

ReferencesFaster Neural Network Training with Data Echoinghttps://arxiv.org/abs/1907.05550

1 month назад @ datascienceathome.com
More powerful deep learning with transformers (Ep. 84) (Rebroadcast)
More powerful deep learning with transformers (Ep. 84) (Rebroadcast) More powerful deep learning with transformers (Ep. 84) (Rebroadcast)

Some of the most powerful NLP models like BERT and GPT-2 have one thing in common: they all use the transformer architecture.

Such architecture is built on top of another important concept already known to the community: self-attention.In this episode I explain what these mechanisms are, how they work and why they are so powerful.

Don’t forget to subscribe to our Newsletter or join the discussion on our Discord serverReferences

1 month назад @ datascienceathome.com
Your Favorite AI Startup is Probably Bullshit (Ep. 298) [RB]
Your Favorite AI Startup is Probably Bullshit (Ep. 298) [RB] Your Favorite AI Startup is Probably Bullshit (Ep. 298) [RB]

The brutal truth about why Silicon Valley is blowing billions on glorified autocomplete while pretending it’s the next iPhone.

We’re diving deep into the AI investment circus where VCs who can’t code are funding companies that barely understand their own technology.

From blockchain déjà vu to the “ChatGPT wrapper” economy—this episode will make you question every AI valuation you’ve ever seen.

Fair warning: We’re naming names and calling out the hype.

Don’t listen if you work at a “revolutionary AI startup” that’s just OpenAI’s API with a pretty interface.

1 month назад @ datascienceathome.com
Why AI Researchers Are Suddenly Obsessed With Whirlpools (Ep. 297) [RB]
Why AI Researchers Are Suddenly Obsessed With Whirlpools (Ep. 297) [RB] Why AI Researchers Are Suddenly Obsessed With Whirlpools (Ep. 297) [RB]

VortexNet uses actual whirlpools to build neural networks.

By borrowing equations from fluid dynamics, this new architecture might solve deep learning’s toughest problems—from vanishing gradients to long-range dependencies.

Today we explain how vortex shedding, the Strouhal number, and turbulent flows might change everything in AI.

SponsorsThis episode is brought to you by Statistical HorizonsAt Statistical Horizons, you can stay ahead with expert-led livestream seminars that make data analytics and AI methods practical and accessible.

Join thousands of researchers and professionals who’ve advanced their careers with Statistical Horizons.

1 month назад @ datascienceathome.com
AGI: The Dream We Should Never Reach (Ep. 296)
AGI: The Dream We Should Never Reach (Ep. 296) AGI: The Dream We Should Never Reach (Ep. 296)

Also on YouTubeTwo AI experts who actually love the technology explain why chasing AGI might be the worst thing for AI’s future—and why the current hype cycle could kill the field we’re trying to save.

Head to datascienceathome.com for detailed show notes, code examples, and exclusive deep-dives into the papers we discuss.

Subscribe to our newsletter for weekly breakdowns of cutting-edge research delivered straight to your inbox—no fluff, just science!

Our Discord community is full of ML engineers, researchers, and AI enthusiasts discussing papers, sharing projects, and helping each other level up.

Whether you’re debugging your first neural net or training your tenth transformer, there’s a …

1 month назад @ datascienceathome.com
When Data Stops Being Code and Starts Being Conversation (Ep. 297)
When Data Stops Being Code and Starts Being Conversation (Ep. 297) When Data Stops Being Code and Starts Being Conversation (Ep. 297)

Mark Brocato built Mockaroo—the tool that taught millions of developers how to fake data.

Now, as Head of Engineering at Tonic.ai, he’s building the AI agent that’s making his own creation obsolete.

From the hidden failures of legacy mocks to the security implications of agent-driven synthesis, Mark reveals what happens when data generation becomes a conversation—not a pipeline.

SponsorsTonic.ai Synthetic data solutions for software and AI development.

Accelerate engineering velocity and ensure compliance with AI-powered data synthesisThis episode is brought to you by Statistical HorizonsAt Statistical Horizons, you can stay ahead with expert-led livestream seminars that make data analytics…

2 months, 1 week назад @ datascienceathome.com
Your AI Strategy is Burning Money: Here’s How to Fix It (Ep.295)
Your AI Strategy is Burning Money: Here’s How to Fix It (Ep.295) Your AI Strategy is Burning Money: Here’s How to Fix It (Ep.295)

Most companies don’t have an AI problem.

In this conversation, he breaks down when AI actually makes sense, where AWS costs spiral out of control, and why your “cool demo” keeps dying before launch.

If you’re tired of AI hype and ready for straight answers, hit play.

Our Discord community is full of ML engineers, researchers, and AI enthusiasts discussing papers, sharing projects, and helping each other level up.

Whether you’re debugging your first neural net or training your tenth transformer, there’s a place for you.

3 months, 1 week назад @ datascienceathome.com
From Tokens to Vectors: The Efficiency Hack That Could Save AI (Ep. 294)
From Tokens to Vectors: The Efficiency Hack That Could Save AI (Ep. 294) From Tokens to Vectors: The Efficiency Hack That Could Save AI (Ep. 294)

LLMs generate text painfully slow, one low-info token at a time.

Researchers just figured out how to compress 4 tokens into smart vectors & cut costs by 44%—with full code & proofs!

🔥📊SponsorsThis episode is brought to you by Statistical HorizonsAt Statistical Horizons, you can stay ahead with expert-led livestream seminars that make data analytics and AI methods practical and accessible.

Join thousands of researchers and professionals who’ve advanced their careers with Statistical Horizons.

Get $200 off any seminar with code DATA25 at https://statisticalhorizons.com

3 months, 3 weeks назад @ datascienceathome.com
Why AI Researchers Are Suddenly Obsessed With Whirlpools (Ep. 293)
Why AI Researchers Are Suddenly Obsessed With Whirlpools (Ep. 293) Why AI Researchers Are Suddenly Obsessed With Whirlpools (Ep. 293)

VortexNet uses actual whirlpools to build neural networks.

By borrowing equations from fluid dynamics, this new architecture might solve deep learning’s toughest problems—from vanishing gradients to long-range dependencies.

Today we explain how vortex shedding, the Strouhal number, and turbulent flows might change everything in AI.

SponsorsThis episode is brought to you by Statistical HorizonsAt Statistical Horizons, you can stay ahead with expert-led livestream seminars that make data analytics and AI methods practical and accessible.

Join thousands of researchers and professionals who’ve advanced their careers with Statistical Horizons.

4 months, 1 week назад @ datascienceathome.com
The Scientists Growing Living Computers in Swiss Labs (Ep. 292)
The Scientists Growing Living Computers in Swiss Labs (Ep. 292) The Scientists Growing Living Computers in Swiss Labs (Ep. 292)

At the intersection of ethics and engineering, Amethix creates AI systems that don’t just function—they adapt, learn, and serve.

With a focus on dual-use innovation, Amethix is shaping a future where intelligent machines extend human capability, not replace it.

Discover more at https://amethix.com This episode is brought to you by Intrepid AI.

From drones to satellites, Intrepid AI gives engineers and defense innovators the tools to prototype, simulate, and deploy autonomous systems with confidence.

Learn more at intrepid.aiReferencesWebsite: finalspark.comDiscord account: / discordNewsletter: https://finalspark.com/#newsletterTopics: Biological computing • Neural engineering • Energy-effic…

4 months, 2 weeks назад @ datascienceathome.com
When AI Hears Thunder But Misses the Fear (Ep. 291)
When AI Hears Thunder But Misses the Fear (Ep. 291) When AI Hears Thunder But Misses the Fear (Ep. 291)

Sanjoy Chowdhury reveals AI’s hidden weakness: while systems can see objects and hear sounds perfectly, they can’t reason across senses like humans do.

At the intersection of ethics and engineering, Amethix creates AI systems that don’t just function—they adapt, learn, and serve.

Discover more at https://amethix.comThis episode is brought to you by Intrepid AI.

From drones to satellites, Intrepid AI gives engineers and defense innovators the tools to prototype, simulate, and deploy autonomous systems with confidence.

Whether it’s in the sky, on the ground, or in orbit—if it’s intelligent and mobile, Intrepid helps you build it.

4 months, 4 weeks назад @ datascienceathome.com