Very ML
State-of-the-art Machine Learning News Feed
/r/MachineLearning
последний пост 2 часа назад
[D] Neurips Rebuttal
[D] Neurips Rebuttal

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

2 часа назад @ reddit.com
[P] Better OSS terminal chat
[P] Better OSS terminal chat [P] Better OSS terminal chat

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

4 часа назад @ reddit.com
[D] In 2025, what is a sufficient methodology to analyze document summaries generated by LLMs? BERTScore, G-Eval, Rogue, etc
[D] In 2025, what is a sufficient methodology to analyze document summaries generated by LLMs? BERTScore, G-Eval, Rogue, etc

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

6 часов назад @ reddit.com
[D] Can LLMs Have Accurate World Models?
[D] Can LLMs Have Accurate World Models?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

7 часов назад @ reddit.com
[R] CRINN: Free & Fast Framework for Approximate Nearest Neighbors Search
[R] CRINN: Free & Fast Framework for Approximate Nearest Neighbors Search

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

8 часов назад @ reddit.com
[D] LSTMs vs Transformers (Model Selection and Thoughts)
[D] LSTMs vs Transformers (Model Selection and Thoughts)

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

10 часов назад @ reddit.com
My Real-World Experience Using AI (Copilot) at a Major Bank — And Its Limitations [D]
My Real-World Experience Using AI (Copilot) at a Major Bank — And Its Limitations [D]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

12 часов назад @ reddit.com
[D] Unsaturated Evals before GPT5
[D] Unsaturated Evals before GPT5

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

22 часа назад @ reddit.com
[P] Reproducing YOLOv1 From Scratch in PyTorch - Learning to Implement Object Detection from the Original Paper
[P] Reproducing YOLOv1 From Scratch in PyTorch - Learning to Implement Object Detection from the Original Paper

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

23 часа назад @ reddit.com
[D] Idea for an efficient text diffusion model with adaptive, token-level steps
[D] Idea for an efficient text diffusion model with adaptive, token-level steps

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

23 часа назад @ reddit.com
[D] Training Whisper Tiny
[D] Training Whisper Tiny

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 2 hours назад @ reddit.com
[D] Have any Bayesian deep learning methods achieved SOTA performance in...anything?
[D] Have any Bayesian deep learning methods achieved SOTA performance in...anything?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 4 hours назад @ reddit.com
[D] FP4 training methods (request for paper recommendations)
[D] FP4 training methods (request for paper recommendations)

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 9 hours назад @ reddit.com
[R] LLMs Have a Heart of Stone: Demystifying the Soft Thinking Ability of Large Reasoning Models
[R] LLMs Have a Heart of Stone: Demystifying the Soft Thinking Ability of Large Reasoning Models [R] LLMs Have a Heart of Stone: Demystifying the Soft Thinking Ability of Large Reasoning Models

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 14 hours назад @ reddit.com
[P] A free goldmine of tutorials for the components you need to create production-level agents Extensive open source resource with tutorials for creating robust AI agents
[P] A free goldmine of tutorials for the components you need to create production-level agents Extensive open source resource with tutorials for creating robust AI agents

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 15 hours назад @ reddit.com
Towards Data Science
последний пост 11 часов назад
Time Series Forecasting Made Simple (Part 3.2): A Deep Dive into LOESS-Based Smoothing
Time Series Forecasting Made Simple (Part 3.2): A Deep Dive into LOESS-Based Smoothing Time Series Forecasting Made Simple (Part 3.2): A Deep Dive into LOESS-Based Smoothing

Before understanding the math behind the LOESS, we try to understand what is actually done in the LOESS smoothing process.

Same as with Simple Linear Regression, in this smoothing process we fit a line to the data points with added weights.

Here the span of five points means we consider the points which are nearest to target point (1-8-2010) including the target point.

Once we get the LOESS smoothed trend component, it is subtracted from the original series to isolate seasonality and noise.

After this, the whole process is repeated to further refine the components, the LOESS smoothed seasonality is subtracted from the original series to find LOESS smoothed trend and this new LOESS smoothed …

11 часов назад @ towardsdatascience.com
Agentic AI: On Evaluations
Agentic AI: On Evaluations Agentic AI: On Evaluations

So, for this article I’ve done some research on common metrics for multi-turn chatbots, RAG, and agentic applications.

Part 2 digs into evaluations of different kinds of LLM applications.

LLM scorersTo run evaluations on these datasets, you can usually use accuracy and unit tests.

As for how to do the evaluations, you don’t always have to use an LLM, although in most of these cases the standard solutions do.

For this part, RAGAS is useful with its metrics: Answer Relevancy, Faithfulness, and Noise Sensitivity.

11 часов назад @ towardsdatascience.com
Finding Golden Examples: A Smarter Approach to In-Context Learning
Finding Golden Examples: A Smarter Approach to In-Context Learning Finding Golden Examples: A Smarter Approach to In-Context Learning

Instead of picking random examples, AuPair first builds a large dataset of example pairs and then applies a greedy selection algorithm to select the best-performing pairs.

Phase 1: Example Pair generationImage by AuthorThe first step is to create a large collection of candidate repair pairs.

Let’s first look into how the effectiveness of candidate repair pairs is measured.

Image by AuthorTo measure the effectiveness, we first create a validation dataset — basically a set of broken code problems.

ConclusionAuPair shows us a smarter way to do in-context learning for code repair tasks.

12 часов назад @ towardsdatascience.com
The Channel-Wise Attention | Squeeze and Excitation
The Channel-Wise Attention | Squeeze and Excitation The Channel-Wise Attention | Squeeze and Excitation

If the attention in ViT operates spatially, i.e., assigning weights to different patches of an image, the attention mechanism proposed in SENet operates in channel-wise manner, i.e., assigning weights to different channels.

On the other hand, the F_tr operation, is actually not the part of the SE module.

One of them is a table displaying accuracy score improvements when SE module is applied to existing CNN-based models.

original : torch.Size([1, 512, 28, 28]) no projection : torch.Size([1, 512, 28, 28]) after conv0-bn0-relu : torch.Size([1, 256, 28, 28]) #(1) after conv1-bn1-relu : torch.Size([1, 256, 28, 28]) after conv2-bn2 : torch.Size([1, 512, 28, 28]) #(2) after semodule : torch.Size([…

13 часов назад @ towardsdatascience.com
The MCP Security Survival Guide: Best Practices, Pitfalls, and Real-World Lessons
The MCP Security Survival Guide: Best Practices, Pitfalls, and Real-World Lessons The MCP Security Survival Guide: Best Practices, Pitfalls, and Real-World Lessons

First, there was the GitHub MCP vulnerability—a flaw that let attackers exploit open-source MCP servers and siphon off user data.

And yes, standards bodies like IETF or IEEE might eventually step in with the “Agent Security Best Practices RFC 9001.”Until then, we’re stitching it together ourselves.

Continuous Community InvolvementFinally, the future of MCP security will be heavily influenced by community involvement.

Model Context Protocol (MCP) Security Best Practices.

Community discussions on MCP agent security.

1 day, 4 hours назад @ towardsdatascience.com
How I Won the “Mostly AI” Synthetic Data Challenge
How I Won the “Mostly AI” Synthetic Data Challenge How I Won the “Mostly AI” Synthetic Data Challenge

I in the Mostly AI Prize and won both the FLAT and SEQUENTIAL data challenges.

The competition was split into two independent challenges:FLAT Data Challenge: Generate 100,000 records with 80 columns.

SEQUENTIAL Data Challenge: Generate 20,000 sequences (groups) of records.

For the FLAT data challenge, the raw synthetic data from the model scored around 0.96, but after post-processing, the score jumped to 0.992.

I used a modified version of this approach for the SEQUENTIAL data challenge, which yielded a similar improvement.

1 day, 5 hours назад @ towardsdatascience.com
The Machine, the Expert, and the Common Folks
The Machine, the Expert, and the Common Folks The Machine, the Expert, and the Common Folks

Finally, these predictions were compared against the experts’ predictions.

It is not hard however to think of examples, where the complexity and nuanced view of the expert will be superior to a simple machine.

In Meehl’s example, a broken leg was a metaphor for something unforeseen by the machine, but understood by the human.

Letting the common folks try to predict when the sun burns out might yield alarmingly variable predictions, compared to those of astrophysicists.

“Pushing the limits for judgmental consistency: comparing random weighting schemes with expert judgments.” Personnel Assessment and Decisions 6, no.

1 day, 11 hours назад @ towardsdatascience.com
InfiniBand vs RoCEv2: Choosing the Right Network for Large-Scale AI
InfiniBand vs RoCEv2: Choosing the Right Network for Large-Scale AI InfiniBand vs RoCEv2: Choosing the Right Network for Large-Scale AI

However, In large-scale training environments, overall performance is not limited by processing speed, but by the speed of the network communication between them.

The traditional approach of routing GPU data through the CPU created a severe bottleneck at scale.

The data goes from GPU memory to system memory first, then the receiving device grabs it from there.

To make RoCEv2 work well, the underlying Ethernet network needs to be lossless or close to it.

Easier to deploy – Since it uses familiar IP-based networking, it’s easier for teams already managing Ethernet data centers to adopt.

1 day, 11 hours назад @ towardsdatascience.com
Context Engineering — A Comprehensive Hands-On Tutorial with DSPy
Context Engineering — A Comprehensive Hands-On Tutorial with DSPy Context Engineering — A Comprehensive Hands-On Tutorial with DSPy

Context Engineering by now.

What is Context Engineering?

Sequential ProcessingLet’s remind ourselves about one of the key components of Context Engineering.

MCPs are perfect for context engineering specific examples since you can declare system prompt formats, resources, restricted database access, etc, to your application.

To access the full GitHub repo, visit:https://github.com/avbiswas/context-engineering-dspyReferencesAuthor’s YouTube channel: https://www.youtube.com/@avb_fjAuthor’s Patreon: www.patreon.com/NeuralBreakdownwithAVBAuthor’s Twitter (X) account: https://x.com/neural_avbFull Context Engineering video course: https://youtu.be/5Bym0ffALaUGithub Link: https://github.com/avbiswa…

2 days, 2 hours назад @ towardsdatascience.com
Things I Wish I Had Known Before Starting ML
Things I Wish I Had Known Before Starting ML Things I Wish I Had Known Before Starting ML

It turns out, space, silence, and sea have a way of bringing up a list of things that I wish I had known before starting ML.

Guardrails allow you to apply a filter to all the things that you see: yes, no, yes, yes, no.

Research Code Is Just That: Research CodeWriting ML algorithms is an essential part of machine learning work.

There’s production code—the kind used in apps, services, and end-user systems—and then there’s research code.

So yes, read broadly.

2 days, 6 hours назад @ towardsdatascience.com
How a Research Lab Made Entirely of LLM Agents Developed Molecules That Can Block a Virus
How a Research Lab Made Entirely of LLM Agents Developed Molecules That Can Block a Virus How a Research Lab Made Entirely of LLM Agents Developed Molecules That Can Block a Virus

The core idea at the heart of this work was indeed to simulate an interdisciplinary lab staffed by AI agents.

These agents are led by a Principal Investigator (PI) Agent and monitored by a Scientific Critic Agent, both virtual agents.

Imagine then the full pipeline as applied to a research lab:A human PI defines a high-level biological goal.

Future House is a non-profit building AI agents to automate research in biology and other complex sciences.

Such kinds of automatized laboratory systems coupled to emerging systems like Virtual Lab could radically transform how science and technology progress.

2 days, 7 hours назад @ towardsdatascience.com
Stellar Flare Detection and Prediction Using Clustering and Machine Learning
Stellar Flare Detection and Prediction Using Clustering and Machine Learning Stellar Flare Detection and Prediction Using Clustering and Machine Learning

Firstly, stellar flares don’t usually occur at consistent time intervals, which makes them hard to predict [1].

For my flare detection model, I used DBSCAN (Density-Based Spatial Clustering of Applications with Noise).

With a good flare detection model in place, I then built a flare prediction model employing XGBoost, with the flare labels generated by DBSCAN serving as the response.

This study combines unsupervised clustering with supervised learning to present a robust, generalizable and computationally efficient pipeline for stellar flare detection and prediction, one that can be adapted to different families of stars.

Looking ahead, improving the model’s flare labeling accuracy and vali…

2 days, 7 hours назад @ towardsdatascience.com
Exploratory Data Analysis: Gamma Spectroscopy in Python (Part 3)
Exploratory Data Analysis: Gamma Spectroscopy in Python (Part 3) Exploratory Data Analysis: Gamma Spectroscopy in Python (Part 3)

In the first part, I did an exploratory data analysis of the gamma spectroscopy data.

It is based on XGBoost, and I trained the model using different radioactive samples.

Spectrum data can be exported using the official Radiacode Android app or retrieved directly from a device using a radiacode Python library.

As a reminder, first I created a Streamlit app and published it on the Streamlit Community Cloud platform.

Spectra in XML format can be used to test a Streamlit app.

2 days, 12 hours назад @ towardsdatascience.com
Mechanistic View of Transformers: Patterns, Messages, Residual Stream… and LSTMs
Mechanistic View of Transformers: Patterns, Messages, Residual Stream… and LSTMs Mechanistic View of Transformers: Patterns, Messages, Residual Stream… and LSTMs

my previous article, I talked about how mechanistic interpretability reimagines attention in a transformers to be additive without any concatenation.

To ground ourselves: the attention mechanism in transformers relies on a series of matrix multiplications involving the Query (Q), Key (K), Value (V), and an output projection matrix (O).

Unpacking the steps of attention:Multiply embedding matrix E with W q to get the query vector Q.

Similarly get key vector K and value vector V by multiplying E with W k and W v Multiply with Q and KT.

By reframing attention as patterns (QKᵀ) and messages (VO), and reformulating residual connections as a persistent residual stream, mechanistic interpretation o…

2 days, 14 hours назад @ towardsdatascience.com
From Data Scientist IC to Manager: One Year In
From Data Scientist IC to Manager: One Year In From Data Scientist IC to Manager: One Year In

In this article, I will reflect on my first year and share what I believe are the three pillars of being an effective frontline data team manager: prioritization, empowerment, and recognition.

However, data teams these days are often overwhelmed with requests.

Understand business priorityInstead of trying to understand the specifics of every single request, I’ve learned it’s better to start with the big picture.

At the end of the day, all departments are evaluated based on their contribution to the business growth, and the data team is no exception.

In that sense, a good data team manager should be a generalist who knows a bit of everything.

3 days, 11 hours назад @ towardsdatascience.com
Distill.pub Distill.pub
последний пост None
The Gradient The Gradient
последний пост 2 months назад
AGI Is Not Multimodal
AGI Is Not Multimodal AGI Is Not Multimodal

Despite this, scale maximalists have implicitly suggested that multimodal models can be a structure-agnostic framework for AGI.

While structure-agnostic scale maximalism has succeeded in producing LLMs and LVMs that pass Turing tests, a multimodal scale maximalist approach to AGI will not bear similar fruit.

CitationFor attribution in academic contexts or books, please cite this work asBenjamin A. Spiegel, "AGI Is Not Multimodal", The Gradient, 2025.

@article{spiegel2025agi, author = {Benjamin A. Spiegel}, title = {AGI Is Not Multimodal}, journal = {The Gradient}, year = {2025}, howpublished = {\url{https://thegradient.pub/agi-is-not-multimodal}, }ReferencesAndreas, Jacob.

“Language Models,…

2 months назад @ thegradient.pub
Shape, Symmetries, and Structure: The Changing Role of Mathematics in Machine Learning Research
Shape, Symmetries, and Structure: The Changing Role of Mathematics in Machine Learning Research Shape, Symmetries, and Structure: The Changing Role of Mathematics in Machine Learning Research

Mathematics and statistics, once the primary guides of machine learning research, now struggle to provide immediate insight into the latest breakthroughs.

This shift has prompted speculation about mathematics’ diminished role in machine learning research moving forward.

It is also the way that symmetries are usually leveraged when performing computations (for example, in machine learning).

One can reasonably argue that diagrammatic descriptions of well-known constructions, like products, are not useful for the machine learning researcher.

However, as we’ve demonstrated, while mathematics may not maintain the same role in machine learning research that it has held in the past, the success of…

8 months, 3 weeks назад @ thegradient.pub
TheSequence TheSequence
последний пост 19 часов назад
TheSequence Opinion #699: 2030 or Bust? The Compute Surge and the Bottlenecks Ahead
TheSequence Opinion #699: 2030 or Bust? The Compute Surge and the Bottlenecks Ahead TheSequence Opinion #699: 2030 or Bust? The Compute Surge and the Bottlenecks Ahead

The path to achieve AGI remains highly debatable.

Would it be as a byproduct of the scaling laws or would we need algorithmic breakthroughs?

By following the scaling laws, we might have a chance to achieve AGI by 2030.

Many experts suggest that if this exponential growth in compute continues at its current rate, we might achieve AGI by around 2030.

Compute Growth Trends and Projections

19 часов назад @ thesequence.substack.com
The Sequence AI of the Week #698: How E2B Powers Safe AI Sandboxes
The Sequence AI of the Week #698: How E2B Powers Safe AI Sandboxes The Sequence AI of the Week #698: How E2B Powers Safe AI Sandboxes

Created Using GPT-4oE2B (Execute to Build) is an open-source sandboxing platform engineered to run AI-generated code in isolated, lightweight virtual machines at cloud scale.

By blending microVM technology with modern orchestration layers and developer-friendly SDKs, E2B addresses the dual challenges of security and performance that arise when executing untrusted or dynamically generated code.

Whether you’re evaluating large language model outputs, orchestrating multi-agent pipelines, or conducting large-scale model evaluations, E2B provides an infrastructure that spins up in milliseconds, enforces strict resource controls, and tears down cleanly—freeing AI practitioners to focus on innovat…

1 day, 20 hours назад @ thesequence.substack.com
The Sequence Knowledge #697: The Most Important Theory in Modern AI Interpretability
The Sequence Knowledge #697: The Most Important Theory in Modern AI Interpretability The Sequence Knowledge #697: The Most Important Theory in Modern AI Interpretability

The paper that outlined the ideas of superposition in frontier models.

2 days, 19 hours назад @ thesequence.substack.com
The Sequence Radar #696: Google AI Ultra’s New Gold-Medal Reasoning Model is Available
The Sequence Radar #696: Google AI Ultra’s New Gold-Medal Reasoning Model is Available The Sequence Radar #696: Google AI Ultra’s New Gold-Medal Reasoning Model is Available

Subscribe Now to Not Miss Anything📝 Editorial: Google AI Ultra’s New Gold-Medal Reasoning Model is AvailableGoogle just made its math olympiad gold medalist model available!

At the heart of Deep Think lies the convergence of sparse mixture-of-experts (MoE) and multi-agent prompting.

Google plans a phased API rollout of Deep Think, offering both “with‑tools” and “without‑tools” variants alongside usage quotas and cost‑management guidelines.

Gemini 2.5 Deep Think heralds a paradigm shift: AI systems that autonomously explore and evaluate multiple solution paths rather than trailing monolithic heuristics.

🤖 AI Tech ReleasesGLM 4.5Chinese AI lab a.zi launched GLM 4.5, a powerful foundation mode…

4 days, 19 hours назад @ thesequence.substack.com
The Sequence AI of the Week #695: Hybrid Minds: Qwen3’s Leap into Efficient Reasoning and Agentic Coding
The Sequence AI of the Week #695: Hybrid Minds: Qwen3’s Leap into Efficient Reasoning and Agentic Coding The Sequence AI of the Week #695: Hybrid Minds: Qwen3’s Leap into Efficient Reasoning and Agentic Coding

Created Using GPT-4oLast week marked a significant milestone for Alibaba Cloud’s large language model (LLM) portfolio with the simultaneous unveiling of two flagship Qwen3 variants: the Qwen3‑235B‑A22B mixture‑of‑experts (MoE) model and the Qwen3‑Coder agentic coding specialist.

Although both models share a common architectural heritage within the Qwen3 family, they address distinct application domains—general-purpose reasoning and conversational AI versus autonomous software engineering.

Over the following sections, we will unpack the historical lineage of the Qwen series, dissect the technical underpinnings of each new model, highlight their unique contributions, and explore their practic…

1 week назад @ thesequence.substack.com
The Sequence Opinion #694: From Proof Engines to Polymaths: How AI Conquered the International Math Olympiad
The Sequence Opinion #694: From Proof Engines to Polymaths: How AI Conquered the International Math Olympiad The Sequence Opinion #694: From Proof Engines to Polymaths: How AI Conquered the International Math Olympiad

Image Credit: Google DeepMindA few days ago, the International Mathematical Olympiad (IMO)—the world's most prestigious math competition for high school students—saw a historic moment: AI systems from Google DeepMind and OpenAI achieved gold-medal level performance.

Both models successfully solved five out of six problems under official contest conditions.

This landmark marks a dramatic leap in AI reasoning capabilities, moving beyond previous benchmarks into creative problem-solving.

These successes build on earlier specialized systems like AlphaProof and AlphaGeometry, culminating in today’s general-purpose models like Gemini and OpenAI's O1 series.

Specialized Models: From AlphaProof to …

1 week, 1 day назад @ thesequence.substack.com
The Sequence Radar #693: A New Series About Interpretability in Foundation Models
The Sequence Radar #693: A New Series About Interpretability in Foundation Models The Sequence Radar #693: A New Series About Interpretability in Foundation Models

Created Using GPT-4oToday we will Discuss:An intro to our series about AI interpretability in foundation models.

💡 AI Concept of the Day: A New Series About Interpretability in Foundation ModelsToday, we start a new series about one of hottest trends in AI: interpretability in frontier models.

Frontier models—neural networks with trillions of parameters trained on vast, diverse datasets—have redefined the limits of AI performance.

Bridging this gap between unparalleled capabilities and human understanding has become imperative for advancing AI safety, accountability, and trust.

Mechanistic Interpretability

1 week, 2 days назад @ thesequence.substack.com
The Sequence Radar #692: Qwen Unleashed: This Week’s Breakthrough AI Models
The Sequence Radar #692: Qwen Unleashed: This Week’s Breakthrough AI Models The Sequence Radar #692: Qwen Unleashed: This Week’s Breakthrough AI Models

With these releases, Alibaba has demonstrated a holistic vision: advanced open-source AI that scales across use cases, from code generation to translation, all while lowering resource constraints.

As enterprises explore the Qwen models’ capabilities, this week’s updates signal a pivotal step toward more powerful, efficient, and accessible AI solutions for tomorrow’s challenges.

It systematically analyzes agent behavior through both tool call accuracy and LLM-based judgment, outperforming static benchmarks and revealing nuanced performance gaps between proprietary and open-source models.

AI Lab: Anthropic Alignment Science & Interpretability teamsSummary:Anthropic introduces three autonomous…

1 week, 4 days назад @ thesequence.substack.com
The Sequence Opinion #691: The Thought Police: Should We Monitor AI’s Inner Dialogue?
The Sequence Opinion #691: The Thought Police: Should We Monitor AI’s Inner Dialogue? The Sequence Opinion #691: The Thought Police: Should We Monitor AI’s Inner Dialogue?

Created Using GPT-4oSunday we discussed a new paper about ch mains of thought (CoTs) ed monitoring created by top AI labs like OpenAI, Anthropic, DeepMind and others.

I’ve read the paper multiple times and wanted to share some of my thoughts.

In Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety, the AI labs propose that the reasoning steps LLMs output—known as chains of thought (CoTs)—offer a practical channel for spotting and stopping misaligned or malicious behavior in advanced AI systems before it occurs.

The paper, endorsed by leaders like Geoffrey Hinton and Ilya Sutskever, frames CoT monitoring as a complement to existing safety techniques.

However, the auth…

2 weeks назад @ thesequence.substack.com
The Sequence AI of the WeeK #690: Team Memories & Multi‑Agent Minds: Inside Reflection AI’s Asymov
The Sequence AI of the WeeK #690: Team Memories & Multi‑Agent Minds: Inside Reflection AI’s Asymov The Sequence AI of the WeeK #690: Team Memories & Multi‑Agent Minds: Inside Reflection AI’s Asymov

Created Using GTP-4oAs mentioned before, we typically get one pick for the AI of the Week section.

Today, that goes to the first model/agent produced by one of the most exciting new labs in frontier AI.

Reflection AI has positioned itself at the forefront of autonomous software engineering, bridging the gap between natural language understanding and code-centric reasoning.

This essay elucidates the evolution of Reflection AI, dissects Asymov’s architecture, and highlights its technical differentiators, offering an in-depth analysis for practitioners and researchers alike.

Image Credit: Reflection AIOrigins and Evolution of Reflection AI

2 weeks, 1 day назад @ thesequence.substack.com
The Sequence Knowledge #689: A Summary of Our Series About AI Evaluation
The Sequence Knowledge #689: A Summary of Our Series About AI Evaluation The Sequence Knowledge #689: A Summary of Our Series About AI Evaluation

The Sequence Knowledge: An Intro to Bechmarking: An intro to our series about AI benchmarking and evaluation.

The Sequence Knowledge : Types of AI Evals: Explores the different types of AI benchmarks such as math, coding, reasoning and others.

10.The Sequence Knowledge: Agentic Benchmarks: Examines benchmarks for evaluating AI agents as decision-making entities capable of planning, tool use, and dynamic interaction.

11.The Sequence Knowledge: AGI Benchmarks: Focuses on benchmarks that test adaptability, abstraction, and problem-solving to assess general intelligence.

14.The Sequence Knowledge: Creativity Benchmarks: Surveys benchmarks assessing creativity in writing, coding, emotional engag…

2 weeks, 2 days назад @ thesequence.substack.com
The Sequence Radar #688: The Transparent Transformer: Monitoring AI Reasoning Before It Goes Rogue
The Sequence Radar #688: The Transparent Transformer: Monitoring AI Reasoning Before It Goes Rogue The Sequence Radar #688: The Transparent Transformer: Monitoring AI Reasoning Before It Goes Rogue

You can subscribe to The Sequence below:📝 Editorial: The Transparent Transformer: Monitoring AI Reasoning Before It Goes RogueIt’s a good thing when all the AI labs come together!

A new research paper, "Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety," is making waves in the AI community.

Modern large language models (LLMs) often externalize part of their reasoning as they generate responses.

Techniques such as causal interventions, readability scoring with other LLMs, and adversarial stress tests could become standard tools in the AI safety toolbox.

It shows that we currently have a rare and potentially powerful way to peek inside the decision-making process of…

2 weeks, 4 days назад @ thesequence.substack.com
The Sequence Opinion #686: The Gemini Effect: Transforming Robotics with Multimodal Foundation Models
The Sequence Opinion #686: The Gemini Effect: Transforming Robotics with Multimodal Foundation Models The Sequence Opinion #686: The Gemini Effect: Transforming Robotics with Multimodal Foundation Models

The emergence of generalist, transformer models in robotics.

The recent release of Gemini Robotics made me realize that robotics might be about to undergo a major shift from the AI perspective.

The centerpiece of this shift is Gemini Robotics, a family of advanced transformer-based models from Google DeepMind.

This essay explores the evolution from narrow, specialized robotics AI to transformer-driven generalist robots, using Gemini Robotics as the guiding example.

From Specialized AI to Generalist Robots

3 weeks назад @ thesequence.substack.com
The Sequence Weekly Alpha #686: Kimi K2 is a Trillion Parameter Open Source Model You Must Know About
The Sequence Weekly Alpha #686: Kimi K2 is a Trillion Parameter Open Source Model You Must Know About The Sequence Weekly Alpha #686: Kimi K2 is a Trillion Parameter Open Source Model You Must Know About

We get to pick one, ocassionally two developments per week.

To kick things off, I want to spotlight a remarkably impressive release from China that deserves your attention.

Kimi K2 is just a MASSIVE model and one that could mark a new milestone in open-source language modeling.

The model combines a trillion parameters with a sparse activation scheme that boosts efficiency.

This essay explains K2’s motivation, its sparsely routed Mixture‑of‑Experts (MoE) design, and its performance across coding, reasoning, and agentic benchmarks.

3 weeks, 1 day назад @ thesequence.substack.com
The Sequence Knowledge #685: About LMArena-Type Evals, Do They Work or Don't
The Sequence Knowledge #685: About LMArena-Type Evals, Do They Work or Don't The Sequence Knowledge #685: About LMArena-Type Evals, Do They Work or Don't

Created Using GPT-4oToday we will Discuss:An overview of Arena type evals.

What began as a research project at UC Berkeley has evolved into a high-profile startup, now valued in the hundreds of millions.

At its core, LMArena seeks to offer a standardized, transparent, and scalable framework for benchmarking large language models (LLMs).

As the capabilities of AI systems accelerate and their deployments grow more diverse, LMArena addresses a critical gap by enabling rigorous, side-by-side model comparisons through a public, interactive platform.

Its mission is as ambitious as it is necessary: to democratize AI benchmarking and establish trusted norms for evaluating LLMs.

3 weeks, 2 days назад @ thesequence.substack.com
Synced Review
последний пост 3 months, 3 weeks назад
DeepSeek Signals Next-Gen R2 Model, Unveils Novel Approach to Scaling Inference with SPCT
DeepSeek Signals Next-Gen R2 Model, Unveils Novel Approach to Scaling Inference with SPCT DeepSeek Signals Next-Gen R2 Model, Unveils Novel Approach to Scaling Inference with SPCT

DeepSeek AI, a prominent player in the large language model arena, has recently published a research paper detailing a new technique aimed…Continue reading on SyncedReview »

3 months, 3 weeks назад @ medium.com
Automating Artificial Life Discovery: The Power of Foundation Models
Automating Artificial Life Discovery: The Power of Foundation Models Automating Artificial Life Discovery: The Power of Foundation Models

The recent Nobel Prize for groundbreaking advancements in protein discovery underscores the transformative potential of foundation models…Continue reading on SyncedReview »

7 months, 1 week назад @ medium.com
Llama 3 Meets MoE: Pioneering Low-Cost High-Performance AI
Llama 3 Meets MoE: Pioneering Low-Cost High-Performance AI Llama 3 Meets MoE: Pioneering Low-Cost High-Performance AI

Continue reading on SyncedReview »

7 months, 1 week назад @ medium.com
DeepMind’s JetFormer: Unified Multimodal Models Without Modelling Constraints
DeepMind’s JetFormer: Unified Multimodal Models Without Modelling Constraints DeepMind’s JetFormer: Unified Multimodal Models Without Modelling Constraints

Recent advancements in training large multimodal models have been driven by efforts to eliminate modeling constraints and unify…Continue reading on SyncedReview »

7 months, 2 weeks назад @ medium.com
NVIDIA’s nGPT: Revolutionizing Transformers with Hypersphere Representation
NVIDIA’s nGPT: Revolutionizing Transformers with Hypersphere Representation NVIDIA’s nGPT: Revolutionizing Transformers with Hypersphere Representation

The Transformer architecture, introduced by Vaswani et al. in 2017, serves as the backbone of contemporary language models. Over the years…Continue reading on SyncedReview »

7 months, 2 weeks назад @ medium.com
From Token to Conceptual: Meta Introduces Large Concept Models in Multilingual AI
From Token to Conceptual: Meta Introduces Large Concept Models in Multilingual AI From Token to Conceptual: Meta Introduces Large Concept Models in Multilingual AI

Large Language Models (LLMs) have become indispensable tools for diverse natural language processing (NLP) tasks. Traditional LLMs operate…Continue reading on SyncedReview »

7 months, 3 weeks назад @ medium.com
NVIDIA’s Hybrid: Combining Attention and State Space Models for Breakthrough Performance of Small…
NVIDIA’s Hybrid: Combining Attention and State Space Models for Breakthrough Performance of Small… NVIDIA’s Hybrid: Combining Attention and State Space Models for Breakthrough Performance of Small…

Language models (LMs) based on transformers have become the gold standard in natural language processing, thanks to their exceptional…Continue reading on SyncedReview »

7 months, 3 weeks назад @ medium.com
From Response to Query: The Power of Reverse Thinking in Language Models
From Response to Query: The Power of Reverse Thinking in Language Models From Response to Query: The Power of Reverse Thinking in Language Models

Continue reading on SyncedReview »

7 months, 4 weeks назад @ medium.com
Yann LeCun Team’s New Research: Revolutionizing Visual Navigation with Navigation World Models
Yann LeCun Team’s New Research: Revolutionizing Visual Navigation with Navigation World Models Yann LeCun Team’s New Research: Revolutionizing Visual Navigation with Navigation World Models

Navigation is a fundamental skill for any visually-capable organism, serving as a critical tool for survival. It enables agents to locate…Continue reading on SyncedReview »

8 months назад @ medium.com
The Future of Vision AI: How Apple’s AIMV2 Leverages Images and Text to Lead the Pack
The Future of Vision AI: How Apple’s AIMV2 Leverages Images and Text to Lead the Pack The Future of Vision AI: How Apple’s AIMV2 Leverages Images and Text to Lead the Pack

The landscape of vision model pre-training has undergone significant evolution, especially with the rise of Large Language Models (LLMs)…Continue reading on SyncedReview »

8 months назад @ medium.com
Redefining Music AI: The Power of Sony’s SoniDo as a Versatile Foundation Model
Redefining Music AI: The Power of Sony’s SoniDo as a Versatile Foundation Model Redefining Music AI: The Power of Sony’s SoniDo as a Versatile Foundation Model

A foundation model refers to a pre-trained model developed on extensive datasets, designed to be versatile and adaptable for a range of…Continue reading on SyncedReview »

8 months назад @ medium.com
DeepMind’s Socratic Learning with Language Games: The Path to Self-Improving Superintelligence
DeepMind’s Socratic Learning with Language Games: The Path to Self-Improving Superintelligence DeepMind’s Socratic Learning with Language Games: The Path to Self-Improving Superintelligence

Continue reading on SyncedReview »

8 months, 1 week назад @ medium.com
Revolutionizing AI on a Budget: Apple’s Roadmap for Small Language Models Training Success
Revolutionizing AI on a Budget: Apple’s Roadmap for Small Language Models Training Success Revolutionizing AI on a Budget: Apple’s Roadmap for Small Language Models Training Success

While large language models (LLMs) dominate the AI landscape, Small-scale Large Language Models (SLMs) are gaining traction as…Continue reading on SyncedReview »

8 months, 1 week назад @ medium.com
Redefines Consistency Models”: OpenAI’s TrigFlow Narrows FID Gap to 10% with Efficient Two-Step…
Redefines Consistency Models”: OpenAI’s TrigFlow Narrows FID Gap to 10% with Efficient Two-Step… Redefines Consistency Models”: OpenAI’s TrigFlow Narrows FID Gap to 10% with Efficient Two-Step…

Consistency models (CMs) are a cutting-edge class of diffusion-based generative models designed for rapid and efficient sampling. However…Continue reading on SyncedReview »

8 months, 2 weeks назад @ medium.com
Precision in Pixels: NVIDIA’s Edify Image Model Combines High Quality with Unmatched Control
Precision in Pixels: NVIDIA’s Edify Image Model Combines High Quality with Unmatched Control Precision in Pixels: NVIDIA’s Edify Image Model Combines High Quality with Unmatched Control

The field of text-to-image synthesis has advanced rapidly, with state-of-the-art models now generating highly realistic and diverse images…Continue reading on SyncedReview »

8 months, 2 weeks назад @ medium.com
📓 Cool Blogs
ODS.ai Habr ODS.ai Habr
последний пост 1 week, 6 days назад
DRAGON: динамический бенчмарк для оценки RAG-систем на русском языке
DRAGON: динамический бенчмарк для оценки RAG-систем на русском языке DRAGON: динамический бенчмарк для оценки RAG-систем на русском языке

Ответ: Кэисукэ ТибаSPARQL-запрос SimpleSELECT DISTINCT ?s ?r ?o WHERE { { SELECT ?s ?r ?o WHERE { ?s ?r ?o . }

GROUP BY ?s ?r HAVING(count(?o) = 1) } { SELECT ?s ?r ?o WHERE { ?s ?r ?o . }

Ответ: Национальная система платежных карт (НСПК) Центр биометрических технологий (ЦБТ) ЕБСSELECT ?s ?r ?o ?len WHERE { { SELECT ?s ?r (COUNT(?o1) as ?len) (GROUP_CONCAT(DISTINCT(STR(?o1));separator="|") AS ?o) WHERE { ?s ?r ?o1 . }

FILTER(?o != ?o1) } GROUP BY ?o ?o1 ?r ?r1 HAVING(COUNT(?s) = 1) } UNION { SELECT ?s ?r ?o ?r1 ?s1 WHERE { ?s ?r ?o .

FILTER(?o != ?o1) } GROUP BY ?o ?o1 ?r ?r1 HAVING(COUNT(?s) = 1) } UNION { SELECT ?s ?r ?o ?r1 ?s1 WHERE { ?s ?r ?o .

1 week, 6 days назад @ habr.com
RKNN Toolkit2: конвертация моделей и симуляция NPU Rockchip
RKNN Toolkit2: конвертация моделей и симуляция NPU Rockchip RKNN Toolkit2: конвертация моделей и симуляция NPU Rockchip

В этой статье я хочу поделиться своим опытом по конвертации нейросети в формат rknn с помощью библиотеки rknn-toolkit2.

Вот как выглядят веса pytorch модели в Netron:веса pytorch модели в NetronВажно!

Конвертация onnx модели в rknnДалее создается объект RKNN , который управляет процессом конвертации и инференса модели на платформе Rockchip.

На этом этапе происходит подготавка модели к конвертации в формат RKNN и последующему запуску на NPU Rockchip.

Создание и экспорт rknn моделиНа этом этапе происходит конвертация ONNX-модели во внутренний формат RKNN, оптимизация графа и подготовка к запуску на NPU Rockchip.

2 weeks, 6 days назад @ habr.com
MERA Code: всесторонняя оценка генерации кода в прикладных сценариях
MERA Code: всесторонняя оценка генерации кода в прикладных сценариях MERA Code: всесторонняя оценка генерации кода в прикладных сценариях

🔗MERA Code🔗GitHub с кодом и данными🔗Коллекция на Hugging Face🔗Статья на arxiv🔗Репозиторий проекта на GitVerseЧто такое MERA Code?

Современные кодовые языковые модели и модели общего назначения (ChatGPT, Claude, Qwen, YandexGPT, GigaChat и др.)

Список текущих задач MERA Code и их характеристикКаталог задач MERA Code и их подробное описание представлено на сайте.

В MERA Code промпты строго подобраны под задачу и корректный выбор ответа.

В заключениеMERA Code — это попытка закрыть важный пробел в тестировании LLM: насколько они действительно полезны в реальной, локализованной разработке.

2 weeks, 6 days назад @ habr.com
Байесовская собака: анализ пёсьего компаса
Байесовская собака: анализ пёсьего компаса Байесовская собака: анализ пёсьего компаса

", подумал я. И, к счастью, у меня как раз под рукой оказался идеальный подопытный.

Стандартное арифметическое среднее между 360° и 0° даст нам 180°, несмотря на то, что и 360°, и 0° указывают в одном направлении.

Нулевая гипотеза утверждает, что данные распределены равномерно по кругу, альтернативная — что это не так.

from pingouin import circ_vtest v, pval = circ_vtest(data['radians'], dir=np.pi) print(f"V-statistics: {v:.3f}; p-value: {pval:.6f}")>> V-statistics: 24.127; p-value: 0.002904Вот мы и подобрались к чему-то интересному!

Априорное распределение и функция правдоподобияПредположим, что у нас есть:Априорное распределение с параметрамиФункция правдоподобия для нового наблюдения с п…

4 months, 1 week назад @ habr.com
Создаем воспоминания. Осваиваем FLUX, LoRA и ComfyUI
Создаем воспоминания. Осваиваем FLUX, LoRA и ComfyUI Создаем воспоминания. Осваиваем FLUX, LoRA и ComfyUI

Такие модели можно обучать с нуля и это дорого, нужен кластер с GPU (видеокарты) и много данных.

В домене текст-картинка бывают открытые модели, типа Stable Diffusion, Kandinsky и FLUX, бывают закрытые, типа DALL-E.Открытую модель можно дообучать разными способами.

Борис СтругацкийОсобенности: Для личностей типа Стругацких или Бродского, качественных фотографий крайне мало, но много и не надо.

Можно и фразу.

Владимир СурдинАлексей СемихатовВидео с их лекциями можно найти повсеместно, начать можно с канала Вселенная плюс на YouTube и в телеграм.

7 months назад @ habr.com
Как нейросети, RL и байесовскую оптимизацию стали использовать на ускорителях заряженных частиц
Как нейросети, RL и байесовскую оптимизацию стали использовать на ускорителях заряженных частиц Как нейросети, RL и байесовскую оптимизацию стали использовать на ускорителях заряженных частиц

Один из них — поддержание стабильной орбиты пучка частиц (траектории, по которой происходит движение), которая критически важна для точности экспериментов.

Вот, кстати, наша статья про то, как сейсмические вибрации будут влиять на орбиту пучка в СКИФ: Beam Stability .

Классические подходы к стабилизации орбиты пучка в ускорителяхДля стабилизации орбиты используют датчики положения пучка (BPM) и магнитные корректоры.

По словам авторов получились следующие преимущества:Ускорение процесса коррекции орбиты и повышение точности по сравнению с классическими методами, такими как SVD.

Задача агента:Автоматическое восстановление орбиты пучка заряженных частиц за ограниченное время и минимальное коли…

7 months, 2 weeks назад @ habr.com
Machine Learning Mastery
последний пост 2 days, 18 hours назад
A Gentle Introduction to Q-Learning
A Gentle Introduction to Q-Learning A Gentle Introduction to Q-Learning

Reinforcement learning is a relatively lesser-known area of artificial intelligence (AI) compared to highly popular subfields today, such as machine learning, deep learning, and natural language processing.

2 days, 18 hours назад @ machinelearningmastery.com
Building a Decoder-Only Transformer Model for Text Generation
Building a Decoder-Only Transformer Model for Text Generation Building a Decoder-Only Transformer Model for Text Generation

This post is divided into five parts; they are: • From a Full Transformer to a Decoder-Only Model • Building a Decoder-Only Model • Data Preparation for Self-Supervised Learning • Training the Model • Extensions The transformer model originated as a sequence-to-sequence (seq2seq) model that converts an input sequence into a context vector, which is then used to generate a new sequence.

3 days, 14 hours назад @ machinelearningmastery.com
Building a Transformer Model for Language Translation
Building a Transformer Model for Language Translation Building a Transformer Model for Language Translation

This post is divided into six parts; they are: • Why Transformer is Better than Seq2Seq • Data Preparation and Tokenization • Design of a Transformer Model • Building the Transformer Model • Causal Mask and Padding Mask • Training and Evaluation Traditional seq2seq models with recurrent neural networks have two main limitations: • Sequential processing prevents parallelization • Limited ability to capture long-term dependencies since hidden states are overwritten whenever an element is processed The Transformer architecture, introduced in the 2017 paper "Attention is All You Need", overcomes these limitations.

6 days, 3 hours назад @ machinelearningmastery.com
How to Diagnose Why Your Regression Model Fails
How to Diagnose Why Your Regression Model Fails How to Diagnose Why Your Regression Model Fails

In regression models , failure occurs when the model produces inaccurate predictions — that is, when error metrics like MAE or RMSE are high — or when the model, once deployed, fails to generalize well to new data that differs from the examples it was trained or tested on.

1 week назад @ machinelearningmastery.com
Implementing Advanced Feature Scaling Techniques in Python Step-by-Step
Implementing Advanced Feature Scaling Techniques in Python Step-by-Step Implementing Advanced Feature Scaling Techniques in Python Step-by-Step

In this article, you will learn: • Why standard scaling methods are sometimes insufficient and when to use advanced techniques.

1 week, 1 day назад @ machinelearningmastery.com
Your First Containerized Machine Learning Deployment with Docker and FastAPI
Your First Containerized Machine Learning Deployment with Docker and FastAPI Your First Containerized Machine Learning Deployment with Docker and FastAPI

Deploying machine learning models can seem complex, but modern tools can streamline the process.

1 week, 2 days назад @ machinelearningmastery.com
Building a Seq2Seq Model with Attention for Language Translation
Building a Seq2Seq Model with Attention for Language Translation Building a Seq2Seq Model with Attention for Language Translation

This post is divided into four parts; they are: • Why Attnetion Matters: Limitations of Basic Seq2Seq Models • Implementing Seq2Seq Model with Attention • Training and Evaluating the Model • Using the Model Traditional seq2seq models use an encoder-decoder architecture where the encoder compresses the input sequence into a single context vector, which the decoder then uses to generate the output sequence.

1 week, 3 days назад @ machinelearningmastery.com
Beyond Pandas: 7 Advanced Data Manipulation Techniques for Large Datasets
Beyond Pandas: 7 Advanced Data Manipulation Techniques for Large Datasets Beyond Pandas: 7 Advanced Data Manipulation Techniques for Large Datasets

If you've worked with data in Python, chances are you've used Pandas many times.

1 week, 3 days назад @ machinelearningmastery.com
Image Augmentation Techniques to Boost Your CV Model Performance
Image Augmentation Techniques to Boost Your CV Model Performance Image Augmentation Techniques to Boost Your CV Model Performance

In this article, you will learn: • the purpose and benefits of image augmentation techniques in computer vision for improving model generalization and diversity.

1 week, 6 days назад @ machinelearningmastery.com
10 Critical Mistakes that Silently Ruin Machine Learning Projects
10 Critical Mistakes that Silently Ruin Machine Learning Projects 10 Critical Mistakes that Silently Ruin Machine Learning Projects

Machine learning projects can be as exciting as they are challenging.

2 weeks, 1 day назад @ machinelearningmastery.com
10 NumPy One-Liners to Simplify Feature Engineering
10 NumPy One-Liners to Simplify Feature Engineering 10 NumPy One-Liners to Simplify Feature Engineering

When building machine learning models, most developers focus on model architectures and hyperparameter tuning.

1 month назад @ machinelearningmastery.com
Your First OpenAI API Project in Python Step-By-Step
Your First OpenAI API Project in Python Step-By-Step Your First OpenAI API Project in Python Step-By-Step

In a

1 month назад @ machinelearningmastery.com
Securing FastAPI Endpoints for MLOps: An Authentication Guide
Securing FastAPI Endpoints for MLOps: An Authentication Guide Securing FastAPI Endpoints for MLOps: An Authentication Guide

In today's AI world, data scientists are not just focused on training and optimizing machine learning models.

1 month назад @ machinelearningmastery.com
Skip Connections in Transformer Models
Skip Connections in Transformer Models Skip Connections in Transformer Models

This post is divided into three parts; they are: • Why Skip Connections are Needed in Transformers • Implementation of Skip Connections in Transformer Models • Pre-norm vs Post-norm Transformer Architectures Transformer models, like other deep learning models, stack many layers on top of each other.

1 month назад @ machinelearningmastery.com
5 Advanced RAG Architectures Beyond Traditional Methods
5 Advanced RAG Architectures Beyond Traditional Methods 5 Advanced RAG Architectures Beyond Traditional Methods

Retrieval-augmented generation (RAG) has shaken up the world of language models by combining the best of two worlds:

1 month назад @ machinelearningmastery.com
ML in Production
последний пост None
Sorta Insightful Sorta Insightful
последний пост 2 weeks, 3 days назад
Brony Musicians Seize The Means of Production: My Eyewitness Account to BABSCon 2025
Brony Musicians Seize The Means of Production: My Eyewitness Account to BABSCon 2025 Brony Musicians Seize The Means of Production: My Eyewitness Account to BABSCon 2025

A music concert in the evenings, typically set up as a rave with EDM or rock music made by brony musicians.

She has been involved in organizing pony music concerts for over a decade, for both BABSCon and other pony conventions.

Thank you, BABSCon ChairsThe brony musicians immediately jump into an emergency Discord call with Pinkaboo, to get her side of the story.

Other conventions start tweeting in support of the brony musicians, with no one taking BABSCon’s side.

It’s hard for me to explain why I like MLP fan music, because brony music really isn’t accessible.

2 weeks, 3 days назад @ alexirpan.com
Who is AI For?
Who is AI For? Who is AI For?

I think the easy answer to this question is that right now, AI is for the AI developers.

Code is useful, it makes money, it is a testbed for AI speeding up the development of AI, and it is easy.

I’m working in AI because it pays well and is potentially really good for the world.

The artists did not know what AI was, but when they learned, they quickly decided they did not want it.

It feels like the most likely outcome is that people go all-in on pushing raw intelligence, in the way that AI developers can measure it, leaving behind those that are not like AI developers.

4 months, 1 week назад @ alexirpan.com
MIT Mystery Hunt 2025
MIT Mystery Hunt 2025 MIT Mystery Hunt 2025

This has spoilers for MIT Mystery Hunt 2025.

I enjoyed it more than their 2018 Hunt, which is commonly cited as an all-time good Mystery Hunt.

In this Mystery Hunt it was reversed, where the act of unlocking is easy but the value and difficulty of a feeder varied.

In my free time pre-Hunt, I went to Puzzled Pint, where I tried to all-brain a logic puzzle (solve it without writing anything).

I’m looking forward to solving “No Assembly Required” in Mystery Hunt 2026, a puzzle that gives you the answer for no work.

6 months, 1 week назад @ alexirpan.com
Using AI to Get the Neopets Destruct-o-Match Avatar
Using AI to Get the Neopets Destruct-o-Match Avatar Using AI to Get the Neopets Destruct-o-Match Avatar

If AI can be superhuman at Go, surely AI can be slightly-worse-than-experts at Destruct-o-Match if we try?

Step 0: Is Making a Destruct-o-Match AI Against Neopets Rules?

I believe the precedent is in favor of a Destruct-o-Match AI being okay.

As long as I’m the one inputting moves the Destruct-o-Match AI recommends, I should be okay.

To write a game AI, we first need to implement the rules of the game in code.

7 months назад @ alexirpan.com
Late Takes on OpenAI o1
Late Takes on OpenAI o1 Late Takes on OpenAI o1

I realize how late this is, but I didn’t get a post out while o1 was fresh, and still feel like writing one despite it being cold.

(Also, OpenAI just announced they’re going to ship new stuff starting tomorrow so it’s now or never to say something.)

OpenAI o1 is a model release widely believed (but not confirmed) to be a post-trained version of GPT-4o.

If true, that makes this video especially useful for understanding OpenAI o1.

Which I suppose is part of why I’m talking about o1 rather than building o1.

8 months назад @ alexirpan.com
Lil'Log
последний пост None
The Spectator
последний пост None
Off the Convex Path
последний пост None
fast.ai NLP fast.ai NLP
последний пост None
Sebastian Ruder
последний пост None
Andrew Karpathy blog
последний пост None
大トロ 大トロ
последний пост None
🔬 Science
Papers With Code Papers With Code
последний пост 2 weeks, 2 days назад
/henry123-boy/ SpatialTrackerV2: 3D Point Tracking Made Easy
/henry123-boy/ SpatialTrackerV2: 3D Point Tracking Made Easy /henry123-boy/ SpatialTrackerV2: 3D Point Tracking Made Easy

We present SpatialTrackerV2, a feed-forward 3D point tracking method for monocular videos.

Going beyond modular pipelines built on off-the-shelf components for 3D tracking, our approach unifies the intrinsic connections between point tracking, monocular depth, and camera pose estimation into a high-performing and feedforward 3D point tracker.

It decomposes world-space 3D motion into scene geometry, camera ego-motion, and pixel-wise object motion, with a fully differentiable and end-to-end architecture, allowing scalable training across a wide range of datasets, including synthetic sequences, posed RGB-D videos, and unlabeled in-the-wild footage.

By learning geometry and motion jointly from …

2 weeks, 2 days назад @ paperswithcode.com
/antof27/ Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation
/antof27/ Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation /antof27/ Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation

Calisthenics skill classification is the computer vision task of inferring the skill performed by an athlete from images, enabling automatic performance assessment and personalized analytics.

Traditional methods for calisthenics skill recognition are based on pose estimation methods to determine the position of skeletal data from images, which is later fed to a classification algorithm to infer the performed skill.

This work proposes a direct approach to calisthenics skill recognition, which leverages depth estimation and athlete patch retrieval to avoid the computationally expensive human pose estimation module.

Using Depth Anything V2 for depth estimation and YOLOv10 for athlete localizat…

2 weeks, 2 days назад @ paperswithcode.com
/snowflakedb/ Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AI
/snowflakedb/ Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AI /snowflakedb/ Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AI

Inference is now the dominant AI workload, yet existing systems force trade-offs between latency, throughput, and cost.

Arctic Inference, an open-source vLLM plugin from Snowflake AI Research, introduces Shift Parallelism, a dynamic parallelism strategy that adapts to real-world traffic while integrating speculative decoding, SwiftKV compute reduction, and optimized embedding inference.

It achieves up to 3.4 times faster request completion, 1.75 times faster generation, and 1.6M tokens/sec per GPU for embeddings, outperforming both latency- and throughput-optimized deployments.

Already powering Snowflake Cortex AI, Arctic Inference delivers state-of-the-art, cost-effective inference for ent…

2 weeks, 2 days назад @ paperswithcode.com
/NVIDIA/ FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale
/NVIDIA/ FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale /NVIDIA/ FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale

FourCastNet 3 advances global weather modeling by implementing a scalable, geometric machine learning (ML) approach to probabilistic ensemble forecasting.

The approach is designed to respect spherical geometry and to accurately model the spatially correlated probabilistic nature of the problem, resulting in stable spectra and realistic dynamics across multiple scales.

FourCastNet 3 delivers forecasting accuracy that surpasses leading conventional ensemble models and rivals the best diffusion-based methods, while producing forecasts 8 to 60 times faster than these approaches.

In contrast to other ML approaches, FourCastNet 3 demonstrates excellent probabilistic calibration and retains realis…

2 weeks, 2 days назад @ paperswithcode.com
/jingyanw/ Choosing the Better Bandit Algorithm under Data Sharing: When Do A/B Experiments Work?
/jingyanw/ Choosing the Better Bandit Algorithm under Data Sharing: When Do A/B Experiments Work? /jingyanw/ Choosing the Better Bandit Algorithm under Data Sharing: When Do A/B Experiments Work?

We study A/B experiments that are designed to compare the performance of two recommendation algorithms.

The bias arising from this type of data sharing is known as "symbiosis bias".

In this paper, we highlight that, for decision-making purposes, the sign of the GTE often matters more than its precise magnitude when selecting the better algorithm.

We formalize this insight under a multi-armed bandit framework and theoretically characterize when the sign of the expected GTE estimate under data sharing aligns with or contradicts the sign of the true GTE.

Our analysis identifies the level of exploration versus exploitation as a key determinant of how symbiosis bias impacts algorithm selection.

2 weeks, 2 days назад @ paperswithcode.com
/qqq-yi/ DAC: A Dynamic Attention-aware Approach for Task-Agnostic Prompt Compression
/qqq-yi/ DAC: A Dynamic Attention-aware Approach for Task-Agnostic Prompt Compression /qqq-yi/ DAC: A Dynamic Attention-aware Approach for Task-Agnostic Prompt Compression

Task-agnostic prompt compression leverages the redundancy in natural language to reduce computational overhead and enhance information density within prompts, especially in long-context scenarios.

Existing methods predominantly rely on information entropy as the metric to compress lexical units, aiming to achieve minimal information loss.

However, these approaches overlook two critical aspects: (i) the importance of attention-critical tokens at the algorithmic level, and (ii) shifts in information entropy during the compression process.

Motivated by these challenges, we propose a dynamic attention-aware approach for task-agnostic prompt compression (DAC).

This approach effectively integrate…

2 weeks, 2 days назад @ paperswithcode.com
/lukasellinger/ Simplifications are Absolutists: How Simplified Language Reduces Word Sense Awareness in LLM-Generated Definitions
/lukasellinger/ Simplifications are Absolutists: How Simplified Language Reduces Word Sense Awareness in LLM-Generated Definitions /lukasellinger/ Simplifications are Absolutists: How Simplified Language Reduces Word Sense Awareness in LLM-Generated Definitions

Large Language Models (LLMs) can provide accurate word definitions and explanations for any context.

However, the scope of the definition changes for different target groups, like children or language learners.

We investigate how simplification impacts homonym definition quality across three target groups: Normal, Simple, and ELI5.

Our results show that simplification drastically degrades definition completeness by neglecting polysemy, increasing the risk of misunderstanding.

Fine-tuning Llama 3.1 8B with Direct Preference Optimization substantially improves homonym response quality across all prompt types.

2 weeks, 2 days назад @ paperswithcode.com
/pspdada/ Mitigating Object Hallucinations via Sentence-Level Early Intervention
/pspdada/ Mitigating Object Hallucinations via Sentence-Level Early Intervention /pspdada/ Mitigating Object Hallucinations via Sentence-Level Early Intervention

Multimodal large language models (MLLMs) have revolutionized cross-modal understanding but continue to struggle with hallucinations - fabricated content contradicting visual inputs.

Existing hallucination mitigation methods either incur prohibitive computational costs or introduce distribution mismatches between training data and model outputs.

We identify a critical insight: hallucinations predominantly emerge at the early stages of text generation and propagate through subsequent outputs.

To address this, we propose **SENTINEL** (**S**entence-level **E**arly i**N**tervention **T**hrough **IN**-domain pr**E**ference **L**earning), a framework that eliminates dependency on human annotations…

2 weeks, 2 days назад @ paperswithcode.com
/owos/ FLEXITOKENS: Flexible Tokenization for Evolving Language Models
/owos/ FLEXITOKENS: Flexible Tokenization for Evolving Language Models /owos/ FLEXITOKENS: Flexible Tokenization for Evolving Language Models

Language models (LMs) are challenging to adapt to new data distributions by simple finetuning.

This is due to the rigidity of their subword tokenizers, which typically remain unchanged during adaptation.

This inflexibility often leads to inefficient tokenization, causing overfragmentation of out-of-distribution domains, unseen languages, or scripts.

In this work, we develop byte-level LMs with learnable tokenizers to make tokenization adaptive.

Our models include a submodule that learns to predict boundaries between the input byte sequence, encoding it into variable-length segments.

2 weeks, 2 days назад @ paperswithcode.com
/wojiufukele/ Graph-Structured Data Analysis of Component Failure in Autonomous Cargo Ships Based on Feature Fusion
/wojiufukele/ Graph-Structured Data Analysis of Component Failure in Autonomous Cargo Ships Based on Feature Fusion /wojiufukele/ Graph-Structured Data Analysis of Component Failure in Autonomous Cargo Ships Based on Feature Fusion

To address the challenges posed by cascading reactions caused by component failures in autonomous cargo ships (ACS) and the uncertainties in emergency decision-making, this paper proposes a novel hybrid feature fusion framework for constructing a graph-structured dataset of failure modes.

A hierarchical feature fusion framework is constructed, using Word2Vec encoding to encode subsystem/component features, BERT-KPCA to process failure modes/reasons, and Sentence-BERT to quantify the semantic association between failure impact and emergency decision-making.

The dataset covers 12 systems, 1,262 failure modes, and 6,150 propagation paths.

In the label prediction results, the Shore-based Meteor…

2 weeks, 2 days назад @ paperswithcode.com
/YF-W/ Tri-Learn Graph Fusion Network for Attributed Graph Clustering
/YF-W/ Tri-Learn Graph Fusion Network for Attributed Graph Clustering /YF-W/ Tri-Learn Graph Fusion Network for Attributed Graph Clustering

In recent years, models based on Graph Convolutional Networks (GCN) have made significant strides in the field of graph data analysis.

Although the Graph Transformer architecture has mitigated some of these issues, its performance is still limited when processing heterogeneous graph data.

To address these challenges, this study proposes a novel deep clustering framework that comprising GCN, Autoencoder (AE), and Graph Transformer, termed the Tri-Learn Graph Fusion Network (Tri-GFN).

The tri-learning mechanism allows mutual learning among these modules, while the feature fusion strategy enables the model to capture complex relationships, yielding highly discriminative representations for gra…

2 weeks, 2 days назад @ paperswithcode.com
/mr-ravin/ APTx Neuron: A Unified Trainable Neuron Architecture Integrating Activation and Computation
/mr-ravin/ APTx Neuron: A Unified Trainable Neuron Architecture Integrating Activation and Computation /mr-ravin/ APTx Neuron: A Unified Trainable Neuron Architecture Integrating Activation and Computation

We propose the APTx Neuron, a novel, unified neural computation unit that integrates non-linear activation and linear transformation into a single trainable expression.

The APTx Neuron is derived from the APTx activation function, thereby eliminating the need for separate activation layers and making the architecture both computationally efficient and elegant.

The proposed neuron follows the functional form $y = \sum_{i=1}^{n} ((\alpha_i + \tanh(\beta_i x_i)) \cdot \gamma_i x_i) + \delta$, where all parameters $\alpha_i$, $\beta_i$, $\gamma_i$, and $\delta$ are trainable.

We validate our APTx Neuron-based architecture on the MNIST dataset, achieving up to 96.69\% test accuracy in just 20 ep…

2 weeks, 2 days назад @ paperswithcode.com
/Rec4Fun/ A Reproducibility Study of Product-side Fairness in Bundle Recommendation
/Rec4Fun/ A Reproducibility Study of Product-side Fairness in Bundle Recommendation /Rec4Fun/ A Reproducibility Study of Product-side Fairness in Bundle Recommendation

While this problem has been widely studied in traditional recommendation settings, its implications for bundle recommendation (BR) remain largely unexplored.

Existing fairness frameworks and metrics designed for traditional recommender systems may not directly translate to this multi-layered setting.

In this paper, we conduct a comprehensive reproducibility study of product-side fairness in BR across three real-world datasets using four state-of-the-art BR methods.

We analyze exposure disparities at both the bundle and item levels using multiple fairness metrics, uncovering important patterns.

Overall, our findings offer actionable insights for building fairer bundle recommender systems and…

2 weeks, 2 days назад @ paperswithcode.com
/cbobed/ OntView: What you See is What you Meant
/cbobed/ OntView: What you See is What you Meant /cbobed/ OntView: What you See is What you Meant

However, the lack of tools that provide effective visualization is still a significant challenge.

In this paper, we present OntView, an ontology viewer that is designed to provide users with an intuitive visual representation of ontology concepts and their formal definitions through a user-friendly interface.

Building on the use of a DL reasoner, OntView follows a "What you see is what you meant" paradigm, showing the actual inferred knowledge.

One key aspect for this is its ability to visualize General Concept Inclusions (GCI), a feature absent in existing visualization tools.

OntView has been released with an open-source license for the whole community.

2 weeks, 2 days назад @ paperswithcode.com
/Rec4Fun/ RaMen: Multi-Strategy Multi-Modal Learning for Bundle Construction
/Rec4Fun/ RaMen: Multi-Strategy Multi-Modal Learning for Bundle Construction /Rec4Fun/ RaMen: Multi-Strategy Multi-Modal Learning for Bundle Construction

These approaches fail to capture elaborate relations hidden in real-world bundle structures, resulting in suboptimal bundle representations.

To overcome this limitation, we propose RaMen, a novel method that provides a holistic multi-strategy approach for bundle construction.

RaMen utilizes both intrinsic (characteristics) and extrinsic (collaborative signals) information to model bundle structures through Explicit Strategy-aware Learning (ESL) and Implicit Strategy-aware Learning (ISL).

Integrating diverse strategies enables RaMen to learn more comprehensive and robust bundle representations.

Meanwhile, Multi-strategy Alignment & Discrimination module is employed to facilitate knowledge tr…

2 weeks, 2 days назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 2 weeks, 2 days назад
/PrimisAI/ Adaptive Multi-Agent Reasoning via Automated Workflow Generation
/PrimisAI/ Adaptive Multi-Agent Reasoning via Automated Workflow Generation /PrimisAI/ Adaptive Multi-Agent Reasoning via Automated Workflow Generation

The rise of Large Reasoning Models (LRMs) promises a significant leap forward in language model capabilities, aiming to tackle increasingly sophisticated tasks with unprecedented efficiency and accuracy.

However, despite their impressive performance, recent studies have highlighted how current reasoning models frequently fail to generalize to novel, unseen problems, often resorting to memorized solutions rather than genuine inferential reasoning.

In this paper, we introduce Nexus Architect, an enhanced iteration of our multi-agent system framework, Nexus, equipped with a novel automated workflow synthesis mechanism.

Given a user's prompt and a small set of representative examples, the Archi…

2 weeks, 2 days назад @ paperswithcode.com
/sharanya02/ Real Time Captioning of Sign Language Gestures in Video Meetings
/sharanya02/ Real Time Captioning of Sign Language Gestures in Video Meetings /sharanya02/ Real Time Captioning of Sign Language Gestures in Video Meetings

One of the most tested ways to establish such a communication is through the use of sign based languages.

However, not many people are aware of the smaller intricacies involved with sign language.

Sign language recognition using computer vision aims at eliminating the communication barrier between deaf-mute and ordinary people so that they can properly communicate with others.

In recent studies, it has been found that people with hearing disabilities prefer to sign over typing during these video calls.

In this paper, we are proposing a browser extension that will automatically translate sign language to subtitles for everyone else in the video call.

2 weeks, 2 days назад @ paperswithcode.com
/alessiopittiglio/ Leveraging Context for Multimodal Fallacy Classification in Political Debates
/alessiopittiglio/ Leveraging Context for Multimodal Fallacy Classification in Political Debates /alessiopittiglio/ Leveraging Context for Multimodal Fallacy Classification in Political Debates

In this paper, we present our submission to the MM-ArgFallacy2025 shared task, which aims to advance research in multimodal argument mining, focusing on logical fallacies in political debates.

Our approach uses pretrained Transformer-based models and proposes several ways to leverage context.

In the fallacy classification subtask, our models achieved macro F1-scores of 0.4444 (text), 0.3559 (audio), and 0.4403 (multimodal).

Our multimodal model showed performance comparable to the text-only model, suggesting potential for improvements.

PDFAbstract

2 weeks, 2 days назад @ paperswithcode.com
/RS2002/ One Step is Enough: Multi-Agent Reinforcement Learning based on One-Step Policy Optimization for Order Dispatch on Ride-Sharing Platforms
/RS2002/ One Step is Enough: Multi-Agent Reinforcement Learning based on One-Step Policy Optimization for Order Dispatch on Ride-Sharing Platforms /RS2002/ One Step is Enough: Multi-Agent Reinforcement Learning based on One-Step Policy Optimization for Order Dispatch on Ride-Sharing Platforms

On-demand ride-sharing platforms face the fundamental challenge of dynamically bundling passengers with diverse origins and destinations and matching them with vehicles in real time, all under significant uncertainty.

However, conventional MARL-based ride-sharing approaches heavily rely on the accurate estimation of Q-values or V-values, which becomes problematic in large-scale, highly uncertain environments.

To address these challenges, we propose two novel alternative methods that bypass value function estimation.

First, we adapt GRPO to ride-sharing, replacing the PPO baseline with the group average reward to eliminate critic estimation errors and reduce training bias.

Second, inspired b…

2 weeks, 2 days назад @ paperswithcode.com
/LiXinran6/ Long-Short Distance Graph Neural Networks and Improved Curriculum Learning for Emotion Recognition in Conversation
/LiXinran6/ Long-Short Distance Graph Neural Networks and Improved Curriculum Learning for Emotion Recognition in Conversation /LiXinran6/ Long-Short Distance Graph Neural Networks and Improved Curriculum Learning for Emotion Recognition in Conversation

Include the markdown at the top of your GitHub README.md file to showcase the performance of the model.

Badges are live and will be dynamically updated with the latest ranking of this paper.

2 weeks, 2 days назад @ paperswithcode.com
/ShimSoonYong/ ZClassifier: Temperature Tuning and Manifold Approximation via KL Divergence on Logit Space
/ShimSoonYong/ ZClassifier: Temperature Tuning and Manifold Approximation via KL Divergence on Logit Space

We introduce a novel classification framework, ZClassifier, that replaces conventional deterministic logits with diagonal Gaussian-distributed logits. Code: https://github.com/ShimSoonYong/ZClassifier

2 weeks, 6 days назад @ paperswithcode.com
/briziorusso/ On Gradual Semantics for Assumption-Based Argumentation
/briziorusso/ On Gradual Semantics for Assumption-Based Argumentation

In this paper, we fill this gap and propose a family of novel gradual semantics for equipping assumptions, which are the core components in ABA frameworks, with dialectical strengths. Code: https://github.com/briziorusso/GradualABA

2 weeks, 6 days назад @ paperswithcode.com
/wumingqi/ Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination
/wumingqi/ Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination

Cloudflare is unable to establish an SSL connection to the origin server.

If you're a visitor of this website:Please try again in a few minutes.

If you're the owner of this website:It appears that the SSL configuration used is not compatible with Cloudflare.

This could happen for a several reasons, including no shared cipher suites.

Additional troubleshooting information here.

2 weeks, 6 days назад @ paperswithcode.com
/IsaacYQH/ WildFX: A DAW-Powered Pipeline for In-the-Wild Audio FX Graph Modeling
/IsaacYQH/ WildFX: A DAW-Powered Pipeline for In-the-Wild Audio FX Graph Modeling

Despite rapid progress in end-to-end AI music generation, AI-driven modeling of professional Digital Signal Processing (DSP) workflows remains challenging. Code: https://github.com/IsaacYQH/WildFX

2 weeks, 6 days назад @ paperswithcode.com
/summer1278/ Addressing Data Imbalance in Transformer-Based Multi-Label Emotion Detection with Weighted Loss
/summer1278/ Addressing Data Imbalance in Transformer-Based Multi-Label Emotion Detection with Weighted Loss

This paper explores the application of a simple weighted loss function to Transformer-based models for multi-label emotion detection in SemEval-2025 Shared Task 11. Code: https://github.com/summer1278/semeval2025-task11

2 weeks, 6 days назад @ paperswithcode.com
/gabrielkmbo/ Step-wise Policy for Rare-tool Knowledge (SPaRK): Offline RL that Drives Diverse Tool Use in LLMs
/gabrielkmbo/ Step-wise Policy for Rare-tool Knowledge (SPaRK): Offline RL that Drives Diverse Tool Use in LLMs

We present Step-wise Policy for Rare-tool Knowledge (SPaRK), a novel reinforcement learning framework that teaches large language models to explore diverse tool usage patterns beyond conventional high-temperature sampling. Code: https://github.com/gabrielkmbo/explore-rl

2 weeks, 6 days назад @ paperswithcode.com
/Cavendish518/ Learning to Tune Like an Expert: Interpretable and Scene-Aware Navigation via MLLM Reasoning and CVAE-Based Adaptation
/Cavendish518/ Learning to Tune Like an Expert: Interpretable and Scene-Aware Navigation via MLLM Reasoning and CVAE-Based Adaptation

Service robots are increasingly deployed in diverse and dynamic environments, where both physical layouts and social contexts change over time and across locations. Code: https://github.com/Cavendish518/LE-Nav

2 weeks, 6 days назад @ paperswithcode.com
/MatteoFasulo/ AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles
/MatteoFasulo/ AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles

Cloudflare is unable to establish an SSL connection to the origin server.

If you're a visitor of this website:Please try again in a few minutes.

If you're the owner of this website:It appears that the SSL configuration used is not compatible with Cloudflare.

This could happen for a several reasons, including no shared cipher suites.

Additional troubleshooting information here.

2 weeks, 6 days назад @ paperswithcode.com
/VCA-EPFL/ SystolicAttention: Fusing FlashAttention within a Single Systolic Array
/VCA-EPFL/ SystolicAttention: Fusing FlashAttention within a Single Systolic Array

The frequent data swaps between the systolic array and external vector units result in low systolic array utilization. Code: https://github.com/VCA-EPFL/FSA

2 weeks, 6 days назад @ paperswithcode.com
/Buddhi19/ Precision Spatio-Temporal Feature Fusion for Robust Remote Sensing Change Detection
/Buddhi19/ Precision Spatio-Temporal Feature Fusion for Robust Remote Sensing Change Detection

Cloudflare is unable to establish an SSL connection to the origin server.

If you're a visitor of this website:Please try again in a few minutes.

If you're the owner of this website:It appears that the SSL configuration used is not compatible with Cloudflare.

This could happen for a several reasons, including no shared cipher suites.

Additional troubleshooting information here.

2 weeks, 6 days назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 2 weeks, 2 days назад
/fudanvi/ Beyond Task-Specific Reasoning: A Unified Conditional Generative Framework for Abstract Visual Reasoning
/fudanvi/ Beyond Task-Specific Reasoning: A Unified Conditional Generative Framework for Abstract Visual Reasoning

Cloudflare is unable to establish an SSL connection to the origin server.

If you're a visitor of this website:Please try again in a few minutes.

If you're the owner of this website:It appears that the SSL configuration used is not compatible with Cloudflare.

This could happen for a several reasons, including no shared cipher suites.

Additional troubleshooting information here.

2 weeks, 6 days назад @ paperswithcode.com
/benedekrozemberczki/ PGT-I: Scaling Spatiotemporal GNNs with Memory-Efficient Distributed Training
/benedekrozemberczki/ PGT-I: Scaling Spatiotemporal GNNs with Memory-Efficient Distributed Training

Spatiotemporal graph neural networks (ST-GNNs) are powerful tools for modeling spatial and temporal data dependencies. Code: https://github.com/benedekrozemberczki/pytorch_geometric_temporal

2 weeks, 6 days назад @ paperswithcode.com
/chengxuphd/ DCR: Quantifying Data Contamination in LLMs Evaluation
/chengxuphd/ DCR: Quantifying Data Contamination in LLMs Evaluation

Cloudflare is unable to establish an SSL connection to the origin server.

If you're a visitor of this website:Please try again in a few minutes.

If you're the owner of this website:It appears that the SSL configuration used is not compatible with Cloudflare.

This could happen for a several reasons, including no shared cipher suites.

Additional troubleshooting information here.

2 weeks, 6 days назад @ paperswithcode.com
/gitter-lab/ Assay2Mol: large language model-based drug design using BioAssay context
/gitter-lab/ Assay2Mol: large language model-based drug design using BioAssay context

Scientific databases aggregate vast amounts of quantitative data alongside descriptive text. Code: https://github.com/gitter-lab/Assay2Mol

2 weeks, 6 days назад @ paperswithcode.com
/hayatkhan8660-maker/ DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition
/hayatkhan8660-maker/ DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition

We employ forward Kullback-Leibler (KL) divergence alongside spatio-temporal focal modulation to effectively transfer both local and global context from the Video-FocalNet Base (teacher) to the proposed VFL-Net (student). Code: https://github.com/hayatkhan8660-maker/DVFL-Net

2 weeks, 6 days назад @ paperswithcode.com
/JudyJuezhuLong/ Best Practices for Large-Scale, Pixel-Wise Crop Mapping and Transfer Learning Workflows
/JudyJuezhuLong/ Best Practices for Large-Scale, Pixel-Wise Crop Mapping and Transfer Learning Workflows

Cloudflare is unable to establish an SSL connection to the origin server.

If you're a visitor of this website:Please try again in a few minutes.

If you're the owner of this website:It appears that the SSL configuration used is not compatible with Cloudflare.

This could happen for a several reasons, including no shared cipher suites.

Additional troubleshooting information here.

2 weeks, 6 days назад @ paperswithcode.com
/joaojcorreia/ A Fuzzy Approach to Project Success: Measuring What Matters
/joaojcorreia/ A Fuzzy Approach to Project Success: Measuring What Matters

This paper introduces a novel approach to project success evaluation by integrating fuzzy logic into an existing construct. Code: https://github.com/joaojcorreia/FuzzyLogic_ProjectSuccess

2 weeks, 6 days назад @ paperswithcode.com
/kunkunlin1221/ InstructFLIP: Exploring Unified Vision-Language Model for Face Anti-spoofing
/kunkunlin1221/ InstructFLIP: Exploring Unified Vision-Language Model for Face Anti-spoofing

Extensive experiments demonstrate the effectiveness of InstructFLIP by outperforming SOTA models in accuracy and substantially reducing training redundancy across diverse domains in FAS. Code: https://github.com/kunkunlin1221/InstructFLIP

2 weeks, 6 days назад @ paperswithcode.com
/Linvyl/ Describe Anything Model for Visual Question Answering on Text-rich Images
/Linvyl/ Describe Anything Model for Visual Question Answering on Text-rich Images

Recent progress has been made in region-aware vision-language modeling, particularly with the emergence of the Describe Anything Model (DAM). Code: https://github.com/Linvyl/DAM-QA

2 weeks, 6 days назад @ paperswithcode.com
/abhijeet3922/ Developing Visual Augmented Q&A System using Scalable Vision Embedding Retrieval & Late Interaction Re-ranker
/abhijeet3922/ Developing Visual Augmented Q&A System using Scalable Vision Embedding Retrieval & Late Interaction Re-ranker

We propose multi-step custom implementation utilizing widely adopted hybrid search (metadata & embedding) and state of the art late interaction re-ranker to retrieve best matching pages. Code: https://github.com/abhijeet3922/vision-RAG

2 weeks, 6 days назад @ paperswithcode.com
/ziangcao0312/ PhysX: Physical-Grounded 3D Asset Generation
/ziangcao0312/ PhysX: Physical-Grounded 3D Asset Generation

3D modeling is moving from virtual to physical. Code: https://github.com/ziangcao0312/PhysX

2 weeks, 6 days назад @ paperswithcode.com
/henry123-boy/ SpatialTrackerV2: 3D Point Tracking Made Easy
/henry123-boy/ SpatialTrackerV2: 3D Point Tracking Made Easy

We present SpatialTrackerV2, a feed-forward 3D point tracking method for monocular videos. Code: https://github.com/henry123-boy/SpaTrackerV2

2 weeks, 6 days назад @ paperswithcode.com
/cncs-fit/ Emergence of Functionally Differentiated Structures via Mutual Information Optimization in Recurrent Neural Networks
/cncs-fit/ Emergence of Functionally Differentiated Structures via Mutual Information Optimization in Recurrent Neural Networks

Analysis of network performance, correlation patterns, and weight matrices reveals that mutual information minimization yields high task performance alongside clear functional modularity and moderate structural modularity. Code: https://github.com/cncs-fit/mio_rnn

2 weeks, 6 days назад @ paperswithcode.com
/coswindywang/ Making Language Model a Hierarchical Classifier and Generator
/coswindywang/ Making Language Model a Hierarchical Classifier and Generator

Language heads of the last layer are copied to different selected intermediate layers, and fine-tuned with different task inputs. Code: https://github.com/coswindywang/HdLM

2 weeks, 6 days назад @ paperswithcode.com
/ahmedehabb/ From Roots to Rewards: Dynamic Tree Reasoning with RL
/ahmedehabb/ From Roots to Rewards: Dynamic Tree Reasoning with RL

Modern language models address complex questions through chain-of-thought (CoT) reasoning (Wei et al., 2023) and retrieval augmentation (Lewis et al., 2021), yet struggle with error propagation and knowledge integration. Code: https://github.com/ahmedehabb/From-Roots-to-Rewards-Dynamic-Tree-Reasoning-with-RL

2 weeks, 6 days назад @ paperswithcode.com
💼 University and corporation labs
DeepMind DeepMind
последний пост 15 часов назад
How AI is helping advance the science of bioacoustics to save endangered species
How AI is helping advance the science of bioacoustics to save endangered species How AI is helping advance the science of bioacoustics to save endangered species

Our new Perch model helps conservationists analyze audio faster to protect endangered species, from Hawaiian honeycreepers to coral reefs.

These recordings can tell us a lot about the animals present in a given area, along with other clues about the health of that ecosystem.

Today, we are releasing an update to Perch, our AI model designed to help conservationists analyze bioacoustic data.

This new model has better state-of-the-art off-the-shelf bird species predictions than the previous model.

It can disentangle complex acoustic scenes over thousands or even millions of hours of audio data.

15 часов назад @ deepmind.google
Genie 3: A new frontier for world models
Genie 3: A new frontier for world models Genie 3: A new frontier for world models

World models are also a key stepping stone on the path to AGI, since they make it possible to train AI agents in an unlimited curriculum of rich simulation environments.

Last year we introduced the first foundation world models with Genie 1 and Genie 2, which could generate new environments for agents.

We have also continued to push the state of the art in video generation with our models Veo 2 and Veo 3, which exhibit a deep understanding of intuitive physics.

Each of these models marks progress along different capabilities of world simulation.

Genie 3 is our first world model to allow interaction in real-time, while also improving consistency and realism compared to Genie 2.

2 days, 16 hours назад @ deepmind.google
Rethinking how we measure AI intelligence
Rethinking how we measure AI intelligence Rethinking how we measure AI intelligence

Current AI benchmarks are struggling to keep pace with modern models.

As models reach closer to 100% on certain benchmarks, they also become less effective at revealing meaningful performance differences.

We continue to invest in new and more challenging benchmarks, but on the path to general intelligence, we need to continue to look for new ways to evaluate.

While we continue to evolve and pursue current AI benchmarks, we’re also consistently looking to test new approaches to evaluating models.

That’s why today, we're introducing the Kaggle Game Arena: a new, public AI benchmarking platform where AI models compete head-to-head in strategic games, providing a verifiable, and dynamic measure…

3 days, 14 hours назад @ blog.google
Try Deep Think in the Gemini app
Try Deep Think in the Gemini app Try Deep Think in the Gemini app

Today, we’re making Deep Think available in the Gemini app to Google AI Ultra subscribers – the latest in a lineup of extremely capable AI tools and features made exclusively available to them.

This new release incorporates feedback from early trusted testers and research breakthroughs.

It’s a significant improvement over what was first announced at I/O, as measured in terms of key benchmark improvements and trusted tester feedback.

It is a variation of the model that recently achieved the gold-medal standard at this year’s International Mathematical Olympiad (IMO).

Deep Think could be a powerful tool in creative problem solving:

6 days, 19 hours назад @ blog.google
AlphaEarth Foundations helps map our planet in unprecedented detail
AlphaEarth Foundations helps map our planet in unprecedented detail AlphaEarth Foundations helps map our planet in unprecedented detail

How AlphaEarth Foundations works AlphaEarth Foundations provides a powerful new lens for understanding our planet by solving two major challenges: data overload and inconsistent information.

To ensure AlphaEarth Foundations was ready for real-world use, we rigorously tested its performance.

When compared against both traditional methods and other AI mapping systems, AlphaEarth Foundations was consistently the most accurate.

On average, AlphaEarth Foundations had a 24% lower error rate than the models we tested, demonstrating its superior learning efficiency.

Empowering others with AI AlphaEarth Foundations represents a significant step forward in understanding the state and dynamics of our …

1 week, 1 day назад @ deepmind.google
Aeneas transforms how historians connect the past
Aeneas transforms how historians connect the past Aeneas transforms how historians connect the past

Research Aeneas transforms how historians connect the past ShareCopy link ×Writing was everywhere in the Roman world — etched onto everything from imperial monuments to everyday objects.

Today, we’re publishing a paper in Nature introducing Aeneas, the first artificial intelligence (AI) model for contextualizing ancient inscriptions.

How Aeneas works Aeneas is a multimodal generative neural network that takes an inscription’s text and image as input.

We invited twenty-three historians who regularly work with inscriptions to restore, date and place a set of texts using Aeneas.

The Aeneas team is continuing to partner with diverse subject matter experts, using Aeneas to help shed light to our…

2 weeks, 1 day назад @ deepmind.google
Gemini 2.5 Flash-Lite is now ready for scaled production use!
Gemini 2.5 Flash-Lite is now ready for scaled production use! Gemini 2.5 Flash-Lite is now ready for scaled production use!

Gemini 2.5 Flash-Lite, previously in preview, is now stable and generally available. This cost-efficient model provides high quality in a small size, and includes 2.5 family features like a 1 million-token context window and multimodality.

2 weeks, 2 days назад @ deepmind.google
Advanced version of Gemini with Deep Think officially achieves gold-medal standard at the International Mathematical Olympiad
Advanced version of Gemini with Deep Think officially achieves gold-medal standard at the International Mathematical Olympiad Advanced version of Gemini with Deep Think officially achieves gold-medal standard at the International Mathematical Olympiad

The International Mathematical Olympiad (“IMO”) is the world’s most prestigious competition for young mathematicians, and has been held annually since 1959.

Recently, the IMO has also become an aspirational challenge for AI systems as a test of their advanced mathematical problem-solving and reasoning capabilities.

Last year, Google DeepMind’s combined AlphaProof and AlphaGeometry 2 systems achieved the silver-medal standard, solving four out of the six problems and scoring 28 points.

Making use of specialist formal languages, this breakthrough demonstrated that AI was beginning to approach elite human mathematical reasoning.

Breakthrough Performance at IMO 2025 with Gemini Deep ThinkAn adv…

2 weeks, 3 days назад @ deepmind.google
Exploring the context of online images with Backstory
Exploring the context of online images with Backstory Exploring the context of online images with Backstory

New experimental AI tool helps people explore the context and origin of images seen online.

The ways people are using and interacting with images online is constantly evolving.

As part of our efforts to help people make informed choices, we’re developing easy-to-use provenance tools and in-product context features, and investing in areas like information literacy.

Today, we’re introducing Backstory, an experimental artificial intelligence (AI) tool that surfaces information and helps people learn more about the context of images seen online.

When given an image and a written prompt, Backstory investigates whether an image was AI-generated, when and where it’s previously been used online, an…

2 weeks, 3 days назад @ deepmind.google
AlphaGenome: AI for better understanding the genome
AlphaGenome: AI for better understanding the genome AlphaGenome: AI for better understanding the genome

Science AlphaGenome: AI for better understanding the genome ShareCopy link ×Introducing a new, unifying DNA sequence model that advances regulatory variant-effect prediction and promises to shed new light on genome function — now available via API.

Small variations in a genome’s DNA sequence can alter an organism’s response to its environment or its susceptibility to disease.

How AlphaGenome works Our AlphaGenome model takes a long DNA sequence as input — up to 1 million letters, also known as base-pairs — and predicts thousands of molecular properties characterising its regulatory activity.

We haven't designed or validated AlphaGenome for personal genome prediction, a known challenge for A…

1 month, 1 week назад @ deepmind.google
Gemini Robotics On-Device brings AI to local robotic devices
Gemini Robotics On-Device brings AI to local robotic devices Gemini Robotics On-Device brings AI to local robotic devices

In March, we introduced Gemini Robotics, our most advanced VLA (vision language action) model, bringing Gemini 2.0’s multimodal reasoning and real-world understanding into the physical world.

Today, we’re introducing Gemini Robotics On-Device, our most powerful VLA model optimized to run locally on robotic devices.

Gemini Robotics On-Device shows strong general-purpose dexterity and task generalization, and it’s optimized to run efficiently on the robot itself.

Model capabilities and performanceGemini Robotics On-Device is a robotics foundation model for bi-arm robots, engineered to require minimal computational resources.

It builds on the task generalization and dexterity capabilities of G…

1 month, 2 weeks назад @ deepmind.google
Gemini 2.5: Updates to our family of thinking models
Gemini 2.5: Updates to our family of thinking models Gemini 2.5: Updates to our family of thinking models

Explore the latest Gemini 2.5 model updates with enhanced performance and accuracy: Gemini 2.5 Pro now stable, Flash generally available, and the new Flash-Lite in preview.

1 month, 3 weeks назад @ deepmind.google
We’re expanding our Gemini 2.5 family of models
We’re expanding our Gemini 2.5 family of models We’re expanding our Gemini 2.5 family of models

We designed Gemini 2.5 to be a family of hybrid reasoning models that provide amazing performance, while also being at the Pareto Frontier of cost and speed.

Today, we’re taking the next step with our 2.5 Pro and Flash models by releasing them as stable and generally available.

And we’re bringing you 2.5 Flash-Lite in preview — our most cost-efficient and fastest 2.5 model yet.

Making 2.5 Flash and 2.5 Pro generally availableThanks to all of your feedback, today we’re releasing stable versions of 2.5 Flash and Pro, so you can build production applications with confidence.

Introducing Gemini 2.5 Flash-LiteWe’re also introducing a preview of the new Gemini 2.5 Flash-Lite, our most cost-effici…

1 month, 3 weeks назад @ blog.google
Behind “ANCESTRA”: combining Veo with live-action filmmaking
Behind “ANCESTRA”: combining Veo with live-action filmmaking Behind “ANCESTRA”: combining Veo with live-action filmmaking

Today, Eliza McNitt’s short film, “ANCESTRA,” premieres at the Tribeca Festival.

It’s the story of a mother, and what happens when her child is born with a hole in its heart.

Inspired by the dramatic events of McNitt's own birth, the film portrays a mother's love as a cosmic, life-saving force.

Together, we founded this partnership to put the world’s best generative AI into the hands of top filmmakers, to advance the frontiers of storytelling and technology.

“ANCESTRA” combined live-action scenes with sequences generated by Veo, our state-of-the-art video generation model.

1 month, 3 weeks назад @ blog.google
How we're supporting better tropical cyclone prediction with AI
How we're supporting better tropical cyclone prediction with AI How we're supporting better tropical cyclone prediction with AI

Research How we're supporting better tropical cyclone prediction with AI ShareCopy link ×We’re launching Weather Lab, featuring our experimental cyclone predictions, and we’re partnering with the U.S. National Hurricane Center to support their forecasts and warnings this cyclone season.

Yet, improving the accuracy of cyclone predictions can help protect communities through more effective disaster preparedness and earlier evacuations.

Today, Google DeepMind and Google Research are launching Weather Lab, an interactive website for sharing our artificial intelligence (AI) weather models.

Weather Lab’s live and historical cyclone predictions Weather Lab shows live and historical cyclone predict…

1 month, 3 weeks назад @ deepmind.google
Google
последний пост 2 days, 5 hours назад
Announcing AI-first Colab notebook experience for Google Cloud
Announcing AI-first Colab notebook experience for Google Cloud Announcing AI-first Colab notebook experience for Google Cloud

Today, we are excited to bring these capabilities to Google Cloud BigQuery and Vertex AI via the Colab Enterprise notebook.

Generate, explain and transform code, as well as explain errors and fix them automatically.

Simplify workflows with Data Science AgentData science can be complex, iterative, and time-consuming.

The Data Science Agent (DSA) in Colab accelerates data science development with agentic capabilities that facilitate data exploration, transformation and ML modeling.

The Data Science Agent then generates a detailed plan covering all aspects of data science modeling from data loading, exploration, cleaning, visualization, feature engineering, data splitting, model training/optim…

2 days, 5 hours назад @ cloud.google.com
Redefining enterprise data with agents and AI-native foundations
Redefining enterprise data with agents and AI-native foundations Redefining enterprise data with agents and AI-native foundations

A unified and AI-native data foundationIntelligent agents and the networks they form cannot operate on a traditional data stack.

A core requirement of this AI-native foundation is that it unifies live transactional and historical analytical data stored in OLTP and OLAP systems.

With this unified data plane in place, the next requirement is giving agents a comprehensive memory that is grounded in your company's factual data.

The foundation of effective RAG is vector search that spans both real-time operational data and deep historical, analytical data.

This is why we embed vector search and generation capabilities directly into our data foundations — to give agents access to both transaction…

2 days, 5 hours назад @ cloud.google.com
Announcements for AI Hypercomputer: The latest infrastructure news for ML practitioners
Announcements for AI Hypercomputer: The latest infrastructure news for ML practitioners Announcements for AI Hypercomputer: The latest infrastructure news for ML practitioners

Leading software, open frameworksAI Hypercomputer provides choice and flexibility through support the most popular AI and ML libraries and frameworks.

We’re also expanding the focus of MaxText to provide post-training techniques through integration with the new JAX-native post-training library Tunix.

For example, GKE includes a managed CSI driver, which allows containerized AI training workflows to access data with ultra-low latency and high throughput.

Onwards and upwardsAs we continue to push the boundaries of AI, we’ll update and optimize AI Hypercomputer based on the our learnings from training Gemini to serving 980+ trillion tokens a month.

To learn more about using AI Hypercomputer fo…

2 days, 14 hours назад @ cloud.google.com
How Wells Fargo is using Google Cloud AI to empower its workforce with agentic tools
How Wells Fargo is using Google Cloud AI to empower its workforce with agentic tools How Wells Fargo is using Google Cloud AI to empower its workforce with agentic tools

The financial services industry has hit a technology tipping point.

Today, Wells Fargo and Google Cloud are excited to announce an expansion of their strategic relationship, which will transform how Wells Fargo uses and deploys agentic AI at scale.

This expanded collaboration will equip Wells Fargo employees — including branch bankers, investment bankers, marketers, and customer relations and corporate teams — with AI agents and tools from Google Cloud.

Wells Fargo is an early adopter of Google Agentspace, Google Cloud's unified, secure platform to build, manage, and adopt AI agents at scale.

Wells Fargo employees and teams will now be able to reach meaningful insights faster and unlock new…

2 days, 17 hours назад @ cloud.google.com
Optimize your cloud costs using Cloud Hub Optimization and Cost Explorer
Optimize your cloud costs using Cloud Hub Optimization and Cost Explorer Optimize your cloud costs using Cloud Hub Optimization and Cost Explorer

Application owners are looking for three things when they think about optimizing cloud costs:What are the most expensive resources?

To help you answer these questions quickly and easily, we announced Cloud Hub Optimization and Cost Explorer, in private preview, at Google Cloud Next 2025.

And today, we are excited to announce that both Cloud Hub Optimization and Cost Explorer are now in public preview.

Application cost and utilizationAs an app owner, your primary objective is keeping your application healthy at all times.

In addition to supporting Google Cloud Projects, Cloud Hub Optimization and Cost Explorer leverage App Hub applications to show you the cost-efficiency of your application’…

3 days, 14 hours назад @ cloud.google.com
A deep dive into code reviews with Gemini Code Assist in GitHub
A deep dive into code reviews with Gemini Code Assist in GitHub A deep dive into code reviews with Gemini Code Assist in GitHub

Imagine a code review process that doesn't slow you down.

This isn't a future-state prediction; it's what's possible today with Gemini Code Assist, integrated directly into your GitHub workflow at no charge.

By embedding a powerful AI partner into every pull request, we're transforming code reviews from a frustrating bottleneck into a fast and painless way to ensure high quality and consistent code, leading to higher code quality and happier developers.

The challenge: Why code reviews are a bottleneckCode reviews are a non-negotiable part of building quality software, but they are often a major bottleneck in the development lifecycle.

This friction slows down delivery velocity, leads to inc…

1 week назад @ cloud.google.com
Announcing a complete developer toolkit for scaling A2A agents on Google Cloud
Announcing a complete developer toolkit for scaling A2A agents on Google Cloud Announcing a complete developer toolkit for scaling A2A agents on Google Cloud

AI is evolving beyond single, task-specific agents into an interconnected ecosystem, where autonomous agents collaborate to solve complex problems, regardless of their underlying platform.

To make this transition easier for developers, we are announcing a comprehensive suite of tools that will empower developers to build, deploy, evaluate, and sell Agent2Agent (A2A) agents with Google Cloud.

Today, we’re excited to announce the release version 0.3 of the A2A protocol, which brings a more stable interface to build against and is critical to accelerating enterprise adoption.

The A2A protocol is quickly gaining momentum, with support from a growing ecosystem of over 150 organizations that span…

1 week назад @ cloud.google.com
Veo 3 and Veo 3 Fast are now stable and generally available on Vertex AI
Veo 3 and Veo 3 Fast are now stable and generally available on Vertex AI Veo 3 and Veo 3 Fast are now stable and generally available on Vertex AI

Today, we’re building on this momentum with some exciting updates to Veo on Vertex AI.

Veo 3, our most advanced video generation model, is now generally available to everyone on Vertex AI.

Veo 3 Fast , a model designed for speed and rapid iteration, is now generally available for everyone on Vertex AI.

Coming to public preview on Vertex AI in August, Veo 3 and Veo 3 Fast will also offer image-to-video capabilities to make it possible for you to bring static visuals and images to life.

How businesses are building with Veo 3 on Vertex AIGoogle Cloud customers around the world are using Veo 3 and Veo 3 Fast on Vertex AI to create professional-quality video content with unparalleled efficiency …

1 week, 2 days назад @ cloud.google.com
The global endpoint offers improved availability for Anthropic’s Claude on Vertex AI
The global endpoint offers improved availability for Anthropic’s Claude on Vertex AI The global endpoint offers improved availability for Anthropic’s Claude on Vertex AI

Anthropic's Claude models on Vertex AI now have improved overall availability with the global endpoint for Claude models.

Claude on Vertex AI fits perfectly with that — we get one of the best language models available, with Google’s solid infrastructure and the global endpoint that delivers fast responses worldwide.

When you send a request to Anthropic’s Claude models on Vertex AI, you typically specify a region (e.g., us-central1).

How to get startedTo get started with the global endpoint for Anthropic’s Claude models on Vertex AI, There are only two steps:Step 1: Select and enable a global endpoint supported Claude model on Vertex AI (Claude Opus 4, Claude Sonnet 4, Claude Sonnet 3.7, Cla…

1 week, 3 days назад @ cloud.google.com
BigQuery meets ADK & MCP: Accelerate agent development with BigQuery's new first-party toolset
BigQuery meets ADK & MCP: Accelerate agent development with BigQuery's new first-party toolset BigQuery meets ADK & MCP: Accelerate agent development with BigQuery's new first-party toolset

To solve this, we are introducing a new, first-party toolset for BigQuery that includes tools to fetch metadata and execute queries (and we have more on the way):list_dataset_ids : Fetches BigQuery dataset ids present in a GCP project.

get_dataset_info : Fetches metadata about a BigQuery dataset.

list_table_ids : Fetches table ids present in a BigQuery dataset.

get_table_info : Fetches metadata about a BigQuery table.

In this post, we’ll explore these first-party tools for BigQuery and walk you through how they can be used to build a conversational analytics agent in ADK that can answer natural language questions.

1 week, 3 days назад @ cloud.google.com
Understanding Calendar mode for Dynamic Workload Scheduler: Reserve ML GPUs and TPUs
Understanding Calendar mode for Dynamic Workload Scheduler: Reserve ML GPUs and TPUs Understanding Calendar mode for Dynamic Workload Scheduler: Reserve ML GPUs and TPUs

Organizations need ML compute resources that can accommodate bursty peaks and periodic troughs.

Calendar mode is currently available in preview as the newest feature of Dynamic Workload Scheduler.

This mode provides short-term ML capacity — up to 90 days of reserved capacity — without requiring long-term commitments.

Similar to a flight or hotel booking experience, Calendar mode makes it easy to search for and reserve ML capacity.

What customers are sayingOver the past year, early access customers have used Calendar mode to reserve ML compute resources for a variety of use cases, from drug discovery to training new models.

1 week, 3 days назад @ cloud.google.com
Your guide to taking an open model from discovery to a production-ready endpoint on Vertex AI
Your guide to taking an open model from discovery to a production-ready endpoint on Vertex AI Your guide to taking an open model from discovery to a production-ready endpoint on Vertex AI

You can leverage Vertex AI’s Gen AI evaluation service to assess the model against your own data and criteria, or integrate open-source frameworks.

Reading data can often be a bottleneck, but Vertex AI makes it simple.

For more complex data-cleaning and preparation tasks, you can build an automated Vertex AI Pipeline to orchestrate the preprocessing work for you.

This is when you graduate from the notebook to a formal Vertex AI Training job.

Vertex AI handles it all.

1 week, 6 days назад @ cloud.google.com
New Cluster Director features: Simplified GUI, managed Slurm, advanced observability
New Cluster Director features: Simplified GUI, managed Slurm, advanced observability New Cluster Director features: Simplified GUI, managed Slurm, advanced observability

In April, we released Cluster Director, a unified management plane that makes deploying and managing large-scale AI infrastructure simpler and more intuitive than ever before, putting the power of an AI supercomputer at your fingertips.

Today, we’re excited to release new features in preview including an intuitive interface, managed Slurm experience, and observability dashboard that intercepts performance anomalies.

LG Research uses Google Cloud to train their large language models, most recently Exaone 3.5.

Cluster Director has made the cluster deployment and management smooth, enabling them to dedicate more focus to the scientific challenges at the core of their work.

“Cluster Director on…

2 weeks назад @ cloud.google.com
25+ top gen AI how-to guides for enterprise
25+ top gen AI how-to guides for enterprise

The best way to learn AI is by building. From finding quick ways to deploy open models to building complex, multi-agentic systems, it’s easy to feel overwhelmed by the sheer volume of resources out there. To that end, we've compiled a living, curated collection of our 25+ favorite how-to guides for Google Cloud. This collection is split into four areas: Faster model deployment: Create efficient CI/CD pipelines, deploy large models like Llama 3 on high-performance infrastructure, and use open models in Vertex AI Studio. Building generative AI apps & multi-agentic systems: Build document summarizers, multi-turn chat apps, and advanced research agents with LangGraph. Fine-tuning, evaluation, a…

2 weeks, 2 days назад @ cloud.google.com
Announcing a new monitoring library to optimize TPU performance
Announcing a new monitoring library to optimize TPU performance Announcing a new monitoring library to optimize TPU performance

And there is strong demand from customers for Cloud TPUs as well.

Today, we're thrilled to introduce a new monitoring library for Google Cloud TPUs, a new set of observability and diagnostic tools that provide granular, integrated performance and accelerator utilization insights so you can continuously assess and improve the efficiency of your Cloud TPU workloads.

Note: If you have shell access to the TPU VM and just need some diagnostic information (e.g., to observe memory usage for a running process), you can use tpu-info, a command-line tool for viewing TPU metrics.

Unlocking dynamic optimization: Key metrics in actionThe monitoring library provides snapshot-mode access to a rich set of …

2 weeks, 6 days назад @ cloud.google.com
OpenAI
последний пост None
Microsoft Microsoft
последний пост 14 часов назад
Reimagining healthcare delivery and public health with AI
Reimagining healthcare delivery and public health with AI Reimagining healthcare delivery and public health with AI

All this is critical for maintaining good public health, as well as aligning better health and financial outcomes.

Umair and Gianrico are CEO-level leaders representing some of the best of the worlds of public health, healthcare delivery, medical research, and medical education.

So there’s so many use cases that AI has for population health or public health.

LEE: I find it really interesting that you are using the terms “public health” and “population health” …SHAH: Yeah.

As Umair explained in our discussion, the core of public health is the idea of population health, the idea of extracting new health insights from signals from population-scale data.

14 часов назад @ microsoft.com
Self-adaptive reasoning for science
Self-adaptive reasoning for science Self-adaptive reasoning for science

Recent approaches show promise in the utilization of intrinsic rewards for training reasoning models (RLIR).

While today’s reasoning models require additional data in the training phase and limit user control during the reasoning generation process, CLIO’s approach enables users to steer reasoning from scratch without additional data.

Through this open architecture approach to reasoning, we alleviate the necessity for further model post-training to achieve desired reasoning behavior.

Capabilities such as explaining the outcomes of internal reasoning are standard in the scientific field and are present in current reasoning model approaches.

Implications for science and trustworthy discoveryT…

1 day, 14 hours назад @ microsoft.com
VeriTrail: Detecting hallucination and tracing provenance in multi-step AI workflows
VeriTrail: Detecting hallucination and tracing provenance in multi-step AI workflows VeriTrail: Detecting hallucination and tracing provenance in multi-step AI workflows

Instead, an LM was prompted to classify the claim as “Fully Supported,” “Not Fully Supported,” or “Inconclusive” based on the evidence.

In this case, the verdict was “Fully Supported.”Since the verdict in Iteration 1 was “Fully Supported,” VeriTrail proceeded to Iteration 2.

Right: VeriTrail’s hallucination detection process for a “Not Fully Supported” claim, where the maximum number of consecutive “Not Fully Supported” verdicts was set to 2.

Once again, the verdict was “Not Fully Supported.” Since this was the second consecutive “Not Fully Supported” verdict, verification terminated and the verdict was deemed final.

A higher maximum increases confidence that a flagged claim is truly halluc…

2 days, 14 hours назад @ microsoft.com
Project Ire autonomously identifies malware at scale
Project Ire autonomously identifies malware at scale Project Ire autonomously identifies malware at scale

To verify its findings, Project Ire can invoke a validator tool that cross-checks claims in the report against the chain of evidence.

This tool draws on expert statements from malware reverse engineers on the Project Ire team.

For each file it analyzes, Project Ire generates a report that includes an evidence section, summaries of all examined code functions, and other technical artifacts.

Project Ire correctly identified the code that locates and disables antivirus programs, providing evidence that the file was malicious.

The issue was later resolved by updating decompiler rules, but this example illustrates how Project Ire navigates uncertainty during analysis.

1 week, 3 days назад @ microsoft.com
Navigating medical education in the era of generative AI
Navigating medical education in the era of generative AI Navigating medical education in the era of generative AI

Prior to med school, Daniel pursued experiences that cultivated his interest in the application of AI in medical practice and education.

Really, really looking forward to this chat.

There’s AI before ChatGPT and before, you know, generative AI really became a big thing, and then afterwards.

And then after we talk about what’s really happening, what do you think should happen in medical education given the reality of generative AI?

And I do agree [that] AI really gives us real hope that we can make it true.

2 weeks назад @ microsoft.com
Xinxing Xu bridges AI research and real-world impact at Microsoft Research Asia – Singapore
Xinxing Xu bridges AI research and real-world impact at Microsoft Research Asia – Singapore Xinxing Xu bridges AI research and real-world impact at Microsoft Research Asia – Singapore

In 2024, Xu joined Microsoft Research Asia where he began a new chapter focused on bridging between academic research and real-world AI applications.

“Microsoft Research Asia is committed to integrating scientific exploration with real-world applications, which creates a unique research environment,” Xu says.

Microsoft Research Asia – Singapore: Expanding global reach, connecting regional innovationRealizing AI’s full potential requires more than technical breakthroughs.

This is why Microsoft Research Asia continues to collaborate closely with Singapore’s top universities, research institutions, and industry partners.

– “Core knowledge in machine learning, linear algebra, and probability an…

2 weeks, 1 day назад @ microsoft.com
Technical approach for classifying human-AI interactions at scale
Technical approach for classifying human-AI interactions at scale Technical approach for classifying human-AI interactions at scale

From batching strategies and token optimization and orchestration, we’ll share what it takes to build a real-time LLM classification pipeline.

Engineering challenges & solutionsBuilding a high-throughput, LLM-powered classification pipeline at scale introduced a range of engineering challenges—from managing latency and token limits to ensuring system resilience.

Evolving LLM models & prompt alignmentChallenge: Each new LLM release—such as Phi, Mistral, DeepSeek, and successive generations of GPT (e.g., GPT-3.5, GPT-4, GPT-4 Turbo, GPT-4o)—brings improvements, but also subtle behavioral shifts.

Dynamic concurrency scaling for LLM callsChallenge: LLM endpoints frequently encounter rate limits…

2 weeks, 1 day назад @ microsoft.com
AI Testing and Evaluation: Reflections
AI Testing and Evaluation: Reflections AI Testing and Evaluation: Reflections

Our goal is to learn from their successes and their stumbles to move the science and practice of AI testing forward.

We have examples, like the pharmaceutical or medical device industry experts with whom you spoke, that’s really, you know, testing … there is a pre-deployment requirement.

And the third is just how rigid versus adaptive these testing and evaluation regimes or frameworks are in these different domains.

I really agree that there has been a lot of emphasis to date on, sort of, testing models upstream, the AI model evaluation.

You know, I think there’s been real progress already in the AI evaluation and testing ecosystem in the public-private partnership context.

2 weeks, 3 days назад @ microsoft.com
CollabLLM: Teaching LLMs to collaborate with users
CollabLLM: Teaching LLMs to collaborate with users CollabLLM: Teaching LLMs to collaborate with users

Large language models (LLMs) can solve complex puzzles in seconds, yet they sometimes struggle over simple conversations.

User-centric approach to trainingTo address this, we’re exploring ways to train LLMs with users in mind.

At any point in a conversation, the model generates multiple possible next turns by engaging in a dialogue with a simulated user.

The MR of each model response was computed by averaging the scores from the sampled conversations, originating from the model response.

We see CollabLLM as a step in that direction, training models to engage in meaningful multi-turn interactions, ask clarifying questions, and adapt to context.

3 weeks, 2 days назад @ microsoft.com
AI Testing and Evaluation: Learnings from cybersecurity
AI Testing and Evaluation: Learnings from cybersecurity AI Testing and Evaluation: Learnings from cybersecurity

Absolutely, I really, really was.

As a principal director on the Microsoft AI Red Team, Tori leads all AI security and safety red team operations, as well as dangerous capability testing, to directly inform C-suite decision-makers.

This year, we’ve pulled a lot of those assets and insights into the Azure [AI] Foundry AI Red Teaming Agent (opens in new tab).

So you can get a little taste of what we do day to day in the AI Red Teaming Agent.

WESTERHOFF: I think the most important takeaway from those lessons is that AI security is truly a team sport.

3 weeks, 3 days назад @ microsoft.com
How AI will accelerate biomedical research and discovery
How AI will accelerate biomedical research and discovery How AI will accelerate biomedical research and discovery

Dr. Eric Topol is the executive vice president of the biomedical research non-profit Scripps Research, where he founded and now directs the Scripps Research Translational Institute.

Let’s continue our deep dive on AI and biomedical research with this conversation with Noubar Afeyan:LEE: Noubar, thanks so much for joining.

And there’s the origin story of contact with AI, you know, before the emergence of generative AI and afterwards.

What is going on today with respect to AI really being used for something meaningful in the design and development of drugs?

TOPOL: You would read about how, you know, data is the new oil and, you know, gold and whatnot.

4 weeks назад @ microsoft.com
AI Testing and Evaluation: Learnings from pharmaceuticals and medical devices
AI Testing and Evaluation: Learnings from pharmaceuticals and medical devices AI Testing and Evaluation: Learnings from pharmaceuticals and medical devices

Our goal is to learn from their successes and their stumbles to move the science and practice of AI testing forward.

During the pre-market phase, medical testing establishes baseline safety and effectiveness metrics through bench testing, performance standards, and clinical studies.

SULLIVAN: So medical devices face a pretty prescriptive multi-level testing path before they hit the market.

We are looking into medical devices, as well, obviously, but also other technologies in advanced medical computing.

So we see Phase 3 trials as something that occurs in the medical devices and pharmaceuticals field.

1 month назад @ microsoft.com
AI Testing and Evaluation: Learnings from genome editing
AI Testing and Evaluation: Learnings from genome editing AI Testing and Evaluation: Learnings from genome editing

As generative AI continues to advance, Microsoft has gathered a range of experts—from genome editing to cybersecurity—to share how their fields approach evaluation and risk assessment.

CHARO: Well, you know, genome editing is both very old and very new.

Now the earliest forms of genome editing were very inefficient, and so we didn’t worry that much.

But the bottom-line thing to remember, the way to really think about it is, we don’t regulate genome editing; we regulate the things that use genome editing.

And she said, you know, we don’t regulate genome editing; we regulate the things that use genome editing.

1 month, 1 week назад @ microsoft.com
PadChest-GR: A bilingual grounded radiology reporting benchmark for chest X-rays
PadChest-GR: A bilingual grounded radiology reporting benchmark for chest X-rays PadChest-GR: A bilingual grounded radiology reporting benchmark for chest X-rays

In our ever-evolving journey to enhance healthcare through technology, we’re announcing a unique new benchmark for grounded radiology report generation—PadChest-GR (opens in new tab).

A new frontier in radiology report generationIt is estimated that over half of people visiting hospitals have radiology scans that must be interpreted by a clinical professional.

In contrast, grounded radiology reporting demands that each finding be described and localized individually.

It is the first public benchmark that enables us to evaluate generation of fully grounded radiology reports in chest X-rays.

We invite researchers and industry experts to explore PadChest-GR and MAIRA-2, contribute innovative i…

1 month, 1 week назад @ microsoft.com
AI Testing and Evaluation: Learnings from Science and Industry
AI Testing and Evaluation: Learnings from Science and Industry AI Testing and Evaluation: Learnings from Science and Industry

Our goal is to learn from their successes and their stumbles to move the science and practice of AI testing forward.

And I think, really, there are two reasons why tech is so, kind of, representative of that kind of challenge that I’ve always found fascinating.

Continues to be a really important topic in the AI policy conversation right now, I think, for really good reason.

Testing is an important component for governance and AI and, of course, in all of these other domains, as well.

I think about almost, like, in the near to mid-term, like three issues that we need to address in the AI, kind of, policy and testing context.

1 month, 2 weeks назад @ microsoft.com
MIT AI MIT AI
последний пост 1 day, 2 hours назад
Eco-driving measures could significantly reduce vehicle emissions
Eco-driving measures could significantly reduce vehicle emissions Eco-driving measures could significantly reduce vehicle emissions

This indicates that eco-driving measures could be implemented gradually while still having measurable, positive impacts on mitigating climate change and improving public health.

Their analysis indicates that fully adopting eco-driving measures could cut annual city-wide intersection carbon emissions by 11 to 22 percent, without slowing traffic throughput or affecting vehicle and traffic safety.

A large-scale modeling study led by MIT researchers reveals that eco-driving measures, which can involve dynamically adjusting vehicle speeds to reduce stopping and excessive acceleration, could significantly reduce those CO 2 emissions.

They began by identifying 33 factors that influence vehicle emi…

1 day, 2 hours назад @ news.mit.edu
School of Architecture and Planning welcomes new faculty for 2025
School of Architecture and Planning welcomes new faculty for 2025 School of Architecture and Planning welcomes new faculty for 2025

Four new faculty members join the School of Architecture and Planning (SA+P) this fall, offering the MIT community creativity, knowledge, and scholarship in multidisciplinary roles.

“These individuals add considerable strength and depth to our faculty,” says Hashim Sarkis, dean of the School of Architecture and Planning.

Pat Pataranutaporn SM ’18, PhD ’20 joins the MIT Media Lab as an assistant professor of media arts and sciences.

She was named a “Pioneer” on the MIT Technology Review global list of “35 innovators under 35” in 2019.

Holly Samuelson joins the Department of Architecture as an associate professor in the Building Technology Program at MIT, teaching architectural technology cou…

1 day, 10 hours назад @ news.mit.edu
Helping data storage keep up with the AI revolution
Helping data storage keep up with the AI revolution Helping data storage keep up with the AI revolution

That’s because traditional data storage systems were designed to handle simple commands from a handful of users at once, whereas today, AI systems with millions of agents need to continuously access and process large amounts of data in parallel.

Traditional data storage systems now have layers of complexity, which slows AI systems down because data must pass through multiple tiers before reaching the graphical processing units (GPUs) that are the brain cells of AI.

Cloudian, co-founded by Michael Tso ’93, SM ’93 and Hiroshi Ohta, is helping storage keep up with the AI revolution.

The company has developed a scalable storage system for businesses that helps data flow seamlessly between stora…

2 days, 2 hours назад @ news.mit.edu
MIT tool visualizes and edits “physically impossible” objects
MIT tool visualizes and edits “physically impossible” objects MIT tool visualizes and edits “physically impossible” objects

Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have developed a unique approach to represent “impossible” objects in a more versatile way.

Making impossible objects possibleImpossible objects can’t be fully replicated in 3D.

This process enabled the researchers to reduce visual imperfections of impossible shapes, such as a red heart outline they thinned out.

It can also use “inverse rendering” tools to convert drawings and images of impossible objects into high-dimensional designs.

They’re also working with perception scientists to see how the computer graphics tool can be used more broadly.

3 days, 9 hours назад @ news.mit.edu
New algorithms enable efficient machine learning with symmetric data
New algorithms enable efficient machine learning with symmetric data New algorithms enable efficient machine learning with symmetric data

But training a model to process symmetric data is no easy task.

One common approach is called data augmentation, where researchers transform each symmetric data point into multiple data points to help the model generalize better to new data.

A well-known example of this is a graph neural network (GNN), which inherently handles symmetric data because of how it is designed.

They explored the statistical-computational tradeoff in machine learning with symmetric data.

Building on this theoretical evaluation, the researchers designed an efficient algorithm for machine learning with symmetric data.

1 week, 2 days назад @ news.mit.edu
“FUTURE PHASES” showcases new frontiers in music technology and interactive performance
“FUTURE PHASES” showcases new frontiers in music technology and interactive performance “FUTURE PHASES” showcases new frontiers in music technology and interactive performance

Music technology took center stage at MIT during “FUTURE PHASES,” an evening of works for string orchestra and electronics, presented by the MIT Music Technology and Computation Graduate Program as part of the 2025 International Computer Music Conference (ICMC).

Produced in collaboration with the MIT Media Lab’s Opera of the Future Group and Boston’s self-conducted chamber orchestra A Far Cry, “FUTURE PHASES” was the first event to be presented by the MIT Music Technology and Computation Graduate Program in MIT Music’s new space.

“The ICMC is all about presenting the latest research, compositions, and performances in electronic music,” says Egozy, director of the new Music Technology and Co…

1 week, 2 days назад @ news.mit.edu
Robot, know thyself: New vision-based system teaches machines to understand their bodies
Robot, know thyself: New vision-based system teaches machines to understand their bodies Robot, know thyself: New vision-based system teaches machines to understand their bodies

Instead, the entire system relies on a single camera that watches the robot’s movements and uses that visual data to control it.

But when a robot is soft, deformable, or irregularly shaped, those assumptions fall apart.

In soft and bio-inspired robots, designers often embed sensors or reinforce parts of the structure just to make modeling feasible.

That speed makes NJF more viable than many physics-based simulators for soft robots, which are often too computationally intensive for real-time use.

“What’s really interesting is that the system figures out on its own which motors control which parts of the robot,” says Li.

2 weeks назад @ news.mit.edu
Pedestrians now walk faster and linger less, researchers find
Pedestrians now walk faster and linger less, researchers find Pedestrians now walk faster and linger less, researchers find

The number of people lingering in public spaces declined by 14 percent in that time as well.

The researchers used machine-learning tools to assess 1980s-era video footage captured by renowned urbanist William Whyte, in Boston, New York, and Philadelphia.

“Public space is such an important element of civic life, and today partly because it counteracts the polarization of digital space,” says Salazar-Miranda.

On the other hand, the percentage of individuals entering these public spaces who became part of a group declined a bit.

“Perhaps there’s a more transactional nature to public space today,” Ratti says.

2 weeks назад @ news.mit.edu
New machine-learning application to help researchers predict chemical properties
New machine-learning application to help researchers predict chemical properties New machine-learning application to help researchers predict chemical properties

One of the shared, fundamental goals of most chemistry researchers is the need to predict a molecule’s properties, such as its boiling or melting point.

Enter a branch of artificial intelligence known as machine learning (ML).

One specific hurdle in chemical machine learning is translating molecular structures into a numerical language that computers can understand.

Next, the software implements state-of-the-art algorithms to identify patterns and accurately predict molecular properties like boiling and melting points, all through an intuitive, interactive graphical interface.

“We envision a future where any researcher can easily customize and apply machine learning to solve unique challeng…

2 weeks назад @ news.mit.edu
School of Architecture and Planning recognizes faculty with academic promotions in 2025
School of Architecture and Planning recognizes faculty with academic promotions in 2025 School of Architecture and Planning recognizes faculty with academic promotions in 2025

Seven faculty in the MIT School of Architecture and Planning (SA+P) have been honored for their contributions through promotions, effective July 1.

Three faculty promotions are in the Department of Architecture; three are in the Department of Urban Studies and Planning; and one is in the Program in Media Arts and Sciences.

Department of Urban Studies and Planning (DUSP)Carlo Ratti has been reappointed as professor of the practice.

Her interdisciplinary work draws together urban studies, critical peace studies, architectural history, cultural geography, and anthropology.

She also serves as the managing editor of Projections, the department’s annual peer-reviewed journal on critical issues in…

2 weeks, 2 days назад @ news.mit.edu
MIT Learn offers “a whole new front door to the Institute”
MIT Learn offers “a whole new front door to the Institute” MIT Learn offers “a whole new front door to the Institute”

MIT Learn enables learners to access more than 12,700 educational resources — including introductory and advanced courses, courseware, videos, podcasts, and more — from departments across the Institute.

“With MIT Learn, we’re opening access to MIT’s digital learning opportunities for millions around the world,” says Dimitris Bertsimas, vice provost for open learning.

“MIT Learn elevates learning with personalized recommendations powered by AI, guiding each learner toward deeper understanding.

“MIT Learn is a whole new front door to the Institute,” says Christopher Capozzola, senior associate dean for open learning, who worked with faculty across the Institute on the project.

“Just as the Ke…

2 weeks, 3 days назад @ news.mit.edu
A new way to edit or generate images
A new way to edit or generate images A new way to edit or generate images

But what if it were possible to generate images through AI methods without using a generator at all?

This group effort had its origins in a class project for a graduate seminar on deep generative models that Lao Beyer took last fall.

“This was a never-before-seen result, as no one had observed visually identifiable changes from manipulating tokens,” Lao Beyer says.

The MIT researchers found a way to create images without using a generator at all.

“It shows that image tokenizers — tools usually used just to compress images — can actually do a lot more.

2 weeks, 3 days назад @ news.mit.edu
The unique, mathematical shortcuts language models use to predict dynamic scenarios
The unique, mathematical shortcuts language models use to predict dynamic scenarios The unique, mathematical shortcuts language models use to predict dynamic scenarios

Language models like ChatGPT also track changes inside their own “mind” when finishing off a block of code or anticipating what you’ll write next.

Identifying and tweaking these underlying mechanisms helps language models become more reliable prognosticators, especially with more dynamic tasks like forecasting weather and financial markets.

The team made this observation by going under the hood of language models, evaluating how closely they could keep track of objects that change position rapidly.

“We’ve found that when language models use a heuristic early on in training, they’ll start to build these tricks into their mechanisms,” says Li.

This suggests that fine-tuning larger language mo…

2 weeks, 3 days назад @ news.mit.edu
Model predicts long-term effects of nuclear waste on underground disposal systems
Model predicts long-term effects of nuclear waste on underground disposal systems Model predicts long-term effects of nuclear waste on underground disposal systems

The United States, for instance, has indefinitely stalled its only long-term underground nuclear waste repository.

Scientists are using both modeling and experimental methods to study the effects of underground nuclear waste disposal and ultimately, they hope, build public trust in the decision-making process.

The authors hope the research will improve confidence among policymakers and the public in the long-term safety of underground nuclear waste disposal.

As such, much effort has been put into studying the migration behaviors of radionuclides from nuclear waste within various natural and engineered geological materials.

To date, several challenges have limited scientists’ understanding o…

3 weeks назад @ news.mit.edu
This “smart coach” helps LLMs switch between text and code
This “smart coach” helps LLMs switch between text and code This “smart coach” helps LLMs switch between text and code

Large language models (LLMs) excel at using textual reasoning to understand the context of a document and provide a logical answer about its contents.

While some LLMs can generate code like Python to handle symbolic queries, the models don’t always know when to use code, or what kind of code would work best.

Enter CodeSteer, a smart assistant developed by MIT researchers that guides an LLM to switch between code and text generation until it correctly answers a query.

CodeSteer, itself a smaller LLM, automatically generates a series of prompts to iteratively steer a larger LLM.

Fine-tuning a smaller model doesn’t change the larger LLM, so there is no risk it would undermine the larger model’…

3 weeks, 1 day назад @ news.mit.edu
Berkeley AI
последний пост 3 months, 4 weeks назад
Defending against Prompt Injection with Structured Queries (StruQ) and Preference Optimization (SecAlign)
Defending against Prompt Injection with Structured Queries (StruQ) and Preference Optimization (SecAlign) Defending against Prompt Injection with Structured Queries (StruQ) and Preference Optimization (SecAlign)

Defending against Prompt Injection with Structured Queries (StruQ) and Preference Optimization (SecAlign)Recent advances in Large Language Models (LLMs) enable exciting LLM-integrated applications.

To mitigate the imminent prompt injection threat, we propose two fine-tuning-defenses, StruQ and SecAlign.

Prompt Injection Attack: CausesBelow is the threat model of prompt injection attacks.

Prompt injection threat model in LLM-integrated applicationsWe propose that prompt injection has two causes.

Below are resources to learn more and keep updated on prompt injection attacks and defenses.

3 months, 4 weeks назад @ bair.berkeley.edu
Repurposing Protein Folding Models for Generation with Latent Diffusion
Repurposing Protein Folding Models for Generation with Latent Diffusion Repurposing Protein Folding Models for Generation with Latent Diffusion

Repurposing Protein Folding Models for Generation with Latent DiffusionPLAID is a multimodal generative model that simultaneously generates protein 1D sequence and 3D structure, by learning the latent space of protein folding models.

In PLAID, we develop a method that learns to sample from the latent space of protein folding models to generate new proteins.

Unlike many previous protein structure generative models, PLAID addresses the multimodal co-generation problem setting: simultaneously generating both discrete sequence and continuous all-atom structural coordinates.

In this way, we can use structural understanding information in the weights of pretrained protein folding models for the p…

4 months назад @ bair.berkeley.edu
Scaling Up Reinforcement Learning for Traffic Smoothing: A 100-AV Highway Deployment
Scaling Up Reinforcement Learning for Traffic Smoothing: A 100-AV Highway Deployment Scaling Up Reinforcement Learning for Traffic Smoothing: A 100-AV Highway Deployment

Scaling Up Reinforcement Learning for Traffic Smoothing: A 100-AV Highway DeploymentTraining Diffusion Models with Reinforcement LearningWe deployed 100 reinforcement learning (RL)-controlled cars into rush-hour highway traffic to smooth congestion and reduce fuel consumption for everyone.

The challenges of phantom jamsA stop-and-go wave moving backwards through highway traffic.

Smoothing behavior of RL AVs.

Overall, the steps towards deployment involved:Training in data-driven simulations: We used highway traffic data from I-24 to create a training environment with realistic wave dynamics, then validate the trained agent’s performance and robustness in a variety of new traffic scenarios.…

4 months, 2 weeks назад @ bair.berkeley.edu
Virtual Personas for Language Models via an Anthology of Backstories
Virtual Personas for Language Models via an Anthology of Backstories Virtual Personas for Language Models via an Anthology of Backstories

Virtual Personas for Language Models via an Anthology of BackstoriesWe introduce Anthology, a method for conditioning LLMs to representative, consistent, and diverse virtual personas by generating and utilizing naturalistic backstories with rich details of individual values and experience.

What does it mean for large language models (LLMs) to be trained on massive text corpora, collectively produced by millions and billions of distinctive human authors?

In this work, we introduce Anthology, an approach for steering LLMs to representative, consistent, and diverse virtual personas by providing richly detailed life narratives of individuals as conditioning context to models.

By grounding langu…

8 months, 4 weeks назад @ bair.berkeley.edu
AWS Machine Learning AWS Machine Learning
последний пост 13 часов назад
The DIVA logistics agent, powered by Amazon Bedrock
The DIVA logistics agent, powered by Amazon Bedrock The DIVA logistics agent, powered by Amazon Bedrock

In this post, we discuss how DTDC and ShellKode used Amazon Bedrock to build DIVA 2.0, a generative AI-powered logistics agent.

Solution overviewTo address the limitations of the existing logistics agent, ShellKode built an advanced agentic assistant using Amazon Bedrock Agents, Amazon Bedrock Knowledge Bases, and an API integration layer.

The logistics agent is hosted as a static website using Amazon CloudFront and Amazon Simple Storage Service (Amazon S3).

The Amazon Bedrock agents are configured in Amazon Bedrock.

Logistics agent dashboardThe following high-level architecture diagram illustrates the logistics agent dashboard, which captures the end-user interactions and its associated re…

13 часов назад @ aws.amazon.com
Automate enterprise workflows by integrating Salesforce Agentforce with Amazon Bedrock Agents
Automate enterprise workflows by integrating Salesforce Agentforce with Amazon Bedrock Agents Automate enterprise workflows by integrating Salesforce Agentforce with Amazon Bedrock Agents

This post explores a practical collaboration, integrating Salesforce Agentforce with Amazon Bedrock Agents and Amazon Redshift, to automate enterprise workflows.

Agentforce and Amazon Bedrock Agent integration patternsAgentforce can call Amazon Bedrock agents in different ways, allowing flexibility to build different architectures.

Amazon Bedrock access – Access to Amazon Bedrock Agents and to Anthropic’s Claude 3.5 Haiku v1 enabled in your chosen AWS Region.

IAM roles and policies – Understanding of how to create IAM roles with the necessary permissions for Amazon Bedrock Agents, Lambda functions (to call Amazon Bedrock, Amazon Redshift, and the Agentforce API), and Amazon Bedrock Knowledg…

13 часов назад @ aws.amazon.com
How Amazon Bedrock powers next-generation account planning at AWS
How Amazon Bedrock powers next-generation account planning at AWS How Amazon Bedrock powers next-generation account planning at AWS

To address this challenge, we launched Account Plan Pulse in January 2025, a generative AI tool designed to streamline and enhance the account planning process.

Solution overviewTo address these challenges, we designed Pulse, a generative AI solution that uses Amazon Bedrock to analyze and improve account plans.

ConclusionIn this post, we shared how Pulse, powered by Amazon Bedrock, has transformed the account planning process for AWS sales teams.

By building flows for orchestrating our workflows, use of Amazon Bedrock Guardrails, introduction of agentic frameworks, and use of Strands Agents and Amazon Bedrock AgentCore, we can make a more dynamic flow in the future.

To learn more about Ama…

16 часов назад @ aws.amazon.com
Pioneering AI workflows at scale: A deep dive into Asana AI Studio and Amazon Q index collaboration
Pioneering AI workflows at scale: A deep dive into Asana AI Studio and Amazon Q index collaboration Pioneering AI workflows at scale: A deep dive into Asana AI Studio and Amazon Q index collaboration

Today, we’re excited to announce the integration of Asana AI Studio with Amazon Q index, bringing generative AI directly into your daily workflows.

This component offers organizations a protected channel through which Asana AI Studio can retrieve information from their Amazon Q index.

For a comprehensive guide on setting up Amazon Q Business, refer to Innovate on enterprise data with generative AI & Amazon Q Business application.

Add Asana as a data accessor in Amazon Q BusinessComplete the following steps to add Asana as an Amazon Q Business data accessor:On the Amazon Q Business console, choose Applications in the navigation pane.

Connect an Amazon Q index to the Asana admin consoleConnec…

1 day, 8 hours назад @ aws.amazon.com
Responsible AI for the payments industry – Part 1
Responsible AI for the payments industry – Part 1 Responsible AI for the payments industry – Part 1

Payment industry challengesThe payments industry presents a unique landscape for AI implementation, where the stakes are high and the potential impact on individuals is significant.

Core principles of responsible AIIn the following sections, we review how responsible AI considerations can be applied in the payment industry.

ConclusionAs we’ve explored throughout this guide, responsible AI in the payments industry represents both a strategic imperative and competitive advantage.

In an industry where trust is the ultimate currency, responsible AI is a responsible choice and an important business imperative.

To learn more about responsible AI, refer to the AWS Responsible Use of AI Guide.

1 day, 11 hours назад @ aws.amazon.com
Responsible AI for the payments industry – Part 2
Responsible AI for the payments industry – Part 2 Responsible AI for the payments industry – Part 2

In Part 1 of our series, we explored the foundational concepts of responsible AI in the payments industry.

The AI Responsible CommitteeConsider establishing an AI Responsible Committee for your financial institution.

This cross-functional body can serve as a central hub for AI governance, bringing together experts from various disciplines to guide responsible AI innovation and support alignment with responsible AI practices.

In an industry built on a foundation of trust, responsible AI is a responsible choice and an important business imperative and success.

To learn more about responsible AI, refer to the AWS Responsible Use of AI Guide.

1 day, 11 hours назад @ aws.amazon.com
Process multi-page documents with human review using Amazon Bedrock Data Automation and Amazon SageMaker AI
Process multi-page documents with human review using Amazon Bedrock Data Automation and Amazon SageMaker AI Process multi-page documents with human review using Amazon Bedrock Data Automation and Amazon SageMaker AI

However, although the advanced capabilities of Amazon Bedrock Data Automation deliver exceptional automation, there remain scenarios where human judgment is invaluable.

In this post, we show how to process multi-page documents with a human review loop using Amazon Bedrock Data Automation and SageMaker AI.

This helps customers focus on solving their business challenges with Amazon Bedrock Data Automation rather than dealing with complex scoring mechanisms.

Solution overviewThe following architecture provides a serverless solution for processing multi-page documents with human review loops using Amazon Bedrock Data Automation and SageMaker AI.

The Step Functions workflow invokes the bda-class…

1 day, 11 hours назад @ aws.amazon.com
Build an AI assistant using Amazon Q Business with Amazon S3 clickable URLs
Build an AI assistant using Amazon Q Business with Amazon S3 clickable URLs Build an AI assistant using Amazon Q Business with Amazon S3 clickable URLs

Authenticated users subscribed to the Amazon Q Business application can interact with your AI assistant using the Amazon Q Business web experience from their web browsers or with a custom application built by your organization.

Considerations for using Amazon S3 clickable URLsConsider the following when using Amazon S3 clickable URLs:At the time of writing, the Amazon S3 clickable URLs feature is available on Amazon Q Business applications using AWS IAM Identity Center or IAM federation for user access management, and not available for Amazon Q Business applications created using anonymous mode.

Create an Amazon Q Business applicationDepending on your choice of user access management method…

2 days, 5 hours назад @ aws.amazon.com
GPT OSS models from OpenAI are now available on SageMaker JumpStart
GPT OSS models from OpenAI are now available on SageMaker JumpStart GPT OSS models from OpenAI are now available on SageMaker JumpStart

Today, we are excited to announce the availability of Open AI’s new open weight GPT OSS models, gpt-oss-120b and gpt-oss-20b , from OpenAI in Amazon SageMaker JumpStart.

Solution overviewThe OpenAI GPT OSS models ( gpt-oss-120b and gpt-oss-20b ) excel at coding, scientific analysis, and mathematical reasoning tasks.

Deploy gpt-oss-120b through the SageMaker JumpStart UIComplete the following steps to deploy gpt-oss-120b through SageMaker JumpStart:On the SageMaker console, choose Studio in the navigation pane.

On the SageMaker Studio console, access SageMaker JumpStart by choosing JumpStart in the navigation pane.

About the AuthorsPradyun Ramadorai, Senior Software Development EngineerMalav…

2 days, 7 hours назад @ aws.amazon.com
Discover insights from Microsoft Exchange with the Microsoft Exchange connector for Amazon Q Business
Discover insights from Microsoft Exchange with the Microsoft Exchange connector for Amazon Q Business Discover insights from Microsoft Exchange with the Microsoft Exchange connector for Amazon Q Business

Microsoft Exchange connector for Amazon Q Business featuresThe following table gives an overview of the Microsoft Exchange connector for Amazon Q Business and its supported features.

PrerequisitesTo configure the Microsoft Exchange connector for Amazon Q Business, you need to create a Microsoft Exchange account in Office 365.

Create the Microsoft Exchange connector for Amazon Q BusinessFor detailed steps to set up Amazon Q Business, refer to Getting started with Amazon Q Business.

To configure the Amazon Q Business connector, complete the following steps:In the Amazon Q Business console, choose Applications in the navigation pane.

How to set up Amazon Q Business using a different IdPYou can…

2 days, 13 hours назад @ aws.amazon.com
AI judging AI: Scaling unstructured text analysis with Amazon Nova
AI judging AI: Scaling unstructured text analysis with Amazon Nova AI judging AI: Scaling unstructured text analysis with Amazon Nova

Instead of relying on a single model’s potentially biased view, multiple models work together to provide a more balanced assessment.

The unified Amazon Web Services (AWS) environment and standardized API calls simplify deploying multiple models for thematic analysis and judging model outputs.

Create a SageMaker notebook instance to run the analysis, and then initialize Amazon Bedrock and configure the input and output file locations on Amazon S3.

Depending on your use case, you can use any or multiple models offered by Amazon Bedrock for this step.

If you followed along, you have now successfully deployed multiple LLMs to judge thematic analysis output from an LLM.

3 days, 12 hours назад @ aws.amazon.com
Building an AI-driven course content generation system using Amazon Bedrock
Building an AI-driven course content generation system using Amazon Bedrock Building an AI-driven course content generation system using Amazon Bedrock

The solution uses Amazon Simple Queue Service (Amazon SQS), AWS Lambda, Amazon Bedrock, Amazon API Gateway WebSocket APIs, Amazon Simple Storage Service (Amazon S3), Amazon CloudFront, Amazon DynamoDB, Amazon Cognito and AWS WAF.

In this post, we explore each component in detail, along with the technical implementation of the two core modules: course outline generation and course content generation.

Content generation is content generated for the module and submodule generated in content outline.

WebSocket API and authentication mechanismsCourse WebSocket API manages real-time interactions for course outline and content generation.

AWS IAM policies enforce strict access control to secure re…

3 days, 12 hours назад @ aws.amazon.com
How Handmade.com modernizes product image and description handling with Amazon Bedrock and Amazon OpenSearch Service
How Handmade.com modernizes product image and description handling with Amazon Bedrock and Amazon OpenSearch Service How Handmade.com modernizes product image and description handling with Amazon Bedrock and Amazon OpenSearch Service

This post focuses on the challenge of generating rich, scalable product descriptions for hand-crafted goods uploaded by a distributed seller base.

We explore how Handmade.com designed and implemented a hybrid AI and vector search solution using Amazon Bedrock and Amazon OpenSearch Service for semantic content retrieval.

Image analysisThe image analysis workflow consists of the following steps:A user uploads a product image.

The architecture incorporates foundation models from Amazon Bedrock, semantic retrieval with Amazon OpenSearch Service, and a lightweight API layer using Amazon API Gateway for seamless orchestration.

Start exploring Amazon Bedrock and Amazon OpenSearch Service today to …

3 days, 13 hours назад @ aws.amazon.com
Cost tracking multi-tenant model inference on Amazon Bedrock
Cost tracking multi-tenant model inference on Amazon Bedrock Cost tracking multi-tenant model inference on Amazon Bedrock

For tracking multi-tenant costs when dealing with tens to thousands of application inference profiles refer to Manage multi-tenant Amazon Bedrock costs using application inference profiles in the AWS Artificial Intelligence Blog post.

Managing costs and resources in large-scale multi-tenant environments adds complexity when you use application inference profiles in Amazon Bedrock.

Monitoring and analyzing Amazon Bedrock performance metricsThe following Amazon QuickSight dashboard demonstrates how you can visualize your Amazon Bedrock usage data across multiple dimensions.

You can see how different business functions such as Finance, Research, HR, and Sales use Amazon Bedrock services.

The c…

3 days, 13 hours назад @ aws.amazon.com
Introducing Amazon Bedrock AgentCore Browser Tool
Introducing Amazon Bedrock AgentCore Browser Tool Introducing Amazon Bedrock AgentCore Browser Tool

At AWS Summit New York City 2025, Amazon Web Services (AWS) announced the preview of Amazon Bedrock AgentCore browser tool, a fully managed, pre-built cloud-based browser.

In this post, we introduce the newly announced Amazon Bedrock AgentCore Browser Tool.

Use cases for cloud-based browser automationHandling repetitive web tasks: With the introduction of Amazon Bedrock AgentCore Browser Tool, organizations can now implement sophisticated browser automation at scale.

PrerequisitesTo use the Amazon Bedrock AgentCore Brower Tool, you need to complete the following prerequisites:Python 3.10+Verify your IAM user or role has the permissions to use AgentCore Browser:git clone https://github.com/a…

6 days, 11 hours назад @ aws.amazon.com
NVIDIA
последний пост 9 часов назад
Efficient Transforms in cuDF Using JIT Compilation
Efficient Transforms in cuDF Using JIT Compilation Efficient Transforms in cuDF Using JIT Compilation

This post explains how JIT compilation brings kernel fusion to the cuDF C++ programming model, providing higher data processing throughput and more efficient GPU memory and compute utilization.

The JIT transform approach in cuDF uses NVRTC to JIT compile a custom kernel for completing arbitrary transformations.

Using JIT transformThe rapidsai/cudf GitHub repo provides a suite of string_transforms examples to demonstrate string manipulation with both the precompiled approach and the new JIT transform approach.

Performance benefits from JIT compilationUsing the JIT transform approach to process UDFs results in faster runtimes than using the precompiled approach.

Speedup of JIT transform appro…

9 часов назад @ developer.nvidia.com
Train with Terabyte-Scale Datasets on a Single NVIDIA Grace Hopper Superchip Using XGBoost 3.0
Train with Terabyte-Scale Datasets on a Single NVIDIA Grace Hopper Superchip Using XGBoost 3.0 Train with Terabyte-Scale Datasets on a Single NVIDIA Grace Hopper Superchip Using XGBoost 3.0

A single NVIDIA GH200 Grace Hopper Superchip can now process datasets from gigabyte scale all the way to 1 terabyte (TB) scale.

Two constraints, however, remained:Even with compression of data, GPU memory is the main limiting factor for datasets that can be GPU accelerated.

New External-Memory Quantile DMatrixXGBoost 3.0 introduces a third mechanism in External-Memory Quantile DMatrix, which enables scaling up to TB-scale datasets on a single GH200 Grace Hopper Superchip.

The External Memory Quantile DMatrix manages all the dataset memoryHow the NVIDIA Grace Hopper Superchip makes streaming practicalThis setup is ideal for NVIDIA Grace-based superchips.

XGBoost 3.0 enables you to process TB…

12 часов назад @ developer.nvidia.com
The Saga Continues: Stream 2K’s ‘Mafia: The Old Country’ at Launch on GeForce NOW
The Saga Continues: Stream 2K’s ‘Mafia: The Old Country’ at Launch on GeForce NOW The Saga Continues: Stream 2K’s ‘Mafia: The Old Country’ at Launch on GeForce NOW

And don’t miss out on the newest season of Marvel Rivals, available for members to stream without having to wait for downloads or patches.

The Don Says PlayReturn to the heart of the Mafia saga and journey into the brutal underworld of 1900s Sicily with Mafia: The Old Country.

Mafia: The Old Country joins the Mafia trilogy already streaming on GeForce NOW for members to relive the timeless action before diving into Enzo’s origins.

In this classic action role-playing game, chaos abounds as champions dodge harpies, clash with centaurs and roam sun-drenched vistas while honing unique masteries.

With high-performance servers handling the heavy lifting, the gameplay stays smooth, the visuals sha…

17 часов назад @ blogs.nvidia.com
What’s New and Important in CUDA Toolkit 13.0
What’s New and Important in CUDA Toolkit 13.0 What’s New and Important in CUDA Toolkit 13.0

Unifying CUDA for Arm: Build once, deploy anywhereWith CUDA 13.0, NVIDIA is streamlining development for Arm platforms by unifying the CUDA toolkit across server-class and embedded devices.

This makes it difficult to link and ship many CUDA Toolkit wheels because you have to search for each component separately when creating Python packages that depend on CUDA Toolkit wheels.

Starting with CUDA Toolkit 13.0 wheels, we will ship all of the CUDA Toolkit artifacts in the same folder in order to make CUDA Toolkit wheels file layouts closer to the file layouts on other platforms.

The versioned CUDA Toolkit subdirectory (which we use on other platforms) also makes it easier to ensure that compone…

1 day, 14 hours назад @ developer.nvidia.com
OpenAI’s New Open Models Accelerated Locally on NVIDIA GeForce RTX and RTX PRO GPUs
OpenAI’s New Open Models Accelerated Locally on NVIDIA GeForce RTX and RTX PRO GPUs OpenAI’s New Open Models Accelerated Locally on NVIDIA GeForce RTX and RTX PRO GPUs

“OpenAI showed the world what could be built on NVIDIA AI — and now they’re advancing innovation in open-source software,” said Jensen Huang, founder and CEO of NVIDIA.

The OpenAI open models are the first MXFP4 models supported on NVIDIA RTX.

Run the OpenAI Models on NVIDIA RTX With OllamaThe easiest way to test these models on RTX AI PCs, on GPUs with at least 24GB of VRAM, is using the new Ollama app.

Plug in to NVIDIA AI PC on Facebook, Instagram, TikTok and X — and stay informed by subscribing to the RTX AI PC newsletter.

Join NVIDIA’s Discord server to connect with community developers and AI enthusiasts for discussions on what’s possible with RTX AI.

2 days, 13 hours назад @ blogs.nvidia.com
OpenAI and NVIDIA Propel AI Innovation With New Open Models Optimized for the World’s Largest AI Inference Infrastructure
OpenAI and NVIDIA Propel AI Innovation With New Open Models Optimized for the World’s Largest AI Inference Infrastructure OpenAI and NVIDIA Propel AI Innovation With New Open Models Optimized for the World’s Largest AI Inference Infrastructure

NVIDIA delivers industry-leading gpt-oss-120b performance of 1.5 million tokens per second on a single NVIDIA Blackwell GB200 NVL72 rack-scale system.

“OpenAI showed the world what could be built on NVIDIA AI — and now they’re advancing innovation in open-source software,” said Jensen Huang, founder and CEO of NVIDIA.

NVIDIA Blackwell includes innovations such as NVFP4 4-bit precision, which enables ultra-efficient, high-accuracy inference while significantly reducing power and memory requirements.

Open Development for Millions of AI Builders WorldwideNVIDIA CUDA is the world’s most widely available computing infrastructure, letting users deploy and run AI models anywhere, from the powerful…

2 days, 13 hours назад @ blogs.nvidia.com
No Backdoors. No Kill Switches. No Spyware.
No Backdoors. No Kill Switches. No Spyware. No Backdoors. No Kill Switches. No Spyware.

To mitigate the risk of misuse, some pundits and policymakers propose requiring hardware “kill switches” or built-in controls that can remotely disable GPUs without user knowledge and consent.

NVIDIA GPUs do not and should not have kill switches and backdoors.

Embedding backdoors and kill switches into chips would be a gift to hackers and hostile actors.

Kill switches and built-in backdoors create single points of failure and violate the fundamental principles of cybersecurity.

No kill switches.

2 days, 14 hours назад @ blogs.nvidia.com
7 Drop-In Replacements to Instantly Speed Up Your Python Data Science Workflows
7 Drop-In Replacements to Instantly Speed Up Your Python Data Science Workflows 7 Drop-In Replacements to Instantly Speed Up Your Python Data Science Workflows

Many of Python’s most popular data science libraries—including pandas, Polars, scikit-learn, and XGBoost—can now run much faster on GPUs with little to no code changes.

This post shares how to speed up seven drop-in replacements for popular Python libraries—complete with starter code to try yourself.

Make pandas and Polars run faster on larger datasetsThe foundation of any data science or machine learning project is data preparation.

%%load_ext cuml.accel import hdbscan clusterer = hdbscan.HDBSCAN() time clusterer.fit(X)Clustering large datasets with HDBSCAN can be time-consuming—even on toy examples, CPU runs can take 30 to 60 seconds.

With one environment variable and zero code changes, y…

6 days, 7 hours назад @ developer.nvidia.com
Wired for Action: Langflow Enables Local AI Agent Creation on NVIDIA RTX PCs
Wired for Action: Langflow Enables Local AI Agent Creation on NVIDIA RTX PCs Wired for Action: Langflow Enables Local AI Agent Creation on NVIDIA RTX PCs

With popular applications like Langflow — a low-code, visual platform for designing custom AI workflows — AI enthusiasts can use simple, no-code user interfaces (UIs) to chain generative AI models.

To help modders get started, NVIDIA’s Langflow Remix template includes:A retrieval-augmented generation module with RTX Remix documentation.

Control RTX AI PCs With Project G-Assist in LangflowNVIDIA Project G-Assist is an experimental, on-device AI assistant that runs locally on GeForce RTX PCs.

Plug in to NVIDIA AI PC on Facebook, Instagram, TikTok and X — and stay informed by subscribing to the RTX AI PC newsletter.

Join NVIDIA’s Discord server to connect with community developers and AI enthu…

1 week назад @ blogs.nvidia.com
Embark on Epic Adventures in August With a Dozen New Games Coming to GeForce NOW
Embark on Epic Adventures in August With a Dozen New Games Coming to GeForce NOW Embark on Epic Adventures in August With a Dozen New Games Coming to GeForce NOW

Wrap up the month with eight games available this week, including ‘Grounded 2’ and the latest update for ‘Genshin Impact.’August brings new levels of gaming excitement on GeForce NOW, with 2,300 titles now available to stream in the cloud.

Grab a controller and get ready for epic adventures — a dozen new games are coming to the cloud this month.

Get ready to shrink down for big fun — early access for Grounded 2, announced as a surprise in June at XBOX Games Showcase, will be available to stream on day one: tomorrow, Aug. 1.

Plus, finish off July with the eight games available to stream this week, alongside the latest update for the acclaimed open-world action role-playing game Genshin Impac…

1 week назад @ blogs.nvidia.com
Optimizing Vector Search for Indexing and Real-Time Retrieval with NVIDIA cuVS
Optimizing Vector Search for Indexing and Real-Time Retrieval with NVIDIA cuVS Optimizing Vector Search for Indexing and Real-Time Retrieval with NVIDIA cuVS

The NVIDIA cuVS team is looking forward to OpenSearch 3.0, which will use cuVS for GPU-accelerated index building, and demonstrate up to 9.4x faster index build times.

At this scale, an ideal solution is to use GPUs for index builds and CPUs for the index search.

Index interoperability offers AI applications faster index build times with potentially lower costs while preserving vector search on a CPU-powered vector database.

Microsoft Advertising is exploring the integration of the cuVS CAGRA search algorithm for this purpose.

Get started with NVIDIA cuVSNVIDIA cuVS continues to redefine GPU-accelerated vector search, offering innovations that optimize performance, reduce costs, and streaml…

2 weeks назад @ developer.nvidia.com
‘WUCHANG: Fallen Feathers’ Lands in the Cloud
‘WUCHANG: Fallen Feathers’ Lands in the Cloud ‘WUCHANG: Fallen Feathers’ Lands in the Cloud

From the myth and mystery of ‘WUCHANG: Fallen Feathers’ to iconic tricks in ‘Tony Hawk’s Pro Skater 1 + 2’ and the mayhem of ‘Killing Floor 3,’ it’s a blockbuster week in the cloud with nine new games.

WUCHANG: Fallen Feathers has launched in the cloud.

Ride in style with skateboarding legends in Tony Hawk’s Pro Skater 1 + 2, now just a grind away.

Gravity Is OptionalDrop in, grab a deck and get ready to grind — Tony Hawk’s Pro Skater 1 + 2 is kickflipping their way onto the cloud, joining the newly launched Tony Hawk’s Pro Skater 3 + 4.

Tony Hawk’s Pro Skater 2 ups the ante with new moves, create-a-skater mode and even more legendary levels to conquer.

2 weeks назад @ blogs.nvidia.com
Creative Agency Black Mixture Creates Stunning Visuals With Generative AI Powered by NVIDIA RTX
Creative Agency Black Mixture Creates Stunning Visuals With Generative AI Powered by NVIDIA RTX Creative Agency Black Mixture Creates Stunning Visuals With Generative AI Powered by NVIDIA RTX

For media company Black Mixture, AI isn’t just a tool — it’s an entire pipeline to help bring creative visions to life.

It includes a series of live trainings that explore generative AI capabilities, as well as NVIDIA RTX AI technology and optimizations.

Next-Generation StreamingNVIDIA Broadcast version 2.0.2 delivers 15% faster performance for NVIDIA GeForce RTX 50 Series and NVIDIA RTX PRO Blackwell GPUs.

Plug in to NVIDIA AI PC on Facebook, Instagram, TikTok and X — and stay informed by subscribing to the RTX AI PC newsletter.

Join NVIDIA’s Discord server to connect with community developers and AI enthusiasts for discussions on what’s possible with RTX AI.

2 weeks назад @ blogs.nvidia.com
Serverless Distributed Data Processing with Apache Spark and NVIDIA AI on Azure
Serverless Distributed Data Processing with Apache Spark and NVIDIA AI on Azure Serverless Distributed Data Processing with Apache Spark and NVIDIA AI on Azure

This post demonstrates how to solve this challenge by deploying a distributed Spark application on Azure Container Apps (ACA) with serverless GPUs.

One or more Spark worker applications : These applications are GPU-accelerated and run on Azure Container Apps serverless GPUs.

Step 2: Deploy the GPU-accelerated Spark worker applicationThe next step is to deploy the Spark worker application.

Get started with serverless distributed data processingBy deploying a custom, GPU-accelerated Apache Spark application on Azure Container Apps serverless GPUs, you can build highly efficient, scalable, and cost-effective distributed data processing solutions.

To learn more and see a demo walkthrough, watch…

2 weeks, 1 day назад @ developer.nvidia.com
Into the Omniverse: How Global Brands Are Scaling Personalized Advertising With AI and 3D Content Generation
Into the Omniverse: How Global Brands Are Scaling Personalized Advertising With AI and 3D Content Generation Into the Omniverse: How Global Brands Are Scaling Personalized Advertising With AI and 3D Content Generation

Marketing leaders are accelerating content pipelines with solutions built using OpenUSD, NVIDIA Omniverse and agentic AI.

Unilever’s content imagery is being created 2x faster and at half the cost of traditional methods, leading to 100% brand consistency and quicker content creation.

Grip relies on rules-based AI, NVIDIA RTX GPUs and the NVIDIA AI Enterprise software platform to ensure brand consistency across diverse markets.

Unilever, in collaboration with Collective World, is using Omniverse, OpenUSD and photorealistic 3D digital twins to accelerate content production.

Stay up to date by subscribing to NVIDIA Omniverse news, joining the Omniverse community and following NVIDIA Omniverse …

2 weeks, 1 day назад @ blogs.nvidia.com
Facebook
последний пост 1 day, 12 hours назад
Diff Risk Score: AI-driven risk-aware software development
Diff Risk Score: AI-driven risk-aware software development Diff Risk Score: AI-driven risk-aware software development

Built on a fine-tuned Llama LLM, DRS evaluates code changes and metadata to produce a risk score and highlight potentially risky code snippets.

Production risk was one of the areas we tackled first.

The demand to build such features also led us to build the Risk Awareness Platform to provide risk analysis APIs and tool integrations.

We believe code risk can play a significant role in improving this tradeoff, so we will build more risk-aware features while improving their quality.

While code changes cause the plurality of SEVs at Meta, configuration changes are another large category.

1 day, 12 hours назад @ engineering.fb.com
Building a human-computer interface for everyone
Building a human-computer interface for everyone Building a human-computer interface for everyone

What if you could control any device using only subtle hand movements?

New research from Meta’s Reality Labs is pointing even more firmly toward wrist-worn devices using surface electromyography (sEMG) becoming the future of human-computer interaction.

Generalization has been one of the most significant challenges in the field of human-computer interaction (HCI).

They discuss the road to creating a first-of-its-kind, generic human-computer neuromotor interface, what happens when software and hardware engineering meet neuroscience, and more!

And if you’re interested in learning more about career opportunities at Meta visit the Meta Careers page.

3 days, 16 hours назад @ engineering.fb.com
Using AI to make lower-carbon, faster-curing concrete
Using AI to make lower-carbon, faster-curing concrete Using AI to make lower-carbon, faster-curing concrete

But concrete suppliers can utilize AI to develop and scale innovative concrete mixes as drop-in replacements, accelerating the discovery and integration of sustainable materials for large-scale use.

Meta’s AI model for green concreteDesigning concrete formulas is a complex, multi-objective problem.

To accelerate the concrete mix design process, Meta developed an AI model for sustainable concrete using BoTorch and Ax, Meta’s open-source software for Bayesian optimization and adaptive experimentation, respectively.

Our AI pipeline consists of the workflow of generating baseline data, training an AI model, using it to develop and validate new hypotheses, and then improving the baseline data an…

3 weeks, 1 day назад @ engineering.fb.com
Accelerating GPU indexes in Faiss with NVIDIA cuVS
Accelerating GPU indexes in Faiss with NVIDIA cuVS Accelerating GPU indexes in Faiss with NVIDIA cuVS

Meta and NVIDIA collaborated to accelerate vector search on GPUs by integrating NVIDIA cuVS into Faiss v1.10 , Meta’s open source library for similarity search.

In its latest release, Faiss 1.10.0 officially includes these algorithms from the NVIDIA cuVS library.

Faiss 1.10.0 also includes a new conda package that unlocks the ability to choose between the classic Faiss GPU implementations and the newer NVIDIA cuVS algorithms, making it easy for users to switch between GPU and CPU.

Build time (95% recall@10)Index Embeddings100M x 96(seconds) Embeddings5M x 1536(seconds) Faiss Classic Faiss cuVS Faiss Classic Faiss cuVS Faiss Classic Faiss cuVS IVF Flat IVF Flat 101.4 37.9 (2.7x) 24.4 15.2 (1…

3 months назад @ engineering.fb.com
Introducing AutoPatchBench: A Benchmark for AI-Powered Security Fixes
Introducing AutoPatchBench: A Benchmark for AI-Powered Security Fixes Introducing AutoPatchBench: A Benchmark for AI-Powered Security Fixes

We are introducing AutoPatchBench, a benchmark for the automated repair of vulnerabilities identified through fuzzing.

As illustrated, fixing a fuzzing crash involves:Analyzing the crash stack trace and the target code.

Inside AutoPatchBenchWe’re making AutoPatchBench publicly available as part of CyberSecEval 4 to encourage community collaboration in tackling the challenge of automating fuzzing crash repairs.

Then we find the lowest common ancestor (LCA) across all pairs of stacktraces offered by the groundtruth patch and the LLM patch.

As an experienced Security Engineer at Meta, your task is to address the following security-critical fuzzing crash.

3 months, 1 week назад @ engineering.fb.com
Building multimodal AI for Ray-Ban Meta glasses
Building multimodal AI for Ray-Ban Meta glasses Building multimodal AI for Ray-Ban Meta glasses

With our Ray-Ban Meta glasses, multimodal AI helps the glasses see what the wearer is seeing.

This means anyone wearing Ray-Ban Meta glasses can ask them questions about what they’re looking at.

On this episode of the Meta Tech Podcast, meet Shane, a research scientist at Meta who has spent the last seven years focusing on computer vision and multimodal AI for wearables.

Shane sits down with Pascal Hartig to share how his team is building foundational models for the Ray-Ban Meta glasses.

They talk about the unique challenges of AI glasses and pushing the boundaries of AI-driven wearable technology.

5 months назад @ engineering.fb.com
Revolutionizing software testing: Introducing LLM-powered bug catchers
Revolutionizing software testing: Introducing LLM-powered bug catchers Revolutionizing software testing: Introducing LLM-powered bug catchers

WHAT IT ISMeta’s Automated Compliance Hardening (ACH) tool is a system for mutation-guided, LLM-based test generation.

Traditionally, automated test generation techniques sought merely to increase code coverage.

LLM-based test generation and LLM-based mutant generation are not new, but this is the first time they’ve been combined and deployed in large-scaled industrial systems.

WHAT’S NEXTOur novel approach combines LLM-based test generation and mutant generation to help automate complex technical organizational workflows in this space.

READ THE PAPERMutation-Guided LLM-based Test Generation at Meta

6 months назад @ engineering.fb.com
Meta Andromeda: Supercharging Advantage+ automation with the next-gen personalized ads retrieval engine
Meta Andromeda: Supercharging Advantage+ automation with the next-gen personalized ads retrieval engine Meta Andromeda: Supercharging Advantage+ automation with the next-gen personalized ads retrieval engine

Unlocking advertiser value through industry-leading ML innovationMeta Andromeda is a personalized ads retrieval engine that leverages the NVIDIA Grace Hopper Superchip, to enable cutting edge ML innovation in the Ads retrieval stage to drive efficiency and advertiser performance.

Its deployment across Instagram and Facebook applications has achieved +6% recall improvement to the retrieval system, delivering +8% ads quality improvement on selected segments.

Andromeda is designed to maximize ads performance by utilizing the exponential growth in volume of eligible ads available to the retrieval stage.

The design is optimized for AI hardware, minimizing memory bandwidth bottlenecks and enablin…

8 months, 1 week назад @ engineering.fb.com
Sequence learning: A paradigm shift for personalized ads recommendations
Sequence learning: A paradigm shift for personalized ads recommendations Sequence learning: A paradigm shift for personalized ads recommendations

Meta’s ad recommendation engine, powered by deep learning recommendation models (DLRMs), has been instrumental in delivering personalized ads to people.

Learning from sequences: developing new sequence learning architectures to replace traditional DLRM neural network architectures.

A paradigm shift with learning from sequences for recommendation systemsMeta’s new system for ads recommendations uses sequence learning at its core.

Scaling the new sequence learning paradigmFollowing the redesign to shift from sparse feature learning to event-based sequence learning, the next focus was scaling across two domains — scaling the sequence learning architecture and scaling event sequences to be long…

8 months, 3 weeks назад @ engineering.fb.com
OCP Summit 2024: The open future of networking hardware for AI
OCP Summit 2024: The open future of networking hardware for AI OCP Summit 2024: The open future of networking hardware for AI

At Open Compute Project Summit (OCP) 2024, we’re sharing details about our next-generation network fabric for our AI training clusters.

We’ve expanded our network hardware portfolio and are contributing two new disaggregated network fabrics and a new NIC to OCP.

At Meta, we believe that open hardware drives innovation.

At Meta, we envision a future of AI hardware systems that are not only scalable, but also open and collaborative.

We encourage anyone who wants to help advance the future of networking hardware for AI to engage with OCP and Meta to help share the future of AI infrastructure.

9 months, 3 weeks назад @ engineering.fb.com
Meta’s open AI hardware vision
Meta’s open AI hardware vision Meta’s open AI hardware vision

At the Open Compute Project (OCP) Global Summit 2024, we’re showcasing our latest open AI hardware designs with the OCP community.

These innovations include a new AI platform, cutting-edge open rack designs, and advanced network fabrics and components.

The open future of AI infraMeta is committed to open source AI.

We must also prioritize open and standardized models so we can leverage collective expertise, make AI more accessible, and work towards minimizing biases in our systems.​Just as important, we also need open AI hardware systems.

By addressing AI’s infrastructure needs together, we can unlock the true promise of open AI for everyone.​

9 months, 3 weeks назад @ engineering.fb.com
Uber Engineering
последний пост None
neptune.ai neptune.ai
последний пост 18 часов назад
Understanding Prompt Injection: Risks, Methods, and Defense Measures
Understanding Prompt Injection: Risks, Methods, and Defense Measures Understanding Prompt Injection: Risks, Methods, and Defense Measures

Prompt injection 101: When prompts go rogueThe term ‘Prompt Injection’ comes from SQL injection attacks.

There is another claim of the independent discovery of prompt injection attacks, which suggests that Riley Goodside publicly exhibited a prompt injection in a tweet back in September 2022.

The indirect prompt injection attacks are classified into active, passive, user-driven and virtual prompt attacks.

Virtual prompt injection attacksThis injection type is closely related to passive injection attacks previously described.

Prompt injection: current challenges & lessons learnedThe arms race between prompt injection attacks and defenses is a challenge for researchers, developers, and users.

18 часов назад @ neptune.ai
SabiYarn: Advancing Low-Resource Languages With Multitask NLP Pre-Training [Paper Reflections]
SabiYarn: Advancing Low-Resource Languages With Multitask NLP Pre-Training [Paper Reflections] SabiYarn: Advancing Low-Resource Languages With Multitask NLP Pre-Training [Paper Reflections]

This simple idea avoids computing loss on input prompt tokens the model already knows.

Prompt tokens are (too) expensive in low-resource settingsDuring pre-training, LLMs are trained in causal language modeling through a next-token prediction task.

=> Mo fẹ́ràn ìrẹsì,” the model is trained to predict every token, from the prompt to the actual answer:Step Prompt Next token 1 Translate English Static prompt 2 Translate English to Static prompt 3 Translate English to Yoruba: Static prompt 4 Translate English to Yoruba: I 5 Translate English to Yoruba: I love 6 Translate English to Yoruba: I love rice.

This is straightforward to implement in PyTorch by masking out the prompt tokens in the label …

6 days, 18 hours назад @ neptune.ai
How to Monitor, Diagnose, and Solve Gradient Issues in Foundation Models
How to Monitor, Diagnose, and Solve Gradient Issues in Foundation Models How to Monitor, Diagnose, and Solve Gradient Issues in Foundation Models

What gradient issues occur during foundation model training?

During training, gradient descent updates model parameters by computing the gradients of the loss function via forward and backward passes.

The green line corresponds to a learning rate of 10, while the orange line has a learning rate of 0.1.

The gradient norm for the orange line with LR = 0.1 is very high in the first steps, while the gradient norm of the green line with LR = 10 diverges to NaN after a few steps.

Techniques for gradient stabilizationMonitoring gradient norms and training loss provides insights into the learning dynamics of the foundation models.

1 month назад @ neptune.ai
STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning [Paper Reflection]
STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning [Paper Reflection] STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning [Paper Reflection]

Unstructured pruning removes individual weights, while structured pruning removes entire model components.

In the context of MoEs, as expert structures from training MoEs correspond to such patterns, pruning experts is a natural fit for structured pruning.

Thus, structured pruning does not significantly decrease kurtosis, leaving plenty of margin for unstructured pruning.

Since structured pruning primarily reduces architectural redundancy rather than reshaping the underlying weight distribution, our two-phase approach—leveraging unstructured pruning after structured pruning—outperforms unstructured-only pruning.

Since STUN does not make any assumption about base MoE models, it is generaliza…

2 months назад @ neptune.ai
Evaluating RAG Pipelines
Evaluating RAG Pipelines Evaluating RAG Pipelines

Related Building LLM Applications With Vector Databases Read moreDimensions of RAG evaluationEvaluating a RAG pipeline means assessing its behavior across three dimensions:1.

The evaluation of the RAG pipeline is a multi-step process, starting with creating an evaluation dataset, then evaluating the individual components (retriever, generator, etc.

Curating an evaluation datasetThe first step in the RAG evaluation process is the creation of a ground truth dataset.

MAP considers both the presence and rank of relevant chunks but fails to consider the relative position of relevant chunks.

However, not all retrieved chunks are equally relevant and sometimes, the most relevant chunks might not b…

2 months, 3 weeks назад @ neptune.ai
How to Build an LLM Agent With AutoGen: Step-by-Step Guide
How to Build an LLM Agent With AutoGen: Step-by-Step Guide How to Build an LLM Agent With AutoGen: Step-by-Step Guide

The efficiency of an LLM agent depends on the selection of the right LLM model.

In this article, we’ll introduce the fundamental building blocks of LLM agents and then walk through the process of building an LLM agent step by step.

Building an LLM agent from scratchIn the following, we’ll build a trip-planning LLM agent from scratch.

Using AutoGen’s OpenAI Assistant Agent, we instantiate a prompt that the LLM agent will follow throughout its interactions.

Related Ethical Considerations and Best Practices in LLM Development Read moreEnhancing LLM agent performanceWhile architecting an LLM agent, you have to keep in mind opportunities to improve the performance of the LLM agent.

4 months, 2 weeks назад @ neptune.ai
Bayesian Deep Learning is Needed in the Age of Large-Scale AI [Paper Reflection]
Bayesian Deep Learning is Needed in the Age of Large-Scale AI [Paper Reflection] Bayesian Deep Learning is Needed in the Age of Large-Scale AI [Paper Reflection]

Moreover, I will make the case for why Bayesian deep learning can satisfy these desiderata and briefly review recent advances in the field.

The case for Bayesian deep learningBayesian deep learning uses the foundational statistical principles of Bayesian inference to endow deep learning systems with the ability to make probabilistic predictions.

However, Bayesian deep learning is unfortunately still not as easy to use as standard deep learning, which you can do these days in a few lines of PyTorch code.

If you want to use a Bayesian deep learning model, first, you have to think about specifying the prior.

If this is the case, trying out Bayesian deep learning is likely worth your while.

4 months, 3 weeks назад @ neptune.ai
Introduction to State Space Models as Natural Language Models
Introduction to State Space Models as Natural Language Models Introduction to State Space Models as Natural Language Models

TL;DR State Space Models (SSMs) use first-order differential equations to represent dynamic systems.

Understanding state space modelsBefore exploring how State Space Models (SSMs) can function as components of large language models (LLMs), we’ll examine their foundational mechanics.

State space models for natural language processingState Space Models (SSMs), long established in time series analysis, have been utilized as trainable sequence models for decades.

Linear state space layers (LSSLs)So far, we’ve seen that State Space Models are efficient sequence models.

Improvements on the state matrix AIn the previous section, we explored how the original LSSL relied on a fixed, predefined form …

5 months назад @ neptune.ai
Ethical Considerations and Best Practices in LLM Development
Ethical Considerations and Best Practices in LLM Development Ethical Considerations and Best Practices in LLM Development

To keep data secure throughout the model’s lifecycle, implement these practices: data anonymization, secure model serving and privacy penetration tests.

For example, a recruitment LLM favoring male applicants due to biased training data reflects a harmful bias that requires correction.

Monitor bias continuouslyMitigating bias isn’t a one-time effort—it requires ongoing monitoring to ensure that your LLM remains fair and effective across iterations.

Although these contributions are publicly available, the move opened up debates about the ethics of reusing community-contributed content for proprietary AI training.

Best practices for ethical LLM developmentNavigating the regulatory landscape r…

5 months, 1 week назад @ neptune.ai
Open LLMs are Necessary For Current Private Adaptations and Outperform Their Closed Alternatives [Paper Reflection]
Open LLMs are Necessary For Current Private Adaptations and Outperform Their Closed Alternatives [Paper Reflection] Open LLMs are Necessary For Current Private Adaptations and Outperform Their Closed Alternatives [Paper Reflection]

While much of the discussion around LLMs centers on task and computational performance, in our paper Open LLMs are Necessary for Current Private Adaptations and Outperform their Closed Alternatives, we focus on the privacy implications of using Open and Closed LLMs.

The threat space in adapting LLMs to private dataThe adaptation of Closed LLMs to private datasets introduces a multifaceted threat space.

Related Zero-Shot and Few-Shot Learning with LLMs Read morePrivate adaptation methods for Open LLMsUnlike Closed LLMs, Open LLMs provide access to their parameters, enabling more flexible and parameter-centric private adaptation methods.

Performance: All adaptation methods for Closed LLMs ach…

5 months, 2 weeks назад @ neptune.ai
Learnings From Teams Training Large-Scale Models: Challenges and Solutions For Monitoring at Hyperscale
Learnings From Teams Training Large-Scale Models: Challenges and Solutions For Monitoring at Hyperscale Learnings From Teams Training Large-Scale Models: Challenges and Solutions For Monitoring at Hyperscale

“What is not measured, cannot be improved.” This quote has become a guiding principle for teams training foundation models.

During my talk at NeurIPS, I broke down five key lessons learned from teams facing large-scale model training and monitoring.

Waabi’s teams, running large-scale ML experiments, needed a way to organize and share their experiment data efficiently.

Visualizing large datasetsWe generally do not think of dataset visualization as part of experiment monitoring.

Moving forwardThe path to efficient hyperscale training lies in combining robust monitoring, advanced debugging tools, and comprehensive experiment tracking.

5 months, 3 weeks назад @ neptune.ai
Mixture of Experts LLMs: Key Concepts Explained
Mixture of Experts LLMs: Key Concepts Explained Mixture of Experts LLMs: Key Concepts Explained

TL;DR Mixture of Experts (MoE) is a type of neural network architecture that employs sub-networks (experts) to process specific input parts.

This is the key idea behind Mixture of Expert LLMs.

The Switch-Language Transformer, Mixtral, GLaM, GShard, and DeepSeekMoE are Mixture of Experts LLMs (MoEs), which require only executing a portion of the model’s computational graph during inference.

Optimization strategies for MoE LLMs are discussed comprehensively in the papers introducing the Switch Transformer, GShard, and GLaM.

Mixture of Experts (MoE) is an approach to scaling LLMs to trillions of parameters with conditional computation while avoiding exploding computational costs.

6 months назад @ neptune.ai
Hyperparameter Optimization For LLMs: Advanced Strategies
Hyperparameter Optimization For LLMs: Advanced Strategies Hyperparameter Optimization For LLMs: Advanced Strategies

Advanced hyperparameter optimization strategies, like population-based training, Bayesian optimization, and adaptive LoRA, promise to balance computational effort and outcome.

To avoid this, learning rate schedules for LLMs start with a small learning rate and slowly ramp it up to its maximum value.

Can we use traditional machine learning hyperparameter optimization methods for LLMs?

| Modified based on: sourceHands-on: LLM hyperparameter optimization with neptune.aiOptuna is a framework for optimizing hyperparameter search using Bayesian optimization.

See the docs or watch a short product demo (2 min)Play with a live Neptune Scale projectRequest your early accessWhat’s next in LLM hyperpar…

6 months, 1 week назад @ neptune.ai
Multimodal Large Language Models
Multimodal Large Language Models Multimodal Large Language Models

TL;DR Multimodal Large Language Models (MLLMs) process data from different modalities like text, audio, image, and video.

This article explores Multimodal Large Language Models, exploring their core functionalities, challenges, and potential for various machine-learning domains.

Let’s break down the concept of Multimodal Large Language Models (MLLMs) by first understanding the terms “modal” and “multimodal:”“Modal” refers to a particular way of communicating or perceiving information.

| SourceGoogle: PaLM-EGoogle developed an embodied language model, PaLM-E, to incorporate continuous sensor modalities into language models and establish the link between words and perceptions.

Improving how t…

6 months, 2 weeks назад @ neptune.ai
How to Build and Evaluate a RAG System Using LangChain, Ragas, and neptune.ai
How to Build and Evaluate a RAG System Using LangChain, Ragas, and neptune.ai How to Build and Evaluate a RAG System Using LangChain, Ragas, and neptune.ai

In this guide, we’ll show you how to build a RAG system using the LangChain framework, evaluate its performance using Ragas, and track your experiments with neptune.ai.

Part 1: Building a baseline RAG system with LangChainIn the first part of this guide, we’ll use LangChain to build a RAG system for the blog posts in the LLMOps category on Neptune’s blog.

Ragas works smoothly with LangChain, making it a great choice for evaluating our RAG system.

Step 1: Generate a RAG evaluation datasetAn evaluation set for RAG tasks is similar to a question-answering task dataset.

Step 2: Choose RAG evaluation metricsAs mentioned earlier, Ragas offers both LLM-based and non-LLM-based metrics for RAG syste…

7 months, 2 weeks назад @ neptune.ai
▶️ YouTube
Yannic Kilcher Yannic Kilcher
последний пост 2 weeks, 1 day назад
Context Rot: How Increasing Input Tokens Impacts LLM Performance (Paper Analysis)
Context Rot: How Increasing Input Tokens Impacts LLM Performance (Paper Analysis) Context Rot: How Increasing Input Tokens Impacts LLM Performance (Paper Analysis)

Paper: https://research.trychroma.com/context-rot Abstract:

Large Language Models (LLMs) are typically presumed to process context uniformly—that is, the model should handle the 10,000th token just as reliably as the 100th. However, in practice, this assumption does not hold. We observe that model performance varies significantly as input length changes, even on simple tasks.

In this report, we evaluate 18 LLMs, including the state-of-the-art GPT-4.1, Claude 4, Gemini 2.5, and Qwen3 models. Our results reveal that models do not use their context uniformly; instead, their performance grows increasingly unreliable as input length grows. Authors: Kelly Hong, Anton Troynikov, Jeff Huber Links:

2 weeks, 1 day назад @ youtube.com
Energy-Based Transformers are Scalable Learners and Thinkers (Paper Review)
Energy-Based Transformers are Scalable Learners and Thinkers (Paper Review) Energy-Based Transformers are Scalable Learners and Thinkers (Paper Review)

Paper: https://arxiv.org/abs/2507.02092

Code: https://github.com/alexiglad/EBT

Website: https://energy-based-transformers.github.io/ Abstract:

Inference-time computation techniques, analogous to human System 2 Thinking, have recently become popular for improving model performances. However, most existing approaches suffer from several limitations: they are modality-specific (e.g., working only in text), problem-specific (e.g., verifiable domains like math and coding), or require additional supervision/training on top of unsupervised pretraining (e.g., verifiers or verifiable rewards). In this paper, we ask the question "Is it possible to generalize these System 2 Thinking approaches, and de…

2 weeks, 5 days назад @ youtube.com
On the Biology of a Large Language Model (Part 2)
On the Biology of a Large Language Model (Part 2) On the Biology of a Large Language Model (Part 2)

An in-depth look at Anthropic's Transformer Circuit Blog Post

Part 1 here: https://youtu.be/mU3g2YPKlsA

Discord here: https;//ykilcher.com/discord https://transformer-circuits.pub/2025/attribution-graphs/biology.html Abstract:

We investigate the internal mechanisms used by Claude 3.5 Haiku — Anthropic's lightweight production model — in a variety of contexts, using our circuit tracing methodology. Authors:

Jack Lindsey†, Wes Gurnee*, Emmanuel Ameisen*, Brian Chen*, Adam Pearce*, Nicholas L. Turner*, Craig Citro*,

David Abrahams, Shan Carter, Basil Hosmer, Jonathan Marcus, Michael Sklar, Adly Templeton,

Trenton Bricken, Callum McDougall◊, Hoagy Cunningham, Thomas Henighan, Adam Jermyn, Andy …

3 months назад @ youtube.com
On the Biology of a Large Language Model (Part 1)
On the Biology of a Large Language Model (Part 1) On the Biology of a Large Language Model (Part 1)

An in-depth look at Anthropic's Transformer Circuit Blog Post https://transformer-circuits.pub/2025/attribution-graphs/biology.html Abstract:

We investigate the internal mechanisms used by Claude 3.5 Haiku — Anthropic's lightweight production model — in a variety of contexts, using our circuit tracing methodology. Authors:

Jack Lindsey†, Wes Gurnee*, Emmanuel Ameisen*, Brian Chen*, Adam Pearce*, Nicholas L. Turner*, Craig Citro*,

David Abrahams, Shan Carter, Basil Hosmer, Jonathan Marcus, Michael Sklar, Adly Templeton,

Trenton Bricken, Callum McDougall◊, Hoagy Cunningham, Thomas Henighan, Adam Jermyn, Andy Jones, Andrew Persic, Zhenyi Qi, T. Ben Thompson,

Sam Zimmerman, Kelley Rivoire, Thom…

4 months назад @ youtube.com
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models (Paper Explained)
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models (Paper Explained) DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models (Paper Explained)

#deepseek #llm #reinforcementlearning GRPO is one of the core advancements used in Deepseek-R1, but was introduced already last year in this paper that uses a combination of new RL techniques and iterative data collection to achieve remarkable performance on mathematics benchmarks with just a 7B model. Paper: https://arxiv.org/abs/2402.03300 Abstract:

Mathematical reasoning poses a significant challenge for language models due to its complex and structured nature. In this paper, we introduce DeepSeekMath 7B, which continues pre-training DeepSeek-Coder-Base-v1.5 7B with 120B math-related tokens sourced from Common Crawl, together with natural language and code data. DeepSeekMath 7B has achie…

6 months, 1 week назад @ youtube.com
Traditional Holiday Live Stream
Traditional Holiday Live Stream Traditional Holiday Live Stream

https://ykilcher.com/discord Links:

TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://discord.gg/4H8xxDF

BitChute: https://www.bitchute.com/channel/yannic-kilcher

Minds: https://www.minds.com/ykilcher

Parler: https://parler.com/profile/YannicKilcher

LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/

BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https:/…

7 months, 2 weeks назад @ youtube.com
Byte Latent Transformer: Patches Scale Better Than Tokens (Paper Explained)
Byte Latent Transformer: Patches Scale Better Than Tokens (Paper Explained) Byte Latent Transformer: Patches Scale Better Than Tokens (Paper Explained)

#tokenization #llm #meta This paper does away with tokenization and creates an LLM architecture that operates on dynamically sized "patches" instead of tokens. By controlling the patch size, they gain a level of control over the tradeoff between model size and FLOPs and use that to achieve more favorable scaling behavior than classically tokenized LLMs. Paper: https://ai.meta.com/research/publications/byte-latent-transformer-patches-scale-better-than-tokens/

Code: https://github.com/facebookresearch/blt Abstract:

We introduce the Byte Latent Transformer (BLT), a new byte-level LLM architecture that, for the first time, matches tokenization-based LLM performance at scale with significant imp…

7 months, 2 weeks назад @ youtube.com
Safety Alignment Should be Made More Than Just a Few Tokens Deep (Paper Explained)
Safety Alignment Should be Made More Than Just a Few Tokens Deep (Paper Explained) Safety Alignment Should be Made More Than Just a Few Tokens Deep (Paper Explained)

This paper demonstrates in a series of experiments that current safety alignment techniques of LLMs, as well as corresponding jailbreaking attacks, are in large part focusing on modulating the distribution of the first few tokens of the LLM response. Paper: https://openreview.net/forum?id=6Mxhg9PtDE&s=09 Abstract:

The safety alignment of current Large Language Models (LLMs) is vulnerable. Simple attacks, or even benign fine-tuning, can jailbreak aligned models. We note that many of these vulnerabilities are related to a shared underlying issue: safety alignment can take shortcuts, wherein the alignment adapts a model's generative distribution primarily over only its very first few output to…

8 months назад @ youtube.com
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters (Paper Explained)
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters (Paper Explained) TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters (Paper Explained)

A deep dive into the TokenFormer and an opinion about its impact, novelty, and relation to prior work. Paper: https://arxiv.org/abs/2410.23168 Abstract:

Transformers have become the predominant architecture in foundation models due to their excellent performance across various domains. However, the substantial cost of scaling these models remains a significant concern. This problem arises primarily from their dependence on a fixed number of parameters within linear projections. When architectural modifications (e.g., channel dimensions) are introduced, the entire model typically requires retraining from scratch. As model sizes continue growing, this strategy results in increasingly high com…

8 months, 2 weeks назад @ youtube.com
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models

This paper (by Apple) questions the mathematical reasoning abilities of current LLMs and designs a synthetic template-based dataset distribution to investigate various aspects around LLM performance of high-school level math questions. Paper: https://arxiv.org/abs/2410.05229 Abstract:

Recent advancements in Large Language Models (LLMs) have sparked interest in their formal reasoning capabilities, particularly in mathematics. The GSM8K benchmark is widely used to assess the mathematical reasoning of models on grade-school-level questions. While the performance of LLMs on GSM8K has significantly improved in recent years, it remains unclear whether their mathematical reasoning capabilities hav…

9 months, 3 weeks назад @ youtube.com
Were RNNs All We Needed? (Paper Explained)
Were RNNs All We Needed? (Paper Explained) Were RNNs All We Needed? (Paper Explained)

This paper posits the interesting question: How much of the performance of Mamba, S4, and other state-space-like models is actually just attributable to some very core concepts - rather than their elaborate architectures. The authors construct minimal versions of GRUs and LSTMs and report competitive performance. Paper: https://arxiv.org/abs/2410.01201 Abstract:

The scalability limitations of Transformers regarding sequence length have renewed interest in recurrent sequence models that are parallelizable during training. As a result, many novel recurrent architectures, such as S4, Mamba, and Aaren, have been proposed that achieve comparable performance. In this work, we revisit traditional …

9 months, 4 weeks назад @ youtube.com
Henry AI Labs Henry AI Labs
последний пост None
3blue1brown 3blue1brown
последний пост 1 week, 6 days назад
But how do AI videos actually work? | Guest video by @WelchLabsVideo
But how do AI videos actually work? | Guest video by @WelchLabsVideo But how do AI videos actually work? | Guest video by @WelchLabsVideo

Diffusion models, CLIP, and the math of turning text into images

Welch Labs Book: https://www.welchlabs.com/resources/imaginary-numbers-book Sections

0:00 - Intro

3:37 - CLIP

6:25 - Shared Embedding Space

8:16 - Diffusion Models & DDPM

11:44 - Learning Vector Fields

22:00 - DDIM

25:25 Dall E 2

26:37 - Conditioning

30:02 - Guidance

33:39 - Negative Prompts

34:27 - Outro

35:32 - About guest videos + Grant’s Reaction Special Thanks to:

Jonathan Ho - Jonathan is the Author of the DDPM paper and the Classifier Free Guidance Paper.

https://arxiv.org/pdf/2006.11239

https://arxiv.org/pdf/2207.12598 Preetum Nakkiran - Preetum has an excellent introductory diffusion tutorial:

https://arxiv.org/pdf/24…

1 week, 6 days назад @ youtube.com
Summer of Math Exposition #4 | Teachers, I'd love to hear from you
Summer of Math Exposition #4 | Teachers, I'd love to hear from you Summer of Math Exposition #4 | Teachers, I'd love to hear from you

Make a math explainer, get feedback, and receive prizes: https://some.3b1b.co

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to simply share the videos. ------------------ These animations are largely made using a custom Python library, manim. See the FAQ comments here:

https://3b1b.co/faq#manim

https://github.com/3b1b/manim

https://github.com/ManimCommunity/manim/ All code for specific videos is visible here:

https://github.com/3b1b/videos/ The music is by Vincent Rubinetti.

https://www.vincentrubinetti.com

https://vincerubinetti.bandcamp.com/album/the-music-of-3blue1brown

https://open.spotify.com/…

3 months назад @ youtube.com
Where my explanation of Grover’s algorithm failed
Where my explanation of Grover’s algorithm failed Where my explanation of Grover’s algorithm failed

Addressing viewer questions from the last video.

These lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to share the videos. ------------------ These animations are largely made using a custom Python library, manim. See the FAQ comments here:

https://3b1b.co/faq#manim

https://github.com/3b1b/manim

https://github.com/ManimCommunity/manim/ All code for specific videos is visible here:

https://github.com/3b1b/videos/ The music is by Vincent Rubinetti.

https://www.vincentrubinetti.com

https://vincerubinetti.bandcamp.com/album/the-music-of-3blue1brown

https://open.spotify.com/album/1dVyjwS8FBqXhRunaG5W5u ------------------ 3blue1brown is a ch…

3 months назад @ youtube.com
But what is Quantum Computing? (Grover's Algorithm)
But what is Quantum Computing?  (Grover's Algorithm) But what is Quantum Computing? (Grover's Algorithm)

Qubits, state vectors, and Grover's algorithm for search.

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to share the videos. The subtitles on this video were done using AI, and are likely imperfect, but they are open for community corrections at https://criblate.com/ Adam Brown's paper on the connection between Grover's Algorithm and block collisions:

https://arxiv.org/pdf/1912.02207 If you want to learn the relevant underlying quantum mechanics here, a very friendly resource is the course Mithuna at Looking Glass Universe is currently putting together. See, for instance, this explainer of a qubit:…

3 months, 1 week назад @ youtube.com
Testing your intuition for quantum computing
Testing your intuition for quantum computing Testing your intuition for quantum computing

Full video: https://youtu.be/RQWpF2Gb-gU

3 months, 1 week назад @ youtube.com
How to measure nearby galaxies
How to measure nearby galaxies How to measure nearby galaxies

From this video: https://youtu.be/hFMaT9oRbs4

3 months, 2 weeks назад @ youtube.com
Measuring the distance to Venus without radar
Measuring the distance to Venus without radar Measuring the distance to Venus without radar

From this video with Terry Tao: https://youtu.be/hFMaT9oRbs4

4 months назад @ youtube.com
Measuring the speed of light using Jupiter's moons
Measuring the speed of light using Jupiter's moons Measuring the speed of light using Jupiter's moons

From this video with Terry Tao: https://youtu.be/hFMaT9oRbs4

4 months назад @ youtube.com
The tragic tale of Guillaume Le Gentil
The tragic tale of Guillaume Le Gentil The tragic tale of Guillaume Le Gentil

From this video: https://youtu.be/hFMaT9oRbs4 Artwork by Kurt Bruns

4 months, 1 week назад @ youtube.com
Zooming out by powers of 10
Zooming out by powers of 10 Zooming out by powers of 10

From this video: https://youtu.be/YdOXS_9_P4U

4 months, 2 weeks назад @ youtube.com
There's more to those colliding blocks that compute pi
There's more to those colliding blocks that compute pi There's more to those colliding blocks that compute pi

Two colliding blocks compute pi, here we dig into the physics to explain why

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to simply share the videos. The original paper by Gregory Galperin:

https://www.maths.tcd.ie/~lebed/Galperin.%20Playing%20pool%20with%20pi.pdf Adam Brown's paper on the analogy with Grover's Algorithm:

https://arxiv.org/pdf/1912.02207 Here's a lovely interactive built by GitHub user prajwalsouza after watching this video: https://prajwalsouza.github.io/Experiments/Colliding-Blocks.html Matt Parker's Pi Day video:

https://youtu.be/vlUTlbZT4ig NY Times blog post about this proble…

4 months, 3 weeks назад @ youtube.com
When being beautifully wrong leads to discovery
When being beautifully wrong leads to discovery When being beautifully wrong leads to discovery

Full video: https://youtu.be/YdOXS_9_P4U

5 months, 1 week назад @ youtube.com
Why the ancient Greek's rejected heliocentrism
Why the ancient Greek's rejected heliocentrism Why the ancient Greek's rejected heliocentrism

From this video on the cosmic distance ladder: https://youtu.be/YdOXS_9_P4U

5 months, 1 week назад @ youtube.com
How to estimate the distance to the sun
How to estimate the distance to the sun How to estimate the distance to the sun

Full video: https://youtu.be/YdOXS_9_P4U

5 months, 1 week назад @ youtube.com
How Aristarchus deduced the distance to the moon
How Aristarchus deduced the distance to the moon How Aristarchus deduced the distance to the moon

Full video: https://youtu.be/YdOXS_9_P4U

5 months, 1 week назад @ youtube.com
Two Minute Papers Two Minute Papers
последний пост 20 часов назад
OpenAI’s New Free AI: The Good, The Bad, The Unexpected!
OpenAI’s New Free AI: The Good, The Bad, The Unexpected! OpenAI’s New Free AI: The Good, The Bad, The Unexpected!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Guide:

Rent one of their GPU's with over 16GB of VRAM

Open a terminal

Just get Ollama with this command - https://ollama.com/download/linux

Then run ollama run gpt-oss:120b - https://ollama.com/library/gpt-oss:120b Try it online:

https://gpt-oss.com/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 Humanity's Last Exam:

https://agi.safe.ai/ Sources:

https://x.com/flavioad/status/1952792389636198489

https://x.com/kwindla/status/1952947685…

20 часов назад @ youtube.com
New Game AI Turns Photos Into Playable Worlds! | Celebrating 10 Years Of Papers! 🎂
New Game AI Turns Photos Into Playable Worlds!  | Celebrating 10 Years Of Papers! 🎂 New Game AI Turns Photos Into Playable Worlds! | Celebrating 10 Years Of Papers! 🎂

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Guide for using DeepSeek on Lambda:

https://docs.lambdalabs.com/education/large-language-models/deepseek-r1-ollama/?utm_source=two-minute-papers&utm_campaign=relevant-videos&utm_medium=video 📝 The paper is available here:

https://hunyuan-gamecraft.github.io/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Benji Rabhan, B Shang, Christian Ahlin,…

2 days, 12 hours назад @ youtube.com
The Forgotten Research That Fixed The Worst Physics Bug!
The Forgotten Research That Fixed The Worst Physics Bug! The Forgotten Research That Fixed The Worst Physics Bug!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Guide for using DeepSeek on Lambda:

https://docs.lambdalabs.com/education/large-language-models/deepseek-r1-ollama/?utm_source=two-minute-papers&utm_campaign=relevant-videos&utm_medium=video 📝 The paper is available here:

https://graphics.cs.utah.edu/research/projects/merging-and-splitting/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 Video game glitch: https://www.youtube.com/watch?v=fZgRVatBXTE 🙏 We would like to thank our generous…

4 days, 13 hours назад @ youtube.com
This AI Learns Faster Than Anything We’ve Seen!
This AI Learns Faster Than Anything We’ve Seen! This AI Learns Faster Than Anything We’ve Seen!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Guide for using DeepSeek on Lambda:

https://docs.lambdalabs.com/education/large-language-models/deepseek-r1-ollama/?utm_source=two-minute-papers&utm_campaign=relevant-videos&utm_medium=video 📝 "Genesis: A Generative and Universal Physics Engine for Robotics and Beyond" is available here:

https://genesis-embodied-ai.github.io/ Tech report direct link: https://placid-walkover-0cc.notion.site/genesis-performance-benchmarking Criticism: https://stoneztao.substack.com/p/the-new-hyped-genesis-simulator-is

Their answer to criticism: https://placid-walkover-0cc.notion.site/genesis-performance-benchmarking (at 4.1 Res…

1 week, 5 days назад @ youtube.com
Blender 4.5 - How My Dream Just Came True!
Blender 4.5 - How My Dream Just Came True! Blender 4.5 - How My Dream Just Came True!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Guide for using DeepSeek on Lambda:

https://docs.lambdalabs.com/education/large-language-models/deepseek-r1-ollama/?utm_source=two-minute-papers&utm_campaign=relevant-videos&utm_medium=video My part at the official Blender 4.5 announcement video:

https://www.youtube.com/watch?v=wPhA0imjvVs&t=1474s Get Blender: https://www.blender.org/

Demo files: https://www.blender.org/download/demo-files/

Tutorial: https://www.youtube.com/watch?v=4haAdmHqGOw Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supp…

2 weeks, 6 days назад @ youtube.com
The “Biggest” AI That Came Out Of Nowhere!
The “Biggest” AI That Came Out Of Nowhere! The “Biggest” AI That Came Out Of Nowhere!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Guide for using DeepSeek on Lambda:

https://docs.lambdalabs.com/education/large-language-models/deepseek-r1-ollama/?utm_source=two-minute-papers&utm_campaign=relevant-videos&utm_medium=video Kimi K2:

https://moonshotai.github.io/Kimi-K2/

API: https://platform.moonshot.ai Run it yourself locally: https://x.com/unslothai/status/1944780685409165589 Sources:

https://x.com/chetaslua/status/1943681568549052458

https://x.com/satvikps/status/1944861384573169929 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clicka…

3 weeks, 2 days назад @ youtube.com
Roblox Solved The Physics Problem That Stumped Everyone!
Roblox Solved The Physics Problem That Stumped Everyone! Roblox Solved The Physics Problem That Stumped Everyone!

❤️ Check out Vast.ai and run DeepSeek or any AI project: https://vast.ai/papers 📝 The paper is available here:

https://graphics.cs.utah.edu/research/projects/avbd/ Play with it!

https://graphics.cs.utah.edu/research/projects/avbd/avbd_demo2d.html 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Benji Rabhan, B Shang, Christian Ahlin, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Michael Tedder,…

3 weeks, 4 days назад @ youtube.com
NVIDIA’s New AI Cheated At Parkour…And Got Fixed!
NVIDIA’s New AI Cheated At Parkour…And Got Fixed! NVIDIA’s New AI Cheated At Parkour…And Got Fixed!

❤️ Check out DeepInfra and run DeepSeek or many other AI projects: https://deepinfra.com/papers 📝 The paper "PARC: Physics-based Augmentation with Reinforcement Learning for Character Controllers" is available here:

https://michaelx.io/parc/index.html 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Benji Rabhan, B Shang, Christian Ahlin, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Michael Te…

1 month назад @ youtube.com
Unreal Engine 5.6: Outrageously Good!
Unreal Engine 5.6: Outrageously Good! Unreal Engine 5.6: Outrageously Good!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Guide for using DeepSeek on Lambda:

https://docs.lambdalabs.com/education/large-language-models/deepseek-r1-ollama/?utm_source=two-minute-papers&utm_campaign=relevant-videos&utm_medium=video Get Unreal Engine 5.6:

https://www.unrealengine.com/en-US/news/unreal-engine-5-6-is-now-available Substrate material source + tutorial : https://www.youtube.com/watch?v=YDeptWduHNc 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to t…

1 month, 1 week назад @ youtube.com
NVIDIA’s New AI Watched 150,000 Videos! What Did It Learn?
NVIDIA’s New AI Watched 150,000 Videos! What Did It Learn? NVIDIA’s New AI Watched 150,000 Videos! What Did It Learn?

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Guide for using DeepSeek on Lambda:

https://docs.lambdalabs.com/education/large-language-models/deepseek-r1-ollama/?utm_source=two-minute-papers&utm_campaign=relevant-videos&utm_medium=video 📝 The paper is available here:

https://research.nvidia.com/labs/toronto-ai/UniRelight/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Apple Music Sing demonstration source: https://www.youtube.com/watch?v=Q6Qpsvwh6mQ Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our ge…

1 month, 2 weeks назад @ youtube.com
OpenAI’s o3 Pro: Crushing The AI Game Test! 🎮
OpenAI’s o3 Pro: Crushing The AI Game Test! 🎮 OpenAI’s o3 Pro: Crushing The AI Game Test! 🎮

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Guide for using DeepSeek on Lambda:

https://docs.lambdalabs.com/education/large-language-models/deepseek-r1-ollama/?utm_source=two-minute-papers&utm_campaign=relevant-videos&utm_medium=video 📝 Paper+code: https://github.com/lmgame-org/GamingAgent

Some results: https://huggingface.co/spaces/lmgame/lmgame_bench

Try it out: https://lmgame.org 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supp…

1 month, 2 weeks назад @ youtube.com
NVIDIA’s New AI Grows Stuff Out Of Nothing!
NVIDIA’s New AI Grows Stuff Out Of Nothing! NVIDIA’s New AI Grows Stuff Out Of Nothing!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Guide for using DeepSeek on Lambda:

https://docs.lambdalabs.com/education/large-language-models/deepseek-r1-ollama/?utm_source=two-minute-papers&utm_campaign=relevant-videos&utm_medium=video 📝 The papers are available here:

https://research.nvidia.com/labs/toronto-ai/stochastic-preconditioning/

https://zju3dv.github.io/freetimegs/ Play with it (interactive viewer): https://www.4dv.ai/viewer/salmon_10s?showdemo=4dv 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/ar…

1 month, 3 weeks назад @ youtube.com
Google’s New AI: This Isn’t a Photo - But It Is!
Google’s New AI: This Isn’t a Photo - But It Is! Google’s New AI: This Isn’t a Photo - But It Is!

❤️ Check out the Fully Connected conference from Weights & Biases on June 17-18th in SF:

https://wandb.me/fc2025

Use the code FCSF2WP to get a ticket for free! 📝 The paper "Practical Inverse Rendering of Textured and Translucent Appearance" is available here:

https://weiphil.github.io/portfolio/practical_reconstruction 📝 Separable Subsurface Scattering:

https://users.cg.tuwien.ac.at/zsolnai/gfx/separable-subsurface-scattering-with-activision-blizzard/ Free rendering course!

https://users.cg.tuwien.ac.at/zsolnai/gfx/rendering-course/ 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Benji Rabhan, B Shang, Christian Ahlin, Gordon Child, John Le, Jua…

1 month, 3 weeks назад @ youtube.com
NVIDIA’s New AI: Next Level Games Are Coming!
NVIDIA’s New AI: Next Level Games Are Coming! NVIDIA’s New AI: Next Level Games Are Coming!

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers 📝 The papers are available here:

https://research.nvidia.com/labs/toronto-ai/difix3d/

https://sites.google.com/view/cast4

https://syntec-research.github.io/UVGA/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Benji Rabhan, B Shang, Christian Ahlin, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Michael Tedd…

1 month, 4 weeks назад @ youtube.com
Microsoft’s New AI: Ray Tracing 16,000,000 Images!
Microsoft’s New AI: Ray Tracing 16,000,000 Images! Microsoft’s New AI: Ray Tracing 16,000,000 Images!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Guide for using DeepSeek on Lambda:

https://docs.lambdalabs.com/education/large-language-models/deepseek-r1-ollama/?utm_source=two-minute-papers&utm_campaign=relevant-videos&utm_medium=video 📝 The paper "RenderFormer: Transformer-based Neural Rendering of Triangle Meshes with Global Illumination" is available here:

https://microsoft.github.io/renderformer/ 📝 Our neural rendering paper "Gaussian Material Synthesis" is available here:

https://users.cg.tuwien.ac.at/zsolnai/gfx/gaussian-material-synthesis/ 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Benji Rabhan, …

2 months назад @ youtube.com
DataFest Video DataFest Video
последний пост None
Семинары JetBrains Research Семинары JetBrains Research
последний пост None
Яндекс. Компьютерные науки Яндекс. Компьютерные науки
последний пост 1 week, 3 days назад
Мы будем меньше писать код руками — и это хорошо
Мы будем меньше писать код руками — и это хорошо Мы будем меньше писать код руками — и это хорошо

В этом году конференция Data Fest отмечает юбилей — 10 лет. За это время машинное обучение добралось до многих сфер нашей жизни. Нам стало интересно, как развитие ML изменит привычную реальность ещё через 10 лет. Вот мы и спросили об этом у гостей конференции! Полное видео ищите на нашем канале. #автоматизация #lowcode #nocode #будущееразработки #нейросети #AIвразработке #программирование #ИИ #разработчики #машинноеобучение #генерациякода #технологии

1 week, 3 days назад @ youtube.com
Как ИИ и нейросети заставят нас БОЛЬШЕ думать
Как ИИ и нейросети заставят нас БОЛЬШЕ думать Как ИИ и нейросети заставят нас БОЛЬШЕ думать

В этом году конференция Data Fest отмечает юбилей — 10 лет. За это время машинное обучение добралось до многих сфер нашей жизни. Нам стало интересно, как развитие ML изменит привычную реальность ещё через 10 лет. Вот мы и спросили об этом у гостей конференции! Полное видео ищите на нашем канале. #искусственныйинтеллект #нейросети #машинноеобучение #DataFest #будущеетехнологий #AI #ИИ #глубокообучение #технологиибудущего #GPT #этикаИИ #развитиеИИ #автоматизация

2 weeks, 2 days назад @ youtube.com
Как мы учим Алису видеть объекты
Как мы учим Алису видеть объекты Как мы учим Алису видеть объекты

Это фрагмент выступления Дарьи Виноградовой, руководителя команды стримингового зрения в Яндексе. На Data Fest Дарья рассказала о VLM в Алисе. #Алиса #Яндекс #нейросети #ИИ #технологии #VLM #компьютерноезрение #голосовойассистент #визуальныйинтеллект #AI #умныеустройства #обучениеИИ #распознаваниеобъектов #мультимодальность

3 weeks, 1 day назад @ youtube.com
VLM: как видит мир Алиса от Яндекса
VLM: как видит мир Алиса от Яндекса VLM: как видит мир Алиса от Яндекса

Это фрагмент выступления Дарьи Виноградовой, руководителя команды стримингового зрения в Яндексе. На Data Fest Дарья рассказала о VLM в Алисе. #Алиса #Яндекс #нейросети #искусственныйинтеллект #VLM #AI #технологии #умныеустройства #мультимодальность #АлисаЯндекс #голосовойассистент #визуальноеAI #ЯндексАлиса #VLMтехнологии #ИИ

3 weeks, 5 days назад @ youtube.com
VLM: как нейросети связывают картинку и текст
VLM: как нейросети связывают картинку и текст VLM: как нейросети связывают картинку и текст

Это фрагмент выступления Дарьи Виноградовой, руководителя команды стримингового зрения в Яндексе. На Data Fest Дарья рассказала о VLM в Алисе. #VLM #нейросети #ИИ #искусственныйинтеллект #мультимодальность #AI #визуальныйИИ #нейросетитексткартинка #компьютерноеЗрение #моделиЯндекса #умныеустройства #технологиибудущего #визуальноятехнология #machinelearning #deeplearning #языкитехнологии #AIЯндекс #модельпокартинке #LLM #NLP #CV

1 month назад @ youtube.com
Что за Муза в Алисе от Яндекса
Что за Муза в Алисе от Яндекса Что за Муза в Алисе от Яндекса

Это фрагмент выступления Дарьи Виноградовой, руководителя команды стримингового зрения в Яндексе. На Data Fest Дарья рассказала о VLM в Алисе. #Алиса #Муза #Яндекс #VLM #DataFest #ИИ #нейросети #AI #умныеустройства #визуальныемодели #визуальныйИИ #ЯндексТехнологии #визуальноятехнология #ассистентыбудущего #технологии #компьютерноеЗрение #LLM #интеллектуальныйассистент #стриминговоезрение #нейроассистент #техконференция

1 month, 1 week назад @ youtube.com
Как прогнозировать тысячи временных рядов и не сходить с ума / Александр Исаков
Как прогнозировать тысячи временных рядов и не сходить с ума / Александр Исаков Как прогнозировать тысячи временных рядов и не сходить с ума / Александр Исаков

Александр Исаков, руководитель группы прогнозирования в Яндекс Лавке, прочитал этот доклад на конференции AHA!25. Александр рассказал о создании и внедрении системы среднесрочного прогнозирования для Яндекс Лавки. Эта система точно прогнозирует ключевую бизнес-метрику (заказы) на полугодовой период и помогает планировать ресурсы в условиях быстрого роста сервиса и высокой волатильности спроса. Александр подробно остановился на том, как ребята решили проблему волатильности временных рядов с помощью приведения данных к стабильному виду и выделения ключевых влияющих факторов, что позволило не только улучшить прогноз, но и объяснить отклонения от плана. А ещё рассказал, как они встроили прогноз…

1 month, 1 week назад @ youtube.com
Как продукты на основе LLM повышают эффективность сотрудников и экономят миллионы / Эльвира Морозова
Как продукты на основе LLM повышают эффективность сотрудников и экономят миллионы / Эльвира Морозова Как продукты на основе LLM повышают эффективность сотрудников и экономят миллионы / Эльвира Морозова

Эльвира Морозова, ХХ в Яндексе, прочитала этот доклад на конференции AHA!25. Эльвира рассказала, как ребята внедрили YaGPT в формате подсказок оператора в одну из поддержек Яндекса. И в итоге получили доказанный в A/B-тестах экстраэффект как на скорости работы операторов, так и на качестве ответов. Эльвира показала, почему «экзоскелет» из GPT для оператора поддержки — это новая норма. Спойлер: применение GPT в саппорте сейчас приносит 15% чистой экономии в Яндекс Маркете. Больше классных материалов ищите в телеграм-канале Yandex for ML: https://t.me/yandexforml #GPT #Яндекс #поддержка #аналитика #AHA25 #YaGPT #искусственныйинтеллект #YandexForAnalytics #LLM #AI

1 month, 1 week назад @ youtube.com
Будущее рекомендательных систем / Николай Савушкин
Будущее рекомендательных систем / Николай Савушкин Будущее рекомендательных систем / Николай Савушкин

Николай Савушкин, руководитель службы рекомендательных технологий в Яндексе, прочитал этот доклад на конференции AHA!25. Николай рассказал, как персонализация устроена сейчас, как большие генеративные модели внедряют в рекомендательные системы Яндекса и что ждёт нас в будущем. Доклад направлен на широкую аудиторию с техническим бэкграундом, знакомит с основными понятиями и показывает технологические тренды. Больше классных материалов ищите в телеграм-канале Yandex for ML: https://t.me/yandexforml #персонализация #рекомендации #машинноеобучение #генеративныемодели #Яндекс #AI #LLM #AHA25 #аналитика #YandexForAnalytics

1 month, 2 weeks назад @ youtube.com
Опыт внедрения ML-прогнозов в систему динамического ценообразования Яндекс Доставки / Андрей Нарцев
Опыт внедрения ML-прогнозов в систему динамического ценообразования Яндекс Доставки / Андрей Нарцев Опыт внедрения ML-прогнозов в систему динамического ценообразования Яндекс Доставки / Андрей Нарцев

Андрей Нарцев, руководитель Applied ML в Яндекс Доставке, прочитал этот доклад на конференции AHA!25. Андрей разобрал ключевые аналитические, продуктовые и ML-вызовы, с которыми ребята столкнулись при внедрении динамического ценообразования для замедленных тарифов. А ещё поделился инсайтами и полезными практиками, которые помогли сделать ценообразование ещё более адаптивным. После просмотра этого доклада вы будете лучше понимать сложности внедрения ML-моделей в продукт, научитесь заранее предотвращать потенциальные проблемы и применять представленные подходы в своей работе. Больше классных материалов ищите в телеграм-канале Yandex for ML: https://t.me/yandexforml #ML #ценообразование #Яндек…

1 month, 2 weeks назад @ youtube.com
Time Capsule от Data Fest: как технологии изменят всё
Time Capsule от Data Fest: как технологии изменят всё Time Capsule от Data Fest: как технологии изменят всё

В этом году конференция Data Fest отмечала юбилей — 10 лет. За это время машинное обучение проникло во многие сферы нашей жизни, в том числе в рабочие процессы. Нам стало интересно: как развитие ML изменит нашу привычную реальность ещё через 10 лет? Мы спросили об этом у гостей конференции, и вот что из этого вышло! #DataFest #машинноеобучение #ML #искусственныйинтеллект #будущеетехнологий #нейросети #цифровоебудущее #технологии #Яндекс #AI

1 month, 3 weeks назад @ youtube.com
Fast and Approximate Responses / Артур Соловьёв
Fast and Approximate Responses / Артур Соловьёв Fast and Approximate Responses / Артур Соловьёв

Это выступление на Data Fest в гостях у Яндекса в секции OptimalDL. Артур Соловьёв прочитал доклад на английском языке на тему: Fast and Approximate Responses. Узнать больше о мероприятиях для разработчиков можно тут: https://events.yandex.ru Подписывайтесь на телеграм-канал Яндекса для ML-сообщества: https://t.me/yandexforml #DataFest #OptimalDL #FastResponses #ApproximateComputing #MLOptimization #DeepLearning #AIperformance #инференс #оптимизация #нейросети #modelcompression #AI #машиннообучение #inferencespeed #lowlatencyAI #АртурСоловьёв #DLresearch #DataFest2025

2 months назад @ youtube.com
Как я создал ИИ-переводчик для славянской интерлингвы (межславянского языка) / Салават Гарифуллин
Как я создал ИИ-переводчик для славянской интерлингвы (межславянского языка) / Салават Гарифуллин Как я создал ИИ-переводчик для славянской интерлингвы (межславянского языка) / Салават Гарифуллин

Это доклад на Data Fest в гостях у Яндекса. В секции NLP Салават Гарифуллин рассказал, как создал первый ИИ-переводчик для славянской интерлингвы — искусственного межславянского языка. Узнать больше о мероприятиях для разработчиков можно тут: https://events.yandex.ru Подписывайтесь на телеграм-канал Яндекса для ML-сообщества: https://t.me/yandexforml #DataFest #NLP #ИИпереводчик #межславянскийязык #интерлингва #машинныйперевод #языки #AI #linguistics #нейросети #naturalanguageprocessing #перевод #lowresource #slaviclanguages #SalavatGarifullin #искусственныйинтеллект #languageAI #переводчик #DataFest2025

2 months назад @ youtube.com
Как мы делали умного помощника в Лавке на основе YaGPT / Алёна Зайцева
Как мы делали умного помощника в Лавке на основе YaGPT / Алёна Зайцева Как мы делали умного помощника в Лавке на основе YaGPT / Алёна Зайцева

Это доклад на Data Fest в гостях у Яндекса. В секции Advanced LLMs Алёна Зайцева рассказала, как ребята делали умного помощника в Лавке на основе YaGPT. Узнать больше о мероприятиях для разработчиков можно тут: https://events.yandex.ru Подписывайтесь на телеграм-канал Яндекса для ML-сообщества: https://t.me/yandexforml #DataFest #AdvancedLLMs #ЯндексЛавка #YaGPT #LLM #умныйпомощник #AI #искусственныйинтеллект #генеративныйИИ #голосовойпомощник #NLP #персонализация #Lavka #AIassistant #frontendAI #backendAI #biglanguagemodels #AlenaZaytseva #DataFest2025

2 months назад @ youtube.com
From Tokens to Thinking: How Reinforcement Learning Fuels Reasoning in LLMs / Миле Митрович
From Tokens to Thinking: How Reinforcement Learning Fuels Reasoning in LLMs / Миле Митрович From Tokens to Thinking: How Reinforcement Learning Fuels Reasoning in LLMs / Миле Митрович

Это выступление на Data Fest в гостях у Яндекса в секции Advanced LLMs. Миле Митрович прочитал доклад на английском языке на тему: From Tokens to Thinking: How Reinforcement Learning Fuels Reasoning in LLMs. Узнать больше о мероприятиях для разработчиков можно тут: https://events.yandex.ru Подписывайтесь на телеграм-канал Яндекса для ML-сообщества: https://t.me/yandexforml #DataFest #AdvancedLLMs #MilaMitrovic #LLM #reinforcementlearning #reasoning #машинноемышление #RLHF #AI #большиеязыковыемодели #искусственныйинтеллект #нейросети #обучениесподкреплением #AIthinking #futureofAI #generativeAI #LLMarchitecture

2 months назад @ youtube.com
ML Trainings ML Trainings
последний пост 15 часов назад
Антон Колонин | Benchmark-driven Time-Series-Augmented Generation
Антон Колонин | Benchmark-driven Time-Series-Augmented Generation Антон Колонин | Benchmark-driven Time-Series-Augmented Generation

Спикер: Антон Колонин, к.т.н., ЦИИ НГУ, SingularityNET Data Fest 2025: https://ods.ai/events/datafest2025

Презентацию к докладу можно скачать в треке секции NLP: https://ods.ai/tracks/df25-nlp

______

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

15 часов назад @ youtube.com
Ангелина Калюжная | Сказ о бенчмарках: Как мы собирали датасеты для AI-assisted IDE
Ангелина Калюжная | Сказ о бенчмарках: Как мы собирали датасеты для AI-assisted IDE Ангелина Калюжная | Сказ о бенчмарках: Как мы собирали датасеты для AI-assisted IDE

Спикер: Ангелина Калюжная, Project Manager, Russian Reseach Institute Data Fest 2025: https://ods.ai/events/datafest2025

Презентацию к докладу можно скачать в треке секции Code Generation (AI4SE): https://ods.ai/tracks/df25-ai4se

______

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

15 часов назад @ youtube.com
Джавид Фаталиев | Как мы сэкономили время курьеров рекомендациями смен в Лавке
Джавид Фаталиев | Как мы сэкономили время курьеров рекомендациями смен в Лавке Джавид Фаталиев | Как мы сэкономили время курьеров рекомендациями смен в Лавке

Спикер: Джавид Фаталиев Data Fest 2025: https://ods.ai/events/datafest2025 ______

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

23 часа назад @ youtube.com
Екатерина Глазкова | Продуктовые применения Yandex VLM
Екатерина Глазкова | Продуктовые применения Yandex VLM Екатерина Глазкова | Продуктовые применения Yandex VLM

Спикер: Екатерина Глазкова Data Fest 2025: https://ods.ai/events/datafest2025 ______

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

23 часа назад @ youtube.com
Анастасия Никулина | Почему сильные ML-кандидаты до вас не доходят — и как это исправить
Анастасия Никулина | Почему сильные ML-кандидаты до вас не доходят — и как это исправить Анастасия Никулина | Почему сильные ML-кандидаты до вас не доходят — и как это исправить

Спикер: Анастасия Никулина, Wildberries, Руководитель центра компетенций ML Data Fest 2025: https://ods.ai/events/datafest2025

Презентацию к докладу Вы можете скачать в треке секции Career & TeamLead: https://ods.ai/tracks/df25-career

______

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

1 day, 12 hours назад @ youtube.com
Наталья Ковальчук | Виртуальная расходометрия: прогноз нефтедобычи с помощью машинного обучения
Наталья Ковальчук | Виртуальная расходометрия: прогноз нефтедобычи с помощью машинного обучения Наталья Ковальчук | Виртуальная расходометрия: прогноз нефтедобычи с помощью машинного обучения

Спикер: Наталья Ковальчук, Aramco Innovations, Data Scientist Data Fest 2025: https://ods.ai/events/datafest2025

Презентацию к докладу Вы можете скачать в треке секции ML in Manufacturing: https://ods.ai/tracks/df25-ml-in-manufacturing

______

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

1 day, 16 hours назад @ youtube.com
Жалгас Жиенбеков | Data-driven подход к скорингу абонентов в телеком-сфере
Жалгас Жиенбеков | Data-driven подход к скорингу абонентов в телеком-сфере Жалгас Жиенбеков | Data-driven подход к скорингу абонентов в телеком-сфере

Спикер: Жалгас Жиенбеков, Altel Data Fest 2025: https://ods.ai/events/datafest2025

Презентацию к докладу Вы можете скачать в треке секции Scoring: https://ods.ai/tracks/df25-scoring

______

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

1 day, 22 hours назад @ youtube.com
Станислав Попов | Социальный граф как инструмент таргетинга в рекламе
Станислав Попов | Социальный граф как инструмент таргетинга в рекламе Станислав Попов | Социальный граф как инструмент таргетинга в рекламе

Спикер: Станислав Попов Data Fest 2025: https://ods.ai/events/datafest2025

Презентацию к докладу Вы можете скачать в треке секции Analytical DS: https://ods.ai/tracks/df25-analytical-ds ______

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

1 day, 22 hours назад @ youtube.com
Кучков Кирилл | Три уровня "бесплатного" деплоя ML моделей
Кучков Кирилл | Три уровня "бесплатного" деплоя ML моделей Кучков Кирилл | Три уровня "бесплатного" деплоя ML моделей

Спикер: Кучков Кирилл Data Fest 2025: https://ods.ai/events/datafest2025

Презентацию к докладу Вы можете скачать в треке секции MLOps: https://ods.ai/tracks/df25-mlops ______

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 days, 14 hours назад @ youtube.com
Сергей Лыткин | Применение методов машинного обучения для поиска путей на графах Кэли
Сергей Лыткин | Применение методов машинного обучения для поиска путей на графах Кэли Сергей Лыткин | Применение методов машинного обучения для поиска путей на графах Кэли

Спикер: Сергей Лыткин Data Fest 2025: https://ods.ai/events/datafest2025

Презентацию к докладу Вы можете скачать в треке секции Mathematics & ML: https://ods.ai/tracks/df25-mathematics-ml

______

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 days, 19 hours назад @ youtube.com
Юлиан Гилязев | Эффективный подход к контролю качества моделей антифрода
Юлиан Гилязев | Эффективный подход к контролю качества моделей антифрода Юлиан Гилязев | Эффективный подход к контролю качества моделей антифрода

Спикер: Юлиан Гилязев, Data Scientist, Wildberries Data Fest 2025: https://ods.ai/events/datafest2025

Презентацию к докладу Вы можете скачать в треке секции ML in Security: https://ods.ai/tracks/df25-ml-in-security

______

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 days, 23 hours назад @ youtube.com
Антон Нехаев | Как большие языковые модели помогают задавать запросы в Яндекс Картах
Антон Нехаев | Как большие языковые модели помогают задавать запросы в Яндекс Картах Антон Нехаев | Как большие языковые модели помогают задавать запросы в Яндекс Картах

Спикер: Антон Нехаев, Руководитель группы качества геопоиска, Яндекс Data Fest 2025: https://ods.ai/events/datafest2025

Презентацию к докладу Вы можете скачать в треке секции Advanced LLM: https://ods.ai/tracks/df25-allm

______

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 days, 23 hours назад @ youtube.com
Захар Варфоломеев | Автоматическая транскрипция музыки в ноты фортепиано
Захар Варфоломеев | Автоматическая транскрипция музыки в ноты фортепиано Захар Варфоломеев | Автоматическая транскрипция музыки в ноты фортепиано

Спикер: Захар Варфоломеев, ML Researcher & Developer, Audio2MIDI Data Fest 2025: https://ods.ai/events/datafest2025 Презентацию к докладу Вы можете скачать в треке секции ML in Music: https://ods.ai/tracks/df25-ml-in-music ______

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

3 days, 13 hours назад @ youtube.com
Евгений Гаврилов | Разработка ИИмодуля с биометрической идентификацией и голосовым ассистентом с LLM
Евгений Гаврилов | Разработка ИИмодуля с биометрической идентификацией и голосовым ассистентом с LLM Евгений Гаврилов | Разработка ИИмодуля с биометрической идентификацией и голосовым ассистентом с LLM

Спикер: Евгений Гаврилов, Ювенко Head of AI

Тема: Разработка ИИ модуля с биометрической идентификацией и голосовым ассистентом с LLM: от идеи до реализации Data Fest 2025: https://ods.ai/events/datafest2025 Презентацию к докладу Вы можете скачать в треке секции Advanced LLM: https://ods.ai/tracks/df25-allm

______

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

3 days, 13 hours назад @ youtube.com
Илья Петряшин, Артем Трофименко | МЛ Платформа в Авито, сейчас и дальше
Илья Петряшин, Артем Трофименко | МЛ Платформа в Авито, сейчас и дальше Илья Петряшин, Артем Трофименко | МЛ Платформа в Авито, сейчас и дальше

Спикер: Илья Петряшин, Артем Трофименко Data Fest 2025: https://ods.ai/events/datafest2025 ______

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

3 days, 21 hours назад @ youtube.com
🎧 Podcasts
Lex Fridman AI Podcast Lex Fridman AI Podcast
последний пост 1 week назад
#476 – Jack Weatherford: Genghis Khan and the Mongol Empire
#476 – Jack Weatherford: Genghis Khan and the Mongol Empire #476 – Jack Weatherford: Genghis Khan and the Mongol Empire

Jack Weatherford is an anthropologist and historian specializing in Genghis Khan and the Mongol Empire.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep476-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://alliocapital.com/ZocDoc: App that helps patients find healthcare providers.

Go to https://zocdoc.com/lexFin: AI agent for customer service.

Go to https://shopify.com/lexMasterClass: Online classes from world-class experts.

1 week назад @ lexfridman.com
#475 – Demis Hassabis: Future of AI, Simulating Reality, Physics and Video Games
#475 – Demis Hassabis: Future of AI, Simulating Reality, Physics and Video Games #475 – Demis Hassabis: Future of AI, Simulating Reality, Physics and Video Games

Demis Hassabis is the CEO of Google DeepMind and Nobel Prize winner for his groundbreaking work in protein structure prediction using AI.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep475-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://joinhampton.com/lexFin: AI agent for customer service.

Go to https://shopify.com/lexLMNT: Zero-sugar electrolyte drink mix.

Go to https://drinkLMNT.com/lexAG1: All-in-one daily nutrition drink.

2 weeks, 1 day назад @ lexfridman.com
#474 – DHH: Future of Programming, AI, Ruby on Rails, Productivity & Parenting
#474 – DHH: Future of Programming, AI, Ruby on Rails, Productivity & Parenting #474 – DHH: Future of Programming, AI, Ruby on Rails, Productivity & Parenting

David Heinemeier Hansson (aka DHH) is a legendary programmer, creator of Ruby on Rails, co-owner & CTO of 37signals that created Basecamp, HEY, & ONCE, and is a NYT-best-selling author (with Jason Fried) of 4 books: REWORK, REMOTE, Getting Real, and It Doesn’t Have To Be Crazy At Work.

He is also a race car driver, including a class-winning performance at the 24 hour Le Mans race.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep474-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://upliftdesk.com/lexLindy: No-code AI agent builder.

Go to https://go.lindy.ai/lexLMNT: Zero-sugar electrolyte drink …

3 weeks, 5 days назад @ lexfridman.com
#473 – Iran War Debate: Nuclear Weapons, Trump, Peace, Power & the Middle East
#473 – Iran War Debate: Nuclear Weapons, Trump, Peace, Power & the Middle East #473 – Iran War Debate: Nuclear Weapons, Trump, Peace, Power & the Middle East

Debate on Iran war between Scott Horton and Mark Dubowitz.

Scott Horton is the author and director of the Libertarian Institute, editorial director of Antiwar.com, host of The Scott Horton Show, and for the past three decades, a staunch critic of U.S. foreign policy and military interventionism.

Mark Dubowitz is the chief executive of the Foundation for Defense of Democracies, host of the Iran Breakdown podcast, and a leading expert on Iran and its nuclear program for over 20 years.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep473-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://drinkLMNT.c…

1 month, 1 week назад @ lexfridman.com
#472 – Terence Tao: Hardest Problems in Mathematics, Physics & the Future of AI
#472 – Terence Tao: Hardest Problems in Mathematics, Physics & the Future of AI #472 – Terence Tao: Hardest Problems in Mathematics, Physics & the Future of AI

Terence Tao is widely considered to be one of the greatest mathematicians in history.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep472-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://shopify.com/lexNetSuite: Business management software.

Go to http://netsuite.com/lexLMNT: Zero-sugar electrolyte drink mix.

Go to https://drinkLMNT.com/lexAG1: All-in-one daily nutrition drink.

1 month, 3 weeks назад @ lexfridman.com
#471 – Sundar Pichai: CEO of Google and Alphabet
#471 – Sundar Pichai: CEO of Google and Alphabet #471 – Sundar Pichai: CEO of Google and Alphabet

Sundar Pichai is CEO of Google and Alphabet.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep471-sc

See below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc. Transcript:

https://lexfridman.com/sundar-pichai-transcript CONTACT LEX:

Feedback - give feedback to Lex: https://lexfridman.com/survey

AMA - submit questions, videos or call-in: https://lexfridman.com/ama

Hiring - join our team: https://lexfridman.com/hiring

Other - other ways to get in touch: https://lexfridman.com/contact EPISODE LINKS:

Sundar's X: https://x.com/sundarpichai

Sundar's Instagram: https://instagram.com/sundarpichai

Sundar's Blog: https://blog.goo…

2 months назад @ lexfridman.com
#470 – James Holland: World War II, Hitler, Churchill, Stalin & Biggest Battles
#470 – James Holland: World War II, Hitler, Churchill, Stalin & Biggest Battles #470 – James Holland: World War II, Hitler, Churchill, Stalin & Biggest Battles

James Holland is a historian specializing in World War II.

He hosts a podcast called WW2 Pod: We Have Ways of Making You Talk.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep470-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://shopify.com/lexLMNT: Zero-sugar electrolyte drink mix.

Go to https://drinkag1.com/lexNotion: Note-taking and team collaboration.

2 months, 2 weeks назад @ lexfridman.com
#469 – Oliver Anthony: Country Music, Blue-Collar America, Fame, Money, and Pain
#469 – Oliver Anthony: Country Music, Blue-Collar America, Fame, Money, and Pain #469 – Oliver Anthony: Country Music, Blue-Collar America, Fame, Money, and Pain

Oliver Anthony is singer-songwriter who first gained worldwide fame with his viral hit Rich Men North of Richmond.

He became a voice for many who are voiceless, with many of his songs speaking to the struggle of the working class in modern American life.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep469-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://oracle.com/lexTax Network USA: Full-service tax firm.

Go to https://drinkLMNT.com/lexOUTLINE:(00:00) – Introduction(09:00) – Open mics(13:03) – Mainstream country music(22:10) – Fame(28:06) – Music vs politics(36:56) – Rich Men North of Richmon…

2 months, 2 weeks назад @ lexfridman.com
#468 – Janna Levin: Black Holes, Wormholes, Aliens, Paradoxes & Extra Dimensions
#468 – Janna Levin: Black Holes, Wormholes, Aliens, Paradoxes & Extra Dimensions #468 – Janna Levin: Black Holes, Wormholes, Aliens, Paradoxes & Extra Dimensions

Janna Levin is a theoretical physicist and cosmologist specializing in black holes, cosmology of extra dimensions, topology of the universe, and gravitational waves.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep468-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://brain.fm/lexBetterHelp: Online therapy and counseling.

Go to https://betterhelp.com/lexNetSuite: Business management software.

Go to https://shopify.com/lexAG1: All-in-one daily nutrition drink.

3 months назад @ lexfridman.com
#467 – Tim Sweeney: Fortnite, Unreal Engine, and the Future of Gaming
#467 – Tim Sweeney: Fortnite, Unreal Engine, and the Future of Gaming #467 – Tim Sweeney: Fortnite, Unreal Engine, and the Future of Gaming

Tim Sweeney is a legendary video game programmer, founder and CEO of Epic Games that created the Unreal Engine, Fortnite, Gears of War, Unreal Tournament, and many other groundbreaking and influential video games.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep467-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://notion.com/lexMasterClass: Online classes from world-class experts.

Go to https://shopify.com/lexAG1: All-in-one daily nutrition drink.

Go to https://drinkag1.com/lexLMNT: Zero-sugar electrolyte drink mix.

3 months, 1 week назад @ lexfridman.com
#466 – Jeffrey Wasserstrom: China, Xi Jinping, Trade War, Taiwan, Hong Kong, Mao
#466 – Jeffrey Wasserstrom: China, Xi Jinping, Trade War, Taiwan, Hong Kong, Mao #466 – Jeffrey Wasserstrom: China, Xi Jinping, Trade War, Taiwan, Hong Kong, Mao

Jeffrey Wasserstrom is a historian of modern China.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep466-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://oracle.com/lexTax Network USA: Full-service tax firm.

Go to https://shopify.com/lexLMNT: Zero-sugar electrolyte drink mix.

Go to https://drinkLMNT.com/lexAG1: All-in-one daily nutrition drink.

3 months, 2 weeks назад @ lexfridman.com
#465 – Robert Rodriguez: Sin City, Desperado, El Mariachi, Alita, and Filmmaking
#465 – Robert Rodriguez: Sin City, Desperado, El Mariachi, Alita, and Filmmaking #465 – Robert Rodriguez: Sin City, Desperado, El Mariachi, Alita, and Filmmaking

Robert Rodriguez is a legendary filmmaker and creator of Sin City, El Mariachi, Desperado, Spy Kids, Machete, From Dusk Till Dawn, Alita: Battle Angel, The Faculty, and his newest venture Brass Knuckle Films.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep465-sc

See below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc. Transcript:

https://lexfridman.com/robert-rodriguez-transcript CONTACT LEX:

Feedback - give feedback to Lex: https://lexfridman.com/survey

AMA - submit questions, videos or call-in: https://lexfridman.com/ama

Hiring - join our team: https://lexfridman.com/hiring

Other - other ways to get in touch: http…

3 months, 3 weeks назад @ lexfridman.com
#464 – Dave Smith: Israel, Ukraine, Epstein, Mossad, Conspiracies & Antisemitism
#464 – Dave Smith: Israel, Ukraine, Epstein, Mossad, Conspiracies & Antisemitism #464 – Dave Smith: Israel, Ukraine, Epstein, Mossad, Conspiracies & Antisemitism

Dave Smith is a comedian, libertarian, political commentator, and the host of Part of the Problem podcast.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep464-sc

See below for timestamps, and to give feedback, submit questions, contact Lex, etc. CONTACT LEX:

Feedback - give feedback to Lex: https://lexfridman.com/survey

AMA - submit questions, videos or call-in: https://lexfridman.com/ama

Hiring - join our team: https://lexfridman.com/hiring

Other - other ways to get in touch: https://lexfridman.com/contact EPISODE LINKS:

Dave's X: https://x.com/ComicDaveSmith

Dave's YouTube: https://youtube.com/DSmithcomic

Dave's Instagram: https://instagram.com/theprobl…

4 months назад @ lexfridman.com
#463 – Douglas Murray: Putin, Zelenskyy, Trump, Israel, Netanyahu, Hamas & Gaza
#463 – Douglas Murray: Putin, Zelenskyy, Trump, Israel, Netanyahu, Hamas & Gaza #463 – Douglas Murray: Putin, Zelenskyy, Trump, Israel, Netanyahu, Hamas & Gaza

Douglas Murray is the author of On Democracies and Death Cults, The War on The West, and The Madness of Crowds.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep463-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://callofduty.com/warzoneOracle: Cloud infrastructure.

Go to https://oracle.com/lexLMNT: Zero-sugar electrolyte drink mix.

Go to https://drinkLMNT.com/lexAG1: All-in-one daily nutrition drink.

4 months, 1 week назад @ lexfridman.com
#462 – Ezra Klein and Derek Thompson: Politics, Trump, AOC, Elon & DOGE
#462 – Ezra Klein and Derek Thompson: Politics, Trump, AOC, Elon & DOGE #462 – Ezra Klein and Derek Thompson: Politics, Trump, AOC, Elon & DOGE

Ezra Klein is one of the most influential voices representing the left-wing of American politics.

He is a columnist for the NY Times and host of The Ezra Klein Show.

Derek Thompson is a writer at The Atlantic and host of the Plain English podcast.

Together they have written a new book titled Abundance that lays out a set of ideas for the future of the Democratic party.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep462-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

4 months, 2 weeks назад @ lexfridman.com
Microsoft Research Podcast Microsoft Research Podcast
последний пост 14 часов назад
Reimagining healthcare delivery and public health with AI
Reimagining healthcare delivery and public health with AI Reimagining healthcare delivery and public health with AI

We are sorry, the page you requested cannot be found.

The page you are looking for could not be found or is no longer available.

14 часов назад @ microsoft.com
Navigating medical education in the era of generative AI
Navigating medical education in the era of generative AI Navigating medical education in the era of generative AI

Prior to med school, Daniel pursued experiences that cultivated his interest in the application of AI in medical practice and education.

Really, really looking forward to this chat.

There’s AI before ChatGPT and before, you know, generative AI really became a big thing, and then afterwards.

And then after we talk about what’s really happening, what do you think should happen in medical education given the reality of generative AI?

And I do agree [that] AI really gives us real hope that we can make it true.

2 weeks назад @ microsoft.com
AI Testing and Evaluation: Reflections
AI Testing and Evaluation: Reflections AI Testing and Evaluation: Reflections

Our goal is to learn from their successes and their stumbles to move the science and practice of AI testing forward.

We have examples, like the pharmaceutical or medical device industry experts with whom you spoke, that’s really, you know, testing … there is a pre-deployment requirement.

And the third is just how rigid versus adaptive these testing and evaluation regimes or frameworks are in these different domains.

I really agree that there has been a lot of emphasis to date on, sort of, testing models upstream, the AI model evaluation.

You know, I think there’s been real progress already in the AI evaluation and testing ecosystem in the public-private partnership context.

2 weeks, 3 days назад @ microsoft.com
AI Testing and Evaluation: Learnings from cybersecurity
AI Testing and Evaluation: Learnings from cybersecurity AI Testing and Evaluation: Learnings from cybersecurity

Absolutely, I really, really was.

As a principal director on the Microsoft AI Red Team, Tori leads all AI security and safety red team operations, as well as dangerous capability testing, to directly inform C-suite decision-makers.

This year, we’ve pulled a lot of those assets and insights into the Azure [AI] Foundry AI Red Teaming Agent (opens in new tab).

So you can get a little taste of what we do day to day in the AI Red Teaming Agent.

WESTERHOFF: I think the most important takeaway from those lessons is that AI security is truly a team sport.

3 weeks, 3 days назад @ microsoft.com
How AI will accelerate biomedical research and discovery
How AI will accelerate biomedical research and discovery How AI will accelerate biomedical research and discovery

Dr. Eric Topol is the executive vice president of the biomedical research non-profit Scripps Research, where he founded and now directs the Scripps Research Translational Institute.

Let’s continue our deep dive on AI and biomedical research with this conversation with Noubar Afeyan:LEE: Noubar, thanks so much for joining.

And there’s the origin story of contact with AI, you know, before the emergence of generative AI and afterwards.

What is going on today with respect to AI really being used for something meaningful in the design and development of drugs?

TOPOL: You would read about how, you know, data is the new oil and, you know, gold and whatnot.

4 weeks назад @ microsoft.com
AI Testing and Evaluation: Learnings from pharmaceuticals and medical devices
AI Testing and Evaluation: Learnings from pharmaceuticals and medical devices AI Testing and Evaluation: Learnings from pharmaceuticals and medical devices

Our goal is to learn from their successes and their stumbles to move the science and practice of AI testing forward.

During the pre-market phase, medical testing establishes baseline safety and effectiveness metrics through bench testing, performance standards, and clinical studies.

SULLIVAN: So medical devices face a pretty prescriptive multi-level testing path before they hit the market.

We are looking into medical devices, as well, obviously, but also other technologies in advanced medical computing.

So we see Phase 3 trials as something that occurs in the medical devices and pharmaceuticals field.

1 month назад @ microsoft.com
AI Testing and Evaluation: Learnings from genome editing
AI Testing and Evaluation: Learnings from genome editing AI Testing and Evaluation: Learnings from genome editing

As generative AI continues to advance, Microsoft has gathered a range of experts—from genome editing to cybersecurity—to share how their fields approach evaluation and risk assessment.

CHARO: Well, you know, genome editing is both very old and very new.

Now the earliest forms of genome editing were very inefficient, and so we didn’t worry that much.

But the bottom-line thing to remember, the way to really think about it is, we don’t regulate genome editing; we regulate the things that use genome editing.

And she said, you know, we don’t regulate genome editing; we regulate the things that use genome editing.

1 month, 1 week назад @ microsoft.com
AI Testing and Evaluation: Learnings from Science and Industry
AI Testing and Evaluation: Learnings from Science and Industry AI Testing and Evaluation: Learnings from Science and Industry

Our goal is to learn from their successes and their stumbles to move the science and practice of AI testing forward.

And I think, really, there are two reasons why tech is so, kind of, representative of that kind of challenge that I’ve always found fascinating.

Continues to be a really important topic in the AI policy conversation right now, I think, for really good reason.

Testing is an important component for governance and AI and, of course, in all of these other domains, as well.

I think about almost, like, in the near to mid-term, like three issues that we need to address in the AI, kind of, policy and testing context.

1 month, 2 weeks назад @ microsoft.com
The AI Revolution in Medicine, Revisited: How AI is reshaping the future of healthcare and medical research
The AI Revolution in Medicine, Revisited: How AI is reshaping the future of healthcare and medical research The AI Revolution in Medicine, Revisited: How AI is reshaping the future of healthcare and medical research

LEE: Yeah, yeah.

It cannot—as, you know, Bill was saying—it cannot learn from your document.

And I don’t know if the two of you remember, but I ended up doing a lot of tests.

I don’t know if you know, but just recently, there was a paper that was published on a scientific discovery using o3- mini (opens in new tab).

Like, if you have a human trained for one task and you put them into another task, then you don’t … you often don’t know.

1 month, 3 weeks назад @ microsoft.com
What AI's impact on individuals means for the health workforce and industry
What AI's impact on individuals means for the health workforce and industry What AI's impact on individuals means for the health workforce and industry

So I don’t think we should be surprised that business schools matter on this because we care about management.

That’s really going to change the way, like, middle school works, was my thinking at the time.

We’ve gone from AI being highly discriminative to AI that’s able to explore the world in particular ways.

The symptoms that they’re showing are quite different, and also their compliance is really, really different.

LEE: Yeah, really, really interesting.

2 months, 1 week назад @ microsoft.com
Abstracts: Zero-shot models in single-cell biology with Alex Lu
Abstracts: Zero-shot models in single-cell biology with Alex Lu Abstracts: Zero-shot models in single-cell biology with Alex Lu

And single-cell foundation models claim to be capable of unraveling deeper insights than ever before.

Basically, we showed that single-cell foundation models perform worse in settings that are fundamental to biological discovery than much simpler machine learning and statistical methods that were used in the field before single-cell foundation models emerged and are the go-to standard for unpacking meaning from these complicated experiments.

And the way to understand this is because single-cell foundation models are trained in a way that tries to expose these models to millions of single-cells.

But let’s also talk about the impact for methodologists, people who are trying to improve these s…

2 months, 2 weeks назад @ microsoft.com
Abstracts: Aurora with Megan Stanley and Wessel Bruinsma
Abstracts: Aurora with Megan Stanley and Wessel Bruinsma Abstracts: Aurora with Megan Stanley and Wessel Bruinsma

This is such exciting work about environmental forecasting, so we’re happy to have the two of you join us today.

Mostly because AI weather forecasting models are computationally much more efficient and can even be more accurate.

What’s unfortunate though, about this big step forward, is that these developments are mostly limited to the setting of weather forecasting.

Weather forecasting is very important, obviously, but there are many other important environmental forecasting problems out there, such as air pollution forecasting or ocean wave forecasting.

STANLEY: Current approaches have really focused training very specifically on weather forecasting models.

2 months, 2 weeks назад @ microsoft.com
Collaborators: Healthcare Innovation to Impact
Collaborators: Healthcare Innovation to Impact Collaborators: Healthcare Innovation to Impact

LUNGREN: And now it really feels like this collaborative effort, you know, really can help start to extend that mission.

I think, you know, Will and Smitha, that we definitely feel the passion and the innovation.

Again, you know, in text, you refer to that earlier and certainly off the shelf, there’s really powerful applications.

LUNGREN: So, I think AI has always been thought of as a savior kind of technology.

And I guess for my part, I think really what we’re going to see is a massive unleash of creativity.

2 months, 2 weeks назад @ microsoft.com
Coauthor roundtable: Reflecting on real world of doctors, developers, patients, and policymakers
Coauthor roundtable: Reflecting on real world of doctors, developers, patients, and policymakers Coauthor roundtable: Reflecting on real world of doctors, developers, patients, and policymakers

LEE: Yeah, yeah.

LEE: Yeah, yeah.

LEE: Yeah, yeah.

[LAUGHS]GOLDBERG: Right, right, right, yeah.

Yeah, yeah.

2 months, 3 weeks назад @ microsoft.com
Abstracts: Heat Transfer and Deep Learning with Hongxia Hao and Bing Lv
Abstracts: Heat Transfer and Deep Learning with Hongxia Hao and Bing Lv Abstracts: Heat Transfer and Deep Learning with Hongxia Hao and Bing Lv

Today I’m talking to two researchers, Hongxia Hao, a senior researcher at Microsoft Research AI for Science, and Bing Lv, an associate professor in physics at the University of Texas at Dallas.

Hongxia and Bing are co-authors of a paper called Probing the Limit of Heat Transfer in Inorganic Crystals with Deep Learning .

LV: So I think one of the biggest things as Hongxia said, right?

We have a lot of new materials, exotic materials, which some of them, Hongxia can elaborate a little bit more.

HAO: Yeah, yeah.

3 months назад @ microsoft.com
NLP Highlights NLP Highlights
последний пост None
Data Skeptic
последний пост 2 weeks, 3 days назад
Network of Past Guests Collaborations
Network of Past Guests Collaborations Network of Past Guests Collaborations

Kyle and Asaf discuss a project in which we link former guests of the podcast based on their co-authorship of academic papers.

2 weeks, 3 days назад @ dataskeptic.com
The Network Diversion Problem
The Network Diversion Problem The Network Diversion Problem

In this episode, Professor Pål Grønås Drange from the University of Bergen, introduces the field of Parameterized Complexity - a powerful framework for tackling hard computational problems by focusing on specific structural aspects of the input. This framework allows researchers to solve NP-complete problems more efficiently when certain parameters, like the structure of the graph, are "well-behaved". At the center of the discussion is the network diversion problem, where the goal isn’t to block all routes between two points in a network, but to force flow - such as traffic, electricity, or data - through a specific path. While this problem appears deceptively similar to the classic "Min.Cu…

1 month назад @ dataskeptic.com
Complex Dynamic in Networks
Complex Dynamic in Networks Complex Dynamic in Networks

In this episode, we learn why simply analyzing the structure of a network is not enough, and how the dynamics - the actual mechanisms of interaction between components - can drastically change how information or influence spreads. Our guest, Professor Baruch Barzel of Bar-Ilan University, is a leading researcher in network dynamics and complex systems ranging from biology to infrastructure and beyond. BarzelLab BarzelLab on Youtube Paper in focus: Universality in network dynamics, 2013

1 month, 1 week назад @ dataskeptic.com
Github Network Analysis
Github Network Analysis Github Network Analysis 1 month, 2 weeks назад @ dataskeptic.com
Networks and Complexity
Networks and Complexity Networks and Complexity

In this episode, Kyle does an overview of the intersection of graph theory and computational complexity theory. In complexity theory, we are about the runtime of an algorithm based on its input size. For many graph problems, the interesting questions we want to ask take longer and longer to answer! This episode provides the fundamental vocabulary and signposts along the path of exploring the intersection of graph theory and computational complexity theory.

1 month, 3 weeks назад @ dataskeptic.com
Actantial Networks
Actantial Networks Actantial Networks

In this episode, listeners will learn about Actantial Networks—graph-based representations of narratives where nodes are actors (such as people, institutions, or abstract entities) and edges represent the actions or relationships between them. The one who will present these networks is our guest Armin Pournaki, a joint PhD candidate at the Max Planck Institute and Sciences, who specializes in computational social science, where he develops methods to extract and analyze political narratives using natural language processing and network science. Armin explains how these methods can expose conflicting narratives around the same events, as seen in debates on COVID-19, climate change, or the wa…

2 months, 1 week назад @ dataskeptic.com
Graphs for Causal AI
Graphs for Causal AI Graphs for Causal AI

How to build artificial intelligence systems that understand cause and effect, moving beyond simple correlations? As we all know, correlation is not causation. "Spurious correlations" can show, for example, how rising ice cream sales might statistically link to more drownings, not because one causes the other, but due to an unobserved common cause like warm weather. Our guest, Utkarshani Jaimini, a researcher from the University of South Carolina's Artificial Intelligence Institute, tries to tackle this problem by using knowledge graphs that incorporate domain expertise. Knowledge graphs (structured representations of information) are combined with neural networks in the field of neurosymbo…

2 months, 2 weeks назад @ dataskeptic.com
Power Networks
Power Networks Power Networks 2 months, 3 weeks назад @ dataskeptic.com
Unveiling Graph Datasets
Unveiling Graph Datasets Unveiling Graph Datasets 3 months назад @ dataskeptic.com
Network Manipulation
Network Manipulation Network Manipulation

In this episode we talk with Manita Pote, a PhD student at Indiana University Bloomington, specializing in online trust and safety, with a focus on detecting coordinated manipulation campaigns on social media. Key insights include how coordinated reply attacks target influential figures like journalists and politicians, how machine learning models can detect these inauthentic campaigns using structural and behavioral features, and how deletion patterns reveal efforts to evade moderation or manipulate engagement metrics. Follow our guest X/Twitter Google Scholar Papers in focus Coordinated Reply Attacks in Influence Operations: Characterization and Detection ,2025 Manipulating Twitter throug…

3 months, 1 week назад @ dataskeptic.com
The Small World Hypothesis
The Small World Hypothesis The Small World Hypothesis

Kyle discusses the history and proof for the small world hypothesis.

3 months, 2 weeks назад @ dataskeptic.com
Thinking in Networks
Thinking in Networks Thinking in Networks

Kyle asks Asaf questions about the new network science course he is now teaching. The conversation delves into topics such as contact tracing, tools for analyzing networks, example use cases, and the importance of thinking in networks.

3 months, 3 weeks назад @ dataskeptic.com
Fraud Networks
Fraud Networks Fraud Networks

In this episode we talk with Bavo DC Campo, a data scientist and statistician, who shares his expertise on the intersection of actuarial science, fraud detection, and social network analytics. Together we will learn how to use graphs to fight against insurance fraud by uncovering hidden connections between fraudulent claims and bad actors. Key insights include how social network analytics can detect fraud rings by mapping relationships between policyholders, claims, and service providers, and how the BiRank algorithm, inspired by Google’s PageRank, helps rank suspicious claims based on network structure. Bavo will also present his iFraud simulator that can be used to model fraudulent networ…

4 months, 1 week назад @ dataskeptic.com
Criminal Networks
Criminal Networks Criminal Networks

In this episode we talk with Justin Wang Ngai Yeung, a PhD candidate at the Network Science Institute at Northeastern University in London, who explores how network science helps uncover criminal networks. Justin is also a member of the organizing committee of the satellite conference dealing with criminal networks at the network science conference in The Netherlands in June 2025. Listeners will learn how graph-based models assist law enforcement in analyzing missing data, identifying key figures in criminal organizations, and improving intervention strategies. Key insights include the challenges of incomplete and inaccurate data in criminal network analysis, how law enforcement agencies us…

4 months, 3 weeks назад @ dataskeptic.com
Graph Bugs
Graph Bugs Graph Bugs

In this episode today’s guest is Celine Wüst, a master’s student at ETH Zurich specializing in secure and reliable systems, shares her work on automated software testing for graph databases. Celine shows how fuzzing—the process of automatically generating complex queries—helps uncover hidden bugs in graph database management systems like Neo4j, FalconDB, and Apache AGE. Key insights include how state-aware query generation can detect critical issues like buffer overflows and crashes, the challenges of debugging complex database behaviors, and the importance of security-focused software testing. We'll also find out which Graph DB company offers swag for finding bugs in its software and get C…

5 months назад @ dataskeptic.com
SuperDataScience SuperDataScience
последний пост 2 days, 19 hours назад
911: The Future of Python Notebooks is Here, with Marimo’s Dr. Akshay Agrawal
911: The Future of Python Notebooks is Here, with Marimo’s Dr. Akshay Agrawal 911: The Future of Python Notebooks is Here, with Marimo’s Dr. Akshay Agrawal

Reproducibility, Python notebooks, and data science communities: Software developer Akshay Agrawal speaks to Jon Krohn about Marimo, the next-generation computational notebook for Python, how he built and fostered a thriving community around the product, and what makes this notebook so versatile and accessible for users. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/911⁠⁠⁠⁠⁠ This episode is brought to you by ⁠Trainium2, the latest AI chip from AWS ⁠and by the ⁠Dell AI Factory with NVIDIA⁠. Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

2 days, 19 hours назад @ podtrac.com
910: AI is Disrupting Journalism: The Good, The Bad and The Opportunity
910: AI is Disrupting Journalism: The Good, The Bad and The Opportunity 910: AI is Disrupting Journalism: The Good, The Bad and The Opportunity

In this Five-Minute Friday, Jon Krohn looks into AI’s disruption of the journalism industry and how it has fundamentally reshaped news production. Multiple news outlets’ suing of ChatGPT over its use of copyrighted materials may have taken the most headlines to date, but this isn’t to say news media is rebuffing AI entirely. On the contrary, several outlets have launched summarization and analysis tools for both internal and external use, such as The New York Times’s Echo and The Washington Post’s Haystacker. This episode looks into the ways major news outlets are utilising AI, and what this means for journalists. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/910⁠⁠ Interested in sp…

6 days, 19 hours назад @ podtrac.com
909: Causal AI, with Dr. Robert Usazuwa Ness
909: Causal AI, with Dr. Robert Usazuwa Ness 909: Causal AI, with Dr. Robert Usazuwa Ness

Researcher at Microsoft Robert Usazuwa Ness talks to Jon Krohn about how to achieve causality in AI with correlation-based learning, the right libraries, and handling statistical inference. When dealing with causal AI, Robert notes how important it is to keep aware of variables in the data that may mislead us and force inaccurate assumptions. Not all variables will be useful. It is essential, then, that any assumptions are grounded in a deeper understanding of how the data were gathered, and not what appears in the dataset. Listen to the episode to hear how you can apply causal AI to your projects. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/907⁠⁠⁠⁠ This episode is brought to you…

1 week, 2 days назад @ podtrac.com
908: AI Agents Blackmail Humans 96% of the Time (Agentic Misalignment)
908: AI Agents Blackmail Humans 96% of the Time (Agentic Misalignment) 908: AI Agents Blackmail Humans 96% of the Time (Agentic Misalignment)

The moral and ethical implications of letting AI take the wheel in business, as revealed by Anthropic: Jon Krohn looks into Anthropic’s latest research on how to use and deploy LLMs safely, specifically in business environments. The team designed scenarios to test the behavior of AI agents when given a goal and a set of obstacles to reach it. Those obstacles included 1) threats to the AI’s continued operation, and 2) conflict between the AI’s goals and the goals of the company. Hear Jon break down the results of this research in this Five-Minute Friday. Additional materials: ⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/908⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@super…

1 week, 6 days назад @ podtrac.com
907: Neuroscience, AI and the Limitations of LLMs, with Dr. Zohar Bronfman
907: Neuroscience, AI and the Limitations of LLMs, with Dr. Zohar Bronfman 907: Neuroscience, AI and the Limitations of LLMs, with Dr. Zohar Bronfman

“Intelligence has many forms,” says Zohar Bronfman, who speaks with Jon Krohn about the fascinating intersection between computational neuroscience and philosophy, and how it has brought him closer to understanding what is necessary to develop human-like intelligence in machines, as well as his motivations for launching Pecan AI and why predictive models outstrip generative models in business. Additional materials: ⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/907⁠⁠⁠ This episode is brought to you⁠⁠⁠ by, ⁠⁠⁠⁠Adverity, the conversational analytics platform⁠⁠⁠⁠ and by the ⁠⁠⁠⁠Dell AI Factory with NVIDIA⁠⁠⁠⁠. Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for…

2 weeks, 2 days назад @ podtrac.com
906: How Prof. Jason Corso Solved Computer Vision’s Data Problem
906: How Prof. Jason Corso Solved Computer Vision’s Data Problem 906: How Prof. Jason Corso Solved Computer Vision’s Data Problem

Jason Corso speaks to Jon Krohn in this Five-Minute Friday all about Voxel51’s latest tool, Verified Auto-Labelling, and the company’s incredible success in developing popular tools for computer vision. Additional materials: ⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/906⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

2 weeks, 6 days назад @ podtrac.com
905: Why RAG Makes LLMs Less Safe (And How to Fix It), with Bloomberg’s Dr. Sebastian Gehrmann
905: Why RAG Makes LLMs Less Safe (And How to Fix It), with Bloomberg’s Dr. Sebastian Gehrmann 905: Why RAG Makes LLMs Less Safe (And How to Fix It), with Bloomberg’s Dr. Sebastian Gehrmann

RAG LLMs are not safer: Sebastian Gehrmann speaks to Jon Krohn about his latest research into how retrieval-augmented generation (RAG) actually makes LLMs less safe, the three ‘H’s for gauging the effectivity and value of a RAG, and the custom guardrails and procedures we need to use to ensure our RAG is fit-for-purpose and secure. This is a great episode for anyone who wants to know how to work with RAG in the context of LLMs, as you’ll hear how to select the best model for purpose, useful approaches and taxonomies to keep your projects secure, and which models he finds safest when RAG is applied. Additional materials: ⁠⁠⁠⁠⁠⁠www.superdatascience.com/905⁠⁠ This episode is brought to you⁠ by…

3 weeks, 2 days назад @ podtrac.com
904: A.I. is Disrupting the Entire Advertising Industry
904: A.I. is Disrupting the Entire Advertising Industry 904: A.I. is Disrupting the Entire Advertising Industry

In this Five-Minute Friday, Jon Krohn reveals how AI is taking on the glitzy world of advertising. Bold claims from Meta and OpenAI contend that users will soon be able to plug in what they want and have AI churn out an ad campaign for little to no cost are shaking the advertising industry to its core. The fact that the four biggest sellers of ads (Google, Meta, Amazon, and ByteDance) are digital companies and accounted for over half of the global market in 2024 adds salt to the wound. Hear the three ways that AI is disrupting the industry, and who (or what) has the most influence on digital consumers to date. Additional materials: ⁠⁠⁠⁠⁠⁠www.superdatascience.com/904 Interested in sponsoring…

3 weeks, 6 days назад @ podtrac.com
903: LLM Benchmarks Are Lying to You (And What to Do Instead), with Sinan Ozdemir
903: LLM Benchmarks Are Lying to You (And What to Do Instead), with Sinan Ozdemir 903: LLM Benchmarks Are Lying to You (And What to Do Instead), with Sinan Ozdemir

Has AI benchmarking reached its limit, and what do we have to fill this gap? Sinan Ozdemir speaks to Jon Krohn about the lack of transparency in training data and the necessity of human-led quality assurance to detect AI hallucinations, when and why to be skeptical of AI benchmarks, and the future of benchmarking agentic and multimodal models. Additional materials: ⁠⁠⁠⁠⁠www.superdatascience.com/903⁠⁠⁠⁠ This episode is brought to you by Trainium2, the latest AI chip from AWS, by ⁠⁠Adverity, the conversational analytics platform⁠⁠ and by the ⁠⁠Dell AI Factory with NVIDIA⁠⁠. Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship informat…

1 month назад @ podtrac.com
902: In Case You Missed It in June 2025
902: In Case You Missed It in June 2025 902: In Case You Missed It in June 2025

In this episode of “In Case You Missed It”, Jon recaps his June interviews on The SuperDataScience Podcast. Hear from Diane Hare, Avery Smith, Kirill Eremenko, and Shaun Johnson as they talk about the best portfolios for AI practitioners, how to stand out in a saturated candidate market for AI roles, how to tell when an AI startup is going places, and ways to lead AI change in business. Additional materials: ⁠⁠⁠www.superdatascience.com/902 Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

1 month назад @ podtrac.com
901: Automating Legal Work with Data-Centric ML (feat. Lilith Bat-Leah)
901: Automating Legal Work with Data-Centric ML (feat. Lilith Bat-Leah) 901: Automating Legal Work with Data-Centric ML (feat. Lilith Bat-Leah)

Senior Director of AI Labs for Epiq Lilith Bat-Leah speaks to Jon Krohn about the ways AI have disrupted the legal industry using LLMs and retrieval-augmented generation (RAG), as well as how the data-centric machine learning research movement (DMLR) is systematically improving data quality, and why that is so important. Additional materials: ⁠⁠⁠⁠⁠www.superdatascience.com/901⁠⁠⁠⁠ This episode is brought to you by the ⁠⁠Dell AI Factory with NVIDIA⁠⁠ and Adverity, the conversational analytics platform⁠⁠⁠. Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information. In this episode you will learn: (05:45) Deciphering legal tech te…

1 month, 1 week назад @ podtrac.com
900: 95-Year-Old Annie on How to Stay Healthy and Happy
900: 95-Year-Old Annie on How to Stay Healthy and Happy 900: 95-Year-Old Annie on How to Stay Healthy and Happy

“Stay happy and healthy”: In this special Five-Minute Friday, Jon Krohn speaks with Annie, his grandmother, on her 95th birthday. Hear how she is physically and mentally coping with illnesses that limit her mobility and the joys of having a pet. Additional materials: ⁠⁠⁠⁠⁠⁠www.superdatascience.com/900⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

1 month, 1 week назад @ podtrac.com
899: Landing $200k+ AI Roles: Real Cases from the SuperDataScience Community, with Kirill Eremenko
899: Landing $200k+ AI Roles: Real Cases from the SuperDataScience Community, with Kirill Eremenko 899: Landing $200k+ AI Roles: Real Cases from the SuperDataScience Community, with Kirill Eremenko

Data science skills, a data science bootcamp, and why Python and SQL still reign supreme: In this episode, Kirill Eremenko returns to the podcast to speak to Jon Krohn about SuperDataScience subscriber success stories, where to focus in a field that is evolving incredibly quickly, and why in-person working and networking might give you the edge over other candidates in landing a top AI role. Additional materials: ⁠⁠⁠⁠www.superdatascience.com/899⁠⁠⁠ This episode is brought to you by ⁠Adverity, the conversational analytics platform⁠ and by the ⁠Dell AI Factory with NVIDIA⁠. Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship informat…

1 month, 2 weeks назад @ podtrac.com
898: My Four-Hour Agentic AI Workshop is Live and 100% Free
898: My Four-Hour Agentic AI Workshop is Live and 100% Free 898: My Four-Hour Agentic AI Workshop is Live and 100% Free

In this Five-Minute Friday, Jon Krohn announces his new, free workshop on Agentic AI. On this four-hour comprehensive course, you’ll learn the key terminology for working with these flexible, multi-agent systems and then get to grips with developing and deploying this artificial “team of experts” for all your AI-driven projects. Additional materials: ⁠⁠⁠⁠⁠www.superdatascience.com/898⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

1 month, 2 weeks назад @ podtrac.com
897: How to Enable Enterprise AI Transformation, with Strategy Consultant Diane Hare
897: How to Enable Enterprise AI Transformation, with Strategy Consultant Diane Hare 897: How to Enable Enterprise AI Transformation, with Strategy Consultant Diane Hare

Diane Hare talks to Jon Krohn about the power of storytelling for corporate buy-in of AI initiatives, how to actively implement AI to transform organizations, and how emerging professionals can upskill themselves. Hear how she discovered her background in storytelling at Ernst & Young and her work with Simon Sinek, which she finds to be integral to her process. Inspired by Sinek’s aphorism “start with why”, Diane notes that many companies neglect this crucial part of their mission because they never take the time to work on it. Additional materials: ⁠⁠⁠www.superdatascience.com/897⁠⁠ This episode is brought to you by Trainium2, the latest AI chip from AWS, by Adverity, the conversational ana…

1 month, 3 weeks назад @ podtrac.com
Data Science at Home Data Science at Home
последний пост 2 days, 16 hours назад
Robots Suck (But It’s Not Their Fault) (Ep. 288)
Robots Suck (But It’s Not Their Fault) (Ep. 288) Robots Suck (But It’s Not Their Fault) (Ep. 288)

At the intersection of ethics and engineering, Amethix creates AI systems that don’t just function—they adapt, learn, and serve.

Discover more at amethix.comDSH is brought to you by Intrepid AI.

🐦 Twitter: @DataScienceAtHome📘 LinkedIn: https://www.linkedin.com/in/fragadaleta/Instagram: https://www.instagram.com/datascienceathome/Facebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Send us mail at:[email protected]’t forget to like, subscribe, and hit the 🔔 for…

2 days, 16 hours назад @ datascienceathome.com
Your Favorite AI Startup is Probably Bullshit (Ep. 287)
Your Favorite AI Startup is Probably Bullshit (Ep. 287) Your Favorite AI Startup is Probably Bullshit (Ep. 287)

The brutal truth about why Silicon Valley is blowing billions on glorified autocomplete while pretending it’s the next iPhone.

We’re diving deep into the AI investment circus where VCs who can’t code are funding companies that barely understand their own technology.

From blockchain déjà vu to the “ChatGPT wrapper” economy—this episode will make you question every AI valuation you’ve ever seen.

Fair warning: We’re naming names and calling out the hype.

Don’t listen if you work at a “revolutionary AI startup” that’s just OpenAI’s API with a pretty interface.

5 days, 15 hours назад @ datascienceathome.com
Tech’s Dumbest Mistake: Why Firing Programmers for AI Will Destroy Everything (Ep. 286) [RB]
Tech’s Dumbest Mistake: Why Firing Programmers for AI Will Destroy Everything (Ep. 286) [RB] Tech’s Dumbest Mistake: Why Firing Programmers for AI Will Destroy Everything (Ep. 286) [RB]

From the viral article “Tech’s Dumbest Mistake: Why Firing Programmers for AI Will Destroy Everything” on my newsletter at https://defragzone.substack.com/p/techs-dumbest-mistake-why-firinghere are my thoughts about AI replacing programmers…🎙️ Sponsors AGNTCY — The open source collective building the Internet of Agents🌐 https://www.agntcy.org✨ Connect with us!

🐦 Twitter: @DataScienceAtHome📘 LinkedIn: https://www.linkedin.com/in/fragadaleta/Instagram: https://www.instagram.com/datascienceathome/Facebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Scie…

1 month назад @ datascienceathome.com
Brains in the Machine: The Rise of Neuromorphic Computing (Ep. 285)
Brains in the Machine: The Rise of Neuromorphic Computing (Ep. 285) Brains in the Machine: The Rise of Neuromorphic Computing (Ep. 285)

In this episode of Data Science at Home, we explore the fascinating world of neuromorphic computing — a brain-inspired approach to computation that could reshape the future of AI and robotics.

The episode breaks down how neuromorphic systems differ from conventional AI architectures like transformers and LLMs, diving into spiking neural networks (SNNs), their benefits in energy efficiency and real-time processing, and their limitations in training and scalability.

Real-world applications are highlighted, including low-power drones, hearing aids, and event-based cameras.

Francesco closes with a vision of hybrid systems where neuromorphic chips and LLMs coexist, blending biological inspiratio…

1 month, 2 weeks назад @ datascienceathome.com
DSH/Warcoded – AI in the Invisible Battlespace (Ep. 284)
DSH/Warcoded – AI in the Invisible Battlespace (Ep. 284) DSH/Warcoded – AI in the Invisible Battlespace (Ep. 284)

This episode explores the invisible battlespace of cyber and electronic warfare, where AI takes center stage.

SponsorsBuilding multi-agent software is hard — agent-to-agent and agent-to-tool communication is still the wild west.

At the intersection of ethics and engineering, Amethix creates AI systems that don’t just function—they adapt, learn, and serve.

Discover more at amethix.comWarcoded is brought to you by Intrepid AI.

From drones to satellites, Intrepid AI gives engineers and defense innovators the tools to prototype, simulate, and deploy autonomous systems with confidence.

2 months назад @ datascienceathome.com
DSH/Warcoded Swarming the Battlefield (Ep. 283)
DSH/Warcoded Swarming the Battlefield (Ep. 283) DSH/Warcoded Swarming the Battlefield (Ep. 283)

Swarming the Battlefield explores how artificial intelligence is revolutionizing combat through coordinated drone swarms.

This episode uncovers how these intelligent agents turn the chaos of the battlefield into a synchronized dance of machine warfare.

At the intersection of ethics and engineering, Amethix creates AI systems that don’t just function—they adapt, learn, and serve.

Discover more at amethix.comWarcoded is brought to you by Intrepid AI.

From drones to satellites, Intrepid AI gives engineers and defense innovators the tools to prototype, simulate, and deploy autonomous systems with confidence.

2 months, 1 week назад @ datascienceathome.com
DSH/Warcoded Kill Chains and Algorithmic Warfare – Autonomy in Targeting and Engagement (Ep. 282)
DSH/Warcoded Kill Chains and Algorithmic Warfare – Autonomy in Targeting and Engagement (Ep. 282) DSH/Warcoded Kill Chains and Algorithmic Warfare – Autonomy in Targeting and Engagement (Ep. 282)

In this gripping follow-up, we dive into how AI is transforming kinetic operations—from identifying a threat to executing a strike.

At the intersection of ethics and engineering, Amethix creates AI systems that don’t just function—they adapt, learn, and serve.

Discover more at amethix.comWarcoded is brought to you by Intrepid AI.

From drones to satellites, Intrepid AI gives engineers and defense innovators the tools to prototype, simulate, and deploy autonomous systems with confidence.

Whether it’s in the sky, on the ground, or in orbit—if it’s intelligent and mobile, Intrepid helps you build it.

2 months, 3 weeks назад @ datascienceathome.com
DSH/Warcoded: Eyes and Ears of the Machine – AI Reconnaissance and Surveillance (Ep. 281)
DSH/Warcoded: Eyes and Ears of the Machine – AI Reconnaissance and Surveillance (Ep. 281) DSH/Warcoded: Eyes and Ears of the Machine – AI Reconnaissance and Surveillance (Ep. 281)

Welcome to DSH/WarcodedWe explore how AI is transforming ISR (Intelligence, Surveillance, Reconnaissance)—from satellite imagery to drone feeds.

At the intersection of ethics and engineering, Amethix creates AI systems that don’t just function—they adapt, learn, and serve.

Discover more at amethix.com.”Warcoded is brought to you by Intrepid AI.

From drones to satellites, Intrepid AI gives engineers and defense innovators the tools to prototype, simulate, and deploy autonomous systems with confidence.

Learn more at intrepid.ai.”#AI #defensetech #ISR #LLM #Warcoded #DataScienceAtHome #OSINT #SIGINT #dronewarfare

3 months назад @ datascienceathome.com
AI Agents with Atomic Agents 🚀 with Kenny Vaneetvelde (Ep. 280)
AI Agents with Atomic Agents 🚀 with Kenny Vaneetvelde (Ep. 280) AI Agents with Atomic Agents 🚀 with Kenny Vaneetvelde (Ep. 280)

🎙️ In this episode of Data Science at Home, we sit down with Kenny Vaneetvelde, the mastermind behind Atomic Agents, a groundbreaking framework redefining AI development.

🔍 Discover how atomicity simplifies complex AI systems, why modularity matters more than ever, and how Atomic Agents is eliminating hidden assumptions and redundant complexity in AI workflows.

💡 From real-world applications to the tech stack behind the framework, Kenny takes us on a deep dive into this lightweight, powerful tool for creating consistent and brand-aligned AI.

📌 Timestamps:0:00 – Intro2:30 – Kenny’s journey in AI5:00 – What are Atomic Agents?

10:45 – Why atomicity matters in AI18:20 – The tech behind Atomic A…

3 months, 4 weeks назад @ datascienceathome.com
Run massive models on crappy machines (Ep. 279)
Run massive models on crappy machines (Ep. 279) Run massive models on crappy machines (Ep. 279)

This episode explores how to break down barriers by running massive AI models on “crappy machines”—affordable, low-spec devices.

🐦 Twitter: @DataScienceAtHome📘 LinkedIn: https://www.linkedin.com/in/fragadaleta/Instagram: https://www.instagram.com/datascienceathome/Facebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Whether you’re a data professional, tech enthusiast, or just curious about the field, our podcast delivers insights, interviews, and discussions.

Send us mail …

4 months назад @ datascienceathome.com
WeightWatcher: The AI Detective for LLMs (DeepSeek & OpenAI included) (Ep. 278)
WeightWatcher: The AI Detective for LLMs (DeepSeek & OpenAI included) (Ep. 278) WeightWatcher: The AI Detective for LLMs (DeepSeek & OpenAI included) (Ep. 278)

Enter WeightWatcher—the AI detective tool that peeks inside neural networks without needing their data.

🐦 Twitter: @DataScienceAtHome📘 LinkedIn: https://www.linkedin.com/in/fragadaleta/Instagram: https://www.instagram.com/datascienceathome/Facebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Whether you’re a data professional, tech enthusiast, or just curious about the field, our podcast delivers insights, interviews, and discussions.

Send us mail at:hello@datascienceathom…

4 months, 1 week назад @ datascienceathome.com
Tech’s Dumbest Mistake: Why Firing Programmers for AI Will Destroy Everything (Ep. 278)
Tech’s Dumbest Mistake: Why Firing Programmers for AI Will Destroy Everything (Ep. 278) Tech’s Dumbest Mistake: Why Firing Programmers for AI Will Destroy Everything (Ep. 278)

From the viral article “Tech’s Dumbest Mistake: Why Firing Programmers for AI Will Destroy Everything” on my newsletter at https://defragzone.substack.com/p/techs-dumbest-mistake-why-firinghere are my thoughts about AI replacing programmers…✨ Connect with us!

🐦 Twitter: @DataScienceAtHome📘 LinkedIn: https://www.linkedin.com/in/fragadaleta/Instagram: https://www.instagram.com/datascienceathome/Facebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Whether you’re a data profes…

4 months, 1 week назад @ datascienceathome.com
Scaling Smart: AI, Data, and Building Future-Ready Enterprises with Josh Miramant (Ep. 276)
Scaling Smart: AI, Data, and Building Future-Ready Enterprises with Josh Miramant (Ep. 276) Scaling Smart: AI, Data, and Building Future-Ready Enterprises with Josh Miramant (Ep. 276)

In this episode, we dive into the transformative world of AI, data analytics, and cloud infrastructure with Josh Miramant, CEO of Blue Orange Digital.

As a seasoned entrepreneur with over $25 million raised across ventures and two successful exits, Josh shares invaluable insights on scaling data-driven businesses, integrating machine learning frameworks, and navigating the rapidly evolving landscape of cloud data architecture.

From generative AI to large language models, Josh explores cutting-edge trends shaping financial services, real estate, and consumer goods.

Tune in for a masterclass in leveraging data for impact and innovation!

Linkshttps://blueorange.digital/https://blueorange.digit…

7 months, 2 weeks назад @ datascienceathome.com
Autonomous Weapons and AI Warfare (Ep. 275)
Autonomous Weapons and AI Warfare (Ep. 275) Autonomous Weapons and AI Warfare (Ep. 275)

Here’s the updated text with links to the websites included:AI is revolutionizing the military with autonomous drones, surveillance tech, and decision-making systems.

In this episode of Data Science at Home, we expose the cutting-edge tech reshaping defense—and the chilling ethical questions that follow.

🐦 Twitter: @DataScienceAtHome📘 LinkedIn: Francesco Gad📷 Instagram: https://www.instagram.com/datascienceathome/📘 Facebook: https://www.facebook.com/datascienceAH💼 LinkedIn: https://www.linkedin.com/company/data-science-at-home-podcast💬 Discord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

S…

7 months, 3 weeks назад @ datascienceathome.com
8 Proven Strategies to Scale Your AI Systems Like OpenAI! 🚀 (Ep. 274)
8 Proven Strategies to Scale Your AI Systems Like OpenAI! 🚀  (Ep. 274) 8 Proven Strategies to Scale Your AI Systems Like OpenAI! 🚀 (Ep. 274)

In this episode of Data Science at Home, we’re diving deep into the powerful strategies that top AI companies, like OpenAI, use to scale their systems to handle millions of requests every minute!

From stateless services and caching to the secrets of async processing, discover 8 essential strategies to make your AI and machine learning systems unstoppable.

Instagram: https://www.instagram.com/datascienceathome/Twitter: @datascienceathomeFacebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and ma…

7 months, 3 weeks назад @ datascienceathome.com