Very ML
State-of-the-art Machine Learning News Feed
/r/MachineLearning
последний пост 3 часа назад
Tried comparing different AI chatbots and ended up building my own tool for it [P]
Tried comparing different AI chatbots and ended up building my own tool for it [P]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

3 часа назад @ reddit.com
[D] How can Computational Neuroscience/ML explain the Origin of First-Person Subjectivity: HOW Do I Feel Like “Me”?
[D] How can Computational Neuroscience/ML explain the Origin of First-Person Subjectivity: HOW Do I Feel Like “Me”?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

4 часа назад @ reddit.com
[D] What are the best industry options for causal ML PhDs?
[D] What are the best industry options for causal ML PhDs?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

8 часов назад @ reddit.com
[D] Has anyone encountered a successful paper reading group at your company?
[D] Has anyone encountered a successful paper reading group at your company?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

9 часов назад @ reddit.com
[R] I am building a framework for AI which will allow it to self learn and self evolve
[R] I am building a framework for AI which will allow it to self learn and self evolve

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

10 часов назад @ reddit.com
What are the most effective practices, tools, and methodologies your Data & AI team follows to stay productive, aligned, and impactful? [D]
What are the most effective practices, tools, and methodologies your Data & AI team follows to stay productive, aligned, and impactful? [D]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

13 часов назад @ reddit.com
[R] How to publish in ML conferences as an independent researcher
[R] How to publish in ML conferences as an independent researcher

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

16 часов назад @ reddit.com
[P] Built a prompt-based automation tool — could this be useful for data scientists too?
[P] Built a prompt-based automation tool — could this be useful for data scientists too?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

17 часов назад @ reddit.com
[P] Hill Space: Neural networks that actually do perfect arithmetic (10⁻¹⁶ precision)
[P] Hill Space: Neural networks that actually do perfect arithmetic (10⁻¹⁶ precision) [P] Hill Space: Neural networks that actually do perfect arithmetic (10⁻¹⁶ precision)

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

19 часов назад @ reddit.com
[R] Join the Ultimate Challenge: Robopalooza – Space Robotics Competition For $5000!
[R] Join the Ultimate Challenge: Robopalooza – Space Robotics Competition For $5000! [R] Join the Ultimate Challenge: Robopalooza – Space Robotics Competition For $5000!

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day назад @ reddit.com
[R] I want to publish my ML paper after leaving grad school. What is the easiest way to do so?
[R] I want to publish my ML paper after leaving grad school. What is the easiest way to do so?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 12 hours назад @ reddit.com
[D] Where to start with contributing to open source ML/AI infra?
[D] Where to start with contributing to open source ML/AI infra?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 13 hours назад @ reddit.com
[D] Modelling continuous non-Gaussian distributions?
[D] Modelling continuous non-Gaussian distributions? [D] Modelling continuous non-Gaussian distributions?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 14 hours назад @ reddit.com
[D] Automatic system prompt generation from a task + data
[D] Automatic system prompt generation from a task + data

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 18 hours назад @ reddit.com
[D] Best AI service for programmers
[D] Best AI service for programmers

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

If you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day, 20 hours назад @ reddit.com
Towards Data Science
последний пост 1 day, 14 hours назад
Are You Being Unfair to LLMs?
Are You Being Unfair to LLMs? Are You Being Unfair to LLMs?

Further, LLMs build internal world models and mind maps of the context they analyze [3].

LLMs show signs of creativityWhen faced with new tasks, LLMs do more than just regurgitate memorized content.

Beyond mere long-term memory systems that provide additional in-context data to static LLMs, future approaches could be self-modifying by adapting the core LLM’s weights.

For instance, LLMs can generate realistic training data for traditional AI systems that are subsequently used in industrial automation.

Therefore, regardless of whether current LLMs turn out to be creative, possess world models, or are sentient, we might want to refrain from belittling them.

1 day, 14 hours назад @ towardsdatascience.com
Hitchhiker’s Guide to RAG: From Tiny Files to Tolstoy with OpenAI’s API and LangChain
Hitchhiker’s Guide to RAG: From Tiny Files to Tolstoy with OpenAI’s API and LangChain Hitchhiker’s Guide to RAG: From Tiny Files to Tolstoy with OpenAI’s API and LangChain

, I walked you through setting up a very simple RAG pipeline in Python, using OpenAI’s API, LangChain, and your local files.

Instead, we need to first do the chunking — create smaller chunks of text, and create embeddings for each one.

For instance, for OpenAI’s text-embedding-3-small embedding model, the chunk size limit is 8,191 tokens.

For instance, for OpenAI’s embedding model, the chunk size limit is 8,191 tokens.

Nonetheless, understanding and fine-tuning chunk size and overlap is the foundation for building RAG pipelines that produce meaningful results.

1 day, 15 hours назад @ towardsdatascience.com
Building a Сustom MCP Chatbot
Building a Сustom MCP Chatbot Building a Сustom MCP Chatbot

Image by authorSince we’ve already implemented the MCP server before, this time we will focus on building the MCP client.

The chatbot starts by discovering available MCP capabilities (iterating through all configured MCP servers, establishing connections and requesting their lists of capabilities).

class MCP_ChatBot: """ MCP (Model Context Protocol) ChatBot that connects to multiple MCP servers and provides a conversational interface using Anthropic's Claude.

Found matching prompt: sql_query_prompt Description: Create a SQL query prompt Would you like to use this prompt template?

SummaryIn this article, we built a chatbot that integrates with MCP servers and leverages all the benefits of st…

2 days, 13 hours назад @ towardsdatascience.com
Reducing Time to Value for Data Science Projects: Part 3
Reducing Time to Value for Data Science Projects: Part 3 Reducing Time to Value for Data Science Projects: Part 3

Your evaluation data should also be at least the minimum length that will be used in production, and ideally multiples of that length.

For example, if your model will score data every week then having your evaluation data be a days worth of data is not sufficient.

Validating the business value of your solution links back to what was said earlier about your role as a data scientist.

ConclusionIn this article we have considered how to plan the model experiment phase of your project.

By adhering to the recommendations laid out here, we greatly increase our chances of reducing the time to value of our data science projects by quickly and confidently iterating through the experimentation process.

2 days, 13 hours назад @ towardsdatascience.com
Scene Understanding in Action: Real-World Validation of Multimodal AI Integration
Scene Understanding in Action: Real-World Validation of Multimodal AI Integration Scene Understanding in Action: Real-World Validation of Multimodal AI Integration

of this series on multimodal AI systems, we’ve moved from a broad overview into the technical details that drive the architecture.

To answer this, I’ll walk you through three carefully selected real-world scenarios that put VisionScout’s scene understanding to the test.

These examples show how four AI models work together in a unified framework to deliver scene understanding that no single model could achieve on its own.

Indoor Scene Analysis: Interpreting Spatial Narratives in Living Rooms1.1 Object Detection and Spatial UnderstandingLet’s begin with a typical home living room.

These include three sofas, two potted plants, a television, and several chairs — the key elements used in further…

2 days, 14 hours назад @ towardsdatascience.com
Evaluation-Driven Development for LLM-Powered Products: Lessons from Building in Healthcare
Evaluation-Driven Development for LLM-Powered Products: Lessons from Building in Healthcare Evaluation-Driven Development for LLM-Powered Products: Lessons from Building in Healthcare

As a data scientist at Nuna, Inc., a healthcare AI company, I’ve been spearheading our efforts to embed evaluation-driven development into our products.

Once the baseline system is up and running, it should be run on the development set to generate the first outputs.

2.3 Metric designDesigning evaluation metrics that are actionable and aligned with the feature’s goals is a critical part of evaluation-driven development.

Asynchronously, SMEs should be able to securely review the inputs and outputs from the development set and make comments on them.

To identity improvements with highest ROI, we can revisit the development set segmentation into personas, functionalities and situations describe…

2 days, 15 hours назад @ towardsdatascience.com
Worried About AI? Use It to Your Advantage
Worried About AI? Use It to Your Advantage Worried About AI? Use It to Your Advantage

The utopians are pointing towards a future where data scientists and programmers can focus on management, strategy, and deep thinking, instead of on boring, repetitive tasks.

The pessimists, meanwhile, are dreading a future in which there are no more data scientists and programmers.

In the first part of her new series, Sara unpacks the benefits of prompt engineering during the EDA (exploratory data analysis) process.

A Developer’s Guide to Building Scalable AI: Workflows vs Agents, by Hailey QuachUnderstanding the architectural trade-offs between autonomous agents and orchestrated workflows.

Jens Winkelmann joins our author community with a multidisciplinary background in physics, data scie…

2 days, 22 hours назад @ towardsdatascience.com
The Crucial Role of NUMA Awareness in High-Performance Deep Learning
The Crucial Role of NUMA Awareness in High-Performance Deep Learning The Crucial Role of NUMA Awareness in High-Performance Deep Learning

CPU NUMA Node DiscoveryThe lscpu command provides information about the CPU architecture of a Linux system, including a section describing the NUMA layout.

When NUMA Placement Matters MostThe performance impact of poor NUMA placement depends heavily on the workload characteristics.

NUMA Discovery and Pinning UtilitiesWe begin by defining utility functions for NUMA node discovery and pinning.

Another advantage is that the NUMA placement is inherited by subprocesses, making it unnecessary to re-pin dataloader workers manually.

SummaryIn our constant pursuit of AI/ML workload optimization, topology awareness — including NUMA node placement — is critical.

3 days, 3 hours назад @ towardsdatascience.com
Work Data Is the Next Frontier for GenAI
Work Data Is the Next Frontier for GenAI Work Data Is the Next Frontier for GenAI

Work data is the best quality data humanity has ever producedWork data is obviously much better quality than our public internet content.

If pretraining was done solely on work data, we might not need RLHF and alignment training at all because all that just permeates the training data.

If pretraining was done solely on work data, we might not need RLHF and alignment training at all because all that just permeates the training data.

Overall, I think, work data is probably the best quality data humanity has ever produced because the incentives are aligned.

Work data is—in many cases—explicitly labeledIn many cases, work data comes in input-output pairs.

3 days, 11 hours назад @ towardsdatascience.com
Recap of all types of LLM Agents
Recap of all types of LLM Agents Recap of all types of LLM Agents

Different types of prompting techniques create different types of Agents.

cot_answer = response['message']['content'] response = ollama.chat(model=llm, messages=[ {'role':'user', 'content': f'''Here was your original answer:{cot_answer}Now reflect on whether it was correct or if it was the best approach.

response = ollama.chat(model=llm, messages=[ {'role':'user', 'content': f"Task: {q}{prompt}"} ]) print(response['message']['content'])6) Graph‑of‑Thoughts (GoT) that generalizes CoT into a graph, considering also interconnected branches.

response = ollama.chat(model=llm, messages=[ {'role':'user', 'content':q} ]) print(response['message']['content'])7) Program‑of‑Thoughts (PoT) that special…

3 days, 13 hours назад @ towardsdatascience.com
AI Agents Are Shaping the Future of Work Task by Task, Not Job by Job
AI Agents Are Shaping the Future of Work Task by Task, Not Job by Job AI Agents Are Shaping the Future of Work Task by Task, Not Job by Job

Source: Stanford Paper – Future of Work with AI Agents: Auditing Automation and Augmentation Potential across the U.S.

WorkforceThe Green Zone (Automate) : High worker desire, high AI capability.

: High worker desire, high AI capability.

: Low worker desire, high AI capability.

Conversely, usage is lower at the extremes: Job Zone 1 (e.g., baristas) and Job Zone 5 (e.g., physicians, lawyers) both under-index significantly.

3 days, 13 hours назад @ towardsdatascience.com
How to Perform Effective Data Cleaning for Machine Learning
How to Perform Effective Data Cleaning for Machine Learning How to Perform Effective Data Cleaning for Machine Learning

I then continue discussing three different data cleaning techniques, and some notes to keep in mind when performing data cleaning.

I will go through why you need data cleaning and data cleaning techniques.

DefinitionTo be explicit, I define data cleaning as a step you perform after your data annotation process.

This is the process I use to perform data cleaning with clustering:Embed all of your data samples.

ConclusionIn this article, I have discussed the importance of data annotation and data cleaning.

3 days, 13 hours назад @ towardsdatascience.com
Time Series Forecasting Made Simple (Part 3.1): STL Decomposition
Time Series Forecasting Made Simple (Part 3.1): STL Decomposition Time Series Forecasting Made Simple (Part 3.1): STL Decomposition

This is where we turn to a more advanced decomposition method to better understand the data: STL — Seasonal-Trend decomposition using LOESS.

Averaging both estimates pulls the result back onto July 2010 itself, yielding a true centered moving average for that month.

Final STL TrendLooking at the plot, we can see that the trend line from the moving average almost overlaps with the STL trend for most of the years.

We now have the initial trend using moving averages, so let’s move on to the next step in the STL process.

That’s why we first extract a rough trend using moving averages and then isolate seasonality using a low-pass filter.

3 days, 14 hours назад @ towardsdatascience.com
How to Fine-Tune Small Language Models to Think with Reinforcement Learning
How to Fine-Tune Small Language Models to Think with Reinforcement Learning How to Fine-Tune Small Language Models to Think with Reinforcement Learning

I will cover the major ideas behind training reasoning behaviours into language models, share some simple code snippets, and some practical tips to fine-tune Small Language Models (SLMs) with RL.

This reward model is trained on a dataset where humans have ranked or rated different model responses.

The full language model response will look something like this: User has asked me to count the number of r's in strawberry.

The reasoning process and answer are enclosed within and tags, respectively, i.e., reasoning process here answer here .

The correction reward comes from the environment, and the format reward is awarded if the model correctly generates the ... and ... tags.

4 days, 7 hours назад @ towardsdatascience.com
What I Learned in my First 18 Months as a Freelance Data Scientist
What I Learned in my First 18 Months as a Freelance Data Scientist What I Learned in my First 18 Months as a Freelance Data Scientist

It has been nearly 18 months since I started up my freelancing career — I can hardly believe it!

I envy those of you who live in countries that provide you with health insurance.

Health insurance is expensive.

We are fortunate in that my spouse’s seasonal job offers us health insurance during the season (the winter ski season).

So is what you are paying out of pocket on health insurance.

4 days, 8 hours назад @ towardsdatascience.com
Distill.pub Distill.pub
последний пост None
The Gradient The Gradient
последний пост 1 month, 1 week назад
AGI Is Not Multimodal
AGI Is Not Multimodal AGI Is Not Multimodal

Despite this, scale maximalists have implicitly suggested that multimodal models can be a structure-agnostic framework for AGI.

While structure-agnostic scale maximalism has succeeded in producing LLMs and LVMs that pass Turing tests, a multimodal scale maximalist approach to AGI will not bear similar fruit.

CitationFor attribution in academic contexts or books, please cite this work asBenjamin A. Spiegel, "AGI Is Not Multimodal", The Gradient, 2025.

@article{spiegel2025agi, author = {Benjamin A. Spiegel}, title = {AGI Is Not Multimodal}, journal = {The Gradient}, year = {2025}, howpublished = {\url{https://thegradient.pub/agi-is-not-multimodal}, }ReferencesAndreas, Jacob.

“Language Models,…

1 month, 1 week назад @ thegradient.pub
Shape, Symmetries, and Structure: The Changing Role of Mathematics in Machine Learning Research
Shape, Symmetries, and Structure: The Changing Role of Mathematics in Machine Learning Research Shape, Symmetries, and Structure: The Changing Role of Mathematics in Machine Learning Research

Mathematics and statistics, once the primary guides of machine learning research, now struggle to provide immediate insight into the latest breakthroughs.

This shift has prompted speculation about mathematics’ diminished role in machine learning research moving forward.

It is also the way that symmetries are usually leveraged when performing computations (for example, in machine learning).

One can reasonably argue that diagrammatic descriptions of well-known constructions, like products, are not useful for the machine learning researcher.

However, as we’ve demonstrated, while mathematics may not maintain the same role in machine learning research that it has held in the past, the success of…

7 months, 4 weeks назад @ thegradient.pub
TheSequence TheSequence
последний пост 1 day, 22 hours назад
The Sequence Research #683: Orchestrating Intelligence: Sakana AI’s Multi-Model Tree Search Architecture
The Sequence Research #683: Orchestrating Intelligence: Sakana AI’s Multi-Model Tree Search Architecture The Sequence Research #683: Orchestrating Intelligence: Sakana AI’s Multi-Model Tree Search Architecture

Image Credit: GPT-4oOver the past few years, we’ve witnessed astonishing leaps in the performance of large language models.

But as we ascend the curve of scale, a familiar law of diminishing returns emerges.

Training ever-larger models yields only incremental benefits, and the marginal cost of additional performance grows exponentially.

This is the provocative premise behind Sakana AI’s Multi-LLM AB-MCTS (Adaptive Branching Monte Carlo Tree Search): a framework that challenges the prevailing narrative of model monoliths.

Instead of building ever-bigger brains, Sakana proposes a collective—an adaptive system where multiple models reason together, iteratively, guided by a search tree that mir…

1 day, 22 hours назад @ thesequence.substack.com
The Sequence Opinion #682: The Boundary of Autonomy: When AI Can Go Solo
The Sequence Opinion #682: The Boundary of Autonomy: When AI Can Go Solo The Sequence Opinion #682: The Boundary of Autonomy: When AI Can Go Solo

Created Using GPT-4oEffective autonomy in AI—where a system reliably makes decisions without human intervention—remains one of the most formidable challenges for the new generation of AI systems.

Finding the balance between the precision of algorithmic rigor against the unpredictability of real-world complexities is one of the crown jewels of AI.

But can it be really achieve?

This essay presents some of my most recent ideas about the state of autonomy.

I hope you enjoy it.

2 days, 18 hours назад @ thesequence.substack.com
The Sequence Engineering #681: Building Agents with Amazon Strands
The Sequence Engineering #681: Building Agents with Amazon Strands The Sequence Engineering #681: Building Agents with Amazon Strands

Created Using GPT-4oAmazon Strands Agents is an open-source, model-driven AI agent framework designed to harness the reasoning capabilities of large language models (LLMs).

The framework enables these models to autonomously plan, select tools, execute tasks, and reflect on outcomes, forming a complete agentic loop without requiring developers to explicitly program workflows.

It achieves this by connecting three essential primitives—model, tools, and prompt—allowing developers to construct intelligent agents with minimal code.

This essay provides a deep technical overview of Amazon Strands.

It covers the core architecture, the agentic loop, tool integration, deployment strategies, multi-agen…

3 days, 22 hours назад @ thesequence.substack.com
The Sequence Knowledge #680: Can we Evaluate Creativity in AI Models?
The Sequence Knowledge #680: Can we Evaluate Creativity in AI Models? The Sequence Knowledge #680: Can we Evaluate Creativity in AI Models?

Created Using GPT-4oToday we will Discuss:An overview of AI creativity benchmarks.

A review of OpenAI’s HumanEval benchmark.

💡 AI Concept of the Day: About Creativity BenchmarksEvaluating creativity in artificial intelligence (AI) is a complex endeavor, as it encompasses subjective qualities like originality, coherence, and emotional resonance.

To systematically assess these aspects, researchers have developed specialized benchmarks that test AI models across various creative domains.

This essay explores key benchmarks designed to evaluate AI creativity, highlighting their methodologies and contributions to the field.

4 days, 22 hours назад @ thesequence.substack.com
The Sequence Radar: From Model to Team: Several Models are Better than One: Sakana’s Blueprint for Collective AI
The Sequence Radar: From Model to Team: Several Models are Better than One: Sakana’s Blueprint for Collective AI The Sequence Radar: From Model to Team: Several Models are Better than One: Sakana’s Blueprint for Collective AI

Research: Dive into Sakana AI’s new AB-MCTS model.

You can subscribe to The Sequence below:📝 Editorial: Several Models are Better than One: Sakana’s Blueprint for Collective AISakana AI has rapidly emerged as one of my favorite and most innovative AI research labs in the current landscape.

In the latest evolution of inference-time AI, Sakana AI has introduced a compelling framework that pushes beyond traditional single-model reasoning: Adaptive Branching Monte Carlo Tree Search, or AB-MCTS.

Initially unbiased, the system learns to favor models that historically perform better on certain subtasks, effectively turning a collection of models into a coherent team.

By decoupling capability from …

6 days, 22 hours назад @ thesequence.substack.com
The Sequence Research #678: Sequence to Function at Scale: Inside The AlphaGenome Breakthrough
The Sequence Research #678: Sequence to Function at Scale: Inside The AlphaGenome Breakthrough The Sequence Research #678: Sequence to Function at Scale: Inside The AlphaGenome Breakthrough

Today we are going to deep dive into one of the most impressive AI models applied to science.

DeepMind's release of AlphaGenome marks a significant advancement in the application of deep learning to genomics.

This powerful AI system is designed to interpret the functional landscape of DNA sequences, including the elusive non-coding regions, using an unprecedented combination of scale and precision.

AlphaGenome can process up to 1 million base pairs of input DNA and output predictions across thousands of functional genomic tracks, such as gene expression, splicing, chromatin state, and 3D genome architecture.

More importantly, it offers a unified and highly efficient architecture capable of …

1 week, 1 day назад @ thesequence.substack.com
The Sequence Opinion #677: Glass-Box Transformers: How Circuits Illuminate Deep Learning’s Inner Workings
The Sequence Opinion #677: Glass-Box Transformers: How Circuits Illuminate Deep Learning’s Inner Workings The Sequence Opinion #677: Glass-Box Transformers: How Circuits Illuminate Deep Learning’s Inner Workings

Created Using GPT-4oCircuits are quickly becoming a favorite of the AI research community to tackle the monumental challenge of interpretability.

Today, we are going to explore both the case in favor and against circuits.

As transformer-based models push the boundaries of what AI can do, understanding how they work becomes increasingly urgent.

At the core of this approach lies the concept of circuits: interconnected sets of neurons or attention heads that jointly compute a specific function.

Historical Evolution of the Circuits Approach

1 week, 2 days назад @ thesequence.substack.com
The Sequence Engineering #676: Hacking with Gemini CLI
The Sequence Engineering #676: Hacking with Gemini CLI The Sequence Engineering #676: Hacking with Gemini CLI

Created Using GPT-4oToday we are going to cover one of the most exciting AI releases of last week.

Gemini CLI brings Google’s advanced Gemini 2.5 Pro model to the developer terminal, blending powerful language capabilities with intuitive command-line tools.

Its architecture—built around a ReAct loop, the Model Context Protocol (MCP), and extensible plugins—provides rich interpretability by logging every decision, exposing internal memory, and letting users trace each “Thought” and “Action.” This essay explores how these components integrate to make Gemini CLI transparent, trustworthy, and easy to debug, all without leaving your shell.

Quick Intro

1 week, 3 days назад @ thesequence.substack.com
The Sequence Knowledge #675: Learning to Evaluate Multi-Agent AIs
The Sequence Knowledge #675: Learning to Evaluate Multi-Agent AIs The Sequence Knowledge #675: Learning to Evaluate Multi-Agent AIs

Created Using GPT-4oToday we will Discuss:An overview of multi-agent benchmarks.

An introduction to the Arena-Hard benchmark.

💡 AI Concept of the Day: An Overview of Multi-Agent BenchmarksThe emergence of large language models (LLMs) has catalyzed a shift in AI evaluation paradigms, moving from single-agent benchmarks to more complex, multi-agent collaboration settings.

These benchmarks are designed to assess the ability of autonomous agents to engage in structured coordination, negotiation, and joint task execution across dynamic environments.

As LLMs become increasingly agentic, capable of memory, planning, and communication, multi-agent benchmarks offer a critical framework for testing e…

1 week, 4 days назад @ thesequence.substack.com
TheSequence Radar #674: Transformers in the Genome: How AlphaGenome Reimagines AI-Driven Genomics
TheSequence Radar #674: Transformers in the Genome: How AlphaGenome Reimagines AI-Driven Genomics TheSequence Radar #674: Transformers in the Genome: How AlphaGenome Reimagines AI-Driven Genomics

You can subscribe to The Sequence below:📝 Editorial: Transformers in the Genome: How AlphaGenome Reimagines AI-Driven GenomicsI have been obsessed with AI in genetics for some time so I couldn’t write about anything else today other than DeepMind’s new model: AlphaGenome!

Traditional genomics models often excel at one signal—SpliceAI for splicing, ChromBPNet for chromatin state—necessitating an ensemble of tools to profile variant consequences fully.

In benchmark evaluations spanning 24 sequence-prediction and 26 variant-effect tasks, AlphaGenome matches or surpasses specialized baselines in over 90% of cases.

Its ability to standardize and accelerate regulatory variant annotation is poised…

1 week, 6 days назад @ thesequence.substack.com
The Sequence Research #673: Infinite Self-Improvement: Unpacking Sakana's Darwin Gödel Machine
The Sequence Research #673: Infinite Self-Improvement: Unpacking Sakana's Darwin Gödel Machine The Sequence Research #673: Infinite Self-Improvement: Unpacking Sakana's Darwin Gödel Machine

Created Using GPT-4oThe Darwin Gödel Machine (DGM) pioneers a new paradigm in autonomous AI by combining the theoretical vision of self–modifying Gödel Machines with an empirical, Darwinian search process powered by foundation models.

DGM iteratively proposes patches to its own Python code, evaluates each variant on real-world coding benchmarks (SWE-bench, Polyglot), and archives successful agents to fuel open-ended evolution.

Building an AI that can rewrite its own reasoning logic has long been a dream since Jürgen Schmidhuber’s 2006 Gödel Machine, which required formal proofs for any self-modification.

However, real-world code rarely admits tractable proofs, relegating the Gödel Machine t…

2 weeks, 1 day назад @ thesequence.substack.com
The Sequence Opinion #672: Mind Over Model: Chain-of-Thought vs. System 1/System 2
The Sequence Opinion #672: Mind Over Model: Chain-of-Thought vs. System 1/System 2 The Sequence Opinion #672: Mind Over Model: Chain-of-Thought vs. System 1/System 2

As AI systems become more sophisticated, researchers and theorists often draw parallels between machine reasoning and human cognition.

One intriguing comparison is between chain-of-thought (CoT) reasoning in AI and the dual-process theory of human thought, commonly known as System 1 and System 2 thinking.

System 1 describes the brain’s fast, automatic, intuitive mode, while System 2 is slower, effortful, and deliberative.

First, we explain what CoT reasoning entails in modern AI systems and outline the psychological basis of the System 1/System 2 theory.

Chain-of-Thought Reasoning in AI

2 weeks, 2 days назад @ thesequence.substack.com
The Sequence Engineering #671: How Anthropic Built a Research Agent?
The Sequence Engineering #671: How Anthropic Built a Research Agent? The Sequence Engineering #671: How Anthropic Built a Research Agent?

Image Created Using GPT-4oThe Research feature in Claude represents a significant evolution in how large language models can tackle open-ended, complex research tasks.

At its core lies a multi-agent architecture, in which a LeadResearcher orchestrator spawns multiple specialized Subagents to explore distinct facets of a query in parallel.

This orchestrator-worker design draws inspiration from distributed computing paradigms and allows the system to achieve both breadth and depth beyond what a single-agent pipeline could accomplish.

In practice, a user’s query first reaches the LeadResearcher, which deconstructs the question into a coherent research plan and assigns targeted subtasks to Suba…

2 weeks, 3 days назад @ thesequence.substack.com
The Sequence Knowledge #670: Evaluating AI in Software Engineering Tasks
The Sequence Knowledge #670: Evaluating AI in Software Engineering Tasks The Sequence Knowledge #670: Evaluating AI in Software Engineering Tasks

Created Using GPT-4oToday we will Discuss:An overview of software engineering benchmarks.

A review of the SWE-Benchmark, the gold standard of software engineering AI evals.

💡 AI Concept of the Day: Software Engineering AI BenchmarksAs large language models (LLMs) find their way into software development workflows, the need for rigorous benchmarks to evaluate their coding capabilities has grown rapidly.

Today, software engineering benchmarks go far beyond simple code generation.

Built from real GitHub issues and corresponding pull requests, SWE-bench tasks models with generating code changes that resolve bugs and pass unit tests.

2 weeks, 4 days назад @ thesequence.substack.com
The Sequence Radar #: MiniMax-M1 is a Very Impressive Model
The Sequence Radar #: MiniMax-M1 is a Very Impressive Model The Sequence Radar #: MiniMax-M1 is a Very Impressive Model

A debate about reasoning in AI models works like system1-system2.

A review of the MCP-Use framework to integrate with MCP serversYou can subscribe to The Sequence below:📝 Editorial: MiniMax-M1 is a Very Impressive ModelAlgorithmic innovation is always interesting when comes to LLMs.

Last week, we had a very interesting release of a highly innovative model that flew a bit under the radar.

MiniMax-M1 is a new 456B parameter model that redefines efficiency and scale for open-weight models.

In combination, this hybrid architecture allows the model to handle up to 1 million tokens of context natively.

2 weeks, 6 days назад @ thesequence.substack.com
Synced Review
последний пост 3 months назад
DeepSeek Signals Next-Gen R2 Model, Unveils Novel Approach to Scaling Inference with SPCT
DeepSeek Signals Next-Gen R2 Model, Unveils Novel Approach to Scaling Inference with SPCT DeepSeek Signals Next-Gen R2 Model, Unveils Novel Approach to Scaling Inference with SPCT

DeepSeek AI, a prominent player in the large language model arena, has recently published a research paper detailing a new technique aimed…Continue reading on SyncedReview »

3 months назад @ medium.com
Automating Artificial Life Discovery: The Power of Foundation Models
Automating Artificial Life Discovery: The Power of Foundation Models Automating Artificial Life Discovery: The Power of Foundation Models

The recent Nobel Prize for groundbreaking advancements in protein discovery underscores the transformative potential of foundation models…Continue reading on SyncedReview »

6 months, 1 week назад @ medium.com
Llama 3 Meets MoE: Pioneering Low-Cost High-Performance AI
Llama 3 Meets MoE: Pioneering Low-Cost High-Performance AI Llama 3 Meets MoE: Pioneering Low-Cost High-Performance AI

Continue reading on SyncedReview »

6 months, 2 weeks назад @ medium.com
DeepMind’s JetFormer: Unified Multimodal Models Without Modelling Constraints
DeepMind’s JetFormer: Unified Multimodal Models Without Modelling Constraints DeepMind’s JetFormer: Unified Multimodal Models Without Modelling Constraints

Recent advancements in training large multimodal models have been driven by efforts to eliminate modeling constraints and unify…Continue reading on SyncedReview »

6 months, 2 weeks назад @ medium.com
NVIDIA’s nGPT: Revolutionizing Transformers with Hypersphere Representation
NVIDIA’s nGPT: Revolutionizing Transformers with Hypersphere Representation NVIDIA’s nGPT: Revolutionizing Transformers with Hypersphere Representation

The Transformer architecture, introduced by Vaswani et al. in 2017, serves as the backbone of contemporary language models. Over the years…Continue reading on SyncedReview »

6 months, 3 weeks назад @ medium.com
From Token to Conceptual: Meta Introduces Large Concept Models in Multilingual AI
From Token to Conceptual: Meta Introduces Large Concept Models in Multilingual AI From Token to Conceptual: Meta Introduces Large Concept Models in Multilingual AI

Large Language Models (LLMs) have become indispensable tools for diverse natural language processing (NLP) tasks. Traditional LLMs operate…Continue reading on SyncedReview »

6 months, 3 weeks назад @ medium.com
NVIDIA’s Hybrid: Combining Attention and State Space Models for Breakthrough Performance of Small…
NVIDIA’s Hybrid: Combining Attention and State Space Models for Breakthrough Performance of Small… NVIDIA’s Hybrid: Combining Attention and State Space Models for Breakthrough Performance of Small…

Language models (LMs) based on transformers have become the gold standard in natural language processing, thanks to their exceptional…Continue reading on SyncedReview »

7 months назад @ medium.com
From Response to Query: The Power of Reverse Thinking in Language Models
From Response to Query: The Power of Reverse Thinking in Language Models From Response to Query: The Power of Reverse Thinking in Language Models

Continue reading on SyncedReview »

7 months назад @ medium.com
Yann LeCun Team’s New Research: Revolutionizing Visual Navigation with Navigation World Models
Yann LeCun Team’s New Research: Revolutionizing Visual Navigation with Navigation World Models Yann LeCun Team’s New Research: Revolutionizing Visual Navigation with Navigation World Models

Navigation is a fundamental skill for any visually-capable organism, serving as a critical tool for survival. It enables agents to locate…Continue reading on SyncedReview »

7 months назад @ medium.com
The Future of Vision AI: How Apple’s AIMV2 Leverages Images and Text to Lead the Pack
The Future of Vision AI: How Apple’s AIMV2 Leverages Images and Text to Lead the Pack The Future of Vision AI: How Apple’s AIMV2 Leverages Images and Text to Lead the Pack

The landscape of vision model pre-training has undergone significant evolution, especially with the rise of Large Language Models (LLMs)…Continue reading on SyncedReview »

7 months, 1 week назад @ medium.com
Redefining Music AI: The Power of Sony’s SoniDo as a Versatile Foundation Model
Redefining Music AI: The Power of Sony’s SoniDo as a Versatile Foundation Model Redefining Music AI: The Power of Sony’s SoniDo as a Versatile Foundation Model

A foundation model refers to a pre-trained model developed on extensive datasets, designed to be versatile and adaptable for a range of…Continue reading on SyncedReview »

7 months, 1 week назад @ medium.com
DeepMind’s Socratic Learning with Language Games: The Path to Self-Improving Superintelligence
DeepMind’s Socratic Learning with Language Games: The Path to Self-Improving Superintelligence DeepMind’s Socratic Learning with Language Games: The Path to Self-Improving Superintelligence

Continue reading on SyncedReview »

7 months, 2 weeks назад @ medium.com
Revolutionizing AI on a Budget: Apple’s Roadmap for Small Language Models Training Success
Revolutionizing AI on a Budget: Apple’s Roadmap for Small Language Models Training Success Revolutionizing AI on a Budget: Apple’s Roadmap for Small Language Models Training Success

While large language models (LLMs) dominate the AI landscape, Small-scale Large Language Models (SLMs) are gaining traction as…Continue reading on SyncedReview »

7 months, 2 weeks назад @ medium.com
Redefines Consistency Models”: OpenAI’s TrigFlow Narrows FID Gap to 10% with Efficient Two-Step…
Redefines Consistency Models”: OpenAI’s TrigFlow Narrows FID Gap to 10% with Efficient Two-Step… Redefines Consistency Models”: OpenAI’s TrigFlow Narrows FID Gap to 10% with Efficient Two-Step…

Consistency models (CMs) are a cutting-edge class of diffusion-based generative models designed for rapid and efficient sampling. However…Continue reading on SyncedReview »

7 months, 2 weeks назад @ medium.com
Precision in Pixels: NVIDIA’s Edify Image Model Combines High Quality with Unmatched Control
Precision in Pixels: NVIDIA’s Edify Image Model Combines High Quality with Unmatched Control Precision in Pixels: NVIDIA’s Edify Image Model Combines High Quality with Unmatched Control

The field of text-to-image synthesis has advanced rapidly, with state-of-the-art models now generating highly realistic and diverse images…Continue reading on SyncedReview »

7 months, 2 weeks назад @ medium.com
📓 Cool Blogs
ODS.ai Habr ODS.ai Habr
последний пост 3 months, 2 weeks назад
Байесовская собака: анализ пёсьего компаса
Байесовская собака: анализ пёсьего компаса Байесовская собака: анализ пёсьего компаса

", подумал я. И, к счастью, у меня как раз под рукой оказался идеальный подопытный.

Стандартное арифметическое среднее между 360° и 0° даст нам 180°, несмотря на то, что и 360°, и 0° указывают в одном направлении.

Нулевая гипотеза утверждает, что данные распределены равномерно по кругу, альтернативная — что это не так.

from pingouin import circ_vtest v, pval = circ_vtest(data['radians'], dir=np.pi) print(f"V-statistics: {v:.3f}; p-value: {pval:.6f}")>> V-statistics: 24.127; p-value: 0.002904Вот мы и подобрались к чему-то интересному!

Априорное распределение и функция правдоподобияПредположим, что у нас есть:Априорное распределение с параметрамиФункция правдоподобия для нового наблюдения с п…

3 months, 2 weeks назад @ habr.com
Создаем воспоминания. Осваиваем FLUX, LoRA и ComfyUI
Создаем воспоминания. Осваиваем FLUX, LoRA и ComfyUI Создаем воспоминания. Осваиваем FLUX, LoRA и ComfyUI

Такие модели можно обучать с нуля и это дорого, нужен кластер с GPU (видеокарты) и много данных.

В домене текст-картинка бывают открытые модели, типа Stable Diffusion, Kandinsky и FLUX, бывают закрытые, типа DALL-E.Открытую модель можно дообучать разными способами.

Борис СтругацкийОсобенности: Для личностей типа Стругацких или Бродского, качественных фотографий крайне мало, но много и не надо.

Можно и фразу.

Владимир СурдинАлексей СемихатовВидео с их лекциями можно найти повсеместно, начать можно с канала Вселенная плюс на YouTube и в телеграм.

6 months, 1 week назад @ habr.com
Как нейросети, RL и байесовскую оптимизацию стали использовать на ускорителях заряженных частиц
Как нейросети, RL и байесовскую оптимизацию стали использовать на ускорителях заряженных частиц Как нейросети, RL и байесовскую оптимизацию стали использовать на ускорителях заряженных частиц

Один из них — поддержание стабильной орбиты пучка частиц (траектории, по которой происходит движение), которая критически важна для точности экспериментов.

Вот, кстати, наша статья про то, как сейсмические вибрации будут влиять на орбиту пучка в СКИФ: Beam Stability .

Классические подходы к стабилизации орбиты пучка в ускорителяхДля стабилизации орбиты используют датчики положения пучка (BPM) и магнитные корректоры.

По словам авторов получились следующие преимущества:Ускорение процесса коррекции орбиты и повышение точности по сравнению с классическими методами, такими как SVD.

Задача агента:Автоматическое восстановление орбиты пучка заряженных частиц за ограниченное время и минимальное коли…

6 months, 3 weeks назад @ habr.com
о1: почему новая GPT от OpenAI — это не хайп, а переход к новой парадигме в ИИ
о1: почему новая GPT от OpenAI — это не хайп, а переход к новой парадигме в ИИ о1: почему новая GPT от OpenAI — это не хайп, а переход к новой парадигме в ИИ

В этой статье мы разберемся, чему научилась новая GPT o1, и как это повлияет на дальнейшую эволюцию ИИ.

Компания утверждает, что для них сброс счётчика линейки моделей к единичке знаменует собой переход к новой парадигме, и что эта нейросеть и вовсе демонстрирует новый уровень возможностей ИИ.

На форумах и в Твиттере была куча обсуждений, предвосхищений и хайпа, на фоне которых планка ожиданий некоторых людей взлетела до небес.

Издание Bloomberg рассказало, что в ходе внутренней демонстрации OpenAI показали концепцию из пяти уровней, помогающую отслеживать прогресс в создании ИИ.

Однако на уровне GPT-5 прирост в навыках может быть совсем другим (как в лучшую, так и в худшую сторону).

9 months, 4 weeks назад @ habr.com
Machine Learning Mastery
последний пост None
ML in Production
последний пост None
Sorta Insightful Sorta Insightful
последний пост 3 months, 1 week назад
Who is AI For?
Who is AI For? Who is AI For?

I think the easy answer to this question is that right now, AI is for the AI developers.

Code is useful, it makes money, it is a testbed for AI speeding up the development of AI, and it is easy.

I’m working in AI because it pays well and is potentially really good for the world.

The artists did not know what AI was, but when they learned, they quickly decided they did not want it.

It feels like the most likely outcome is that people go all-in on pushing raw intelligence, in the way that AI developers can measure it, leaving behind those that are not like AI developers.

3 months, 1 week назад @ alexirpan.com
MIT Mystery Hunt 2025
MIT Mystery Hunt 2025 MIT Mystery Hunt 2025

This has spoilers for MIT Mystery Hunt 2025.

I enjoyed it more than their 2018 Hunt, which is commonly cited as an all-time good Mystery Hunt.

In this Mystery Hunt it was reversed, where the act of unlocking is easy but the value and difficulty of a feeder varied.

In my free time pre-Hunt, I went to Puzzled Pint, where I tried to all-brain a logic puzzle (solve it without writing anything).

I’m looking forward to solving “No Assembly Required” in Mystery Hunt 2026, a puzzle that gives you the answer for no work.

5 months, 2 weeks назад @ alexirpan.com
Using AI to Get the Neopets Destruct-o-Match Avatar
Using AI to Get the Neopets Destruct-o-Match Avatar Using AI to Get the Neopets Destruct-o-Match Avatar

If AI can be superhuman at Go, surely AI can be slightly-worse-than-experts at Destruct-o-Match if we try?

Step 0: Is Making a Destruct-o-Match AI Against Neopets Rules?

I believe the precedent is in favor of a Destruct-o-Match AI being okay.

As long as I’m the one inputting moves the Destruct-o-Match AI recommends, I should be okay.

To write a game AI, we first need to implement the rules of the game in code.

6 months назад @ alexirpan.com
Late Takes on OpenAI o1
Late Takes on OpenAI o1 Late Takes on OpenAI o1

I realize how late this is, but I didn’t get a post out while o1 was fresh, and still feel like writing one despite it being cold.

(Also, OpenAI just announced they’re going to ship new stuff starting tomorrow so it’s now or never to say something.)

OpenAI o1 is a model release widely believed (but not confirmed) to be a post-trained version of GPT-4o.

If true, that makes this video especially useful for understanding OpenAI o1.

Which I suppose is part of why I’m talking about o1 rather than building o1.

7 months, 1 week назад @ alexirpan.com
Lil'Log
последний пост None
The Spectator
последний пост None
Off the Convex Path
последний пост None
fast.ai NLP fast.ai NLP
последний пост None
Sebastian Ruder
последний пост None
Andrew Karpathy blog
последний пост None
大トロ 大トロ
последний пост None
🔬 Science
Papers With Code Papers With Code
последний пост 1 week, 5 days назад
/mehrdadsaberi/ Model State Arithmetic for Machine Unlearning
/mehrdadsaberi/ Model State Arithmetic for Machine Unlearning /mehrdadsaberi/ Model State Arithmetic for Machine Unlearning

Large language models are trained on massive corpora of web data, which may include private data, copyrighted material, factually inaccurate data, or data that degrades model performance.

Eliminating the influence of such problematic datapoints through complete retraining -- by repeatedly pretraining the model on datasets that exclude these specific instances -- is computationally prohibitive.

For this reason, unlearning algorithms have emerged that aim to eliminate the influence of particular datapoints, while otherwise preserving the model -- at a low computational cost.

However, precisely estimating and undoing the influence of individual datapoints has proved to be challenging.

In this …

1 week, 5 days назад @ paperswithcode.com
/quester-one/ Agent-RewardBench: Towards a Unified Benchmark for Reward Modeling across Perception, Planning, and Safety in Real-World Multimodal Agents
/quester-one/ Agent-RewardBench: Towards a Unified Benchmark for Reward Modeling across Perception, Planning, and Safety in Real-World Multimodal Agents /quester-one/ Agent-RewardBench: Towards a Unified Benchmark for Reward Modeling across Perception, Planning, and Safety in Real-World Multimodal Agents

As Multimodal Large Language Models (MLLMs) advance, multimodal agents show promise in real-world tasks like web navigation and embodied intelligence.

A promising approach is to use reward models as external feedback, but there is no clear on how to select reward models for agents.

The benchmark is characterized by three key features: (1) Multiple dimensions and real-world agent scenarios evaluation.

It covers perception, planning, and safety with 7 scenarios; (2) Step-level reward evaluation.

Experiments demonstrate that even state-of-the-art multimodal models show limited performance, highlighting the need for specialized training in agent reward modeling.

1 week, 5 days назад @ paperswithcode.com
/huggingface/ FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
/huggingface/ FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language /huggingface/ FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

Pre-training state-of-the-art large language models (LLMs) requires vast amounts of clean and diverse text data.

While the open development of large high-quality English pre-training datasets has seen substantial recent progress, training performant multilingual LLMs remains a challenge, in large part due to the inherent difficulty of tailoring filtering and deduplication pipelines to a large number of languages.

In this work, we introduce a new pre-training dataset curation pipeline based on FineWeb that can be automatically adapted to support any language.

Ultimately, we show that our pipeline can be used to create non-English corpora that produce more performant models than prior dataset…

1 week, 5 days назад @ paperswithcode.com
/seon82/ A Hierarchical Deep Learning Approach for Minority Instrument Detection
/seon82/ A Hierarchical Deep Learning Approach for Minority Instrument Detection /seon82/ A Hierarchical Deep Learning Approach for Minority Instrument Detection

Identifying instrument activities within audio excerpts is vital in music information retrieval, with significant implications for music cataloging and discovery.

Prior deep learning endeavors in musical instrument recognition have predominantly emphasized instrument classes with ample data availability.

Recent studies have demonstrated the applicability of hierarchical classification in detecting instrument activities in orchestral music, even with limited fine-grained annotations at the instrument level.

This work presents various strategies to integrate hierarchical structures into models and tests a new class of models for hierarchical music prediction.

This study showcases more reliabl…

1 week, 5 days назад @ paperswithcode.com
/sri-csl/ Scalable Bayesian Low-Rank Adaptation of Large Language Models via Stochastic Variational Subspace Inference
/sri-csl/ Scalable Bayesian Low-Rank Adaptation of Large Language Models via Stochastic Variational Subspace Inference /sri-csl/ Scalable Bayesian Low-Rank Adaptation of Large Language Models via Stochastic Variational Subspace Inference

Prior work has made Bayesian deep learning-based approaches to this problem more tractable by performing inference over the low-rank adaptation (LoRA) parameters of a fine-tuned model.

In this work we present $\textbf{Scala}$ble $\textbf{B}$ayesian $\textbf{L}$ow-Rank Adaptation via Stochastic Variational Subspace Inference (ScalaBL).

We perform Bayesian inference in an $r$-dimensional subspace, for LoRA rank $r$.

By repurposing the LoRA parameters as projection matrices, we are able to map samples from this subspace into the full weight space of the LLM.

This allows us to learn all the parameters of our approach using stochastic variational inference.

1 week, 5 days назад @ paperswithcode.com
/yihong-97/ Unlocking Constraints: Source-Free Occlusion-Aware Seamless Segmentation
/yihong-97/ Unlocking Constraints: Source-Free Occlusion-Aware Seamless Segmentation /yihong-97/ Unlocking Constraints: Source-Free Occlusion-Aware Seamless Segmentation

Panoramic image processing is essential for omni-context perception, yet faces constraints like distortions, perspective occlusions, and limited annotations.

Previous unsupervised domain adaptation methods transfer knowledge from labeled pinhole data to unlabeled panoramic images, but they require access to source pinhole data.

To address these, we introduce a more practical task, i.e., Source-Free Occlusion-Aware Seamless Segmentation (SFOASS), and propose its first solution, called UNconstrained Learning Omni-Context Knowledge (UNLOCK).

While adapting without relying on source data or target labels, this framework enhances models to achieve segmentation with 360{\deg} viewpoint coverage a…

1 week, 5 days назад @ paperswithcode.com
/galvinlim/ AGTCNet: A Graph-Temporal Approach for Principled Motor Imagery EEG Classification
/galvinlim/ AGTCNet: A Graph-Temporal Approach for Principled Motor Imagery EEG Classification /galvinlim/ AGTCNet: A Graph-Temporal Approach for Principled Motor Imagery EEG Classification

Brain-computer interface (BCI) technology utilizing electroencephalography (EEG) marks a transformative innovation, empowering motor-impaired individuals to engage with their environment on equal footing.

Despite its promising potential, developing subject-invariant and session-invariant BCI systems remains a significant challenge due to the inherent complexity and variability of neural activity across individuals and over time, compounded by EEG hardware constraints.

While prior studies have sought to develop robust BCI systems, existing approaches remain ineffective in capturing the intricate spatiotemporal dependencies within multichannel EEG signals.

This study addresses this gap by int…

1 week, 5 days назад @ paperswithcode.com
/heboyong/ Boosting Domain Generalized and Adaptive Detection with Diffusion Models: Fitness, Generalization, and Transferability
/heboyong/ Boosting Domain Generalized and Adaptive Detection with Diffusion Models: Fitness, Generalization, and Transferability /heboyong/ Boosting Domain Generalized and Adaptive Detection with Diffusion Models: Fitness, Generalization, and Transferability

Detectors often suffer from performance drop due to domain gap between training and testing data.

Recent methods explore diffusion models applied to domain generalization (DG) and adaptation (DA) tasks, but still struggle with large inference costs and have not yet fully leveraged the capabilities of diffusion models.

We also apply consistency loss to align the auxiliary and ordinary branch, balancing fitness and generalization while preventing overfitting and improving performance on target domains (i.e., Generalization).

Furthermore, within a unified framework, standard detectors are guided by diffusion detectors through feature-level and object-level alignment on source domains (for DG) …

1 week, 5 days назад @ paperswithcode.com
/yannkerzreho/ Homogenization of Multi-agent Learning Dynamics in Finite-state Markov Games
/yannkerzreho/ Homogenization of Multi-agent Learning Dynamics in Finite-state Markov Games /yannkerzreho/ Homogenization of Multi-agent Learning Dynamics in Finite-state Markov Games

This paper introduces a new approach for approximating the learning dynamics of multiple reinforcement learning (RL) agents interacting in a finite-state Markov game.

The idea is to rescale the learning process by simultaneously reducing the learning rate and increasing the update frequency, effectively treating the agent's parameters as a slow-evolving variable influenced by the fast-mixing game state.

Under mild assumptions-ergodicity of the state process and continuity of the updates-we prove the convergence of this rescaled process to an ordinary differential equation (ODE).

This ODE provides a tractable, deterministic approximation of the agent's learning dynamics.

An implementation of…

1 week, 5 days назад @ paperswithcode.com
/hwang52/ FedSC: Federated Learning with Semantic-Aware Collaboration
/hwang52/ FedSC: Federated Learning with Semantic-Aware Collaboration /hwang52/ FedSC: Federated Learning with Semantic-Aware Collaboration

Federated learning (FL) aims to train models collaboratively across clients without sharing data for privacy-preserving.

However, one major challenge is the data heterogeneity issue, which refers to the biased labeling preferences at multiple clients.

To explore the possibility of using intra-client semantically meaningful knowledge in handling data heterogeneity, in this paper, we propose Federated Learning with Semantic-Aware Collaboration (FedSC) to capture client-specific and class-relevant knowledge across heterogeneous clients.

The core idea of FedSC is to construct relational prototypes and consistent prototypes at semantic-level, aiming to provide fruitful class underlying knowledge…

1 week, 5 days назад @ paperswithcode.com
/ins-amu/ Amortizing personalization in virtual brain twins
/ins-amu/ Amortizing personalization in virtual brain twins /ins-amu/ Amortizing personalization in virtual brain twins

Virtual brain twins are personalized digital models of individual human subject or patient's brains, allowing for mechanistic interpretation of neuroimaging data features.

Training and inference with these models however presents a pair of challenges: large shared infrastructure do not allow for use of personal data and inference in clinical applications should not require significant resources.

We introduce "anonymized personalization" to address both by expanding model priors to include personalization which under amortized inference allows training to be performed anonymously, while inference is both personalized and lightweight.

We illustrate the basic approach, demonstrate reliability …

1 week, 5 days назад @ paperswithcode.com
/tianyi-lab/ FaSTA$^*$: Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing
/tianyi-lab/ FaSTA$^*$: Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing /tianyi-lab/ FaSTA$^*$: Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing

We develop a cost-efficient neurosymbolic agent to address challenging multi-turn image editing tasks such as "Detect the bench in the image while recoloring it to pink.

Also, remove the cat for a clearer view and recolor the wall to yellow.''

It combines the fast, high-level subtask planning by large language models (LLMs) with the slow, accurate, tool-use, and local A$^*$ search per subtask to find a cost-efficient toolpath -- a sequence of calls to AI tools.

By comparing with recent image editing approaches, we demonstrate FaSTA$^*$ is significantly more computationally efficient while remaining competitive with the state-of-the-art baseline in terms of success rate.

PDFAbstract

1 week, 5 days назад @ paperswithcode.com
/ai-agi/ TableMoE: Neuro-Symbolic Routing for Structured Expert Reasoning in Multimodal Table Understanding
/ai-agi/ TableMoE: Neuro-Symbolic Routing for Structured Expert Reasoning in Multimodal Table Understanding /ai-agi/ TableMoE: Neuro-Symbolic Routing for Structured Expert Reasoning in Multimodal Table Understanding

Existing multimodal large language models (MLLMs) struggle with such WildStruct conditions, resulting in limited performance and poor generalization.

To address these challenges, we propose TableMoE, a neuro-symbolic Mixture-of-Connector-Experts (MoCE) architecture specifically designed for robust, structured reasoning over multimodal table data.

TableMoE features an innovative Neuro-Symbolic Routing mechanism, which predicts latent semantic token roles (e.g., header, data cell, axis, formula) and dynamically routes table elements to specialized experts (Table-to-HTML, Table-to-JSON, Table-to-Code) using a confidence-aware gating strategy informed by symbolic reasoning graphs.

Extensive abl…

1 week, 5 days назад @ paperswithcode.com
/jianghaiscu/ Learning to See in the Extremely Dark
/jianghaiscu/ Learning to See in the Extremely Dark /jianghaiscu/ Learning to See in the Extremely Dark

Learning-based methods have made promising advances in low-light RAW image enhancement, while their capability to extremely dark scenes where the environmental illuminance drops as low as 0.0001 lux remains to be explored due to the lack of corresponding datasets.

To this end, we propose a paired-to-paired data synthesis pipeline capable of generating well-calibrated extremely low-light RAW images at three precise illuminance ranges of 0.01-0.1 lux, 0.001-0.01 lux, and 0.0001-0.001 lux, together with high-quality sRGB references to comprise a large-scale paired dataset named See-in-the-Extremely-Dark (SIED) to benchmark low-light RAW image enhancement approaches.

Extensive experiments on th…

1 week, 5 days назад @ paperswithcode.com
/unabletousegit/ Task-Aware KV Compression For Cost-Effective Long Video Understanding
/unabletousegit/ Task-Aware KV Compression For Cost-Effective Long Video Understanding /unabletousegit/ Task-Aware KV Compression For Cost-Effective Long Video Understanding

Recent approaches have explored KV compression to mitigate this issue, but they often suffer from significant information loss at high compression ratios.

In this paper, we introduce Video-X^2L, which flexibly preserves critical video information for each LVU task.

The first one is called bi-level KV compression.

During the MLLM's pre-filling stage, Video-X^2L generates two types of compressed KVs: low-compression KVs (L-KVs) to capture fine-grained video details and high-compression KVs (H-KVs) to offer compact video representations.

During the MLLM's decoding stage, Video-X^2L selectively re-loads L-KVs for the most critical video chunks while using H-KVs for other less important ones.

1 week, 5 days назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 1 week, 5 days назад
/everm0re/ EraRAG: Efficient and Incremental Retrieval Augmented Generation for Growing Corpora
/everm0re/ EraRAG: Efficient and Incremental Retrieval Augmented Generation for Growing Corpora /everm0re/ EraRAG: Efficient and Incremental Retrieval Augmented Generation for Growing Corpora

Graph-based Retrieval-Augmented Generation (Graph-RAG) enhances large language models (LLMs) by structuring retrieval over an external corpus.

To address these limitations, we introduce EraRAG, a novel multi-layered Graph-RAG framework that supports efficient and scalable dynamic updates.

The design eliminates the need for retraining or costly recomputation while preserving high retrieval accuracy and low latency.

Experiments on large-scale benchmarks demonstrate that EraRag achieves up to an order of magnitude reduction in update time and token consumption compared to existing Graph-RAG systems, while providing superior accuracy performance.

This work offers a practical path forward for RA…

1 week, 5 days назад @ paperswithcode.com
/danialmoa/ Robust Deep Learning for Myocardial Scar Segmentation in Cardiac MRI with Noisy Labels
/danialmoa/ Robust Deep Learning for Myocardial Scar Segmentation in Cardiac MRI with Noisy Labels /danialmoa/ Robust Deep Learning for Myocardial Scar Segmentation in Cardiac MRI with Noisy Labels

The accurate segmentation of myocardial scars from cardiac MRI is essential for clinical assessment and treatment planning.

In this study, we propose a robust deep-learning pipeline for fully automated myocardial scar detection and segmentation by fine-tuning state-of-the-art models.

We evaluate the model's performance on both acute and chronic cases and demonstrate its ability to produce accurate and smooth segmentations despite noisy labels.

In particular, our approach outperforms state-of-the-art models like nnU-Net and shows strong generalizability in an out-of-distribution test set, highlighting its robustness across various imaging conditions and clinical tasks.

These results establis…

1 week, 5 days назад @ paperswithcode.com
/antoniolopezmc/ Discovering multiple antibiotic resistance phenotypes using diverse top-k subgroup list discovery
/antoniolopezmc/ Discovering multiple antibiotic resistance phenotypes using diverse top-k subgroup list discovery /antoniolopezmc/ Discovering multiple antibiotic resistance phenotypes using diverse top-k subgroup list discovery

Patient phenotyping is the task of finding a set of patient characteristics related to a specific medical problem such as the one described in this work.

However, a single explanation of a medical phenomenon might be useless in the eyes of a clinical expert and be discarded.

The discovery of multiple patient phenotypes for the same medical phenomenon would be useful in such cases.

Our proposal provides clinicians with a method with which to obtain multiple and diverse phenotypes of a set of patients.

We show a real use case of phenotyping in antimicrobial resistance using the well-known MIMIC-III dataset.

1 week, 5 days назад @ paperswithcode.com
/lcs0215/ OracleFusion: Assisting the Decipherment of Oracle Bone Script with Structurally Constrained Semantic Typography
/lcs0215/ OracleFusion: Assisting the Decipherment of Oracle Bone Script with Structurally Constrained Semantic Typography /lcs0215/ OracleFusion: Assisting the Decipherment of Oracle Bone Script with Structurally Constrained Semantic Typography

As one of the earliest ancient languages, Oracle Bone Script (OBS) encapsulates the cultural records and intellectual expressions of ancient civilizations.

To address these challenges, this paper proposes a novel two-stage semantic typography framework, named OracleFusion.

In the second stage, we introduce Oracle Structural Vector Fusion (OSVF), incorporating glyph structure constraints and glyph maintenance constraints to ensure the accurate generation of semantically enriched vector fonts.

This approach preserves the objective integrity of the glyph structure, offering visually enhanced representations that assist experts in deciphering OBS.

Furthermore, OracleFusion provides expert-like …

1 week, 5 days назад @ paperswithcode.com
/lihe-maxsize/ Transformer-Based Spatial-Temporal Counterfactual Outcomes Estimation
/lihe-maxsize/ Transformer-Based Spatial-Temporal Counterfactual Outcomes Estimation /lihe-maxsize/ Transformer-Based Spatial-Temporal Counterfactual Outcomes Estimation

Therefore, estimating the counterfactual outcomes with spatial-temporal attributes is a crucial problem.

This paper proposes a novel framework for estimating counterfactual outcomes with spatial-temporal attributes using the Transformer, exhibiting stronger estimation ability.

To validate the effectiveness of our approach, we conduct simulation experiments and real data experiments.

Simulation experiments show that our estimator has a stronger estimation capability than baseline methods.

Real data experiments provide a valuable conclusion to the causal effect of conflicts on forest loss in Colombia.

1 week, 5 days назад @ paperswithcode.com
/aoqunjin/ Parallels Between VLA Model Post-Training and Human Motor Learning: Progress, Challenges, and Trends
/aoqunjin/ Parallels Between VLA Model Post-Training and Human Motor Learning: Progress, Challenges, and Trends /aoqunjin/ Parallels Between VLA Model Post-Training and Human Motor Learning: Progress, Challenges, and Trends

Vision-language-action (VLA) models extend vision-language models (VLM) by integrating action generation modules for robotic manipulation.

Evidence from multiple domains highlights the critical role of post-training to align foundational models with downstream applications, spurring extensive research on post-training VLA models.

Accordingly, this paper reviews post-training strategies for VLA models through the lens of human motor learning, focusing on three dimensions: environments, embodiments, and tasks.

Finally, key challenges and trends in post-training VLA models are identified, establishing a conceptual framework to guide future research.

This work delivers both a comprehensive over…

1 week, 5 days назад @ paperswithcode.com
/pd162/ Class-Agnostic Region-of-Interest Matching in Document Images
/pd162/ Class-Agnostic Region-of-Interest Matching in Document Images /pd162/ Class-Agnostic Region-of-Interest Matching in Document Images

Document understanding and analysis have received a lot of attention due to their widespread application.

However, existing document analysis solutions, such as document layout analysis and key information extraction, are only suitable for fixed category definitions and granularities, and cannot achieve flexible applications customized by users.

Therefore, this paper defines a new task named ``Class-Agnostic Region-of-Interest Matching'' (``RoI-Matching'' for short), which aims to match the customized regions in a flexible, efficient, multi-granularity, and open-set manner.

The visual prompt of the reference document and target document images are fed into our model, while the output is the…

1 week, 5 days назад @ paperswithcode.com
/xiweix/ ReME: A Data-Centric Framework for Training-Free Open-Vocabulary Segmentation
/xiweix/ ReME: A Data-Centric Framework for Training-Free Open-Vocabulary Segmentation /xiweix/ ReME: A Data-Centric Framework for Training-Free Open-Vocabulary Segmentation

Training-free open-vocabulary semantic segmentation (OVS) aims to segment images given a set of arbitrary textual categories without costly model fine-tuning.

Existing solutions often explore attention mechanisms of pre-trained models, such as CLIP, or generate synthetic data and design complex retrieval processes to perform OVS.

However, their performance is limited by the capability of reliant models or the suboptimal quality of reference sets.

In this work, we investigate the largely overlooked data quality problem for this challenging dense scene understanding task, and identify that a high-quality reference set can significantly benefit training-free OVS.

Remarkably, extensive evaluati…

1 week, 5 days назад @ paperswithcode.com
/shuoyang2/ RecCoT: Enhancing Recommendation via Chain-of-Thought
/shuoyang2/ RecCoT: Enhancing Recommendation via Chain-of-Thought /shuoyang2/ RecCoT: Enhancing Recommendation via Chain-of-Thought

In real-world applications, users always interact with items in multiple aspects, such as through implicit binary feedback (e.g., clicks, dislikes, long views) and explicit feedback (e.g., comments, reviews).

Modern recommendation systems (RecSys) learn user-item collaborative signals from these implicit feedback signals as a large-scale binary data-streaming, subsequently recommending other highly similar items based on users' personalized historical interactions.

Consequently, under this binary learning paradigm, the RecSys struggles to understand why a user likes or dislikes certain items.

To alleviate it, some works attempt to utilize the content-based reviews to capture the semantic kn…

1 week, 5 days назад @ paperswithcode.com
/haoang97/ Unveiling Causal Reasoning in Large Language Models: Reality or Mirage?
/haoang97/ Unveiling Causal Reasoning in Large Language Models: Reality or Mirage? /haoang97/ Unveiling Causal Reasoning in Large Language Models: Reality or Mirage?

Causal reasoning capability is critical in advancing large language models (LLMs) toward strong artificial intelligence.

Specifically, LLMs are only capable of performing shallow (level-1) causal reasoning, primarily attributed to the causal knowledge embedded in their parameters, but they lack the capacity for genuine human-like (level-2) causal reasoning.

To bridge the gap towards level-2 causal reasoning, we draw inspiration from the fact that human reasoning is usually facilitated by general knowledge and intended goals.

We propose G^2-Reasoner, a method that incorporates general knowledge and goal-oriented prompts into LLMs' causal reasoning processes.

Experiments demonstrate that G^2-…

1 week, 5 days назад @ paperswithcode.com
/chenkaisun/ Beyond Reactive Safety: Risk-Aware LLM Alignment via Long-Horizon Simulation
/chenkaisun/ Beyond Reactive Safety: Risk-Aware LLM Alignment via Long-Horizon Simulation /chenkaisun/ Beyond Reactive Safety: Risk-Aware LLM Alignment via Long-Horizon Simulation

Given the growing influence of language model-based agents on high-stakes societal decisions, from public policy to healthcare, ensuring their beneficial impact requires understanding the far-reaching implications of their suggestions.

We propose a proof-of-concept framework that projects how model-generated advice could propagate through societal systems on a macroscopic scale over time, enabling more robust alignment.

To assess the long-term safety awareness of language models, we also introduce a dataset of 100 indirect harm scenarios, testing models' ability to foresee adverse, non-obvious outcomes from seemingly harmless user prompts.

Our approach achieves not only over 20% improvement…

1 week, 5 days назад @ paperswithcode.com
/7uheng/ Out-of-Distribution Semantic Occupancy Prediction
/7uheng/ Out-of-Distribution Semantic Occupancy Prediction /7uheng/ Out-of-Distribution Semantic Occupancy Prediction

3D Semantic Occupancy Prediction is crucial for autonomous driving, providing a dense, semantically rich environmental representation.

However, existing methods focus on in-distribution scenes, making them susceptible to Out-of-Distribution (OoD) objects and long-tail distributions, which increases the risk of undetected anomalies and misinterpretations, posing safety hazards.

To address these challenges, we introduce Out-of-Distribution Semantic Occupancy Prediction, targeting OoD detection in 3D voxel space.

We introduce OccOoD, a novel framework integrating OoD detection into 3D semantic occupancy prediction, with Voxel-BEV Progressive Fusion (VBPF) leveraging an RWKV-based branch to enh…

1 week, 5 days назад @ paperswithcode.com
/V3RGANz/ JointRank: Rank Large Set with Single Pass
/V3RGANz/ JointRank: Rank Large Set with Single Pass /V3RGANz/ JointRank: Rank Large Set with Single Pass

Efficiently ranking relevant items from large candidate pools is a cornerstone of modern information retrieval systems -- such as web search, recommendation, and retrieval-augmented generation.

Listwise rerankers, which improve relevance by jointly considering multiple candidates, are often limited in practice: either by model input size constraints, or by degraded quality when processing large sets.

We propose a model-agnostic method for fast reranking large sets that exceed a model input limits.

The method first partitions candidate items into overlapping blocks, each of which is ranked independently in parallel.

Finally, these comparisons are aggregated to construct a global ranking usin…

1 week, 5 days назад @ paperswithcode.com
/k2-fsa/ ZipVoice: Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching
/k2-fsa/ ZipVoice: Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching /k2-fsa/ ZipVoice: Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching

Existing large-scale zero-shot text-to-speech (TTS) models deliver high speech quality but suffer from slow inference speeds due to massive parameters.

To address this issue, this paper introduces ZipVoice, a high-quality flow-matching-based zero-shot TTS model with a compact model size and fast inference speed.

Experiments on 100k hours multilingual datasets show that ZipVoice matches state-of-the-art models in speech quality, while being 3 times smaller and up to 30 times faster than a DiT-based flow-matching baseline.

Codes, model checkpoints and demo samples are publicly available.

PDFAbstract

2 weeks, 1 day назад @ paperswithcode.com
/snowflakedb/ Arctic Long Sequence Training: Scalable And Efficient Training For Multi-Million Token Sequences
/snowflakedb/ Arctic Long Sequence Training: Scalable And Efficient Training For Multi-Million Token Sequences /snowflakedb/ Arctic Long Sequence Training: Scalable And Efficient Training For Multi-Million Token Sequences

Long sequences are critical for applications like RAG, long document summarization, multi-modality, etc., and modern LLMs, like Llama 4 Scout, support max sequence length of up to 10 million tokens.

However, outside of enterprise labs, long sequence training is challenging for the AI community with limited system support in the open-source space.

We address this with Arctic Long Sequence Training (ALST).

It offers a combination of attention-agnostic single GPU and multi-GPU memory optimizations, that enables it to support out-of-box training of multi-million sequence length for a wide variety of HF models.

ALST is fully compatible with HF models and open-sourced via Deepspeed https://www.de…

2 weeks, 1 day назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 1 week, 5 days назад
/kagnlp/ Xolver: Multi-Agent Reasoning with Holistic Experience Learning Just Like an Olympiad Team
/kagnlp/ Xolver: Multi-Agent Reasoning with Holistic Experience Learning Just Like an Olympiad Team /kagnlp/ Xolver: Multi-Agent Reasoning with Holistic Experience Learning Just Like an Olympiad Team

Include the markdown at the top of your GitHub README.md file to showcase the performance of the model.

Badges are live and will be dynamically updated with the latest ranking of this paper.

2 weeks, 1 day назад @ paperswithcode.com
/tencent/ Hunyuan3D 2.5: Towards High-Fidelity 3D Assets Generation with Ultimate Details
/tencent/ Hunyuan3D 2.5: Towards High-Fidelity 3D Assets Generation with Ultimate Details /tencent/ Hunyuan3D 2.5: Towards High-Fidelity 3D Assets Generation with Ultimate Details

In this report, we present Hunyuan3D 2.5, a robust suite of 3D diffusion models aimed at generating high-fidelity and detailed textured 3D assets.

Hunyuan3D 2.5 follows two-stages pipeline of its previous version Hunyuan3D 2.0, while demonstrating substantial advancements in both shape and texture generation.

In terms of shape generation, we introduce a new shape foundation model -- LATTICE, which is trained with scaled high-quality datasets, model-size, and compute.

In terms of texture generation, it is upgraded with phyiscal-based rendering (PBR) via a novel multi-view architecture extended from Hunyuan3D 2.0 Paint model.

Our extensive evaluation shows that Hunyuan3D 2.5 significantly out…

2 weeks, 1 day назад @ paperswithcode.com
/SaminYeasar/ Sparse-Reg: Improving Sample Complexity in Offline Reinforcement Learning using Sparsity
/SaminYeasar/ Sparse-Reg: Improving Sample Complexity in Offline Reinforcement Learning using Sparsity /SaminYeasar/ Sparse-Reg: Improving Sample Complexity in Offline Reinforcement Learning using Sparsity

In this paper, we investigate the use of small datasets in the context of offline reinforcement learning (RL).

While many common offline RL benchmarks employ datasets with over a million data points, many offline RL applications rely on considerably smaller datasets.

We show that offline RL algorithms can overfit on small datasets, resulting in poor performance.

To address this challenge, we introduce "Sparse-Reg": a regularization technique based on sparsity to mitigate overfitting in offline reinforcement learning, enabling effective learning in limited data settings and outperforming state-of-the-art baselines in continuous control.

PDFAbstract

2 weeks, 1 day назад @ paperswithcode.com
/autogluon/ TabArena: A Living Benchmark for Machine Learning on Tabular Data
/autogluon/ TabArena: A Living Benchmark for Machine Learning on Tabular Data /autogluon/ TabArena: A Living Benchmark for Machine Learning on Tabular Data

With the growing popularity of deep learning and foundation models for tabular data, the need for standardized and reliable benchmarks is higher than ever.

To address this, we introduce TabArena, the first continuously maintained living tabular benchmarking system.

While gradient-boosted trees are still strong contenders on practical tabular datasets, we observe that deep learning methods have caught up under larger time budgets with ensembling.

Finally, we show that ensembles across models advance the state-of-the-art in tabular machine learning and investigate the contributions of individual models.

We launch TabArena with a public leaderboard, reproducible code, and maintenance protocols…

2 weeks, 1 day назад @ paperswithcode.com
/decisionintelligence/ TAB: Unified Benchmarking of Time Series Anomaly Detection Methods
/decisionintelligence/ TAB: Unified Benchmarking of Time Series Anomaly Detection Methods /decisionintelligence/ TAB: Unified Benchmarking of Time Series Anomaly Detection Methods

Time series anomaly detection (TSAD) plays an important role in many domains such as finance, transportation, and healthcare.

While many TSAD methods already exist, new and better methods are still desirable.

However, effective progress hinges on the availability of reliable means of evaluating new methods and comparing them with existing methods.

Second, TAB covers a variety of TSAD methods, including Non-learning, Machine learning, Deep learning, LLM-based, and Time-series pre-trained methods.

Finally, we employ TAB to evaluate existing TSAD methods and report on the outcomes, thereby offering a deeper insight into the performance of these methods.

2 weeks, 1 day назад @ paperswithcode.com
/PaddlePaddle/ PP-DocBee2: Improved Baselines with Efficient Data for Multimodal Document Understanding
/PaddlePaddle/ PP-DocBee2: Improved Baselines with Efficient Data for Multimodal Document Understanding /PaddlePaddle/ PP-DocBee2: Improved Baselines with Efficient Data for Multimodal Document Understanding

This report introduces PP-DocBee2, an advanced version of the PP-DocBee, designed to enhance multimodal document understanding.

Built on a large multimodal model architecture, PP-DocBee2 addresses the limitations of its predecessor through key technological improvements, including enhanced synthetic data quality, improved visual feature fusion strategy, and optimized inference methodologies.

A key innovation of our work is a data quality optimization strategy for multimodal document tasks.

By employing a large-scale multimodal pre-trained model to evaluate data, we apply a novel statistical criterion to filter outliers, ensuring high-quality training data.

The source code and pre-trained mo…

2 weeks, 1 day назад @ paperswithcode.com
/thudm/ LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning
/thudm/ LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning /thudm/ LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning

Ultra-long generation by large language models (LLMs) is a widely demanded scenario, yet it remains a significant challenge due to their maximum generation length limit and overall quality degradation as sequence length increases.

Previous approaches, exemplified by LongWriter, typically rely on ''teaching'', which involves supervised fine-tuning (SFT) on synthetic long-form outputs.

However, this strategy heavily depends on synthetic SFT data, which is difficult and costly to construct, often lacks coherence and consistency, and tends to be overly artificial and structurally monotonous.

In this work, we propose an incentivization-based approach that, starting entirely from scratch and with…

2 weeks, 1 day назад @ paperswithcode.com
/gen-verse/ ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs
/gen-verse/ ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs /gen-verse/ ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs

Process Reward Models (PRMs) have recently emerged as a powerful framework for supervising intermediate reasoning steps in large language models (LLMs).

Previous PRMs are primarily trained on model final output responses and struggle to evaluate intermediate thinking trajectories robustly, especially in the emerging setting of trajectory-response outputs generated by frontier reasoning models like Deepseek-R1.

In this work, we introduce ReasonFlux-PRM, a novel trajectory-aware PRM explicitly designed to evaluate the trajectory-response type of reasoning traces.

ReasonFlux-PRM incorporates both step-level and trajectory-level supervision, enabling fine-grained reward assignment aligned with …

2 weeks, 1 day назад @ paperswithcode.com
/ibm/ AnalogNAS-Bench: A NAS Benchmark for Analog In-Memory Computing
/ibm/ AnalogNAS-Bench: A NAS Benchmark for Analog In-Memory Computing /ibm/ AnalogNAS-Bench: A NAS Benchmark for Analog In-Memory Computing

Analog In-memory Computing (AIMC) has emerged as a highly efficient paradigm for accelerating Deep Neural Networks (DNNs), offering significant energy and latency benefits over conventional digital hardware.

Neural Architecture Search (NAS) is thus needed to systematically discover neural architectures optimized explicitly for AIMC constraints.

However, comparing NAS methodologies and extracting insights about robust architectures for AIMC requires a dedicated NAS benchmark that explicitly accounts for AIMC-specific hardware non-idealities.

To address this, we introduce AnalogNAS-Bench, the first NAS benchmark tailored specifically for AIMC.

These insights highlight the limitations of curre…

2 weeks, 1 day назад @ paperswithcode.com
/deeps73/ CycleDistill: Bootstrapping Machine Translation using LLMs with Cyclical Distillation
/deeps73/ CycleDistill: Bootstrapping Machine Translation using LLMs with Cyclical Distillation /deeps73/ CycleDistill: Bootstrapping Machine Translation using LLMs with Cyclical Distillation

Large language models (LLMs), despite their ability to perform few-shot machine translation (MT), often lag behind dedicated MT systems trained on parallel corpora, which are crucial for high quality machine translation (MT).

However, parallel corpora are often scarce or non-existent for low-resource languages.

In this paper, we propose CycleDistill, a bootstrapping approach leveraging LLMs and few-shot translation to obtain high-quality MT systems.

CycleDistill involves iteratively generating synthetic parallel corpora from monolingual corpora via zero- or few-shot MT, which is then used to fine-tune the model that was used for generating said data for MT.

We also study the effect of lever…

2 weeks, 1 day назад @ paperswithcode.com
/gsumbul/ SMARTIES: Spectrum-Aware Multi-Sensor Auto-Encoder for Remote Sensing Images
/gsumbul/ SMARTIES: Spectrum-Aware Multi-Sensor Auto-Encoder for Remote Sensing Images /gsumbul/ SMARTIES: Spectrum-Aware Multi-Sensor Auto-Encoder for Remote Sensing Images

From optical sensors to microwave radars, leveraging the complementary strengths of remote sensing (RS) sensors is crucial for achieving dense spatio-temporal monitoring of our planet.

On the contrary, a single model able to modulate its feature representations to accept diverse sensors as input would pave the way to agile and flexible multi-sensor RS data processing.

To address this, we introduce SMARTIES, a generic and versatile foundation model lifting sensor-specific/dependent efforts and enabling scalability and generalization to diverse RS sensors: SMARTIES projects data from heterogeneous sensors into a shared spectrum-aware space, enabling the use of arbitrary combinations of bands …

2 weeks, 1 day назад @ paperswithcode.com
/glad-ruc/ Fast and Distributed Equivariant Graph Neural Networks by Virtual Node Learning
/glad-ruc/ Fast and Distributed Equivariant Graph Neural Networks by Virtual Node Learning /glad-ruc/ Fast and Distributed Equivariant Graph Neural Networks by Virtual Node Learning

Equivariant Graph Neural Networks (GNNs) have achieved remarkable success across diverse scientific applications.

To address these limitations, we introduce FastEGNN and DistEGNN, two novel enhancements to equivariant GNNs for large-scale geometric graphs.

FastEGNN employs a key innovation: a small ordered set of virtual nodes that effectively approximates the large unordered graph of real nodes.

For extremely large-scale geometric graphs, we present DistEGNN, a distributed extension where virtual nodes act as global bridges between subgraphs in different devices, maintaining consistency while dramatically reducing memory and computational overhead.

Results demonstrate superior efficiency a…

2 weeks, 1 day назад @ paperswithcode.com
/JacopoDapueto/ Disentangled representations of microscopy images
/JacopoDapueto/ Disentangled representations of microscopy images /JacopoDapueto/ Disentangled representations of microscopy images

Microscopy image analysis is fundamental for different applications, from diagnosis to synthetic engineering and environmental monitoring.

Modern acquisition systems have granted the possibility to acquire an escalating amount of images, requiring a consequent development of a large collection of deep learning-based automatic image analysis methods.

Although deep neural networks have demonstrated great performance in this field, interpretability, an essential requirement for microscopy image analysis, remains an open challenge.

This work proposes a Disentangled Representation Learning (DRL) methodology to enhance model interpretability for microscopy image classification.

PDFAbstract

2 weeks, 1 day назад @ paperswithcode.com
/ghimiredhikura/ Loss-Aware Automatic Selection of Structured Pruning Criteria for Deep Neural Network Acceleration
/ghimiredhikura/ Loss-Aware Automatic Selection of Structured Pruning Criteria for Deep Neural Network Acceleration /ghimiredhikura/ Loss-Aware Automatic Selection of Structured Pruning Criteria for Deep Neural Network Acceleration

Structured pruning is a well-established technique for compressing neural networks, making it suitable for deployment in resource-limited edge devices.

This paper presents an efficient Loss-Aware Automatic Selection of Structured Pruning Criteria (LAASP) for slimming and accelerating deep neural networks.

The automatic selection of magnitude or similarity-based filter pruning criteria from a specified pool of criteria and the specific pruning layer at each pruning iteration is guided by the network's overall loss on a small subset of the training data.

The optimal pruning rates for each layer in the network are automatically determined, eliminating the need for manual allocation of fixed or…

2 weeks, 1 day назад @ paperswithcode.com
/guinan-su/ GPTailor: Large Language Model Pruning Through Layer Cutting and Stitching
/guinan-su/ GPTailor: Large Language Model Pruning Through Layer Cutting and Stitching /guinan-su/ GPTailor: Large Language Model Pruning Through Layer Cutting and Stitching

Large language models (LLMs) have shown remarkable capabilities in language understanding and generation.

However, such impressive capability typically comes with a substantial model size, which presents significant challenges in deployment and inference.

While structured pruning of model parameters offers a promising way to reduce computational costs at deployment time, current methods primarily focus on single model pruning.

We pose the optimal tailoring of these LLMs as a zero-order optimization problem, adopting a search space that supports three different operations: (1) Layer removal, (2) Layer selection from different candidate models, and (3) Layer merging.

Our experiments demonstra…

2 weeks, 1 day назад @ paperswithcode.com
💼 University and corporation labs
DeepMind DeepMind
последний пост 2 weeks, 3 days назад
AlphaGenome: AI for better understanding the genome
AlphaGenome: AI for better understanding the genome AlphaGenome: AI for better understanding the genome

Science AlphaGenome: AI for better understanding the genome ShareCopy link ×Introducing a new, unifying DNA sequence model that advances regulatory variant-effect prediction and promises to shed new light on genome function — now available via API.

Small variations in a genome’s DNA sequence can alter an organism’s response to its environment or its susceptibility to disease.

How AlphaGenome works Our AlphaGenome model takes a long DNA sequence as input — up to 1 million letters, also known as base-pairs — and predicts thousands of molecular properties characterising its regulatory activity.

We haven't designed or validated AlphaGenome for personal genome prediction, a known challenge for A…

2 weeks, 3 days назад @ deepmind.google
Gemini Robotics On-Device brings AI to local robotic devices
Gemini Robotics On-Device brings AI to local robotic devices Gemini Robotics On-Device brings AI to local robotic devices

In March, we introduced Gemini Robotics, our most advanced VLA (vision language action) model, bringing Gemini 2.0’s multimodal reasoning and real-world understanding into the physical world.

Today, we’re introducing Gemini Robotics On-Device, our most powerful VLA model optimized to run locally on robotic devices.

Gemini Robotics On-Device shows strong general-purpose dexterity and task generalization, and it’s optimized to run efficiently on the robot itself.

Model capabilities and performanceGemini Robotics On-Device is a robotics foundation model for bi-arm robots, engineered to require minimal computational resources.

It builds on the task generalization and dexterity capabilities of G…

2 weeks, 4 days назад @ deepmind.google
Gemini 2.5: Updates to our family of thinking models
Gemini 2.5: Updates to our family of thinking models Gemini 2.5: Updates to our family of thinking models

Explore the latest Gemini 2.5 model updates with enhanced performance and accuracy: Gemini 2.5 Pro now stable, Flash generally available, and the new Flash-Lite in preview.

3 weeks, 4 days назад @ deepmind.google
We’re expanding our Gemini 2.5 family of models
We’re expanding our Gemini 2.5 family of models We’re expanding our Gemini 2.5 family of models

We designed Gemini 2.5 to be a family of hybrid reasoning models that provide amazing performance, while also being at the Pareto Frontier of cost and speed.

Today, we’re taking the next step with our 2.5 Pro and Flash models by releasing them as stable and generally available.

And we’re bringing you 2.5 Flash-Lite in preview — our most cost-efficient and fastest 2.5 model yet.

Making 2.5 Flash and 2.5 Pro generally availableThanks to all of your feedback, today we’re releasing stable versions of 2.5 Flash and Pro, so you can build production applications with confidence.

Introducing Gemini 2.5 Flash-LiteWe’re also introducing a preview of the new Gemini 2.5 Flash-Lite, our most cost-effici…

3 weeks, 4 days назад @ blog.google
Behind “ANCESTRA”: combining Veo with live-action filmmaking
Behind “ANCESTRA”: combining Veo with live-action filmmaking Behind “ANCESTRA”: combining Veo with live-action filmmaking

Today, Eliza McNitt’s short film, “ANCESTRA,” premieres at the Tribeca Festival.

It’s the story of a mother, and what happens when her child is born with a hole in its heart.

Inspired by the dramatic events of McNitt's own birth, the film portrays a mother's love as a cosmic, life-saving force.

Together, we founded this partnership to put the world’s best generative AI into the hands of top filmmakers, to advance the frontiers of storytelling and technology.

“ANCESTRA” combined live-action scenes with sequences generated by Veo, our state-of-the-art video generation model.

4 weeks, 1 day назад @ blog.google
How we're supporting better tropical cyclone prediction with AI
How we're supporting better tropical cyclone prediction with AI How we're supporting better tropical cyclone prediction with AI

Research How we're supporting better tropical cyclone prediction with AI ShareCopy link ×We’re launching Weather Lab, featuring our experimental cyclone predictions, and we’re partnering with the U.S. National Hurricane Center to support their forecasts and warnings this cyclone season.

Yet, improving the accuracy of cyclone predictions can help protect communities through more effective disaster preparedness and earlier evacuations.

Today, Google DeepMind and Google Research are launching Weather Lab, an interactive website for sharing our artificial intelligence (AI) weather models.

Weather Lab’s live and historical cyclone predictions Weather Lab shows live and historical cyclone predict…

1 month назад @ deepmind.google
How we're supporting better tropical cyclone prediction with AI
How we're supporting better tropical cyclone prediction with AI How we're supporting better tropical cyclone prediction with AI

We’re launching Weather Lab, featuring our experimental cyclone predictions, and we’re partnering with the U.S. National Hurricane Center to support their forecasts and warnings this cyclone season.

1 month назад @ d16660f-dot-gdm-deepmind-com-prod.appspot.com
Advanced audio dialog and generation with Gemini 2.5
Advanced audio dialog and generation with Gemini 2.5 Advanced audio dialog and generation with Gemini 2.5

Gemini 2.5 has new capabilities in AI-powered audio dialog and generation.

1 month, 1 week назад @ d16660f-dot-gdm-deepmind-com-prod.appspot.com
Advanced audio dialog and generation with Gemini 2.5
Advanced audio dialog and generation with Gemini 2.5 Advanced audio dialog and generation with Gemini 2.5

Safety and responsibilityWe’ve proactively assessed potential risks throughout every stage of the development process for these native audio features, using what we’ve learned to inform our mitigation strategies.

Additionally, all audio outputs from our models are embedded with SynthID, our watermarking technology, to ensure transparency by making AI-generated audio identifiable.

Native audio capabilities for developersWe’re bringing native audio outputs to Gemini 2.5 models, giving developers new capabilities to build richer, more interactive applications via the Gemini API in Google AI Studio or Vertex AI.

To begin exploring, developers can try native audio dialog with Gemini 2.5 Flash pr…

1 month, 1 week назад @ blog.google
Advancing Gemini's security safeguards
Advancing Gemini's security safeguards Advancing Gemini's security safeguards

We’ve made Gemini 2.5 our most secure model family to date.

1 month, 3 weeks назад @ 9aee036-dot-gdm-deepmind-com-prod.appspot.com
SynthID Detector — a new portal to help identify AI-generated content
SynthID Detector — a new portal to help identify AI-generated content SynthID Detector — a new portal to help identify AI-generated content

Learn about the new SynthID Detector portal we announced at I/O to help people understand how the content they see online was generated.

1 month, 3 weeks назад @ d16660f-dot-gdm-deepmind-com-prod.appspot.com
Gemini 2.5: Our most intelligent models are getting even better
Gemini 2.5: Our most intelligent models are getting even better Gemini 2.5: Our most intelligent models are getting even better

Gemini 2.5 Pro continues to be loved by developers as the best model for coding, and 2.5 Flash is getting even better with a new update. We’re bringing new capabilities to our models, including Deep Think, an experimental enhanced reasoning mode for 2.5 Pro.

1 month, 3 weeks назад @ d16660f-dot-gdm-deepmind-com-prod.appspot.com
Fuel your creativity with new generative media models and tools
Fuel your creativity with new generative media models and tools Fuel your creativity with new generative media models and tools

Today, we’re announcing our newest generative media models, which mark significant breakthroughs.

These models create breathtaking images, videos and music, empowering artists to bring their creative vision to life.

Veo 3 and Imagen 4, our newest video and image generation models, push the frontier of media generation, with their groundbreaking new capabilities.

We're also expanding access to Lyria 2, giving musicians more tools to create music.

Using Google DeepMind’s most advanced models, Flow lets you weave cinematic films with more sophisticated control of characters, scenes and styles, to bring your story to life.

1 month, 3 weeks назад @ blog.google
Gemini 2.5: Our most intelligent models are getting even better
Gemini 2.5: Our most intelligent models are getting even better Gemini 2.5: Our most intelligent models are getting even better

2.5 Pro performs better than everWe recently updated 2.5 Pro to help developers build richer, interactive web apps.

And, with its 1 million-token context window, 2.5 Pro has state-of-the-art long context and video understanding performance.

Since incorporating LearnLM, our family of models built with educational experts, 2.5 Pro is also now the leading model for learning.

In head-to-head comparisons evaluating its pedagogy and effectiveness, educators and experts preferred Gemini 2.5 Pro over other models across a diverse range of scenarios.

Read more in our updated Gemini 2.5 Pro model card and on the Gemini technology page.

1 month, 3 weeks назад @ blog.google
Announcing Gemma 3n preview: Powerful, efficient, mobile-first AI
Announcing Gemma 3n preview: Powerful, efficient, mobile-first AI

Gemma 3n is a cutting-edge open model designed for fast, multimodal AI on devices, featuring optimized performance, unique flexibility with a 2-in-1 model, and expanded multimodal understanding with audio, empowering developers to build live, interactive applications and sophisticated audio-centric experiences.

1 month, 3 weeks назад @ 35d038e-dot-gdm-deepmind-com-prod.appspot.com
Google
последний пост 1 day, 17 hours назад
Get better at getting better: Take the 2025 DORA survey
Get better at getting better: Take the 2025 DORA survey Get better at getting better: Take the 2025 DORA survey

This year's DORA survey will take an even deeper look into how we can all improve the impact of AI across the entire software lifecycle.

The act of taking the DORA survey is, in itself, a valuable team-assessment.

By sharing your anonymous experiences, you help shape the industry's understanding of the impact of AI and the characteristics of high-performing teams.

Take the DORA survey todayReady to take the first step towards improving your team's performance?

The 2025 DORA Survey is open until July 18, 2025.

1 day, 17 hours назад @ cloud.google.com
How Jina AI built its 100-billion-token web grounding system with Cloud Run GPUs
How Jina AI built its 100-billion-token web grounding system with Cloud Run GPUs How Jina AI built its 100-billion-token web grounding system with Cloud Run GPUs

Editor’s note: The Jina AI Reader is a specialized tool that transforms raw web content from URLs or local files into a clean, structured, and LLM-friendly format.

In this post, Han Xiao details how Cloud Run empowers Jina AI to build a secure, reliable, and massively scalable web scraping system that remains economically viable.

This post explores the collaborative innovation, technical hurdles, and breakthrough achievements behind Jina Reader, a web grounding system now processing 100 billion tokens daily.

Jina Reader isn't just another scraper; it takes a different approach to how AI systems consume web content by transforming raw, noisy web pages into clean, structured markdown.

The cor…

1 day, 17 hours назад @ cloud.google.com
How to tap into natural language AI services using the Conversational Analytics API
How to tap into natural language AI services using the Conversational Analytics API How to tap into natural language AI services using the Conversational Analytics API

With intelligent tools like the Conversational Analytics API, powered by Gemini, you no longer need intricate systems to get insights.

The Conversational Analytics API lets you use everyday language to ask questions of your data in BigQuery or Looker, with more sources to come.

Developers can also build custom data agents that understand your business and provide answers where they're needed.

Now in preview, this API transforms unstructured conversation and structured data into actionable insights, accessible through natural language.

You can create custom data agents, map business terms, and define calculations to empower your users and analysts.

3 days, 15 hours назад @ cloud.google.com
Zero-shot forecasting in BigQuery with the TimesFM foundation model
Zero-shot forecasting in BigQuery with the TimesFM foundation model Zero-shot forecasting in BigQuery with the TimesFM foundation model

Accurate time-series forecasting is essential for many business scenarios such as planning, supply chain management, and resource allocation.

BigQuery now embeds TimesFM, a state-of-the-art pre-trained model from Google Research, enabling powerful forecasting via the simple AI.FORECAST function.

More recently, with the rapid progress and success of large pre-trained LLM models, the Google Research team developed TimesFM, a foundational model specifically for the time series domain.

The Time Series foundation modelTimesFM is a forecasting model that’s pre-trained on a large time-series corpus of 400 billion real-world time-points.

While ARIMA_PLUS offers customizability and explainability, t…

3 days, 17 hours назад @ cloud.google.com
London Summit: agentic AI leaders, training 100,000 civil servants, AI sovereignty, and more
London Summit: agentic AI leaders, training 100,000 civil servants, AI sovereignty, and more London Summit: agentic AI leaders, training 100,000 civil servants, AI sovereignty, and more

There’s a buzz of excitement here at Tobacco Dock as we welcome our customers and partners to the Google Cloud Summit London.

Together, we’re exploring the essential role Google Cloud is playing in driving AI innovation and adoption across the UK.

First, we are launching a new partnership with the UK Government to help modernise public services.

Today, we are announcing a landmark proposal to provide free cloud and AI skills training for up to 100,000 UK public sector workers.

This initiative is designed to accelerate the digital transformation of public services and empower the workforce with the talent to lead in the AI era.

4 days, 1 hour назад @ cloud.google.com
Accelerate your AI workloads with the Google Cloud Managed Lustre
Accelerate your AI workloads with the Google Cloud Managed Lustre Accelerate your AI workloads with the Google Cloud Managed Lustre

The Managed Lustre solution is powered by DDN’s EXAScaler, combining DDN's decades of leadership in high-performance storage with Google Cloud's expertise in cloud infrastructure.

Managed Lustre provides a POSIX-compliant, parallel file system that delivers consistently high throughput and low latency, essential for:High-throughput inference: For applications that require near-real-time inference on large datasets, Lustre provides high parallel throughput and sub-millisecond read latency.

Checkpointing and restarting large models: Save and restore the state of large models during training faster, improving goodput and allowing for more efficient experimentation.

Data preprocessing and featu…

4 days, 16 hours назад @ cloud.google.com
Google is a Leader in the 2025 Gartner® Magic Quadrant™ for Search and Product Discovery
Google is a Leader in the 2025 Gartner® Magic Quadrant™ for Search and Product Discovery Google is a Leader in the 2025 Gartner® Magic Quadrant™ for Search and Product Discovery

Built upon Google's deep expertise & knowledge of the way users search, interact with & purchase products across a broad landscape of commerce domains, Vertex AI Search for commerce is a fully-managed, AI-first solution tailored for e-commerce.

Redefining product discovery with generative AIAs digital channels continue to increase in traffic and adoption, businesses in e-commerce need to navigate the challenges involved in providing great digital experiences.

Vertex AI Search for commerce uses the best of Google AI to drive personalized product discovery at scale, optimized for revenue-per visitor.

Search and browse: Powering digital commerce sites and applications with Google-quality capab…

4 days, 17 hours назад @ cloud.google.com
Announcing Vertex AI Agent Engine Memory Bank available for everyone in preview
Announcing Vertex AI Agent Engine Memory Bank available for everyone in preview Announcing Vertex AI Agent Engine Memory Bank available for everyone in preview

Developers are racing to productize agents, but a common limitation is the absence of memory.

Without memory, agents treat each interaction as the first, asking repetitive questions and failing to recall user preferences.

How we normally mitigate memory problems: So far, a common approach to this problem has been to leverage the LLM’s context window.

However, directly inserting entire session dialogues into an LLM's context window is both expensive and computationally inefficient, leading to higher inference costs and slower response times.

Memory Bank helps us address memory problems in four ways:

4 days, 17 hours назад @ cloud.google.com
Formula E accelerates its work with Google Cloud Storage and Google Workspace
Formula E accelerates its work with Google Cloud Storage and Google Workspace Formula E accelerates its work with Google Cloud Storage and Google Workspace

This led to the widespread adoption of Google Cloud's ecosystem, including Google Workspace, to address both its large-scale data storage needs and internal collaboration and productivity workflows.

This involved moving its massive media archive to Google Cloud Storage and transitioning its entire staff to Google Workspace.

Following meticulous planning with Mimir and technical support from Google Cloud, the team chose the Google Cloud Storage Transfer Service (STS).

To revolutionize its internal operations, Formula E partnered with Cobry, a Google Workspace migration specialist, for a two-phased migration process.

The transition to Google Workspace, meanwhile, has transformed Formula E’s i…

5 days, 1 hour назад @ cloud.google.com
A guide to converting ADK agents with MCP to the A2A framework
A guide to converting ADK agents with MCP to the A2A framework A guide to converting ADK agents with MCP to the A2A framework

The evolution of AI agents has led to powerful, specialized models capable of complex tasks.

However, to unlock their full potential, these agents must be able to collaborate.

This guide provides a step-by-step process for converting a standalone ADK agent that uses an MCP tool into a fully A2A-compatible component, ready to participate in a larger, multi-agent ecosystem.

We will use a MultiURLBrowser agent, designed to scrape web content, as a practical exampleStep 1: Define the core agent and its MCP tool (agent.py)The foundation of your agent remains its core logic.

The key is to properly initialize the ADK LlmAgent and configure its MCPToolset to connect with its external tool.

1 week, 3 days назад @ cloud.google.com
How to build a simple multi-agentic system using Google’s ADK
How to build a simple multi-agentic system using Google’s ADK How to build a simple multi-agentic system using Google’s ADK

When the Root Agent calls the Flight Agent as a sub-agent, the responsibility for answering the user is completely transferred to the Flight Agent.

The Root Agent is effectively out of the loop.

All subsequent user input will be handled solely by the Flight Agent.

This led to the next evolution: the Dispatcher Agent with Agent Tools.

The root agent could then reason about a complex query and decide to use multiple tools to get the job done.

1 week, 3 days назад @ cloud.google.com
How to build Web3 AI agents with Google Cloud
How to build Web3 AI agents with Google Cloud How to build Web3 AI agents with Google Cloud

Some sophisticated libraries now equip developers with the tools to build and deploy them.

How to build Web3 AI Agents with Google CloudGoogle Cloud provides a flexible, end-to-end suite of tools for building Web3 AI Agents, allowing you to start simple and scale to highly complex, customized solutions:1.

For rapid prototyping and no-code development: Vertex AI Agent BuilderConversational Agents allows for rapid prototyping and deployment of agents through a user-friendly interface, making it accessible even for non-technical users (refer to the Agent Builder codelab for a quick start).

Agents can be easily augmented with standard capabilities like leveraging datastores, performing Google s…

1 week, 5 days назад @ cloud.google.com
You dream it, Veo creates it: Veo 3 is now available for everyone in public preview on Vertex AI
You dream it, Veo creates it: Veo 3 is now available for everyone in public preview on Vertex AI You dream it, Veo creates it: Veo 3 is now available for everyone in public preview on Vertex AI

With Veo 3, we’ve leapt forward in combining video and audio generation to take storytelling to the next level.

Today, we’re excited to share that Veo 3 is now available for all Google Cloud customers and partners in public preview on Vertex AI.

Why this matters: Veo 3 is your partner for creating near-cinematic quality generative video, moving beyond novelty to narrative-driven creation.

It not only brings stunning visual quality, but now adds sound from background sounds to dialogue.

With Veo 3 on Vertex AI, you can take advantage of three powerful new capabilities:

2 weeks, 2 days назад @ cloud.google.com
How Schroders built its multi-agent financial analysis research assistant
How Schroders built its multi-agent financial analysis research assistant How Schroders built its multi-agent financial analysis research assistant

For example, Vertex AI Agent Builder provided easy tool integration for leveraging:Internal knowledge: Grounding with Vertex AI Search tool was leveraged to ground Gemini to private document corpus, such as internal research notes, using, enabling agents to answer questions based on Schroder’s proprietary data.

In addition, Vertex AI offers seamless integration with other Google Cloud services and tools that help facilitate rapid agent governance and management, including Cloud Logging, Cloud Monitoring, IAM Access Control, Vertex AI evaluation, BigQuery and more.

Initially, native function calling helped Schroders get familiar with Vertex AI Agent Builder and develop agent-building best pr…

2 weeks, 3 days назад @ cloud.google.com
Audit smarter: Introducing Google Cloud’s Recommended AI Controls framework
Audit smarter: Introducing Google Cloud’s Recommended AI Controls framework Audit smarter: Introducing Google Cloud’s Recommended AI Controls framework

As organizations build new generative AI applications and AI agents to automate business workflows, security and risk management management leaders face a new set of governance challenges.

These include:How do we prove our AI systems operate in line with internal policies and evolving regulations?

How can we verify that data access controls are consistently enforced across the entire AI lifecycle, from training to inference to large scale production?

Developed by Google Cloud Security experts and validated by our Office of the CISO, this prebuilt framework incorporates best practices for securing AI systems, and uses industry standards including the NIST AI Risk Management Framework and the…

2 weeks, 3 days назад @ cloud.google.com
OpenAI
последний пост None
Microsoft Microsoft
последний пост 2 days, 17 hours назад
How AI will accelerate biomedical research and discovery
How AI will accelerate biomedical research and discovery How AI will accelerate biomedical research and discovery

Dr. Eric Topol is the executive vice president of the biomedical research non-profit Scripps Research, where he founded and now directs the Scripps Research Translational Institute.

Let’s continue our deep dive on AI and biomedical research with this conversation with Noubar Afeyan:LEE: Noubar, thanks so much for joining.

And there’s the origin story of contact with AI, you know, before the emergence of generative AI and afterwards.

What is going on today with respect to AI really being used for something meaningful in the design and development of drugs?

TOPOL: You would read about how, you know, data is the new oil and, you know, gold and whatnot.

2 days, 17 hours назад @ microsoft.com
AI Testing and Evaluation: Learnings from pharmaceuticals and medical devices
AI Testing and Evaluation: Learnings from pharmaceuticals and medical devices AI Testing and Evaluation: Learnings from pharmaceuticals and medical devices

Our goal is to learn from their successes and their stumbles to move the science and practice of AI testing forward.

During the pre-market phase, medical testing establishes baseline safety and effectiveness metrics through bench testing, performance standards, and clinical studies.

SULLIVAN: So medical devices face a pretty prescriptive multi-level testing path before they hit the market.

We are looking into medical devices, as well, obviously, but also other technologies in advanced medical computing.

So we see Phase 3 trials as something that occurs in the medical devices and pharmaceuticals field.

5 days, 17 hours назад @ microsoft.com
AI Testing and Evaluation: Learnings from genome editing
AI Testing and Evaluation: Learnings from genome editing AI Testing and Evaluation: Learnings from genome editing

As generative AI continues to advance, Microsoft has gathered a range of experts—from genome editing to cybersecurity—to share how their fields approach evaluation and risk assessment.

CHARO: Well, you know, genome editing is both very old and very new.

Now the earliest forms of genome editing were very inefficient, and so we didn’t worry that much.

But the bottom-line thing to remember, the way to really think about it is, we don’t regulate genome editing; we regulate the things that use genome editing.

And she said, you know, we don’t regulate genome editing; we regulate the things that use genome editing.

1 week, 5 days назад @ microsoft.com
PadChest-GR: A bilingual grounded radiology reporting benchmark for chest X-rays
PadChest-GR: A bilingual grounded radiology reporting benchmark for chest X-rays PadChest-GR: A bilingual grounded radiology reporting benchmark for chest X-rays

In our ever-evolving journey to enhance healthcare through technology, we’re announcing a unique new benchmark for grounded radiology report generation—PadChest-GR (opens in new tab).

A new frontier in radiology report generationIt is estimated that over half of people visiting hospitals have radiology scans that must be interpreted by a clinical professional.

In contrast, grounded radiology reporting demands that each finding be described and localized individually.

It is the first public benchmark that enables us to evaluate generation of fully grounded radiology reports in chest X-rays.

We invite researchers and industry experts to explore PadChest-GR and MAIRA-2, contribute innovative i…

2 weeks, 2 days назад @ microsoft.com
AI Testing and Evaluation: Learnings from Science and Industry
AI Testing and Evaluation: Learnings from Science and Industry AI Testing and Evaluation: Learnings from Science and Industry

Our goal is to learn from their successes and their stumbles to move the science and practice of AI testing forward.

And I think, really, there are two reasons why tech is so, kind of, representative of that kind of challenge that I’ve always found fascinating.

Continues to be a really important topic in the AI policy conversation right now, I think, for really good reason.

Testing is an important component for governance and AI and, of course, in all of these other domains, as well.

I think about almost, like, in the near to mid-term, like three issues that we need to address in the AI, kind of, policy and testing context.

2 weeks, 5 days назад @ microsoft.com
Learning from other domains to advance AI evaluation and testing
Learning from other domains to advance AI evaluation and testing Learning from other domains to advance AI evaluation and testing

Today, we’re launching a limited-series podcast, AI Testing and Evaluation: Learnings from Science and Industry, to share insights from domains that have grappled with testing and measurement questions.

At the close of the podcast series, we’ll offer Microsoft’s deeper reflections on next steps toward more reliable and trustworthy approaches to AI evaluation.

While approaches to risk evaluation and testing vary significantly across the case studies, there was one consistent, top-level takeaway: evaluation frameworks always reflect trade-offs among different policy objectives, such as safety, efficiency, and innovation.

Experts across all eight fields noted that policymakers have had to weig…

2 weeks, 5 days назад @ microsoft.com
Breaking bonds, breaking ground: Advancing the accuracy of computational chemistry with deep learning
Breaking bonds, breaking ground: Advancing the accuracy of computational chemistry with deep learning Breaking bonds, breaking ground: Advancing the accuracy of computational chemistry with deep learning

We are excited to share our first big milestone in solving a grand challenge that has hampered the predictive power of computational chemistry, biochemistry, and materials science for decades.

For 60 years, people have designed practical approximations for the XC functional.

We can contrast the present state of computational chemistry with the state of aircraft engineering and design.

The result is Skala, an XC functional that generalizes to unseen molecules, reaching the accuracy needed to predict experiments.

At Flagship, we believe that openly shared, foundational advances in science – like this leap forward in DFT accuracy – can serve as powerful enablers of innovation.

3 weeks, 3 days назад @ microsoft.com
New methods boost reasoning in small and large language models
New methods boost reasoning in small and large language models New methods boost reasoning in small and large language models

To support this progress, we’ve identified three primary strategies to strengthen reasoning capabilities in both small and large language models: improve architectural design to boost performance in smaller models; incorporate mathematical reasoning techniques to increase reliability; and build stronger generalization capabilities to enable reasoning across a variety of fields.

Read more Opens in a new tabThe problem stems from how current language models operate.

rStar-Math is a method that uses Monte Carlo Tree Search (MCTS) to simulate deeper, more methodical reasoning in smaller models.

LIPS (LLM-based Inequality Prover with Symbolic Reasoning) is a system that combines LLMs’ pattern re…

3 weeks, 4 days назад @ microsoft.com
How AI is reshaping the future of healthcare and medical research
How AI is reshaping the future of healthcare and medical research How AI is reshaping the future of healthcare and medical research

LEE: Yeah, yeah.

It cannot—as, you know, Bill was saying—it cannot learn from your document.

And I don’t know if the two of you remember, but I ended up doing a lot of tests.

I don’t know if you know, but just recently, there was a paper that was published on a scientific discovery using o3- mini (opens in new tab).

Like, if you have a human trained for one task and you put them into another task, then you don’t … you often don’t know.

1 month назад @ microsoft.com
How AI is reshaping the future of healthcare and medical research
How AI is reshaping the future of healthcare and medical research How AI is reshaping the future of healthcare and medical research

LEE: Yeah, yeah.

It cannot—as, you know, Bill was saying—it cannot learn from your document.

And I don’t know if the two of you remember, but I ended up doing a lot of tests.

I don’t know if you know, but just recently, there was a paper that was published on a scientific discovery using o3- mini (opens in new tab).

Like, if you have a human trained for one task and you put them into another task, then you don’t … you often don’t know.

1 month назад @ microsoft.com
Rewriting SymCrypt in Rust to modernize Microsoft’s cryptographic library
Rewriting SymCrypt in Rust to modernize Microsoft’s cryptographic library Rewriting SymCrypt in Rust to modernize Microsoft’s cryptographic library

To address these vulnerabilities and improve memory safety, we’re rewriting SymCrypt (opens in new tab)—Microsoft’s open-source cryptographic library—in Rust.

For example, reasoning about C code often requires proving that two non-const pointers are live and non-overlapping, a property that can depend on external client code.

As a result, new tools have emerged specifically for verifying Rust code.

Some users compile our C code directly and may rely on specific toolchains or compiler features that complicate the adoption of Rust code.

Looking ahead, we plan to support direct use of the same cryptographic library in Rust without requiring C bindings.

1 month назад @ microsoft.com
BenchmarkQED: Automated benchmarking of RAG systems
BenchmarkQED: Automated benchmarking of RAG systems BenchmarkQED: Automated benchmarking of RAG systems

To meet this need, we’re introducing BenchmarkQED, a new suite of tools that automates RAG benchmarking at scale, available on GitHub (opens in new tab).

AutoQ: Automated query synthesisThis limitation motivated the development of GraphRAG a system designed to answer global queries.

GraphRAG’s evaluation requirements subsequently led to the creation of AutoQ, a method for synthesizing these global queries for any dataset.

Synthesis process and example query for each of the four AutoQ query classes.

We hope these datasets, together with the BenchmarkQED tools (opens in new tab), help accelerate benchmark-driven development of RAG systems and AI question-answering.

1 month, 1 week назад @ microsoft.com
What AI’s impact on individuals means for the health workforce and industry
What AI’s impact on individuals means for the health workforce and industry What AI’s impact on individuals means for the health workforce and industry

So I don’t think we should be surprised that business schools matter on this because we care about management.

That’s really going to change the way, like, middle school works, was my thinking at the time.

We’ve gone from AI being highly discriminative to AI that’s able to explore the world in particular ways.

The symptoms that they’re showing are quite different, and also their compliance is really, really different.

LEE: Yeah, really, really interesting.

1 month, 2 weeks назад @ microsoft.com
FrodoKEM: A conservative quantum-safe cryptographic algorithm
FrodoKEM: A conservative quantum-safe cryptographic algorithm FrodoKEM: A conservative quantum-safe cryptographic algorithm

FrodoKEM is a key encapsulation mechanism (KEM) based on the Learning with Errors (LWE) problem, a cornerstone of lattice-based cryptography.

The Learning with Errors (LWE) problemThe LWE problem is a fundamental hard problem in lattice-based cryptography.

In other words, cryptanalysts and quantum researchers have not been able to devise an efficient quantum algorithm capable of solving the LWE problem and, hence, FrodoKEM.

ConclusionAfter years of research and analysis, the next generation of post-quantum cryptographic algorithms has arrived.

Further ReadingFor those interested in learning more about FrodoKEM, post-quantum cryptography, and lattice-based cryptography, the following resourc…

1 month, 2 weeks назад @ microsoft.com
Abstracts: Zero-shot models in single-cell biology with Alex Lu
Abstracts: Zero-shot models in single-cell biology with Alex Lu Abstracts: Zero-shot models in single-cell biology with Alex Lu

And single-cell foundation models claim to be capable of unraveling deeper insights than ever before.

Basically, we showed that single-cell foundation models perform worse in settings that are fundamental to biological discovery than much simpler machine learning and statistical methods that were used in the field before single-cell foundation models emerged and are the go-to standard for unpacking meaning from these complicated experiments.

And the way to understand this is because single-cell foundation models are trained in a way that tries to expose these models to millions of single-cells.

But let’s also talk about the impact for methodologists, people who are trying to improve these s…

1 month, 3 weeks назад @ microsoft.com
MIT AI MIT AI
последний пост 1 day, 14 hours назад
New AI system uncovers hidden cell subtypes, boosts precision medicine
New AI system uncovers hidden cell subtypes, boosts precision medicine New AI system uncovers hidden cell subtypes, boosts precision medicine

In order to produce effective targeted therapies for cancer, scientists need to isolate the genetic and phenotypic characteristics of cancer cells, both within and across different tumors, because those differences impact how tumors respond to treatment.

Part of this work requires a deep understanding of the RNA or protein molecules each cancer cell expresses, where it is located in the tumor, and what it looks like under a microscope.

“I can use existing information to better define what a cell is, what is the subpopulation of that cell, what that cell is doing, and what is the potential functional readout of that cell.

By using deep learning, the researchers can detect many different laye…

1 day, 14 hours назад @ news.mit.edu
Changing the conversation in health care
Changing the conversation in health care Changing the conversation in health care

In health care, gaps in communication between patients and practitioners can worsen patient outcomes and prevent improvements in practice and care.

The project’s focus on health care and communication seeks to build bridges across socioeconomic, cultural, and linguistic strata.

“The basis of health care delivery is the knowledge of health and disease,” Celi says.

Creating spaces where ideas about AI and health care can potentially become actions is a key element of the project.

Celi believes there are opportunities to address the widening gap between people and practitioners while addressing gaps in health care.

3 days, 12 hours назад @ news.mit.edu
AI shapes autonomous underwater “gliders”
AI shapes autonomous underwater “gliders” AI shapes autonomous underwater “gliders”

Marine scientists have long marveled at how animals like fish and seals swim so efficiently despite having different shapes.

Autonomous vehicles can drift through the ocean in a similar way, collecting data about vast underwater environments.

Their method uses machine learning to test different 3D designs in a physics simulator, then molds them into more hydrodynamic shapes.

To truly evaluate these gliders in the real world, though, the team needed to see how their devices would fare underwater.

Chen adds that the team is looking to explore new types of shapes, particularly thinner glider designs.

3 days, 12 hours назад @ news.mit.edu
Study could lead to LLMs that are better at complex reasoning
Study could lead to LLMs that are better at complex reasoning Study could lead to LLMs that are better at complex reasoning

For all their impressive capabilities, large language models (LLMs) often fall short when given challenging new tasks that require complex reasoning skills.

They show that test-time training, a method that involves temporarily updating some of a model’s inner workings during deployment, can lead to a sixfold improvement in accuracy.

But in-context learning doesn’t always work for problems that require logic and reasoning.

In-context learning requires a small set of task examples, including problems and their solutions.

It boosted accuracy as much as sixfold over techniques that use only in-context learning.

5 days, 5 hours назад @ news.mit.edu
New postdoctoral fellowship program to accelerate innovation in health care
New postdoctoral fellowship program to accelerate innovation in health care New postdoctoral fellowship program to accelerate innovation in health care

The MIT Health and Life Sciences Collaborative (MIT HEALS) is launching the Biswas Postdoctoral Fellowship Program to advance the work of outstanding early-career researchers in health and life sciences.

Supported by a gift from the Biswas Family Foundation, the program aims to help apply cutting-edge research to improve health care and the lives of millions.

“We are deeply honored to launch this world-class postdoctoral fellows program,” adds Anantha P. Chandrakasan, MIT’s chief innovation and strategy officer and head of MIT HEALS.

“The Biswas Family Foundation is proud to support the MIT HEALS initiative, which reimagines how scientific discovery can translate into real-world health impa…

5 days, 19 hours назад @ news.mit.edu
Exploring data and its influence on political behavior
Exploring data and its influence on political behavior Exploring data and its influence on political behavior

A Department of Political Science course offers students tools to help make sense of these choices and their outcomes.

In class 17.831 (Data and Politics), students are introduced to principles and practices necessary to understand electoral and other types of political behavior.

The course opens a window into what happens behind the scenes of local and national political campaigns, which appealed to second-year political science major Jackson Hamilton.

“Computer science and data science aren’t just useful for STEM applications; data science approaches can also be extremely useful in many social sciences,” Hamilton argues.

“He focuses on how different approaches in coding can be applied to …

5 days, 19 hours назад @ news.mit.edu
Robotic probe quickly measures key properties of new materials
Robotic probe quickly measures key properties of new materials Robotic probe quickly measures key properties of new materials

But the pace of innovation is bottlenecked by the speed at which researchers can manually measure important material properties.

A fully autonomous robotic system developed by MIT researchers could speed things up.

During a 24-hour test, the fully autonomous robotic probe took more than 125 unique measurements per hour, with more precision and reliability than other artificial intelligence-based methods.

They also designed imaging-based methods to determine some important material properties.

The researchers want to continue building on this robotic system as they strive to create a fully autonomous lab for materials discovery.

1 week, 1 day назад @ news.mit.edu
Confronting the AI/energy conundrum
Confronting the AI/energy conundrum Confronting the AI/energy conundrum

At the same time, artificial intelligence technologies could revolutionize energy systems, accelerating the transition to clean power.

After decades of flat electricity demand in the United States, computing centers now consume approximately 4 percent of the nation's electricity.

Strategies for clean energy solutionsThe symposium explored multiple pathways to address the AI-energy challenge.

Strubell advocated for viewing computing center electricity as a limited resource requiring thoughtful allocation across different applications.

Navigating the AI-energy paradoxThe symposium highlighted MIT’s central role in developing solutions to the AI-electricity challenge.

1 week, 3 days назад @ news.mit.edu
Accelerating scientific discovery with AI
Accelerating scientific discovery with AI Accelerating scientific discovery with AI

Several researchers have taken a broad view of scientific progress over the last 50 years and come to the same troubling conclusion: Scientific productivity is declining.

Now, the philanthropically funded research lab FutureHouse is seeking to accelerate scientific research with an AI platform designed to automate many of the critical steps on the path toward scientific progress.

“Natural language is the real language of science,” Rodriques says.

The founders started out wanting to create distinct AI tools for tasks like literature searches, data analysis, and hypothesis generation.

They began with data collection, eventually releasing PaperQA in September 2024, which Rodriques calls the be…

1 week, 5 days назад @ news.mit.edu
MIT and Mass General Brigham launch joint seed program to accelerate innovations in health
MIT and Mass General Brigham launch joint seed program to accelerate innovations in health MIT and Mass General Brigham launch joint seed program to accelerate innovations in health

Leveraging the strengths of two world-class research institutions, MIT and Mass General Brigham (MGB) recently celebrated the launch of the MIT-MGB Seed Program.

The new initiative, which is supported by Analog Devices Inc. (ADI), will fund joint research projects led by researchers at MIT and Mass General Brigham.

The program will launch an open call for proposals to researchers at MIT and Mass General Brigham.

Awardees will be selected by a joint review committee composed of MIT and Mass General Brigham experts.

Conversely, MIT researchers may not fully grasp these clinical challenges or have access to the right patient data and samples,” explains Shalek, who is also a member of the Ragon…

2 weeks, 1 day назад @ news.mit.edu
Using generative AI to help robots jump higher and land safely
Using generative AI to help robots jump higher and land safely Using generative AI to help robots jump higher and land safely

Diffusion models like OpenAI’s DALL-E are becoming increasingly useful in helping brainstorm new designs.

Users can draft a 3D model of a robot and specify which parts they’d like to see a diffusion model modify, providing its dimensions beforehand.

The resulting design resembled a blob, so the researchers prompted their system to scale the draft to fit their 3D model.

The advantage of using diffusion models for this task, according to co-lead author and CSAIL postdoc Byungchul Kim, is that they can find unconventional solutions to refine robots.

The diffusion model’s ability to upgrade a robot’s jumping and landing skills suggests it could be useful in enhancing how other machines are desi…

2 weeks, 1 day назад @ news.mit.edu
Merging AI and underwater photography to reveal hidden ocean worlds
Merging AI and underwater photography to reveal hidden ocean worlds Merging AI and underwater photography to reveal hidden ocean worlds

Just as the 19th-century camera transformed our ability to document and reveal the natural world — capturing life with unprecedented detail and bringing distant or hidden environments into view — generative AI marks a new frontier in visual storytelling.

In addition, LOBSTgER’s models are built using custom code developed by Mentzelopoulos to protect the process and outputs from any potential biases from external data or models.

LOBSTgER’s generative AI builds upon real photography, expanding the researchers’ visual vocabulary to deepen the public’s connection to the natural world.

The project draws from the visual language of photography, the observational rigor of marine science, and the …

2 weeks, 3 days назад @ news.mit.edu
LLMs factor in unrelated information when recommending medical treatments
LLMs factor in unrelated information when recommending medical treatments LLMs factor in unrelated information when recommending medical treatments

A large language model (LLM) deployed to make treatment recommendations can be tripped up by nonclinical information in patient messages, like typos, extra white space, missing gender markers, or the use of uncertain, dramatic, and informal language, according to a study by MIT researchers.

These findings indicate that LLMs take nonclinical information into account for clinical decision-making in previously unknown ways.

It brings to light the need for more rigorous studies of LLMs before they are deployed for high-stakes applications like making treatment recommendations, the researchers say.

“These models are often trained and tested on medical exam questions but then used in tasks that a…

2 weeks, 6 days назад @ news.mit.edu
Researchers present bold ideas for AI at MIT Generative AI Impact Consortium kickoff event
Researchers present bold ideas for AI at MIT Generative AI Impact Consortium kickoff event Researchers present bold ideas for AI at MIT Generative AI Impact Consortium kickoff event

Launched in February of this year, the MIT Generative AI Impact Consortium (MGAIC), a presidential initiative led by MIT’s Office of Innovation and Strategy and administered by the MIT Stephen A. Schwarzman College of Computing, issued a call for proposals, inviting researchers from across MIT to submit ideas for innovative projects studying high-impact uses of generative AI models.

The call received 180 submissions from nearly 250 faculty members, spanning all of MIT’s five schools and the college.

The overwhelming response across the Institute exemplifies the growing interest in AI and follows in the wake of MIT’s Generative AI Week and call for impact papers.

Over 30 funding recipients p…

3 weeks, 1 day назад @ news.mit.edu
Combining technology, education, and human connection to improve online learning
Combining technology, education, and human connection to improve online learning Combining technology, education, and human connection to improve online learning

“My years of organizing learning and making communities — both in person and online — have shown me firsthand how powerful social interaction can be for motivation and curiosity,” Morris said.

Combining her observational skills with active community engagement, she works at the intersection of technology, education, and human connection to improve digital learning platforms.

This research builds on her experience increasing human connection in both physical and virtual learning environments.

“I’m developing a framework that combines AI-driven behavioral analysis with human expert assessment to study social learning dynamics,” she says.

“I aim to make two primary contributions: first, analys…

3 weeks, 4 days назад @ news.mit.edu
Berkeley AI
последний пост 3 months назад
Defending against Prompt Injection with Structured Queries (StruQ) and Preference Optimization (SecAlign)
Defending against Prompt Injection with Structured Queries (StruQ) and Preference Optimization (SecAlign) Defending against Prompt Injection with Structured Queries (StruQ) and Preference Optimization (SecAlign)

Defending against Prompt Injection with Structured Queries (StruQ) and Preference Optimization (SecAlign)Recent advances in Large Language Models (LLMs) enable exciting LLM-integrated applications.

To mitigate the imminent prompt injection threat, we propose two fine-tuning-defenses, StruQ and SecAlign.

Prompt Injection Attack: CausesBelow is the threat model of prompt injection attacks.

Prompt injection threat model in LLM-integrated applicationsWe propose that prompt injection has two causes.

Below are resources to learn more and keep updated on prompt injection attacks and defenses.

3 months назад @ bair.berkeley.edu
Repurposing Protein Folding Models for Generation with Latent Diffusion
Repurposing Protein Folding Models for Generation with Latent Diffusion Repurposing Protein Folding Models for Generation with Latent Diffusion

Repurposing Protein Folding Models for Generation with Latent DiffusionPLAID is a multimodal generative model that simultaneously generates protein 1D sequence and 3D structure, by learning the latent space of protein folding models.

In PLAID, we develop a method that learns to sample from the latent space of protein folding models to generate new proteins.

Unlike many previous protein structure generative models, PLAID addresses the multimodal co-generation problem setting: simultaneously generating both discrete sequence and continuous all-atom structural coordinates.

In this way, we can use structural understanding information in the weights of pretrained protein folding models for the p…

3 months назад @ bair.berkeley.edu
Scaling Up Reinforcement Learning for Traffic Smoothing: A 100-AV Highway Deployment
Scaling Up Reinforcement Learning for Traffic Smoothing: A 100-AV Highway Deployment Scaling Up Reinforcement Learning for Traffic Smoothing: A 100-AV Highway Deployment

Scaling Up Reinforcement Learning for Traffic Smoothing: A 100-AV Highway DeploymentTraining Diffusion Models with Reinforcement LearningWe deployed 100 reinforcement learning (RL)-controlled cars into rush-hour highway traffic to smooth congestion and reduce fuel consumption for everyone.

The challenges of phantom jamsA stop-and-go wave moving backwards through highway traffic.

Smoothing behavior of RL AVs.

Overall, the steps towards deployment involved:Training in data-driven simulations: We used highway traffic data from I-24 to create a training environment with realistic wave dynamics, then validate the trained agent’s performance and robustness in a variety of new traffic scenarios.…

3 months, 2 weeks назад @ bair.berkeley.edu
Virtual Personas for Language Models via an Anthology of Backstories
Virtual Personas for Language Models via an Anthology of Backstories Virtual Personas for Language Models via an Anthology of Backstories

Virtual Personas for Language Models via an Anthology of BackstoriesWe introduce Anthology, a method for conditioning LLMs to representative, consistent, and diverse virtual personas by generating and utilizing naturalistic backstories with rich details of individual values and experience.

What does it mean for large language models (LLMs) to be trained on massive text corpora, collectively produced by millions and billions of distinctive human authors?

In this work, we introduce Anthology, an approach for steering LLMs to representative, consistent, and diverse virtual personas by providing richly detailed life narratives of individuals as conditioning context to models.

By grounding langu…

8 months назад @ bair.berkeley.edu
Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination
Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination

Linguistic Bias in ChatGPT: Language Models Reinforce Dialect DiscriminationSample language model responses to different varieties of English and native speaker reactions.

Over 1 billion people around the world speak varieties such as Indian English, Nigerian English, Irish English, and African-American English.

Then, we compared the language model responses to the “standard” varieties and the non-“standard” varieties.

Here, we included the original GPT-3.5 responses, plus responses from GPT-3.5 and GPT-4 where the models were told to imitate the style of the input.

That can reinforce barriers against speakers of non-“standard” varieties as AI models become increasingly used in …

9 months, 3 weeks назад @ bair.berkeley.edu
AWS Machine Learning AWS Machine Learning
последний пост 1 day, 15 hours назад
Advanced fine-tuning methods on Amazon SageMaker AI
Advanced fine-tuning methods on Amazon SageMaker AI Advanced fine-tuning methods on Amazon SageMaker AI

This method involves training models to internalize specific principles or rules, then using these principles to guide generation and self-improvement.

For more information about DPO, see Align Meta Llama 3 to human preferences with DPO Amazon SageMaker Studio and Amazon SageMaker Ground Truth.

Amazon SageMaker HyperPod offers fine-tuning capabilities for supported foundation models (FMs), and Amazon SageMaker Model Training offers flexibility for custom fine-tuning implementations along with training the models at scale without the need to manage infrastructure.

P-tuningP-tuning extends prompt tuning by representing prompts as continuous embeddings passed through a small trainable neural n…

1 day, 15 hours назад @ aws.amazon.com
Streamline machine learning workflows with SkyPilot on Amazon SageMaker HyperPod
Streamline machine learning workflows with SkyPilot on Amazon SageMaker HyperPod Streamline machine learning workflows with SkyPilot on Amazon SageMaker HyperPod

In this post, we share how SageMaker HyperPod, in collaboration with SkyPilot, is streamlining AI development workflows.

SageMaker HyperPod orchestrated on Amazon Elastic Kubernetes Service (Amazon EKS) combines the power of Kubernetes with the resilient environment of SageMaker HyperPod designed for training large models.

We go over the process of creating a SageMaker HyperPod cluster, installing SkyPilot, creating a SkyPilot cluster, and deploying a SkyPilot training job.

PrerequisitesYou must have the following prerequisites:An existing SageMaker HyperPod cluster with Amazon EKS (to create one, refer to Deploy Your HyperPod Cluster).

To create the cluster using the SageMaker HyperPod API…

1 day, 15 hours назад @ aws.amazon.com
Intelligent document processing at scale with generative AI and Amazon Bedrock Data Automation
Intelligent document processing at scale with generative AI and Amazon Bedrock Data Automation Intelligent document processing at scale with generative AI and Amazon Bedrock Data Automation

To learn more about Amazon Bedrock Data Automation, refer to Simplify multimodal generative AI with Amazon Bedrock Data Automation.

For cases requiring further customization, the solution also provides alternative processing paths using Amazon Bedrock FMs and Amazon Textract integration.

As part of the workflow, we use AWS Lambda functions to call Amazon Bedrock Data Automation or Amazon Textract and Amazon Bedrock (depending on the selected parsing mode).

The Lambda function creates a blueprint with the user-defined fields schema and launches an asynchronous Amazon Bedrock Data Automation job.

Configure Amazon Bedrock model access by opening the Amazon Bedrock console in the Region specifi…

1 day, 16 hours назад @ aws.amazon.com
Build a conversational data assistant, Part 2 – Embedding generative business intelligence with Amazon Q in QuickSight
Build a conversational data assistant, Part 2 – Embedding generative business intelligence with Amazon Q in QuickSight Build a conversational data assistant, Part 2 – Embedding generative business intelligence with Amazon Q in QuickSight

Each Q topic contains metadata about available metrics, dimensions, time periods, and synonyms—making topic selection a critical step in delivering accurate visual insights.

The right phrasing significantly improves response quality by making sure Amazon Q in QuickSight precisely understands what to visualize.

Embedding Amazon Q in QuickSight visualizations within the chat interfaceWhen users ask to visualize data in the RRDA chat interface, the system seamlessly embeds Amazon Q in QuickSight directly within the conversation.

For detailed information about the embedding capabilities used, see Embedding the Amazon Q in QuickSight Generative Q&A experience.

This pipeline makes sure RRDA consi…

1 day, 16 hours назад @ aws.amazon.com
Build a conversational data assistant, Part 1: Text-to-SQL with Amazon Bedrock Agents
Build a conversational data assistant, Part 1: Text-to-SQL with Amazon Bedrock Agents Build a conversational data assistant, Part 1: Text-to-SQL with Amazon Bedrock Agents

Business domain classification is crucial because enterprises can have metrics that might be calculated differently across different programs, teams, or departments.

Amazon Bedrock agent overviewAt the core of RRDA’s architecture is an Amazon Bedrock agent powered by Anthropic’s Claude 3.5 Haiku on Amazon Bedrock.

Metrics dictionary with business domain filteringRRDA’s metrics dictionary serves as the source of truth for over 1,000 metrics across WWRR’s business domains.

This action group retrieves only metrics within that specific business domain by applying a domain filter to the vector search configuration.

ConclusionThe Returns & ReCommerce Data Assist transforms data access at WWRR by …

1 day, 16 hours назад @ aws.amazon.com
Implement user-level access control for multi-tenant ML platforms on Amazon SageMaker AI
Implement user-level access control for multi-tenant ML platforms on Amazon SageMaker AI Implement user-level access control for multi-tenant ML platforms on Amazon SageMaker AI

Although Amazon SageMaker Studio provides user-level execution roles, this approach becomes unwieldy as organizations scale and team sizes grow.

With SageMaker AI, platform teams can create dedicated Amazon SageMaker Studio domains for each business unit, thereby maintaining resource isolation between workloads.

PrerequisitesBefore implementing this solution, make sure your SageMaker Studio domain meets the following criteria:Roles used with SageMaker AI (both SageMaker Studio and roles passed to SageMaker pipelines or processing jobs) have the sts:SetSourceIdentity permission in their trust policy.

SageMaker AI access controlWhen managing a shared SageMaker Studio domain with multiple user…

1 day, 16 hours назад @ aws.amazon.com
Long-running execution flows now supported in Amazon Bedrock Flows in public preview
Long-running execution flows now supported in Amazon Bedrock Flows in public preview Long-running execution flows now supported in Amazon Bedrock Flows in public preview

With Amazon Bedrock Flows, you can link foundation models (FMs), Amazon Bedrock Prompt Management, Amazon Bedrock Agents, Amazon Bedrock Knowledge Bases, Amazon Bedrock Guardrails, and other AWS services together to build and scale predefined generative AI workflows.

Builder-friendly – You can create and manage long-running execution flows through both the Amazon Bedrock API and Amazon Bedrock console.

– You can create and manage long-running execution flows through both the Amazon Bedrock API and Amazon Bedrock console.

Long-running execution flow support in Amazon Bedrock Flows is now available in public preview in AWS Regions where Amazon Bedrock Flows is available, except for the AWS Go…

1 day, 16 hours назад @ aws.amazon.com
Fraud detection empowered by federated learning with the Flower framework on Amazon SageMaker AI
Fraud detection empowered by federated learning with the Flower framework on Amazon SageMaker AI Fraud detection empowered by federated learning with the Flower framework on Amazon SageMaker AI

Federated learning with the Flower framework on SageMaker AIWith federated learning, multiple institutions can train a shared model while keeping their data decentralized, addressing privacy and security concerns in fraud detection.

A fair evaluation approach for federated learning modelsA critical aspect of federated learning is facilitating a fair and unbiased evaluation of trained models.

For more information, refer to the Flower Federated Learning Workshop, which provides detailed guidance on setting up and running federated learning workloads effectively.

Results and key takeawaysThe implementation of federated learning for fraud detection has demonstrated significant improvements in m…

1 day, 17 hours назад @ aws.amazon.com
Building intelligent AI voice agents with Pipecat and Amazon Bedrock – Part 2
Building intelligent AI voice agents with Pipecat and Amazon Bedrock – Part 2 Building intelligent AI voice agents with Pipecat and Amazon Bedrock – Part 2

Amazon Nova Sonic also supports tool use and agentic RAG with Amazon Bedrock Knowledge Bases enabling your voice agents to retrieve information.

This example shows how to build a complete voice AI agent with Amazon Nova Sonic and Pipecat.

Customize your voice AI agentTo customize your voice AI agent, start by:Modifying bot.py to change conversation logic.

ResourcesTo learn more about voice AI agents, see the following resources:To get started with your own voice AI project, contact your AWS account team to explore an engagement with AWS Generative AI Innovation Center (GAIIC).

His team partners with AWS customers on generative AI projects, with the goal of accelerating customers’ adoption o…

1 day, 17 hours назад @ aws.amazon.com
Uphold ethical standards in fashion using multimodal toxicity detection with Amazon Bedrock Guardrails
Uphold ethical standards in fashion using multimodal toxicity detection with Amazon Bedrock Guardrails Uphold ethical standards in fashion using multimodal toxicity detection with Amazon Bedrock Guardrails

In this post, we cover the use of the multimodal toxicity detection feature of Amazon Bedrock Guardrails to guard against toxic content.

Take the first step toward responsible AI within your creative practices with Amazon Bedrock Guardrails.

Create a multimodal guardrail in Amazon BedrockThe foundation of our moderation system is a guardrail in Amazon Bedrock configured specifically for image content.

We use the AWS SDK for Python (Boto3) to interact with Amazon S3 and Amazon Bedrock.

By implementing Amazon Bedrock Guardrails multimodal toxicity detection, fashion brands can automatically screen content for potentially harmful material before it impacts their reputation or violates their et…

1 day, 17 hours назад @ aws.amazon.com
New capabilities in Amazon SageMaker AI continue to transform how organizations develop AI models
New capabilities in Amazon SageMaker AI continue to transform how organizations develop AI models New capabilities in Amazon SageMaker AI continue to transform how organizations develop AI models

Since launching in 2017, SageMaker AI has transformed how organizations approach AI model development by reducing complexity while maximizing performance.

Today, we’re pleased to announce new innovations that build on the rich features of SageMaker AI to accelerate how customers build and train AI models.

Amazon SageMaker HyperPod: The infrastructure of choice for developing AI modelsAWS launched Amazon SageMaker HyperPod in 2023 to reduce complexity and maximize performance and efficiency when building AI models.

ConclusionIn this post, we shared some of the new innovations in SageMaker AI to accelerate how you can build and train AI models.

To learn more about these new features, SageMake…

2 days, 14 hours назад @ aws.amazon.com
Accelerate foundation model development with one-click observability in Amazon SageMaker HyperPod
Accelerate foundation model development with one-click observability in Amazon SageMaker HyperPod Accelerate foundation model development with one-click observability in Amazon SageMaker HyperPod

SageMaker HyperPod observability is available for SageMaker HyperPod clusters with an Amazon EKS orchestrator.

If you don’t already have a SageMaker HyperPod cluster with an Amazon EKS orchestrator, refer to Amazon SageMaker HyperPod quickstart workshops for instructions to create one.

Enable SageMaker HyperPod observabilityTo enable SageMaker HyperPod observability, follow these steps:On the SageMaker AI console, choose Cluster management in the navigation pane.

ConclusionThis post provided an overview and usage instructions for SageMaker HyperPod observability, a newly released observability feature for SageMaker HyperPod.

For more information about SageMaker HyperPod observability, see A…

2 days, 14 hours назад @ aws.amazon.com
Accelerating generative AI development with fully managed MLflow 3.0 on Amazon SageMaker AI
Accelerating generative AI development with fully managed MLflow 3.0 on Amazon SageMaker AI Accelerating generative AI development with fully managed MLflow 3.0 on Amazon SageMaker AI

Amazon SageMaker now offers fully managed support for MLflow 3.0 that streamlines AI experimentation and accelerates your generative AI journey from idea to production.

With the launch of fully managed MLflow 3.0 on Amazon SageMaker AI, you can accelerate generative AI development by making it easier to track experiments and observe behavior of models and AI applications using a single tool.

To begin tracking your experiments with your newly created SageMaker managed MLflow tracking server, you need to install both MLflow and the AWS SageMaker MLflow Python packages in your environment.

A LoggedModel entity in managed MLflow 3.0 represents your AI model, agent, or generative AI application …

2 days, 14 hours назад @ aws.amazon.com
Amazon SageMaker HyperPod launches model deployments to accelerate the generative AI model development lifecycle
Amazon SageMaker HyperPod launches model deployments to accelerate the generative AI model development lifecycle Amazon SageMaker HyperPod launches model deployments to accelerate the generative AI model development lifecycle

Today, we’re excited to announce that Amazon SageMaker HyperPod now supports deploying foundation models (FMs) from Amazon SageMaker JumpStart, as well as custom or fine-tuned models from Amazon S3 or Amazon FSx.

With Amazon EKS support in SageMaker HyperPod you can orchestrate your HyperPod Clusters with EKS.

You can navigate to SageMaker Studio, go to SageMaker JumpStart and select the open-weights model you want to deploy, and select SageMaker HyperPod.

To learn more, visit the Amazon SageMaker HyperPod documentation or try the HyperPod inference getting started guide in the AWS Management Console.

Chaitanya Hazarey leads software development for inference on SageMaker HyperPod at Amazon…

2 days, 14 hours назад @ aws.amazon.com
Supercharge your AI workflows by connecting to SageMaker Studio from Visual Studio Code
Supercharge your AI workflows by connecting to SageMaker Studio from Visual Studio Code Supercharge your AI workflows by connecting to SageMaker Studio from Visual Studio Code

AI developers and machine learning (ML) engineers can now use the capabilities of Amazon SageMaker Studio directly from their local Visual Studio Code (VS Code).

Solution overviewThe following diagram showcases the interaction between your local IDE and SageMaker Studio spaces.

You use compatible SageMaker Distribution images (2.7+ and 3.1+) for running SageMaker Studio spaces, or a custom image.

Best practices and guardrailsFollow the principle of least privilege when allowing users to connect remotely to SageMaker Studio spaces applications.

Get started today with SageMaker Studio remote IDE connection to connect your local development environment to SageMaker Studio and experience stream…

2 days, 14 hours назад @ aws.amazon.com
NVIDIA
последний пост 1 day, 18 hours назад
Forecasting the Weather Beyond Two Weeks Using NVIDIA Earth-2
Forecasting the Weather Beyond Two Weeks Using NVIDIA Earth-2 Forecasting the Weather Beyond Two Weeks Using NVIDIA Earth-2

Using AI models to forecast weather and climate has gained significant steam in research over the past two years, and is now gaining traction in operational settings.

The latest release of Earth2Studio has introduced a new subseasonal-to-seasonal (S2S) forecasting capability demonstrated in the context of the Deep Learning Earth System Model (DLESyM).

This is a parsimonious deep-learning model that couples a multi-layer atmosphere AI model to a separate ocean AI model that predicts sea surface temperature evolution.

The AI Weather Quest competition from the European Centre for Medium-Range Weather Forecasting (ECMWF) aims to accelerate community participation in advancing S2S forecasting.

F…

1 day, 18 hours назад @ developer.nvidia.com
A Gaming GPU Helps Crack the Code on a Thousand-Year Cultural Conversation
A Gaming GPU Helps Crack the Code on a Thousand-Year Cultural Conversation A Gaming GPU Helps Crack the Code on a Thousand-Year Cultural Conversation

Ceramics — the humble mix of earth, fire and artistry — have been part of a global conversation for millennia.

Researchers there, alongside colleagues at UNSW Sydney, have built an AI system that can classify Chinese ceramics and predict their value with uncanny precision.

Wu’s team aims to change that by making cultural appraisal more objective, scalable and accessible to a wider audience.

In one test, the AI assessed a Ming Dynasty artifact at roughly 30% below its final hammer price.

The team is already exploring AI for other forms of cultural visual heritage, from Cantonese opera costumes to historical murals.

1 day, 20 hours назад @ blogs.nvidia.com
Indonesia on Track to Achieve Sovereign AI Goals With NVIDIA, Cisco and IOH
Indonesia on Track to Achieve Sovereign AI Goals With NVIDIA, Cisco and IOH Indonesia on Track to Achieve Sovereign AI Goals With NVIDIA, Cisco and IOH

Leaders from these companies reveal the nation’s new AI Center of Excellence, supported by the Indonesia government, that will foster localized AI research, nurture local talent and drive innovation with NVIDIA Inception startups.

Building out the nation’s AI infrastructure is a crucial part of this plan.

That’s why Indonesian telecommunications leader Indosat Ooredoo Hutchison, aka Indosat or IOH, has partnered with Cisco and NVIDIA to support the establishment of Indonesia’s AI Center of Excellence (CoE).

As part of the CoE, a new NVIDIA AI Technology Center will offer research support, NVIDIA Inception program benefits for eligible startups, and NVIDIA Deep Learning Institute training an…

2 days, 5 hours назад @ blogs.nvidia.com
Reach the ‘PEAK’ on GeForce NOW
Reach the ‘PEAK’ on GeForce NOW Reach the ‘PEAK’ on GeForce NOW

The popular indie co-op climber is now streaming as one of four new releases — and ‘Tony Hawk’s Pro Skater 3 + 4’ is coming soon.

Grab a friend and climb toward the clouds — PEAK is now available on GeForce NOW, enabling members to try the hugely popular indie hit on virtually any device.

Plus, members can look forward to Tony Hawk’s Pro Skater 3 + 4 coming soon.

Ultimate members can play PEAK and more than 2,000 other games they already own with extended session lengths and ultralow latency on a GeForce RTX 4080 rig.

Get HypedTony Hawk’s Pro Skater 3 + 4 is coming soon to GeForce NOW.

2 days, 20 hours назад @ blogs.nvidia.com
How to Run Coding Assistants for Free on RTX AI PCs and Workstations
How to Run Coding Assistants for Free on RTX AI PCs and Workstations How to Run Coding Assistants for Free on RTX AI PCs and Workstations

AI-powered copilots deliver real-time assistance for projects from academic projects to production code — and are optimized for RTX AI PCs.

There are two ways to run coding assistants: in the cloud or locally.

It’s a chance to win prizes and showcase what’s possible with RTX AI PCs.

Join NVIDIA’s Discord server to connect with community developers and AI enthusiasts for discussions on what’s possible with RTX AI.

Plug in to NVIDIA AI PC on Facebook, Instagram, TikTok and X — and stay informed by subscribing to the RTX AI PC newsletter.

2 days, 20 hours назад @ blogs.nvidia.com
From Terabytes to Turnkey: AI-Powered Climate Models Go Mainstream
From Terabytes to Turnkey: AI-Powered Climate Models Go Mainstream From Terabytes to Turnkey: AI-Powered Climate Models Go Mainstream

That’s the promise of ClimSim-Online, a reproducible framework for developing and deploying hybrid physics-machine learning climate models at scale.

From terabytes to turnkey: training AI to emulate complex nested climate physicsClimSim-Online builds on the award-winning ClimSim dataset, introduced at NeurIPS 2023.

Plug in and simulateClimSim-Online was pioneered by NVIDIA to make hybrid climate modeling accessible to the broader ML community.

Red lines represent hybrid models with microphysics constraints, approaching the theoretical lower bound of internal atmosphere unpredictability.

Whether you’re an AI researcher eager to work on climate or a climate scientist curious about the power o…

2 days, 20 hours назад @ developer.nvidia.com
Delivering the Missing Building Blocks for NVIDIA CUDA Kernel Fusion in Python
Delivering the Missing Building Blocks for NVIDIA CUDA Kernel Fusion in Python Delivering the Missing Building Blocks for NVIDIA CUDA Kernel Fusion in Python

However, the lack of “building blocks” forces Python library developers to drop down to C++ to implement custom algorithms.

Let’s time the algorithm we just built using parallel alongside a naive implementation that uses CuPy’s array operations.

of 7 runs, 10,000 loops each)We see that our approach, combining parallel iterators and algorithms, is faster than a naive approach.

By using parallel , you don’t have to jump through multiple layers of Python before invoking device code.

Using CUDA C++ and writing custom Python bindings to Thrust or CUB abstractions.

3 days, 14 hours назад @ developer.nvidia.com
New Learning Pathway: Deploy AI Models with NVIDIA NIM on GKE
New Learning Pathway: Deploy AI Models with NVIDIA NIM on GKE New Learning Pathway: Deploy AI Models with NVIDIA NIM on GKE

Connect with peers & expertsParticipate in an exclusive forum to connect with fellow developers, share insights, and learn from the best in the field.

Get answers from both Google and NVIDIA experts.

4 days, 14 hours назад @ developers.google.com
Asking an Encyclopedia-Sized Question: How To Make the World Smarter with Multi-Million Token Real-Time Inference
Asking an Encyclopedia-Sized Question: How To Make the World Smarter with Multi-Million Token Real-Time Inference Asking an Encyclopedia-Sized Question: How To Make the World Smarter with Multi-Million Token Real-Time Inference

Decoding bottlenecks: KV cache and FFN weight readsTo support real-time decoding at scale, a system must overcome two major bottlenecks during the decoding (aka generation) phase:Key-Value (KV) cache streaming: When handling multi-million-token contexts, each GPU must read a massive history of past tokens (KV cache) from DRAM per sample.

When handling multi-million-token contexts, each GPU must read a massive history of past tokens (KV cache) from DRAM per sample.

In the case of MLA, the upper limit for TP is just one to avoid duplication of KV cache.

Inspired by the structure of a DNA helix, Helix interweaves multiple dimensions of parallelism—KV, tensor, and expert—into a unified executio…

5 days, 8 hours назад @ developer.nvidia.com
RAPIDS Adds GPU Polars Streaming, a Unified GNN API, and Zero-Code ML Speedups
RAPIDS Adds GPU Polars Streaming, a Unified GNN API, and Zero-Code ML Speedups RAPIDS Adds GPU Polars Streaming, a Unified GNN API, and Zero-Code ML Speedups

These include a Polars GPU streaming engine, a unified API for graph neural networks (GNNs), and acceleration for support vector machines with zero code changes required.

In September 2024, we worked with the Polars team to launch a Polars GPU engine built on top of NVIDIA cuDF.

The 25.06 release brings some significant updates to Polars GPU engine capabilities.

Streaming executor is now experimentally availableWith the 25.06 release, we introduced streaming execution in the Polars GPU engine.

ConclusionThe RAPIDS 25.06 release brings zero-code-change functionality to new machine learning algorithms, a new Polars GPU streaming engine, hardware decompression capabilities for async memory res…

1 week, 2 days назад @ developer.nvidia.com
GeForce NOW’s 20 July Games Bring the Heat to the Cloud
GeForce NOW’s 20 July Games Bring the Heat to the Cloud GeForce NOW’s 20 July Games Bring the Heat to the Cloud

Celebrate the six new games available this week with the GeForce NOW Summer Sale.

Catch the scorching lineup of 20 titles coming to the cloud, which gamers can play whether indoors or on the go.

Six new games are landing on GeForce NOW this week, including launch day titles Figment and Little Nightmares II.

And to make the summer even hotter, the GeForce NOW Summer Sale is in full swing.

(New release on Steam, June 17)(New release on Steam, June 17) METAL EDEN Demo (Steam)Demo (Steam) Torque Drift 2 (Epic Games Store)(Epic Games Store) Broken Age (Steam)(Steam) Sandwich Simulator (Steam)(Steam) We Happy Few (Steam)What are you planning to play this weekend?

1 week, 2 days назад @ blogs.nvidia.com
NVIDIA RTX AI Accelerates FLUX.1 Kontext — Now Available for Download
NVIDIA RTX AI Accelerates FLUX.1 Kontext — Now Available for Download NVIDIA RTX AI Accelerates FLUX.1 Kontext — Now Available for Download

NVIDIA has collaborated with Black Forest Labs to optimize FLUX.1 Kontext [dev] for NVIDIA RTX GPUs using the NVIDIA TensorRT software development kit and quantization to deliver faster inference with lower VRAM requirements.

The FLUX.1 Kontext [dev] Flex: In-Context Image GenerationBlack Forest Labs in May introduced the FLUX.1 Kontext family of image models which accept both text and image prompts.

Learn more about NVIDIA optimizations and how to get started with FLUX.1 Kontext [dev] on the NVIDIA Technical Blog.

Join NVIDIA’s Discord server to connect with community developers and AI enthusiasts for discussions on what’s possible with RTX AI.

Plug in to NVIDIA AI PC on Facebook, Instagra…

1 week, 3 days назад @ blogs.nvidia.com
Per-Tensor and Per-Block Scaling Strategies for Effective FP8 Training
Per-Tensor and Per-Block Scaling Strategies for Effective FP8 Training Per-Tensor and Per-Block Scaling Strategies for Effective FP8 Training

Delayed scaling and current scaling are two common approaches within per-tensor scaling.

Per-tensor current scalingWhile delayed scaling uses historical data, per-tensor current scaling offers an alternative that prioritizes real-time adaptability.

Building upon the concept of per-block scaling, Micro-Scaling FP8 (MXFP8) represents the NVIDIA Blackwell hardware-level solution for achieving efficient and stable FP8 training.

Block scalingBeyond per-tensor scaling strategies and hardware-specific block scaling like MXFP8, generic FP8 block scaling is a versatile and configurable approach to fine-grained precision control.

Unlike per-tensor methods that apply a single scale across an entire te…

1 week, 4 days назад @ developer.nvidia.com
How AI Factories Can Help Relieve Grid Stress
How AI Factories Can Help Relieve Grid Stress How AI Factories Can Help Relieve Grid Stress

Emerald AI, an NVIDIA Inception startup, is developing software to control power use during times of peak grid demand while meeting the performance requirements of data center AI workloads.

Emerald AI achieved this by orchestrating the host of different workloads that AI factories run.

“The Phoenix technology trial validates the vast potential of an essential element in data center flexibility,” said Anuja Ratnayake, who leads EPRI’s DCFlex Consortium.

“This test was an opportunity to completely reimagine AI data centers as helpful resources to help us operate the power grid more effectively and reliably,” said David Rousseau, president of SRP.

“AI factories can flex when the grid is tight …

1 week, 4 days назад @ blogs.nvidia.com
NVIDIA NeMo Retriever Scores First Place for Visual Retrieval
NVIDIA NeMo Retriever Scores First Place for Visual Retrieval NVIDIA NeMo Retriever Scores First Place for Visual Retrieval

NeMo Retriever tops several visual document retrieval leaderboards, setting new standards for RAG apps.

1 week, 5 days назад @ huggingface.co
Facebook
последний пост 2 months назад
Accelerating GPU indexes in Faiss with NVIDIA cuVS
Accelerating GPU indexes in Faiss with NVIDIA cuVS Accelerating GPU indexes in Faiss with NVIDIA cuVS

Meta and NVIDIA collaborated to accelerate vector search on GPUs by integrating NVIDIA cuVS into Faiss v1.10 , Meta’s open source library for similarity search.

In its latest release, Faiss 1.10.0 officially includes these algorithms from the NVIDIA cuVS library.

Faiss 1.10.0 also includes a new conda package that unlocks the ability to choose between the classic Faiss GPU implementations and the newer NVIDIA cuVS algorithms, making it easy for users to switch between GPU and CPU.

Build time (95% recall@10)Index Embeddings100M x 96(seconds) Embeddings5M x 1536(seconds) Faiss Classic Faiss cuVS Faiss Classic Faiss cuVS Faiss Classic Faiss cuVS IVF Flat IVF Flat 101.4 37.9 (2.7x) 24.4 15.2 (1…

2 months назад @ engineering.fb.com
Introducing AutoPatchBench: A Benchmark for AI-Powered Security Fixes
Introducing AutoPatchBench: A Benchmark for AI-Powered Security Fixes Introducing AutoPatchBench: A Benchmark for AI-Powered Security Fixes

We are introducing AutoPatchBench, a benchmark for the automated repair of vulnerabilities identified through fuzzing.

As illustrated, fixing a fuzzing crash involves:Analyzing the crash stack trace and the target code.

Inside AutoPatchBenchWe’re making AutoPatchBench publicly available as part of CyberSecEval 4 to encourage community collaboration in tackling the challenge of automating fuzzing crash repairs.

Then we find the lowest common ancestor (LCA) across all pairs of stacktraces offered by the groundtruth patch and the LLM patch.

As an experienced Security Engineer at Meta, your task is to address the following security-critical fuzzing crash.

2 months, 2 weeks назад @ engineering.fb.com
Building multimodal AI for Ray-Ban Meta glasses
Building multimodal AI for Ray-Ban Meta glasses Building multimodal AI for Ray-Ban Meta glasses

With our Ray-Ban Meta glasses, multimodal AI helps the glasses see what the wearer is seeing.

This means anyone wearing Ray-Ban Meta glasses can ask them questions about what they’re looking at.

On this episode of the Meta Tech Podcast, meet Shane, a research scientist at Meta who has spent the last seven years focusing on computer vision and multimodal AI for wearables.

Shane sits down with Pascal Hartig to share how his team is building foundational models for the Ray-Ban Meta glasses.

They talk about the unique challenges of AI glasses and pushing the boundaries of AI-driven wearable technology.

4 months, 1 week назад @ engineering.fb.com
Revolutionizing software testing: Introducing LLM-powered bug catchers
Revolutionizing software testing: Introducing LLM-powered bug catchers Revolutionizing software testing: Introducing LLM-powered bug catchers

WHAT IT ISMeta’s Automated Compliance Hardening (ACH) tool is a system for mutation-guided, LLM-based test generation.

Traditionally, automated test generation techniques sought merely to increase code coverage.

LLM-based test generation and LLM-based mutant generation are not new, but this is the first time they’ve been combined and deployed in large-scaled industrial systems.

WHAT’S NEXTOur novel approach combines LLM-based test generation and mutant generation to help automate complex technical organizational workflows in this space.

READ THE PAPERMutation-Guided LLM-based Test Generation at Meta

5 months, 1 week назад @ engineering.fb.com
Meta Andromeda: Supercharging Advantage+ automation with the next-gen personalized ads retrieval engine
Meta Andromeda: Supercharging Advantage+ automation with the next-gen personalized ads retrieval engine Meta Andromeda: Supercharging Advantage+ automation with the next-gen personalized ads retrieval engine

Unlocking advertiser value through industry-leading ML innovationMeta Andromeda is a personalized ads retrieval engine that leverages the NVIDIA Grace Hopper Superchip, to enable cutting edge ML innovation in the Ads retrieval stage to drive efficiency and advertiser performance.

Its deployment across Instagram and Facebook applications has achieved +6% recall improvement to the retrieval system, delivering +8% ads quality improvement on selected segments.

Andromeda is designed to maximize ads performance by utilizing the exponential growth in volume of eligible ads available to the retrieval stage.

The design is optimized for AI hardware, minimizing memory bandwidth bottlenecks and enablin…

7 months, 1 week назад @ engineering.fb.com
Sequence learning: A paradigm shift for personalized ads recommendations
Sequence learning: A paradigm shift for personalized ads recommendations Sequence learning: A paradigm shift for personalized ads recommendations

Meta’s ad recommendation engine, powered by deep learning recommendation models (DLRMs), has been instrumental in delivering personalized ads to people.

Learning from sequences: developing new sequence learning architectures to replace traditional DLRM neural network architectures.

A paradigm shift with learning from sequences for recommendation systemsMeta’s new system for ads recommendations uses sequence learning at its core.

Scaling the new sequence learning paradigmFollowing the redesign to shift from sparse feature learning to event-based sequence learning, the next focus was scaling across two domains — scaling the sequence learning architecture and scaling event sequences to be long…

7 months, 3 weeks назад @ engineering.fb.com
OCP Summit 2024: The open future of networking hardware for AI
OCP Summit 2024: The open future of networking hardware for AI OCP Summit 2024: The open future of networking hardware for AI

At Open Compute Project Summit (OCP) 2024, we’re sharing details about our next-generation network fabric for our AI training clusters.

We’ve expanded our network hardware portfolio and are contributing two new disaggregated network fabrics and a new NIC to OCP.

At Meta, we believe that open hardware drives innovation.

At Meta, we envision a future of AI hardware systems that are not only scalable, but also open and collaborative.

We encourage anyone who wants to help advance the future of networking hardware for AI to engage with OCP and Meta to help share the future of AI infrastructure.

9 months назад @ engineering.fb.com
Meta’s open AI hardware vision
Meta’s open AI hardware vision Meta’s open AI hardware vision

At the Open Compute Project (OCP) Global Summit 2024, we’re showcasing our latest open AI hardware designs with the OCP community.

These innovations include a new AI platform, cutting-edge open rack designs, and advanced network fabrics and components.

The open future of AI infraMeta is committed to open source AI.

We must also prioritize open and standardized models so we can leverage collective expertise, make AI more accessible, and work towards minimizing biases in our systems.​Just as important, we also need open AI hardware systems.

By addressing AI’s infrastructure needs together, we can unlock the true promise of open AI for everyone.​

9 months назад @ engineering.fb.com
How open source AI can improve population estimates, sustainable energy, and the delivery of climate change interventions
How open source AI can improve population estimates, sustainable energy, and the delivery of climate change interventions How open source AI can improve population estimates, sustainable energy, and the delivery of climate change interventions

Why we need better population mapsAccurate estimates of population are taken for granted in many countries.

As the world’s natural resource and energy demands scale, accurate population estimates also offer significant opportunities to improve sustainability efforts.

In addition to total population counts, Meta’s population maps also include demographic breakdowns for groups such as the number of children under five, women of reproductive age, youth, and the elderly.

AI-powered population estimates have been scientifically evaluated to be among the most accurate in the world for mapping population distribution for a variety of geographies and use-cases.

Please visit the Data for Good websit…

9 months, 1 week назад @ engineering.fb.com
Uber Engineering
последний пост None
neptune.ai neptune.ai
последний пост 1 week, 1 day назад
How to Monitor, Diagnose, and Solve Gradient Issues in Foundation Models
How to Monitor, Diagnose, and Solve Gradient Issues in Foundation Models How to Monitor, Diagnose, and Solve Gradient Issues in Foundation Models

What gradient issues occur during foundation model training?

During training, gradient descent updates model parameters by computing the gradients of the loss function via forward and backward passes.

The green line corresponds to a learning rate of 10, while the orange line has a learning rate of 0.1.

The gradient norm for the orange line with LR = 0.1 is very high in the first steps, while the gradient norm of the green line with LR = 10 diverges to NaN after a few steps.

Techniques for gradient stabilizationMonitoring gradient norms and training loss provides insights into the learning dynamics of the foundation models.

1 week, 1 day назад @ neptune.ai
STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning [Paper Reflection]
STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning [Paper Reflection] STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning [Paper Reflection]

Unstructured pruning removes individual weights, while structured pruning removes entire model components.

In the context of MoEs, as expert structures from training MoEs correspond to such patterns, pruning experts is a natural fit for structured pruning.

Thus, structured pruning does not significantly decrease kurtosis, leaving plenty of margin for unstructured pruning.

Since structured pruning primarily reduces architectural redundancy rather than reshaping the underlying weight distribution, our two-phase approach—leveraging unstructured pruning after structured pruning—outperforms unstructured-only pruning.

Since STUN does not make any assumption about base MoE models, it is generaliza…

1 month, 1 week назад @ neptune.ai
Evaluating RAG Pipelines
Evaluating RAG Pipelines Evaluating RAG Pipelines

Related Building LLM Applications With Vector Databases Read moreDimensions of RAG evaluationEvaluating a RAG pipeline means assessing its behavior across three dimensions:1.

The evaluation of the RAG pipeline is a multi-step process, starting with creating an evaluation dataset, then evaluating the individual components (retriever, generator, etc.

Curating an evaluation datasetThe first step in the RAG evaluation process is the creation of a ground truth dataset.

MAP considers both the presence and rank of relevant chunks but fails to consider the relative position of relevant chunks.

However, not all retrieved chunks are equally relevant and sometimes, the most relevant chunks might not b…

1 month, 4 weeks назад @ neptune.ai
How to Build an LLM Agent With AutoGen: Step-by-Step Guide
How to Build an LLM Agent With AutoGen: Step-by-Step Guide How to Build an LLM Agent With AutoGen: Step-by-Step Guide

The efficiency of an LLM agent depends on the selection of the right LLM model.

In this article, we’ll introduce the fundamental building blocks of LLM agents and then walk through the process of building an LLM agent step by step.

Building an LLM agent from scratchIn the following, we’ll build a trip-planning LLM agent from scratch.

Using AutoGen’s OpenAI Assistant Agent, we instantiate a prompt that the LLM agent will follow throughout its interactions.

Related Ethical Considerations and Best Practices in LLM Development Read moreEnhancing LLM agent performanceWhile architecting an LLM agent, you have to keep in mind opportunities to improve the performance of the LLM agent.

3 months, 3 weeks назад @ neptune.ai
Bayesian Deep Learning is Needed in the Age of Large-Scale AI [Paper Reflection]
Bayesian Deep Learning is Needed in the Age of Large-Scale AI [Paper Reflection] Bayesian Deep Learning is Needed in the Age of Large-Scale AI [Paper Reflection]

Moreover, I will make the case for why Bayesian deep learning can satisfy these desiderata and briefly review recent advances in the field.

The case for Bayesian deep learningBayesian deep learning uses the foundational statistical principles of Bayesian inference to endow deep learning systems with the ability to make probabilistic predictions.

However, Bayesian deep learning is unfortunately still not as easy to use as standard deep learning, which you can do these days in a few lines of PyTorch code.

If you want to use a Bayesian deep learning model, first, you have to think about specifying the prior.

If this is the case, trying out Bayesian deep learning is likely worth your while.

4 months назад @ neptune.ai
Introduction to State Space Models as Natural Language Models
Introduction to State Space Models as Natural Language Models Introduction to State Space Models as Natural Language Models

TL;DR State Space Models (SSMs) use first-order differential equations to represent dynamic systems.

Understanding state space modelsBefore exploring how State Space Models (SSMs) can function as components of large language models (LLMs), we’ll examine their foundational mechanics.

State space models for natural language processingState Space Models (SSMs), long established in time series analysis, have been utilized as trainable sequence models for decades.

Linear state space layers (LSSLs)So far, we’ve seen that State Space Models are efficient sequence models.

Improvements on the state matrix AIn the previous section, we explored how the original LSSL relied on a fixed, predefined form …

4 months, 1 week назад @ neptune.ai
Ethical Considerations and Best Practices in LLM Development
Ethical Considerations and Best Practices in LLM Development Ethical Considerations and Best Practices in LLM Development

To keep data secure throughout the model’s lifecycle, implement these practices: data anonymization, secure model serving and privacy penetration tests.

For example, a recruitment LLM favoring male applicants due to biased training data reflects a harmful bias that requires correction.

Monitor bias continuouslyMitigating bias isn’t a one-time effort—it requires ongoing monitoring to ensure that your LLM remains fair and effective across iterations.

Although these contributions are publicly available, the move opened up debates about the ethics of reusing community-contributed content for proprietary AI training.

Best practices for ethical LLM developmentNavigating the regulatory landscape r…

4 months, 2 weeks назад @ neptune.ai
Open LLMs are Necessary For Current Private Adaptations and Outperform Their Closed Alternatives [Paper Reflection]
Open LLMs are Necessary For Current Private Adaptations and Outperform Their Closed Alternatives [Paper Reflection] Open LLMs are Necessary For Current Private Adaptations and Outperform Their Closed Alternatives [Paper Reflection]

While much of the discussion around LLMs centers on task and computational performance, in our paper Open LLMs are Necessary for Current Private Adaptations and Outperform their Closed Alternatives, we focus on the privacy implications of using Open and Closed LLMs.

The threat space in adapting LLMs to private dataThe adaptation of Closed LLMs to private datasets introduces a multifaceted threat space.

Related Zero-Shot and Few-Shot Learning with LLMs Read morePrivate adaptation methods for Open LLMsUnlike Closed LLMs, Open LLMs provide access to their parameters, enabling more flexible and parameter-centric private adaptation methods.

Performance: All adaptation methods for Closed LLMs ach…

4 months, 3 weeks назад @ neptune.ai
Learnings From Teams Training Large-Scale Models: Challenges and Solutions For Monitoring at Hyperscale
Learnings From Teams Training Large-Scale Models: Challenges and Solutions For Monitoring at Hyperscale Learnings From Teams Training Large-Scale Models: Challenges and Solutions For Monitoring at Hyperscale

“What is not measured, cannot be improved.” This quote has become a guiding principle for teams training foundation models.

During my talk at NeurIPS, I broke down five key lessons learned from teams facing large-scale model training and monitoring.

Waabi’s teams, running large-scale ML experiments, needed a way to organize and share their experiment data efficiently.

Visualizing large datasetsWe generally do not think of dataset visualization as part of experiment monitoring.

Moving forwardThe path to efficient hyperscale training lies in combining robust monitoring, advanced debugging tools, and comprehensive experiment tracking.

4 months, 4 weeks назад @ neptune.ai
Mixture of Experts LLMs: Key Concepts Explained
Mixture of Experts LLMs: Key Concepts Explained Mixture of Experts LLMs: Key Concepts Explained

TL;DR Mixture of Experts (MoE) is a type of neural network architecture that employs sub-networks (experts) to process specific input parts.

This is the key idea behind Mixture of Expert LLMs.

The Switch-Language Transformer, Mixtral, GLaM, GShard, and DeepSeekMoE are Mixture of Experts LLMs (MoEs), which require only executing a portion of the model’s computational graph during inference.

Optimization strategies for MoE LLMs are discussed comprehensively in the papers introducing the Switch Transformer, GShard, and GLaM.

Mixture of Experts (MoE) is an approach to scaling LLMs to trillions of parameters with conditional computation while avoiding exploding computational costs.

5 months назад @ neptune.ai
Hyperparameter Optimization For LLMs: Advanced Strategies
Hyperparameter Optimization For LLMs: Advanced Strategies Hyperparameter Optimization For LLMs: Advanced Strategies

Advanced hyperparameter optimization strategies, like population-based training, Bayesian optimization, and adaptive LoRA, promise to balance computational effort and outcome.

To avoid this, learning rate schedules for LLMs start with a small learning rate and slowly ramp it up to its maximum value.

Can we use traditional machine learning hyperparameter optimization methods for LLMs?

| Modified based on: sourceHands-on: LLM hyperparameter optimization with neptune.aiOptuna is a framework for optimizing hyperparameter search using Bayesian optimization.

See the docs or watch a short product demo (2 min)Play with a live Neptune Scale projectRequest your early accessWhat’s next in LLM hyperpar…

5 months, 1 week назад @ neptune.ai
Multimodal Large Language Models
Multimodal Large Language Models Multimodal Large Language Models

TL;DR Multimodal Large Language Models (MLLMs) process data from different modalities like text, audio, image, and video.

This article explores Multimodal Large Language Models, exploring their core functionalities, challenges, and potential for various machine-learning domains.

Let’s break down the concept of Multimodal Large Language Models (MLLMs) by first understanding the terms “modal” and “multimodal:”“Modal” refers to a particular way of communicating or perceiving information.

| SourceGoogle: PaLM-EGoogle developed an embodied language model, PaLM-E, to incorporate continuous sensor modalities into language models and establish the link between words and perceptions.

Improving how t…

5 months, 2 weeks назад @ neptune.ai
How to Build and Evaluate a RAG System Using LangChain, Ragas, and neptune.ai
How to Build and Evaluate a RAG System Using LangChain, Ragas, and neptune.ai How to Build and Evaluate a RAG System Using LangChain, Ragas, and neptune.ai

In this guide, we’ll show you how to build a RAG system using the LangChain framework, evaluate its performance using Ragas, and track your experiments with neptune.ai.

Part 1: Building a baseline RAG system with LangChainIn the first part of this guide, we’ll use LangChain to build a RAG system for the blog posts in the LLMOps category on Neptune’s blog.

Ragas works smoothly with LangChain, making it a great choice for evaluating our RAG system.

Step 1: Generate a RAG evaluation datasetAn evaluation set for RAG tasks is similar to a question-answering task dataset.

Step 2: Choose RAG evaluation metricsAs mentioned earlier, Ragas offers both LLM-based and non-LLM-based metrics for RAG syste…

6 months, 2 weeks назад @ neptune.ai
Position: Understanding LLMs Requires More Than Statistical Generalization [Paper Reflection]
Position: Understanding LLMs Requires More Than Statistical Generalization [Paper Reflection] Position: Understanding LLMs Requires More Than Statistical Generalization [Paper Reflection]

In our paper, Understanding LLMs Requires More Than Statistical Generalization, we argue that current machine learning theory cannot explain the interesting emergent properties of Large Language Models, such as reasoning or in-context learning.

Inductive biases affect which solution the neural network converges to, such as the model architecture or the optimization algorithm.

How do language complexity and model architecture affect generalization ability?

showed how different neural network architectures generalize better for different language types.

Presumably, we’ll need to find different complexity measures for different model architectures that consider their specific inductive biases.

6 months, 3 weeks назад @ neptune.ai
From Research to Production: Building The Most Scalable Experiment Tracker For Foundation Models
From Research to Production: Building The Most Scalable Experiment Tracker For Foundation Models From Research to Production: Building The Most Scalable Experiment Tracker For Foundation Models

TL;DR At a large-scale model training (in huge models), anomalies are not rare events but problematic patterns that drive failure.

The Neptune Scale experiment tracker supports fault tolerance and is designed to maintain progress despite hardware failures, making it adaptable for enterprise teams tackling LLM fine-tuning, compliance, and building domain-specific models.

Experiment tracking back then was straightforward—dealing mostly with single models or small-scale distributed systems.

One of the biggest lessons we’ve learned is that experiment tracking has evolved into experiment monitoring.

That’s why we’re focusing on building intelligent alerts and anomaly detection right into our exp…

7 months назад @ neptune.ai
▶️ YouTube
Yannic Kilcher Yannic Kilcher
последний пост 2 months, 1 week назад
On the Biology of a Large Language Model (Part 2)
On the Biology of a Large Language Model (Part 2) On the Biology of a Large Language Model (Part 2)

An in-depth look at Anthropic's Transformer Circuit Blog Post

Part 1 here: https://youtu.be/mU3g2YPKlsA

Discord here: https;//ykilcher.com/discord https://transformer-circuits.pub/2025/attribution-graphs/biology.html Abstract:

We investigate the internal mechanisms used by Claude 3.5 Haiku — Anthropic's lightweight production model — in a variety of contexts, using our circuit tracing methodology. Authors:

Jack Lindsey†, Wes Gurnee*, Emmanuel Ameisen*, Brian Chen*, Adam Pearce*, Nicholas L. Turner*, Craig Citro*,

David Abrahams, Shan Carter, Basil Hosmer, Jonathan Marcus, Michael Sklar, Adly Templeton,

Trenton Bricken, Callum McDougall◊, Hoagy Cunningham, Thomas Henighan, Adam Jermyn, Andy …

2 months, 1 week назад @ youtube.com
On the Biology of a Large Language Model (Part 1)
On the Biology of a Large Language Model (Part 1) On the Biology of a Large Language Model (Part 1)

An in-depth look at Anthropic's Transformer Circuit Blog Post https://transformer-circuits.pub/2025/attribution-graphs/biology.html Abstract:

We investigate the internal mechanisms used by Claude 3.5 Haiku — Anthropic's lightweight production model — in a variety of contexts, using our circuit tracing methodology. Authors:

Jack Lindsey†, Wes Gurnee*, Emmanuel Ameisen*, Brian Chen*, Adam Pearce*, Nicholas L. Turner*, Craig Citro*,

David Abrahams, Shan Carter, Basil Hosmer, Jonathan Marcus, Michael Sklar, Adly Templeton,

Trenton Bricken, Callum McDougall◊, Hoagy Cunningham, Thomas Henighan, Adam Jermyn, Andy Jones, Andrew Persic, Zhenyi Qi, T. Ben Thompson,

Sam Zimmerman, Kelley Rivoire, Thom…

3 months, 1 week назад @ youtube.com
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models (Paper Explained)
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models (Paper Explained) DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models (Paper Explained)

#deepseek #llm #reinforcementlearning GRPO is one of the core advancements used in Deepseek-R1, but was introduced already last year in this paper that uses a combination of new RL techniques and iterative data collection to achieve remarkable performance on mathematics benchmarks with just a 7B model. Paper: https://arxiv.org/abs/2402.03300 Abstract:

Mathematical reasoning poses a significant challenge for language models due to its complex and structured nature. In this paper, we introduce DeepSeekMath 7B, which continues pre-training DeepSeek-Coder-Base-v1.5 7B with 120B math-related tokens sourced from Common Crawl, together with natural language and code data. DeepSeekMath 7B has achie…

5 months, 2 weeks назад @ youtube.com
Traditional Holiday Live Stream
Traditional Holiday Live Stream Traditional Holiday Live Stream

https://ykilcher.com/discord Links:

TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://discord.gg/4H8xxDF

BitChute: https://www.bitchute.com/channel/yannic-kilcher

Minds: https://www.minds.com/ykilcher

Parler: https://parler.com/profile/YannicKilcher

LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/

BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https:/…

6 months, 2 weeks назад @ youtube.com
Byte Latent Transformer: Patches Scale Better Than Tokens (Paper Explained)
Byte Latent Transformer: Patches Scale Better Than Tokens (Paper Explained) Byte Latent Transformer: Patches Scale Better Than Tokens (Paper Explained)

#tokenization #llm #meta This paper does away with tokenization and creates an LLM architecture that operates on dynamically sized "patches" instead of tokens. By controlling the patch size, they gain a level of control over the tradeoff between model size and FLOPs and use that to achieve more favorable scaling behavior than classically tokenized LLMs. Paper: https://ai.meta.com/research/publications/byte-latent-transformer-patches-scale-better-than-tokens/

Code: https://github.com/facebookresearch/blt Abstract:

We introduce the Byte Latent Transformer (BLT), a new byte-level LLM architecture that, for the first time, matches tokenization-based LLM performance at scale with significant imp…

6 months, 2 weeks назад @ youtube.com
Safety Alignment Should be Made More Than Just a Few Tokens Deep (Paper Explained)
Safety Alignment Should be Made More Than Just a Few Tokens Deep (Paper Explained) Safety Alignment Should be Made More Than Just a Few Tokens Deep (Paper Explained)

This paper demonstrates in a series of experiments that current safety alignment techniques of LLMs, as well as corresponding jailbreaking attacks, are in large part focusing on modulating the distribution of the first few tokens of the LLM response. Paper: https://openreview.net/forum?id=6Mxhg9PtDE&s=09 Abstract:

The safety alignment of current Large Language Models (LLMs) is vulnerable. Simple attacks, or even benign fine-tuning, can jailbreak aligned models. We note that many of these vulnerabilities are related to a shared underlying issue: safety alignment can take shortcuts, wherein the alignment adapts a model's generative distribution primarily over only its very first few output to…

7 months назад @ youtube.com
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters (Paper Explained)
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters (Paper Explained) TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters (Paper Explained)

A deep dive into the TokenFormer and an opinion about its impact, novelty, and relation to prior work. Paper: https://arxiv.org/abs/2410.23168 Abstract:

Transformers have become the predominant architecture in foundation models due to their excellent performance across various domains. However, the substantial cost of scaling these models remains a significant concern. This problem arises primarily from their dependence on a fixed number of parameters within linear projections. When architectural modifications (e.g., channel dimensions) are introduced, the entire model typically requires retraining from scratch. As model sizes continue growing, this strategy results in increasingly high com…

7 months, 3 weeks назад @ youtube.com
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models

This paper (by Apple) questions the mathematical reasoning abilities of current LLMs and designs a synthetic template-based dataset distribution to investigate various aspects around LLM performance of high-school level math questions. Paper: https://arxiv.org/abs/2410.05229 Abstract:

Recent advancements in Large Language Models (LLMs) have sparked interest in their formal reasoning capabilities, particularly in mathematics. The GSM8K benchmark is widely used to assess the mathematical reasoning of models on grade-school-level questions. While the performance of LLMs on GSM8K has significantly improved in recent years, it remains unclear whether their mathematical reasoning capabilities hav…

8 months, 3 weeks назад @ youtube.com
Were RNNs All We Needed? (Paper Explained)
Were RNNs All We Needed? (Paper Explained) Were RNNs All We Needed? (Paper Explained)

This paper posits the interesting question: How much of the performance of Mamba, S4, and other state-space-like models is actually just attributable to some very core concepts - rather than their elaborate architectures. The authors construct minimal versions of GRUs and LSTMs and report competitive performance. Paper: https://arxiv.org/abs/2410.01201 Abstract:

The scalability limitations of Transformers regarding sequence length have renewed interest in recurrent sequence models that are parallelizable during training. As a result, many novel recurrent architectures, such as S4, Mamba, and Aaren, have been proposed that achieve comparable performance. In this work, we revisit traditional …

9 months назад @ youtube.com
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters (Paper)
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters (Paper) Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters (Paper)

How can one best use extra FLOPS at test time? Paper: https://arxiv.org/abs/2408.03314 Abstract:

Enabling LLMs to improve their outputs by using more test-time computation is a critical step towards building generally self-improving agents that can operate on open-ended natural language. In this paper, we study the scaling of inference-time computation in LLMs, with a focus on answering the question: if an LLM is allowed to use a fixed but non-trivial amount of inference-time compute, how much can it improve its performance on a challenging prompt? Answering this question has implications not only on the achievable performance of LLMs, but also on the future of LLM pretraining and how one s…

9 months, 1 week назад @ youtube.com
Henry AI Labs Henry AI Labs
последний пост None
3blue1brown 3blue1brown
последний пост 2 months, 1 week назад
Summer of Math Exposition #4 | Teachers, I'd love to hear from you
Summer of Math Exposition #4 | Teachers, I'd love to hear from you Summer of Math Exposition #4 | Teachers, I'd love to hear from you

Make a math explainer, get feedback, and receive prizes: https://some.3b1b.co

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to simply share the videos. ------------------ These animations are largely made using a custom Python library, manim. See the FAQ comments here:

https://3b1b.co/faq#manim

https://github.com/3b1b/manim

https://github.com/ManimCommunity/manim/ All code for specific videos is visible here:

https://github.com/3b1b/videos/ The music is by Vincent Rubinetti.

https://www.vincentrubinetti.com

https://vincerubinetti.bandcamp.com/album/the-music-of-3blue1brown

https://open.spotify.com/…

2 months, 1 week назад @ youtube.com
Where my explanation of Grover’s algorithm failed
Where my explanation of Grover’s algorithm failed Where my explanation of Grover’s algorithm failed

Addressing viewer questions from the last video.

These lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to share the videos. ------------------ These animations are largely made using a custom Python library, manim. See the FAQ comments here:

https://3b1b.co/faq#manim

https://github.com/3b1b/manim

https://github.com/ManimCommunity/manim/ All code for specific videos is visible here:

https://github.com/3b1b/videos/ The music is by Vincent Rubinetti.

https://www.vincentrubinetti.com

https://vincerubinetti.bandcamp.com/album/the-music-of-3blue1brown

https://open.spotify.com/album/1dVyjwS8FBqXhRunaG5W5u ------------------ 3blue1brown is a ch…

2 months, 1 week назад @ youtube.com
But what is Quantum Computing? (Grover's Algorithm)
But what is Quantum Computing?  (Grover's Algorithm) But what is Quantum Computing? (Grover's Algorithm)

Qubits, state vectors, and Grover's algorithm for search.

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to share the videos. The subtitles on this video were done using AI, and are likely imperfect, but they are open for community corrections at https://criblate.com/ Adam Brown's paper on the connection between Grover's Algorithm and block collisions:

https://arxiv.org/pdf/1912.02207 If you want to learn the relevant underlying quantum mechanics here, a very friendly resource is the course Mithuna at Looking Glass Universe is currently putting together. See, for instance, this explainer of a qubit:…

2 months, 1 week назад @ youtube.com
Testing your intuition for quantum computing
Testing your intuition for quantum computing Testing your intuition for quantum computing

Full video: https://youtu.be/RQWpF2Gb-gU

2 months, 1 week назад @ youtube.com
How to measure nearby galaxies
How to measure nearby galaxies How to measure nearby galaxies

From this video: https://youtu.be/hFMaT9oRbs4

2 months, 3 weeks назад @ youtube.com
Measuring the distance to Venus without radar
Measuring the distance to Venus without radar Measuring the distance to Venus without radar

From this video with Terry Tao: https://youtu.be/hFMaT9oRbs4

3 months назад @ youtube.com
Measuring the speed of light using Jupiter's moons
Measuring the speed of light using Jupiter's moons Measuring the speed of light using Jupiter's moons

From this video with Terry Tao: https://youtu.be/hFMaT9oRbs4

3 months назад @ youtube.com
The tragic tale of Guillaume Le Gentil
The tragic tale of Guillaume Le Gentil The tragic tale of Guillaume Le Gentil

From this video: https://youtu.be/hFMaT9oRbs4 Artwork by Kurt Bruns

3 months, 2 weeks назад @ youtube.com
Zooming out by powers of 10
Zooming out by powers of 10 Zooming out by powers of 10

From this video: https://youtu.be/YdOXS_9_P4U

3 months, 2 weeks назад @ youtube.com
There's more to those colliding blocks that compute pi
There's more to those colliding blocks that compute pi There's more to those colliding blocks that compute pi

Two colliding blocks compute pi, here we dig into the physics to explain why

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to simply share the videos. The original paper by Gregory Galperin:

https://www.maths.tcd.ie/~lebed/Galperin.%20Playing%20pool%20with%20pi.pdf Adam Brown's paper on the analogy with Grover's Algorithm:

https://arxiv.org/pdf/1912.02207 Here's a lovely interactive built by GitHub user prajwalsouza after watching this video: https://prajwalsouza.github.io/Experiments/Colliding-Blocks.html Matt Parker's Pi Day video:

https://youtu.be/vlUTlbZT4ig NY Times blog post about this proble…

4 months назад @ youtube.com
When being beautifully wrong leads to discovery
When being beautifully wrong leads to discovery When being beautifully wrong leads to discovery

Full video: https://youtu.be/YdOXS_9_P4U

4 months, 2 weeks назад @ youtube.com
Why the ancient Greek's rejected heliocentrism
Why the ancient Greek's rejected heliocentrism Why the ancient Greek's rejected heliocentrism

From this video on the cosmic distance ladder: https://youtu.be/YdOXS_9_P4U

4 months, 2 weeks назад @ youtube.com
How to estimate the distance to the sun
How to estimate the distance to the sun How to estimate the distance to the sun

Full video: https://youtu.be/YdOXS_9_P4U

4 months, 2 weeks назад @ youtube.com
How Aristarchus deduced the distance to the moon
How Aristarchus deduced the distance to the moon How Aristarchus deduced the distance to the moon

Full video: https://youtu.be/YdOXS_9_P4U

4 months, 2 weeks назад @ youtube.com
The cosmic distance ladder with Terence Tao (part 2)
The cosmic distance ladder with Terence Tao (part 2) The cosmic distance ladder with Terence Tao (part 2)

How we know the distances to the planets, stars, and faraway galaxies.

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

FAQ with added details and corrections: https://terrytao.wordpress.com/2025/02/13/cosmic-distance-ladder-video-with-grant-sanderson-3blue1brown-commentary-and-corrections/ An equally valuable form of support is to simply share the videos. Terry and his collaborator Tanya have an Instagram about the cosmic distance ladder: https://www.instagram.com/cosmic_distance_ladder/ Artwork of Guillaume Le Gentil by Kurt Bruns

Artwork of Antonia Maury and Henrietta Leavitt by Talia Gershon: https://bit.ly/taliagershonart

Several of t…

4 months, 2 weeks назад @ youtube.com
Two Minute Papers Two Minute Papers
последний пост 4 days, 16 hours назад
NVIDIA’s New AI Cheated At Parkour…And Got Fixed!
NVIDIA’s New AI Cheated At Parkour…And Got Fixed! NVIDIA’s New AI Cheated At Parkour…And Got Fixed!

❤️ Check out DeepInfra and run DeepSeek or many other AI projects: https://deepinfra.com/papers 📝 The paper "PARC: Physics-based Augmentation with Reinforcement Learning for Character Controllers" is available here:

https://michaelx.io/parc/index.html 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Benji Rabhan, B Shang, Christian Ahlin, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Michael Te…

4 days, 16 hours назад @ youtube.com
Unreal Engine 5.6: Outrageously Good!
Unreal Engine 5.6: Outrageously Good! Unreal Engine 5.6: Outrageously Good!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Guide for using DeepSeek on Lambda:

https://docs.lambdalabs.com/education/large-language-models/deepseek-r1-ollama/?utm_source=two-minute-papers&utm_campaign=relevant-videos&utm_medium=video Get Unreal Engine 5.6:

https://www.unrealengine.com/en-US/news/unreal-engine-5-6-is-now-available Substrate material source + tutorial : https://www.youtube.com/watch?v=YDeptWduHNc 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to t…

2 weeks, 1 day назад @ youtube.com
NVIDIA’s New AI Watched 150,000 Videos! What Did It Learn?
NVIDIA’s New AI Watched 150,000 Videos! What Did It Learn? NVIDIA’s New AI Watched 150,000 Videos! What Did It Learn?

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Guide for using DeepSeek on Lambda:

https://docs.lambdalabs.com/education/large-language-models/deepseek-r1-ollama/?utm_source=two-minute-papers&utm_campaign=relevant-videos&utm_medium=video 📝 The paper is available here:

https://research.nvidia.com/labs/toronto-ai/UniRelight/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Apple Music Sing demonstration source: https://www.youtube.com/watch?v=Q6Qpsvwh6mQ Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our ge…

2 weeks, 4 days назад @ youtube.com
OpenAI’s o3 Pro: Crushing The AI Game Test! 🎮
OpenAI’s o3 Pro: Crushing The AI Game Test! 🎮 OpenAI’s o3 Pro: Crushing The AI Game Test! 🎮

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Guide for using DeepSeek on Lambda:

https://docs.lambdalabs.com/education/large-language-models/deepseek-r1-ollama/?utm_source=two-minute-papers&utm_campaign=relevant-videos&utm_medium=video 📝 Paper+code: https://github.com/lmgame-org/GamingAgent

Some results: https://huggingface.co/spaces/lmgame/lmgame_bench

Try it out: https://lmgame.org 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supp…

3 weeks, 1 day назад @ youtube.com
NVIDIA’s New AI Grows Stuff Out Of Nothing!
NVIDIA’s New AI Grows Stuff Out Of Nothing! NVIDIA’s New AI Grows Stuff Out Of Nothing!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Guide for using DeepSeek on Lambda:

https://docs.lambdalabs.com/education/large-language-models/deepseek-r1-ollama/?utm_source=two-minute-papers&utm_campaign=relevant-videos&utm_medium=video 📝 The papers are available here:

https://research.nvidia.com/labs/toronto-ai/stochastic-preconditioning/

https://zju3dv.github.io/freetimegs/ Play with it (interactive viewer): https://www.4dv.ai/viewer/salmon_10s?showdemo=4dv 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/ar…

3 weeks, 5 days назад @ youtube.com
Google’s New AI: This Isn’t a Photo - But It Is!
Google’s New AI: This Isn’t a Photo - But It Is! Google’s New AI: This Isn’t a Photo - But It Is!

❤️ Check out the Fully Connected conference from Weights & Biases on June 17-18th in SF:

https://wandb.me/fc2025

Use the code FCSF2WP to get a ticket for free! 📝 The paper "Practical Inverse Rendering of Textured and Translucent Appearance" is available here:

https://weiphil.github.io/portfolio/practical_reconstruction 📝 Separable Subsurface Scattering:

https://users.cg.tuwien.ac.at/zsolnai/gfx/separable-subsurface-scattering-with-activision-blizzard/ Free rendering course!

https://users.cg.tuwien.ac.at/zsolnai/gfx/rendering-course/ 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Benji Rabhan, B Shang, Christian Ahlin, Gordon Child, John Le, Jua…

1 month назад @ youtube.com
NVIDIA’s New AI: Next Level Games Are Coming!
NVIDIA’s New AI: Next Level Games Are Coming! NVIDIA’s New AI: Next Level Games Are Coming!

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers 📝 The papers are available here:

https://research.nvidia.com/labs/toronto-ai/difix3d/

https://sites.google.com/view/cast4

https://syntec-research.github.io/UVGA/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Benji Rabhan, B Shang, Christian Ahlin, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Michael Tedd…

1 month назад @ youtube.com
Microsoft’s New AI: Ray Tracing 16,000,000 Images!
Microsoft’s New AI: Ray Tracing 16,000,000 Images! Microsoft’s New AI: Ray Tracing 16,000,000 Images!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Guide for using DeepSeek on Lambda:

https://docs.lambdalabs.com/education/large-language-models/deepseek-r1-ollama/?utm_source=two-minute-papers&utm_campaign=relevant-videos&utm_medium=video 📝 The paper "RenderFormer: Transformer-based Neural Rendering of Triangle Meshes with Global Illumination" is available here:

https://microsoft.github.io/renderformer/ 📝 Our neural rendering paper "Gaussian Material Synthesis" is available here:

https://users.cg.tuwien.ac.at/zsolnai/gfx/gaussian-material-synthesis/ 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Benji Rabhan, …

1 month назад @ youtube.com
NVIDIA’s New AI: From Video Games to Reality!
NVIDIA’s New AI: From Video Games to Reality! NVIDIA’s New AI: From Video Games to Reality!

❤️ Try Macro for free and supercharge your learning: https://macro.com/papers 📝 The #NVIDIA papers are available here:

https://research.nvidia.com/labs/dir/cosmos-transfer1/

https://research.nvidia.com/labs/dir/cosmos-reason1/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Benji Rabhan, B Shang, Christian Ahlin, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Michael Tedder, Owen Skarpness, Ric…

1 month, 1 week назад @ youtube.com
NVIDIA’s New AI: Impossible Weather Graphics!
NVIDIA’s New AI: Impossible Weather Graphics! NVIDIA’s New AI: Impossible Weather Graphics!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Guide for using DeepSeek on Lambda:

https://docs.lambdalabs.com/education/large-language-models/deepseek-r1-ollama/?utm_source=two-minute-papers&utm_campaign=relevant-videos&utm_medium=video 📝 The papers are available here:

https://research.nvidia.com/labs/toronto-ai/WeatherWeaver/

https://research.nvidia.com/labs/toronto-ai/DiffusionRenderer/ Source: https://www.youtube.com/watch?v=CVdtLieI5D0 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01…

1 month, 2 weeks назад @ youtube.com
DeepMind’s Veo3 AI - The New King Is Here!
DeepMind’s Veo3 AI - The New King Is Here! DeepMind’s Veo3 AI - The New King Is Here!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Guide for using DeepSeek on Lambda:

https://docs.lambdalabs.com/education/large-language-models/deepseek-r1-ollama/?utm_source=two-minute-papers&utm_campaign=relevant-videos&utm_medium=video 📝 More on Veo3 available here:

https://deepmind.google/models/veo/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Benji Rabhan, B Shang, Christian Ahlin, …

1 month, 3 weeks назад @ youtube.com
DeepMind’s AlphaEvolve AI: History In The Making!
DeepMind’s AlphaEvolve AI: History In The Making! DeepMind’s AlphaEvolve AI: History In The Making!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Guide for using DeepSeek on Lambda:

https://docs.lambdalabs.com/education/large-language-models/deepseek-r1-ollama/?utm_source=two-minute-papers&utm_campaign=relevant-videos&utm_medium=video 📝 AlphaEvolve: https://deepmind.google/discover/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/

📝 My genetic algorithm for the Mona Lisa: https://users.cg.tuwien.ac.at/zsolnai/gfx/mona_lisa_parallel_genetic_algorithm/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

1 month, 3 weeks назад @ youtube.com
New AI: Impossible Creatures Come Alive!
New AI: Impossible Creatures Come Alive! New AI: Impossible Creatures Come Alive!

❤️ Check out DeepInfra and run DeepSeek or many other AI projects: https://deepinfra.com/papers 📝 The papers are available here:

https://anytop2025.github.io/Anytop-page/

https://zhongleilz.github.io/Sketch2Anim/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Benji Rabhan, B Shang, Christian Ahlin, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Michael Tedder, Owen Skarpness, Richard Sundvall,…

1 month, 4 weeks назад @ youtube.com
NVIDIA’s New AI: Impossible Video Game Animations!
NVIDIA’s New AI: Impossible Video Game Animations! NVIDIA’s New AI: Impossible Video Game Animations!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Guide for using DeepSeek on Lambda:

https://docs.lambdalabs.com/education/large-language-models/deepseek-r1-ollama/?utm_source=two-minute-papers&utm_campaign=relevant-videos&utm_medium=video 📝 The #NVIDIA paper "GENMO: A GENeralist Model for Human MOtion" is available here:

https://research.nvidia.com/labs/dair/genmo/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 Sources for SLAM:

https://www.youtube.com/watch?v=2GJuEIh4xGo

https://ww…

2 months назад @ youtube.com
3 Ways OpenAI’s ChatGPT Surprised Its Creators!
3 Ways OpenAI’s ChatGPT Surprised Its Creators! 3 Ways OpenAI’s ChatGPT Surprised Its Creators!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Guide for using DeepSeek on Lambda:

https://docs.lambdalabs.com/education/large-language-models/deepseek-r1-ollama/?utm_source=two-minute-papers&utm_campaign=relevant-videos&utm_medium=video OpenAI post: https://openai.com/index/expanding-on-sycophancy/ Paper on agreeableness: https://arxiv.org/abs/2212.09251 Source: https://x.com/georgejrjrjr/status/1917722125668081863/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to…

2 months назад @ youtube.com
DataFest Video DataFest Video
последний пост None
Семинары JetBrains Research Семинары JetBrains Research
последний пост None
Яндекс. Компьютерные науки Яндекс. Компьютерные науки
последний пост 22 часа назад
VLM: как видит мир Алиса от Яндекса
VLM: как видит мир Алиса от Яндекса VLM: как видит мир Алиса от Яндекса

Это фрагмент выступления Дарьи Виноградовой, руководителя команды стримингового зрения в Яндексе. На Data Fest Дарья рассказала о VLM в Алисе. #Алиса #Яндекс #нейросети #искусственныйинтеллект #VLM #AI #технологии #умныеустройства #мультимодальность #АлисаЯндекс #голосовойассистент #визуальноеAI #ЯндексАлиса #VLMтехнологии #ИИ

22 часа назад @ youtube.com
VLM: как нейросети связывают картинку и текст
VLM: как нейросети связывают картинку и текст VLM: как нейросети связывают картинку и текст

Это фрагмент выступления Дарьи Виноградовой, руководителя команды стримингового зрения в Яндексе. На Data Fest Дарья рассказала о VLM в Алисе. #VLM #нейросети #ИИ #искусственныйинтеллект #мультимодальность #AI #визуальныйИИ #нейросетитексткартинка #компьютерноеЗрение #моделиЯндекса #умныеустройства #технологиибудущего #визуальноятехнология #machinelearning #deeplearning #языкитехнологии #AIЯндекс #модельпокартинке #LLM #NLP #CV

4 days, 20 hours назад @ youtube.com
Что за Муза в Алисе от Яндекса
Что за Муза в Алисе от Яндекса Что за Муза в Алисе от Яндекса

Это фрагмент выступления Дарьи Виноградовой, руководителя команды стримингового зрения в Яндексе. На Data Fest Дарья рассказала о VLM в Алисе. #Алиса #Муза #Яндекс #VLM #DataFest #ИИ #нейросети #AI #умныеустройства #визуальныемодели #визуальныйИИ #ЯндексТехнологии #визуальноятехнология #ассистентыбудущего #технологии #компьютерноеЗрение #LLM #интеллектуальныйассистент #стриминговоезрение #нейроассистент #техконференция

1 week, 5 days назад @ youtube.com
Как прогнозировать тысячи временных рядов и не сходить с ума / Александр Исаков
Как прогнозировать тысячи временных рядов и не сходить с ума / Александр Исаков Как прогнозировать тысячи временных рядов и не сходить с ума / Александр Исаков

Александр Исаков, руководитель группы прогнозирования в Яндекс Лавке, прочитал этот доклад на конференции AHA!25. Александр рассказал о создании и внедрении системы среднесрочного прогнозирования для Яндекс Лавки. Эта система точно прогнозирует ключевую бизнес-метрику (заказы) на полугодовой период и помогает планировать ресурсы в условиях быстрого роста сервиса и высокой волатильности спроса. Александр подробно остановился на том, как ребята решили проблему волатильности временных рядов с помощью приведения данных к стабильному виду и выделения ключевых влияющих факторов, что позволило не только улучшить прогноз, но и объяснить отклонения от плана. А ещё рассказал, как они встроили прогноз…

2 weeks, 2 days назад @ youtube.com
Как продукты на основе LLM повышают эффективность сотрудников и экономят миллионы / Эльвира Морозова
Как продукты на основе LLM повышают эффективность сотрудников и экономят миллионы / Эльвира Морозова Как продукты на основе LLM повышают эффективность сотрудников и экономят миллионы / Эльвира Морозова

Эльвира Морозова, ХХ в Яндексе, прочитала этот доклад на конференции AHA!25. Эльвира рассказала, как ребята внедрили YaGPT в формате подсказок оператора в одну из поддержек Яндекса. И в итоге получили доказанный в A/B-тестах экстраэффект как на скорости работы операторов, так и на качестве ответов. Эльвира показала, почему «экзоскелет» из GPT для оператора поддержки — это новая норма. Спойлер: применение GPT в саппорте сейчас приносит 15% чистой экономии в Яндекс Маркете. Больше классных материалов ищите в телеграм-канале Yandex for ML: https://t.me/yandexforml #GPT #Яндекс #поддержка #аналитика #AHA25 #YaGPT #искусственныйинтеллект #YandexForAnalytics #LLM #AI

2 weeks, 3 days назад @ youtube.com
Будущее рекомендательных систем / Николай Савушкин
Будущее рекомендательных систем / Николай Савушкин Будущее рекомендательных систем / Николай Савушкин

Николай Савушкин, руководитель службы рекомендательных технологий в Яндексе, прочитал этот доклад на конференции AHA!25. Николай рассказал, как персонализация устроена сейчас, как большие генеративные модели внедряют в рекомендательные системы Яндекса и что ждёт нас в будущем. Доклад направлен на широкую аудиторию с техническим бэкграундом, знакомит с основными понятиями и показывает технологические тренды. Больше классных материалов ищите в телеграм-канале Yandex for ML: https://t.me/yandexforml #персонализация #рекомендации #машинноеобучение #генеративныемодели #Яндекс #AI #LLM #AHA25 #аналитика #YandexForAnalytics

2 weeks, 4 days назад @ youtube.com
Опыт внедрения ML-прогнозов в систему динамического ценообразования Яндекс Доставки / Андрей Нарцев
Опыт внедрения ML-прогнозов в систему динамического ценообразования Яндекс Доставки / Андрей Нарцев Опыт внедрения ML-прогнозов в систему динамического ценообразования Яндекс Доставки / Андрей Нарцев

Андрей Нарцев, руководитель Applied ML в Яндекс Доставке, прочитал этот доклад на конференции AHA!25. Андрей разобрал ключевые аналитические, продуктовые и ML-вызовы, с которыми ребята столкнулись при внедрении динамического ценообразования для замедленных тарифов. А ещё поделился инсайтами и полезными практиками, которые помогли сделать ценообразование ещё более адаптивным. После просмотра этого доклада вы будете лучше понимать сложности внедрения ML-моделей в продукт, научитесь заранее предотвращать потенциальные проблемы и применять представленные подходы в своей работе. Больше классных материалов ищите в телеграм-канале Yandex for ML: https://t.me/yandexforml #ML #ценообразование #Яндек…

2 weeks, 5 days назад @ youtube.com
Time Capsule от Data Fest: как технологии изменят всё
Time Capsule от Data Fest: как технологии изменят всё Time Capsule от Data Fest: как технологии изменят всё

В этом году конференция Data Fest отмечала юбилей — 10 лет. За это время машинное обучение проникло во многие сферы нашей жизни, в том числе в рабочие процессы. Нам стало интересно: как развитие ML изменит нашу привычную реальность ещё через 10 лет? Мы спросили об этом у гостей конференции, и вот что из этого вышло! #DataFest #машинноеобучение #ML #искусственныйинтеллект #будущеетехнологий #нейросети #цифровоебудущее #технологии #Яндекс #AI

3 weeks, 5 days назад @ youtube.com
Fast and Approximate Responses / Артур Соловьёв
Fast and Approximate Responses / Артур Соловьёв Fast and Approximate Responses / Артур Соловьёв

Это выступление на Data Fest в гостях у Яндекса в секции OptimalDL. Артур Соловьёв прочитал доклад на английском языке на тему: Fast and Approximate Responses. Узнать больше о мероприятиях для разработчиков можно тут: https://events.yandex.ru Подписывайтесь на телеграм-канал Яндекса для ML-сообщества: https://t.me/yandexforml #DataFest #OptimalDL #FastResponses #ApproximateComputing #MLOptimization #DeepLearning #AIperformance #инференс #оптимизация #нейросети #modelcompression #AI #машиннообучение #inferencespeed #lowlatencyAI #АртурСоловьёв #DLresearch #DataFest2025

1 month, 1 week назад @ youtube.com
Как я создал ИИ-переводчик для славянской интерлингвы (межславянского языка) / Салават Гарифуллин
Как я создал ИИ-переводчик для славянской интерлингвы (межславянского языка) / Салават Гарифуллин Как я создал ИИ-переводчик для славянской интерлингвы (межславянского языка) / Салават Гарифуллин

Это доклад на Data Fest в гостях у Яндекса. В секции NLP Салават Гарифуллин рассказал, как создал первый ИИ-переводчик для славянской интерлингвы — искусственного межславянского языка. Узнать больше о мероприятиях для разработчиков можно тут: https://events.yandex.ru Подписывайтесь на телеграм-канал Яндекса для ML-сообщества: https://t.me/yandexforml #DataFest #NLP #ИИпереводчик #межславянскийязык #интерлингва #машинныйперевод #языки #AI #linguistics #нейросети #naturalanguageprocessing #перевод #lowresource #slaviclanguages #SalavatGarifullin #искусственныйинтеллект #languageAI #переводчик #DataFest2025

1 month, 1 week назад @ youtube.com
Как мы делали умного помощника в Лавке на основе YaGPT / Алёна Зайцева
Как мы делали умного помощника в Лавке на основе YaGPT / Алёна Зайцева Как мы делали умного помощника в Лавке на основе YaGPT / Алёна Зайцева

Это доклад на Data Fest в гостях у Яндекса. В секции Advanced LLMs Алёна Зайцева рассказала, как ребята делали умного помощника в Лавке на основе YaGPT. Узнать больше о мероприятиях для разработчиков можно тут: https://events.yandex.ru Подписывайтесь на телеграм-канал Яндекса для ML-сообщества: https://t.me/yandexforml #DataFest #AdvancedLLMs #ЯндексЛавка #YaGPT #LLM #умныйпомощник #AI #искусственныйинтеллект #генеративныйИИ #голосовойпомощник #NLP #персонализация #Lavka #AIassistant #frontendAI #backendAI #biglanguagemodels #AlenaZaytseva #DataFest2025

1 month, 1 week назад @ youtube.com
From Tokens to Thinking: How Reinforcement Learning Fuels Reasoning in LLMs / Миле Митрович
From Tokens to Thinking: How Reinforcement Learning Fuels Reasoning in LLMs / Миле Митрович From Tokens to Thinking: How Reinforcement Learning Fuels Reasoning in LLMs / Миле Митрович

Это выступление на Data Fest в гостях у Яндекса в секции Advanced LLMs. Миле Митрович прочитал доклад на английском языке на тему: From Tokens to Thinking: How Reinforcement Learning Fuels Reasoning in LLMs. Узнать больше о мероприятиях для разработчиков можно тут: https://events.yandex.ru Подписывайтесь на телеграм-канал Яндекса для ML-сообщества: https://t.me/yandexforml #DataFest #AdvancedLLMs #MilaMitrovic #LLM #reinforcementlearning #reasoning #машинноемышление #RLHF #AI #большиеязыковыемодели #искусственныйинтеллект #нейросети #обучениесподкреплением #AIthinking #futureofAI #generativeAI #LLMarchitecture

1 month, 1 week назад @ youtube.com
Обучение LLM в низкой точности вычислений / Андрей Панфёров
Обучение LLM в низкой точности вычислений / Андрей Панфёров Обучение LLM в низкой точности вычислений / Андрей Панфёров

Это доклад на Data Fest в гостях у Яндекса. В секции DL Frontiers Андрей Панфёров рассказал про обучение LLM в низкой точности вычислений. Узнать больше о мероприятиях для разработчиков можно тут: https://events.yandex.ru Подписывайтесь на телеграм-канал Яндекса для ML-сообщества: https://t.me/yandexforml #DataFest #DLFrontiers #LLM #LowPrecision #АндрейПанфёров #машинноеобучение #глубокообучение #искусственныйинтеллект #AI #нейросети #оптимизациямоделей #ML #большиемодели #обучениеLLM #AIтехнологии #MLинфраструктура #ресурсоэффективность

1 month, 1 week назад @ youtube.com
Библиотека для создания рекомендательных систем RePlay / Алексей Васильев
Библиотека для создания рекомендательных систем RePlay / Алексей Васильев Библиотека для создания рекомендательных систем RePlay / Алексей Васильев

Это доклад на Data Fest в гостях у Яндекса. В секции Open Source Алексей Васильев рассказал о RePlay, библиотеке для создания рекомендательных систем. Узнать больше о мероприятиях для разработчиков можно тут: https://events.yandex.ru Подписывайтесь на телеграм-канал Яндекса для ML-сообщества: https://t.me/yandexforml #DataFest #OpenSource #RePlay #RecSys #MachineLearning #ML #рекомендации #алгоритмы #библиотека #персонализация #рекомендательныесистемы #opensource #recommendationengine #AI #АлексейВасильев #DataFest2025 #MLtools

1 month, 1 week назад @ youtube.com
Хемоинформатика — это область на стыке химии и информационных технологий / Ксения Никитина
Хемоинформатика — это область на стыке химии и информационных технологий / Ксения Никитина Хемоинформатика — это область на стыке химии и информационных технологий / Ксения Никитина

Это доклад на Data Fest в гостях у Яндекса. В секции ML in Chemistry Ксения Никитина рассказала о хемоинформатике, научной области на стыке химии и информационных технологий. Узнать больше о мероприятиях для разработчиков можно тут: https://events.yandex.ru Подписывайтесь на телеграм-канал Яндекса для ML-сообщества: https://t.me/yandexforml #DataFest #MLinChemistry #хемоинформатика #машинноеобучение #искусственныйинтеллект #AI #наука #chemoinformatics #КсенияНикитина #химия #цифроваяхимия #AIвнауке #нейросетивхимии #интердисциплинарность #наукиожизни

1 month, 1 week назад @ youtube.com
ML Trainings ML Trainings
последний пост 1 week назад
Капитанский мостик №1: OpenAI против стартапа | ИИ заменит всех | разметчики за решеткой
Капитанский мостик №1: OpenAI против стартапа | ИИ заменит всех | разметчики за решеткой Капитанский мостик №1: OpenAI против стартапа | ИИ заменит всех | разметчики за решеткой

Первый экспериментальный выпуск подкаста Капитанский мостик, обсуждение новостей из мира ИИ за прошедшую неделю и не только. Оставляйте новости к следующему выпуску в нашем канале в Mattermost. 00:00:00 вступление

00:00:44 суд OpenAI против стартапа бывшего дизайнера из Apple

00:31:41 Яндекс сделал бесплатной YandexGPT

00:45:35 X (Twitter) заменил Google Tranlate на Grok

00:56:22 Илья Суцкевер сказал, что ИИ всех заменит

01:14:20 Claude управляет магазином 01:22:29 Сбер привлекает заключенных к разметке Канал в mattermost: https://mm.ods.ai/ods/channels/data_captain (авторизуйтесь через ODS.ai) Источники:

1. https://techcrunch.com/2025/06/23/court-filings-reveal-openai-and-ios-early-work-on…

1 week назад @ youtube.com
Анастасия Функнер, Ольга Павлова, Анна Ефимова | Как Ozon Банк создает ML-платформу будущего
Анастасия Функнер, Ольга Павлова, Анна Ефимова | Как Ozon Банк создает ML-платформу будущего Анастасия Функнер, Ольга Павлова, Анна Ефимова | Как Ozon Банк создает ML-платформу будущего

Спикеры: Анастасия Функнер, Ольга Павлова, Анна Ефимова, Ozon Банк

Тема доклада: MLOps, Data Science и Golang: Как Ozon Банк создает ML-платформу будущего

Мероприятие Wids-meetup-2025: https://ods.ai/events/wids-meetup-2025 Наши соц.сети:Telegram: https://t.me/datafestВконтакте: https://vk.com/datafestКанал с вакансиями в telegram: https://t.me/odsjobsКанал с апдейтами по курсам: https://t.me/odscoursesКак попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

3 months, 1 week назад @ youtube.com
Алена Феногенова | Новые бенчмарки 2024-2025 для русского языка: вызовы и перспективы
Алена Феногенова | Новые бенчмарки 2024-2025 для русского языка: вызовы и перспективы Алена Феногенова | Новые бенчмарки 2024-2025 для русского языка: вызовы и перспективы

Спикер: Алена Феногенова, AGI NLP TeamLead, Сбер

Мероприятие Wids-meetup-2025: https://ods.ai/events/wids-meetup-2025 Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

3 months, 1 week назад @ youtube.com
Мария Бегичева | Цифровой помощник компании в управлении операционными рисками
Мария Бегичева | Цифровой помощник компании в управлении операционными рисками Мария Бегичева | Цифровой помощник компании в управлении операционными рисками

Спикер: Мария Бегичева, Senior DS, Сбер

Мероприятие Wids-meetup-2025: https://ods.ai/events/wids-meetup-2025 Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

3 months, 1 week назад @ youtube.com
Полина Федотова | Foundation Models in Robotics
Полина Федотова |  Foundation Models in Robotics Полина Федотова | Foundation Models in Robotics

Спикер: Полина Федотова, Главный инженер-разработчик, лид исследовательской команды, «Сбер Центр робототехники»

Мероприятие Wids-meetup-2025: https://ods.ai/events/wids-meetup-2025 Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

3 months, 1 week назад @ youtube.com
Нонна Шахова, Эмели Драль,Ирина Голощапова, Анастасия Никулина | Круглый стол: Women in Data Science
Нонна Шахова, Эмели Драль,Ирина Голощапова, Анастасия Никулина | Круглый стол: Women in Data Science Нонна Шахова, Эмели Драль,Ирина Голощапова, Анастасия Никулина | Круглый стол: Women in Data Science

Спикеры: Нонна Шахова, Эмели Драль, Ирина Голощапова, Анастасия Никулина

Мероприятие Wids-meetup-2025: https://ods.ai/events/wids-meetup-2025

Как женщины меняют data science:

Старт в data science:

Работа и карьера:

Профессиональное развитие:

Будущее в data science: Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

3 months, 1 week назад @ youtube.com
Анна Текучева | "Знаешь что это за слово? а оно есть"
Анна Текучева | "Знаешь что это за слово? а оно есть" Анна Текучева | "Знаешь что это за слово? а оно есть"

Спикер: Анна Текучева, Data Scientist HML Wildberries

Мероприятие Wids-meetup-2025: https://ods.ai/events/wids-meetup-2025 Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

3 months, 1 week назад @ youtube.com
Юлия Раковская | Вступительное слово WiDS Meetup
Юлия Раковская | Вступительное слово WiDS Meetup Юлия Раковская | Вступительное слово WiDS Meetup

Спикер: Юлия Раковская, Руководитель центра исследований и разработки Сбера

Мероприятие Wids-meetup-2025: https://ods.ai/events/wids-meetup-2025 Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

3 months, 1 week назад @ youtube.com
Митап #1 | Data Fusion Contest 2025
Митап #1 | Data Fusion Contest 2025 Митап #1 | Data Fusion Contest 2025

Ежегодная серия соревнований по машинному обучению Data Fusion Contest стартовала. Страница с задачами: https://ods.ai/tracks/data-fusion-2025-competitions Во вторник 25 февраля (19:00 - 20:00 по мск) мы провели первый митап по Data Fusion Contest 2025 в ODS спейсе Spatial.Chat. Поговорили о задачах, ответим на ваши вопросы участников. В программе: Анатолий Глушенко Разбор задачи 1 “LabelCraft”

Алексей Натёкин Обзор задачи 2 “4Cast” и задачи 3 “Distribution” Команда Data Fusion Contest 2025 Q&A и обсуждение вопросов с участниками

4 months, 2 weeks назад @ youtube.com
Анастасия Вепрева | Мастер-класс: Генерация лекарственных молекул
Анастасия Вепрева | Мастер-класс: Генерация лекарственных молекул Анастасия Вепрева | Мастер-класс: Генерация лекарственных молекул

Спикер: Анастасия Вепрева - разработчик моделей в области прогнозирования физико-химических свойств и биологических активностей малых молекул, сотрудница Центра ИИ в химии ИТМО

Мероприятие 21.02.2025: https://ods.ai/events/ai_chemistrymk1 Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

4 months, 2 weeks назад @ youtube.com
Антон Воронов | Итоги года в DS/ML карьерных вопросах
Антон Воронов | Итоги года в DS/ML карьерных вопросах Антон Воронов | Итоги года в DS/ML карьерных вопросах

Спикер: Антон Воронов, Газпром ИД, Заместитель директора департамента, Руководитель платформы Поиска Data Ёлка 2024 в гостях у VK: https://ods.ai/events/data-elka-24-vk-offline

Data Ёлка 2024: https://ods.ai/events/data-elka-2024

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

4 months, 2 weeks назад @ youtube.com
Пётр Ермаков | Итоги года в PyData stack
Пётр Ермаков | Итоги года в PyData stack Пётр Ермаков | Итоги года в PyData stack

Спикер: Пётр Ермаков, ML Brand Director, Яндекс Data Ёлка 2024 в гостях у VK: https://ods.ai/events/data-elka-24-vk-offline

Data Ёлка 2024: https://ods.ai/events/data-elka-2024

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

4 months, 2 weeks назад @ youtube.com
Валентин Малых | Итоги года в NLP
Валентин Малых | Итоги года в NLP Валентин Малых | Итоги года в NLP

Спикер: Валентин Малых, руководитель группы, MTS AI Data Ёлка 2024 в гостях у VK: https://ods.ai/events/data-elka-24-vk-offline

Data Ёлка 2024: https://ods.ai/events/data-elka-2024

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

4 months, 2 weeks назад @ youtube.com
Николай Анохин | Итоги года в RecSys
Николай Анохин | Итоги года в RecSys Николай Анохин | Итоги года в RecSys

Спикер: Николай Анохин, ведущий специалист по ML, AI VK Data Ёлка 2024 в гостях у VK: https://ods.ai/events/data-elka-24-vk-offline

Data Ёлка 2024: https://ods.ai/events/data-elka-2024

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

4 months, 2 weeks назад @ youtube.com
Ирина Голощапова | Итоги года в Reliable ML, часть 2
Ирина Голощапова | Итоги года в Reliable ML, часть 2 Ирина Голощапова | Итоги года в Reliable ML, часть 2

Спикер: Ирина Голощапова, CDO Raiffeisenbank Operations Data Ёлка 2024 в гостях у VK: https://ods.ai/events/data-elka-24-vk-offline

Data Ёлка 2024: https://ods.ai/events/data-elka-2024

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

4 months, 2 weeks назад @ youtube.com
🎧 Podcasts
Lex Fridman AI Podcast Lex Fridman AI Podcast
последний пост 15 часов назад
#474 – DHH: Future of Programming, AI, Ruby on Rails, Productivity & Parenting
#474 – DHH: Future of Programming, AI, Ruby on Rails, Productivity & Parenting #474 – DHH: Future of Programming, AI, Ruby on Rails, Productivity & Parenting

David Heinemeier Hansson (aka DHH) is a legendary programmer, creator of Ruby on Rails, co-owner & CTO of 37signals that created Basecamp, HEY, & ONCE, and is a NYT-best-selling author (with Jason Fried) of 4 books: REWORK, REMOTE, Getting Real, and It Doesn’t Have To Be Crazy At Work.

He is also a race car driver, including a class-winning performance at the 24 hour Le Mans race.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep474-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://upliftdesk.com/lexLindy: No-code AI agent builder.

Go to https://go.lindy.ai/lexLMNT: Zero-sugar electrolyte drink …

15 часов назад @ lexfridman.com
#473 – Iran War Debate: Nuclear Weapons, Trump, Peace, Power & the Middle East
#473 – Iran War Debate: Nuclear Weapons, Trump, Peace, Power & the Middle East #473 – Iran War Debate: Nuclear Weapons, Trump, Peace, Power & the Middle East

Debate on Iran war between Scott Horton and Mark Dubowitz.

Scott Horton is the author and director of the Libertarian Institute, editorial director of Antiwar.com, host of The Scott Horton Show, and for the past three decades, a staunch critic of U.S. foreign policy and military interventionism.

Mark Dubowitz is the chief executive of the Foundation for Defense of Democracies, host of the Iran Breakdown podcast, and a leading expert on Iran and its nuclear program for over 20 years.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep473-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://drinkLMNT.c…

2 weeks, 2 days назад @ lexfridman.com
#472 – Terence Tao: Hardest Problems in Mathematics, Physics & the Future of AI
#472 – Terence Tao: Hardest Problems in Mathematics, Physics & the Future of AI #472 – Terence Tao: Hardest Problems in Mathematics, Physics & the Future of AI

Terence Tao is widely considered to be one of the greatest mathematicians in history.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep472-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://shopify.com/lexNetSuite: Business management software.

Go to http://netsuite.com/lexLMNT: Zero-sugar electrolyte drink mix.

Go to https://drinkLMNT.com/lexAG1: All-in-one daily nutrition drink.

4 weeks назад @ lexfridman.com
#471 – Sundar Pichai: CEO of Google and Alphabet
#471 – Sundar Pichai: CEO of Google and Alphabet #471 – Sundar Pichai: CEO of Google and Alphabet

Sundar Pichai is CEO of Google and Alphabet.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep471-sc

See below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc. Transcript:

https://lexfridman.com/sundar-pichai-transcript CONTACT LEX:

Feedback - give feedback to Lex: https://lexfridman.com/survey

AMA - submit questions, videos or call-in: https://lexfridman.com/ama

Hiring - join our team: https://lexfridman.com/hiring

Other - other ways to get in touch: https://lexfridman.com/contact EPISODE LINKS:

Sundar's X: https://x.com/sundarpichai

Sundar's Instagram: https://instagram.com/sundarpichai

Sundar's Blog: https://blog.goo…

1 month, 1 week назад @ lexfridman.com
#470 – James Holland: World War II, Hitler, Churchill, Stalin & Biggest Battles
#470 – James Holland: World War II, Hitler, Churchill, Stalin & Biggest Battles #470 – James Holland: World War II, Hitler, Churchill, Stalin & Biggest Battles

James Holland is a historian specializing in World War II.

He hosts a podcast called WW2 Pod: We Have Ways of Making You Talk.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep470-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://shopify.com/lexLMNT: Zero-sugar electrolyte drink mix.

Go to https://drinkag1.com/lexNotion: Note-taking and team collaboration.

1 month, 2 weeks назад @ lexfridman.com
#469 – Oliver Anthony: Country Music, Blue-Collar America, Fame, Money, and Pain
#469 – Oliver Anthony: Country Music, Blue-Collar America, Fame, Money, and Pain #469 – Oliver Anthony: Country Music, Blue-Collar America, Fame, Money, and Pain

Oliver Anthony is singer-songwriter who first gained worldwide fame with his viral hit Rich Men North of Richmond.

He became a voice for many who are voiceless, with many of his songs speaking to the struggle of the working class in modern American life.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep469-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://oracle.com/lexTax Network USA: Full-service tax firm.

Go to https://drinkLMNT.com/lexOUTLINE:(00:00) – Introduction(09:00) – Open mics(13:03) – Mainstream country music(22:10) – Fame(28:06) – Music vs politics(36:56) – Rich Men North of Richmon…

1 month, 3 weeks назад @ lexfridman.com
#468 – Janna Levin: Black Holes, Wormholes, Aliens, Paradoxes & Extra Dimensions
#468 – Janna Levin: Black Holes, Wormholes, Aliens, Paradoxes & Extra Dimensions #468 – Janna Levin: Black Holes, Wormholes, Aliens, Paradoxes & Extra Dimensions

Janna Levin is a theoretical physicist and cosmologist specializing in black holes, cosmology of extra dimensions, topology of the universe, and gravitational waves.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep468-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://brain.fm/lexBetterHelp: Online therapy and counseling.

Go to https://betterhelp.com/lexNetSuite: Business management software.

Go to https://shopify.com/lexAG1: All-in-one daily nutrition drink.

2 months, 1 week назад @ lexfridman.com
#467 – Tim Sweeney: Fortnite, Unreal Engine, and the Future of Gaming
#467 – Tim Sweeney: Fortnite, Unreal Engine, and the Future of Gaming #467 – Tim Sweeney: Fortnite, Unreal Engine, and the Future of Gaming

Tim Sweeney is a legendary video game programmer, founder and CEO of Epic Games that created the Unreal Engine, Fortnite, Gears of War, Unreal Tournament, and many other groundbreaking and influential video games.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep467-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://notion.com/lexMasterClass: Online classes from world-class experts.

Go to https://shopify.com/lexAG1: All-in-one daily nutrition drink.

Go to https://drinkag1.com/lexLMNT: Zero-sugar electrolyte drink mix.

2 months, 1 week назад @ lexfridman.com
#466 – Jeffrey Wasserstrom: China, Xi Jinping, Trade War, Taiwan, Hong Kong, Mao
#466 – Jeffrey Wasserstrom: China, Xi Jinping, Trade War, Taiwan, Hong Kong, Mao #466 – Jeffrey Wasserstrom: China, Xi Jinping, Trade War, Taiwan, Hong Kong, Mao

Jeffrey Wasserstrom is a historian of modern China.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep466-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://oracle.com/lexTax Network USA: Full-service tax firm.

Go to https://shopify.com/lexLMNT: Zero-sugar electrolyte drink mix.

Go to https://drinkLMNT.com/lexAG1: All-in-one daily nutrition drink.

2 months, 2 weeks назад @ lexfridman.com
#465 – Robert Rodriguez: Sin City, Desperado, El Mariachi, Alita, and Filmmaking
#465 – Robert Rodriguez: Sin City, Desperado, El Mariachi, Alita, and Filmmaking #465 – Robert Rodriguez: Sin City, Desperado, El Mariachi, Alita, and Filmmaking

Robert Rodriguez is a legendary filmmaker and creator of Sin City, El Mariachi, Desperado, Spy Kids, Machete, From Dusk Till Dawn, Alita: Battle Angel, The Faculty, and his newest venture Brass Knuckle Films.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep465-sc

See below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc. Transcript:

https://lexfridman.com/robert-rodriguez-transcript CONTACT LEX:

Feedback - give feedback to Lex: https://lexfridman.com/survey

AMA - submit questions, videos or call-in: https://lexfridman.com/ama

Hiring - join our team: https://lexfridman.com/hiring

Other - other ways to get in touch: http…

2 months, 3 weeks назад @ lexfridman.com
#464 – Dave Smith: Israel, Ukraine, Epstein, Mossad, Conspiracies & Antisemitism
#464 – Dave Smith: Israel, Ukraine, Epstein, Mossad, Conspiracies & Antisemitism #464 – Dave Smith: Israel, Ukraine, Epstein, Mossad, Conspiracies & Antisemitism

Dave Smith is a comedian, libertarian, political commentator, and the host of Part of the Problem podcast.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep464-sc

See below for timestamps, and to give feedback, submit questions, contact Lex, etc. CONTACT LEX:

Feedback - give feedback to Lex: https://lexfridman.com/survey

AMA - submit questions, videos or call-in: https://lexfridman.com/ama

Hiring - join our team: https://lexfridman.com/hiring

Other - other ways to get in touch: https://lexfridman.com/contact EPISODE LINKS:

Dave's X: https://x.com/ComicDaveSmith

Dave's YouTube: https://youtube.com/DSmithcomic

Dave's Instagram: https://instagram.com/theprobl…

3 months назад @ lexfridman.com
#463 – Douglas Murray: Putin, Zelenskyy, Trump, Israel, Netanyahu, Hamas & Gaza
#463 – Douglas Murray: Putin, Zelenskyy, Trump, Israel, Netanyahu, Hamas & Gaza #463 – Douglas Murray: Putin, Zelenskyy, Trump, Israel, Netanyahu, Hamas & Gaza

Douglas Murray is the author of On Democracies and Death Cults, The War on The West, and The Madness of Crowds.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep463-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://callofduty.com/warzoneOracle: Cloud infrastructure.

Go to https://oracle.com/lexLMNT: Zero-sugar electrolyte drink mix.

Go to https://drinkLMNT.com/lexAG1: All-in-one daily nutrition drink.

3 months, 2 weeks назад @ lexfridman.com
#462 – Ezra Klein and Derek Thompson: Politics, Trump, AOC, Elon & DOGE
#462 – Ezra Klein and Derek Thompson: Politics, Trump, AOC, Elon & DOGE #462 – Ezra Klein and Derek Thompson: Politics, Trump, AOC, Elon & DOGE

Ezra Klein is one of the most influential voices representing the left-wing of American politics.

He is a columnist for the NY Times and host of The Ezra Klein Show.

Derek Thompson is a writer at The Atlantic and host of the Plain English podcast.

Together they have written a new book titled Abundance that lays out a set of ideas for the future of the Democratic party.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep462-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

3 months, 2 weeks назад @ lexfridman.com
#461 – ThePrimeagen: Programming, AI, ADHD, Productivity, Addiction, and God
#461 – ThePrimeagen: Programming, AI, ADHD, Productivity, Addiction, and God #461 – ThePrimeagen: Programming, AI, ADHD, Productivity, Addiction, and God

ThePrimeagen (aka Michael Paulson) is a programmer who has educated, entertained, and inspired millions of people to build software and have fun doing it.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep461-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://shopify.com/lexNetSuite: Business management software.

Go to http://netsuite.com/lexBetterHelp: Online therapy and counseling.

Go to https://betterhelp.com/lexAG1: All-in-one daily nutrition drinks.

3 months, 3 weeks назад @ lexfridman.com
#460 – Narendra Modi: Prime Minister of India – Power, Democracy, War & Peace
#460 – Narendra Modi: Prime Minister of India – Power, Democracy, War & Peace #460 – Narendra Modi: Prime Minister of India – Power, Democracy, War & Peace

Narendra Modi is the Prime Minister of India.

On YouTube this episode is available in English, Hindi, Russian (and soon other languages).

Captions and voice-over audio tracks are provided (for the main episode video on YouTube) in English, Hindi, Russian, and the original mixed-language version, with subtitles available in your preferred language.

To listen to the original mixed-language version, please select the Hindi (Latin) audio track.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep460-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

3 months, 4 weeks назад @ lexfridman.com
Microsoft Research Podcast Microsoft Research Podcast
последний пост 2 days, 17 hours назад
How AI will accelerate biomedical research and discovery
How AI will accelerate biomedical research and discovery How AI will accelerate biomedical research and discovery

Dr. Eric Topol is the executive vice president of the biomedical research non-profit Scripps Research, where he founded and now directs the Scripps Research Translational Institute.

Let’s continue our deep dive on AI and biomedical research with this conversation with Noubar Afeyan:LEE: Noubar, thanks so much for joining.

And there’s the origin story of contact with AI, you know, before the emergence of generative AI and afterwards.

What is going on today with respect to AI really being used for something meaningful in the design and development of drugs?

TOPOL: You would read about how, you know, data is the new oil and, you know, gold and whatnot.

2 days, 17 hours назад @ microsoft.com
AI Testing and Evaluation: Learnings from pharmaceuticals and medical devices
AI Testing and Evaluation: Learnings from pharmaceuticals and medical devices AI Testing and Evaluation: Learnings from pharmaceuticals and medical devices

Our goal is to learn from their successes and their stumbles to move the science and practice of AI testing forward.

During the pre-market phase, medical testing establishes baseline safety and effectiveness metrics through bench testing, performance standards, and clinical studies.

SULLIVAN: So medical devices face a pretty prescriptive multi-level testing path before they hit the market.

We are looking into medical devices, as well, obviously, but also other technologies in advanced medical computing.

So we see Phase 3 trials as something that occurs in the medical devices and pharmaceuticals field.

5 days, 17 hours назад @ microsoft.com
AI Testing and Evaluation: Learnings from genome editing
AI Testing and Evaluation: Learnings from genome editing AI Testing and Evaluation: Learnings from genome editing

As generative AI continues to advance, Microsoft has gathered a range of experts—from genome editing to cybersecurity—to share how their fields approach evaluation and risk assessment.

CHARO: Well, you know, genome editing is both very old and very new.

Now the earliest forms of genome editing were very inefficient, and so we didn’t worry that much.

But the bottom-line thing to remember, the way to really think about it is, we don’t regulate genome editing; we regulate the things that use genome editing.

And she said, you know, we don’t regulate genome editing; we regulate the things that use genome editing.

1 week, 5 days назад @ microsoft.com
AI Testing and Evaluation: Learnings from Science and Industry
AI Testing and Evaluation: Learnings from Science and Industry AI Testing and Evaluation: Learnings from Science and Industry

Our goal is to learn from their successes and their stumbles to move the science and practice of AI testing forward.

And I think, really, there are two reasons why tech is so, kind of, representative of that kind of challenge that I’ve always found fascinating.

Continues to be a really important topic in the AI policy conversation right now, I think, for really good reason.

Testing is an important component for governance and AI and, of course, in all of these other domains, as well.

I think about almost, like, in the near to mid-term, like three issues that we need to address in the AI, kind of, policy and testing context.

2 weeks, 5 days назад @ microsoft.com
The AI Revolution in Medicine, Revisited: How AI is reshaping the future of healthcare and medical research
The AI Revolution in Medicine, Revisited: How AI is reshaping the future of healthcare and medical research The AI Revolution in Medicine, Revisited: How AI is reshaping the future of healthcare and medical research

LEE: Yeah, yeah.

It cannot—as, you know, Bill was saying—it cannot learn from your document.

And I don’t know if the two of you remember, but I ended up doing a lot of tests.

I don’t know if you know, but just recently, there was a paper that was published on a scientific discovery using o3- mini (opens in new tab).

Like, if you have a human trained for one task and you put them into another task, then you don’t … you often don’t know.

1 month назад @ microsoft.com
What AI's impact on individuals means for the health workforce and industry
What AI's impact on individuals means for the health workforce and industry What AI's impact on individuals means for the health workforce and industry

So I don’t think we should be surprised that business schools matter on this because we care about management.

That’s really going to change the way, like, middle school works, was my thinking at the time.

We’ve gone from AI being highly discriminative to AI that’s able to explore the world in particular ways.

The symptoms that they’re showing are quite different, and also their compliance is really, really different.

LEE: Yeah, really, really interesting.

1 month, 2 weeks назад @ microsoft.com
Abstracts: Zero-shot models in single-cell biology with Alex Lu
Abstracts: Zero-shot models in single-cell biology with Alex Lu Abstracts: Zero-shot models in single-cell biology with Alex Lu

And single-cell foundation models claim to be capable of unraveling deeper insights than ever before.

Basically, we showed that single-cell foundation models perform worse in settings that are fundamental to biological discovery than much simpler machine learning and statistical methods that were used in the field before single-cell foundation models emerged and are the go-to standard for unpacking meaning from these complicated experiments.

And the way to understand this is because single-cell foundation models are trained in a way that tries to expose these models to millions of single-cells.

But let’s also talk about the impact for methodologists, people who are trying to improve these s…

1 month, 3 weeks назад @ microsoft.com
Abstracts: Aurora with Megan Stanley and Wessel Bruinsma
Abstracts: Aurora with Megan Stanley and Wessel Bruinsma Abstracts: Aurora with Megan Stanley and Wessel Bruinsma

This is such exciting work about environmental forecasting, so we’re happy to have the two of you join us today.

Mostly because AI weather forecasting models are computationally much more efficient and can even be more accurate.

What’s unfortunate though, about this big step forward, is that these developments are mostly limited to the setting of weather forecasting.

Weather forecasting is very important, obviously, but there are many other important environmental forecasting problems out there, such as air pollution forecasting or ocean wave forecasting.

STANLEY: Current approaches have really focused training very specifically on weather forecasting models.

1 month, 3 weeks назад @ microsoft.com
Collaborators: Healthcare Innovation to Impact
Collaborators: Healthcare Innovation to Impact Collaborators: Healthcare Innovation to Impact

LUNGREN: And now it really feels like this collaborative effort, you know, really can help start to extend that mission.

I think, you know, Will and Smitha, that we definitely feel the passion and the innovation.

Again, you know, in text, you refer to that earlier and certainly off the shelf, there’s really powerful applications.

LUNGREN: So, I think AI has always been thought of as a savior kind of technology.

And I guess for my part, I think really what we’re going to see is a massive unleash of creativity.

1 month, 3 weeks назад @ microsoft.com
Coauthor roundtable: Reflecting on real world of doctors, developers, patients, and policymakers
Coauthor roundtable: Reflecting on real world of doctors, developers, patients, and policymakers Coauthor roundtable: Reflecting on real world of doctors, developers, patients, and policymakers

LEE: Yeah, yeah.

LEE: Yeah, yeah.

LEE: Yeah, yeah.

[LAUGHS]GOLDBERG: Right, right, right, yeah.

Yeah, yeah.

1 month, 4 weeks назад @ microsoft.com
Abstracts: Heat Transfer and Deep Learning with Hongxia Hao and Bing Lv
Abstracts: Heat Transfer and Deep Learning with Hongxia Hao and Bing Lv Abstracts: Heat Transfer and Deep Learning with Hongxia Hao and Bing Lv

Today I’m talking to two researchers, Hongxia Hao, a senior researcher at Microsoft Research AI for Science, and Bing Lv, an associate professor in physics at the University of Texas at Dallas.

Hongxia and Bing are co-authors of a paper called Probing the Limit of Heat Transfer in Inorganic Crystals with Deep Learning .

LV: So I think one of the biggest things as Hongxia said, right?

We have a lot of new materials, exotic materials, which some of them, Hongxia can elaborate a little bit more.

HAO: Yeah, yeah.

2 months назад @ microsoft.com
Abstracts: Societal AI with Xing Xie
Abstracts: Societal AI with Xing Xie Abstracts: Societal AI with Xing Xie

I’m here today with Xing Xie, a partner research manager at Microsoft Research and co-author of a white paper called Societal AI: Research Challenges and Opportunities .

HUIZINGA: So let’s start with a brief overview of the background for this white paper on Societal AI.

XIE: The idea for this white paper emerged in response to the shift we are witnessing in the AI landscape.

XIE: Rather than follow a traditional research methodology, we built this white paper around ten fundamental, foundational research questions.

HUIZINGA: Yeah, yeah, yeah.

2 months, 1 week назад @ microsoft.com
The AI Revolution in Medicine, Revisited: Laws, norms, and ethics for AI in health
The AI Revolution in Medicine, Revisited: Laws, norms, and ethics for AI in health The AI Revolution in Medicine, Revisited: Laws, norms, and ethics for AI in health

Roxana is among the world’s thought leaders in AI, healthcare, and medicine, thanks in part to groundbreaking work on AI biases and trustworthiness.

When we think about AI, we think about it being very futuristic, but it’s trained on data from the past.

And so I think your struggles and frustrations on, you know, how to expand that nationwide, I think, are really, really informative.

And in a way, I think … I don’t know that I or my coauthors were satisfied with that.

You know, AI has all the time in the world [LAUGHS] to write you a text.

2 months, 1 week назад @ microsoft.com
The AI Revolution in Medicine, Revisited: Empowering patients and healthcare consumers in the age of generative AI
The AI Revolution in Medicine, Revisited: Empowering patients and healthcare consumers in the age of generative AI The AI Revolution in Medicine, Revisited: Empowering patients and healthcare consumers in the age of generative AI

[LAUGHS]DEBRONKART: Ah, well, that’s … that’s even weirder.

I’m, as I said at the beginning, I’m glad to be alive and I’m really, really, really grateful to be given a chance to share my thoughts with your audience because I really like super smart nerds.

And it’s really to me like where I’m seeing kind of the first set of really kind of promising AI applications.

And so, to me, that’s really kind of where I see the most interesting opportunities for technology and for digital health.

Just really, really appreciate it.

2 months, 3 weeks назад @ microsoft.com
The AI Revolution in Medicine, Revisited: Real-world healthcare AI development and deployment—at scale
The AI Revolution in Medicine, Revisited: Real-world healthcare AI development and deployment—at scale The AI Revolution in Medicine, Revisited: Real-world healthcare AI development and deployment—at scale

We are sorry, the page you requested cannot be found.

The page you are looking for could not be found or is no longer available.

3 months, 1 week назад @ microsoft.com
NLP Highlights NLP Highlights
последний пост None
Data Skeptic
последний пост 6 days, 9 hours назад
The Network Diversion Problem
The Network Diversion Problem The Network Diversion Problem

In this episode, Professor Pål Grønås Drange from the University of Bergen, introduces the field of Parameterized Complexity - a powerful framework for tackling hard computational problems by focusing on specific structural aspects of the input. This framework allows researchers to solve NP-complete problems more efficiently when certain parameters, like the structure of the graph, are "well-behaved". At the center of the discussion is the network diversion problem, where the goal isn’t to block all routes between two points in a network, but to force flow - such as traffic, electricity, or data - through a specific path. While this problem appears deceptively similar to the classic "Min.Cu…

6 days, 9 hours назад @ dataskeptic.com
Complex Dynamic in Networks
Complex Dynamic in Networks Complex Dynamic in Networks

In this episode, we learn why simply analyzing the structure of a network is not enough, and how the dynamics - the actual mechanisms of interaction between components - can drastically change how information or influence spreads. Our guest, Professor Baruch Barzel of Bar-Ilan University, is a leading researcher in network dynamics and complex systems ranging from biology to infrastructure and beyond. BarzelLab BarzelLab on Youtube Paper in focus: Universality in network dynamics, 2013

2 weeks назад @ dataskeptic.com
Github Network Analysis
Github Network Analysis Github Network Analysis 3 weeks назад @ dataskeptic.com
Networks and Complexity
Networks and Complexity Networks and Complexity

In this episode, Kyle does an overview of the intersection of graph theory and computational complexity theory. In complexity theory, we are about the runtime of an algorithm based on its input size. For many graph problems, the interesting questions we want to ask take longer and longer to answer! This episode provides the fundamental vocabulary and signposts along the path of exploring the intersection of graph theory and computational complexity theory.

4 weeks, 1 day назад @ dataskeptic.com
Actantial Networks
Actantial Networks Actantial Networks

In this episode, listeners will learn about Actantial Networks—graph-based representations of narratives where nodes are actors (such as people, institutions, or abstract entities) and edges represent the actions or relationships between them. The one who will present these networks is our guest Armin Pournaki, a joint PhD candidate at the Max Planck Institute and Sciences, who specializes in computational social science, where he develops methods to extract and analyze political narratives using natural language processing and network science. Armin explains how these methods can expose conflicting narratives around the same events, as seen in debates on COVID-19, climate change, or the wa…

1 month, 1 week назад @ dataskeptic.com
Graphs for Causal AI
Graphs for Causal AI Graphs for Causal AI

How to build artificial intelligence systems that understand cause and effect, moving beyond simple correlations? As we all know, correlation is not causation. "Spurious correlations" can show, for example, how rising ice cream sales might statistically link to more drownings, not because one causes the other, but due to an unobserved common cause like warm weather. Our guest, Utkarshani Jaimini, a researcher from the University of South Carolina's Artificial Intelligence Institute, tries to tackle this problem by using knowledge graphs that incorporate domain expertise. Knowledge graphs (structured representations of information) are combined with neural networks in the field of neurosymbo…

1 month, 2 weeks назад @ dataskeptic.com
Power Networks
Power Networks Power Networks 1 month, 3 weeks назад @ dataskeptic.com
Unveiling Graph Datasets
Unveiling Graph Datasets Unveiling Graph Datasets 2 months назад @ dataskeptic.com
Network Manipulation
Network Manipulation Network Manipulation

In this episode we talk with Manita Pote, a PhD student at Indiana University Bloomington, specializing in online trust and safety, with a focus on detecting coordinated manipulation campaigns on social media. Key insights include how coordinated reply attacks target influential figures like journalists and politicians, how machine learning models can detect these inauthentic campaigns using structural and behavioral features, and how deletion patterns reveal efforts to evade moderation or manipulate engagement metrics. Follow our guest X/Twitter Google Scholar Papers in focus Coordinated Reply Attacks in Influence Operations: Characterization and Detection ,2025 Manipulating Twitter throug…

2 months, 2 weeks назад @ dataskeptic.com
The Small World Hypothesis
The Small World Hypothesis The Small World Hypothesis

Kyle discusses the history and proof for the small world hypothesis.

2 months, 3 weeks назад @ dataskeptic.com
Thinking in Networks
Thinking in Networks Thinking in Networks

Kyle asks Asaf questions about the new network science course he is now teaching. The conversation delves into topics such as contact tracing, tools for analyzing networks, example use cases, and the importance of thinking in networks.

3 months назад @ dataskeptic.com
Fraud Networks
Fraud Networks Fraud Networks

In this episode we talk with Bavo DC Campo, a data scientist and statistician, who shares his expertise on the intersection of actuarial science, fraud detection, and social network analytics. Together we will learn how to use graphs to fight against insurance fraud by uncovering hidden connections between fraudulent claims and bad actors. Key insights include how social network analytics can detect fraud rings by mapping relationships between policyholders, claims, and service providers, and how the BiRank algorithm, inspired by Google’s PageRank, helps rank suspicious claims based on network structure. Bavo will also present his iFraud simulator that can be used to model fraudulent networ…

3 months, 1 week назад @ dataskeptic.com
Criminal Networks
Criminal Networks Criminal Networks

In this episode we talk with Justin Wang Ngai Yeung, a PhD candidate at the Network Science Institute at Northeastern University in London, who explores how network science helps uncover criminal networks. Justin is also a member of the organizing committee of the satellite conference dealing with criminal networks at the network science conference in The Netherlands in June 2025. Listeners will learn how graph-based models assist law enforcement in analyzing missing data, identifying key figures in criminal organizations, and improving intervention strategies. Key insights include the challenges of incomplete and inaccurate data in criminal network analysis, how law enforcement agencies us…

3 months, 3 weeks назад @ dataskeptic.com
Graph Bugs
Graph Bugs Graph Bugs

In this episode today’s guest is Celine Wüst, a master’s student at ETH Zurich specializing in secure and reliable systems, shares her work on automated software testing for graph databases. Celine shows how fuzzing—the process of automatically generating complex queries—helps uncover hidden bugs in graph database management systems like Neo4j, FalconDB, and Apache AGE. Key insights include how state-aware query generation can detect critical issues like buffer overflows and crashes, the challenges of debugging complex database behaviors, and the importance of security-focused software testing. We'll also find out which Graph DB company offers swag for finding bugs in its software and get C…

4 months назад @ dataskeptic.com
Organizational Network Analysis
Organizational Network Analysis Organizational Network Analysis

In this episode, Gabriel Petrescu, an organizational network analyst, discusses how network science can provide deep insights into organizational structures using OrgXO, a tool that maps companies as networks rather than rigid hierarchies. Listeners will learn how analyzing workplace collaboration networks can reveal hidden influencers, organizational bottlenecks, and engagement levels, offering a data-driven approach to improving effectiveness and resilience. Key insights include how companies can identify overburdened employees, address silos between departments, and detect vulnerabilities where too few individuals hold critical knowledge. Real-life applications range from mergers and acq…

4 months, 1 week назад @ dataskeptic.com
SuperDataScience SuperDataScience
последний пост 1 day, 22 hours назад
904: A.I. is Disrupting the Entire Advertising Industry
904: A.I. is Disrupting the Entire Advertising Industry 904: A.I. is Disrupting the Entire Advertising Industry

In this Five-Minute Friday, Jon Krohn reveals how AI is taking on the glitzy world of advertising. Bold claims from Meta and OpenAI contend that users will soon be able to plug in what they want and have AI churn out an ad campaign for little to no cost are shaking the advertising industry to its core. The fact that the four biggest sellers of ads (Google, Meta, Amazon, and ByteDance) are digital companies and accounted for over half of the global market in 2024 adds salt to the wound. Hear the three ways that AI is disrupting the industry, and who (or what) has the most influence on digital consumers to date. Additional materials: ⁠⁠⁠⁠⁠⁠www.superdatascience.com/904 Interested in sponsoring…

1 day, 22 hours назад @ podtrac.com
903: LLM Benchmarks Are Lying to You (And What to Do Instead), with Sinan Ozdemir
903: LLM Benchmarks Are Lying to You (And What to Do Instead), with Sinan Ozdemir 903: LLM Benchmarks Are Lying to You (And What to Do Instead), with Sinan Ozdemir

Has AI benchmarking reached its limit, and what do we have to fill this gap? Sinan Ozdemir speaks to Jon Krohn about the lack of transparency in training data and the necessity of human-led quality assurance to detect AI hallucinations, when and why to be skeptical of AI benchmarks, and the future of benchmarking agentic and multimodal models. Additional materials: ⁠⁠⁠⁠⁠www.superdatascience.com/903⁠⁠⁠⁠ This episode is brought to you by Trainium2, the latest AI chip from AWS, by ⁠⁠Adverity, the conversational analytics platform⁠⁠ and by the ⁠⁠Dell AI Factory with NVIDIA⁠⁠. Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship informat…

4 days, 22 hours назад @ podtrac.com
902: In Case You Missed It in June 2025
902: In Case You Missed It in June 2025 902: In Case You Missed It in June 2025

In this episode of “In Case You Missed It”, Jon recaps his June interviews on The SuperDataScience Podcast. Hear from Diane Hare, Avery Smith, Kirill Eremenko, and Shaun Johnson as they talk about the best portfolios for AI practitioners, how to stand out in a saturated candidate market for AI roles, how to tell when an AI startup is going places, and ways to lead AI change in business. Additional materials: ⁠⁠⁠www.superdatascience.com/902 Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

1 week, 1 day назад @ podtrac.com
901: Automating Legal Work with Data-Centric ML (feat. Lilith Bat-Leah)
901: Automating Legal Work with Data-Centric ML (feat. Lilith Bat-Leah) 901: Automating Legal Work with Data-Centric ML (feat. Lilith Bat-Leah)

Senior Director of AI Labs for Epiq Lilith Bat-Leah speaks to Jon Krohn about the ways AI have disrupted the legal industry using LLMs and retrieval-augmented generation (RAG), as well as how the data-centric machine learning research movement (DMLR) is systematically improving data quality, and why that is so important. Additional materials: ⁠⁠⁠⁠⁠www.superdatascience.com/901⁠⁠⁠⁠ This episode is brought to you by the ⁠⁠Dell AI Factory with NVIDIA⁠⁠ and Adverity, the conversational analytics platform⁠⁠⁠. Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information. In this episode you will learn: (05:45) Deciphering legal tech te…

1 week, 4 days назад @ podtrac.com
900: 95-Year-Old Annie on How to Stay Healthy and Happy
900: 95-Year-Old Annie on How to Stay Healthy and Happy 900: 95-Year-Old Annie on How to Stay Healthy and Happy

“Stay happy and healthy”: In this special Five-Minute Friday, Jon Krohn speaks with Annie, his grandmother, on her 95th birthday. Hear how she is physically and mentally coping with illnesses that limit her mobility and the joys of having a pet. Additional materials: ⁠⁠⁠⁠⁠⁠www.superdatascience.com/900⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

2 weeks, 1 day назад @ podtrac.com
899: Landing $200k+ AI Roles: Real Cases from the SuperDataScience Community, with Kirill Eremenko
899: Landing $200k+ AI Roles: Real Cases from the SuperDataScience Community, with Kirill Eremenko 899: Landing $200k+ AI Roles: Real Cases from the SuperDataScience Community, with Kirill Eremenko

Data science skills, a data science bootcamp, and why Python and SQL still reign supreme: In this episode, Kirill Eremenko returns to the podcast to speak to Jon Krohn about SuperDataScience subscriber success stories, where to focus in a field that is evolving incredibly quickly, and why in-person working and networking might give you the edge over other candidates in landing a top AI role. Additional materials: ⁠⁠⁠⁠www.superdatascience.com/899⁠⁠⁠ This episode is brought to you by ⁠Adverity, the conversational analytics platform⁠ and by the ⁠Dell AI Factory with NVIDIA⁠. Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship informat…

2 weeks, 4 days назад @ podtrac.com
898: My Four-Hour Agentic AI Workshop is Live and 100% Free
898: My Four-Hour Agentic AI Workshop is Live and 100% Free 898: My Four-Hour Agentic AI Workshop is Live and 100% Free

In this Five-Minute Friday, Jon Krohn announces his new, free workshop on Agentic AI. On this four-hour comprehensive course, you’ll learn the key terminology for working with these flexible, multi-agent systems and then get to grips with developing and deploying this artificial “team of experts” for all your AI-driven projects. Additional materials: ⁠⁠⁠⁠⁠www.superdatascience.com/898⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

3 weeks, 1 day назад @ podtrac.com
897: How to Enable Enterprise AI Transformation, with Strategy Consultant Diane Hare
897: How to Enable Enterprise AI Transformation, with Strategy Consultant Diane Hare 897: How to Enable Enterprise AI Transformation, with Strategy Consultant Diane Hare

Diane Hare talks to Jon Krohn about the power of storytelling for corporate buy-in of AI initiatives, how to actively implement AI to transform organizations, and how emerging professionals can upskill themselves. Hear how she discovered her background in storytelling at Ernst & Young and her work with Simon Sinek, which she finds to be integral to her process. Inspired by Sinek’s aphorism “start with why”, Diane notes that many companies neglect this crucial part of their mission because they never take the time to work on it. Additional materials: ⁠⁠⁠www.superdatascience.com/897⁠⁠ This episode is brought to you by Trainium2, the latest AI chip from AWS, by Adverity, the conversational ana…

3 weeks, 4 days назад @ podtrac.com
896: AI (Probably) Isn’t Taking Your Job (At Least Anytime Soon)
896: AI (Probably) Isn’t Taking Your Job (At Least Anytime Soon) 896: AI (Probably) Isn’t Taking Your Job (At Least Anytime Soon)

The Economist reported that global Google searches for "AI unemployment" hit an all-time high earlier this year. But do we have to worry about AI taking our jobs? In this week’s Five-Minute Friday, Jon Krohn investigates whether the rise of AI has directly led to an increase in unemployment. Additional materials: ⁠⁠⁠⁠www.superdatascience.com/896 Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

4 weeks, 1 day назад @ podtrac.com
895: The Future of Enterprise AI: Investor Shaun Johnson Reveals What Actually Works
895: The Future of Enterprise AI: Investor Shaun Johnson Reveals What Actually Works 895: The Future of Enterprise AI: Investor Shaun Johnson Reveals What Actually Works

How to get funded by a VC specializing in AI: Head of AIX Ventures Shaun Johnson talks to Jon Krohn about investment strategies, how to simplify AI adoption, why a little competition can be so beneficial to AI startups, and how Big Tech is circumventing anti-monopoly measures. Additional materials: ⁠⁠www.superdatascience.com/895⁠ This episode is brought to you by ⁠Adverity, the conversational analytics platform⁠ and by the ⁠Dell AI Factory with NVIDIA⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information. In this episode you will learn: (10:36) What Shaun looks for when evaluating early-stage AI startups (19:11) Building…

1 month назад @ podtrac.com
894: In Case You Missed It in May 2025
894: In Case You Missed It in May 2025 894: In Case You Missed It in May 2025

In this episode of “In Case You Missed It”, Jon Krohn takes clips from interviews with guests in May 2025. From AI agent integration and RAG-based chatbots to education through virtual reality headsets and data harmonization, this episode explores how industry leaders are developing the tools and technologies that can improve operations, education, healthcare, and marketing. Highlight clips are with John Roese, Global Chief Technology Officer and Chief AI Officer at Dell Technologies (Episode 887), Senior Developer Relations Engineer at Posit, PBC Jeroen Janssens and Lead Data Scientist at Xomnia Thijs Nieuwdorp (Episode 885), Founder of CEEK Mary Spio (Episode 889), and Martin Brunthaler, …

1 month назад @ podtrac.com
893: How to Jumpstart Your Data Career (by Applying Like a Scientist), with Avery Smith
893: How to Jumpstart Your Data Career (by Applying Like a Scientist), with Avery Smith 893: How to Jumpstart Your Data Career (by Applying Like a Scientist), with Avery Smith

Avery Smith is a passionate and motivational YouTuber and careers educator for data science. In this episode, Jon Krohn asks Avery about the tools and tricks he has learned from personal experience and from his students in how to get ahead in the tech industry. Avery shares the “learning ladder” he uses to help newcomers start on the right foot with great examples from former bootcamp students who have put his theories into practice. And, if you’re using LinkedIn to find jobs, Avery explains why this might be one of the reasons you’re not getting work. Additional materials: ⁠www.superdatascience.com/893 This episode is brought to you by Adverity, the conversational analytics platform and by…

1 month, 1 week назад @ podtrac.com
892: We’re In The AI “Trough of Disillusionment” (and that’s Great!)
892: We’re In The AI “Trough of Disillusionment” (and that’s Great!) 892: We’re In The AI “Trough of Disillusionment” (and that’s Great!)

Businesses have entered a “trough of disillusionment” for AI. In this Five-Minute Friday, Jon Krohn learns why Fortune 500 execs are so frustrated with the tools and how they can work their way up the “slope of enlightenment” towards effective AI. Hear why AI takeup hasn’t so far gone to plan in the corporate world and what that world needs from AI to encourage greater business engagement. Additional materials: ⁠⁠⁠www.superdatascience.com/892⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

1 month, 1 week назад @ podtrac.com
891: Conversational AI is Overhauling Data Analytics, with Martin Brunthaler
891: Conversational AI is Overhauling Data Analytics, with Martin Brunthaler 891: Conversational AI is Overhauling Data Analytics, with Martin Brunthaler

Martin Brunthaler talks to Jon Krohn about founding Adverity, a data analytics platform for marketing that simplifies integrating data from multiple sources and crunching them into actionable insights. Learn how Adverity became a data analytics powerhouse serving multiple industries, and why Martin thinks AI will strengthen rather than diminish the job market for data scientists, data analysts, and machine learning engineers. Additional materials: www.superdatascience.com/891 Today’s episode is brought to you by Trainium2, the latest AI chip from AWS and by the Dell AI Factory with NVIDIA Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for spo…

1 month, 2 weeks назад @ podtrac.com
890: The “State of AI” Report 2025
890: The “State of AI” Report 2025 890: The “State of AI” Report 2025

In this week’s Five-Minute Friday, Jon Krohn reveals highlights from Stanford University’s AI Index Report. Released a few weeks ago by the Institute for Human-Centered AI, this annual report details the incredible technical advances, policies, and investments in artificial intelligence. Hear which models achieve the best performance relative to their size, in what scenarios top AI systems can outperform humans (and when humans still outperform AI), and more in Jon’s five key takeaways. Additional materials: ⁠⁠www.superdatascience.com/890⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

1 month, 2 weeks назад @ podtrac.com
Data Science at Home Data Science at Home
последний пост 6 days, 3 hours назад
Tech’s Dumbest Mistake: Why Firing Programmers for AI Will Destroy Everything (Ep. 286) [RB]
Tech’s Dumbest Mistake: Why Firing Programmers for AI Will Destroy Everything (Ep. 286) [RB] Tech’s Dumbest Mistake: Why Firing Programmers for AI Will Destroy Everything (Ep. 286) [RB]

From the viral article “Tech’s Dumbest Mistake: Why Firing Programmers for AI Will Destroy Everything” on my newsletter at https://defragzone.substack.com/p/techs-dumbest-mistake-why-firinghere are my thoughts about AI replacing programmers…🎙️ Sponsors AGNTCY — The open source collective building the Internet of Agents🌐 https://www.agntcy.org✨ Connect with us!

🐦 Twitter: @DataScienceAtHome📘 LinkedIn: https://www.linkedin.com/in/fragadaleta/Instagram: https://www.instagram.com/datascienceathome/Facebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Scie…

6 days, 3 hours назад @ datascienceathome.com
Brains in the Machine: The Rise of Neuromorphic Computing (Ep. 285)
Brains in the Machine: The Rise of Neuromorphic Computing (Ep. 285) Brains in the Machine: The Rise of Neuromorphic Computing (Ep. 285)

In this episode of Data Science at Home, we explore the fascinating world of neuromorphic computing — a brain-inspired approach to computation that could reshape the future of AI and robotics.

The episode breaks down how neuromorphic systems differ from conventional AI architectures like transformers and LLMs, diving into spiking neural networks (SNNs), their benefits in energy efficiency and real-time processing, and their limitations in training and scalability.

Real-world applications are highlighted, including low-power drones, hearing aids, and event-based cameras.

Francesco closes with a vision of hybrid systems where neuromorphic chips and LLMs coexist, blending biological inspiratio…

3 weeks, 3 days назад @ datascienceathome.com
DSH/Warcoded – AI in the Invisible Battlespace (Ep. 284)
DSH/Warcoded – AI in the Invisible Battlespace (Ep. 284) DSH/Warcoded – AI in the Invisible Battlespace (Ep. 284)

This episode explores the invisible battlespace of cyber and electronic warfare, where AI takes center stage.

SponsorsBuilding multi-agent software is hard — agent-to-agent and agent-to-tool communication is still the wild west.

At the intersection of ethics and engineering, Amethix creates AI systems that don’t just function—they adapt, learn, and serve.

Discover more at amethix.comWarcoded is brought to you by Intrepid AI.

From drones to satellites, Intrepid AI gives engineers and defense innovators the tools to prototype, simulate, and deploy autonomous systems with confidence.

1 month, 1 week назад @ datascienceathome.com
DSH/Warcoded Swarming the Battlefield (Ep. 283)
DSH/Warcoded Swarming the Battlefield (Ep. 283) DSH/Warcoded Swarming the Battlefield (Ep. 283)

Swarming the Battlefield explores how artificial intelligence is revolutionizing combat through coordinated drone swarms.

This episode uncovers how these intelligent agents turn the chaos of the battlefield into a synchronized dance of machine warfare.

At the intersection of ethics and engineering, Amethix creates AI systems that don’t just function—they adapt, learn, and serve.

Discover more at amethix.comWarcoded is brought to you by Intrepid AI.

From drones to satellites, Intrepid AI gives engineers and defense innovators the tools to prototype, simulate, and deploy autonomous systems with confidence.

1 month, 2 weeks назад @ datascienceathome.com
DSH/Warcoded Kill Chains and Algorithmic Warfare – Autonomy in Targeting and Engagement (Ep. 282)
DSH/Warcoded Kill Chains and Algorithmic Warfare – Autonomy in Targeting and Engagement (Ep. 282) DSH/Warcoded Kill Chains and Algorithmic Warfare – Autonomy in Targeting and Engagement (Ep. 282)

In this gripping follow-up, we dive into how AI is transforming kinetic operations—from identifying a threat to executing a strike.

At the intersection of ethics and engineering, Amethix creates AI systems that don’t just function—they adapt, learn, and serve.

Discover more at amethix.comWarcoded is brought to you by Intrepid AI.

From drones to satellites, Intrepid AI gives engineers and defense innovators the tools to prototype, simulate, and deploy autonomous systems with confidence.

Whether it’s in the sky, on the ground, or in orbit—if it’s intelligent and mobile, Intrepid helps you build it.

2 months назад @ datascienceathome.com
DSH/Warcoded: Eyes and Ears of the Machine – AI Reconnaissance and Surveillance (Ep. 281)
DSH/Warcoded: Eyes and Ears of the Machine – AI Reconnaissance and Surveillance (Ep. 281) DSH/Warcoded: Eyes and Ears of the Machine – AI Reconnaissance and Surveillance (Ep. 281)

Welcome to DSH/WarcodedWe explore how AI is transforming ISR (Intelligence, Surveillance, Reconnaissance)—from satellite imagery to drone feeds.

At the intersection of ethics and engineering, Amethix creates AI systems that don’t just function—they adapt, learn, and serve.

Discover more at amethix.com.”Warcoded is brought to you by Intrepid AI.

From drones to satellites, Intrepid AI gives engineers and defense innovators the tools to prototype, simulate, and deploy autonomous systems with confidence.

Learn more at intrepid.ai.”#AI #defensetech #ISR #LLM #Warcoded #DataScienceAtHome #OSINT #SIGINT #dronewarfare

2 months назад @ datascienceathome.com
AI Agents with Atomic Agents 🚀 with Kenny Vaneetvelde (Ep. 280)
AI Agents with Atomic Agents 🚀 with Kenny Vaneetvelde (Ep. 280) AI Agents with Atomic Agents 🚀 with Kenny Vaneetvelde (Ep. 280)

🎙️ In this episode of Data Science at Home, we sit down with Kenny Vaneetvelde, the mastermind behind Atomic Agents, a groundbreaking framework redefining AI development.

🔍 Discover how atomicity simplifies complex AI systems, why modularity matters more than ever, and how Atomic Agents is eliminating hidden assumptions and redundant complexity in AI workflows.

💡 From real-world applications to the tech stack behind the framework, Kenny takes us on a deep dive into this lightweight, powerful tool for creating consistent and brand-aligned AI.

📌 Timestamps:0:00 – Intro2:30 – Kenny’s journey in AI5:00 – What are Atomic Agents?

10:45 – Why atomicity matters in AI18:20 – The tech behind Atomic A…

3 months назад @ datascienceathome.com
Run massive models on crappy machines (Ep. 279)
Run massive models on crappy machines (Ep. 279) Run massive models on crappy machines (Ep. 279)

This episode explores how to break down barriers by running massive AI models on “crappy machines”—affordable, low-spec devices.

🐦 Twitter: @DataScienceAtHome📘 LinkedIn: https://www.linkedin.com/in/fragadaleta/Instagram: https://www.instagram.com/datascienceathome/Facebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Whether you’re a data professional, tech enthusiast, or just curious about the field, our podcast delivers insights, interviews, and discussions.

Send us mail …

3 months, 1 week назад @ datascienceathome.com
WeightWatcher: The AI Detective for LLMs (DeepSeek & OpenAI included) (Ep. 278)
WeightWatcher: The AI Detective for LLMs (DeepSeek & OpenAI included) (Ep. 278) WeightWatcher: The AI Detective for LLMs (DeepSeek & OpenAI included) (Ep. 278)

Enter WeightWatcher—the AI detective tool that peeks inside neural networks without needing their data.

🐦 Twitter: @DataScienceAtHome📘 LinkedIn: https://www.linkedin.com/in/fragadaleta/Instagram: https://www.instagram.com/datascienceathome/Facebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Whether you’re a data professional, tech enthusiast, or just curious about the field, our podcast delivers insights, interviews, and discussions.

Send us mail at:hello@datascienceathom…

3 months, 2 weeks назад @ datascienceathome.com
Tech’s Dumbest Mistake: Why Firing Programmers for AI Will Destroy Everything (Ep. 278)
Tech’s Dumbest Mistake: Why Firing Programmers for AI Will Destroy Everything (Ep. 278) Tech’s Dumbest Mistake: Why Firing Programmers for AI Will Destroy Everything (Ep. 278)

From the viral article “Tech’s Dumbest Mistake: Why Firing Programmers for AI Will Destroy Everything” on my newsletter at https://defragzone.substack.com/p/techs-dumbest-mistake-why-firinghere are my thoughts about AI replacing programmers…✨ Connect with us!

🐦 Twitter: @DataScienceAtHome📘 LinkedIn: https://www.linkedin.com/in/fragadaleta/Instagram: https://www.instagram.com/datascienceathome/Facebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

Whether you’re a data profes…

3 months, 2 weeks назад @ datascienceathome.com
Scaling Smart: AI, Data, and Building Future-Ready Enterprises with Josh Miramant (Ep. 276)
Scaling Smart: AI, Data, and Building Future-Ready Enterprises with Josh Miramant (Ep. 276) Scaling Smart: AI, Data, and Building Future-Ready Enterprises with Josh Miramant (Ep. 276)

In this episode, we dive into the transformative world of AI, data analytics, and cloud infrastructure with Josh Miramant, CEO of Blue Orange Digital.

As a seasoned entrepreneur with over $25 million raised across ventures and two successful exits, Josh shares invaluable insights on scaling data-driven businesses, integrating machine learning frameworks, and navigating the rapidly evolving landscape of cloud data architecture.

From generative AI to large language models, Josh explores cutting-edge trends shaping financial services, real estate, and consumer goods.

Tune in for a masterclass in leveraging data for impact and innovation!

Linkshttps://blueorange.digital/https://blueorange.digit…

6 months, 3 weeks назад @ datascienceathome.com
Autonomous Weapons and AI Warfare (Ep. 275)
Autonomous Weapons and AI Warfare (Ep. 275) Autonomous Weapons and AI Warfare (Ep. 275)

Here’s the updated text with links to the websites included:AI is revolutionizing the military with autonomous drones, surveillance tech, and decision-making systems.

In this episode of Data Science at Home, we expose the cutting-edge tech reshaping defense—and the chilling ethical questions that follow.

🐦 Twitter: @DataScienceAtHome📘 LinkedIn: Francesco Gad📷 Instagram: https://www.instagram.com/datascienceathome/📘 Facebook: https://www.facebook.com/datascienceAH💼 LinkedIn: https://www.linkedin.com/company/data-science-at-home-podcast💬 Discord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

S…

6 months, 4 weeks назад @ datascienceathome.com
8 Proven Strategies to Scale Your AI Systems Like OpenAI! 🚀 (Ep. 274)
8 Proven Strategies to Scale Your AI Systems Like OpenAI! 🚀  (Ep. 274) 8 Proven Strategies to Scale Your AI Systems Like OpenAI! 🚀 (Ep. 274)

In this episode of Data Science at Home, we’re diving deep into the powerful strategies that top AI companies, like OpenAI, use to scale their systems to handle millions of requests every minute!

From stateless services and caching to the secrets of async processing, discover 8 essential strategies to make your AI and machine learning systems unstoppable.

Instagram: https://www.instagram.com/datascienceathome/Twitter: @datascienceathomeFacebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and ma…

7 months назад @ datascienceathome.com
Humans vs. Bots: Are You Talking to a Machine Right Now? (Ep. 273)
Humans vs. Bots: Are You Talking to a Machine Right Now? (Ep. 273) Humans vs. Bots: Are You Talking to a Machine Right Now? (Ep. 273)

Together, they explore the growing importance of distinguishing human-written from AI-generated text, discussing real-world examples from social media to news.

How reliable are current detection tools like DetectGPT?

What are the ethical and technical challenges ahead as AI continues to advance?

And is the balance between innovation and regulation tipping in the right direction?

Tune in for insights on the future of AI text detection and the broader implications for media, academia, and policy.

7 months, 2 weeks назад @ datascienceathome.com
AI bubble, Sam Altman’s Manifesto and other fairy tales for billionaires (Ep. 272)
AI bubble, Sam Altman’s Manifesto and other fairy tales for billionaires (Ep. 272) AI bubble, Sam Altman’s Manifesto and other fairy tales for billionaires (Ep. 272)

Welcome to Data Science at Home, where we don’t just drink the AI Kool-Aid.

Today, we’re dissecting Sam Altman’s “AI manifesto”—a magical journey where, apparently, AI will fix everything from climate change to your grandma’s back pain.

In this episode, I’ll break down the bold (and often bizarre) claims in Altman’s grand speech for the Intelligence Age.

I’ll give you the real scoop on what’s realistic, what’s nonsense, and why some tech billionaires just can’t resist overselling.

Chapters00:00 – Intro00:18 – CEO of Baidu Statement on AI Bubble03:47 – News On Sam Altman Open AI06:43 – Online Manifesto “The Intelleigent Age”13:14 – Deep Learning16:26 – AI gets Better With Scale17:45 – Conclu…

7 months, 3 weeks назад @ datascienceathome.com