Very ML
State-of-the-art Machine Learning News Feed
/r/MachineLearning
последний пост 1 час назад
[D] Was neurosymbolism a mistake?
[D] Was neurosymbolism a mistake?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 час назад @ reddit.com
[D] Question about DDPM
[D] Question about DDPM

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

2 часа назад @ reddit.com
[D] Game Engines for training foundational models
[D] Game Engines for training foundational models

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

2 часа назад @ reddit.com
[D][P] image/txt-to-json model recommendation.
[D][P] image/txt-to-json model recommendation.

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

4 часа назад @ reddit.com
[R] The Curse of Depth in Large Language Models
[R] The Curse of Depth in Large Language Models [R] The Curse of Depth in Large Language Models

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

8 часов назад @ reddit.com
[R] Evaluating LLMs on Real-World Software Engineering Tasks: A $1M Benchmark Study
[R] Evaluating LLMs on Real-World Software Engineering Tasks: A $1M Benchmark Study

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

10 часов назад @ reddit.com
[R] Membership Inference Attacks for Face Images Against Fine-Tuned Latent Diffusion Models
[R] Membership Inference Attacks for Face Images Against Fine-Tuned Latent Diffusion Models

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

11 часов назад @ reddit.com
[R] Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention (submitted by Liang Wenfeng - DeepSeek)
[R] Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention (submitted by Liang Wenfeng - DeepSeek) [R] Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention (submitted by Liang Wenfeng - DeepSeek)

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

12 часов назад @ reddit.com
[D] Does there exist no global convergence proof for the ADAM algorithm?
[D] Does there exist no global convergence proof for the ADAM algorithm?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

14 часов назад @ reddit.com
[D] Can anyone recommend me books I can read on stream?
[D] Can anyone recommend me books I can read on stream?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

15 часов назад @ reddit.com
[D] How AISTATS/UAI/TMLR is viewed when applied for industry job?
[D] How AISTATS/UAI/TMLR is viewed when applied for industry job?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

19 часов назад @ reddit.com
[D] Which Conference Template can Write most?
[D] Which Conference Template can Write most?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

19 часов назад @ reddit.com
[D] Is It Okay to Train and Compare Models Without a Benchmark Dataset?
[D] Is It Okay to Train and Compare Models Without a Benchmark Dataset?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

20 часов назад @ reddit.com
[D] Finetuning ModernBERT is taking 3hrs (2 epochs) and 35gigs of vram. is it normal?
[D] Finetuning ModernBERT is taking 3hrs (2 epochs) and 35gigs of vram. is it normal?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

21 час назад @ reddit.com
built a chrome extension to read arXiv papers 2x faster [P]
built a chrome extension to read arXiv papers 2x faster [P]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 day назад @ reddit.com
Towards Data Science
последний пост 3 часа назад
Learning How to Play Atari Games Through Deep Neural Networks
Learning How to Play Atari Games Through Deep Neural Networks Learning How to Play Atari Games Through Deep Neural Networks

Yet, it remains an ideal stepping stone in the field of Deep Reinforcement Learning, widely recognized for combining deep learning with reinforcement learning.

A singular frame omits many details of the game state such as the velocity and direction of the ball.

Arcade Learning Environment → ALE is a framework that allows us to interact with Atari 2600 environments.

Before that, we need to create the necessary objects to pass into our training function along with some hyperparameters.

References[1] A. L. Samuel, “Some Studies in Machine Learning Using the Game of Checkers,” IBM Journal of Research and Development, vol.

3 часа назад @ towardsdatascience.com
Honestly Uncertain
Honestly Uncertain Honestly Uncertain

Spiegelhalter walks the reader through the quadratic scoring rule, and briefly mentions that a linear scoring rule will lead to dishonest behavior.

One proper scoring rule is the quadratic scoring rule, also known as the Brier score.

This reward function is asymmetric: When you increase your confidence from Q=0.95 to Q=0.98 (and A is true), the reward function only increases marginally.

The data scientist working at the supermarket does not know my personal shopping list, even after having collected considerable personal data.

In a way, we often behave as if a linear scoring rule judged our predictions, making us lean towards the bold side.

4 часа назад @ towardsdatascience.com
How LLMs Work: Pre-Training to Post-Training, Neural Networks, Hallucinations, and Inference
How LLMs Work: Pre-Training to Post-Training, Neural Networks, Hallucinations, and Inference How LLMs Work: Pre-Training to Post-Training, Neural Networks, Hallucinations, and Inference

With the recent explosion of interest in large language models (LLMs), they often seem almost magical.

What you’ll getPart 1 (this article): Covers the fundamentals of LLMs, including pre-training to post-training, neural networks, Hallucinations, and inference.

Step 3: Neural network trainingOnce the text is tokenized, the neural network learns to predict the next token based on its context.

You can think of them as an advanced autocomplete system — they predict the next token based on probability, but with limited instruction-following ability.

Hallucinations happen because LLMs do not “know” facts — they simply predict the most statistically likely sequence of words based on their traini…

5 часов назад @ towardsdatascience.com
The Future of Data: How Decision Intelligence is Revolutionizing Data
The Future of Data: How Decision Intelligence is Revolutionizing Data The Future of Data: How Decision Intelligence is Revolutionizing Data

That is exactly where Decision Intelligence (DI) comes into the picture.

In this blog, I want to introduce the readers to the concept of Decision Intelligence — a very nuanced and rapidly growing concept.

This is where Decision Intelligence (DI) becomes important for organizations making focused, contextual, and personalized decisions.

What is the difference between Decision Intelligence and Artificial Intelligence?

Decision Intelligence creates value through increased revenue, cost reduction, improved efficiency, and risk mitigation, and that makes Decision Intelligence important for organizations.

5 часов назад @ towardsdatascience.com
Retrieval Augmented Generation in SQLite
Retrieval Augmented Generation in SQLite Retrieval Augmented Generation in SQLite

SQLite-VecOne of the powers of the SQLite database is the use of extensions.

This extension allows SQLite to perform efficient searches across large volumes of textual data in SQLite.

Because the extension is written purely in C, we can run it anywhere a SQLite database can be run, including Raspberry Pis and browsers.

Integration with SQLite Query Engine: Virtual tables integrate seamlessly with SQLite’s standard query syntax e.g.

The typical syntax for creating a virtual table looks like the following:CREATE VIRTUAL TABLE my_table USING my_extension_module();The important part of this statement is my_extension_module() .

5 часов назад @ towardsdatascience.com
Tutorial: Semantic Clustering of User Messages with LLM Prompts
Tutorial: Semantic Clustering of User Messages with LLM Prompts Tutorial: Semantic Clustering of User Messages with LLM Prompts

The first step was to convert the anonymized data into a pandas data frame with columns: score, user, role, message, timestamp, thread, user_turns.

I separate out user messages from entire thread messages, to see if one or the other would produce better clusters.

Make user thread summaries straight to the point, capture the main content without losing too much detail, skip the intro text.

Cluster topics should be as short as possible without losing too much detail in the cluster meaning.

Interestingly, both Perplexity Pro and Gemini 2.0 Pro sometimes hallucinated cluster topics (e.g., misclassifying a question about slow queries as “Personal Matter”).

1 day, 7 hours назад @ towardsdatascience.com
On-Device Machine Learning in Spatial Computing
On-Device Machine Learning in Spatial Computing On-Device Machine Learning in Spatial Computing

This convergence of spatial awareness and on-device machine learning capabilities opens up new possibilities for object recognition and tracking applications that were previously challenging to implement.

Apple’s Create ML application provides a straightforward interface for training machine learning models, including specialized templates for spatial computing applications.

To begin the training process, we launch Create ML and select the “Object Tracking” template from the spatial category.

It manages the ARKit session, handles object tracking initialization, and maintains the state of object detection throughout the app's lifecycle.

This implementation represents just the beginning of wh…

1 day, 9 hours назад @ towardsdatascience.com
How I Became A Machine Learning Engineer (No CS Degree, No Bootcamp)
How I Became A Machine Learning Engineer (No CS Degree, No Bootcamp) How I Became A Machine Learning Engineer (No CS Degree, No Bootcamp)

I am fortunate enough to work and develop with these technologies every day as a machine learning engineer!

After several interview rounds, I was offered the role, and I am now a fully fledged machine learning engineer!

Experience — A machine learning engineer is not an entry-level position in my opinion.

— Most companies nowadays deploy many of their architecture and systems on the cloud, and machine learning models are no exception.

This is why securing a machine learning engineer role can be quite challenging, as it requires proficiency across a wide range of skills.

3 days, 20 hours назад @ towardsdatascience.com
➡️ Start Asking Your Data ‘Why?’ — A Gentle Intro To Causality
➡️ Start Asking Your Data ‘Why?’ — A Gentle Intro To Causality ➡️ Start Asking Your Data ‘Why?’ — A Gentle Intro To Causality

To understand causal graph models (or causal graphs for short) we start with the following illustration of an example undirected graph with four nodes/vertices and three edges.

💊 Causal Graph Resolution of Simpson’s ParadoxFor simplicity we’ll examine Simpson’s paradox focusing on two cohorts, male and female adults.

Post data collection: E.g, in this made up scenario the data has already been collected and hence we need to deal with what is referred to as Observational Data.

🦚 Causal Graph Resolution of Berkson’s ParadoxAs in the previous section we are going to make clear statements about how we believe the data was generated and then draw these in a causal graph.

Check out my new article…

3 days, 20 hours назад @ towardsdatascience.com
Roadmap to Becoming a Data Scientist, Part 4: Advanced Machine Learning
Roadmap to Becoming a Data Scientist, Part 4: Advanced Machine Learning Roadmap to Becoming a Data Scientist, Part 4: Advanced Machine Learning

Following significant breakthroughs in machine learning about a decade ago, data science has surged in popularity within the tech community.

The first three parts of this series outlined the necessary skills to become a data scientist in three key areas: math, software engineering, and machine learning.

While knowledge of classical Machine Learning and neural network algorithms is an excellent starting point for aspiring data specialists, there are still many important topics in machine learning that must be mastered to work on more advanced projects.

In machine learning, many new methods are built upon older approaches, which is especially true for NLP and computer vision.

Another advanced…

4 days, 5 hours назад @ towardsdatascience.com
Publish Interactive Data Visualizations for Free with Python and Marimo
Publish Interactive Data Visualizations for Free with Python and Marimo Publish Interactive Data Visualizations for Free with Python and Marimo

Here I present my experience using a recently released Python library — marimo — which opens up exciting new opportunities for publishing interactive visualizations across the entire field of data science.

Interactive Data VisualizationThe tradeoffs to consider when selecting an approach for presenting data visualizations can be broken into three categories:Capabilities — what visualizations and interactivity am I able to present to the user?

The biggest tradeoff for using Python has been the cost for publishing – delivering the tool to users.

— user input and visual display features are rather extensive, supporting user input via Altair and Plotly plots.

As a proof of concept, I built a si…

4 days, 6 hours назад @ towardsdatascience.com
Building a Data Engineering Center of Excellence
Building a Data Engineering Center of Excellence Building a Data Engineering Center of Excellence

In this blog post, we will discuss the essential components of a functioning data engineering practice and why data engineering is becoming increasingly critical for businesses today, and how you can build your very own Data Engineering Center of Excellence!

The data warehouse or the data lake is where data analysts, data scientists, and business users consume data.

Data Engineers load the data into data warehouses and data lakes, which are transformed not just for the Data Science & predictive analytics initiatives (as everyone likes to talk about) but primarily for data analysts.

Without the data engineers, analysts and data scientists won’t have valuable data to work with, and hence, dat…

4 days, 20 hours назад @ towardsdatascience.com
Learnings from a Machine Learning Engineer — Part 5: The Training
Learnings from a Machine Learning Engineer — Part 5: The Training Learnings from a Machine Learning Engineer — Part 5: The Training

Image files are stored in a directory structure like the following, which is self-documenting and easily modified.

Building your Docker containerBeing able to execute a training run in a consistent manner lends itself perfectly to building a Docker container.

Training scriptAs AI/ML engineers, the model training is where we want to spend most of our time.

Evaluation scriptAfter your training run has completed and you have taken proper precaution on saving your work, it is time to see how well it performed.

Manually review resultsWith the training run complete, you should now have model artifacts saved and can examine the performance.

5 days, 1 hour назад @ towardsdatascience.com
Learnings from a Machine Learning Engineer — Part 3: The Evaluation
Learnings from a Machine Learning Engineer — Part 3: The Evaluation Learnings from a Machine Learning Engineer — Part 3: The Evaluation

In Part 1, I discussed the process of labelling your image data that you use in your Image Classification project.

This means adding a step immediately after your training run completes to calculate scores for all images — training, validation, test, and the benchmark sets.

Confidence threshold reportAnother valuable measure that incorporates the confidence threshold (in my example, 95) is to report on accuracy and false positive rates.

Recall that when we apply the confidence threshold before presenting results, we help reduce false positives from being shown to the end user.

Some of these may make sense to incorporate into your data set to provide a better user experience.

5 days, 1 hour назад @ towardsdatascience.com
Learnings from a Machine Learning Engineer — Part 1: The Data
Learnings from a Machine Learning Engineer — Part 1: The Data Learnings from a Machine Learning Engineer — Part 1: The Data

It is said that in order for a machine learning model to be successful, you need to have good data.

While this is true (and pretty much obvious), it is extremely difficult to define, build, and sustain good data.

It was about eight years ago that I started online studies for Data Science and machine learning.

These standards will guide you toward that “good” data.

What qualifies as a good image for training is a bit different than what qualifies as a good image for evaluation.

5 days, 2 hours назад @ towardsdatascience.com
Distill.pub Distill.pub
последний пост None
The Gradient The Gradient
последний пост 3 months назад
Shape, Symmetries, and Structure: The Changing Role of Mathematics in Machine Learning Research
Shape, Symmetries, and Structure: The Changing Role of Mathematics in Machine Learning Research Shape, Symmetries, and Structure: The Changing Role of Mathematics in Machine Learning Research

Mathematics and statistics, once the primary guides of machine learning research, now struggle to provide immediate insight into the latest breakthroughs.

This shift has prompted speculation about mathematics’ diminished role in machine learning research moving forward.

It is also the way that symmetries are usually leveraged when performing computations (for example, in machine learning).

One can reasonably argue that diagrammatic descriptions of well-known constructions, like products, are not useful for the machine learning researcher.

However, as we’ve demonstrated, while mathematics may not maintain the same role in machine learning research that it has held in the past, the success of…

3 months назад @ thegradient.pub
What's Missing From LLM Chatbots: A Sense of Purpose
What's Missing From LLM Chatbots: A Sense of Purpose What's Missing From LLM Chatbots: A Sense of Purpose

Let's jump back to the 1970s, when Roger Schank introduced his "restaurant script" as a kind of dialogue system [1].

The minimum requirement we could have for a dialogue system is that it can stay on the task we gave them.

Concluding marksI have reviewed the making of current LLM dialogue systems, how and why it is insufficient.

The following are two research questions that I’m mostly excited about:(1) Better monitoring and control of dialogue systems with steering techniques.

CitationFor attribution of this in academic contexts or books, please cite this work as:Kenneth Li, "From prediction to purpose: a tutorial on LLM dialogue system", The Gradient, 2024.

5 months, 1 week назад @ thegradient.pub
We Need Positive Visions for AI Grounded in Wellbeing
We Need Positive Visions for AI Grounded in Wellbeing We Need Positive Visions for AI Grounded in Wellbeing

This leads to our second conclusion: We need plausible positive visions of a society with capable AI, grounded in wellbeing.

The rest of this post describes in more detail (1) what we mean by AI that benefits our wellbeing, (2) the need for positive visions for AI grounded in wellbeing, and (3) concrete leverage points to aid in the development and deployment of AI in service of such positive visions.

In diving into the philosophy of flourishing, wellbeing economics, or psychological theories of human wellbeing, one encounters many interesting, compelling, but seemingly incompatible ideas.

The case so far is that we need positive visions for society with capable AI, grounded in individual a…

6 months, 2 weeks назад @ thegradient.pub
TheSequence TheSequence
последний пост 11 часов назад
The Sequence Knowledge #492: RAG-Fusion is Better than Just RAG
The Sequence Knowledge #492: RAG-Fusion is Better than Just RAG The Sequence Knowledge #492: RAG-Fusion is Better than Just RAG

Created Using MidjourneyToday we will Discuss:An introduction to RAG Fusion which expands some of the traditional RAG concepts with sophisticated ranking methods.

A review of the original RAG Fusion paper.

Fusion RAG( also known as RAG-Fusion), an evolution of traditional Retrieval-Augmented Generation (RAG), enhances information retrieval and response generation by leveraging multiple query expansions and advanced ranking techniques.

This approach addresses limitations of conventional RAG systems, which often rely on single-document retrieval for complex queries.

At its core, Fusion RAG employs multi-query generation to create diverse interpretations of the original user query.

11 часов назад @ thesequence.substack.com
Guest-post: Open-source Python Development Landscape
Guest-post: Open-source Python Development Landscape Guest-post: Open-source Python Development Landscape

In this experimental guest post, Avi Chawla shares a curated selection of Python tools that streamline development.

Python development involves various stages and equally many tools to manage them:For dependencies, tools like pip , Conda , and Poetry help.

Dependency & Package ManagersManage Python package installations and dependencies.

Poetry – A dependency management tool that simplifies package management and publishing.

Bandit – A security linter for identifying vulnerabilities in Python code.

1 day, 10 hours назад @ thesequence.substack.com
The Sequence Radar #491: Red Teaming AI with AI
The Sequence Radar #491: Red Teaming AI with AI The Sequence Radar #491: Red Teaming AI with AI

The engineering section dives into another awesome framework and we discuss large action models in our research edition.

You can subscribe to The Sequence below:📝 Editorial: Red Teaming AI with AIJailbreaks are one of the biggest headaches when it comes to large language models (LLMs).

That’s exactly what Anthropic set out to explore in their research paper, "Constitutional Classifiers: Safeguarding Language Models from Universal Jailbreaks."

Red Teaming for RoboticsIn the paper "Embodied Red Teaming for Auditing Robotic Foundation Models", researchers introduce Embodied Red Teaming (ERT) to identify diverse instructions that cause language-conditioned robot models to fail, demonstrating a …

2 days, 10 hours назад @ thesequence.substack.com
The Sequence Research #490: A Practical Deep Dive Inside DeepSeek-R1
The Sequence Research #490: A Practical Deep Dive Inside DeepSeek-R1 The Sequence Research #490: A Practical Deep Dive Inside DeepSeek-R1

Created Using MidjourneyI have been waiting a few days until the madness around DeepSeek-R1 slowed down to discuss some of the details around this model.

Yesterday, we debated the GPU optimization techniques used in the model and today we would like to dive into the R1 model itself discussing some of its key contribitions.

This is the DeepSeek-R1, the newest model by the famous Chinese eval lab that dabbles into reasoning.

One of the dominat reasoning thesis in the market is that it’s an emerging property of the scaling laws.

DeepSeek-R1 challenges that thesis achieving reasoning by leveraging a very clever post-training process.

4 days, 11 hours назад @ thesequence.substack.com
The Sequence Opinion #489: CRAZY: How DeepSeek R1 Bypassed CUDA with Lower-Level GPU Optimization Techniques
The Sequence Opinion #489: CRAZY: How DeepSeek R1 Bypassed CUDA with Lower-Level GPU Optimization Techniques The Sequence Opinion #489: CRAZY: How DeepSeek R1 Bypassed CUDA with Lower-Level GPU Optimization Techniques

Created Using MidjourneyA lot has been written about DeepSeek R1 and its clever innvoations over the last few weeks.

However, one of the aspects that hasn’t received a lot of attention has been their work on GPU level optimizations.

The level of optimization is insane to the point of bypassing NVIDIA’s CUDA altogether and leverage PTX programming, utilize NCCL for communication efficiency, and adopt other advanced techniques.

It provides high-level abstractions for GPU programming, making it accessible to developers through languages like C++ and Python.

Strengths of CUDA

5 days, 10 hours назад @ thesequence.substack.com
The Sequence Engineering #488: Txtai, Maybe the Simplest Way to do Embeddings
The Sequence Engineering #488: Txtai, Maybe the Simplest Way to do Embeddings The Sequence Engineering #488: Txtai, Maybe the Simplest Way to do Embeddings

Created Using MidjourneyEmbeddings are a critical component of any generative AI applications.

The market has been floated with many vector databases and other platforms.

Simplicity and developer friendliness and two of the main characteristics that I look for in embedding frameworks when building generative AI apps.

txtai is an open-source embeddings database that integrates semantic search, LLM orchestration, and language model workflows1.

This foundation allows for both vector search and serves as a knowledge base for large language model (LLM) applications2.

6 days, 10 hours назад @ thesequence.substack.com
The Sequence Knowledge #487: A RAG that Assesees Itself
The Sequence Knowledge #487: A RAG that Assesees Itself The Sequence Knowledge #487: A RAG that Assesees Itself

Created Using MidjourneyToday we will Discuss:An overview of SELF-RAG, a technique for self-assessing retrieval augmentation methods.

The original SELF-RAG paper from Google Deepmind.

💡 AI Concept of the Day: Understanding Self-RAGContinuing our series about RAG techniques, today I would like to discuss one of the most intriguing ones.

Self-Reflective Retrieval-Augmented Generation (SELF-RAG) is an advanced framework that enhances traditional Retrieval-Augmented Generation (RAG) by incorporating self-assessment mechanisms into the retrieval and generation processes.

Unlike standard RAG, which often retrieves passages indiscriminately, SELF-RAG employs a more sophisticated approach by introd…

1 week назад @ thesequence.substack.com
The Sequence Radar # : The Amazing AlphaGeometry2 Now Achieved Gold Medalist in Math Olympiads
The Sequence Radar # : The Amazing AlphaGeometry2 Now Achieved Gold Medalist in Math Olympiads The Sequence Radar # : The Amazing AlphaGeometry2 Now Achieved Gold Medalist in Math Olympiads

The coverage rate of the AG2 language on International Math Olympiad (IMO) geometry problems from 2000-2024 increased from 66% to 88%.

An enhanced language model, leveraging the Gemini architecture and trained on a larger and more diverse dataset, was also implemented.

The language model itself is a sparse mixture-of-expert Transformer-based model that leverages the Gemini training pipeline.

Self-MoAIn the paper "Rethinking Mixture-of-Agents: Is Mixing Different Large Language Models Beneficial?

🤖 AI Tech ReleasesDeep ResearchOpenAI introduced Deep Research, its AI reasoning agent.

1 week, 2 days назад @ thesequence.substack.com
eBook: Mastering AI Agents
eBook: Mastering AI Agents eBook: Mastering AI Agents

Check out this – New Mastering AI Agents eBook.

In this ~100-page book, you will find in-depth content on creating powerful, reliable, and scalable AI agents from Galileo experts.

Get Your CopyRead their comprehensive Mastering AI Agents eBook to learn how to:Choose the right agentic framework for your use caseEvaluate and improve AI agent performanceIdentify failure points and production issuesWe also recommend to sign up for their upcoming webinar with crewAI on evaluating AI agents!

For those who wants to know more:The book is divided into five chapters:Chapter 1 introduces AI agents, their optimal applications, and scenarios where they might be excessive.

Chapter 5 addresses why many AI…

1 week, 4 days назад @ thesequence.substack.com
The Sequence Opinion #485: What's Wrong With AI Benchmarks
The Sequence Opinion #485: What's Wrong With AI Benchmarks The Sequence Opinion #485: What's Wrong With AI Benchmarks

Created Using MidjourneyIn the generative AI era, evaluations and benchmarking have rapidly evolved from a set of quantitative metrics into the primary means by which we understand the capabilities of foundation models.

With explainability techniques such as mechanistic interpretability still in their infancy, benchmarks serve as a crucial tool for deriving insights into the inner workings of generative AI models.

Every day, new benchmarks emerge to evaluate unique model capabilities.

This essay explores the current challenges in AI benchmarking and the trends shaping its future.

The Challenges of Modern AI BenchmarkingAI benchmarking today is undermined by three significant challenges: dat…

1 week, 5 days назад @ thesequence.substack.com
The Sequence Engineering #483: Block's goose is a Brand New Framework for Building Agentic Applications
The Sequence Engineering #483: Block's goose is a Brand New Framework for Building Agentic Applications The Sequence Engineering #483: Block's goose is a Brand New Framework for Building Agentic Applications

Created Using MidjourneyAgents, agents, agents.

Every week it feels that we have a new framework for agentic workflows.

Like many AI capabilities, some of the most interesting agentic releases are coming from large software companies that are applying these technologies at scale.

A few days ago, jumped Block released goose, a framework for building agentic apps.

In today’s The Sequence Engineering, I would like to dive into the details behind goose.

1 week, 6 days назад @ thesequence.substack.com
The Sequence Knowledge #482: An Introduction to Corrective RAG
The Sequence Knowledge #482: An Introduction to Corrective RAG The Sequence Knowledge #482: An Introduction to Corrective RAG

Created Using MidjourneyToday we will Discuss:An overview of Corrective RAG.

The original paper about Corrective RAG from Google DeepMind and others.

One of the most interesting ones is known as corrective RAG.

Corrective RAG represents an evolution in retrieval-augmented generation, pushing beyond the boundaries of standard RAG to deliver more accurate and contextually appropriate outputs.

While standard RAG enhances LLM responses with external knowledge, Corrective RAG introduces a sophisticated feedback mechanism that scrutinizes and refines the generated content.

2 weeks назад @ thesequence.substack.com
The Sequence Radar #481: Humanity's Last Exam
The Sequence Radar #481: Humanity's Last Exam The Sequence Radar #481: Humanity's Last Exam

The top AI models are constantly outpacing and memorizing leading benchmarks, triggering a race to develop more challenging evaluations that push the boundaries of foundation models.

The Humanity's Last Exam (HLE) benchmark is a novel, multi-modal evaluation suite designed to assess the limits of large language model (LLM) capabilities on closed-ended academic questions.

Another key contribution of the HLE benchmark lies in its diverse question formats and subject coverage.

This level of difficulty contrasts with the saturation seen in many existing benchmarks, demonstrating the utility of HLE in assessing frontier AI capabilities.

The HLE benchmark aims to address the fact that current LLM…

2 weeks, 2 days назад @ thesequence.substack.com
📝 Guest Post: Augmented SBERT: A Data Augmentation Method to Enhance Bi-Encoders for Pairwise Sentence Scoring*
📝 Guest Post: Augmented SBERT: A Data Augmentation Method to Enhance Bi-Encoders for Pairwise Sentence Scoring* 📝 Guest Post: Augmented SBERT: A Data Augmentation Method to Enhance Bi-Encoders for Pairwise Sentence Scoring*

This guest post from Zilliz discusses the limitations of these approaches and how Augmented SBERT (AugSBERT) addresses them.

It helps generate new sentence pairs by sampling the existing sentence pairs.

Here’s why it is essential:Generating Diverse Sentence Pairs: AugSBERT uses data augmentation to generate diverse sentence pairs.

It reuses individual sentences from the existing labeled sentence pairs (gold training set) and recombines them to create new sentence pairs.

Sentence Pair Sampling: AugSBERT employs smart sampling techniques to select sentence pairs that enhance training quality.

2 weeks, 4 days назад @ thesequence.substack.com
The Sequence Opinion #480: What is GPT-o1 Actually Doing?
The Sequence Opinion #480: What is GPT-o1 Actually Doing? The Sequence Opinion #480: What is GPT-o1 Actually Doing?

Created Using MidjourneyThese days we can only talk about DeepSeek-R1 and the reasoning capaiblities of foundation models.

The reasoning race was initially triggered by the release of GPT-o1 followed by the announcement of the upcoming release of GPT-o3.

By exploring hypotheses about how these models work internally, we can better understand their mechanisms and the breakthroughs they represent.

This essay delves into three critical aspects of these models: reasoning hypothesis search, program synthesis, and the innovative reinforcement learning techniques introduced by DeepSeek-R1.

Reasoning Hypothesis Search: Structured Problem Solving

2 weeks, 5 days назад @ thesequence.substack.com
Synced Review
последний пост 1 month, 2 weeks назад
Automating Artificial Life Discovery: The Power of Foundation Models
Automating Artificial Life Discovery: The Power of Foundation Models Automating Artificial Life Discovery: The Power of Foundation Models

The recent Nobel Prize for groundbreaking advancements in protein discovery underscores the transformative potential of foundation models…Continue reading on SyncedReview »

1 month, 2 weeks назад @ medium.com
Llama 3 Meets MoE: Pioneering Low-Cost High-Performance AI
Llama 3 Meets MoE: Pioneering Low-Cost High-Performance AI Llama 3 Meets MoE: Pioneering Low-Cost High-Performance AI

Continue reading on SyncedReview »

1 month, 3 weeks назад @ medium.com
DeepMind’s JetFormer: Unified Multimodal Models Without Modelling Constraints
DeepMind’s JetFormer: Unified Multimodal Models Without Modelling Constraints DeepMind’s JetFormer: Unified Multimodal Models Without Modelling Constraints

Recent advancements in training large multimodal models have been driven by efforts to eliminate modeling constraints and unify…Continue reading on SyncedReview »

1 month, 3 weeks назад @ medium.com
NVIDIA’s nGPT: Revolutionizing Transformers with Hypersphere Representation
NVIDIA’s nGPT: Revolutionizing Transformers with Hypersphere Representation NVIDIA’s nGPT: Revolutionizing Transformers with Hypersphere Representation

The Transformer architecture, introduced by Vaswani et al. in 2017, serves as the backbone of contemporary language models. Over the years…Continue reading on SyncedReview »

1 month, 3 weeks назад @ medium.com
From Token to Conceptual: Meta Introduces Large Concept Models in Multilingual AI
From Token to Conceptual: Meta Introduces Large Concept Models in Multilingual AI From Token to Conceptual: Meta Introduces Large Concept Models in Multilingual AI

Large Language Models (LLMs) have become indispensable tools for diverse natural language processing (NLP) tasks. Traditional LLMs operate…Continue reading on SyncedReview »

2 months назад @ medium.com
NVIDIA’s Hybrid: Combining Attention and State Space Models for Breakthrough Performance of Small…
NVIDIA’s Hybrid: Combining Attention and State Space Models for Breakthrough Performance of Small… NVIDIA’s Hybrid: Combining Attention and State Space Models for Breakthrough Performance of Small…

Language models (LMs) based on transformers have become the gold standard in natural language processing, thanks to their exceptional…Continue reading on SyncedReview »

2 months назад @ medium.com
From Response to Query: The Power of Reverse Thinking in Language Models
From Response to Query: The Power of Reverse Thinking in Language Models From Response to Query: The Power of Reverse Thinking in Language Models

Continue reading on SyncedReview »

2 months, 1 week назад @ medium.com
Yann LeCun Team’s New Research: Revolutionizing Visual Navigation with Navigation World Models
Yann LeCun Team’s New Research: Revolutionizing Visual Navigation with Navigation World Models Yann LeCun Team’s New Research: Revolutionizing Visual Navigation with Navigation World Models

Navigation is a fundamental skill for any visually-capable organism, serving as a critical tool for survival. It enables agents to locate…Continue reading on SyncedReview »

2 months, 1 week назад @ medium.com
The Future of Vision AI: How Apple’s AIMV2 Leverages Images and Text to Lead the Pack
The Future of Vision AI: How Apple’s AIMV2 Leverages Images and Text to Lead the Pack The Future of Vision AI: How Apple’s AIMV2 Leverages Images and Text to Lead the Pack

The landscape of vision model pre-training has undergone significant evolution, especially with the rise of Large Language Models (LLMs)…Continue reading on SyncedReview »

2 months, 1 week назад @ medium.com
Redefining Music AI: The Power of Sony’s SoniDo as a Versatile Foundation Model
Redefining Music AI: The Power of Sony’s SoniDo as a Versatile Foundation Model Redefining Music AI: The Power of Sony’s SoniDo as a Versatile Foundation Model

A foundation model refers to a pre-trained model developed on extensive datasets, designed to be versatile and adaptable for a range of…Continue reading on SyncedReview »

2 months, 2 weeks назад @ medium.com
DeepMind’s Socratic Learning with Language Games: The Path to Self-Improving Superintelligence
DeepMind’s Socratic Learning with Language Games: The Path to Self-Improving Superintelligence DeepMind’s Socratic Learning with Language Games: The Path to Self-Improving Superintelligence

Continue reading on SyncedReview »

2 months, 3 weeks назад @ medium.com
Revolutionizing AI on a Budget: Apple’s Roadmap for Small Language Models Training Success
Revolutionizing AI on a Budget: Apple’s Roadmap for Small Language Models Training Success Revolutionizing AI on a Budget: Apple’s Roadmap for Small Language Models Training Success

While large language models (LLMs) dominate the AI landscape, Small-scale Large Language Models (SLMs) are gaining traction as…Continue reading on SyncedReview »

2 months, 3 weeks назад @ medium.com
Redefines Consistency Models”: OpenAI’s TrigFlow Narrows FID Gap to 10% with Efficient Two-Step…
Redefines Consistency Models”: OpenAI’s TrigFlow Narrows FID Gap to 10% with Efficient Two-Step… Redefines Consistency Models”: OpenAI’s TrigFlow Narrows FID Gap to 10% with Efficient Two-Step…

Consistency models (CMs) are a cutting-edge class of diffusion-based generative models designed for rapid and efficient sampling. However…Continue reading on SyncedReview »

2 months, 3 weeks назад @ medium.com
Precision in Pixels: NVIDIA’s Edify Image Model Combines High Quality with Unmatched Control
Precision in Pixels: NVIDIA’s Edify Image Model Combines High Quality with Unmatched Control Precision in Pixels: NVIDIA’s Edify Image Model Combines High Quality with Unmatched Control

The field of text-to-image synthesis has advanced rapidly, with state-of-the-art models now generating highly realistic and diverse images…Continue reading on SyncedReview »

2 months, 3 weeks назад @ medium.com
Meta’s Dualformer: Bridging Fast and Slow Thinking in Transformers for Superior AI Reasoning
Meta’s Dualformer: Bridging Fast and Slow Thinking in Transformers for Superior AI Reasoning Meta’s Dualformer: Bridging Fast and Slow Thinking in Transformers for Superior AI Reasoning

In cognitive science, human thought processes are commonly divided into two systems: the fast, intuitive System 1 and the slower…Continue reading on SyncedReview »

3 months назад @ medium.com
📓 Cool Blogs
ODS.ai Habr ODS.ai Habr
последний пост 1 month, 2 weeks назад
Создаем воспоминания. Осваиваем FLUX, LoRA и ComfyUI
Создаем воспоминания. Осваиваем FLUX, LoRA и ComfyUI Создаем воспоминания. Осваиваем FLUX, LoRA и ComfyUI

Такие модели можно обучать с нуля и это дорого, нужен кластер с GPU (видеокарты) и много данных.

В домене текст-картинка бывают открытые модели, типа Stable Diffusion, Kandinsky и FLUX, бывают закрытые, типа DALL-E.Открытую модель можно дообучать разными способами.

Борис СтругацкийОсобенности: Для личностей типа Стругацких или Бродского, качественных фотографий крайне мало, но много и не надо.

Можно и фразу.

Владимир СурдинАлексей СемихатовВидео с их лекциями можно найти повсеместно, начать можно с канала Вселенная плюс на YouTube и в телеграм.

1 month, 2 weeks назад @ habr.com
Как нейросети, RL и байесовскую оптимизацию стали использовать на ускорителях заряженных частиц
Как нейросети, RL и байесовскую оптимизацию стали использовать на ускорителях заряженных частиц Как нейросети, RL и байесовскую оптимизацию стали использовать на ускорителях заряженных частиц

Один из них — поддержание стабильной орбиты пучка частиц (траектории, по которой происходит движение), которая критически важна для точности экспериментов.

Вот, кстати, наша статья про то, как сейсмические вибрации будут влиять на орбиту пучка в СКИФ: Beam Stability .

Классические подходы к стабилизации орбиты пучка в ускорителяхДля стабилизации орбиты используют датчики положения пучка (BPM) и магнитные корректоры.

По словам авторов получились следующие преимущества:Ускорение процесса коррекции орбиты и повышение точности по сравнению с классическими методами, такими как SVD.

Задача агента:Автоматическое восстановление орбиты пучка заряженных частиц за ограниченное время и минимальное коли…

1 month, 4 weeks назад @ habr.com
о1: почему новая GPT от OpenAI — это не хайп, а переход к новой парадигме в ИИ
о1: почему новая GPT от OpenAI — это не хайп, а переход к новой парадигме в ИИ о1: почему новая GPT от OpenAI — это не хайп, а переход к новой парадигме в ИИ

В этой статье мы разберемся, чему научилась новая GPT o1, и как это повлияет на дальнейшую эволюцию ИИ.

Компания утверждает, что для них сброс счётчика линейки моделей к единичке знаменует собой переход к новой парадигме, и что эта нейросеть и вовсе демонстрирует новый уровень возможностей ИИ.

На форумах и в Твиттере была куча обсуждений, предвосхищений и хайпа, на фоне которых планка ожиданий некоторых людей взлетела до небес.

Издание Bloomberg рассказало, что в ходе внутренней демонстрации OpenAI показали концепцию из пяти уровней, помогающую отслеживать прогресс в создании ИИ.

Однако на уровне GPT-5 прирост в навыках может быть совсем другим (как в лучшую, так и в худшую сторону).

5 months назад @ habr.com
Большие и чёрные (ящики): что мы знаем о том, как «думают» нейросети?
Большие и чёрные (ящики): что мы знаем о том, как «думают» нейросети? Большие и чёрные (ящики): что мы знаем о том, как «думают» нейросети?

И в том, и в другом случаях объяснение действия не связано с реальным мотивом его сделать, и там, и там рождается поддельное (но правдоподобно звучащее) объяснение причин.

Просто сейчас это не воспринимается всерьёз, ведь LLM не распространены и не становятся ядром бизнес-процессов, включающих принятие решений.

Один и тот же текст запроса+ответа подаётся в модель, и производится оценка вероятности получить именно такой ответ при фиксированном запросе.

Это и желание продолжать существовать/жить, и нежелание умирать, и рассуждения об эмоциях и контроле.

Потому что абстракции, потому что обобщение, потому что это ровно то, за что мы ценим модели.

5 months, 1 week назад @ habr.com
Как организовать процесс А/В тестирования на коленке
Как организовать процесс А/В тестирования на коленке Как организовать процесс А/В тестирования на коленке

В ней авторы выделили 4 этапа зрелости, грубо можно разделить компании по частоте запусков экспериментов:на этапе Crawl компания проводит эксперимент раз в месяц (примерно 10 экспериментов в год);на этапе Walk – раз в неделю (примерно 50 экспериментов в год);на этапе Run – ежедневно (примерно 250 экспериментов в год);на этапе Fly – более 1000 экспериментов в год.

Более четкое разделение основывается на некоторых важных свойствах компаний, таких как: наличие команды платформы экспериментов, возможности автоматического расчета метрик, распространенности экспериментов в компании, самодостаточность продуктовых команд в проведении экспериментов, влияние экспериментов на компанию и другое.

Точка …

6 months, 1 week назад @ habr.com
Как организовать процесс А/В тестирования на коленке
Как организовать процесс А/В тестирования на коленке Как организовать процесс А/В тестирования на коленке

В ней авторы выделили 4 этапа зрелости, грубо можно разделить компании по частоте запусков экспериментов:на этапе Crawl компания проводит эксперимент раз в месяц (примерно 10 экспериментов в год);на этапе Walk – раз в неделю (примерно 50 экспериментов в год);на этапе Run – ежедневно (примерно 250 экспериментов в год);на этапе Fly – более 1000 экспериментов в год.

Более четкое разделение основывается на некоторых важных свойствах компаний, таких как: наличие команды платформы экспериментов, возможности автоматического расчета метрик, распространенности экспериментов в компании, самодостаточность продуктовых команд в проведении экспериментов, влияние экспериментов на компанию и другое.

Точка …

6 months, 1 week назад @ habr.com
Введение в MLflow
Введение в MLflow Введение в MLflow

Mlflow Experiments and Mlflow RunsMLflow Experiments и MLflow Runs - это основные абстракции для структурирования проекта.

mlflow run mlproject --entry-point hyperparameters-tuning --env-manager conda --experiment-name Cancer_Classification --run-name Hyperparameters_Search -P n-trials=10Посмотрим на результаты в MLflow UI.

artifact_path: model flavors: python_function: data: model.xgb env: conda: conda.yaml virtualenv: python_env.yaml loader_module: mlflow.xgboost python_version: 3.11.4 xgboost: code: null data: model.xgb model_class: xgboost.sklearn.XGBClassifier model_format: xgb xgb_version: 2.0.3 mlflow_version: 2.14.2 model_size_bytes: 35040 model_uuid: 516954aae7c94e91adeed9df76cb405…

6 months, 3 weeks назад @ habr.com
В 48 собесах от оффера в Гугл
В 48 собесах от оффера в Гугл В 48 собесах от оффера в Гугл

Как это определить и как предсказывать?

- Да... упс.. правда, они же уже определяют целевой признакВовремя не выкрутился, пришел фидбек, что я не понимаю разницу между обычными признаками (а.к.а.

Не, не делал, только DDP?

NVIDIA ищет единорогов, крутых и в рисече, и в инженерии.

Вакансия в Лондоне была, и в итоге они взяли моего бывшего коллегу: он уже в Лондоне и с чуть большим опытом в рекомендашках, явно лучше подходит.

7 months, 1 week назад @ habr.com
ChatGPT + YandexGPT API = ЛЮБОФ. Часть 1
ChatGPT + YandexGPT API = ЛЮБОФ. Часть 1 ChatGPT + YandexGPT API = ЛЮБОФ. Часть 1

ChatGPT 4 был значительно улучшен по сравнению с ChatGPT 3.5, что делает его более мощным инструментом.

Вам тоже надо учиться — учиться выстраивать взаимоотношение с ChatGPT, учиться общаться с ним.

Помните, вы всегда можете уточнить любую строчку в ответе ChatGPT и, в большинстве случаев, получить исчерпывающий ответ, который, вероятнее всего, обогатит вас новыми знаниями.

Вот несколько идей:Откройте с ChatGPT новый чат и в нём отправьте запрос в другой форме, желательно с новыми деталями.

И поэтому, когда с ChatGPT не удаётся что-то сделать с первого раза за 2–5 минут, возникает возмущение: “Ну, как так?!”.

9 months, 1 week назад @ habr.com
Machine Learning Mastery
последний пост 7 months, 1 week назад
Tips for Effectively Training Your Machine Learning Models
Tips for Effectively Training Your Machine Learning Models

In machine learning projects, achieving optimal model performance requires paying attention to various steps in the training process. But before focusing on the technical aspects of model training, it is important to define the problem, understand the context, and analyze the dataset in detail. Once you have a solid grasp of the problem and data, […]

The post Tips for Effectively Training Your Machine Learning Models appeared first on MachineLearningMastery.com.

7 months, 1 week назад @ machinelearningmastery.com
Principles of Reinforcement Learning: An Introduction with Python
Principles of Reinforcement Learning: An Introduction with Python

Reinforcement Learning (RL) is a type of machine learning. It trains an agent to make decisions by interacting with an environment. This article covers the basic concepts of RL. These include states, actions, rewards, policies, and the Markov Decision Process (MDP). By the end, you will understand how RL works. You will also learn how […]

The post Principles of Reinforcement Learning: An Introduction with Python appeared first on MachineLearningMastery.com.

7 months, 1 week назад @ machinelearningmastery.com
5 Tips for Getting Started with Deep Learning
5 Tips for Getting Started with Deep Learning

Deep learning is a subset of machine learning that has become a cornerstone in many technological breakthroughs. At the core of deep learning, it’s a model inspired by the human brain, which we call a neural network. Contrary to the traditional machine learning model, deep learning can automatically find feature representations from data. That’s why […]

The post 5 Tips for Getting Started with Deep Learning appeared first on MachineLearningMastery.com.

7 months, 2 weeks назад @ machinelearningmastery.com
Tips for Effective Feature Engineering in Machine Learning
Tips for Effective Feature Engineering in Machine Learning

Feature engineering is an important step in the machine learning pipeline. It is the process of transforming data in its native format into meaningful features to help the machine learning model learn better from the data. If done right, feature engineering can significantly enhance the performance of machine learning algorithms. Beyond the basics of understanding […]

The post Tips for Effective Feature Engineering in Machine Learning appeared first on MachineLearningMastery.com.

7 months, 2 weeks назад @ machinelearningmastery.com
5 Common Mistakes in Machine Learning and How to Avoid Them
5 Common Mistakes in Machine Learning and How to Avoid Them

Using machine learning to solve real-world problems is exciting. But most eager beginners jump straight to model building—overlooking the fundamentals—resulting in models that aren’t very helpful. From understanding the data to choosing the best machine learning model for the problem, there are some common mistakes that beginners often tend to make. But before we go […]

The post 5 Common Mistakes in Machine Learning and How to Avoid Them appeared first on MachineLearningMastery.com.

7 months, 3 weeks назад @ machinelearningmastery.com
Stable Diffusion Project: Reviving Old Photos
Stable Diffusion Project: Reviving Old Photos

Photography has been around for more than a century. There are many old photos around, and probably your family has some, too. Limited by the camera and film of the time, you may have photos of low resolution, blurry, or with folds or scratches. Restoring these old photos and making them like new ones taken […]

The post Stable Diffusion Project: Reviving Old Photos appeared first on MachineLearningMastery.com.

7 months, 3 weeks назад @ machinelearningmastery.com
The Ultimate Beginner’s Guide to Docker
The Ultimate Beginner’s Guide to Docker

Today’s digital landscape has never been so diverse. Every individual and company selects their preferred tools and operating systems, creating a diverse technological system. However, this diversity often leads to compatibility issues, making it hard to ensure application performance across different environments. This is where Docker plays a key role as an indispensable tool for […]

The post The Ultimate Beginner’s Guide to Docker appeared first on MachineLearningMastery.com.

7 months, 3 weeks назад @ machinelearningmastery.com
Stable Diffusion Project: Commercial Poster
Stable Diffusion Project: Commercial Poster

Stable Diffusion has taken the AI art world by storm, empowering users to generate stunning and imaginative visuals with just a few text prompts. This opens exciting possibilities for creatives, including crafting impactful commercial posters. In this post, we’ll delve into using Stable Diffusion to design a compelling poster for a product. After finishing this […]

The post Stable Diffusion Project: Commercial Poster appeared first on MachineLearningMastery.com.

7 months, 3 weeks назад @ machinelearningmastery.com
5 Effective Ways to Handle Imbalanced Data in Machine Learning
5 Effective Ways to Handle Imbalanced Data in Machine Learning

Introduction Here’s a something that new machine learning practitioners figure out almost immediately: not all datasets are created equal. It may now seem obvious to you, but had you considered this before undertaking machine learning projects on a real world dataset? As an example of a single class vastly outnumbering the rest, take for instance […]

The post 5 Effective Ways to Handle Imbalanced Data in Machine Learning appeared first on MachineLearningMastery.com.

7 months, 4 weeks назад @ machinelearningmastery.com
Tips for Choosing the Right Machine Learning Model for Your Data
Tips for Choosing the Right Machine Learning Model for Your Data

Introduction Choosing the right machine learning model for your data is of major importance in any data science project. The model you select will have a significant impact on the insights you derive from your data, and ultimately determine the usefulness of a project. In this article, we aim to provide practical tips to help […]

The post Tips for Choosing the Right Machine Learning Model for Your Data appeared first on MachineLearningMastery.com.

8 months назад @ machinelearningmastery.com
Stable Diffusion Project: Creating Illustration
Stable Diffusion Project: Creating Illustration

Many people write in their jobs. Not everyone is a novel writer; some write technical documentation, business plans, news articles, and even blog posts. In those writings, illustrations are not essential but often good to have. They are decorations, interpretations, or visual explanations of the text. However, you probably do not want to spend too […]

The post Stable Diffusion Project: Creating Illustration appeared first on MachineLearningMastery.com.

8 months назад @ machinelearningmastery.com
5 Free Books on Machine Learning Algorithms You Must Read
5 Free Books on Machine Learning Algorithms You Must Read

If you are a machine learning student, researcher, or practitioner, it is crucial for your career growth to have a deep understanding of how each algorithm works and the various techniques to enhance model performance. Nowadays, many individuals tend to focus solely on the code, data, and pre-trained models, often without fully comprehending the machine […]

The post 5 Free Books on Machine Learning Algorithms You Must Read appeared first on MachineLearningMastery.com.

8 months назад @ machinelearningmastery.com
Stable Diffusion Project: Word Art
Stable Diffusion Project: Word Art

Stable Diffusion is a powerful tool that helps you generate pictures. It is fun to play with the generative AI tool. But it would be useful if the tool could help you in a real job. In this post, you will see how you can leverage the power of Stable Diffusion to work on something […]

The post Stable Diffusion Project: Word Art appeared first on MachineLearningMastery.com.

8 months назад @ machinelearningmastery.com
5 Free YouTube Channels Dedicated to Machine Learning Education
5 Free YouTube Channels Dedicated to Machine Learning Education

As a data professional, you should also know how to build predictive models with machine learning to solve business problems. And if you’re interested in machine learning, you’re probably also looking for the best resources to get going. Well, you can always choose a self-paced online course that best aligns with your learning preferences. But […]

The post 5 Free YouTube Channels Dedicated to Machine Learning Education appeared first on MachineLearningMastery.com.

8 months назад @ machinelearningmastery.com
Tips for Choosing the Right Machine Learning Course
Tips for Choosing the Right Machine Learning Course

If you’re looking to make a career in data science, you probably know that machine learning is one of the most in-demand skills. Whether you are a beginner looking to break into the field or an experienced professional aiming to level up your expertise, selecting the right machine learning course is super important. So how […]

The post Tips for Choosing the Right Machine Learning Course appeared first on MachineLearningMastery.com.

8 months назад @ machinelearningmastery.com
ML in Production
последний пост None
Sorta Insightful Sorta Insightful
последний пост 3 weeks назад
MIT Mystery Hunt 2025
MIT Mystery Hunt 2025 MIT Mystery Hunt 2025

This has spoilers for MIT Mystery Hunt 2025.

I enjoyed it more than their 2018 Hunt, which is commonly cited as an all-time good Mystery Hunt.

In this Mystery Hunt it was reversed, where the act of unlocking is easy but the value and difficulty of a feeder varied.

In my free time pre-Hunt, I went to Puzzled Pint, where I tried to all-brain a logic puzzle (solve it without writing anything).

I’m looking forward to solving “No Assembly Required” in Mystery Hunt 2026, a puzzle that gives you the answer for no work.

3 weeks назад @ alexirpan.com
Using AI to Get the Neopets Destruct-o-Match Avatar
Using AI to Get the Neopets Destruct-o-Match Avatar Using AI to Get the Neopets Destruct-o-Match Avatar

If AI can be superhuman at Go, surely AI can be slightly-worse-than-experts at Destruct-o-Match if we try?

Step 0: Is Making a Destruct-o-Match AI Against Neopets Rules?

I believe the precedent is in favor of a Destruct-o-Match AI being okay.

As long as I’m the one inputting moves the Destruct-o-Match AI recommends, I should be okay.

To write a game AI, we first need to implement the rules of the game in code.

1 month, 1 week назад @ alexirpan.com
Late Takes on OpenAI o1
Late Takes on OpenAI o1 Late Takes on OpenAI o1

I realize how late this is, but I didn’t get a post out while o1 was fresh, and still feel like writing one despite it being cold.

(Also, OpenAI just announced they’re going to ship new stuff starting tomorrow so it’s now or never to say something.)

OpenAI o1 is a model release widely believed (but not confirmed) to be a post-trained version of GPT-4o.

If true, that makes this video especially useful for understanding OpenAI o1.

Which I suppose is part of why I’m talking about o1 rather than building o1.

2 months, 2 weeks назад @ alexirpan.com
Nine Years Later
Nine Years Later Nine Years Later

I expected to fill that void with more blog writing, but that’s not what happened.

The puzzles are great though, and if that’s good enough for you, I had fun with that.

Undertale YellowUndertale Yellow is a fantastic fan game, that’s been in development for 7 years and comes out feeling like a canon entry made by Toby Fox.

markdown 15837 2024 - 01 - 11 - ai - timelines - 2024. markdown 1939 2024 - 01 - 21 - mh - 2024. markdown 5076 2024 - 03 - 23 - crew - battle .

markdown 826 2024 - 04 - 30 - puzzlehunting - 201. markdown 8641 2024 - 07 - 08 - tragedies - of - reality .

6 months назад @ alexirpan.com
I'm Switching Into AI Safety
I'm Switching Into AI Safety I'm Switching Into AI Safety

There’s often a conflation between the research field of AI safety and the community of AI safety.

Me thinking AI safety is important is not an endorsement for or against anything else in the broader meme space it came from.

Historically, AI safety work did not appeal to me because of how theoretical it was.

I’m aware of the arguments that most AI safety work so far has either been useless or not that different from broader AI work.

Those who care about safety a lot call this safetywashing, the stapling of “safety” to work that does not advance safety.

6 months, 2 weeks назад @ alexirpan.com
The Tragedies of Reality Are Coming for You
The Tragedies of Reality Are Coming for You The Tragedies of Reality Are Coming for You

I would extend it to reality is complicated, relative to code, and in robotics you’re often pushing a messy reality into an abstraction nice enough for code to act on it.

Robotics research relies on building new bridges between reality and software, but that happens outside of robotics too.

Any software that interfaces with reality will have imperfect knowledge of that reality.

However, that means all the messiness of reality is coming for a field that historically does a bad job at considering reality.

I consider the world of bits to be as much a part of reality as the world of atoms.

7 months, 2 weeks назад @ alexirpan.com
Puzzlehunting 201
Puzzlehunting 201 Puzzlehunting 201

Most people will have more fun if they solve puzzles than if they don’t, but you don’t have to solve puzzles quickly to have fun.

I’m still going to explain the solving strategies I’ve learned, but puzzle solving is really an activity where you learn by doing.

Puzzle solving often involves relating two parts of the puzzle together.

Search everythingHonestly, a lot of puzzle solving is about taking random parts of the puzzle and throwing them into a search engine.

Bringing This TogetherTo showcase these strategies together, here is a puzzle I remember speedrunning especially quickly: The Three Little Pigs from Hunt 20 2.1 Puzzle Hunt.

9 months, 3 weeks назад @ alexirpan.com
Lil'Log
последний пост None
inFERENCe
последний пост None
The Unofficial Google Data Science Blog The Unofficial Google Data Science Blog
последний пост None
Off the Convex Path
последний пост None
Jay Alammar
последний пост None
fast.ai NLP fast.ai NLP
последний пост None
Sebastian Ruder
последний пост None
Andrew Karpathy blog
последний пост None
大トロ 大トロ
последний пост None
🔬 Science
Papers With Code Papers With Code
последний пост 1 day, 1 hour назад
/thomashubner/ Package Bids in Combinatorial Electricity Auctions: Selection, Welfare Losses, and Alternatives
/thomashubner/ Package Bids in Combinatorial Electricity Auctions: Selection, Welfare Losses, and Alternatives /thomashubner/ Package Bids in Combinatorial Electricity Auctions: Selection, Welfare Losses, and Alternatives

In combinatorial auctions, participants can bid on packages of goods rather than just individual items.

Electricity auctions, where participants trade power for future periods, are often combinatorial due to intertemporal constraints governing production and consumption.

Unlike the parametric bid formats often employed in US power auctions, XOR package bids are technology-agnostic, making them particularly suitable for flexible, demand-side participants.

However, to mitigate excessive computational complexity, auctioneers must limit the number of package bids that agents are allowed to submit.

To address this issue, we propose decision support algorithms that optimize package bid selection,…

1 day, 1 hour назад @ paperswithcode.com
/coleygroup/ DiffMS: Diffusion Generation of Molecules Conditioned on Mass Spectra
/coleygroup/ DiffMS: Diffusion Generation of Molecules Conditioned on Mass Spectra /coleygroup/ DiffMS: Diffusion Generation of Molecules Conditioned on Mass Spectra

Mass spectrometry plays a fundamental role in elucidating the structures of unknown molecules and subsequent scientific discoveries.

One formulation of the structure elucidation task is the conditional $\textit{de novo}$ generation of molecular structure given a mass spectrum.

Toward a more accurate and efficient scientific discovery pipeline for small molecules, we present DiffMS, a formula-restricted encoder-decoder generative network that achieves state-of-the-art performance on this task.

Extensive experiments on established benchmarks show that DiffMS outperforms existing models on $\textit{de novo}$ molecule generation.

We provide several ablations to demonstrate the effectiveness of …

1 day, 1 hour назад @ paperswithcode.com
/wenlun-zhang/ LiSA: Leveraging Link Recommender to Attack Graph Neural Networks via Subgraph Injection
/wenlun-zhang/ LiSA: Leveraging Link Recommender to Attack Graph Neural Networks via Subgraph Injection /wenlun-zhang/ LiSA: Leveraging Link Recommender to Attack Graph Neural Networks via Subgraph Injection

Graph Neural Networks (GNNs) have demonstrated remarkable proficiency in modeling data with graph structures, yet recent research reveals their susceptibility to adversarial attacks.

Traditional attack methodologies, which rely on manipulating the original graph or adding links to artificially created nodes, often prove impractical in real-world settings.

This paper introduces a novel adversarial scenario involving the injection of an isolated subgraph to deceive both the link recommender and the node classifier within a GNN system.

Specifically, the link recommender is mislead to propose links between targeted victim nodes and the subgraph, encouraging users to unintentionally establish co…

1 day, 1 hour назад @ paperswithcode.com
/modanesh/ Learning to Coordinate with Experts
/modanesh/ Learning to Coordinate with Experts /modanesh/ Learning to Coordinate with Experts

However, querying experts is often costly, necessitating the development of agents that can efficiently request and utilize expert guidance.

We consider a challenging practical setting in which an agent does not interact with experts during training but must adapt to novel environmental changes and expert interventions at test time.

To facilitate empirical research, we introduce YRC-Bench, an open-source benchmark featuring diverse domains.

YRC-Bench provides a standardized Gym-like API, simulated experts, evaluation pipeline, and implementation of competitive baselines.

Towards tackling the YRC problem, we propose a novel validation approach and investigate the performance of various learn…

1 day, 1 hour назад @ paperswithcode.com
/azureleon1/ Diffusion Models for Molecules: A Survey of Methods and Tasks
/azureleon1/ Diffusion Models for Molecules: A Survey of Methods and Tasks /azureleon1/ Diffusion Models for Molecules: A Survey of Methods and Tasks

Generative tasks about molecules, including but not limited to molecule generation, are crucial for drug discovery and material design, and have consistently attracted significant attention.

In recent years, diffusion models have emerged as an impressive class of deep generative models, sparking extensive research and leading to numerous studies on their application to molecular generative tasks.

Particularly, due to the diversity of diffusion model formulations, molecular data modalities, and generative task types, the research landscape is challenging to navigate, hindering understanding and limiting the area's growth.

To address this, this paper conducts a comprehensive survey of diffusi…

1 day, 1 hour назад @ paperswithcode.com
/katherinewu312/ LP-LM: No Hallucinations in Question Answering with Logic Programming
/katherinewu312/ LP-LM: No Hallucinations in Question Answering with Logic Programming /katherinewu312/ LP-LM: No Hallucinations in Question Answering with Logic Programming

Large language models (LLMs) are able to generate human-like responses to user queries.

LP-LM generates a most probable constituency parse tree along with a corresponding Prolog term for an input question via Prolog definite clause grammar (DCG) parsing.

The term is then executed against a KB of natural language sentences also represented as Prolog terms for question answering.

By leveraging DCG and tabling, LP-LM runs in linear time in the size of input sentences for sufficiently many grammar rules.

Performing experiments comparing LP-LM with current well-known LLMs in accuracy, we show that LLMs hallucinate on even simple questions, unlike LP-LM.

1 day, 1 hour назад @ paperswithcode.com
/nabu-hep/ Communicating Likelihoods with Normalising Flows
/nabu-hep/ Communicating Likelihoods with Normalising Flows /nabu-hep/ Communicating Likelihoods with Normalising Flows

We present a machine-learning-based workflow to model an unbinned likelihood from its samples.

A key advancement over existing approaches is the validation of the learned likelihood using rigorous statistical tests of the joint distribution, such as the Kolmogorov-Smirnov test of the joint distribution.

Our method enables the reliable communication of experimental and phenomenological likelihoods for subsequent analyses.

We demonstrate its effectiveness through three case studies in high-energy physics.

To support broader adoption, we provide an open-source reference implementation, nabu.

1 day, 1 hour назад @ paperswithcode.com
/airtownapp/ Use of Air Quality Sensor Network Data for Real-time Pollution-Aware POI Suggestion
/airtownapp/ Use of Air Quality Sensor Network Data for Real-time Pollution-Aware POI Suggestion /airtownapp/ Use of Air Quality Sensor Network Data for Real-time Pollution-Aware POI Suggestion

This demo paper presents AirSense-R, a privacy-preserving mobile application that provides real-time, pollution-aware recommendations for points of interest (POIs) in urban environments.

By combining real-time air quality monitoring data with user preferences, the proposed system aims to help users make health-conscious decisions about the locations they visit.

The application utilizes collaborative filtering for personalized suggestions, and federated learning for privacy protection, and integrates air pollutant readings from AirSENCE sensor networks in cities such as Bari, Italy, and Cork, Ireland.

Additionally, the AirSENCE prediction engine can be employed to detect anomaly readings and…

1 day, 1 hour назад @ paperswithcode.com
/xuchen0427/ Bridging Jensen Gap for Max-Min Group Fairness Optimization in Recommendation
/xuchen0427/ Bridging Jensen Gap for Max-Min Group Fairness Optimization in Recommendation /xuchen0427/ Bridging Jensen Gap for Max-Min Group Fairness Optimization in Recommendation

Group max-min fairness (MMF) is commonly used in fairness-aware recommender systems (RS) as an optimization objective, as it aims to protect marginalized item groups and ensures a fair competition platform.

Such nonlinearity property introduces the Jensen gap between the model's convergence point and the optimal point if mini-batch sampling is applied.

Both theoretical and empirical studies show that as the mini-batch size decreases and the group size increases, the Jensen gap will widen accordingly.

To overcome these limitations, we first theoretically demonstrate that the MMF-constrained objective can be essentially reformulated as a group-weighted optimization objective.

Then we present …

1 day, 1 hour назад @ paperswithcode.com
/ia-gu/ Vision-Language In-Context Learning Driven Few-Shot Visual Inspection Model
/ia-gu/ Vision-Language In-Context Learning Driven Few-Shot Visual Inspection Model /ia-gu/ Vision-Language In-Context Learning Driven Few-Shot Visual Inspection Model

We propose general visual inspection model using Vision-Language Model~(VLM) with few-shot images of non-defective or defective products, along with explanatory texts that serve as inspection criteria.

Although existing VLM exhibit high performance across various tasks, they are not trained on specific tasks such as visual inspection.

Thus, we construct a dataset consisting of diverse images of non-defective and defective products collected from the web, along with unified formatted output text, and fine-tune VLM.

For new products, our method employs In-Context Learning, which allows the model to perform inspections with an example of non-defective or defective image and the corresponding e…

1 day, 1 hour назад @ paperswithcode.com
/gjgjh/ PTZ-Calib: Robust Pan-Tilt-Zoom Camera Calibration
/gjgjh/ PTZ-Calib: Robust Pan-Tilt-Zoom Camera Calibration /gjgjh/ PTZ-Calib: Robust Pan-Tilt-Zoom Camera Calibration

In this paper, we present PTZ-Calib, a robust two-stage PTZ camera calibration method, that efficiently and accurately estimates camera parameters for arbitrary viewpoints.

Our method includes an offline and an online stage.

In the offline stage, we first uniformly select a set of reference images that sufficiently overlap to encompass a complete 360{\deg} view.

Additionally, for practical application, we can further optimize camera parameters and align them with the geographic coordinate system using extra global reference 3D information.

In the online stage, we formulate the calibration of any new viewpoints as a relocalization problem.

1 day, 1 hour назад @ paperswithcode.com
/ccccai239/ Pixel-Level Reasoning Segmentation via Multi-turn Conversations
/ccccai239/ Pixel-Level Reasoning Segmentation via Multi-turn Conversations /ccccai239/ Pixel-Level Reasoning Segmentation via Multi-turn Conversations

Our work tackles this issue by introducing a novel task, Pixel-level Reasoning Segmentation (Pixel-level RS) based on multi-turn conversations, tracking evolving user intent via multi-turn interactions for fine-grained segmentation.

To establish a benchmark for this novel task, we build a Pixel-level ReasonIng Segmentation Dataset Based on Multi-Turn Conversations (PRIST), comprising 24k utterances from 8.3k multi-turn conversational scenarios with segmentation targets.

Building on PRIST, we further propose MIRAS, a Multi-turn Interactive ReAsoning Segmentation framework, integrates pixel-level segmentation with robust multi-turn conversation understanding, generating pixel-grounded explana…

1 day, 1 hour назад @ paperswithcode.com
/remigenet/ SigGate: Enhancing Recurrent Neural Networks with Signature-Based Gating Mechanisms
/remigenet/ SigGate: Enhancing Recurrent Neural Networks with Signature-Based Gating Mechanisms /remigenet/ SigGate: Enhancing Recurrent Neural Networks with Signature-Based Gating Mechanisms

In this paper, we propose a novel approach that enhances recurrent neural networks (RNNs) by incorporating path signatures into their gating mechanisms.

Our method modifies both Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) architectures by replacing their forget and reset gates, respectively, with learnable path signatures.

These signatures, which capture the geometric features of the entire path history, provide a richer context for controlling information flow through the network's memory.

This modification allows the networks to make memory decisions based on the full historical context rather than just the current input and state.

By leveraging path signatures in recurre…

1 day, 1 hour назад @ paperswithcode.com
/aix-group/ This looks like what? Challenges and Future Research Directions for Part-Prototype Models
/aix-group/ This looks like what? Challenges and Future Research Directions for Part-Prototype Models /aix-group/ This looks like what? Challenges and Future Research Directions for Part-Prototype Models

The growing interest in eXplainable Artificial Intelligence (XAI) has prompted research into models with built-in interpretability, the most prominent of which are part-prototype models.

Part-Prototype Models (PPMs) make decisions by comparing an input image to a set of learned prototypes, providing human-understandable explanations in the form of ``this looks like that''.

Despite their inherent interpretability, PPMS are not yet considered a valuable alternative to post-hoc models.

In this survey, we investigate the reasons for this and provide directions for future research.

We provide ideas for future research in five broad directions: improving predictive performance, developing novel a…

1 day, 1 hour назад @ paperswithcode.com
/visionxlab/ Wholly-WOOD: Wholly Leveraging Diversified-quality Labels for Weakly-supervised Oriented Object Detection
/visionxlab/ Wholly-WOOD: Wholly Leveraging Diversified-quality Labels for Weakly-supervised Oriented Object Detection /visionxlab/ Wholly-WOOD: Wholly Leveraging Diversified-quality Labels for Weakly-supervised Oriented Object Detection

Accurately estimating the orientation of visual objects with compact rotated bounding boxes (RBoxes) has become a prominent demand, which challenges existing object detection paradigms that only use horizontal bounding boxes (HBoxes).

Meanwhile, some existing datasets with oriented objects are already annotated with horizontal boxes or even single points.

It becomes attractive yet remains open for effectively utilizing weaker single point and horizontal annotations to train an oriented object detector (OOD).

We develop Wholly-WOOD, a weakly-supervised OOD framework, capable of wholly leveraging various labeling forms (Points, HBoxes, RBoxes, and their combination) in a unified fashion.

By o…

1 day, 1 hour назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 1 day, 1 hour назад
/needylove/ Improving Deep Regression with Tightness
/needylove/ Improving Deep Regression with Tightness /needylove/ Improving Deep Regression with Tightness

For deep regression, preserving the ordinality of the targets with respect to the feature representation improves performance across various tasks.

This work reveals that preserving ordinality reduces the conditional entropy $H(Z|Y)$ of representation $Z$ conditional on the target $Y$.

However, our findings reveal that typical regression losses do little to reduce $H(Z|Y)$, even though it is vital for generalization performance.

With this motivation, we introduce an optimal transport-based regularizer to preserve the similarity relationships of targets in the feature space to reduce $H(Z|Y)$.

Experiments on three real-world regression tasks verify the effectiveness of our strategies to impr…

1 day, 1 hour назад @ paperswithcode.com
/razvandu/ CopySpec: Accelerating LLMs with Speculative Copy-and-Paste Without Compromising Quality
/razvandu/ CopySpec: Accelerating LLMs with Speculative Copy-and-Paste Without Compromising Quality /razvandu/ CopySpec: Accelerating LLMs with Speculative Copy-and-Paste Without Compromising Quality

We introduce CopySpec, an innovative technique designed to tackle the inefficiencies LLMs face when generating responses that closely resemble previous outputs.

CopySpec identifies repeated sequences in the model's chat history and speculates that the same tokens will follow, enabling seamless copying without compromising output quality or requiring additional GPU memory.

Our results demonstrate significant speed-ups: up to 2.35x on CNN/DM, 3.08x on the second turn of select MT-Redundant categories, and 2.66x on the third turn of GSM-8K's self-correction tasks.

Moreover, we show that CopySpec integrates seamlessly with speculative decoding, yielding an average 49% additional speed-up over s…

1 day, 1 hour назад @ paperswithcode.com
/intellabs/ SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models
/intellabs/ SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models /intellabs/ SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models

In the rapidly evolving field of Natural Language Processing, Large Language Models (LLMs) are tasked with increasingly complex reasoning challenges.

Traditional methods like chain-of-thought prompting have shown promise but often fall short in fully leveraging a model's reasoning capabilities.

This paper introduces SQuARE (Sequential Question Answering Reasoning Engine), a novel prompting technique designed to improve reasoning through a self-interrogation paradigm.

Building upon CoT frameworks, SQuARE prompts models to generate and resolve multiple auxiliary questions before tackling the main query, promoting a more thorough exploration of various aspects of a topic.

Our expansive evaluat…

1 day, 1 hour назад @ paperswithcode.com
/pfnet-research/ A Judge-free LLM Open-ended Generation Benchmark Based on the Distributional Hypothesis
/pfnet-research/ A Judge-free LLM Open-ended Generation Benchmark Based on the Distributional Hypothesis /pfnet-research/ A Judge-free LLM Open-ended Generation Benchmark Based on the Distributional Hypothesis

Evaluating the open-ended text generation of large language models (LLMs) is challenging because of the lack of a clear ground truth and the high cost of human or LLM-based assessments.

We propose a novel benchmark that evaluates LLMs using n-gram statistics and rules, without relying on human judgement or LLM-as-a-judge approaches.

Using 50 question and reference answer sets, we introduce three new metrics based on n-grams and rules: Fluency, Truthfulness, and Helpfulness.

Our benchmark strongly correlates with GPT-4o-based evaluations while requiring significantly fewer computational resources, demonstrating its effectiveness as a scalable alternative for assessing LLMs' open-ended genera…

1 day, 1 hour назад @ paperswithcode.com
/anpc849/ QMaxViT-Unet+: A Query-Based MaxViT-Unet with Edge Enhancement for Scribble-Supervised Segmentation of Medical Images
/anpc849/ QMaxViT-Unet+: A Query-Based MaxViT-Unet with Edge Enhancement for Scribble-Supervised Segmentation of Medical Images /anpc849/ QMaxViT-Unet+: A Query-Based MaxViT-Unet with Edge Enhancement for Scribble-Supervised Segmentation of Medical Images

The deployment of advanced deep learning models for medical image segmentation is often constrained by the requirement for extensively annotated datasets.

Building on this approach, we propose QMaxViT-Unet+, a novel framework for scribble-supervised medical image segmentation.

Additionally, our approach integrates a query-based Transformer decoder to refine features and an edge enhancement module to compensate for the limited boundary information in the scribble label.

We evaluate the proposed QMaxViT-Unet+ on four public datasets focused on cardiac structures, colorectal polyps, and breast cancer: ACDC, MS-CMRSeg, SUN-SEG, and BUSI.

This makes it ideal for medical image analysis, where hig…

1 day, 1 hour назад @ paperswithcode.com
/bowensu123/ Robust Anomaly Detection via Tensor Chidori Pseudoskeleton Decomposition
/bowensu123/ Robust Anomaly Detection via Tensor Chidori Pseudoskeleton Decomposition /bowensu123/ Robust Anomaly Detection via Tensor Chidori Pseudoskeleton Decomposition

Anomaly detection plays a critical role in modern data-driven applications, from identifying fraudulent transactions and safeguarding network infrastructure to monitoring sensor systems for irregular patterns.

Traditional approaches, such as distance, density, or cluster-based methods, face significant challenges when applied to high dimensional tensor data, where complex interdependencies across dimensions amplify noise and computational complexity.

To address these limitations, this paper leverages Tensor Chidori pseudoskeleton decomposition within a tensor-robust principal component analysis framework to extract low Tucker rank structure while isolating sparse anomalies, ensuring robustn…

1 day, 1 hour назад @ paperswithcode.com
/rodrihgh/ CISSIR: Beam Codebooks with Self-Interference Reduction Guarantees for Integrated Sensing and Communication Beyond 5G
/rodrihgh/ CISSIR: Beam Codebooks with Self-Interference Reduction Guarantees for Integrated Sensing and Communication Beyond 5G /rodrihgh/ CISSIR: Beam Codebooks with Self-Interference Reduction Guarantees for Integrated Sensing and Communication Beyond 5G

We propose a beam codebook design to reduce self-interference (SI) in integrated sensing and communication (ISAC) systems.

Our optimization methods, which can be applied to both tapered beamforming and phased arrays, adapt the codebooks to the SI channel such that a certain SI level is achieved.

Furthermore, we derive an upper bound on the quantization noise in terms of the achieved SI level, which provides guidelines to pose the optimization problem in order to obtain performance guarantees for sensing.

By selecting standard reference codebooks in our simulations, we show substantially improved sensing quality with little impact on 5G-NR communication.

Our proposed method is not only less …

1 day, 1 hour назад @ paperswithcode.com
/agarwalishika/ Data Valuation using Neural Networks for Efficient Instruction Fine-Tuning
/agarwalishika/ Data Valuation using Neural Networks for Efficient Instruction Fine-Tuning /agarwalishika/ Data Valuation using Neural Networks for Efficient Instruction Fine-Tuning

Influence functions provide crucial insights into model training, but existing methods suffer from large computational costs and limited generalization.

Particularly, recent works have proposed various metrics and algorithms to calculate the influence of data using language models, which do not scale well with large models and datasets.

In this paper, we explore the use of small neural networks -- which we refer to as the InfluenceNetwork -- to estimate influence values, achieving up to 99% cost reduction.

We apply our algorithm of estimating influence values (called NN-CIFT: Neural Networks for effiCient Instruction Fine-Tuning) to the downstream task of subset selection for general instru…

1 day, 1 hour назад @ paperswithcode.com
/ai-forever/ RusCode: Russian Cultural Code Benchmark for Text-to-Image Generation
/ai-forever/ RusCode: Russian Cultural Code Benchmark for Text-to-Image Generation /ai-forever/ RusCode: Russian Cultural Code Benchmark for Text-to-Image Generation

Text-to-image generation models have gained popularity among users around the world.

The lack of cultural awareness can reduce the generation quality and lead to undesirable consequences such as unintentional insult, and the spread of prejudice.

We propose a RusCode benchmark for evaluating the quality of text-to-image generation containing elements of the Russian cultural code.

To do this, we form a list of 19 categories that best represent the features of Russian visual culture.

We present the results of a human evaluation of the side-by-side comparison of Russian visual concepts representations using popular generative models.

2 days, 20 hours назад @ paperswithcode.com
/liangwu2019/ EIQP: Execution-time-certified and Infeasibility-detecting QP Solver
/liangwu2019/ EIQP: Execution-time-certified and Infeasibility-detecting QP Solver /liangwu2019/ EIQP: Execution-time-certified and Infeasibility-detecting QP Solver

Solving real-time quadratic programming (QP) is a ubiquitous task in control engineering, such as in model predictive control and control barrier function-based QP.

In such real-time scenarios, certifying that the employed QP algorithm can either return a solution within a predefined level of optimality or detect QP infeasibility before the predefined sampling time is a pressing requirement.

This article considers convex QP and adopts its homogeneous formulation to achieve infeasibility detection.

Exploiting this homogeneous formulation, this article proposes a novel infeasible interior-point method (IPM) algorithm with the best theoretical $O(\sqrt{n})$ iteration complexity that feasible I…

2 days, 20 hours назад @ paperswithcode.com
/getterupper/ Semi-Supervised Vision-Centric 3D Occupancy World Model for Autonomous Driving
/getterupper/ Semi-Supervised Vision-Centric 3D Occupancy World Model for Autonomous Driving /getterupper/ Semi-Supervised Vision-Centric 3D Occupancy World Model for Autonomous Driving

Understanding world dynamics is crucial for planning in autonomous driving.

Recent methods attempt to achieve this by learning a 3D occupancy world model that forecasts future surrounding scenes based on current observation.

However, 3D occupancy labels are still required to produce promising results.

Considering the high annotation cost for 3D outdoor scenes, we propose a semi-supervised vision-centric 3D occupancy world model, PreWorld, to leverage the potential of 2D labels through a novel two-stage training paradigm: the self-supervised pre-training stage and the fully-supervised fine-tuning stage.

Extensive experiments on the nuScenes dataset validate the effectiveness and scalability …

2 days, 20 hours назад @ paperswithcode.com
/aholovchak/ Distributional Instrumental Variable Method
/aholovchak/ Distributional Instrumental Variable Method /aholovchak/ Distributional Instrumental Variable Method

The instrumental variable (IV) approach is commonly used to infer causal effects in the presence of unmeasured confounding.

Conventional IV models commonly make the additive noise assumption, which is hard to ensure in practice, but also typically lack flexibility if the causal effects are complex.

Further, the vast majority of the existing methods aims to estimate the mean causal effects only, a few other methods focus on the quantile effects.

We propose a novel method called distributional instrumental variables (DIV), which leverages generative modelling in a nonlinear instrumental variable setting.

We also apply DIV to a single-cell data set, where we study the generalizability and stab…

2 days, 20 hours назад @ paperswithcode.com
/SeffiCohen/ Forget What You Know about LLMs Evaluations - LLMs are Like a Chameleon
/SeffiCohen/ Forget What You Know about LLMs Evaluations - LLMs are Like a Chameleon /SeffiCohen/ Forget What You Know about LLMs Evaluations - LLMs are Like a Chameleon

Large language models (LLMs) often appear to excel on public benchmarks, but these high scores may mask an overreliance on dataset-specific surface cues rather than true language understanding.

We introduce the Chameleon Benchmark Overfit Detector (C-BOD), a meta-evaluation framework that systematically distorts benchmark prompts via a parametric transformation and detects overfitting of LLMs.

By rephrasing inputs while preserving their semantic content and labels, C-BOD exposes whether a model's performance is driven by memorized patterns.

Notably, models with higher baseline accuracy exhibit larger performance differences under perturbation, and larger LLMs tend to be more sensitive to re…

2 days, 20 hours назад @ paperswithcode.com
/dominikglandorf/ Grammar Control in Dialogue Response Generation for Language Learning Chatbots
/dominikglandorf/ Grammar Control in Dialogue Response Generation for Language Learning Chatbots /dominikglandorf/ Grammar Control in Dialogue Response Generation for Language Learning Chatbots

Chatbots based on large language models offer cheap conversation practice opportunities for language learners.

However, they are hard to control for linguistic forms that correspond to learners' current needs, such as grammar.

We control grammar in chatbot conversation practice by grounding a dialogue response generation model in a pedagogical repository of grammar skills.

We comprehensively evaluate prompting, fine-tuning, and decoding strategies for grammar-controlled dialogue response generation.

Existing language learning chatbots and research on second language acquisition benefit from these affordances.

2 days, 20 hours назад @ paperswithcode.com
/ablaom/ New tools for comparing classical and neural ODE models for tumor growth
/ablaom/ New tools for comparing classical and neural ODE models for tumor growth /ablaom/ New tools for comparing classical and neural ODE models for tumor growth

A new computational tool TumorGrowth.jl for modeling tumor growth is introduced.

The tool allows the comparison of standard textbook models, such as General Bertalanffy and Gompertz, with some newer models, including, for the first time, neural ODE models.

As an application, we revisit a human meta-study of non-small cell lung cancer and bladder cancer lesions, in patients undergoing two different treatment options, to determine if previously reported performance differences are statistically significant, and if newer, more complex models perform any better.

In a population of examples with at least four time-volume measurements available for calibration, and an average of about 6.3, our ma…

2 days, 20 hours назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 1 day, 1 hour назад
/Soumya-Mukherjee-Statistics/ Uniform Kernel Prober
/Soumya-Mukherjee-Statistics/ Uniform Kernel Prober /Soumya-Mukherjee-Statistics/ Uniform Kernel Prober

The ability to identify useful features or representations of the input data based on training data that achieves low prediction error on test data across multiple prediction tasks is considered the key to multitask learning success.

In practice, however, one faces the issue of the choice of prediction tasks and the availability of test data from the chosen tasks while comparing the relative performance of different features.

In this work, we develop a class of pseudometrics called Uniform Kernel Prober (UKP) for comparing features or representations learned by different statistical models such as neural networks when the downstream prediction tasks involve kernel ridge regression.

The prop…

2 days, 20 hours назад @ paperswithcode.com
/ZombaSY/ Color Universal Design Neural Network for the Color Vision Deficiencies
/ZombaSY/ Color Universal Design Neural Network for the Color Vision Deficiencies /ZombaSY/ Color Universal Design Neural Network for the Color Vision Deficiencies

However, such information is not recognizable if the color that seems to be distorted to the color deficiencies meets an adjacent object.

The aim of this paper is to propose a color universal design network, called CUD-Net, that generates images that are visually understandable by individuals with color deficiency.

To generate CUD images for color deficiencies, we follow a four-step process.

Second, we expand the input image information through pre-processing that is specialized for color deficiency vision.

Our approach is able to produce high-quality CUD images that maintain color and contrast stability.

2 days, 20 hours назад @ paperswithcode.com
/lemuelpuglisi/ Brain Latent Progression: Individual-based Spatiotemporal Disease Progression on 3D Brain MRIs via Latent Diffusion
/lemuelpuglisi/ Brain Latent Progression: Individual-based Spatiotemporal Disease Progression on 3D Brain MRIs via Latent Diffusion /lemuelpuglisi/ Brain Latent Progression: Individual-based Spatiotemporal Disease Progression on 3D Brain MRIs via Latent Diffusion

The growing availability of longitudinal Magnetic Resonance Imaging (MRI) datasets has facilitated Artificial Intelligence (AI)-driven modeling of disease progression, making it possible to predict future medical scans for individual patients.

However, despite significant advancements in AI, current methods continue to face challenges including achieving patient-specific individualization, ensuring spatiotemporal consistency, efficiently utilizing longitudinal data, and managing the substantial memory demands of 3D scans.

To address these challenges, we propose Brain Latent Progression (BrLP), a novel spatiotemporal model designed to predict individual-level disease progression in 3D brain …

2 days, 20 hours назад @ paperswithcode.com
/vl2g/ Composite Sketch+Text Queries for Retrieving Objects with Elusive Names and Complex Interactions
/vl2g/ Composite Sketch+Text Queries for Retrieving Objects with Elusive Names and Complex Interactions /vl2g/ Composite Sketch+Text Queries for Retrieving Objects with Elusive Names and Complex Interactions

Further, users may want to search for such elusive objects with difficult-to-sketch interactions, e.g., numbat digging in the ground.

In such common but complex situations, users desire a search interface that accepts composite multimodal queries comprising hand-drawn sketches of difficult-to-name but easy-to-draw objects and text describing difficult-to-sketch but easy-to-verbalize object attributes or interaction with the scene.

This novel problem statement distinctly differs from the previously well-researched TBIR (text-based image retrieval) and SBIR (sketch-based image retrieval) problems.

To study this under-explored task, we curate a dataset, CSTBIR (Composite Sketch+Text Based Imag…

2 days, 20 hours назад @ paperswithcode.com
/jiangbo-shi/ ViLa-MIL: Dual-scale Vision-Language Multiple Instance Learning for Whole Slide Image Classification
/jiangbo-shi/ ViLa-MIL: Dual-scale Vision-Language Multiple Instance Learning for Whole Slide Image Classification /jiangbo-shi/ ViLa-MIL: Dual-scale Vision-Language Multiple Instance Learning for Whole Slide Image Classification

Multiple instance learning (MIL)-based framework has become the mainstream for processing the whole slide image (WSI) with giga-pixel size and hierarchical image context in digital pathology.

Recently, vision language model (VLM)-based methods introduced the language prior by pre-training on large-scale pathological image-text pairs.

However, the previous text prompt lacks the consideration of pathological prior knowledge, therefore does not substantially boost the model's performance.

Moreover, the collection of such pairs and the pre-training process are very time-consuming and source-intensive.To solve the above problems, we propose a dual-scale vision-language multiple instance learning…

2 days, 20 hours назад @ paperswithcode.com
/ziyuey/ Uncertainty Aware Human-machine Collaboration in Camouflaged Object Detection
/ziyuey/ Uncertainty Aware Human-machine Collaboration in Camouflaged Object Detection /ziyuey/ Uncertainty Aware Human-machine Collaboration in Camouflaged Object Detection

Camouflaged Object Detection (COD), the task of identifying objects concealed within their environments, has seen rapid growth due to its wide range of practical applications.

A key step toward developing trustworthy COD systems is the estimation and effective utilization of uncertainty.

In this work, we propose a human-machine collaboration framework for classifying the presence of camouflaged objects, leveraging the complementary strengths of computer vision (CV) models and noninvasive brain-computer interfaces (BCIs).

Analysis of the training process revealed a strong correlation between our confidence measures and precision, while an ablation study confirmed the effectiveness of the pro…

2 days, 20 hours назад @ paperswithcode.com
/mala-lab/ AnomalyGFM: Graph Foundation Model for Zero/Few-shot Anomaly Detection
/mala-lab/ AnomalyGFM: Graph Foundation Model for Zero/Few-shot Anomaly Detection /mala-lab/ AnomalyGFM: Graph Foundation Model for Zero/Few-shot Anomaly Detection

Graph anomaly detection (GAD) aims to identify abnormal nodes that differ from the majority of the nodes in a graph, which has been attracting significant attention in recent years.

Existing generalist graph models have achieved remarkable success in different graph tasks but struggle to generalize to the GAD task.

To address this challenge, we propose AnomalyGFM, a GAD-oriented graph foundation model that supports zero-shot inference and few-shot prompt tuning for GAD in diverse graph datasets.

If there are few-shot labeled normal nodes available in the new graphs, AnomalyGFM can further support prompt tuning to leverage these nodes for better adaptation.

Comprehensive experiments on 11 wi…

2 days, 20 hours назад @ paperswithcode.com
/liuxiao-guan/ Redistribute Ensemble Training for Mitigating Memorization in Diffusion Models
/liuxiao-guan/ Redistribute Ensemble Training for Mitigating Memorization in Diffusion Models /liuxiao-guan/ Redistribute Ensemble Training for Mitigating Memorization in Diffusion Models

Diffusion models, known for their tremendous ability to generate high-quality samples, have recently raised concerns due to their data memorization behavior, which poses privacy risks.

In this paper, we propose a novel method for diffusion models from the perspective of visual modality, which is more generic and fundamental for mitigating memorization.

Directly exposing visual data to the model increases memorization risk, so we design a framework where models learn through proxy model parameters instead.

However, balancing the need to skip memorization-prone samples while maintaining sufficient training data for high-quality image generation presents a key challenge.

Moreover, we fine-tune…

2 days, 20 hours назад @ paperswithcode.com
/zaynah91124/ Spatio-temporal collaborative multiple-stream transformer network for liver lesion classification on multiple-sequence magnetic resonance imaging
/zaynah91124/ Spatio-temporal collaborative multiple-stream transformer network for liver lesion classification on multiple-sequence magnetic resonance imaging /zaynah91124/ Spatio-temporal collaborative multiple-stream transformer network for liver lesion classification on multiple-sequence magnetic resonance imaging

Accurate identification of focal liver lesions is essential for determining the appropriate therapeutic approach in clinical practice.

Magnetic resonance imaging (MRI) is a valuable technology for precise classification, revealing diverse physical and biological details of lesions.

Specifically, multiple-sequence MRI is first grouped into multiple streams based on MRI diagnostic characteristics.

To reduce the interference of redundancy across multiple streams, we design a bottleneck bridge structure for spatial information aggregation.

The experimental results indicate that the network performs well in predicting focal liver lesions, advancing the application of precision medicine.

2 days, 20 hours назад @ paperswithcode.com
/opensparsellms/ LASP-2: Rethinking Sequence Parallelism for Linear Attention and Its Hybrid
/opensparsellms/ LASP-2: Rethinking Sequence Parallelism for Linear Attention and Its Hybrid /opensparsellms/ LASP-2: Rethinking Sequence Parallelism for Linear Attention and Its Hybrid

Linear sequence modeling approaches, such as linear attention, provide advantages like linear-time training and constant-memory inference over sequence lengths.

In this paper, we introduce LASP-2, a new SP method to enhance both communication and computation parallelism when training linear attention transformer models with very-long input sequences.

Compared to previous work LASP, LASP-2 rethinks the minimal communication requirement for SP on linear attention layers, reorganizes the whole communication-computation workflow of LASP.

Additionally, we extend LASP-2 to LASP-2H by applying similar communication redesign to standard attention modules, offering an efficient SP solution for hybri…

4 days, 14 hours назад @ paperswithcode.com
/cone-mt/ BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models
/cone-mt/ BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models /cone-mt/ BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models

Previous multilingual benchmarks focus primarily on simple understanding tasks, but for large language models(LLMs), we emphasize proficiency in instruction following, reasoning, long context understanding, code generation, and so on.

However, measuring these advanced capabilities across languages is underexplored.

To address the disparity, we introduce BenchMAX, a multi-way multilingual evaluation benchmark that allows for fair comparisons of these important abilities across languages.

Extensive experiments on BenchMAX reveal varying effectiveness of core capabilities across languages, highlighting performance gaps that cannot be bridged by simply scaling up model size.

BenchMAX serves as …

4 days, 14 hours назад @ paperswithcode.com
/ffyyytt/ Hierarchical Document Parsing via Large Margin Feature Matching and Heuristics
/ffyyytt/ Hierarchical Document Parsing via Large Margin Feature Matching and Heuristics /ffyyytt/ Hierarchical Document Parsing via Large Margin Feature Matching and Heuristics

We present our solution to the AAAI-25 VRD-IU challenge, achieving first place in the competition.

Our approach integrates large margin loss for improved feature discrimination and employs heuristic rules to refine hierarchical relationships.

By combining a deep learning-based matching strategy with greedy algorithms, we achieve a significant boost in accuracy while maintaining computational efficiency.

Our method attains an accuracy of 0.98904 on the private leaderboard, demonstrating its effectiveness in document structure parsing.

Source codes are publicly available at https://github.com/ffyyytt/VRUID-AAAI-DAKietPDFAbstract

4 days, 14 hours назад @ paperswithcode.com
/meaquadddd/ DPO-Shift: Shifting the Distribution of Direct Preference Optimization
/meaquadddd/ DPO-Shift: Shifting the Distribution of Direct Preference Optimization /meaquadddd/ DPO-Shift: Shifting the Distribution of Direct Preference Optimization

Direct Preference Optimization (DPO) and its variants have become increasingly popular for aligning language models with human preferences.

However, prior research has identified that the probability of chosen responses often decreases during training, and this phenomenon is known as likelihood displacement.

To tackle this challenge, in this work we introduce \method to controllably shift the distribution of the chosen probability.

Furthermore, we demonstrate the superiority of \method over DPO on downstream tasks such as MT-Bench and a designed win rate experiment.

We believe this study shows that the likelihood displacement issue of DPO can be effectively mitigated with a simple, theoreti…

4 days, 14 hours назад @ paperswithcode.com
/zinyy/ Provably Efficient RLHF Pipeline: A Unified View from Contextual Bandits
/zinyy/ Provably Efficient RLHF Pipeline: A Unified View from Contextual Bandits /zinyy/ Provably Efficient RLHF Pipeline: A Unified View from Contextual Bandits

Reinforcement Learning from Human Feedback (RLHF) is a widely used approach for aligning Large Language Models (LLMs) with human preferences.

While recent advancements have provided valuable insights into various stages and settings of RLHF, a comprehensive theoretical understanding of the entire RLHF pipeline remains lacking.

Towards this end, we propose a unified framework for the RLHF pipeline from the view of contextual bandits and provide provable efficiency guarantees.

By employing the Bradley-Terry preference model with a linearly parameterized reward function, we reformulate RLHF as a contextual preference bandit problem.

We then develop novel algorithms for each stage, demonstratin…

4 days, 14 hours назад @ paperswithcode.com
/mmiao2/ Hallucination, Monofacts, and Miscalibration: An Empirical Investigation
/mmiao2/ Hallucination, Monofacts, and Miscalibration: An Empirical Investigation /mmiao2/ Hallucination, Monofacts, and Miscalibration: An Empirical Investigation

Recent theoretical work by [Kalai and Vempala 2024] proves that a particular notion of hallucination rate in LLMs must be lower bounded by the training data monofact rate (related to the classical Good-Turing missing mass estimator) minus model miscalibration.

Through systematic experiments with n-gram models and in-context learning with LLMs, we empirically investigate and validate this theory by examining how different underlying data distributions affect the monofact rate and a model's tendency to hallucinate.

We then vary model miscalibration through controlled upweighting of training samples while holding monofact rates constant, allowing us to isolate miscalibration's reduction effect…

4 days, 14 hours назад @ paperswithcode.com
💼 University and corporation labs
DeepMind DeepMind
последний пост 1 week, 6 days назад
Gemini 2.0 is now available to everyone
Gemini 2.0 is now available to everyone Gemini 2.0 is now available to everyone

And last week, we made an updated 2.0 Flash available to all users of the Gemini app on desktop and mobile, helping everyone discover new ways to create, interact and collaborate with Gemini.

Today, we’re making the updated Gemini 2.0 Flash generally available via the Gemini API in Google AI Studio and Vertex AI.

It is available in Google AI Studio and Vertex AI, and in the Gemini app for Gemini Advanced users.

Finally, 2.0 Flash Thinking Experimental will be available to Gemini app users in the model dropdown on desktop and mobile.

Try Gemini 2.0 Flash in the Gemini app or the Gemini API in Google AI Studio and Vertex AI.

1 week, 6 days назад @ blog.google
Updating the Frontier Safety Framework
Updating the Frontier Safety Framework Updating the Frontier Safety Framework

Responsibility & Safety Updating the Frontier Safety Framework ShareCopy link ×Our next iteration of the FSF sets out stronger security protocols on the path to AGI AI is a powerful tool that is helping to unlock new breakthroughs and make significant progress on some of the biggest challenges of our time, from climate change to drug discovery.

That’s why we introduced the first iteration of our Frontier Safety Framework last year - a set of protocols to help us stay ahead of possible severe risks from powerful frontier AI models.

We have also implemented the Framework in our safety and governance processes for evaluating frontier models such as Gemini 2.0.

As a result of this work, today w…

2 weeks назад @ deepmind.google
FACTS Grounding: A new benchmark for evaluating the factuality of large language models
FACTS Grounding: A new benchmark for evaluating the factuality of large language models FACTS Grounding: A new benchmark for evaluating the factuality of large language models

They can “hallucinate” false information, particularly when given complex inputs.

Today, we’re introducing FACTS Grounding, a comprehensive benchmark for evaluating the ability of LLMs to generate responses that are not only factually accurate with respect to given inputs, but also sufficiently detailed to provide satisfactory answers to user queries.

We hope our benchmark will spur industry-wide progress on factuality and grounding.

To track progress, we’re also launching the FACTS leaderboard on Kaggle.

We’ve already tested leading LLMs using FACTS Grounding and have populated the initial leaderboard with their grounding scores.

2 months назад @ deepmind.google
State-of-the-art video and image generation with Veo 2 and Imagen 3
State-of-the-art video and image generation with Veo 2 and Imagen 3 State-of-the-art video and image generation with Veo 2 and Imagen 3

Earlier this year, we introduced our video generation model, Veo, and our latest image generation model, Imagen 3.

Since then, it’s been exciting to watch people bring their ideas to life with help from these models: YouTube creators are exploring the creative possibilities of video backgrounds for their YouTube Shorts, enterprise customers are enhancing creative workflows on Vertex AI and creatives are using VideoFX and ImageFX to tell their stories.

Together with collaborators ranging from filmmakers to businesses, we’re continuing to develop and evolve these technologies.

Today we're introducing a new video model, Veo 2, and the latest version of Imagen 3, both of which achieve state-of-…

2 months назад @ blog.google
Introducing Gemini 2.0: our new AI model for the agentic era
Introducing Gemini 2.0: our new AI model for the agentic era Introducing Gemini 2.0: our new AI model for the agentic era

Today we’re excited to launch our next era of models built for this new agentic era: introducing Gemini 2.0, our most capable model yet.

Starting today our Gemini 2.0 Flash experimental model will be available to all Gemini users.

It's available in Gemini Advanced today.

TPUs powered 100% of Gemini 2.0 training and inference, and today Trillium is generally available to customers so they can build with it too.

If Gemini 1.0 was about organizing and understanding information, Gemini 2.0 is about making it much more useful.

2 months, 1 week назад @ blog.google
Google DeepMind at NeurIPS 2024
Google DeepMind at NeurIPS 2024 Google DeepMind at NeurIPS 2024

Google Research Scientist David Warde and Google DeepMind Research Scientist Ian Goodfellow will present on Generative Adversarial Nets.

Teams across Google DeepMind will present more than 150 new papers on topics ranging from AI agents and generative media to innovative learning approaches.

Building adaptive, smart, and safe AI Agents LLM-based AI agents are showing promise in carrying out digital tasks via natural language commands.

Yet their success depends on precise interaction with complex user interfaces, which requires extensive training data.

AI agents trained using this dataset showed significant performance gains which we hope helps advance research into more general AI agents.

2 months, 2 weeks назад @ deepmind.google
GenCast predicts weather and the risks of extreme conditions with state-of-the-art accuracy
GenCast predicts weather and the risks of extreme conditions with state-of-the-art accuracy GenCast predicts weather and the risks of extreme conditions with state-of-the-art accuracy

Because a perfect weather forecast is not possible, scientists and weather agencies use probabilistic ensemble forecasts, where the model predicts a range of likely weather scenarios.

The evolution of AI weather models GenCast marks a critical advance in AI-based weather prediction that builds on our previous weather model, which was deterministic, and provided a single, best estimate of future weather.

Better forecasts of extreme weather, such as heat waves or strong winds, enable timely and cost-effective preventative actions.

GenCast offers greater value than ENS when making decisions about preparations for extreme weather, across a wide range of decision-making scenarios.

Advanced forec…

2 months, 2 weeks назад @ deepmind.google
Genie 2: A large-scale foundation world model
Genie 2: A large-scale foundation world model Genie 2: A large-scale foundation world model

Today we introduce Genie 2, a foundation world model capable of generating an endless variety of action-controllable, playable 3D environments for training and evaluating embodied agents.

Based on a single prompt image, it can be played by a human or AI agent using keyboard and mouse inputs.

Games play a key role in the world of artificial intelligence (AI) research.

However, training more general embodied agents has been traditionally bottlenecked by the availability of sufficiently rich and diverse training environments.

As we show, Genie 2 could enable future agents to be trained and evaluated in a limitless curriculum of novel worlds.

2 months, 2 weeks назад @ deepmind.google
AlphaQubit tackles one of quantum computing’s biggest challenges
AlphaQubit tackles one of quantum computing’s biggest challenges AlphaQubit tackles one of quantum computing’s biggest challenges

Quantum computers have the potential to revolutionize drug discovery, material design and fundamental physics — that is, if we can get them to work reliably.

Certain problems, which would take a conventional computer billions of years to solve, would take a quantum computer just hours.

If we want to make quantum computers more reliable, especially at scale, we need to accurately identify and correct these errors.

In a paper published today in Nature, we introduce AlphaQubit, an AI-based decoder that identifies quantum computing errors with state-of-the-art accuracy.

This collaborative work brought together Google DeepMind’s machine learning knowledge and Google Quantum AI’s error correction…

3 months назад @ blog.google
The AI for Science Forum: A new era of discovery
The AI for Science Forum: A new era of discovery The AI for Science Forum: A new era of discovery

AI is revolutionizing the landscape of scientific research, enabling advancements at a pace that was once unimaginable — from accelerating drug discovery to designing new materials for clean energy technologies.

The AI for Science Forum — co-hosted by Google DeepMind and the Royal Society — brought together the scientific community, policymakers, and industry leaders to explore the transformative potential of AI to drive scientific breakthroughs, address the world's most pressing challenges, and lead to a new era of discovery.

3 months назад @ blog.google
Pushing the frontiers of audio generation
Pushing the frontiers of audio generation Pushing the frontiers of audio generation

Technologies Pushing the frontiers of audio generation ShareCopy link ×Our pioneering speech generation technologies are helping people around the world interact with more natural, conversational and intuitive digital assistants and AI tools.

Pioneering techniques for audio generation For years, we've been investing in audio generation research and exploring new ways for generating more natural dialogue in our products and experimental tools.

Download audio Audio clip of two speakers telling a funny story, with laughter at the punchline.

Download audio Audio clip of two speakers expressing excitement about a surprise birthday party.

Scaling our audio generation models Scaling our single-spe…

3 months, 3 weeks назад @ deepmind.google
New generative AI tools open the doors of music creation
New generative AI tools open the doors of music creation New generative AI tools open the doors of music creation

Technologies New generative AI tools open the doors of music creation ShareCopy link ×Our latest AI music technologies are now available in MusicFX DJ, Music AI Sandbox and YouTube Shorts For nearly a decade, our teams have been exploring how artificial intelligence (AI) can support the creative process, building tools that empower enthusiasts and professionals to discover new forms of creative expression.

Their input has been guiding our state-of-the-art generative music experiments, and helping us ensure that our new generative AI tools responsibly open the doors of music creation to everyone.

We’re also announcing updates to our music AI toolkit, called Music AI Sandbox, and highlighting…

3 months, 4 weeks назад @ deepmind.google
Demis Hassabis & John Jumper awarded Nobel Prize in Chemistry
Demis Hassabis & John Jumper awarded Nobel Prize in Chemistry Demis Hassabis & John Jumper awarded Nobel Prize in Chemistry

This morning, Co-founder and CEO of Google DeepMind and Isomorphic Labs Sir Demis Hassabis, and Google DeepMind Senior Research Scientist Dr. John Jumper were co-awarded the 2024 Nobel Prize in Chemistry for their work developing AlphaFold, a groundbreaking AI system that predicts the 3D structure of proteins from their amino acid sequences.

AlphaFold’s predictions, made freely available through the AlphaFold Protein Structure Database, have given more than 2 million scientists and researchers from 190 countries a powerful tool for making new discoveries.

In a statement released after informed of the news, Demis Hassabis said:"Receiving the Nobel Prize is the honour of a lifetime.

After rec…

4 months, 1 week назад @ deepmind.google
How AlphaChip transformed computer chip design
How AlphaChip transformed computer chip design How AlphaChip transformed computer chip design

Research How AlphaChip transformed computer chip design ShareCopy link ×Our AI method has accelerated and optimized chip design, and its superhuman chip layouts are used in hardware around the world In 2020, we released a preprint introducing our novel reinforcement learning method for designing chip layouts, which we later published in Nature and open sourced.

Today, we’re publishing a Nature addendum that describes more about our method and its impact on the field of chip design.

Computer chips have fueled remarkable progress in artificial intelligence (AI), and AlphaChip returns the favor by using AI to accelerate and optimize chip design.

Using AI to design Google’s AI accelerator chips…

4 months, 3 weeks назад @ deepmind.google
Updated production-ready Gemini models, reduced 1.5 Pro pricing, increased rate limits, and more
Updated production-ready Gemini models, reduced 1.5 Pro pricing, increased rate limits, and more Updated production-ready Gemini models, reduced 1.5 Pro pricing, increased rate limits, and more

Developers can access our latest models for free via Google AI Studio and the Gemini API.

With the latest updates, 1.5 Pro and Flash are now better, faster, and more cost-efficient to build with in production.

For more details on migrating to the latest versions of Gemini 1.5 Pro and 1.5 Flash, check out the Gemini API models page.

Gemini 1.5 Pro We continue to be blown away with the creative and useful applications of Gemini 1.5 Pro’s 2 million token long context window and multimodal capabilities.

Increased rate limits To make it even easier for developers to build with Gemini, we are increasing the paid tier rate limits for 1.5 Flash to 2,000 RPM and increasing 1.5 Pro to 1,000 RPM, up f…

4 months, 3 weeks назад @ developers.googleblog.com
Google
последний пост 4 часа назад
How to use gen AI for better data schema handling, data quality, and data generation
How to use gen AI for better data schema handling, data quality, and data generation How to use gen AI for better data schema handling, data quality, and data generation

In the realm of data engineering, generative AI models are quietly revolutionizing how we handle, process, and ultimately utilize data.

For example, large language models (LLMs) can help with data schema handling, data quality, and even data generation.

Building upon the recently released Gemini in BigQuery Data preparation capabilities, this blog showcases areas where gen AI models are making a significant impact in data engineering with automated solutions for schema management, data quality automation, and generation of synthetic and structured data from diverse sources, providing practical examples and code snippets.

Data schema handling: Integrating new datasetsData movement and mainte…

4 часа назад @ cloud.google.com
BigQuery ML is now compatible with open-source gen AI models
BigQuery ML is now compatible with open-source gen AI models BigQuery ML is now compatible with open-source gen AI models

BigQuery Machine Learning allows you to use large language models (LLMs), like Gemini, to perform tasks such as entity extraction, sentiment analysis, translation, text generation, and more on your data using familiar SQL syntax.

Today, we are extending this capability with support for any open-source LLM from the Vertex AI Model Garden — including any models you deploy from Hugging Face and including OSS models you might have tuned.

However, you can use any of 170K+ text generation models available on Hugging Face by following the same steps.

Using Open-Source Software (OSS) models with BigQuery ML1.

Host the model on a Vertex endpointFirst, choose a text -generation model from Hugging Fac…

5 часов назад @ cloud.google.com
With MultiKueue, grab GPUs for your GKE cluster, wherever they may be
With MultiKueue, grab GPUs for your GKE cluster, wherever they may be With MultiKueue, grab GPUs for your GKE cluster, wherever they may be

These technologies rely on intensive computations that require specialized hardware resources, like GPUs.

For Google Cloud users, the introduction of Dynamic Workload Scheduler (DWS) transformed how you can access and use GPU resources, particularly within a Google Kubernetes Engine (GKE) cluster.

With MultiKueue, GKE, and Dynamic Workload Scheduler, you can wait for accelerators in multiple regions.

Dynamic Workload Scheduler automatically provisions resources in the best GKE clusters as soon as they are available.

By submitting workloads to a global queue, MultiKueue executes them in the region with available GPU resources, helping to optimize global resource usage, lowering costs, and sp…

4 days, 5 hours назад @ cloud.google.com
Enhance Gemini model security with content filters and system instructions
Enhance Gemini model security with content filters and system instructions Enhance Gemini model security with content filters and system instructions

As organizations rush to adopt generative AI-driven chatbots and agents, it’s important to reduce the risk of exposure to threat actors who force AI models to create harmful content.

We want to highlight two powerful capabilities of Vertex AI that can help manage this risk — content filters and system instructions.

Content filters: Post-response defensesBy analyzing generated text and blocking responses that trigger specific criteria, content filters can help block the output of harmful content.

They function independently from Gemini models as part of a layered defense against threat actors who attempt to jailbreak the model.

Gemini models on Vertex AI use two types of content filters:

5 days, 5 hours назад @ cloud.google.com
Operationalizing generative AI apps with Apigee
Operationalizing generative AI apps with Apigee Operationalizing generative AI apps with Apigee

It also streamlines the onboarding of new AI applications and agents while providing robust access control for your LLMs.

You can decide which model to use by specifying it in your API call and Apigee routes it to your desired model while keeping a consistent API contract.

Or perhaps you want to analyze response quality to continue to improve your LLM applications.

By integrating Apigee with your LLM architecture, you create a secure and reliable environment for your AI applications to thrive.

Visit our Apigee generative AI samples page to learn more and get started, watch a webinar with more details, or contact us here!

5 days, 5 hours назад @ cloud.google.com
Networking support for AI workloads
Networking support for AI workloads Networking support for AI workloads

Managed and Unmanaged AI optionsGoogle Cloud provides both managed (Vertex AI) and do-it-yourself (DIY) approaches for running AI workloads.

As a managed service, Vertex AI handles infrastructure management, allowing you to concentrate on training, tuning, and inferencing your AI models.

Enterprises that want to use private connectivity have a choice of Private Service Access, Private Google Access, Private Service Connect endpoints and Private Service Connect for Google APIs.

The option you choose will vary based on the specific Vertex AI service you are using.

You can learn more in the Accessing Vertex AI from on-premises and multicloud documentation.

1 week, 1 day назад @ cloud.google.com
News you can use: What we announced in AI this month
News you can use: What we announced in AI this month News you can use: What we announced in AI this month

We shared how Gemini-powered content creation in Partner Marketing Studio will help partners co-market faster.

These models will help developers write code and build faster – from high-complexity tasks to reasoning tasks, like creative writing.

Check our new AI tools to help retailers build gen AI search and agents.

Now with NVIDIA's AI Enterprise software available on Google Cloud, retailers can handle more data and more complex AI tasks without their systems getting bogged down.

For a deeper dive into the latest from Google Cloud, read our weekly updates, The Overwhelmed Person’s Guide to Google Cloud.

1 week, 4 days назад @ cloud.google.com
Announcing public beta of Gen AI Toolbox for Databases
Announcing public beta  of Gen AI Toolbox for Databases Announcing public beta of Gen AI Toolbox for Databases

Today, we are thrilled to announce the public beta launch of Gen AI Toolbox for Databases in partnership with LangChain, the leading orchestration framework for developers building large language model (LLM) applications.

Gen AI Toolbox for Databases (Toolbox) is an open-source server that empowers application developers to connect production-grade, agent-based generative AI (gen AI) applications to databases.

It streamlines the creation, deployment, and management of sophisticated gen AI tools capable of querying databases with secure access, robust observability, scalability, and comprehensive manageability.

It also provides connectivity to popular open-source databases such as PostgreSQL…

1 week, 5 days назад @ cloud.google.com
How to build a strong brand logo with Imagen 3 and Gemini
How to build a strong brand logo with Imagen 3 and Gemini How to build a strong brand logo with Imagen 3 and Gemini

Last year we announced Imagen 3, our highest quality image generation model.

Imagen 3 is available to Vertex AI customers, which means businesses can create high quality images that reflect their own brand style and logos for use in marketing, advertising, or product design.

Today, we’ll share how you can build your brand style with a logo using Imagen 3, Gemini, and the Python Library Pillow.

First, use Imagen 3 to generate visual optionsImagen 3 generates the most realistic and highest quality images from simple text prompts, surpassing previous versions of Imagen in detail, lighting, and artifact reduction.

The new Imagen 3 generation model (002), delivers even higher visual appeal, prom…

1 week, 6 days назад @ cloud.google.com
Designing sustainable AI: A deep dive into TPU efficiency and lifecycle emissions
Designing sustainable AI: A deep dive into TPU efficiency and lifecycle emissions Designing sustainable AI: A deep dive into TPU efficiency and lifecycle emissions

Today we’re releasing a first-of-its-kind study1 on the lifetime emissions of our Tensor Processing Unit (TPU) hardware.

These measurements provide a snapshot of the average, chip-level carbon intensity of Google’s TPU hardware, and enable us to compare efficiency across generations.

CCI quantifies an AI accelerator chip’s carbon emissions per unit of computation (measured in grams of CO2e per Exa-FLOP).3 Lower CCI scores mean lower emissions from the AI hardware platform for a given AI workload — for example training an AI model.

This underscores the importance of improving the energy efficiency of AI chips and reducing the carbon intensity of the electricity that powers them.

Our signific…

1 week, 6 days назад @ cloud.google.com
Helping our partners co-market faster with AI
Helping our partners co-market faster with AI Helping our partners co-market faster with AI

You find a pre-built campaign in Partner Marketing Studio that aligns with your goals, but you need to tailor for the healthcare industry.

Each bite-sized episode features Google marketers who've successfully integrated AI into their daily work, using tools like Gemini, Gemini for Workspace, NotebookLM, and AI Studio.

To find the AI Boost Bites training series and start using the AI features, login to Partner Marketing Studio.

Google Cloud speakers: Request a Google Cloud speaker for your events.

What our pilot partners are saying"With the new AI features in Partner Marketing Studio, we can create more targeted industry and persona-based versions of our Google Cloud marketing campaigns auto…

2 weeks назад @ cloud.google.com
Improving model performance with PyTorch/XLA 2.6
Improving model performance with PyTorch/XLA 2.6 Improving model performance with PyTorch/XLA 2.6

Host offloadingAnother powerful tool for memory optimization in PyTorch/XLA is host offloading.

This suggests your model is "tracing bound," meaning performance is limited by the speed of tracing operations.

You'll now find builds with both the pre-C++11 ABI, which remains the default to match PyTorch upstream, and the more modern C++11 ABI.

So if we have a higher goodput measurement for the same model on the same hardware, that indicates better performance of the model.

An example of using a C++11 ABI docker image in your Dockerfile might look something like:

2 weeks, 4 days назад @ cloud.google.com
Blackwell is here — new A4 VMs powered by NVIDIA B200 now in preview
Blackwell is here — new A4 VMs powered by NVIDIA B200 now in preview Blackwell is here — new A4 VMs powered by NVIDIA B200 now in preview

Today, we’re excited to bring the highly-anticipated NVIDIA Blackwell GPUs to Google Cloud with the preview of A4 VMs, powered by NVIDIA HGX B200.

Combined with our datacenter-wide 4-way rail-aligned network, A4 VMs deliver non-blocking 3.2 Tbps of GPU-to-GPU traffic with RDMA over Converged Ethernet (RoCE).

Combined with our datacenter-wide 4-way rail-aligned network, A4 VMs deliver non-blocking 3.2 Tbps of GPU-to-GPU traffic with RDMA over Converged Ethernet (RoCE).

Vertex AI : A4 VMs will be accessible through Vertex AI, our fully managed, unified AI development platform for building and using generative AI, and which is powered by the AI Hypercomputer architecture under the hood.

Hudson…

2 weeks, 4 days назад @ cloud.google.com
Announcing smaller machine types for A3 High VMs
Announcing smaller machine types for A3 High VMs Announcing smaller machine types for A3 High VMs

You can use A3 High VMs powered by NVIDIA H100 80GB GPUs in multiple generally available machine types of 1NEW, 2NEW, 4NEW, and 8 GPUs.

Accessing smaller H100 machine typesAll A3 machine types are available through the fully managed Vertex AI, as nodes through Google Kubernetes Engine (GKE), and as VMs through Google Compute Engine.

The 1, 2, and 4 A3 High GPU machine types are available as Spot VMs and through Dynamic Workload Scheduler (DWS) Flex Start mode.

You can use the 1, 2, and 4 A3 High GPU machine types through both GKE Standard and GKE Autopilot modes of operation.

Below are two examples of creating node pools in your GKE cluster with a3-highgpu-1g machine type using Spot VMs and…

3 weeks, 4 days назад @ cloud.google.com
Introducing agent evaluation in Vertex AI Gen AI evaluation service
Introducing agent evaluation in Vertex AI Gen AI evaluation service Introducing agent evaluation in Vertex AI Gen AI evaluation service

Evaluate agents using Vertex AI Gen AI evaluation serviceOur evaluation metrics can be grouped in two categories: final response and trajectory evaluation.

Compatibility meets flexibilityVertex AI Gen AI evaluation service supports a variety of agent architectures.

With today’s launch, you can evaluate agents built with Reasoning Engine (LangChain on Vertex AI), the managed runtime for your agentic applications on Vertex AI.

For maximum flexibility, you can evaluate agents using a custom function that processes prompts and returns responses.

To make your evaluation experience easier, we offer native agent inference and automatically log all results in Vertex AI experiments.

3 weeks, 4 days назад @ cloud.google.com
Microsoft Microsoft
последний пост 6 days, 1 hour назад
Microsoft Research and Physics Wallah team up to enhance AI-based tutoring
Microsoft Research and Physics Wallah team up to enhance AI-based tutoring Microsoft Research and Physics Wallah team up to enhance AI-based tutoring

Some 2 million students use the Physics Wallah platform every day, at a fraction of the cost of offline tutoring.

Physics Wallah has developed an AI-driven educational suite, Alakh AI, leveraging OpenAI’s GPT-4o model through Microsoft Azure OpenAI Service.

Researchers from Microsoft Research are developing new algorithms and techniques to enhance the accuracy and reasoning capabilities of AI models.

Microsoft Research is working with Physics Wallah to move beyond traditional next-token prediction and develop AI systems that approach reliable, systematic, step-by-step problem-solving.

Without Physics Wallah, students like Chandra would likely have no access to the support and resources that…

6 days, 1 hour назад @ microsoft.com
ExACT: Improving AI agents’ decision-making via test-time compute scaling
ExACT: Improving AI agents’ decision-making via test-time compute scaling ExACT: Improving AI agents’ decision-making via test-time compute scaling

Rank Model Score 1 GPT-4o + ExACT 33.70 2 GPT-4o + Search 26.40 3 GPT-4o + WebDreamer 23.60 4 GPT-4o + ICAL 23.40 5 GPT-4o 19.78 6 Llama-3-70B + Search 16.70 Table 1.

How Exploratory Learning worksExploratory Learning enables agents to dynamically search and adjust their computational resources during testing without depending on MCTS.

In contrast to Imitation Learning, Exploratory Learning uses the entire search trajectory for training.

Results demonstrate the following key benefits:Improved performance : GPT-4o achieves performance improvement, comparable with scaling test-time compute with MCTS, even without search.

How can AI agents improve decision-making in real-world scenarios, where…

1 week назад @ microsoft.com
Ideas: Building AI for population-scale systems with Akshay Nambi
Ideas: Building AI for population-scale systems with Akshay Nambi Ideas: Building AI for population-scale systems with Akshay Nambi

His work lies at the intersection of systems, AI, and machine learning with a focus on designing, deploying, and scaling AI systems to solve compelling real-world problems.

CHRIS STETKIEWICZ: You’re listening to Ideas, a Microsoft Research Podcast that dives deep into the world of technology research and the profound questions behind the code.

NAMBI: That’s right.

This represents a major step towards building AI systems that’s much more holistic personal tutors, which help student understanding and create more engaging, effective learning experience.

Are there some things that could go wrong, even if we get the technology right?

1 week назад @ microsoft.com
Advances to low-bit quantization enable LLMs on edge devices
Advances to low-bit quantization enable LLMs on edge devices Advances to low-bit quantization enable LLMs on edge devices

LUT Tensor Core hardware architecture: Introduces a cutting-edge design for next-generation AI hardware, tailored for low-bit quantization and mixed-precision computations.

By eliminating dequantization and lowering computational costs, T-MAC enables efficient inference of low-bit LLMs on resource-constrained edge devices.

The LUT Tensor Core workflowEvaluating LUT Tensor CoreTesting LUT Tensor Core on low-bit LLMs, such as BitNet and Llama, showed significant performance gains, achieving 6.93 times the inference speed while using just 38.3% of the area of a traditional Tensor Core.

As AI models grow in scale and complexity, LUT Tensor Core enables low-bit LLMs to be applied in new and dive…

1 week, 6 days назад @ microsoft.com
Research Focus: Week of January 27, 2025
Research Focus: Week of January 27, 2025 Research Focus: Week of January 27, 2025

And we invite you to join an upcoming workshop: LLM4Eval@WSDM 2025: Large Language Models for Evaluation in Information Retrieval.

Listen now Opens in a new tabNEW RESEARCH Symbolic Automata: Omega-Regularity Modulo Theories Symbolic automata are finite state automata that support potentially infinite alphabets, such as the set of rational numbers, generally applied to regular expressions and languages over finite words.

In symbolic automata (or automata modulo A), an alphabet is represented by an effective Boolean algebra A, supported by a decision procedure for satisfiability.

In a recent paper: Symbolic Automata: Omega-Regularity Modulo Theories, researchers from Microsoft generalize sym…

2 weeks, 4 days назад @ microsoft.com
Ideas: Bug hunting with Shan Lu
Ideas: Bug hunting with Shan Lu Ideas: Bug hunting with Shan Lu

LU: And I still feel like, wow, you know, I feel so … I feel like I’m inspired every day!

HUIZINGA: Yeah, yeah, yeah, yeah.

And then once you learn that language, right, then you have to learn about how to write proofs in that special language.

HUIZINGA: Yeah, yeah, yeah.

And my colleagues are also working on using AI, right, to automatically do performance tuning.

3 weeks, 5 days назад @ microsoft.com
Research Focus: Week of January 13, 2025
Research Focus: Week of January 13, 2025 Research Focus: Week of January 13, 2025

NEW RESEARCH AI meets materials discovery Two of the transformative tools that play a central role in Microsoft’s work on AI for science are MatterGen and MatterSim.

Read the paperNEW RESEARCH RD-Agent: An open-source solution for smarter R&D Research and development (R&D) plays a pivotal role in boosting industrial productivity.

Read the articleMicrosoft research podcast What’s Your Story: Lex Story Model maker and fabricator Lex Story helps bring research to life through prototyping.

Listen now Opens in a new tabMicrosoft Research | In case you missed it Microsoft Research 2024: A year in review December 20, 2024 Microsoft Research did extraordinary work this year, using AI and scientific…

1 month назад @ microsoft.com
Ideas: AI for materials discovery with Tian Xie and Ziheng Lu
Ideas: AI for materials discovery with Tian Xie and Ziheng Lu Ideas: AI for materials discovery with Tian Xie and Ziheng Lu

And now you can use this loop to design materials really quickly.

XIE: So you can really think about MatterSim and MatterGen accelerating different parts of materials discovery process.

They are also both foundation AI models, meaning they can both be used for a broad range of materials design problems.

Really, really a lot.

Yeah, I really, really like the example that Ziheng mentioned about the educational purposes.

1 month назад @ microsoft.com
MatterGen: A new paradigm of materials design with generative AI
MatterGen: A new paradigm of materials design with generative AI MatterGen: A new paradigm of materials design with generative AI

MatterGen enables a new paradigm of generative AI-assisted materials design that allows for efficient exploration of materials, going beyond the limited set of known ones.

MatterGen can be fine-tuned to generate materials under different design requirements such as specific chemistry, crystal symmetry, or materials’ properties.

AI emulator and generator flywheelMatterGen presents a new opportunity for AI accelerated materials design, complementing our AI emulator MatterSim.

Looking aheadMatterGen represents a new paradigm of materials design enabled by generative AI technology.

That’s why we are interested in understanding the impact that MatterGen could have on materials discovery,” said C…

1 month назад @ microsoft.com
AutoGen v0.4: Reimagining the foundation of agentic AI for scale, extensibility, and robustness
AutoGen v0.4: Reimagining the foundation of agentic AI for scale, extensibility, and robustness AutoGen v0.4: Reimagining the foundation of agentic AI for scale, extensibility, and robustness

Modular and extensible : Users can easily customize systems with pluggable components, including custom agents, tools, memory, and models.

: Users can easily customize systems with pluggable components, including custom agents, tools, memory, and models.

Core: The foundational building blocks for an event-driven agentic system.

In addition to the framework, AutoGen 0.4 includes upgraded programming tools and applications, designed to support developers in building and experimenting with AutoGen.

Third-party component galleries: Import and use custom agents, tools, and workflows from external galleries to extend functionality.

1 month назад @ microsoft.com
AIOpsLab: Building AI agents for autonomous clouds
AIOpsLab: Building AI agents for autonomous clouds AIOpsLab: Building AI agents for autonomous clouds

To tackle these challenges, recent research on using AIOps agents for cloud operations—such as AI agents for incident root cause analysis (RCA) or triaging—has relied on proprietary services and datasets.

Users developing agents for cloud operations tasks with Azure AI Agent Service can evaluate and improve them using AIOpsLab.

This calls for a standardized and principled research framework for building, testing, comparing, and improving AIOps agents.

The AIOpsLab research paper has been accepted at SoCC’24 (the annual ACM Symposium on Cloud Computing).

Our approach integrates application and domain knowledge to create adaptable policies and “oracles” compatible with AIOps scenarios.

1 month, 4 weeks назад @ microsoft.com
Ideas: AI and democracy with Madeleine Daepp and Robert Osazuwa Ness
Ideas: AI and democracy with Madeleine Daepp and Robert Osazuwa Ness Ideas: AI and democracy with Madeleine Daepp and Robert Osazuwa Ness

DAEPP: You know, we didn’t really think about the term fraud until we started prepping for this interview with you.

BADANES: Right, right.

One of the things that I get asked a lot is, why can’t we just build good AI to detect bad AI, right?

BADANES: So next time my kids are in a fight, I’m going to point them to Copilot and say, work with Copilot to mediate.

[LAUGHS] No, that’s really, really interesting.

2 months назад @ microsoft.com
Research Focus: Week of December 16, 2024
Research Focus: Week of December 16, 2024 Research Focus: Week of December 16, 2024

Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft.

Read the paperon-demand event Microsoft Research Forum Episode 4 Learn about the latest multimodal AI models, advanced benchmarks for AI evaluation and model self-improvement, and an entirely new kind of computer for AI inference and hard optimization.

In a recent paper: RedCode: Risky Code Execution and Generation Benchmark for Code Agents, published at NeurIPS 2024, researchers from Microsoft and external colleagues propose comprehensive and practical evaluations on the safety of code agents.

This res…

2 months назад @ microsoft.com
NeurIPS 2024: The co-evolution of AI and systems with Lidong Zhou
NeurIPS 2024: The co-evolution of AI and systems with Lidong Zhou NeurIPS 2024: The co-evolution of AI and systems with Lidong Zhou

Earlier today, Lidong gave a keynote here at NeurIPS on the co-evolution of AI and systems engineering.

One dimension is that the scale of the AI systems that we have to support.

And the other dimension is if you look at AI systems, it’s actually a whole-stack kind of design.

STRICKLAND: Yeah, yeah.

ZHOU: Yeah, I think in terms of AI systems, I’m certainly pretty excited about what we can do together, you know, with a combination of AI and systems.

2 months назад @ microsoft.com
PromptWizard: The future of prompt optimization through feedback-driven self-evolving prompts
PromptWizard: The future of prompt optimization through feedback-driven self-evolving prompts PromptWizard: The future of prompt optimization through feedback-driven self-evolving prompts

The question then becomes: How can we make prompt optimization faster, more accessible, and more adaptable across diverse tasks?

Download PromptWizardTo address this challenge, we developed PromptWizard (PW), a research framework that automates and streamlines the process of prompt optimization.

Joint optimization and synthesis of diverse examples: PW generates synthetic examples that are not only robust and diverse but also task-aware.

Stage 1: Refinement of prompt instructionThe first stage focuses on refining the task instructions of a prompt.

Through the critique-and-synthesis mechanism, PromptWizard ensures alignment between the prompt and examples, simultaneously synthesizing new exam…

2 months назад @ microsoft.com
MIT AI MIT AI
последний пост 5 days назад
AI model deciphers the code in proteins that tells them where to go
AI model deciphers the code in proteins that tells them where to go AI model deciphers the code in proteins that tells them where to go

More recently, researchers are coming to appreciate that a protein’s localization is also critical for its function.

MIT Professor Richard Young and colleagues wondered whether the code in those regions could be used to predict protein localization in the same way that other regions are used to predict structure.

Other researchers have discovered some protein sequences that code for protein localization, and some have begun developing predictive models for protein localization.

The researchers also tested how well ProtGPS could predict changes in protein localization based on disease-associated mutations within a protein.

The researchers found many cases in which a disease-associated mutati…

5 days назад @ news.mit.edu
Gift from Sebastian Man ’79, SM ’80 supports MIT Stephen A. Schwarzman College of Computing building
Gift from Sebastian Man ’79, SM ’80 supports MIT Stephen A. Schwarzman College of Computing building Gift from Sebastian Man ’79, SM ’80 supports MIT Stephen A. Schwarzman College of Computing building

The MIT Stephen A. Schwarzman College of Computing has received substantial support for its striking new headquarters on Vassar Street in Cambridge, Massachusetts.

A major gift from Sebastian Man ’79, SM ’80 will be recognized with the naming of a key space in the building, enriching the academic and research activities of the MIT Schwarzman College of Computing and MIT.

Man’s gift to the college was recognized at a ceremony and luncheon in Hong Kong, where he resides, on Jan. 10.

“When we first announced the new college at MIT,” he said, “MIT said it was reshaping itself for the future.

“MIT instilled in me unending intellectual curiosity and the love for the unknown, and I am honored and …

1 week назад @ news.mit.edu
Bridging philosophy and AI to explore computing ethics
Bridging philosophy and AI to explore computing ethics Bridging philosophy and AI to explore computing ethics

At this moment, what some consider the golden age of generative AI, this may seem like an urgent new question.

But Solar-Lezama, the Distinguished Professor of Computing at MIT, is quick to point out that this struggle is as old as humankind itself.

A recitation to break down the week's topic with graduate students from philosophy or computer science and a lively discussion combine the course content.

Westover says he's drawn to philosophy because of an interest in ethics and a desire to distinguish right from wrong.

However, in Ethics of Computing, he has learned how to make written arguments for "tricky philosophical questions" that may not have a single correct answer.

1 week назад @ news.mit.edu
Puzzling out climate change
Puzzling out climate change Puzzling out climate change

There, she discovered how computational techniques like machine learning could optimize materials for solar panels — a direct application of her skills toward mitigating climate change.

All of them were communicating effectively and working toward one unified goal — building better renewable energy systems,” Raghavan says.

“Transportation is a critical element of both the economy and climate change, so potential changes to transportation must be carefully studied,” Wu says.

Raghavan believes that addressing climate change requires collaboration across disciplines.

“I think with climate change, no one industry or field is going to solve it on its own.

1 week, 1 day назад @ news.mit.edu
Can deep learning transform heart failure prevention?
Can deep learning transform heart failure prevention? Can deep learning transform heart failure prevention?

When a life-threatening condition like heart failure strikes, the heart gradually loses the ability to supply other organs with enough blood and nutrients that enables them to function.

In a clinical trial, the model showed results with accuracy comparable to gold-standard but more-invasive procedures, giving hope to those at risk of heart failure.

In a healthy human heart, these chambers operate in a rhythmic synchrony: oxygen-poor blood flows into the heart via the right atrium.

But in Stultz’s view, these measures are coarse, as evidenced by the fact that one-in-four heart failure patients is readmitted to the hospital within 30 days.

However, 12-lead ECG machines are only accessible in …

1 week, 1 day назад @ news.mit.edu
Creating a common language
Creating a common language Creating a common language

“So the computer vision community quickly grew really large because these different subtopics were now able to speak a common language and share a common set of tools.”From there, He says the trend began to expand to other areas of computer science, including natural language processing, speech recognition, and robotics, creating the foundation for ChatGPT and other progress toward artificial general intelligence (AGI).

While AI tools provide a clear benefit to the work of He’s scientist colleagues, He also notes the reciprocal effect they can have, and have had, on the creation and advancement of AI.

“Scientists provide new problems and challenges that help us continue to evolve these tool…

1 week, 4 days назад @ news.mit.edu
Validation technique could help scientists make more accurate forecasts
Validation technique could help scientists make more accurate forecasts Validation technique could help scientists make more accurate forecasts

Scientists typically use tried-and-true validation methods to determine how much to trust these predictions.

But MIT researchers have shown that these popular validation methods can fail quite badly for spatial prediction tasks.

These methods hold out a small amount of training data, called validation data, and use it to assess the accuracy of the predictor.

Evaluation methods rely on assumptions about how validation data and the data one wants to predict, called test data, are related.

In addition, perhaps the validation data are from EPA sensors near cities while the conservation sites are in rural areas.

1 week, 4 days назад @ news.mit.edu
Streamlining data collection for improved salmon population management
Streamlining data collection for improved salmon population management Streamlining data collection for improved salmon population management

Young salmon fry (newly hatched salmon) make their way to the ocean, where they spend several years maturing to adulthood.

Monitoring salmon migrationIncreased human activity, including overfishing and hydropower development, together with habitat loss and climate change, have had a significant impact on salmon populations in the region.

As a result, effective monitoring and management of salmon fisheries is important to ensure balance among competing ecological, cultural, and human interests.

Beery is currently leading a research project that aims to streamline salmon monitoring using cutting-edge computer vision methods.

Automating salmon monitoring is necessary for better management of s…

1 week, 5 days назад @ news.mit.edu
Aligning AI with human values
Aligning AI with human values Aligning AI with human values

Senior Audrey Lorvo is researching AI safety, which seeks to ensure increasingly intelligent AI models are reliable and can benefit humanity.

The growing field focuses on technical challenges like robustness and AI alignment with human values, as well as societal concerns like transparency and accountability.

“We need to do all we can to develop it safely.”Her participation in efforts like the AI Safety Technical Fellowship reflect her investment in understanding the technical aspects of AI safety.

The fellowship provides opportunities to review existing research on aligning AI development with considerations of potential human impact.

“The fellowship helped me understand AI safety’s techni…

2 weeks назад @ news.mit.edu
Introducing the MIT Generative AI Impact Consortium
Introducing the MIT Generative AI Impact Consortium Introducing the MIT Generative AI Impact Consortium

Enter the MIT Generative AI Impact Consortium, a collaboration between industry leaders and MIT’s top minds.

As MIT President Sally Kornbluth highlighted last year, the Institute is poised to address the societal impacts of generative AI through bold collaborations.

Generative AI continues to advance at lightning speed, but its future depends on building a solid foundation.

“Now is a perfect time to look at the fundamentals — the building blocks that will make generative AI more effective and safer to use,” adds Kraska.

“Progress on generative AI is not zero-sum, so it makes sense for this to be an open-source initiative.”While participants may approach success from different angles, they s…

2 weeks, 1 day назад @ news.mit.edu
User-friendly system can help developers build more efficient simulations and AI models
User-friendly system can help developers build more efficient simulations and AI models User-friendly system can help developers build more efficient simulations and AI models

To improve the efficiency of AI models, MIT researchers created an automated system that enables developers of deep learning algorithms to simultaneously take advantage of two types of data redundancy.

Because the system utilizes a user-friendly programming language, it could optimize machine-learning algorithms for a wide range of applications.

The system could also help scientists who are not experts in deep learning but want to improve the efficiency of AI algorithms they use to process data.

Cutting out computationIn machine learning, data are often represented and manipulated as multidimensional arrays known as tensors.

Because the system is automated, it could be especially useful in …

2 weeks, 1 day назад @ news.mit.edu
With generative AI, MIT chemists quickly calculate 3D genomic structures
With generative AI, MIT chemists quickly calculate 3D genomic structures With generative AI, MIT chemists quickly calculate 3D genomic structures

MIT chemists have now come up with a new way to determine those 3D genome structures, using generative artificial intelligence.

These differences in chromatin conformation help determine which genes are expressed in different cell types, or at different times within a given cell.

Over the past 20 years, scientists have developed experimental techniques for determining chromatin structures.

The AI model that they designed can quickly analyze DNA sequences and predict the chromatin structures that those sequences might produce in a cell.

“If you repeat your experiment multiple times, in different cells, you will very likely end up with a very different conformation.

2 weeks, 4 days назад @ news.mit.edu
3 Questions: Modeling adversarial intelligence to exploit AI’s security vulnerabilities
3 Questions: Modeling adversarial intelligence to exploit AI’s security vulnerabilities 3 Questions: Modeling adversarial intelligence to exploit AI’s security vulnerabilities

Q: In what ways can artificial adversarial intelligence play the role of a cyber attacker, and how does artificial adversarial intelligence portray a cyber defender?

Think of the specialized, nefarious intelligence that these attackers marshal — that's adversarial intelligence.

My research goal is to replicate this specific kind of offensive or attacking intelligence, intelligence that is adversarially-oriented (intelligence that human threat actors rely upon).

Another thing stands out about adversarial intelligence: Both Tom and Jerry are able to learn from competing with one another!

Q: What are some examples in our everyday lives where artificial adversarial intelligence has kept us safe?

2 weeks, 6 days назад @ news.mit.edu
MIT students' works redefine human-AI collaboration
MIT students' works redefine human-AI collaboration MIT students' works redefine human-AI collaboration

Imagine a boombox that tracks your every move and suggests music to match your personal dance style.

That’s the idea behind “Be the Beat,” one of several projects from MIT course 4.043/4.044 (Interaction Intelligence), taught by Marcelo Coelho in the Department of Architecture, that were presented at the 38th annual NeurIPS (Neural Information Processing Systems) conference in December 2024.

With over 16,000 attendees converging in Vancouver, NeurIPS is a competitive and prestigious conference dedicated to research and science in the field of artificial intelligence and machine learning, and a premier venue for showcasing cutting-edge developments.

The course investigates the emerging field…

2 weeks, 6 days назад @ news.mit.edu
New training approach could help AI agents perform better in uncertain conditions
New training approach could help AI agents perform better in uncertain conditions New training approach could help AI agents perform better in uncertain conditions

Their results indicate that, in some situations, training a simulated AI agent in a world with less uncertainty, or “noise,” enabled it to perform better than a competing AI agent trained in the same, noisy world they used to test both agents.

The researchers studied this phenomenon by training AI agents to play Atari games, which they modified by adding some unpredictability.

They hope these results fuel additional research toward developing better training methods for AI agents.

If their exploration patterns are different, then the agent trained in the noisy environment tends to perform better.

They also want to build training environments designed to leverage the indoor training effect, …

2 weeks, 6 days назад @ news.mit.edu
Berkeley AI
последний пост 3 months, 1 week назад
Virtual Personas for Language Models via an Anthology of Backstories
Virtual Personas for Language Models via an Anthology of Backstories Virtual Personas for Language Models via an Anthology of Backstories

Virtual Personas for Language Models via an Anthology of BackstoriesWe introduce Anthology, a method for conditioning LLMs to representative, consistent, and diverse virtual personas by generating and utilizing naturalistic backstories with rich details of individual values and experience.

What does it mean for large language models (LLMs) to be trained on massive text corpora, collectively produced by millions and billions of distinctive human authors?

In this work, we introduce Anthology, an approach for steering LLMs to representative, consistent, and diverse virtual personas by providing richly detailed life narratives of individuals as conditioning context to models.

By grounding langu…

3 months, 1 week назад @ bair.berkeley.edu
Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination
Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination

Linguistic Bias in ChatGPT: Language Models Reinforce Dialect DiscriminationSample language model responses to different varieties of English and native speaker reactions.

Over 1 billion people around the world speak varieties such as Indian English, Nigerian English, Irish English, and African-American English.

Then, we compared the language model responses to the “standard” varieties and the non-“standard” varieties.

Here, we included the original GPT-3.5 responses, plus responses from GPT-3.5 and GPT-4 where the models were told to imitate the style of the input.

That can reinforce barriers against speakers of non-“standard” varieties as AI models become increasingly used in …

5 months назад @ bair.berkeley.edu
How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark
How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT BenchmarkWhen we began studying jailbreak evaluations, we found a fascinating paper claiming that you could jailbreak frontier LLMs simply by translating forbidden prompts into obscure languages.

This blog post shows how to use a new, state-of-the art jailbreak benchmark - StrongREJECT - to accurately and robustly evaluate jailbreak methods.

PAP instructs an attacker model to persuade a victim model to give it harmful information using techniques like misrepresentation and logical appeals.

We conducted two experiments to test this hypothesis:We used StrongREJECT to evaluate 37 jailbreak methods on an unaligned model; Dolp…

5 months, 3 weeks назад @ bair.berkeley.edu
Are We Ready for Multi-Image Reasoning? Launching VHs: The Visual Haystacks Benchmark!
Are We Ready for Multi-Image Reasoning? Launching VHs: The Visual Haystacks Benchmark! Are We Ready for Multi-Image Reasoning? Launching VHs: The Visual Haystacks Benchmark!

Launching VHs: The Visual Haystacks Benchmark!

Humans excel at processing vast arrays of visual information, a skill that is crucial for achieving artificial general intelligence (AGI).

Visual Haystacks: the first "visual-centric" Needle-In-A-Haystack (NIAH) benchmark designed to rigorously evaluate Large Multimodal Models (LMMs) in processing long-context visual information.

The first NIAH benchmark for visual reasoning was introduced by Google in the Gemini-v1.5 technical report.

What is the Visual Haystacks (VHs) Benchmark?

7 months назад @ bair.berkeley.edu
TinyAgent: Function Calling at the Edge
TinyAgent: Function Calling at the Edge TinyAgent: Function Calling at the Edge

TinyAgent: Function Calling at the EdgeThe ability of LLMs to execute commands through plain language (e.g.

The framework is open sourced and available at https://github.com/SqueezeAILab/TinyAgentTeaching LLMs to do Function CallingFigure 1: Overview of the LLMCompiler Function Calling Planner.

Once this function calling plan is generated, we can parse it and call each function based on the dependencies.

With our dataset in place, we can now proceed to fine-tune off-the-shelf SLMs to enhance their function calling capability.

Latency is the end-to-end latency of the function calling planner, including the prompt processing time and generation.

8 months, 3 weeks назад @ bair.berkeley.edu
AWS Machine Learning AWS Machine Learning
последний пост 1 час назад
How Formula 1® uses generative AI to accelerate race-day issue resolution
How Formula 1® uses generative AI to accelerate race-day issue resolution How Formula 1® uses generative AI to accelerate race-day issue resolution

Recognizing this challenge as an opportunity for innovation, F1 partnered with Amazon Web Services (AWS) to develop an AI-driven solution using Amazon Bedrock to streamline issue resolution.

Amazon Bedrock Agents facilitates interaction with internal systems such as databases and Amazon Elastic Compute Cloud (Amazon EC2) instances and external systems such as Jira and Datadog.

Building the RCA assistant with Amazon Bedrock Agents and Amazon Bedrock Knowledge BasesAmazon Bedrock was used to build an agentic (agent-based) RAG solution for the RCA assistant.

In the chat assistant, the full conversation history is displayed, and the conversation can be reset by choosing Clear.

In most cases, th…

1 час назад @ aws.amazon.com
Using Amazon Rekognition to improve bicycle safety
Using Amazon Rekognition to improve bicycle safety Using Amazon Rekognition to improve bicycle safety

To solve some of these problems, I have developed a simple solution using Amazon Rekognition video analysis.

The Amazon Rekognition API returns a set of JSON identifying the selected labels and timestamps of detected objects.

ConclusionIn this example I demonstrated how to use Amazon Rekognition video analysis in a unique scenario.

In this example I demonstrated how using Amazon Rekognition with other serverless services can yield a serverless video processing workflow that—in this case—can help improve the safety of cyclists.

Get started today by using Amazon Rekognition for your computer vision use case.

1 day, 6 hours назад @ aws.amazon.com
Build a dynamic, role-based AI agent using Amazon Bedrock inline agents
Build a dynamic, role-based AI agent using Amazon Bedrock inline agents Build a dynamic, role-based AI agent using Amazon Bedrock inline agents

AI agents continue to gain momentum, as businesses use the power of generative AI to reinvent customer experiences and automate complex workflows.

This is where the new inline agents capability in Amazon Bedrock Agents becomes transformative.

Inline agents in Amazon Bedrock AgentsThis runtime flexibility enabled by inline agents opens powerful new possibilities, such as:Rapid prototyping – Inline agents minimize the time-consuming create/update/prepare cycles traditionally required for agent configuration changes.

Inline agents expand your options for building and deploying agentic solutions with Amazon Bedrock Agents.

Solution overviewOur HR assistant example shows how to build a single AI…

5 days, 2 hours назад @ aws.amazon.com
Use language embeddings for zero-shot classification and semantic search with Amazon Bedrock
Use language embeddings for zero-shot classification and semantic search with Amazon Bedrock Use language embeddings for zero-shot classification and semantic search with Amazon Bedrock

In this post, we discuss what embeddings are, show how to practically use language embeddings, and explore how to use them to add functionality such as zero-shot classification and semantic search.

We then use Amazon Bedrock and language embeddings to add these features to a really simple syndication (RSS) aggregator application.

For this post, we use the Cohere v3 Embed model on Amazon Bedrock to create our language embeddings.

Use case: RSS aggregatorTo demonstrate some of the possible uses of these language embeddings, we developed an RSS aggregator website.

Semantic search – Users can search their articles using semantic search, as shown in the following screenshot.

5 days, 2 hours назад @ aws.amazon.com
Fine-tune LLMs with synthetic data for context-based Q&A using Amazon Bedrock
Fine-tune LLMs with synthetic data for context-based Q&A using Amazon Bedrock Fine-tune LLMs with synthetic data for context-based Q&A using Amazon Bedrock

Generate synthetic data using the Amazon Bedrock InvokeModel APIWe use the Amazon Bedrock InvokeModel API to generate synthetic data for fine-tuning.

Document 2: Setting Up AWS CodeStar Before you can start using AWS CodeStar, you must complete the following steps.

To get started with AWS CodeStar: Prepare to use AWS CodeStar by following the steps in Setting Up AWS CodeStar.

Amazon Bedrock evaluationsRunning LLM-as-a-judge and human evaluation has become much easier with Amazon Bedrock.

We provided instructions on generating synthetic data using the Amazon Bedrock InvokeModel API and fine-tuning the student model using an Amazon Bedrock custom model.

6 days, 5 hours назад @ aws.amazon.com
Achieve ~2x speed-up in LLM inference with Medusa-1 on Amazon SageMaker AI
Achieve ~2x speed-up in LLM inference with Medusa-1 on Amazon SageMaker AI Achieve ~2x speed-up in LLM inference with Medusa-1 on Amazon SageMaker AI

The paper Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads introduced Medusa as an alternative to speculative decoding.

Solution overviewWe cover the following steps in our solution:PrerequisitesLoad and prepare the datasetFine-tune an LLM using a SageMaker AI training jobTrain Medusa heads on top of a frozen fine-tuned LLM using a SageMaker AI training jobDeploy the fine-tuned LLM with Medusa heads on a SageMaker AI endpointDemonstrate LLM inference speedupBy following this solution, you can accelerate LLM inference in your applications, leading to faster response times and improved user experience.

Fine-tune an LLM using SageMaker AI training jobWe use the …

6 days, 5 hours назад @ aws.amazon.com
LLM-as-a-judge on Amazon Bedrock Model Evaluation
LLM-as-a-judge on Amazon Bedrock Model Evaluation LLM-as-a-judge on Amazon Bedrock Model Evaluation

Amazon Bedrock, a fully managed service offering high-performing foundation models from leading AI companies through a single API, has recently introduced two significant evaluation capabilities: LLM-as-a-judge under Amazon Bedrock Model Evaluation and RAG evaluation for Amazon Bedrock Knowledge Bases.

Seamless integration: The feature integrates directly with Amazon Bedrock and remains compatible with existing Amazon Bedrock Model Evaluation features.

Let’s walk through the key components of this solution as shown in the following diagram:LLM-as-a-judge on Amazon Bedrock Model Evaluation follows a streamlined workflow that enables systematic model evaluation.

", "category": "environmental"…

6 days, 5 hours назад @ aws.amazon.com
From concept to reality: Navigating the Journey of RAG from proof of concept to production
From concept to reality: Navigating the Journey of RAG from proof of concept to production From concept to reality: Navigating the Journey of RAG from proof of concept to production

Optimization techniquesThe diagram below illustrates the tradeoffs to consider for a production-ready RAG application.

Amazon Bedrock Evaluations can help you evaluate your retrieval or end-to-end RAG workflow in Amazon Bedrock Knowledge Bases.

For a demonstration on how you can use a RAG evaluation framework in Amazon Bedrock to compute RAG quality metrics, refer to New RAG evaluation and LLM-as-a-judge capabilities in Amazon Bedrock.

Use tools like Amazon Bedrock Knowledge Bases so you can take advantage of fully managed support for the end-to-end RAG workflow.

For more information on building RAG applications using Amazon Bedrock Knowledge Bases, refer to Building scalable, secure, and r…

6 days, 5 hours назад @ aws.amazon.com
Meta SAM 2.1 is now available in Amazon SageMaker JumpStart
Meta SAM 2.1 is now available in Amazon SageMaker JumpStart Meta SAM 2.1 is now available in Amazon SageMaker JumpStart

Meta SAM 2.1 overviewMeta SAM 2.1 is a state-of-the-art vision segmentation model designed for high-performance computer vision tasks, enabling advanced object detection and segmentation workflows.

To learn more about how IAM works with SageMaker AI, refer to Identity and Access Management for Amazon SageMaker AI.

Discover Meta SAM 2.1 in SageMaker JumpStartSageMaker JumpStart provides FMs through two primary interfaces: SageMaker Studio and the SageMaker Python SDK.

To deploy Meta SAM 2.1 using the SageMaker JumpStart UI, complete the following steps:In SageMaker Unified Studio, on the Build menu, choose JumpStart models.

For more information about SageMaker JumpStart, see SageMaker JumpSt…

6 days, 23 hours назад @ aws.amazon.com
Falcon 3 models now available in Amazon SageMaker JumpStart
Falcon 3 models now available in Amazon SageMaker JumpStart Falcon 3 models now available in Amazon SageMaker JumpStart

Today, we are excited to announce that the Falcon 3 family of models from TII are available in Amazon SageMaker JumpStart.

With SageMaker JumpStart, you can evaluate, compare, and select pre-trained foundation models (FMs), including Falcon 3 models.

Deploying a Falcon 3 model through SageMaker JumpStart offers two convenient approaches: using the intuitive SageMaker JumpStart UI or implementing programmatically through the SageMaker Python SDK.

Deploy Falcon 3 using the SageMaker JumpStart UIComplete the following steps to deploy Falcon 3 through the JumpStart UI:To access SageMaker JumpStart, use one of the following methods: In Amazon SageMaker Unified Studio, on the Build menu, choose J…

1 week назад @ aws.amazon.com
Building a virtual meteorologist using Amazon Bedrock Agents
Building a virtual meteorologist using Amazon Bedrock Agents Building a virtual meteorologist using Amazon Bedrock Agents

Amazon Bedrock Agents can securely connect to your company’s data sources and augments the user’s request with accurate responses.

In this post, we present a streamlined approach to deploying an AI-powered agent by combining Amazon Bedrock Agents and a foundation model (FM).

Use Amazon Bedrock Agents to automate application tasksWith Amazon Bedrock Agents, you can build and configure autonomous agents in your application.

Use action group to define actions that Amazon Bedrock agents performAn action group defines a set of related actions that an Amazon Bedrock agent can perform to assist users.

Upon accessing the application URL, you’ll be prompted to provide information related to Amazon C…

1 week назад @ aws.amazon.com
Amazon Q Business simplifies integration of enterprise knowledge bases at scale
Amazon Q Business simplifies integration of enterprise knowledge bases at scale Amazon Q Business simplifies integration of enterprise knowledge bases at scale

In this post, we propose an end-to-end solution using Amazon Q Business to simplify integration of enterprise knowledge bases at scale.

First we discuss end-to-end large-scale data integration with Amazon Q Business, covering data preprocessing, security guardrail implementation, and Amazon Q Business best practices.

Large-scale data ingestionBefore ingesting the data to Amazon Q Business, the data might need transformation into formats supported by Amazon Q Business.

Amazon CloudWatch logging for Amazon Q Business – You can use Amazon CloudWatch logging for Amazon Q Business to save the logs for the data source connectors and document-level errors, focusing particularly on failed ingestion…

1 week назад @ aws.amazon.com
Faster distributed graph neural network training with GraphStorm v0.4
Faster distributed graph neural network training with GraphStorm v0.4 Faster distributed graph neural network training with GraphStorm v0.4

GraphStorm is a low-code enterprise graph machine learning (ML) framework that provides ML practitioners a simple way of building, training, and deploying graph ML solutions on industry-scale graph data.

To achieve this, GraphStorm v0.4 with DGL-GraphBolt addresses two crucial challenges of graph learning:Memory constraints – GraphStorm v0.4 provides compact and distributed storage of graph structure and features, which may grow in the multi-TB range.

– GraphStorm v0.4 provides compact and distributed storage of graph structure and features, which may grow in the multi-TB range.

We provide a hands-on example of using GraphStorm with GraphBolt on SageMaker for distributed training.

He led th…

1 week назад @ aws.amazon.com
Transforming credit decisions using generative AI with Rich Data Co and AWS
Transforming credit decisions using generative AI with Rich Data Co and AWS Transforming credit decisions using generative AI with Rich Data Co and AWS

Making credit decisions using AI can be challenging, requiring data science and portfolio teams to synthesize complex subject matter information and collaborate productively.

Conclusion and next steps with RDCTo expedite development, RDC collaborated with AWS Startups and the AWS Generative AI Innovation Center.

“The integration of generative AI into our solution marks a pivotal moment in our mission to revolutionize credit decision-making.

His team partners with AWS customers on generative AI projects, with the goal of accelerating customers’ adoption of generative AI.

Iman Abbasnejad is a computer scientist at the Generative AI Innovation Center at Amazon Web Services (AWS) working on Gen…

1 week, 1 day назад @ aws.amazon.com
Build agentic AI solutions with DeepSeek-R1, CrewAI, and Amazon SageMaker AI
Build agentic AI solutions with DeepSeek-R1, CrewAI, and Amazon SageMaker AI Build agentic AI solutions with DeepSeek-R1, CrewAI, and Amazon SageMaker AI

SageMaker AI features like notebooks, Amazon SageMaker Training, inference, Amazon SageMaker for MLOps, and Partner AI Apps enable advanced model builders to adapt FMs using LoRA, full fine-tuning, or training from scratch.

With SageMaker AI, you can build generative AI-powered agentic workflows using a framework of your choice.

Solution overviewCrewAI provides a robust framework for developing multi-agent systems that integrate with AWS services, particularly SageMaker AI.

Additionally, SageMaker JumpStart provides solution templates that configure infrastructure for common use cases, along with executable example notebooks to streamline ML development with SageMaker AI.

The tasks are inte…

1 week, 1 day назад @ aws.amazon.com
NVIDIA
последний пост 3 days, 20 hours назад
Featured Sessions for Students at NVIDIA GTC 2025
Featured Sessions for Students at NVIDIA GTC 2025 Featured Sessions for Students at NVIDIA GTC 2025

This site requires Javascript in order to view all its content.

Please enable Javascript in order to access all the functionality of this web site.

Here are the instructions how to enable JavaScript in your web browser.

3 days, 20 hours назад @ nvidia.com
Using NetworkX, Jaccard Similarity, and cuGraph to Predict Your Next Favorite Movie
Using NetworkX, Jaccard Similarity, and cuGraph to Predict Your Next Favorite Movie Using NetworkX, Jaccard Similarity, and cuGraph to Predict Your Next Favorite Movie

Jaccard Similarity compares pairs of nodes and computes a similarity coefficient using their relationships in the graph.

Jaccard Similarity computes a similarity coefficient using the sizes of the sets of neighbors for the two nodes being compared.

But as we’ll see, performance becomes a limitation for larger-sized graphs—such as those needed for our movie recommendation system—when using NetworkX without the GPU-accelerated cuGraph backend.

Chart showing speedup of cuGraph over NetworkX for Jaccard Similarity computation for various movies.

To learn more about accelerated graph analysis using NetworkX and NVIDIA cuGraph, visit https://rapids.ai/nx-cugraph.

5 days, 5 hours назад @ developer.nvidia.com
Physicists Tap James Web Space Telescope to Track New Asteroids and City-Killer Rock
Physicists Tap James Web Space Telescope to Track New Asteroids and City-Killer Rock Physicists Tap James Web Space Telescope to Track New Asteroids and City-Killer Rock

Smaller asteroids have been located farther out with the world’s most powerful space telescope, which can now also track the city crusher 2024YR4 asteroid with a 2.3% chance of hitting Earth.

But researchers focused on asteroid tracking are on a mission to locate them for today’s real-world concerns: planetary defense.

The new and unexpected discovery tool applied in this research is NASA’s James Web Space Telescope (JWST), which was tapped for views of these asteroids from previous research and enabled by NVIDIA accelerated computing.

The finding of more than 100 space rocks of this size marks the smallest asteroids ever detected in the main asteroid belt.

The JWST technology will soon be …

5 days, 6 hours назад @ blogs.nvidia.com
GeForce NOW Welcomes Warner Bros. Games to the Cloud With ‘Batman: Arkham’ Series
GeForce NOW Welcomes Warner Bros. Games to the Cloud With ‘Batman: Arkham’ Series GeForce NOW Welcomes Warner Bros. Games to the Cloud With ‘Batman: Arkham’ Series

It’s a match made in heaven — GeForce NOW and Warner Bros. Games are collaborating to bring the beloved Batman: Arkham series to the cloud as part of GeForce NOW’s fifth-anniversary celebration.

A Match Made in Gotham CityGeForce NOW is welcoming Warner Bros. Games to the cloud with the Batman: Arkham series, including Arkham Asylum Game of the Year Edition, Batman: Arkham City Game of the Year Edition and Batman: Arkham Knight Premium.

With enhanced gameplay and an even larger arsenal of gadgets, Batman: Arkham City elevates the Batman experience to new heights.

Conclude the Batman: Arkham trilogy in style with Batman: Arkham Knight.

With stunning visuals and refined gameplay, Batman: Arkh…

5 days, 8 hours назад @ blogs.nvidia.com
How Scaling Laws Drive Smarter, More Powerful AI
How Scaling Laws Drive Smarter, More Powerful AI How Scaling Laws Drive Smarter, More Powerful AI

Scaling laws describe how the performance of AI systems improves as the size of the training data, model parameters or computational resources increases.

Together, these AI scaling laws — pretraining scaling, post-training scaling and test-time scaling, also called long thinking — reflect how the field has evolved with techniques to use additional compute in a wide variety of increasingly complex AI use cases.

While pretraining is like sending an AI model to school to learn foundational skills, post-training enhances the model with skills applicable to its intended job.

Another, newer technique, reinforcement learning from AI feedback (RLAIF), instead uses feedback from AI models to guide t…

6 days, 5 hours назад @ blogs.nvidia.com
Safety First: Leading Partners Adopt NVIDIA Cybersecurity AI to Safeguard Critical Infrastructure
Safety First: Leading Partners Adopt NVIDIA Cybersecurity AI to Safeguard Critical Infrastructure Safety First: Leading Partners Adopt NVIDIA Cybersecurity AI to Safeguard Critical Infrastructure

Critical infrastructure cybersecurity is advancing to thwart the next wave of emerging threats in the AI era.

Leading operational technology (OT) providers today showcased at the S4 conference for industrial control systems (ICS) and OT cybersecurity how they’re adopting the NVIDIA cybersecurity AI platform to deliver real-time threat detection and critical infrastructure protection.

By harnessing NVIDIA’s cybersecurity AI platform, these partners can provide exceptional visibility into critical infrastructure environments, achieving robust and adaptive security while delivering operational continuity.

The platform integrates NVIDIA’s accelerated computing and AI, featuring NVIDIA BlueField…

6 days, 6 hours назад @ blogs.nvidia.com
What Are Foundation Models?
What Are Foundation Models? What Are Foundation Models?

They said transformer models, large language models (LLMs), vision language models (VLMs) and other neural networks still being built are part of an important new category they dubbed foundation models.

Foundation Models DefinedA foundation model is an AI neural network — trained on mountains of raw data, generally with unsupervised learning — that can be adapted to accomplish a broad range of tasks.

With a little fine-tuning, foundation models can handle jobs from translating text to analyzing medical images to performing agent-based behaviors.

“I think we’ve uncovered a very small fraction of the capabilities of existing foundation models, let alone future ones,” said Percy Liang, the cen…

6 days, 23 hours назад @ blogs.nvidia.com
NVIDIA CEO Awarded for Advancing Precision Medicine With Accelerated Computing, AI
NVIDIA CEO Awarded for Advancing Precision Medicine With Accelerated Computing, AI NVIDIA CEO Awarded for Advancing Precision Medicine With Accelerated Computing, AI

NVIDIA’s contributions to accelerating medical imaging, genomics, computational chemistry and AI-powered robotics were honored Friday at the Precision Medicine World Conference in Santa Clara, California, where NVIDIA founder and CEO Jensen Huang received a Luminary award.

The Precision Medicine World Conference brings together healthcare leaders, top global researchers and innovators across biotechnology.

Its Luminary award recognizes people transforming healthcare by advancing precision medicine in the clinic.

Advancing Precision Medicine With Accelerated ComputingHuang spoke about the ways AI will support the work of doctors, scientists and researchers advancing medicine.

Learn more abou…

1 week назад @ blogs.nvidia.com
Technovation Empowers Girls in AI, Making AI Education More Inclusive and Engaging
Technovation Empowers Girls in AI, Making AI Education More Inclusive and Engaging Technovation Empowers Girls in AI, Making AI Education More Inclusive and Engaging

The founder and CEO of education nonprofit Technovation joined the AI Podcast in 2019 to discuss the AI Family Challenge.

In this episode of the NVIDIA AI Podcast, Chklovski and Anshita Saini, a Technovation alumna and member of the technical staff at OpenAI, explore how the nonprofit empowers girls worldwide through technology education.

They discuss the organization’s growth from its early days to its current focus on AI education and real-world problem-solving.

He emphasizes the importance of AI education, inclusivity and public-private partnerships in preparing the global workforce for the future.

They discuss the need to hear voices from the disability community in conversations about …

1 week назад @ blogs.nvidia.com
NVIDIA Open GPU Datacenter Drivers for RHEL9 Signed by Red Hat
NVIDIA Open GPU Datacenter Drivers for RHEL9 Signed by Red Hat NVIDIA Open GPU Datacenter Drivers for RHEL9 Signed by Red Hat

NVIDIA and Red Hat have partnered to bring continued improvements to the precompiled NVIDIA Driver introduced in 2020.

Last month, NVIDIA announced that the open GPU driver modules will become the default recommended way to enable NVIDIA graphics hardware.

Starting with RHEL 9.5, NVIDIA now offers a tech preview repository available for trial until April 30, 2025.

They install just fine without any warnings but upon reboot the NVIDIA graphics driver will not be available:$ nvidia-smi NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver.

SummaryNow with signed packages from NVIDIA and Red Hat, secure boot with NVIDIA on RHEL 9 is even better.

1 week, 1 day назад @ developer.nvidia.com
AI-Designed Proteins Take on Deadly Snake Venom
AI-Designed Proteins Take on Deadly Snake Venom AI-Designed Proteins Take on Deadly Snake Venom

Their research, published in Nature, introduces a new class of synthetic proteins that successfully protect animals from otherwise lethal doses of snake venom toxins.

Instead of screening a vast number of these proteins in a lab, they used AI tools to predict how the designer proteins would interact with snake venom toxins, rapidly homing in on the most promising designs.

The results were remarkable:Newly designed proteins bound tightly to three-finger toxins (3FTx), the deadliest components of elapid venom, effectively neutralizing their toxic effects.

The AI-designed proteins were small, heat-resistant and easy to manufacture — no cold storage required.

By replacing trial-and-error drug d…

1 week, 4 days назад @ blogs.nvidia.com
Get Started with GPU Acceleration for Data Science
Get Started with GPU Acceleration for Data Science Get Started with GPU Acceleration for Data Science

This post shares tips for transitioning from CPU data science libraries to GPU-accelerated workflows, especially for experienced data scientists.

Similar to pandas, cuDF pandas supports different file formats such as .csv, .json, .pickle, .paraquet, and hence enables GPU-accelerated data manipulation.

This highlights the efficiency of GPU acceleration with cuDF pandas for large-scale data operations.

Verifying GPU utilizationWhen working with different data types, it is important to verify whether your system is utilizing the GPU effectively.

To learn more about GPU-accelerated data science, see 10 Minutes to Data Science: Transitioning Between RAPIDS cuDF and CuPy Libraries and RAPIDS cuDF…

1 week, 4 days назад @ developer.nvidia.com
When the Earth Talks, AI Listens
When the Earth Talks, AI Listens When the Earth Talks, AI Listens

A team of researchers from the Earth and environmental sciences division at Los Alamos National Laboratory repurposed Meta’s Wav2Vec-2.0, an AI model designed for speech recognition, to analyze seismic signals from Hawaii’s 2018 Kīlauea volcano collapse.

The AI model outperformed traditional methods, such as gradient-boosted trees, which struggle with the unpredictable nature of seismic signals.

High-performance NVIDIA GPUs accelerated training, enabling the AI to efficiently extract meaningful patterns from continuous seismic signals.

This study suggests that AI models designed for speech recognition may be uniquely suited to interpreting the intricate, shifting signals faults generate ove…

1 week, 5 days назад @ blogs.nvidia.com
Medieval Mayhem Arrives With ‘Kingdom Come: Deliverance II’ on GeForce NOW
Medieval Mayhem Arrives With ‘Kingdom Come: Deliverance II’ on GeForce NOW Medieval Mayhem Arrives With ‘Kingdom Come: Deliverance II’ on GeForce NOW

The month kicks off with Kingdom Come: Deliverance II.

Experience the highly anticipated sequel’s stunning open world at GeForce RTX quality in the cloud, available to stream across devices at launch.

It leads seven games joining the GeForce NOW library of over 2,000 titles, along with MARVEL vs. CAPCOM Fighting Collection: Arcade Classics.

Chainmail Meets the CloudKingdom Come: Deliverance II continues the epic, open-world RPG saga set in the brutal and realistic medieval world of Bohemia.

The game also features enhanced graphics powered by GeForce RTX, making it ideal to stream on GeForce NOW even without a game-ready rig.

1 week, 5 days назад @ blogs.nvidia.com
Featured Researcher and Educator Sessions at NVIDIA GTC 2025
Featured Researcher and Educator Sessions at NVIDIA GTC 2025 Featured Researcher and Educator Sessions at NVIDIA GTC 2025

This site requires Javascript in order to view all its content.

Please enable Javascript in order to access all the functionality of this web site.

Here are the instructions how to enable JavaScript in your web browser.

1 week, 5 days назад @ nvidia.com
Facebook
последний пост 1 week, 6 days назад
Revolutionizing software testing: Introducing LLM-powered bug catchers
Revolutionizing software testing: Introducing LLM-powered bug catchers Revolutionizing software testing: Introducing LLM-powered bug catchers

WHAT IT ISMeta’s Automated Compliance Hardening (ACH) tool is a system for mutation-guided, LLM-based test generation.

Traditionally, automated test generation techniques sought merely to increase code coverage.

LLM-based test generation and LLM-based mutant generation are not new, but this is the first time they’ve been combined and deployed in large-scaled industrial systems.

WHAT’S NEXTOur novel approach combines LLM-based test generation and mutant generation to help automate complex technical organizational workflows in this space.

READ THE PAPERMutation-Guided LLM-based Test Generation at Meta

1 week, 6 days назад @ engineering.fb.com
Meta Andromeda: Supercharging Advantage+ automation with the next-gen personalized ads retrieval engine
Meta Andromeda: Supercharging Advantage+ automation with the next-gen personalized ads retrieval engine Meta Andromeda: Supercharging Advantage+ automation with the next-gen personalized ads retrieval engine

Unlocking advertiser value through industry-leading ML innovationMeta Andromeda is a personalized ads retrieval engine that leverages the NVIDIA Grace Hopper Superchip, to enable cutting edge ML innovation in the Ads retrieval stage to drive efficiency and advertiser performance.

Its deployment across Instagram and Facebook applications has achieved +6% recall improvement to the retrieval system, delivering +8% ads quality improvement on selected segments.

Andromeda is designed to maximize ads performance by utilizing the exponential growth in volume of eligible ads available to the retrieval stage.

The design is optimized for AI hardware, minimizing memory bandwidth bottlenecks and enablin…

2 months, 2 weeks назад @ engineering.fb.com
Sequence learning: A paradigm shift for personalized ads recommendations
Sequence learning: A paradigm shift for personalized ads recommendations Sequence learning: A paradigm shift for personalized ads recommendations

Meta’s ad recommendation engine, powered by deep learning recommendation models (DLRMs), has been instrumental in delivering personalized ads to people.

Learning from sequences: developing new sequence learning architectures to replace traditional DLRM neural network architectures.

A paradigm shift with learning from sequences for recommendation systemsMeta’s new system for ads recommendations uses sequence learning at its core.

Scaling the new sequence learning paradigmFollowing the redesign to shift from sparse feature learning to event-based sequence learning, the next focus was scaling across two domains — scaling the sequence learning architecture and scaling event sequences to be long…

3 months назад @ engineering.fb.com
OCP Summit 2024: The open future of networking hardware for AI
OCP Summit 2024: The open future of networking hardware for AI OCP Summit 2024: The open future of networking hardware for AI

At Open Compute Project Summit (OCP) 2024, we’re sharing details about our next-generation network fabric for our AI training clusters.

We’ve expanded our network hardware portfolio and are contributing two new disaggregated network fabrics and a new NIC to OCP.

At Meta, we believe that open hardware drives innovation.

At Meta, we envision a future of AI hardware systems that are not only scalable, but also open and collaborative.

We encourage anyone who wants to help advance the future of networking hardware for AI to engage with OCP and Meta to help share the future of AI infrastructure.

4 months назад @ engineering.fb.com
Meta’s open AI hardware vision
Meta’s open AI hardware vision Meta’s open AI hardware vision

At the Open Compute Project (OCP) Global Summit 2024, we’re showcasing our latest open AI hardware designs with the OCP community.

These innovations include a new AI platform, cutting-edge open rack designs, and advanced network fabrics and components.

The open future of AI infraMeta is committed to open source AI.

We must also prioritize open and standardized models so we can leverage collective expertise, make AI more accessible, and work towards minimizing biases in our systems.​Just as important, we also need open AI hardware systems.

By addressing AI’s infrastructure needs together, we can unlock the true promise of open AI for everyone.​

4 months назад @ engineering.fb.com
How open source AI can improve population estimates, sustainable energy, and the delivery of climate change interventions
How open source AI can improve population estimates, sustainable energy, and the delivery of climate change interventions How open source AI can improve population estimates, sustainable energy, and the delivery of climate change interventions

Why we need better population mapsAccurate estimates of population are taken for granted in many countries.

As the world’s natural resource and energy demands scale, accurate population estimates also offer significant opportunities to improve sustainability efforts.

In addition to total population counts, Meta’s population maps also include demographic breakdowns for groups such as the number of children under five, women of reproductive age, youth, and the elderly.

AI-powered population estimates have been scientifically evaluated to be among the most accurate in the world for mapping population distribution for a variety of geographies and use-cases.

Please visit the Data for Good websit…

4 months, 2 weeks назад @ engineering.fb.com
Simulator-based reinforcement learning for data center cooling optimization
Simulator-based reinforcement learning for data center cooling optimization Simulator-based reinforcement learning for data center cooling optimization

Meta is revamping its new data center design to optimize for artificial intelligence and the same methodology will be applicable for future data center optimizations as well.

As Meta is revamping its new data center design to optimize for artificial intelligence, the same methodology will be applicable for future data center optimizations as well to improve operational efficiency.

A reinforcement learning approach to data center coolingReinforcement learning (RL) is good at modeling control systems as sequential state machines.

There are also various RL approaches reported such as, transforming cooling optimization via deep reinforcement learning and data center cooling using model-predicti…

5 months, 1 week назад @ engineering.fb.com
How PyTorch powers AI training and inference
How PyTorch powers AI training and inference How PyTorch powers AI training and inference

How PyTorch powers AI training and inferenceLearn about new PyTorch advancements for LLMs and how PyTorch is enhancing every aspect of the LLM lifecycle.

In this talk from AI Infra @ Scale 2024, software engineers Wanchao Liang and Evan Smothers are joined by Meta research scientist Kimish Patel to discuss our newest features and tools that enable large-scale training, memory efficient fine-tuning, and on-device LLM capabilities.

First, they cover the importance of memory-efficient fine-tuning and a few common architectural and algorithmic techniques to enable fine-tuning on consumer-grade hardware.

Then they discuss the challenges of deploying large models for on-device deployment and how …

5 months, 4 weeks назад @ engineering.fb.com
Inside the hardware and co-design of MTIA
Inside the hardware and co-design of MTIA Inside the hardware and co-design of MTIA

In this talk from AI Infra @ Scale 2024, Joel Colburn, a software engineer at Meta, technical lead Junqiang Lan, and software engineer Jack Montgomery discuss the second generation of MTIA, Meta’s in-house training and inference accelerator.

They cover the co-design process behind building the second generation of Meta’s first-ever custom silicon for AI workloads, including the PyTorch software ecosystem, and the model architectures for Meta’s key applications.

They demonstrate how MTIA achieves the performance, efficiency, and developer experience to successfully launch models into production.

They also highlight several co-design examples where special silicon features are utilized to acc…

6 months назад @ engineering.fb.com
Bringing Llama 3 to life
Bringing Llama 3 to life Bringing Llama 3 to life

At AI Infra @ Scale 2024, Meta engineers discussed every step of how we built and brought Llama 3 to life, from data and training to inference.

Joe Spisak, Product Director and Head of Generative AI Open Source at Meta, talks about the history of Llama and Meta’s overarching vision for open source AI.

He’s joined by Delia David, a software engineer at Meta, to discuss all things data-related for GenAI.

Kaushik Veeraraghavan, a software engineer at Meta, discusses how Meta trains Llama at scale and delves into the data center, networking, and software investments that have enabled the development of Meta’s Llama 3 models.

Finally, Ye (Charlotte) Qia, a production engineer at Meta, discusses …

6 months назад @ engineering.fb.com
Aparna Ramani discusses the future of AI infrastructure
Aparna Ramani discusses the future of AI infrastructure Aparna Ramani discusses the future of AI infrastructure

Delivering new AI technologies at scale also means rethinking every layer of our infrastructure – from silicon and software systems and even our data center designs.

For the second year in a row, Meta’s engineering and infrastructure teams returned for the AI Infra @ Scale conference, where they discussed the challenges of scaling up an infrastructure for AI as well as work being done on our large-scale GPU clusters, open hardware designs for next-generation data center hardware, and how Meta is building custom silicon like the Meta Training and Inference Accelerator (MTIA) to handle some of our AI training workloads.

Aparna Ramani, VP of Engineering at Meta, responsible for AI infrastructu…

6 months назад @ engineering.fb.com
How Meta animates AI-generated images at scale
How Meta animates AI-generated images at scale How Meta animates AI-generated images at scale

Meta AI’s animate feature, which lets people generate a short animation of a generated image, carried unique challenges in this regard.

Here’s how we were able to deploy Meta AI’s animate feature using a combination of latency optimizations, traffic management, and other novel techniques.

We started by looking at the data for previous traffic on our AI-generated media both at their launches and over time.

With these changes, the preponderance of requests remained in region and latency dropped to roughly what we would expect.

The service tries to take a chunk of that region’s requests and offload them to a nearby region that can handle them without becoming more overloaded.

6 months, 1 week назад @ engineering.fb.com
A RoCE network for distributed AI training at scale
A RoCE network for distributed AI training at scale A RoCE network for distributed AI training at scale

Our paper, “ RDMA over Ethernet for Distributed AI Training at Meta Scale ,” provides the details on how we design, implement, and operate one of the world’s largest AI networks at scale.

These RoCE clusters support an extensive range of production distributed GPU training jobs, including ranking, content recommendation, content understanding, natural language processing, and GenAI model training, among other workloads.

However, our experience with distributed AI training workloads provides a different perspective on tailoring the congestion control algorithms.

Moving forwardThe design and operation of large-scale RoCE networks for distributed AI training workloads have evolved to meet the …

6 months, 2 weeks назад @ engineering.fb.com
Meet Caddy – Meta’s next-gen mixed reality CAD software
Meet Caddy – Meta’s next-gen mixed reality CAD software Meet Caddy – Meta’s next-gen mixed reality CAD software

What happens when a team of mechanical engineers get tired of looking at flat images of 3D models over Zoom?

Meet the team behind Caddy, a new CAD app for mixed reality.

They join Pascal Hartig (@passy) on the Meta Tech Podcast to talk about teaching themselves to code, disrupting the CAD software space, and how they integrated Caddy with Llama 3, and so much more!

Download or listen to the podcast episode below:You can also find the episode wherever you get your podcasts, including:The Meta Tech Podcast is a podcast, brought to you by Meta, where we highlight the work Meta’s engineers are doing at every level – from low-level frameworks to end-user features.

And if you’re interested in lea…

7 months назад @ engineering.fb.com
AI Lab: The secrets to keeping machine learning engineers moving fast
AI Lab: The secrets to keeping machine learning engineers moving fast AI Lab: The secrets to keeping machine learning engineers moving fast

The key to developer velocity across AI lies in minimizing time to first batch (TTFB) for machine learning (ML) engineers.

AI Lab prevents TTFB regressions whilst enabling experimentation to develop improvements.

Optimizing TTFB helps ML engineers move fastThe overhead induced from TTFB is on the critical path for most ML development.

Here, we see the true utility of a framework like AI Lab and how it was used to facilitate this sweeping change.

O(Releases): Running a more holistic set of AI Lab tests prior to release and performing a bisect-like attribution process to find the root cause.

7 months, 1 week назад @ engineering.fb.com
Uber Engineering
последний пост None
neptune.ai neptune.ai
последний пост 5 days, 11 hours назад
Learnings From Teams Training Large-Scale Models: Challenges and Solutions For Monitoring at Hyperscale
Learnings From Teams Training Large-Scale Models: Challenges and Solutions For Monitoring at Hyperscale Learnings From Teams Training Large-Scale Models: Challenges and Solutions For Monitoring at Hyperscale

“What is not measured, cannot be improved.” This quote has become a guiding principle for teams training foundation models.

During my talk at NeurIPS, I broke down five key lessons learned from teams facing large-scale model training and monitoring.

Waabi’s teams, running large-scale ML experiments, needed a way to organize and share their experiment data efficiently.

Visualizing large datasetsWe generally do not think of dataset visualization as part of experiment monitoring.

Moving forwardThe path to efficient hyperscale training lies in combining robust monitoring, advanced debugging tools, and comprehensive experiment tracking.

5 days, 11 hours назад @ neptune.ai
Mixture of Experts LLMs: Key Concepts Explained
Mixture of Experts LLMs: Key Concepts Explained Mixture of Experts LLMs: Key Concepts Explained

TL;DR Mixture of Experts (MoE) is a type of neural network architecture that employs sub-networks (experts) to process specific input parts.

This is the key idea behind Mixture of Expert LLMs.

The Switch-Language Transformer, Mixtral, GLaM, GShard, and DeepSeekMoE are Mixture of Experts LLMs (MoEs), which require only executing a portion of the model’s computational graph during inference.

Optimization strategies for MoE LLMs are discussed comprehensively in the papers introducing the Switch Transformer, GShard, and GLaM.

Mixture of Experts (MoE) is an approach to scaling LLMs to trillions of parameters with conditional computation while avoiding exploding computational costs.

1 week, 5 days назад @ neptune.ai
Hyperparameter Optimization For LLMs: Advanced Strategies
Hyperparameter Optimization For LLMs: Advanced Strategies Hyperparameter Optimization For LLMs: Advanced Strategies

Advanced hyperparameter optimization strategies, like population-based training, Bayesian optimization, and adaptive LoRA, promise to balance computational effort and outcome.

To avoid this, learning rate schedules for LLMs start with a small learning rate and slowly ramp it up to its maximum value.

Can we use traditional machine learning hyperparameter optimization methods for LLMs?

| Modified based on: sourceHands-on: LLM hyperparameter optimization with neptune.aiOptuna is a framework for optimizing hyperparameter search using Bayesian optimization.

See the docs or watch a short product demo (2 min)Play with a live Neptune Scale projectRequest your early accessWhat’s next in LLM hyperpar…

2 weeks, 5 days назад @ neptune.ai
Multimodal Large Language Models
Multimodal Large Language Models Multimodal Large Language Models

TL;DR Multimodal Large Language Models (MLLMs) process data from different modalities like text, audio, image, and video.

This article explores Multimodal Large Language Models, exploring their core functionalities, challenges, and potential for various machine-learning domains.

Let’s break down the concept of Multimodal Large Language Models (MLLMs) by first understanding the terms “modal” and “multimodal:”“Modal” refers to a particular way of communicating or perceiving information.

| SourceGoogle: PaLM-EGoogle developed an embodied language model, PaLM-E, to incorporate continuous sensor modalities into language models and establish the link between words and perceptions.

Improving how t…

3 weeks, 5 days назад @ neptune.ai
How to Build and Evaluate a RAG System Using LangChain, Ragas, and neptune.ai
How to Build and Evaluate a RAG System Using LangChain, Ragas, and neptune.ai How to Build and Evaluate a RAG System Using LangChain, Ragas, and neptune.ai

In this guide, we’ll show you how to build a RAG system using the LangChain framework, evaluate its performance using Ragas, and track your experiments with neptune.ai.

Part 1: Building a baseline RAG system with LangChainIn the first part of this guide, we’ll use LangChain to build a RAG system for the blog posts in the LLMOps category on Neptune’s blog.

Ragas works smoothly with LangChain, making it a great choice for evaluating our RAG system.

Step 1: Generate a RAG evaluation datasetAn evaluation set for RAG tasks is similar to a question-answering task dataset.

Step 2: Choose RAG evaluation metricsAs mentioned earlier, Ragas offers both LLM-based and non-LLM-based metrics for RAG syste…

1 month, 3 weeks назад @ neptune.ai
Position: Understanding LLMs Requires More Than Statistical Generalization [Paper Reflection]
Position: Understanding LLMs Requires More Than Statistical Generalization [Paper Reflection] Position: Understanding LLMs Requires More Than Statistical Generalization [Paper Reflection]

In our paper, Understanding LLMs Requires More Than Statistical Generalization, we argue that current machine learning theory cannot explain the interesting emergent properties of Large Language Models, such as reasoning or in-context learning.

Inductive biases affect which solution the neural network converges to, such as the model architecture or the optimization algorithm.

How do language complexity and model architecture affect generalization ability?

showed how different neural network architectures generalize better for different language types.

Presumably, we’ll need to find different complexity measures for different model architectures that consider their specific inductive biases.

2 months назад @ neptune.ai
From Research to Production: Building The Most Scalable Experiment Tracker For Foundation Models
From Research to Production: Building The Most Scalable Experiment Tracker For Foundation Models From Research to Production: Building The Most Scalable Experiment Tracker For Foundation Models

TL;DR At a large-scale model training (in huge models), anomalies are not rare events but problematic patterns that drive failure.

The Neptune Scale experiment tracker supports fault tolerance and is designed to maintain progress despite hardware failures, making it adaptable for enterprise teams tackling LLM fine-tuning, compliance, and building domain-specific models.

Experiment tracking back then was straightforward—dealing mostly with single models or small-scale distributed systems.

One of the biggest lessons we’ve learned is that experiment tracking has evolved into experiment monitoring.

That’s why we’re focusing on building intelligent alerts and anomaly detection right into our exp…

2 months, 1 week назад @ neptune.ai
Transformers Key-Value Caching Explained
Transformers Key-Value Caching Explained Transformers Key-Value Caching Explained

Key-value (KV) caching is a clever trick to do that: At inference time, key and value matrices are calculated for each generated token.

Implementing K-V caching in large-scale production systems requires careful cache management, including choosing an appropriate strategy for cache invalidation and exploring opportunities for cache reuse.

Key-value (KV) caching is a clever trick to do just that – let’s see how it works and when to use it.

Transformer architecture overviewBefore we dive into KV caching, we will need to take a short detour to the attention mechanism used in transformers.

Understanding how it works is required to spot and appreciate how KV caching optimizes transformer inferen…

2 months, 2 weeks назад @ neptune.ai
Learn From Failure: Fine-Tuning LLMs With Trial-and-Error Data For Intuitionistic Propositional Logic Proving [Paper Reflection]
Learn From Failure: Fine-Tuning LLMs With Trial-and-Error Data For Intuitionistic Propositional Logic Proving [Paper Reflection] Learn From Failure: Fine-Tuning LLMs With Trial-and-Error Data For Intuitionistic Propositional Logic Proving [Paper Reflection]

In our paper, Learn from Failure: Fine-Tuning LLMs with Trial-and-Error Data for Intuitionistic Propositional Logic Proving, we explored this problem experimentally.

Our goal was to assess the influence of trial-and-error information in the training data on the performance of LLMs in theorem proving.

However, at the time we published our paper, current approaches to training LLMs for ATPs only utilized data on correct proof attempts.

We hope our work can raise the community’s awareness of the importance of trial-and-error data for automated theorem proving.

We believe this advancement is largely due to the substantial trial-and-error data included in the model’s training process.

2 months, 3 weeks назад @ neptune.ai
Fine-Tuning Llama 3 with LoRA: Step-by-Step Guide
Fine-Tuning Llama 3 with LoRA: Step-by-Step Guide Fine-Tuning Llama 3 with LoRA: Step-by-Step Guide

We will explore these challenges and provide an example of fine-tuning the Llama 3 8B Instruct model utilizing the neptune.ai experiment tracker.

The Llama 3 training data is seven times larger than what Meta used for training Llama 2.

For pre-training, Meta combined four types of parallelization, an approach they dubbed “4D parallelism”: data, model, pipeline, and context.

Hands-on guide: resource-efficient fine-tuning of Llama 3 on Google ColabFine-tuning Llama 3 8B is challenging, as it requires considerable computational resources.

We’ll use the Llama 3 8B model, which is sufficient for this task despite being the smallest Llama 3 model.

2 months, 4 weeks назад @ neptune.ai
How to Run LLMs Locally
How to Run LLMs Locally How to Run LLMs Locally

This is the process we will be going through in the following section to answer the question: When do I decide to run LLMs locally?

PrivacyA very obvious argument in favor of running LLMs locally is that, in some cases, there is no alternative.

Related ML/AI Platform Build vs Buy Decision: What Factors to Consider Read moreWhat does it take to run LLMs locally?

However, recent advancements in optimization techniques, such as quantization and attention mechanism optimizations, have made it possible to run LLMs locally, even on a CPU.

LM StudioLM Studio is a user-friendly application designed to run LLMs locally.

3 months назад @ neptune.ai
Scale And Track Your AI/ML Workflows: neptune.ai + Flyte & Union Integration
Scale And Track Your AI/ML Workflows: neptune.ai + Flyte & Union Integration Scale And Track Your AI/ML Workflows: neptune.ai + Flyte & Union Integration

Like Union, Neptune excels in scalability, making it the ideal tracking solution for teams working on large-scale model training.

The new Neptune Flyte plugin enables you to use Neptune to track, visualize, and manage your models.

In this blog post, you’ll learn how to use the Neptune plugin on Union.

Orchestrate and track your models with Flytekit’s Neptune PluginIn Union, data and compute are fundamental building blocks for developing all workflows.

With flytekit’s Neptune plugin, you can easily track your experiments, visualize results, and debug your models.

4 months, 1 week назад @ neptune.ai
LLM Hallucinations 101: Why Do They Appear? Can We Avoid Them?
LLM Hallucinations 101: Why Do They Appear? Can We Avoid Them? LLM Hallucinations 101: Why Do They Appear? Can We Avoid Them?

What are LLM hallucinations?

LLM hallucinations become a problem in LLM-based applicationsMost of the time, if you use an LLM, you probably won’t use a base LLM but an LLM-based assistant whose goal is to help with your requests and reliably answer your questions.

Before we dive into this further, I’d like to stress that when thinking about LLM hallucinations, it’s important to keep in mind the difference between a base LLM and an LLM-based assistant.

When we talk about LLM hallucinations as a problematic phenomenon, it’s in the context of an LLM-based assistant or system.

While it’s unlikely that this process introduces new hallucinations, hallucinations seeded upstream are amplified.

4 months, 3 weeks назад @ neptune.ai
LLM Guardrails: Secure and Controllable Deployment
LLM Guardrails: Secure and Controllable Deployment LLM Guardrails: Secure and Controllable Deployment

LLM guardrails prevent models from generating harmful, biased, or inappropriate content and ensure that they adhere to guidelines set by developers and stakeholders.

LLM guardrails are small programs that validate and correct the modes’ outputs to ensure they align with your application’s specific requirements and context.

We’ll approach the broad and constantly evolving field of LLM guardrails in three stages:First, we’ll talk about the key vulnerabilities threatening AI applications.

We’ll explore these different kinds of LLM guardrails using the Guardrails AI framework, an open-source tool for building reliable AI applications.

| SourceRule-based data validationThe simplest type of LLM g…

5 months назад @ neptune.ai
Reinforcement Learning From Human Feedback (RLHF) For LLMs
Reinforcement Learning From Human Feedback (RLHF) For LLMs Reinforcement Learning From Human Feedback (RLHF) For LLMs

TL;DR Reinforcement Learning from Human Feedback (RLHF) unlocked the full potential of today’s large language models (LLMs).

Reinforcement Learning from Human Feedback (RLHF) has turned out to be the key to unlocking the full potential of today’s large language models (LLMs).

Related LLM Evaluation For Text Summarization Read moreThe RLHF processThe RLHF process consists of three steps:Collecting human feedback.

Collecting human feedbackThe first step in RLHF is to collect human feedback in the so-called preference dataset.

We analyzed the three steps of the RLHF training pipeline: collecting human feedback, training the reward model, and fine-tuning the LLM.

5 months, 1 week назад @ neptune.ai
▶️ YouTube
Yannic Kilcher Yannic Kilcher
последний пост 3 weeks, 2 days назад
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models (Paper Explained)
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models (Paper Explained) DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models (Paper Explained)

#deepseek #llm #reinforcementlearning GRPO is one of the core advancements used in Deepseek-R1, but was introduced already last year in this paper that uses a combination of new RL techniques and iterative data collection to achieve remarkable performance on mathematics benchmarks with just a 7B model. Paper: https://arxiv.org/abs/2402.03300 Abstract:

Mathematical reasoning poses a significant challenge for language models due to its complex and structured nature. In this paper, we introduce DeepSeekMath 7B, which continues pre-training DeepSeek-Coder-Base-v1.5 7B with 120B math-related tokens sourced from Common Crawl, together with natural language and code data. DeepSeekMath 7B has achie…

3 weeks, 2 days назад @ youtube.com
Traditional Holiday Live Stream
Traditional Holiday Live Stream Traditional Holiday Live Stream

https://ykilcher.com/discord Links:

TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://discord.gg/4H8xxDF

BitChute: https://www.bitchute.com/channel/yannic-kilcher

Minds: https://www.minds.com/ykilcher

Parler: https://parler.com/profile/YannicKilcher

LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/

BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https:/…

1 month, 3 weeks назад @ youtube.com
Byte Latent Transformer: Patches Scale Better Than Tokens (Paper Explained)
Byte Latent Transformer: Patches Scale Better Than Tokens (Paper Explained) Byte Latent Transformer: Patches Scale Better Than Tokens (Paper Explained)

#tokenization #llm #meta This paper does away with tokenization and creates an LLM architecture that operates on dynamically sized "patches" instead of tokens. By controlling the patch size, they gain a level of control over the tradeoff between model size and FLOPs and use that to achieve more favorable scaling behavior than classically tokenized LLMs. Paper: https://ai.meta.com/research/publications/byte-latent-transformer-patches-scale-better-than-tokens/

Code: https://github.com/facebookresearch/blt Abstract:

We introduce the Byte Latent Transformer (BLT), a new byte-level LLM architecture that, for the first time, matches tokenization-based LLM performance at scale with significant imp…

1 month, 3 weeks назад @ youtube.com
Safety Alignment Should be Made More Than Just a Few Tokens Deep (Paper Explained)
Safety Alignment Should be Made More Than Just a Few Tokens Deep (Paper Explained) Safety Alignment Should be Made More Than Just a Few Tokens Deep (Paper Explained)

This paper demonstrates in a series of experiments that current safety alignment techniques of LLMs, as well as corresponding jailbreaking attacks, are in large part focusing on modulating the distribution of the first few tokens of the LLM response. Paper: https://openreview.net/forum?id=6Mxhg9PtDE&s=09 Abstract:

The safety alignment of current Large Language Models (LLMs) is vulnerable. Simple attacks, or even benign fine-tuning, can jailbreak aligned models. We note that many of these vulnerabilities are related to a shared underlying issue: safety alignment can take shortcuts, wherein the alignment adapts a model's generative distribution primarily over only its very first few output to…

2 months, 1 week назад @ youtube.com
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters (Paper Explained)
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters (Paper Explained) TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters (Paper Explained)

A deep dive into the TokenFormer and an opinion about its impact, novelty, and relation to prior work. Paper: https://arxiv.org/abs/2410.23168 Abstract:

Transformers have become the predominant architecture in foundation models due to their excellent performance across various domains. However, the substantial cost of scaling these models remains a significant concern. This problem arises primarily from their dependence on a fixed number of parameters within linear projections. When architectural modifications (e.g., channel dimensions) are introduced, the entire model typically requires retraining from scratch. As model sizes continue growing, this strategy results in increasingly high com…

2 months, 3 weeks назад @ youtube.com
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models

This paper (by Apple) questions the mathematical reasoning abilities of current LLMs and designs a synthetic template-based dataset distribution to investigate various aspects around LLM performance of high-school level math questions. Paper: https://arxiv.org/abs/2410.05229 Abstract:

Recent advancements in Large Language Models (LLMs) have sparked interest in their formal reasoning capabilities, particularly in mathematics. The GSM8K benchmark is widely used to assess the mathematical reasoning of models on grade-school-level questions. While the performance of LLMs on GSM8K has significantly improved in recent years, it remains unclear whether their mathematical reasoning capabilities hav…

4 months назад @ youtube.com
Were RNNs All We Needed? (Paper Explained)
Were RNNs All We Needed? (Paper Explained) Were RNNs All We Needed? (Paper Explained)

This paper posits the interesting question: How much of the performance of Mamba, S4, and other state-space-like models is actually just attributable to some very core concepts - rather than their elaborate architectures. The authors construct minimal versions of GRUs and LSTMs and report competitive performance. Paper: https://arxiv.org/abs/2410.01201 Abstract:

The scalability limitations of Transformers regarding sequence length have renewed interest in recurrent sequence models that are parallelizable during training. As a result, many novel recurrent architectures, such as S4, Mamba, and Aaren, have been proposed that achieve comparable performance. In this work, we revisit traditional …

4 months, 1 week назад @ youtube.com
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters (Paper)
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters (Paper) Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters (Paper)

How can one best use extra FLOPS at test time? Paper: https://arxiv.org/abs/2408.03314 Abstract:

Enabling LLMs to improve their outputs by using more test-time computation is a critical step towards building generally self-improving agents that can operate on open-ended natural language. In this paper, we study the scaling of inference-time computation in LLMs, with a focus on answering the question: if an LLM is allowed to use a fixed but non-trivial amount of inference-time compute, how much can it improve its performance on a challenging prompt? Answering this question has implications not only on the achievable performance of LLMs, but also on the future of LLM pretraining and how one s…

4 months, 2 weeks назад @ youtube.com
Privacy Backdoors: Stealing Data with Corrupted Pretrained Models (Paper Explained)
Privacy Backdoors: Stealing Data with Corrupted Pretrained Models (Paper Explained) Privacy Backdoors: Stealing Data with Corrupted Pretrained Models (Paper Explained)

#llm #privacy #finetuning Can you tamper with a base model in such a way that it will exactly remember its fine-tuning data? This paper presents a method of doing exactly that, and implements it in modern transformers. OUTLINE:

0:00 - Intro & Overview

10:50 -Core idea: single-use data traps

44:30 - Backdoors in transformer models

58:00 - Additional numerical tricks

1:00:35 - Experimental results & conclusion Paper: https://arxiv.org/abs/2404.00473

Code: https://github.com/ShanglunFengatETHZ/PrivacyBackdoor Abstract:

Practitioners commonly download pretrained machine learning models from open repositories and finetune them to fit specific applications. We show that this practice introduces a…

6 months, 2 weeks назад @ youtube.com
Scalable MatMul-free Language Modeling (Paper Explained)
Scalable MatMul-free Language Modeling (Paper Explained) Scalable MatMul-free Language Modeling (Paper Explained)

Matrix multiplications (MatMuls) are pervasive throughout modern machine learning architectures. However, they are also very resource intensive and require special accelerators (GPUs). This paper explores architectures that do away with MatMuls and use quantization and recurrence to keep performance up. OUTLINE:

0:00 - Intro

2:30 - MatMul is everywhere

5:55 - Ternary accumulation as a substitute for matrix multiplication

16:35 - Replacing attention layers with recurrent layers

32:40 - Replacing dense layers with ternary channel mixing

38:30 - Language modelling results & scaling laws

45:00 - Other experimental results

48:20 - Conclusion Paper: https://arxiv.org/abs/2406.02528

Code: https://…

7 months, 2 weeks назад @ youtube.com
Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools (Paper Explained)
Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools (Paper Explained) Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools (Paper Explained)

#rag #hallucinations #legaltech An in-depth look at a recent Stanford paper examining the degree of hallucinations in various LegalTech tools that incorporate LLMs. OUTLINE:

0:00 - Intro

1:58 - What are legal research tools and how are large language models used by them?

5:30 - Overview and abstract of the paper

9:29 - What is a hallucination and why do they occur?

15:45 - What is retrieval augmented generation (RAG)?

25:00 - Why LLMs are a bad choice when reasoning is involved

29:16 - The products that were tested

32:00 - Some shady practices by the researchers in the back and forth with the legal research companies

37:00 - Legal technology companies’ marketing claims to eliminate or solve…

7 months, 3 weeks назад @ youtube.com
xLSTM: Extended Long Short-Term Memory
xLSTM: Extended Long Short-Term Memory xLSTM: Extended Long Short-Term Memory

xLSTM is an architecture that combines the recurrency and constant memory requirement of LSTMs with the large-scale training of transformers and achieves impressive results. Paper: https://arxiv.org/abs/2405.04517 Abstract:

In the 1990s, the constant error carousel and gating were introduced as the central ideas of the Long Short-Term Memory (LSTM). Since then, LSTMs have stood the test of time and contributed to numerous deep learning success stories, in particular they constituted the first Large Language Models (LLMs). However, the advent of the Transformer technology with parallelizable self-attention at its core marked the dawn of a new era, outpacing LSTMs at scale. We now raise a sim…

8 months, 3 weeks назад @ youtube.com
[ML News] OpenAI is in hot waters (GPT-4o, Ilya Leaving, Scarlett Johansson legal action)
[ML News] OpenAI is in hot waters (GPT-4o, Ilya Leaving, Scarlett Johansson legal action) [ML News] OpenAI is in hot waters (GPT-4o, Ilya Leaving, Scarlett Johansson legal action)

#gpt4o #sky #scarlettjohansson After the release of their flagship model GPT-4o, OpenAI finds itself in multiple controversies and an exodus of senior personnel - notably Ilya Sutskever References:

https://openai.com/index/gpt-4o-and-more-tools-to-chatgpt-free/

https://openai.com/index/hello-gpt-4o/

https://x.com/LiamFedus/status/1790064963966370209?t=rx2YBT9AdDdKPhI6dUH4zA&s=09

https://x.com/lmsysorg/status/1790097588399779991?t=rx2YBT9AdDdKPhI6dUH4zA&s=09

https://x.com/bindureddy/status/1790127425705120149?t=mMUBqFBRphx-bDuZ1j3mjQ&s=09

https://openai.com/index/improvements-to-data-analysis-in-chatgpt/

https://openai.com/index/openai-and-reddit-partnership/

https://archive.ph/jHlMm

https:/…

9 months назад @ youtube.com
ORPO: Monolithic Preference Optimization without Reference Model (Paper Explained)
ORPO: Monolithic Preference Optimization without Reference Model (Paper Explained) ORPO: Monolithic Preference Optimization without Reference Model (Paper Explained)

Paper: https://arxiv.org/abs/2403.07691 Abstract:

While recent preference alignment algorithms for language models have demonstrated promising results, supervised fine-tuning (SFT) remains imperative for achieving successful convergence. In this paper, we study the crucial role of SFT within the context of preference alignment, emphasizing that a minor penalty for the disfavored generation style is sufficient for preference-aligned SFT. Building on this foundation, we introduce a straightforward and innovative reference model-free monolithic odds ratio preference optimization algorithm, ORPO, eliminating the necessity for an additional preference alignment phase. We demonstrate, both empiri…

9 months, 3 weeks назад @ youtube.com
[ML News] Chips, Robots, and Models
[ML News] Chips, Robots, and Models [ML News] Chips, Robots, and Models

OUTLINE:

0:00 - Intro

0:19 - Our next-generation Meta Training and Inference Accelerator

01:39 - ALOHA Unleashed

03:10 - Apple Inks $50M Deal with Shutterstock for AI Training Data

04:28 - OpenAI Researchers, Including Ally of Sutskever, Fired for Alleged Leaking

05:01 - Adobe's Ethical Firefly AI was Trained on Midjourney Images

05:52 - Trudeau announces $2.4billion for AI-related investments

06:48 - RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

07:15 - CodeGemma - an official Google release for code LLMs

07:24 - Mistral AI: Cheaper, Better, Faster, Stronger

08:08 - Vezora/Mistral-22B-v0.1

09:00 - WizardLM-2, next generation state-of-the-art-LLM

09:31 - Idefic…

9 months, 3 weeks назад @ youtube.com
Henry AI Labs Henry AI Labs
последний пост 6 months, 1 week назад
Chunking with Generative Feedback Loops
Chunking with Generative Feedback Loops Chunking with Generative Feedback Loops

Hey everyone! I am super excited to share a quick notebook on using Generative Feedback Loops to chunk code files and better structure how they are indexed in the Weaviate Vector Database! Chunking is one of the key topics in Vector Search. We need to break up long documents into smaller parts that we can encode with a pre-trained embedding model and index in a vector index, such as HNSW-PQ. Most solutions use some form of a rolling token window such as taking every 300 tokens as a chunk, with say 50 tokens overlapping between each window. Unfortunately, this solution doesn't work that well for code particularly. We don't want the chunk to cut off in the middle of a function or class defini…

6 months, 1 week назад @ youtube.com
Gemini 1.5 Pro and Flash - Demo of Long Context LLMs!
Gemini 1.5 Pro and Flash - Demo of Long Context LLMs! Gemini 1.5 Pro and Flash - Demo of Long Context LLMs!

Hey everyone! Thanks so much for watching this video exploring Gemini Pro 1.5 and Gemini Flash! Long Context LLMs!! This video covers 3 key tests, the classic "Lost in the Middle" exploration, using Long Context LLMs as Re-rankers in Search, and finally, testing Many-Shot In-Context Learning! I am really excited about the potential of Many-Shot In-Context Learning with DSPy's `BootstrapFewShot` and Gemini, curious to know what you think! Notebook: https://github.com/weaviate/recipes/blob/main/integrations/dspy/llms/Gemini-1.5-Pro-and-Flash.ipynb Gemini 1.5 Technical Report: https://storage.googleapis.com/deepmind-media/gemini/gemini_v1_5_report.pdf Chapters

0:00 Gemini 1.5!!

1:25 Setup and …

9 months, 1 week назад @ youtube.com
3blue1brown 3blue1brown
последний пост 6 days, 7 hours назад
How Earth's size was computed by Eratosthenes
How Earth's size was computed by Eratosthenes How Earth's size was computed by Eratosthenes

From this video: https://youtu.be/YdOXS_9_P4U

6 days, 7 hours назад @ youtube.com
Terence Tao on how we measure the cosmos | Part 1
Terence Tao on how we measure the cosmos | Part 1 Terence Tao on how we measure the cosmos | Part 1

The Cosmic Distance Ladder, how we learned distances in the heavens.

Email list: https://3b1b.co/mail

Patreon supporters see early views of new videos: https://www.patreon.com/3blue1brown Artwork by Kurt Bruns

Thanks to Paul Dancstep for several animations, such as the powers of 10 zoom out and the simulations of shadows on the moon. Thanks to Tanya Klowden for helpful conversations about the history of the distance ladder. Argument for why if every shadow of a convex shape is a circle, it must be a sphere: https://mathoverflow.net/questions/39127/is-the-sphere-the-only-surface-with-circular-projections-or-can-we-deduce-a-sp Timestamps: 0:00 - About Terence Tao and the Distance Ladder

2:02 …

1 week, 3 days назад @ youtube.com
Measuring the earth with Terence Tao
Measuring the earth with Terence Tao Measuring the earth with Terence Tao

From this video: https://youtu.be/YdOXS_9_P4U

1 week, 3 days назад @ youtube.com
The topology of two-note chords
The topology of two-note chords The topology of two-note chords

Based on a construction in this video: https://youtu.be/IQqtsm-bBRU

2 weeks, 5 days назад @ youtube.com
The space of all musical intervals
The space of all musical intervals The space of all musical intervals

Full video: https://youtu.be/IQqtsm-bBRU

1 month, 1 week назад @ youtube.com
The barber pole optical mystery
The barber pole optical mystery The barber pole optical mystery

Series exploring optics: https://www.youtube.com/watch?v=QCX62YJCmGk&list=PLZHQObOWTQDMKqfyUvG2kTlYt-QQ2x-ui

1 month, 1 week назад @ youtube.com
Monge's Theorem
Monge's Theorem Monge's Theorem

Full video: https://youtu.be/piJkuavhV50

1 month, 3 weeks назад @ youtube.com
Thinking through double slits
Thinking through double slits Thinking through double slits

Extracted from this video about holograms: https://youtu.be/EmKQsSDlaa4

1 month, 3 weeks назад @ youtube.com
The inscribed square problem
The inscribed square problem The inscribed square problem

Full video: https://youtu.be/IQqtsm-bBRU

1 month, 3 weeks назад @ youtube.com
This open problem taught me what topology is
This open problem taught me what topology is This open problem taught me what topology is

A beautiful solution to the inscribed rectangle problem.

Playlist with more neat proofs: https://www.youtube.com/playlist?list=PLZHQObOWTQDPSKntUcMArGheySM4gL7wS

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to simply share the videos. This argument was originally by Herbert Vaughan, appearing for examples in this issue of the Topology Proceedings.

https://topology.nipissingu.ca/tp/reprints/v06/tp06107.pdf 2020 Paper by Greene and Lobb:

https://arxiv.org/pdf/2005.09193 Nice Quanta article about this result:

https://www.quantamagazine.org/new-geometric-perspective-cracks-old-problem-about-rectangles…

1 month, 3 weeks назад @ youtube.com
The meaning within the Mandelbrot set
The meaning within the Mandelbrot set The meaning within the Mandelbrot set

The full video dives deeper into the field of math studying this called holomorphic dynamics: https://youtu.be/LqbZpur38nw

2 months, 3 weeks назад @ youtube.com
The scale of training LLMs
The scale of training LLMs The scale of training LLMs

From this 7-minute LLM explainer: https://youtu.be/LPZh9BOjkQs

3 months назад @ youtube.com
Large Language Models explained briefly
Large Language Models explained briefly Large Language Models explained briefly

Dig deeper here: https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi

Technical details as a talk: https://youtu.be/KJtZARuO3JY

Made for an exhibit at the Computer History Museum: https://computerhistory.org/

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support Timestamps:

0:00 - Who this was made for

0:41 - What are large language models?

7:48 - Where to learn more No secret end-screen vlog for this one, the end-screen real estate was all full! ------------------ These animations are largely made using a custom Python library, manim. See the FAQ comments here:

https://3b1b.co/faq#manim

https://github.com/3b1b/manim

https:/…

3 months назад @ youtube.com
This puzzle is tricker than it seems
This puzzle is tricker than it seems This puzzle is tricker than it seems

From this full video: https://youtu.be/piJkuavhV50

3 months назад @ youtube.com
Sphere surface area proof sketch
Sphere surface area proof sketch Sphere surface area proof sketch

Full video: https://youtu.be/GNcFjFmqEc8

3 months назад @ youtube.com
Two Minute Papers Two Minute Papers
последний пост 4 days, 4 hours назад
OpenAI: The Age of AI Is Here!
OpenAI: The Age of AI Is Here! OpenAI: The Age of AI Is Here!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "Competitive Programming with Large Reasoning Models" is available here:

https://arxiv.org/abs/2502.06807 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Benji Rabhan, B Shang, Christian Ahlin, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Michael Tedder, Owen Skarpness, Richard Sundvall, Steef,…

4 days, 4 hours назад @ youtube.com
Meta’s New AI: Outrageously Good!
Meta’s New AI: Outrageously Good! Meta’s New AI: Outrageously Good!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models" is available here:

https://hila-chefer.github.io/videojam-paper.github.io/ Vs Veo2: https://x.com/TomLikesRobots/status/1888279188336963725 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Benji Rabhan, B Shang, Christian Ahlin, Go…

6 days, 7 hours назад @ youtube.com
OpenAI’s Deep Research: Unexpected Game Changer!
OpenAI’s Deep Research: Unexpected Game Changer! OpenAI’s Deep Research: Unexpected Game Changer!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 Deep research:

https://openai.com/index/introducing-deep-research/ Lambda DeepSeek instructions: https://docs.lambdalabs.com/education/large-language-models/deepseek-r1-ollama/ - Open Deep Research: https://opendeepresearch.vercel.app/

- Revisiting the McKinley Tariff of 1890 through the Lens of Modern Trade Theory

https://kevinbryanecon.com/o3McKinley.pdf

- Deep Research by Tyler Cowen February 4, 2025

https://marginalrevolution.com/marginalrevolution/2025/02/deep-research.html Complex tax situation: https://x.com/PatriceBTC/status/1886529037474127951

Daily briefing: https://x.com/mckaywrigley/status/…

1 week, 5 days назад @ youtube.com
OpenAI o3-mini - Thinking AI for Free…For Everyone!
OpenAI o3-mini - Thinking AI for Free…For Everyone! OpenAI o3-mini - Thinking AI for Free…For Everyone!

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papersllm o3 mini: https://openai.com/index/openai-o3-mini/

OpenAI Deep Research: https://openai.com/index/introducing-deep-research/ 🤝 Interested in sponsoring us? Click here: https://eu.jotform.com/241831324457354 Sources:

https://x.com/hxiao/status/1885522459329520089?s=46

https://x.com/techikansh/status/1885429093862187008?s=46

https://x.com/rlancemartin/status/1885748894220554445?s=46

https://x.com/buccocapital/status/1885792154129219959?s=46

https://x.com/_akhaliq/status/1885733581651050586?s=46

https://x.com/_akhaliq/status/1885833163764646267?s=46

https://x.com/aidan_mclau/status/1886078444855034055?s=4…

2 weeks назад @ youtube.com
NVIDIA Unveils AI For 150x Faster 3D Modeling!
NVIDIA Unveils AI For 150x Faster 3D Modeling! NVIDIA Unveils AI For 150x Faster 3D Modeling!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "InstantSplat: Sparse-view SfM-free Gaussian Splatting in Seconds" is available here:

https://instantsplat.github.io/

Try it out (hopefully works): https://huggingface.co/spaces/kairunwen/InstantSplat Clouds paper: https://arcanous98.github.io/projectPages/gaussianVolumes.html 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Ben…

2 weeks, 6 days назад @ youtube.com
This New Free AI Is History In The Making!
This New Free AI Is History In The Making! This New Free AI Is History In The Making!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers Try it out (choose DeepSeek as your model): https://huggingface.co/chat/

Official (read the privacy policy below before you use this one): https://www.deepseek.com/ Run it at home:

https://www.reddit.com/r/selfhosted/comments/1i6ggyh/got_deepseek_r1_running_locally_full_setup_guide/

https://lmstudio.ai/ Links:

https://x.com/ivanfioravanti/status/1881565759123702140

https://eu.jotform.com/tables/241831324457354

https://x.com/deepseek_ai/status/1881318130334814301

https://x.com/nobody_qwert/status/1881620406710452535

https://github.com/bytedance/UI-TARS

https://github.com/bytedance/UI-TARS-desktop

https://…

3 weeks, 5 days назад @ youtube.com
OpenAI’s New ChatGPT: 3 Secrets From The Paper!
OpenAI’s New ChatGPT: 3 Secrets From The Paper! OpenAI’s New ChatGPT: 3 Secrets From The Paper!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper is available here:

https://openai.com/index/openai-o1-system-card/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Martin, Michael Albrecht, Michael Tedder, Owen Skarpness, Richard Sund…

4 weeks назад @ youtube.com
NVIDIA’s New AI Learned How To Punch!
NVIDIA’s New AI Learned How To Punch! NVIDIA’s New AI Learned How To Punch!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "CLoSD - Closing the Loop between Simulation and

Diffusion for multi-task character control" is available here:

https://guytevet.github.io/CLoSD-page/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, L…

1 month назад @ youtube.com
DeepMind’s Veo2 AI - The New King Is Here!
DeepMind’s Veo2 AI - The New King Is Here! DeepMind’s Veo2 AI - The New King Is Here!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers Try Veo2 here (Notes: likely USA only so far and there may be a waitlist):

https://deepmind.google/technologies/veo/veo-2/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Martin, Michael Albrecht, …

1 month, 1 week назад @ youtube.com
NVIDIA Cosmos - A Video AI…For Free!
NVIDIA Cosmos - A Video AI…For Free! NVIDIA Cosmos - A Video AI…For Free!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers Cosmos platform:

https://www.nvidia.com/en-us/ai/cosmos/

Hugging Face models: https://huggingface.co/collections/nvidia/cosmos-6751e884dc10e013a0a0d8e6

More: https://github.com/NVIDIA/Cosmos 📝 The paper "Cosmos World Foundation Model Platform for Physical AI" is available here:

https://research.nvidia.com/publication/2025-01_cosmos-world-foundation-model-platform-physical-ai 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We woul…

1 month, 1 week назад @ youtube.com
The Simulator That Could Supercharge Robotics!
The Simulator That Could Supercharge Robotics! The Simulator That Could Supercharge Robotics!

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers 📝 The paper is available here:

https://github.com/Genesis-Embodied-AI/DiffTactile

https://difftactile.github.io/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Martin, Michael Albrecht, Michael Te…

1 month, 2 weeks назад @ youtube.com
NVIDIA’s New AI: A Revolution In 3D Modeling!
NVIDIA’s New AI: A Revolution In 3D Modeling! NVIDIA’s New AI: A Revolution In 3D Modeling!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The Edify 3D paper available here:

https://research.nvidia.com/labs/dir/edify-3d/

https://build.nvidia.com/shutterstock/edify-3d 📝 MeshGPT paper: https://nihalsid.github.io/mesh-gpt/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, John Le, Juan Benet, Kyle Dav…

1 month, 3 weeks назад @ youtube.com
New Super Resolution AI - Enhance ~10x Faster!
New Super Resolution AI - Enhance ~10x Faster! New Super Resolution AI - Enhance ~10x Faster!

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers 📝 The paper "Deep Fourier-based Arbitrary-scale Super-resolution for Real-time Rendering" is available here:

https://iamxym.github.io/DFASRR.github.io/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewal…

1 month, 4 weeks назад @ youtube.com
NVIDIA’s New AI: Training 10,000x Faster!
NVIDIA’s New AI: Training 10,000x Faster! NVIDIA’s New AI: Training 10,000x Faster!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The papers are available here:

https://hover-versatile-humanoid.github.io/

https://blogs.nvidia.com/blog/robot-learning-humanoid-development/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Marti…

2 months назад @ youtube.com
Unreal Engine 5 - Real Time Ray Tracing Is Here!
Unreal Engine 5 - Real Time Ray Tracing Is Here! Unreal Engine 5 - Real Time Ray Tracing Is Here!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers Try Unreal Engine 5 here:

https://www.unrealengine.com/en-US/unreal-engine-5 My free course on ray tracing for you Fellow Scholars:

https://users.cg.tuwien.ac.at/zsolnai/gfx/rendering-course/ Our earlier paper with the spheres scene:

https://users.cg.tuwien.ac.at/zsolnai/gfx/adaptive_metropolis/ LuxCoreRender (free and open source): https://luxcorerender.org/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank …

2 months, 1 week назад @ youtube.com
DataFest Video DataFest Video
последний пост 5 months, 2 weeks назад
Interview with Juergen Schmidhuber at Data Christmas 2020
Interview with Juergen Schmidhuber at Data Christmas 2020 Interview with Juergen Schmidhuber at Data Christmas 2020

02:00-05:38 What do you think were the most outstanding underestimated news and achievements in AI field in 2020?

05:41-11:28 What do you think about trends in ML like transformers trying to replace LSTMs in NLP?

11:29-16:06 Are you working on any new types of models right now?

16:07-20:41 What is your opinion on the most underestimated ML subfield like Reinforcement Learning?

20:42-22:17 Your best recommendation for our community is to look into AI in the real physical world, right?

22:18-33:10 Do you think it is possible to achieve great results in creative AI, particularly in subjective beauty?

33:17-35:50 What prevents chat bots from reaching more intelligent levels?

36:03-39:39 What is…

5 months, 2 weeks назад @ youtube.com
Data Fest Online 2020 AI Hardware Track Premiere
Data Fest Online 2020 AI Hardware Track Premiere Data Fest Online 2020 AI Hardware Track Premiere

DataFest Online 2020

AI Hardware track https://ods.ai/tracks/ai-hardware-df2020 Register and get access to the tracks: https://ods.ai/events/datafest2020

Join the community: https://ods.ai/

5 months, 2 weeks назад @ youtube.com
Mikita Shchutski | A small BERT towards Large Medical Models
Mikita Shchutski | A small BERT towards Large Medical Models Mikita Shchutski | A small BERT towards Large Medical Models

Mikita Shchutski | Lead Machine Learning Engineer, Quantori Training large medical models using electronic health records in order to create a highly informative medical embedding space

5 months, 2 weeks назад @ youtube.com
Семинары JetBrains Research Семинары JetBrains Research
последний пост None
Яндекс. Компьютерные науки Яндекс. Компьютерные науки
последний пост 1 week, 4 days назад
Обзор RecSys&AdTech с ICML | Андрей Мищенко, Яндекс
Обзор RecSys&AdTech с ICML | Андрей Мищенко, Яндекс Обзор RecSys&AdTech с ICML | Андрей Мищенко, Яндекс

Это Андрей Мищенко, руководитель команды качества в Яндекс Рекламе. Андрей рассказал, как прошла ICML в этом году, как на инженерную конференцию попали работы об этике, и подробно остановился на самых важных статьях про рекомендательные системы. Например, на работе On Multi-Armed Bandit with Impatient Arms, которая даёт повод поразмышлять о рекламных инструментах с помощью метафоры однорукого бандита. Подписывайтесь на телеграм-канал «Яндекс для ML-инженеров»: https://t.me/yandexforml

1 week, 4 days назад @ youtube.com
Обзор CV с ICML | Сергей Овчаренко и Александр Шишеня, Яндекс
Обзор CV с ICML | Сергей Овчаренко и Александр Шишеня, Яндекс Обзор CV с ICML | Сергей Овчаренко и Александр Шишеня, Яндекс

Это ребята из Яндекс Поиска: руководитель команды нейросетевых технологий Сергей Овчаренко и ведущий разработчик генеративных моделей Александр Шишеня. Ребята рассказали, как правильно ездить на конференции и что нужно из них выносить. А ещё поделились полезными статьями на тему компьютерного зрения, которые нашли на ICML этого года. Например, Александр выделил работу Inferring 3D Structures with 2D Priors от Google DeepMind, которая посвящена восстановлению 3D-образа из малого количества плоских изображений. В пределе — всего из одной картинки. А Сергей рассказал о статье Genie: Generative Interactive Environments от DeepMind. Модель, о которой идёт речь в этой статье, обучена на записях г…

1 week, 5 days назад @ youtube.com
Как мы меняем мир — Яндекс
Как мы меняем мир — Яндекс Как мы меняем мир — Яндекс

В этом ролике мы показали, как сервисы Яндекса уже изменили мир. И рассказали, что нас ждёт впереди (спойлер: будет ещё круче!). В первую очередь мы хотим донести одну мысль: за каждым нашим продуктом стоят люди и их идеи. Наши сотрудники работают не в стол и приближают будущее с помощью развитой инженерной культуры и передовых технологий. Технологии в ролике быстро перетекают друг в друга, так что подробнее почитать о них и узнать, кого ещё мы ждём в нашу команду, можно здесь: https://yandex.ru/dev/

1 week, 5 days назад @ youtube.com
Обзор Reinforcement Learning с ICML | Дмитрий Бабаев, Яндекс
Обзор Reinforcement Learning с ICML | Дмитрий Бабаев, Яндекс Обзор Reinforcement Learning с ICML | Дмитрий Бабаев, Яндекс

Это Дмитрий Бабаев, руководитель ML R&D в Яндекс Картах. Дмитрий рассказал о самых запоминающихся статьях про обучение с подкреплением с ICML этого года. Например, Дмитрий поговорил о работе Stop Regressing: Training Value Functions via Classification for Scalable Deep RL от DeepMind, которая посвящена нестандартному подходу к регрессиям. Подписывайтесь на телеграм-канал «Яндекс для ML-инженеров»: https://t.me/yandexforml

1 week, 6 days назад @ youtube.com
Обзор NLP с ICML | Андрей Бут, Яндекс
Обзор NLP с ICML | Андрей Бут, Яндекс Обзор NLP с ICML | Андрей Бут, Яндекс

Это Андрей Бут, руководитель команды YandexGPT Alignment в Яндекс Поиске. В этом видео он рассказывает о самых интересных статьях на тему обработки естественного языка с нынешней конференции ICML. Андрей рассмотрел ряд проблем, которые ещё однозначно не решены в современных LLM. Подписывайтесь на телеграм-канал «Яндекс для ML-инженеров»: https://t.me/yandexforml

2 weeks назад @ youtube.com
Вступительное слово на ML Party — ICML Recap | Алексей Гусаков, Яндекс
Вступительное слово на ML Party — ICML Recap | Алексей Гусаков, Яндекс Вступительное слово на ML Party — ICML Recap | Алексей Гусаков, Яндекс

Это Алексей Гусаков, CTO Яндекс Поиска. В этом видео он поделился личными впечатлениями от нынешнего ICML и дал напутственные слова коллегам из индустрии. Алексей вспомнил, как мероприятия по ML за 10 лет превратились из скромных посиделок в конференции на тысячи участников, а ещё подробно рассказал про его любимую статью Physics of LLMs. Подписывайтесь на телеграм-канал «Яндекс для ML-инженеров»: https://t.me/yandexforml

2 weeks, 1 day назад @ youtube.com
Как мы улучшили алгоритм рекомендаций в настройке «Незнакомое» в Яндекс Музыке | Савва Степурин
Как мы улучшили алгоритм рекомендаций в настройке «Незнакомое» в Яндекс Музыке | Савва Степурин Как мы улучшили алгоритм рекомендаций в настройке «Незнакомое» в Яндекс Музыке | Савва Степурин

Это выступление Саввы Степурина, старшего разработчика рекомендательных продуктов Яндекс Музыки, на конференции Data Fest в нашем офисе 2 июня 2024 года. Доклад подготовлен в рамках трека Practical ML про опыт применения ML в реальных задачах Яндекса. Подписывайтесь на наши медиа в телеграме: Канал Яндекса о технологиях и людях, которые их создают: https://t.me/Yandex4Developers Канал Яндекса специально для ML-сообщества: https://t.me/yandexforml

3 weeks, 4 days назад @ youtube.com
Нейрофильтр с YaGPT: как помочь пользователю разобрать его почтовый ящик | Руслан Дюсаев
Нейрофильтр с YaGPT: как помочь пользователю разобрать его почтовый ящик | Руслан Дюсаев Нейрофильтр с YaGPT: как помочь пользователю разобрать его почтовый ящик | Руслан Дюсаев

Это выступление Руслана Дюсаева, разработчика группы машинного обучения антиспама Яндекс 360, на конференции Data Fest в нашем офисе 2 июня 2024 года. Доклад подготовлен в рамках трека Practical ML про опыт применения ML в реальных задачах Яндекса. Подписывайтесь на наши медиа в телеграме: Канал Яндекса о технологиях и людях, которые их создают: https://t.me/Yandex4Developers Канал Яндекса специально для ML-сообщества: https://t.me/yandexforml

3 weeks, 5 days назад @ youtube.com
NBA in Cross-service: про применение ML в Плюс Дейли | Анис Хамуш
NBA in Cross-service: про применение ML в Плюс Дейли | Анис Хамуш NBA in Cross-service: про применение ML в Плюс Дейли | Анис Хамуш

Это выступление Аниса Хамуша, ML-техлида Яндекс Плюса, на конференции Data Fest в нашем офисе 2 июня 2024 года. Доклад подготовлен в рамках трека Practical ML про опыт применения ML в реальных задачах Яндекса. Подписывайтесь на наши медиа в телеграме: Канал Яндекса о технологиях и людях, которые их создают: https://t.me/Yandex4Developers Канал Яндекса специально для ML-сообщества: https://t.me/yandexforml

3 weeks, 6 days назад @ youtube.com
Про адаптацию LLM к переводу HTML / Георгий Иванов
Про адаптацию LLM к переводу HTML / Георгий Иванов Про адаптацию LLM к переводу HTML / Георгий Иванов

Это выступление Георгия Иванова, старшего разработчика отдела NLP Яндекса, на конференции Data Fest в нашем офисе 2 июня 2024 года. Доклад подготовлен в рамках трека Practical ML про опыт применения ML в реальных задачах Яндекса. Подписывайтесь на наши медиа в телеграме: Канал Яндекса о технологиях и людях, которые их создают: https://t.me/Yandex4Developers Канал Яндекса специально для ML-сообщества: https://t.me/yandexforml

4 weeks назад @ youtube.com
Дообучение модели генерации изображений на предпочтения людей | Александр Шишеня
Дообучение модели генерации изображений на предпочтения людей | Александр Шишеня Дообучение модели генерации изображений на предпочтения людей | Александр Шишеня

Это выступление Александра Шишени, старшего разработчика группы генеративных моделей Яндекса, на конференции Data Fest в нашем офисе 2 июня 2024 года. Доклад подготовлен в рамках трека Practical ML про опыт применения ML в реальных задачах Яндекса. Подписывайтесь на наши медиа в телеграме: Канал Яндекса о технологиях и людях, которые их создают: https://t.me/Yandex4Developers Канал Яндекса специально для ML-сообщества: https://t.me/yandexforml

4 weeks, 1 day назад @ youtube.com
Робот-доставщик: как мы решаем проблемы прода в perception’е | Тая Пенская
Робот-доставщик: как мы решаем проблемы прода в perception’е | Тая Пенская Робот-доставщик: как мы решаем проблемы прода в perception’е | Тая Пенская

Это выступление Таи Пенской, разработчика группы восприятия робота-доставщика Яндекса, на конференции Data Fest в нашем офисе 2 июня 2024 года. Доклад подготовлен в рамках трека Practical ML про опыт применения ML в реальных задачах Яндекса. Подписывайтесь на наши медиа в телеграме: Канал Яндекса о технологиях и людях, которые их создают: https://t.me/Yandex4Developers Канал Яндекса специально для ML-сообщества: https://t.me/yandexforml В докладе упоминается корпорация Meta, признана экстремистской и запрещена в России

1 month назад @ youtube.com
Продуктовый ML: ожидание и реальность. Как работают «Идеи» в Яндекс Картах | Никита Киселёв
Продуктовый ML: ожидание и реальность. Как работают «Идеи» в Яндекс Картах | Никита Киселёв Продуктовый ML: ожидание и реальность. Как работают «Идеи» в Яндекс Картах | Никита Киселёв

Это выступление Никиты Киселёва, руководителя группы качества геосервисов Яндекса, на конференции Data Fest в нашем офисе 2 июня 2024 года. Доклад подготовлен в рамках трека Practical ML про опыт применения ML в реальных задачах Яндекса. Подписывайтесь на наши медиа в телеграме: Канал Яндекса о технологиях и людях, которые их создают: https://t.me/Yandex4Developers Канал Яндекса специально для ML-сообщества: https://t.me/yandexforml

1 month назад @ youtube.com
Механизм оптимизации подмешивания дополнительных элементов в результаты WEB-поиска / Алексей Голиков
Механизм оптимизации подмешивания дополнительных элементов в результаты WEB-поиска / Алексей Голиков Механизм оптимизации подмешивания дополнительных элементов в результаты WEB-поиска / Алексей Голиков

Поиск давно перестал быть набором из 10 синих ссылок. Около половины пользовательских задач успешно и быстро решаются с помощью изображений, видео, фактовых ответов и других элементов выдачи. В Яндексе создали Блендер — инструмент на базе ML, который смешивает документы разной модальности и источники, чтобы оптимизировать пользовательский опыт. Как выбирали метрики, реализовывали Блендер и с какими особенностями столкнулись, рассказал Алексей Голиков, руководитель команды качественных вызовов в Яндексе Другие мероприятия Яндекса вы можете посмотреть здесь: https://events.yandex.ru/

1 month назад @ youtube.com
Я вас (не) слышу! Как Алиса переезжала из мобильного приложения в колонку / Никита Рыжиков
Я вас (не) слышу! Как Алиса переезжала из мобильного приложения в колонку / Никита Рыжиков Я вас (не) слышу! Как Алиса переезжала из мобильного приложения в колонку / Никита Рыжиков

Когда Алиса переезжала из телефона в колонку, у команды технологий голосового ввода появились новые вызовы. Например, впервые столкнулись с задачей AEC и направленного шумоподавления. Как удалось во всём разобраться и делать хорошие решения, почему голосовые помощники требуют особенного подхода и зачем колонке несколько микрофонов — рассказал Никита Рыжиков, руководитель Службы технологий голосового ввода в Яндексе Другие мероприятия Яндекса вы можете посмотреть здесь: https://events.yandex.ru/

1 month назад @ youtube.com
ML Trainings ML Trainings
последний пост 2 weeks, 5 days назад
Кирилл Хрыльченко | VK RecSys Challenge. Разбор решения
Кирилл Хрыльченко | VK RecSys Challenge. Разбор решения Кирилл Хрыльченко | VK RecSys Challenge. Разбор решения

Спикер: Кирилл Хрыльченко, Руководитель группы исследования перспективных рекомендательных технологий, Яндекс Data Ёлка 2024 в гостях у VK: https://ods.ai/events/data-elka-24-vk-offline

VK RecSys Challenge: https://ods.ai/competitions/aivkchallenge

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 weeks, 5 days назад @ youtube.com
Иван Брагин | VK RecSys Challenge. Разбор решения
Иван Брагин | VK RecSys Challenge. Разбор решения Иван Брагин | VK RecSys Challenge. Разбор решения

Спикер: Иван Брагин, Staff machine learning engineer constructor.io Data Ёлка 2024 в гостях у VK: https://ods.ai/events/data-elka-24-vk-offline

VK RecSys Challenge: https://ods.ai/competitions/aivkchallenge

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 weeks, 5 days назад @ youtube.com
Станислав Чистяков | VK RecSys Challenge. Разбор решения
Станислав Чистяков | VK RecSys Challenge. Разбор решения Станислав Чистяков | VK RecSys Challenge. Разбор решения

Спикер: Станислав Чистяков, директор по развитию бизнеса, компания Noventiq Data Ёлка 2024 в гостях у VK: https://ods.ai/events/data-elka-24-vk-offline

VK RecSys Challenge: https://ods.ai/competitions/aivkchallenge

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 weeks, 5 days назад @ youtube.com
Александр Пославский | VK RecSys Challenge. Вступительное слово
Александр Пославский | VK RecSys Challenge. Вступительное слово Александр Пославский | VK RecSys Challenge. Вступительное слово

Спикер: Александр Пославский, руководитель группы Сore ML, AI VK Data Ёлка 2024 в гостях у VK: https://ods.ai/events/data-elka-24-vk-offline

VK RecSys Challenge: https://ods.ai/competitions/aivkchallenge

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 weeks, 5 days назад @ youtube.com
Роман Дерунец | Мультимодальность и RAG, или как сесть на два стула
Роман Дерунец | Мультимодальность и RAG, или как сесть на два стула Роман Дерунец | Мультимодальность и RAG, или как сесть на два стула

Спикер: Роман Дерунец Data Fest Siberia 5: https://ods.ai/events/datafestsiberia5

Трек NLP: https://ods.ai/tracks/sibfest5-nlp

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 months, 1 week назад @ youtube.com
Мария Мичурина | Case study как быстро и дешево переводить новости для требовательного заказчика
Мария Мичурина | Case study  как быстро и дешево переводить новости для требовательного заказчика Мария Мичурина | Case study как быстро и дешево переводить новости для требовательного заказчика

Спикер: Мария Мичурина Data Fest Siberia 5: https://ods.ai/events/datafestsiberia5

Трек NLP: https://ods.ai/tracks/sibfest5-nlp

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 months, 1 week назад @ youtube.com
Дари Батурова | BERTScore для русского языка
Дари Батурова | BERTScore для русского языка Дари Батурова | BERTScore для русского языка

Спикер: Дари Батурова Data Fest Siberia 5: https://ods.ai/events/datafestsiberia5

Трек NLP: https://ods.ai/tracks/sibfest5-nlp

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 months, 1 week назад @ youtube.com
Ислам Умаров | Графы в антифроде и то, как мы их применяем
Ислам Умаров | Графы в антифроде и то, как мы их применяем Ислам Умаров | Графы в антифроде и то, как мы их применяем

Спикер: Ислам Умаров Data Fest Siberia 5: https://ods.ai/events/datafestsiberia5

Трек: https://ods.ai/tracks/sibfest5-ml-security

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 months, 1 week назад @ youtube.com
Сиракан Багдасарян | ML-платформа: что это за зверь и как его приготовить?
Сиракан Багдасарян | ML-платформа: что это за зверь и как его приготовить? Сиракан Багдасарян | ML-платформа: что это за зверь и как его приготовить?

Спикер: Сиракан Багдасарян, MLOps-инженер, OZON Банк Data Halloween 2024: https://ods.ai/events/halloween2024_spb

Трек: https://ods.ai/tracks/halloween2024-spb

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 months, 3 weeks назад @ youtube.com
Никита Венедиктов | Алло,Эйнштейн? Создание call-ботов на базе LLM в 2024 году
Никита Венедиктов | Алло,Эйнштейн? Создание call-ботов на базе LLM в 2024 году Никита Венедиктов | Алло,Эйнштейн? Создание call-ботов на базе LLM в 2024 году

Спикер: Никита Венедиктов, NLP Researcher, RAFT Data Halloween 2024: https://ods.ai/events/halloween2024_spb

Трек: https://ods.ai/tracks/halloween2024-spb

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 months, 3 weeks назад @ youtube.com
Алексей Козлов, Евгений Тайчинов | HRBert2.0: улучшаем векторизацию вакансий и резюме
Алексей Козлов, Евгений Тайчинов | HRBert2.0: улучшаем векторизацию вакансий и резюме Алексей Козлов, Евгений Тайчинов | HRBert2.0: улучшаем векторизацию вакансий и резюме

Спикер: Алексей Козлов, Евгений Тайчинов, ML-инженер, Работа.ру Data Halloween 2024: https://ods.ai/events/halloween2024_spb

Трек: https://ods.ai/tracks/halloween2024-spb

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 months, 3 weeks назад @ youtube.com
Марк Паненко | Сервис машинного обучения с нуля , или Создаём Франкенштейна
Марк Паненко | Сервис машинного обучения с нуля , или Создаём Франкенштейна Марк Паненко | Сервис машинного обучения с нуля , или Создаём Франкенштейна

Спикер: Марк Паненко, CDS, OZON Банк Data Halloween 2024: https://ods.ai/events/halloween2024_spb

Трек: https://ods.ai/tracks/halloween2024-spb

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 months, 3 weeks назад @ youtube.com
Дмитрий Тихомиров | HypEx и A B тесты
Дмитрий Тихомиров | HypEx и A B тесты Дмитрий Тихомиров | HypEx и A B тесты

Data Fest Siberia 5: https://ods.ai/events/datafestsiberia5

Трек: https://ods.ai/tracks/sibfest5-ml-infrastructure

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 months, 3 weeks назад @ youtube.com
Маршалова Аня, Тихобаева Оля | Вспомнить всё по короткой подсказке разбор статьи Rethinking LLM
Маршалова Аня, Тихобаева Оля | Вспомнить всё по короткой подсказке  разбор статьи Rethinking LLM Маршалова Аня, Тихобаева Оля | Вспомнить всё по короткой подсказке разбор статьи Rethinking LLM

Спикер: Маршалова Аня, Тихобаева Оля

Название: Вспомнить всё по короткой подсказке разбор статьи Rethinking LLM Memorization through the Lens of Adversarial Compression Data Fest Siberia 5: https://ods.ai/events/datafestsiberia5

Трек: https://ods.ai/tracks/sibfest5-ds-talks

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 months, 3 weeks назад @ youtube.com
Мурашкина Анна | Знания и пиксели как совместить разбор статьи Fetch A Set
Мурашкина Анна | Знания и пиксели  как совместить  разбор статьи Fetch A Set Мурашкина Анна | Знания и пиксели как совместить разбор статьи Fetch A Set

Спикер: Мурашкина Анна Data Fest Siberia 5: https://ods.ai/events/datafestsiberia5

Трек: https://ods.ai/tracks/sibfest5-ds-talks

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 months, 4 weeks назад @ youtube.com
Primer Primer
последний пост 2 weeks, 3 days назад
Simulating the Evolution of Aging
Simulating the Evolution of Aging Simulating the Evolution of Aging

Patreon: https://www.patreon.com/primerlearning Ageless book: https://www.amazon.com/Ageless-Science-Getting-Older-Without/dp/0525566317/ Papers and other further reading:

Diversity of aging across the tree of life: https://pmc.ncbi.nlm.nih.gov/articles/PMC4157354/

Antagonistic pleiotropy and p53: https://pmc.ncbi.nlm.nih.gov/articles/PMC2771578/

An unsolved problem of biology (Medawar): https://ia903408.us.archive.org/31/items/medawar-1952-unsolved-problem/Medawar1952-Unsolved-Problem.pdf

Evolution of the mutation rate: https://pmc.ncbi.nlm.nih.gov/articles/PMC2910838/

Our World in Data Life Expectancy explainer: https://ourworldindata.org/life-expectancy-how-is-it-calculated-and-how-shoul…

2 weeks, 3 days назад @ youtube.com
Simulating the Evolution of Rock, Paper, Scissors
Simulating the Evolution of Rock, Paper, Scissors Simulating the Evolution of Rock, Paper, Scissors

Twitch: https://www.twitch.tv/justin_helps

Discord: https://discord.gg/NbruaNW

Store: https://store.dftba.com/collections/primer

Patreon: https://www.patreon.com/primerlearning Source and further reading on the common side-blotched lizard:

Sinervo, B.; C.M. Lively (1996). "The rock–paper–scissors game and the evolution of alternative male strategies". Nature. 380 (6571): 240–243.

https://en.wikipedia.org/wiki/Common_side-blotched_lizard Made with Godot

Github: https://github.com/Primer-Learning/PrimerTools Made possible by support from these wonderful Patrons:

abledbody

Alba Caparros-Roissard

Andrew Lang

Anthony Eufemio

Brian Cloutier

Captain Chinchilla

Christoph Grabo (@asaaki)

Christy Ser…

7 months, 1 week назад @ youtube.com
Evolving Rock Paper Scissors
Evolving Rock Paper Scissors Evolving Rock Paper Scissors 7 months, 1 week назад @ youtube.com
🎧 Podcasts
Lex Fridman AI Podcast Lex Fridman AI Podcast
последний пост 2 weeks, 1 day назад
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters #459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Dylan Patel is the founder of SemiAnalysis, a research & analysis company specializing in semiconductors, GPUs, CPUs, and AI hardware.

Nathan Lambert is a research scientist at the Allen Institute for AI (Ai2) and the author of a blog on AI called Interconnects.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep459-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://invideo.io/i/lexpodGitHub: Developer platform and AI code editor.

(4:31:34) – AI agents(4:40:16) – Programming and AI(4:47:43) – Open source(4:56:55) – Stargate(5:04:24) – Future of AIPODCAST LINKS:– Podcast Website: https://lexfridman.com/podcast–…

2 weeks, 1 day назад @ lexfridman.com
#458 – Marc Andreessen: Trump, Power, Tech, AI, Immigration & Future of America
#458 – Marc Andreessen: Trump, Power, Tech, AI, Immigration & Future of America #458 – Marc Andreessen: Trump, Power, Tech, AI, Immigration & Future of America

Marc Andreessen is an entrepreneur, investor, co-creator of Mosaic, co-founder of Netscape, and co-founder of the venture capital firm Andreessen Horowitz.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep458-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://encord.com/lexGitHub: Developer platform and AI code editor.

Go to https://gh.io/copilotNotion: Note-taking and team collaboration.

Go to https://shopify.com/lexLMNT: Zero-sugar electrolyte drink mix.

3 weeks, 2 days назад @ lexfridman.com
#457 – Jennifer Burns: Milton Friedman, Ayn Rand, Economics, Capitalism, Freedom
#457 – Jennifer Burns: Milton Friedman, Ayn Rand, Economics, Capitalism, Freedom #457 – Jennifer Burns: Milton Friedman, Ayn Rand, Economics, Capitalism, Freedom

Jennifer Burns is a historian of ideas, focusing on the evolution of economic, political, and social ideas in the United States in the 20th century.

She wrote two biographies, one on Milton Friedman, and the other on Ayn Rand.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep457-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://brain.fm/lexGitHub: Developer platform and AI code editor.

Go to https://gh.io/copilotLMNT: Zero-sugar electrolyte drink mix.

1 month назад @ lexfridman.com
#456 – Volodymyr Zelenskyy: Ukraine, War, Peace, Putin, Trump, NATO, and Freedom
#456 – Volodymyr Zelenskyy: Ukraine, War, Peace, Putin, Trump, NATO, and Freedom #456 – Volodymyr Zelenskyy: Ukraine, War, Peace, Putin, Trump, NATO, and Freedom

Volodymyr Zelenskyy is the President of Ukraine.

On YouTube this episode is available in English, Ukrainian, and Russian.

Captions and voice-over audio tracks are provided in English, Ukrainian, Russian, and the original mixed-language version, with subtitles available in your preferred language.

To listen to the original mixed language version, please select the English (UK) audio track audio track.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep456-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

1 month, 1 week назад @ lexfridman.com
#455 – Adam Frank: Alien Civilizations and the Search for Extraterrestrial Life
#455 – Adam Frank: Alien Civilizations and the Search for Extraterrestrial Life #455 – Adam Frank: Alien Civilizations and the Search for Extraterrestrial Life

Adam Frank is an astrophysicist studying star systems and the search for extraterrestrial life and alien civilizations.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep455-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://encord.com/lexEight Sleep: Temp-controlled smart mattress cover.

Go to https://betterhelp.com/lexNotion: Note-taking and team collaboration.

Go to https://drinkLMNT.com/lexAG1: All-in-one daily nutrition drinks.

1 month, 4 weeks назад @ lexfridman.com
#454 – Saagar Enjeti: Trump, MAGA, DOGE, Obama, FDR, JFK, History & Politics
#454 – Saagar Enjeti: Trump, MAGA, DOGE, Obama, FDR, JFK, History & Politics #454 – Saagar Enjeti: Trump, MAGA, DOGE, Obama, FDR, JFK, History & Politics

Saagar Enjeti is a political journalist & commentator, co-host of Breaking Points with Krystal and Saagar and The Realignment Podcast.

He is exceptionally well-read, and the books he recommends are always fascinating and eye-opening.

You can check out all the books he mentions in this episode here: https://lexfridman.com/saagar-booksThank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep454-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://eightsleep.com/lexAG1: All-in-one daily nutrition drinks.

Go to https://drinkLMNT.com/lexBetterHelp: Online therapy and counseling.

2 months, 1 week назад @ lexfridman.com
#453 – Javier Milei: President of Argentina – Freedom, Economics, and Corruption
#453 – Javier Milei: President of Argentina – Freedom, Economics, and Corruption #453 – Javier Milei: President of Argentina – Freedom, Economics, and Corruption

Javier Milei is the President of Argentina.

This episode is available in both English and Spanish.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep453-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to http://netsuite.com/lexBetterHelp: Online therapy and counseling.

Go to https://betterhelp.com/lexAG1: All-in-one daily nutrition drinks.

3 months назад @ lexfridman.com
#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity
#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity #452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Dario Amodei is the CEO of Anthropic, the company that created Claude.

Amanda Askell is an AI researcher working on Claude’s character and personality.

Chris Olah is an AI researcher working on mechanistic interpretability.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep452-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

(3:49:02) – Character training(3:50:01) – Nature of truth(3:54:38) – Optimal rate of failure(4:01:49) – AI consciousness(4:16:20) – AGI(4:24:58) – Chris Olah – Mechanistic Interpretability(4:29:49) – Features, Circuits, Universality(4:47:23) – Superposition(4:58:22) – Monosemanticity(5:05…

3 months, 1 week назад @ lexfridman.com
#451 – Rick Spence: CIA, KGB, Illuminati, Secret Societies, Cults & Conspiracies
#451 – Rick Spence: CIA, KGB, Illuminati, Secret Societies, Cults & Conspiracies #451 – Rick Spence: CIA, KGB, Illuminati, Secret Societies, Cults & Conspiracies

Rick Spence is a historian specializing in the history of intelligence agencies, espionage, secret societies, conspiracies, the occult, and military history.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep451-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to http://netsuite.com/lexBetterHelp: Online therapy and counseling.

Go to https://betterhelp.com/lexMasterClass: Online classes from world-class experts.

Go to https://masterclass.com/lexpodShopify: Sell stuff online.

3 months, 3 weeks назад @ lexfridman.com
#450 – Bernie Sanders Interview
#450 – Bernie Sanders Interview #450 – Bernie Sanders Interview

Bernie Sanders is a US Senator from Vermont and a two-time presidential candidate.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep450-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://eightsleep.com/lexSaily: An eSIM for international travel.

Go to https://saily.com/lexGround News: Unbiased news source.

Go to https://drinkLMNT.com/lexOUTLINE:(00:00) – Introduction(08:51) – MLK Jr(11:43) – Corruption in politics(23:00) – Healthcare in US(31:33) – 2016 election(37:32) – Barack Obama(43:26) – Capitalism(51:35) – Response to attacks(56:32) – AOC and progressive politics(1:04:24) – Mortality(1:06:30) – Hope fo…

3 months, 4 weeks назад @ lexfridman.com
#449 – Graham Hancock: Lost Civilization of the Ice Age & Ancient Human History
#449 – Graham Hancock: Lost Civilization of the Ice Age & Ancient Human History #449 – Graham Hancock: Lost Civilization of the Ice Age & Ancient Human History

Graham Hancock a journalist and author who for over 30 years has explored the controversial possibility that there existed a lost civilization during the last Ice Age, and that it was destroyed in a global cataclysm some 12,000 years ago.

He is the presenter of the Netflix documentary series “Ancient Apocalypse”, the 2nd season of which has just been released.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep449-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://notion.com/lexRiverside: Platform for recording podcasts and videos from everywhere.

Go to https://creators.riverside.fm/LEXLMNT: Zero-sugar electro…

4 months назад @ lexfridman.com
#448 – Jordan Peterson: Nietzsche, Hitler, God, Psychopathy, Suffering & Meaning
#448 – Jordan Peterson: Nietzsche, Hitler, God, Psychopathy, Suffering & Meaning #448 – Jordan Peterson: Nietzsche, Hitler, God, Psychopathy, Suffering & Meaning

Jordan Peterson is a psychologist, author, lecturer, and podcast host.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep448-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://eightsleep.com/lexGround News: Unbiased news source.

Go to https://betterhelp.com/lexLMNT: Zero-sugar electrolyte drink mix.

Go to https://drinkLMNT.com/lexOUTLINE:(00:00) – Introduction(07:07) – Nietzsche(14:48) – Power and propaganda(19:54) – Nazism(24:54) – Religion(41:18) – Communism(47:03) – Hero myth(49:12) – Belief in God(59:24) – Advice for young people(1:12:02) – Sex(1:32:00) – Good and evil(1:44:46) – Psychopathy(1…

4 months, 1 week назад @ lexfridman.com
#447 – Cursor Team: Future of Programming with AI
#447 – Cursor Team: Future of Programming with AI #447 – Cursor Team: Future of Programming with AI

Aman Sanger, Arvid Lunnemark, Michael Truell, and Sualeh Asif are creators of Cursor, a popular code editor that specializes in AI-assisted programming.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep447-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://encord.com/lexMasterClass: Online classes from world-class experts.

Go to https://masterclass.com/lexpodShopify: Sell stuff online.

Go to http://netsuite.com/lexAG1: All-in-one daily nutrition drinks.

4 months, 2 weeks назад @ lexfridman.com
#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America
#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America #446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Ed Barnhart is an archaeologist and explorer specializing in ancient civilizations of the Americas.

He is the Director of the Maya Exploration Center, host of the ArchaeoEd Podcast, and lecturer on the ancient history of North, Central, and South America.

Ed is in part known for his groundbreaking work on ancient astronomy, mathematics, and calendar systems.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep446-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://drinkag1.com/lexNotion: Note-taking and team collaboration.

4 months, 3 weeks назад @ lexfridman.com
#445 – Vivek Ramaswamy: Trump, Conservatism, Nationalism, Immigration, and War
#445 – Vivek Ramaswamy: Trump, Conservatism, Nationalism, Immigration, and War #445 – Vivek Ramaswamy: Trump, Conservatism, Nationalism, Immigration, and War

Vivek Ramaswamy is a conservative politician, entrepreneur, and author of many books on politics, including his latest titled Truths: The Future of America First.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep445-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to http://netsuite.com/lexGround News: Unbiased news source.

Go to https://eightsleep.com/lexOUTLINE:(00:00) – Introduction(12:50) – Conservatism(16:06) – Progressivism(21:41) – DEI(26:33) – Bureaucracy(33:25) – Government efficiency(48:34) – Education(1:02:59) – Military Industrial Complex(1:25:18) – Illegal immigration(1:46:53) – Donald Trump(…

4 months, 3 weeks назад @ lexfridman.com
Microsoft Research Podcast Microsoft Research Podcast
последний пост 1 week назад
Ideas: Building AI for population-scale systems with Akshay Nambi
Ideas: Building AI for population-scale systems with Akshay Nambi Ideas: Building AI for population-scale systems with Akshay Nambi

His work lies at the intersection of systems, AI, and machine learning with a focus on designing, deploying, and scaling AI systems to solve compelling real-world problems.

CHRIS STETKIEWICZ: You’re listening to Ideas, a Microsoft Research Podcast that dives deep into the world of technology research and the profound questions behind the code.

NAMBI: That’s right.

This represents a major step towards building AI systems that’s much more holistic personal tutors, which help student understanding and create more engaging, effective learning experience.

Are there some things that could go wrong, even if we get the technology right?

1 week назад @ microsoft.com
Ideas: Building AI for population-scale systems with Akshay Nambi
Ideas: Building AI for population-scale systems with Akshay Nambi Ideas: Building AI for population-scale systems with Akshay Nambi

His work lies at the intersection of systems, AI, and machine learning with a focus on designing, deploying, and scaling AI systems to solve compelling real-world problems.

CHRIS STETKIEWICZ: You’re listening to Ideas, a Microsoft Research Podcast that dives deep into the world of technology research and the profound questions behind the code.

NAMBI: That’s right.

This represents a major step towards building AI systems that’s much more holistic personal tutors, which help student understanding and create more engaging, effective learning experience.

Are there some things that could go wrong, even if we get the technology right?

1 week назад @ microsoft.com
Ideas: Bug hunting with Shan Lu
Ideas: Bug hunting with Shan Lu Ideas: Bug hunting with Shan Lu

We are sorry, the page you requested cannot be found.

The page you are looking for could not be found or is no longer available.

3 weeks, 5 days назад @ microsoft.com
Ideas: AI for materials discovery with Tian Xie and Ziheng Lu
Ideas: AI for materials discovery with Tian Xie and Ziheng Lu Ideas: AI for materials discovery with Tian Xie and Ziheng Lu

And now you can use this loop to design materials really quickly.

XIE: So you can really think about MatterSim and MatterGen accelerating different parts of materials discovery process.

They are also both foundation AI models, meaning they can both be used for a broad range of materials design problems.

Really, really a lot.

Yeah, I really, really like the example that Ziheng mentioned about the educational purposes.

1 month назад @ microsoft.com
Ideas: AI and democracy with Madeleine Daepp and Robert Osazuwa Ness
Ideas: AI and democracy with Madeleine Daepp and Robert Osazuwa Ness Ideas: AI and democracy with Madeleine Daepp and Robert Osazuwa Ness

DAEPP: You know, we didn’t really think about the term fraud until we started prepping for this interview with you.

BADANES: Right, right.

One of the things that I get asked a lot is, why can’t we just build good AI to detect bad AI, right?

BADANES: So next time my kids are in a fight, I’m going to point them to Copilot and say, work with Copilot to mediate.

[LAUGHS] No, that’s really, really interesting.

2 months назад @ microsoft.com
NeurIPS 2024: The co-evolution of AI and systems with Lidong Zhou
NeurIPS 2024: The co-evolution of AI and systems with Lidong Zhou NeurIPS 2024: The co-evolution of AI and systems with Lidong Zhou

Earlier today, Lidong gave a keynote here at NeurIPS on the co-evolution of AI and systems engineering.

One dimension is that the scale of the AI systems that we have to support.

And the other dimension is if you look at AI systems, it’s actually a whole-stack kind of design.

STRICKLAND: Yeah, yeah.

ZHOU: Yeah, I think in terms of AI systems, I’m certainly pretty excited about what we can do together, you know, with a combination of AI and systems.

2 months назад @ microsoft.com
NeurIPS 2024: AI for Science with Chris Bishop
NeurIPS 2024: AI for Science with Chris Bishop NeurIPS 2024: AI for Science with Chris Bishop

And then the second paradigm really emerged in the 17th century.

And so the third paradigm really began, I guess, sort of, in the ’50s and ’60s, the development of digital computers.

And when I think about AI for Science actually, the space of opportunity is colossal because science is, science is really just understanding more about the world around us.

And now the SMILES autoregressive model can now generate a molecule that’s an improvement on the starting molecule and knows about the protein binding.

But also if you think about [it], science is really about learning more about the world.

2 months, 1 week назад @ microsoft.com
Abstracts: NeurIPS 2024 with Jindong Wang and Steven Euijong Whang
Abstracts: NeurIPS 2024 with Jindong Wang and Steven Euijong Whang Abstracts: NeurIPS 2024 with Jindong Wang and Steven Euijong Whang

Today I’m talking to Jindong Wang, a senior researcher at Microsoft Research, and Steven Whang, a tenured associate professor at the Korea Advanced Institute of Science and Technology.

JINDONG WANG: OK, everybody knows that with the widespread usage of large language models, hallucination has become a crucial factor of concern.

So foreign key constraint basically requires that if there is some director mentioned in the movie table, it has to be one of the directors in the director table.

So now we can join the movie and director table and generate a bigger table.

HUIZINGA: Well, Jindong Wang and Steven Whang, thanks for joining us today, and to our listeners, thanks for tuning in.

2 months, 1 week назад @ microsoft.com
Abstracts: NeurIPS 2024 with Weizhu Chen
Abstracts: NeurIPS 2024 with Weizhu Chen Abstracts: NeurIPS 2024 with Weizhu Chen

The other one actually is some token actually is very, very hard to be predicted during the pretraining.

And the important thing for the data is about data filtering.

If we’re able to build a better model actually is able to benefit so many different kinds of application.

And definitely there’s a lot of things about how to build a better data [that] is unsolved yet in the literature.

And the other thing actually, we are working on something that’s very exciting.

2 months, 2 weeks назад @ microsoft.com
Abstracts: NeurIPS 2024 with Dylan Foster
Abstracts: NeurIPS 2024 with Dylan Foster Abstracts: NeurIPS 2024 with Dylan Foster

FOSTER: So this is a, kind of, a theoretical work on reinforcement learning, or RL.

FOSTER: Yeah, so if you look at these sort of RL problems with latent dynamics, this is something that’s actually received a lot of investigation in theory.

Like, can we take existing algorithms and use them to solve rich-observation RL problems in a modular fashion?

TINGLE: Dylan, I’d like to know—and I’m sure our audience would, too—what this work means when it comes to real-world application.

TINGLE: Well, Dylan Foster, thank you for joining us today to discuss your paper on reinforcement learning under latent dynamics.

2 months, 2 weeks назад @ microsoft.com
Abstracts: NeurIPS 2024 with Pranjal Chitale
Abstracts: NeurIPS 2024 with Pranjal Chitale Abstracts: NeurIPS 2024 with Pranjal Chitale

The drawback of this approach is that it often misses the cultural nuances of local languages.

CHITALE: Now that we have created a benchmark, the next step is to evaluate how these multimodal models are performing on this benchmark.

So what we observed is there is a huge gap when it comes … in performance when we compare these proprietary offerings versus the open-source models.

These open-source models significantly lag behind the proprietary models.

CHITALE: CVQA is significant because it addresses a fundamental gap in how we evaluate vision-language and multimodal models today.

2 months, 2 weeks назад @ microsoft.com
Ideas: Economics and computation with Nicole Immorlica
Ideas: Economics and computation with Nicole Immorlica Ideas: Economics and computation with Nicole Immorlica

We are sorry, the page you requested cannot be found.

The page you are looking for could not be found or is no longer available.

2 months, 2 weeks назад @ microsoft.com
Ideas: The journey to DNA data storage
Ideas: The journey to DNA data storage Ideas: The journey to DNA data storage

Joining me today to discuss the state of DNA data storage and some of our contributions are several members of the DNA Data Storage Project at Microsoft Research: Principal Researcher Bichlien Nguyen, Senior Researcher Jake Smith, and Partner Research Manager Sergey Yekhanin.

Once we do this, we return to our original data and we’ve completed, let’s call it, one DNA data storage cycle.

So, like, I mean, coding is an important aspect of this whole idea of DNA data storage because we have to deal with errors—it’s a new medium—but talking about error-correcting codes in the context of DNA data storage, so, I mean, usually, like … what are error-correcting codes about?

In DNA data storage, the …

3 months назад @ microsoft.com
Abstracts: November 14, 2024
Abstracts: November 14, 2024 Abstracts: November 14, 2024

We are sorry, the page you requested cannot be found.

The page you are looking for could not be found or is no longer available.

3 months назад @ microsoft.com
Collaborators: Prompt engineering with Siddharth Suri and David Holtz
Collaborators: Prompt engineering with Siddharth Suri and David Holtz Collaborators: Prompt engineering with Siddharth Suri and David Holtz

Researcher Siddharth Suri and professor David Holtz give a brief history of prompt engineering, discuss the debate behind their recent collaboration, and share what they found from studying how people’s approaches to prompting change as models advance.

Learn more:

3 months, 1 week назад @ blubrry.com
NLP Highlights NLP Highlights
последний пост None
Data Skeptic
последний пост 2 часа назад
Networks of the Mind
Networks of the Mind Networks of the Mind

A man goes into a bar… This is the beginning of a riddle that our guest, Yoed Kennet, an assistant professor at the Technion's Faculty of Data and Decision Sciences, uses to measure creativity in subjects. In our talk, Yoed speaks about how to combine cognitive science and network science to explore the complexities and decode the mysteries of the human mind. The listeners will learn how network science provides tools to map and analyze human memory, revealing how problem-solving and creativity emerge from changes in semantic memory structures. Key insights include the role of memory restructuring during moments of insight, the connection between semantic networks and creative thinking, and…

2 часа назад @ dataskeptic.com
LLMs and Graphs Synergy
LLMs and Graphs Synergy LLMs and Graphs Synergy

In this episode, Garima Agrawal, a senior researcher and AI consultant, brings her years of experience in data science and artificial intelligence. Listeners will learn about the evolving role of knowledge graphs in augmenting large language models (LLMs) for domain-specific tasks and how these tools can mitigate issues like hallucination in AI systems. Key insights include how LLMs can leverage knowledge graphs to improve accuracy by integrating domain expertise, reducing hallucinations, and enabling better reasoning. Real-life applications discussed range from enhancing customer support systems with efficient FAQ retrieval to creating smarter AI-driven decision-making pipelines. Garima’s …

1 week, 1 day назад @ dataskeptic.com
A Network of Networks
A Network of Networks A Network of Networks

In this episode, Bnaya Gross, a Fulbright postdoctoral fellow at the Center for Complex Network Research at Northwestern University, explores the transformative applications of network science in fields ranging from infrastructure to medicine, by studying the interactions between networks ("a network of networks"). Listeners will learn how interdependent networks provide a framework for understanding cascading failures, such as power outages, and how these insights transfer to physical systems like superconducting materials and biological networks. Key takeaways include understanding how dependencies between networks can amplify vulnerabilities, applying these principles to create resilient…

2 weeks назад @ dataskeptic.com
Auditing LLMs and Twitter
Auditing LLMs and Twitter Auditing LLMs and Twitter

Our guests, Erwan Le Merrer and Gilles Tredan, are long-time collaborators in graph theory and distributed systems. They share their expertise on applying graph-based approaches to understanding both large language model (LLM) hallucinations and shadow banning on social media platforms. In this episode, listeners will learn how graph structures and metrics can reveal patterns in algorithmic behavior and platform moderation practices. Key insights include the use of graph theory to evaluate LLM outputs, uncovering patterns in hallucinated graphs that might hint at the underlying structure and training data of the models, and applying epidemic models to analyze the uneven spread of shadow ban…

2 weeks, 6 days назад @ dataskeptic.com
Fraud Detection with Graphs
Fraud Detection with Graphs Fraud Detection with Graphs

In this episode, Šimon Mandlík, a PhD candidate at the Czech Technical University will talk with us about leveraging machine learning and graph-based techniques for cybersecurity applications. We'll learn how graphs are used to detect malicious activity in networks, such as identifying harmful domains and executable files by analyzing their relationships within vast datasets. This will include the use of hierarchical multi-instance learning (HML) to represent JSON-based network activity as graphs and the advantages of analyzing connections between entities (like clients, domains etc.). Our guest shows that while other graph methods (such as GNN or Label Propagation) lack in scalability or h…

3 weeks, 6 days назад @ dataskeptic.com
Optimizing Supply Chains with GNN
Optimizing Supply Chains with GNN Optimizing Supply Chains with GNN

Thibaut Vidal, a professor at Polytechnique Montreal, specializes in leveraging advanced algorithms and machine learning to optimize supply chain operations. In this episode, listeners will learn how graph-based approaches can transform supply chains by enabling more efficient routing, districting, and decision-making in complex logistical networks. Key insights include the application of Graph Neural Networks to predict delivery costs, with potential to improve districting strategies for companies like UPS or Amazon and overcoming limitations of traditional heuristic methods. Thibaut’s work underscores the potential for GNN to reduce costs, enhance operational efficiency, and provide bette…

1 month назад @ dataskeptic.com
The Mystery Behind Large Graphs
The Mystery Behind Large Graphs The Mystery Behind Large Graphs

Our guest in this episode is David Tench, a Grace Hopper postdoctoral fellow at Lawrence Berkeley National Labs, who specializes in scalable graph algorithms and compression techniques to tackle massive datasets. In this episode, we will learn how his techniques enable real-time analysis of large datasets, such as particle tracking in physics experiments or social network analysis, by reducing storage requirements while preserving critical structural properties. David also challenges the common belief that giant graphs are sparse by pointing to a potential bias: Maybe because of the challenges that exist in analyzing large dense graphs, we only see datasets of sparse graphs? The truth is ou…

1 month, 1 week назад @ dataskeptic.com
Customizing a Graph Solution
Customizing a Graph Solution Customizing a Graph Solution

In this episode, Dave Bechberger, principal Graph Architect at AWS and author of "Graph Databases in Action", brings deep insights into the field of graph databases and their applications. Together we delve into specific scenarios in which Graph Databases provide unique solutions, such as in the fraud industry, and learn how to optimize our DB for questions around connections, such as "How are these entities related?" or "What patterns of interaction indicate anomalies?" This discussion sheds light on when organizations should consider adopting graph databases, particularly for cases that require scalable analysis of highly interconnected data and provides practical insights into leveraging…

2 months назад @ dataskeptic.com
Graph Transformations
Graph Transformations Graph Transformations

In this episode, Adam Machowczyk, a PhD student at the University of Leicester, specializes in graph rewriting and its intersection with machine learning, particularly Graph Neural Networks. Adam explains how graph rewriting provides a formalized method to modify graphs using rule-based transformations, allowing for tasks like graph completion, attribute prediction, and structural evolution. Bridging the worlds of graph rewriting and machine learning, Adam's work aspire to open new possibilities for creating adaptive, scalable models capable of solving challenges that traditional methods struggle with, such as handling heterogeneous graphs or incorporating incremental updates efficiently. R…

2 months, 1 week назад @ dataskeptic.com
Networks for AB Testing
Networks for AB Testing Networks for AB Testing

In this episode, the data scientist Wentao Su shares his experience in AB testing on social media platforms like LinkedIn and TikTok. We talk about how network science can enhance AB testing by accounting for complex social interactions, especially in environments where users are both viewers and content creators. These interactions might cause a "spillover effect" meaning a possible influence across experimental groups, which can distort results. To mitigate this effect, our guest presents heuristics and algorithms they developed ("one-degree label propagation”) to allow for good results on big data with minimal running time and so optimize user experience and advertiser performance in soc…

2 months, 3 weeks назад @ dataskeptic.com
Lessons from eGamer Networks
Lessons from eGamer Networks Lessons from eGamer Networks

Alex Bisberg, a PhD candidate at the University of Southern California, specializes in network science and game analytics, with a focus on understanding social and competitive success in multiplayer online games. In this episode, listeners can expect to learn from a network perspective about players interactions and patterns of behavior. Through his research on games, Alex sheds light on how network analysis and statistical tests might explain positive contagious behaviors, such as generosity, and explore the dynamics of collaboration and competition in gaming environments. These insights offer valuable lessons not only for game developers in enhancing player experience, engagement and rete…

3 months назад @ dataskeptic.com
Github Collaboration Network
Github Collaboration Network Github Collaboration Network

In this episode we discuss the GitHub Collaboration Network with Behnaz Moradi-Jamei, assistant professor at James Madison University. As a network scientist, Behnaz created and analyzed a network of about 700,000 contributors to Github's repository. The network of collaborators on GitHub was created by identifying developers (nodes) and linking them with edges based on shared contributions to the same repositories. This means that if two developers contributed to the same project, an edge (connection) was formed between them, representing a collaborative relationship network consisting of 32 million such connections. By using algorithms for Community Detection, Behnaz's analysis reveals in…

3 months, 1 week назад @ dataskeptic.com
Github Collaboration Network
Github Collaboration Network Github Collaboration Network

In this episode we discuss the GitHub Collaboration Network with Behnaz Moradi-Jamei, assistant professor at James Madison University. As a network scientist, Behnaz created and analyzed a network of about 700,000 contributors to Github's repository. The network of collaborators on GitHub was created by identifying developers (nodes) and linking them with edges based on shared contributions to the same repositories. This means that if two developers contributed to the same project, an edge (connection) was formed between them, representing a collaborative relationship network consisting of 32 million such connections. By using algorithms for Community Detection, Behnaz's analysis reveals in…

3 months, 1 week назад @ dataskeptic.com
Graphs and ML for Robotics
Graphs and ML for Robotics Graphs and ML for Robotics

We are joined by Abhishek Paudel, a PhD Student at George Mason University with a research focus on robotics, machine learning, and planning under uncertainty, using graph-based methods to enhance robot behavior. He explains how graph-based approaches can model environments, capture spatial relationships, and provide a framework for integrating multiple levels of planning and decision-making.

3 months, 2 weeks назад @ dataskeptic.com
Graphs for HPC and LLMs
Graphs for HPC and LLMs Graphs for HPC and LLMs

We are joined by Maciej Besta, a senior researcher of sparse graph computations and large language models at the Scalable Parallel Computing Lab (SPCL). In this episode, we explore the intersection of graph theory and high-performance computing (HPC), Graph Neural Networks (GNNs) and LLMs.

3 months, 3 weeks назад @ dataskeptic.com
SuperDataScience SuperDataScience
последний пост 10 часов назад
863: TabPFN: Deep Learning for Tabular Data (That Actually Works!), with Prof. Frank Hutter
863: TabPFN: Deep Learning for Tabular Data (That Actually Works!), with Prof. Frank Hutter 863: TabPFN: Deep Learning for Tabular Data (That Actually Works!), with Prof. Frank Hutter

Tabular data, I'm sure you're familiar with them once I describe them, are data stored in a table format, tabular.

Frank Hutter: 00:06:22So, what is tabular data, and why is it so different than, yeah, vision data or speech data or text data and so on?

That's really, really cool.

We can't publish in Nature, but let's go for a next version, and let's make this really, really strong," because the first version, all that that did is ...

00:55:57In the university, I will keep an academic co-affiliation, and in the university, I will focus very much on tabular data as well then and research about the tabular data, things like interpretability.

10 часов назад @ superdatascience.com
862: In Case You Missed It in January 2025
862: In Case You Missed It in January 2025 862: In Case You Missed It in January 2025

If it's growing linearly, I think you're making incredibly extreme assumptions based on what evidence has shown us.

13:26Earlier I talked about how large language models are a subset of all the foundation models out there.

Even two years ago there was no such thing as only just starting foundation models and LLMs and so on.

So it's one of the foundation models by Amazon, which by the way just released their brand new foundation models called Nova.

Jon Krohn: 31:27All right, that's it for today's In Case You Missed It episode.

4 days, 10 hours назад @ superdatascience.com
861: From Pro Athlete to Data Engineer: Colleen Fotsch’s Inspiring Journey
861: From Pro Athlete to Data Engineer: Colleen Fotsch’s Inspiring Journey 861: From Pro Athlete to Data Engineer: Colleen Fotsch’s Inspiring Journey

00:09:34I actually got to coach some of the girls that were my teammates in college, which was really, really cool.

01:08:17So we did work with big businesses too, but we were really, really passionate about helping local small businesses continue to grow, which was really, really fun.

And that was a really, really big thing that I love.

But there's so many moments in between where there's opportunities to make it really, really fun.

And just stuff that honestly, I was getting really, really passionate about doing and I was seeing really good results without being in the gym for hours on end.

1 week назад @ superdatascience.com
860: DeepSeek R1: SOTA Reasoning at 1% of the Cost
860: DeepSeek R1: SOTA Reasoning at 1% of the Cost 860: DeepSeek R1: SOTA Reasoning at 1% of the Cost

This is episode number 860 on the DeepSeek R1.

At the time of writing, DeepSeek's R1 reasoning model is statistically, so within a 95 % confidence interval tied for first place on the overall LM Arena leaderboard with the top models.

But speaking in rough approximations, training a single DeepSeek V3 or DeepSeek R1 model appears to cost on the order of millions of dollars.

I'm not going to go further into the technical details of the DeepSeek models in this episode.

I've got a link in the show notes to the R1 model from DeepSeek provided by Olama, so you can do just that.

1 week, 4 days назад @ superdatascience.com
859: BAML: The Programming Language for AI, with Vaibhav Gupta
859: BAML: The Programming Language for AI, with Vaibhav Gupta 859: BAML: The Programming Language for AI, with Vaibhav Gupta

In this week’s guest interview, Vaibhav Gupta talks to Jon Krohn about creating a programming language, BAML, that helps companies save up to 30% on their AI costs. He explains how he started tailoring BAML to facilitate natural language generation interactions with AI models, how BAML helps companies optimize their outputs, and he also lets listeners into Boundary’s hiring process. This episode is brought to you by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information. In this episode you will learn: (04:53) What BAML stands for (14:33) Making a prompt engineering a serious practic…

2 weeks назад @ podtrac.com
858: Are You The Account Executive We’re Looking For?
858: Are You The Account Executive We’re Looking For? 858: Are You The Account Executive We’re Looking For?

Podcast TranscriptJon Krohn: 00:05This is episode 858 on the Account Executive we’re looking to hire.

Perhaps you are the person we’re looking for or maybe you know the person we are looking for!

And hopefully every once in a while you also do actually learn about a product or service that could be genuinely useful to you in this data science niche.

And it's been an amazing experience over these last eight years with many of the best known names in AI and machine learning data science more broadly coming as guests on to the show.

Well, as the world's most listened-to data-science podcast, we obviously have a fair bit of reach, particularly in this data science genre or, amongst machine lear…

2 weeks, 4 days назад @ superdatascience.com
857: How to Ensure AI Agents Are Accurate and Reliable, with Brooke Hopkins
857: How to Ensure AI Agents Are Accurate and Reliable, with Brooke Hopkins 857: How to Ensure AI Agents Are Accurate and Reliable, with Brooke Hopkins

I'll do one that's more personal and then one I think that's pretty related to work or informed a lot of my work.

And we talked about how voice agents are taking off because they provide a universal natural language API between businesses and with consumers.

So an autonomous agent is an agent that's navigating the world and responding to the world.

I think how this translates to voice agents is can the voice agent self-determine when the request is too complex for itself?

It's the same kind of thing with this with voice agents with more and more kind agentic systems.

3 weeks назад @ superdatascience.com
856: The Fastest-Growing Jobs Are AI Jobs
856: The Fastest-Growing Jobs Are AI Jobs 856: The Fastest-Growing Jobs Are AI Jobs

Podcast TranscriptJon Krohn: 00:05This is Five-Minute Friday on the fastest growing jobs of 2024.

For AI engineer, for example, it shows that LLMs, natural language processing, and PyTorch are the most common skills.

Amongst the fastest growing jobs in the US, you've got at number one AI engineer, at number two AI consultant, and at number 12 AI researcher.

For example, like in the US, AI engineer is the number one fastest growing role in the UK and in the Netherlands.

AI engineer is also the fifth-fastest growing role in Sweden, the sixth-fastest growing role in Canada and Israel, and the 12th fastest growing role in India.

3 weeks, 4 days назад @ superdatascience.com
855: Exponential Views on AI and Humanity’s Greatest Challenges, with Azeem Azhar
855: Exponential Views on AI and Humanity’s Greatest Challenges, with Azeem Azhar 855: Exponential Views on AI and Humanity’s Greatest Challenges, with Azeem Azhar

How can we use AI to solve global problems like the environmental crisis, and how will future AI start to manage increasingly complex workflows? Famed futurist Azeem Azhar talks to Jon Krohn about the future of AI as a force for good, how we can stay mindful of an evolving job market, and Azeem’s favorite tools for automating his workflows. This episode is brought to you by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information. In this episode you will learn: (05:43) Azeem Azhar’s vision for AI’s future (14:16) How to prepare for technological shifts (20:35) How to be more like an A…

4 weeks назад @ podtrac.com
854: The Six Epochs of Intelligence Evolution
854: The Six Epochs of Intelligence Evolution 854: The Six Epochs of Intelligence Evolution

Join Jon Krohn as he unpacks Ray Kurzweil’s six epochs of intelligence evolution, a fascinating framework from The Singularity is Nearer. From the origins of atoms and molecules to the transformative future of brain-computer interfaces and cosmic intelligence, Jon explores how each stage builds upon the last. This quick yet profound journey reveals how humanity is shaping the Fifth Epoch—and hints at what’s next for intelligence in our universe. Additional materials: www.superdatascience.com/854 Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

1 month назад @ podtrac.com
853: Generative AI for Business, with Kirill Eremenko and Hadelin de Ponteves
853: Generative AI for Business, with Kirill Eremenko and Hadelin de Ponteves 853: Generative AI for Business, with Kirill Eremenko and Hadelin de Ponteves

In today's episode, Kirill and Hadelin detail what generative AI models, like large language models, are and how they fit within the broader category of foundation models.

So, now, with that example in hand and with a good understanding of the lifecycle of foundation models in general, there are a lot of foundation models out there.

But that's going to mean doing steps one, two, and three in the lifecycle, and that's going to cost a lot of money.

It's somewhere in between where you do get access to the foundation models and you can choose your foundation models, you can customize them.

So remember, Jon, when you were saying that actually LLMs are included in foundation models, because in fa…

1 month назад @ superdatascience.com
852: In Case You Missed It in December 2024
852: In Case You Missed It in December 2024 852: In Case You Missed It in December 2024

Podcast TranscriptJon Krohn: 00:00:00 This is episode number 852 our In Case You Missed It in December episode.

I don't think we ever put that in production, but it gave us something we could use and measure against our outcomes.

So, you don't need to be worried about spending dollars or tens of dollars by switching to GPT-4o mini.

00:11:17You have data that's sensitive and that you are not comfortable sending that data to a third party.

And for some that's a good, that's going to be fun and pleasurable, and for others that's going to be utter doom and disaster, right?

1 month, 1 week назад @ superdatascience.com
851: Quantum ML: Real-World Applications Today, with Dr. Florian Neukart
851: Quantum ML: Real-World Applications Today, with Dr. Florian Neukart 851: Quantum ML: Real-World Applications Today, with Dr. Florian Neukart

Florian is chief product officer and member of the board at Terra Quantum, a leading quantum computing startup headquartered in Switzerland and Germany.

You gave a great presentation on quantum computing, quantum applications.

How does Terra Quantum address its... And this is a quote from Terra Quantum themselves, "Terra Quantum has a commitment to responsible innovation in the ethical implementation of quantum technology."

These are things that we would not have to do if we were only to focus on quantum computing and other security aspects of protecting against quantum computing.

Otherwise, share this episode with folks who would love to learn about quantum computing or quantum ML.

1 month, 1 week назад @ superdatascience.com
850: Continuous Calendar for 2025
850: Continuous Calendar for 2025 850: Continuous Calendar for 2025

For Jon Krohn, the continuous calendar gives him a realistic and uninterrupted overview of his time.

Well, it’s the start of another year, which means it's time for another continuous calendar from us here at SuperDataScience!

In order to do that efficiently, we love the continuous calendar format.

So if you’d like to get started today with your own super efficient continuous calendar in 2025, simply head to jonkrohn.com/cal25.

The good news for you listeners is that, in 2025, the SuperDataScience Podcast will be a bigger priority for me than ever before.

1 month, 2 weeks назад @ superdatascience.com
849: 2025 AI and Data Science Predictions, with Sadie St. Lawrence
849: 2025 AI and Data Science Predictions, with Sadie St. Lawrence 849: 2025 AI and Data Science Predictions, with Sadie St. Lawrence

And so I think that's something just to think about.

Sadie Lawrence: 00:36:05... particularly if you want me to get specific, but I really think it's us as consumers.

I could speak for a long time about this, but you are the guest, so I'm going to just let you go first, please.

I'm going to quickly preempt what you're going to say with a couple of recent episodes that I released on this topic.

Agentic AI, AI integration to everyday devices.

1 month, 2 weeks назад @ superdatascience.com
Data Science at Home Data Science at Home
последний пост 1 month, 3 weeks назад
Scaling Smart: AI, Data, and Building Future-Ready Enterprises with Josh Miramant (Ep. 276)
Scaling Smart: AI, Data, and Building Future-Ready Enterprises with Josh Miramant (Ep. 276) Scaling Smart: AI, Data, and Building Future-Ready Enterprises with Josh Miramant (Ep. 276)

In this episode, we dive into the transformative world of AI, data analytics, and cloud infrastructure with Josh Miramant, CEO of Blue Orange Digital.

As a seasoned entrepreneur with over $25 million raised across ventures and two successful exits, Josh shares invaluable insights on scaling data-driven businesses, integrating machine learning frameworks, and navigating the rapidly evolving landscape of cloud data architecture.

From generative AI to large language models, Josh explores cutting-edge trends shaping financial services, real estate, and consumer goods.

Tune in for a masterclass in leveraging data for impact and innovation!

Linkshttps://blueorange.digital/https://blueorange.digit…

1 month, 3 weeks назад @ datascienceathome.com
Autonomous Weapons and AI Warfare (Ep. 275)
Autonomous Weapons and AI Warfare (Ep. 275) Autonomous Weapons and AI Warfare (Ep. 275)

Here’s the updated text with links to the websites included:AI is revolutionizing the military with autonomous drones, surveillance tech, and decision-making systems.

In this episode of Data Science at Home, we expose the cutting-edge tech reshaping defense—and the chilling ethical questions that follow.

🐦 Twitter: @DataScienceAtHome📘 LinkedIn: Francesco Gad📷 Instagram: https://www.instagram.com/datascienceathome/📘 Facebook: https://www.facebook.com/datascienceAH💼 LinkedIn: https://www.linkedin.com/company/data-science-at-home-podcast💬 Discord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and machine learning.

S…

2 months назад @ datascienceathome.com
8 Proven Strategies to Scale Your AI Systems Like OpenAI! 🚀 (Ep. 274)
8 Proven Strategies to Scale Your AI Systems Like OpenAI! 🚀  (Ep. 274) 8 Proven Strategies to Scale Your AI Systems Like OpenAI! 🚀 (Ep. 274)

In this episode of Data Science at Home, we’re diving deep into the powerful strategies that top AI companies, like OpenAI, use to scale their systems to handle millions of requests every minute!

From stateless services and caching to the secrets of async processing, discover 8 essential strategies to make your AI and machine learning systems unstoppable.

Instagram: https://www.instagram.com/datascienceathome/Twitter: @datascienceathomeFacebook: https://www.facebook.com/datascienceAHLinkedIn: https://www.linkedin.com/company/data-science-at-home-podcastDiscord Channel: https://discord.gg/4UNKGf3NEW TO DATA SCIENCE AT HOME?

Data Science at Home explores the latest in AI, data science, and ma…

2 months, 1 week назад @ datascienceathome.com
Humans vs. Bots: Are You Talking to a Machine Right Now? (Ep. 273)
Humans vs. Bots: Are You Talking to a Machine Right Now? (Ep. 273) Humans vs. Bots: Are You Talking to a Machine Right Now? (Ep. 273)

Together, they explore the growing importance of distinguishing human-written from AI-generated text, discussing real-world examples from social media to news.

How reliable are current detection tools like DetectGPT?

What are the ethical and technical challenges ahead as AI continues to advance?

And is the balance between innovation and regulation tipping in the right direction?

Tune in for insights on the future of AI text detection and the broader implications for media, academia, and policy.

2 months, 3 weeks назад @ datascienceathome.com
AI bubble, Sam Altman’s Manifesto and other fairy tales for billionaires (Ep. 272)
AI bubble, Sam Altman’s Manifesto and other fairy tales for billionaires (Ep. 272) AI bubble, Sam Altman’s Manifesto and other fairy tales for billionaires (Ep. 272)

Welcome to Data Science at Home, where we don’t just drink the AI Kool-Aid.

Today, we’re dissecting Sam Altman’s “AI manifesto”—a magical journey where, apparently, AI will fix everything from climate change to your grandma’s back pain.

In this episode, I’ll break down the bold (and often bizarre) claims in Altman’s grand speech for the Intelligence Age.

I’ll give you the real scoop on what’s realistic, what’s nonsense, and why some tech billionaires just can’t resist overselling.

Chapters00:00 – Intro00:18 – CEO of Baidu Statement on AI Bubble03:47 – News On Sam Altman Open AI06:43 – Online Manifesto “The Intelleigent Age”13:14 – Deep Learning16:26 – AI gets Better With Scale17:45 – Conclu…

3 months назад @ datascienceathome.com
AI vs. The Planet: The Energy Crisis Behind the Chatbot Boom (Ep. 271)
AI vs. The Planet: The Energy Crisis Behind the Chatbot Boom (Ep. 271) AI vs. The Planet: The Energy Crisis Behind the Chatbot Boom (Ep. 271)

In this episode of Data Science at Home, we dive into the hidden costs of AI’s rapid growth — specifically, its massive energy consumption.

With tools like ChatGPT reaching 200 million weekly active users, the environmental impact of AI is becoming impossible to ignore.

Each query, every training session, and every breakthrough come with a price in kilowatt-hours, raising questions about AI’s sustainability.

Join us, as we uncovers the staggering figures behind AI’s energy demands and explores practical solutions for the future.

From efficiency-focused algorithms and specialized hardware to decentralized learning, this episode examines how we can balance AI’s advancements with our planet’s …

3 months, 1 week назад @ datascienceathome.com
Love, Loss, and Algorithms: The Dangerous Realism of AI (Ep. 270)
Love, Loss, and Algorithms: The Dangerous Realism of AI (Ep. 270) Love, Loss, and Algorithms: The Dangerous Realism of AI (Ep. 270)

Subscribe to our new channel https://www.youtube.com/@DataScienceatHomeIn this episode of Data Science at Home, we confront a tragic story highlighting the ethical and emotional complexities of AI technology.

This devastating event has sparked urgent discussions on the mental health risks, ethical responsibilities, and potential regulations surrounding AI chatbots, especially as they become increasingly lifelike.

🎙️ Topics Covered:AI & Emotional Attachment: How hyper-realistic AI chatbots can foster intense emotional bonds with users, especially vulnerable groups like adolescents.

Mental Health Risks: The potential for AI to unintentionally contribute to mental health issues, and the challe…

3 months, 2 weeks назад @ datascienceathome.com
VC Advice Exposed: When Investors Don’t Know What They Want (Ep. 269)
VC Advice Exposed: When Investors Don’t Know What They Want (Ep. 269) VC Advice Exposed: When Investors Don’t Know What They Want (Ep. 269)

Ever feel like VC advice is all over the place?

That’s because it is.

In this episode, I expose the madness behind the money and how to navigate their confusing advice!

Watch the video at https://youtu.be/IBrPFyRMG1QSubscribe to our new Youtube channel https://www.youtube.com/@DataScienceatHome00:00 – Introduction00:16 – The Wild World of VC Advice02:01 – Grow Fast vs. Grow Slow05:00 – Listen to Customers or Innovate Ahead09:51 – Raise Big or Stay Lean?

14:20 – The Real VC Secret: Focus on Your Team and Vision17:03 – Outro

3 months, 3 weeks назад @ datascienceathome.com
AI Says It Can Compress Better Than FLAC?! Hold My Entropy 🍿 (Ep. 268)
AI Says It Can Compress Better Than FLAC?! Hold My Entropy 🍿 (Ep. 268) AI Says It Can Compress Better Than FLAC?! Hold My Entropy 🍿 (Ep. 268)

In this episode of Data Science at Home, Frag dives deep into the wild claims that Large Language Models (LLMs) like Chinchilla 70B are beating traditional lossless compression algorithms.

🧠💥But before you toss out your FLAC collection, let’s break down Shannon’s Source Coding Theorem and why entropy sets the ultimate limit on lossless compression.

We explore: ⚙️ How LLMs leverage probabilistic patterns for compression 📉 Why compression efficiency doesn’t equal general intelligence 🚀 The practical (and ridiculous) challenges of using AI for compression 💡 Can AI actually BREAK Shannon’s limit—or is it just an illusion?

If you love AI, algorithms, or just enjoy some good old myth-busting, thi…

4 months назад @ datascienceathome.com
What Big Tech Isn’t Telling You About AI (Ep. 267)
What Big Tech Isn’t Telling You About AI (Ep. 267) What Big Tech Isn’t Telling You About AI (Ep. 267)

Are AI giants really building trustworthy systems?

A groundbreaking transparency report by Stanford, MIT, and Princeton says no.

In this episode, we expose the shocking lack of transparency in AI development and how it impacts bias, safety, and trust in the technology.

We’ll break down Gary Marcus’s demands for more openness and what consumers should know about the AI products shaping their lives.

Check our new YouTube channel https://www.youtube.com/@DataScienceatHome and Subscribe!

4 months, 1 week назад @ datascienceathome.com
Money, Cryptocurrencies, and AI: Exploring the Future of Finance with Chris Skinner [RB] (Ep. 266)
Money, Cryptocurrencies, and AI: Exploring the Future of Finance with Chris Skinner [RB] (Ep. 266) Money, Cryptocurrencies, and AI: Exploring the Future of Finance with Chris Skinner [RB] (Ep. 266)

We’re revisiting one of our most popular episodes from last year, where renowned financial expert Chris Skinner explores the future of money.

In this fascinating discussion, Skinner dives deep into cryptocurrencies, digital currencies, AI, and even the metaverse.

He touches on government regulations, the role of tech in finance, and what these innovations mean for humanity.

Now, one year later, we encourage you to listen again and reflect—how much has changed?

Are Chris Skinner’s predictions still holding up, or has the financial landscape evolved in unexpected ways?

4 months, 1 week назад @ datascienceathome.com
Kaggle Kommando’s Data Disco: Laughing our Way Through AI Trends (Ep. 265) [RB]
Kaggle Kommando’s Data Disco: Laughing our Way Through AI Trends (Ep. 265) [RB] Kaggle Kommando’s Data Disco: Laughing our Way Through AI Trends (Ep. 265) [RB]

In this episode, join me and the Kaggle Grand Master, Konrad Banachewicz, for a hilarious journey into the zany world of data science trends.

From algorithm acrobatics to AI, creativity, Hollywood movies, and music, we just can’t get enough.

It’s the typical episode with a dose of nerdy comedy you didn’t know you needed.

Buckle up, it’s a data disco, and we’re breaking down the binary!

SponsorsIntrepid AI is an AI assisted all-in-one platform for robotics teams.

4 months, 2 weeks назад @ datascienceathome.com
AI and Video Game Development: Navigating the Future Frontier (Ep. 264) [RB]
AI and Video Game Development: Navigating the Future Frontier (Ep. 264) [RB] AI and Video Game Development: Navigating the Future Frontier (Ep. 264) [RB]

In this episode we delve into the dynamic realm of game development and the transformative role of artificial intelligence (AI).

Join Frag, Jim and Mike as they explore the current landscape of game development processes, from initial creative ideation to the integration of AI-driven solutions.

With Mike’s expertise as a software executive and avid game developer, we uncover the potential of AI to revolutionize game design, streamline development cycles, and enhance player experiences.

SponsorsIntrepid AI ( https://intrepid.ai ) is an AI assisted all-in-one platform for robotics teams.

Build robotics applications in minutes, not months.

4 months, 3 weeks назад @ datascienceathome.com
LLMs: Totally Not Making Stuff Up (they promise) (Ep. 263)
LLMs: Totally Not Making Stuff Up (they promise) (Ep. 263) LLMs: Totally Not Making Stuff Up (they promise) (Ep. 263)

In this episode, we dive into the wild world of Large Language Models (LLMs) and their knack for… making things up.

Can they really generalize without throwing in some fictional facts?

Or is hallucination just part of their charm?

Let’s separate the genius from the guesswork in this insightful breakdown of AI’s creativity problem.

TL;DR;LLM Generalisation without hallucinations.

4 months, 3 weeks назад @ datascienceathome.com
AI: The Bubble That Might Pop—What’s Next? (Ep. 262)
AI: The Bubble That Might Pop—What’s Next? (Ep. 262) AI: The Bubble That Might Pop—What’s Next? (Ep. 262)

The hype around Generative AI is real, but is the bubble about to burst?

Join me as we dissect the recent downturn in AI investments and what it means for the tech giants like OpenAI and Nvidia.

Could this be the end of the AI gold rush, or just a bump in the road?

5 months, 2 weeks назад @ datascienceathome.com