Very ML
State-of-the-art Machine Learning News Feed
/r/MachineLearning
последний пост 3 часа назад
[R] Hidden Persuaders: LLMs’ Political Leaning and Their Influence on Voters
[R] Hidden Persuaders: LLMs’ Political Leaning and Their Influence on Voters [R] Hidden Persuaders: LLMs’ Political Leaning and Their Influence on Voters

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

3 часа назад @ reddit.com
[D] Struggling to Transition to PhD
[D] Struggling to Transition to PhD

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

4 часа назад @ reddit.com
[P] Enhancing LLM Safety with Precision Knowledge Editing (PKE)
[P] Enhancing LLM Safety with Precision Knowledge Editing (PKE)

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

4 часа назад @ reddit.com
[R] Transposed matrix of the matrix containing the probabilities not changing despite loss term?
[R] Transposed matrix of the matrix containing the probabilities not changing despite loss term?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

7 часов назад @ reddit.com
[D] ML and AI for predictive maintenance
[D] ML and AI for predictive maintenance

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

9 часов назад @ reddit.com
[D] ML and AI for predictive maintenance
[D] ML and AI for predictive maintenance

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

9 часов назад @ reddit.com
[D] ML and AI for predictive maintenance
[D] ML and AI for predictive maintenance

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

9 часов назад @ reddit.com
[D] Is the maths deduction in the Smaug paper valid?
[D] Is the maths deduction in the Smaug paper valid? [D] Is the maths deduction in the Smaug paper valid?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

9 часов назад @ reddit.com
[R] ITCMA-S: A Multi-Agent Architecture for Emergent Social Behavior and Group Formation
[R] ITCMA-S: A Multi-Agent Architecture for Emergent Social Behavior and Group Formation

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

11 часов назад @ reddit.com
[D] PhD in RL/ML Theory or LLM
[D] PhD in RL/ML Theory or LLM

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

12 часов назад @ reddit.com
[R] Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models
[R] Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

13 часов назад @ reddit.com
[D] ICASSP 2025 reviews are due today!
[D] ICASSP 2025 reviews are due today!

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

17 часов назад @ reddit.com
Scikit-learn Pipelines [D]
Scikit-learn Pipelines [D]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

17 часов назад @ reddit.com
[Discussion] Can CuPyNumeric Simplify GPU Acceleration for Python Developers?
[Discussion] Can CuPyNumeric Simplify GPU Acceleration for Python Developers?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

19 часов назад @ reddit.com
[Discussion] How do emergent behaviors in large models challenge our understanding of generalization and intelligence?
[Discussion] How do emergent behaviors in large models challenge our understanding of generalization and intelligence?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

21 час назад @ reddit.com
Towards Data Science
последний пост 1 час назад
Rewiring My Career: How I Transitioned from Electrical Engineering to Data Engineering
Rewiring My Career: How I Transitioned from Electrical Engineering to Data Engineering Rewiring My Career: How I Transitioned from Electrical Engineering to Data Engineering

Data is booming. It comes in vast volumes and variety and this explosion comes with a plethora of job opportunities too. Is it worth switching to a data career now? My honest opinion: absolutely!It is worth mentioning that this article comes from an Electrical and Electronic Engineer graduate who went all the way and spent almost 8 years in academia learning about the Energy sector (and when I say all the way, I mean from a bachelor degree to a PhD and postdoc). Even though that’s also a high demand career in the job market, I decided to switch to a data engineer path instead.I constantly come across posts in forums and blogs where people from different disciplines ask about how to switch t…

1 час назад @ towardsdatascience.com
Is ReFT All We Needed?
Is ReFT All We Needed? Is ReFT All We Needed?

Representation Fintuning — Beyond the PEFT Techniques for fine-tuning LLMsHasn’t everyone started using ReFT yet?Stanford published the paper ReFT: Representation finetuning for language models in May 2024, which immediately showed its great potential. In July 2024, Oxen.ai presented an experiment finetuning Llama3 (8B) on a single Nvidia A10 GPU within 14 mins, further demonstrating this technique's power.Unlike SOTA PEFT methods, which focus on modifying the model weights or input, the ReFT technique is based on a previously proposed distributed interchange intervention (DII) method. The DII method first projects the embedding from the deep learning model to a lower dimension subspace and…

1 час назад @ towardsdatascience.com
How to Build a Data-Driven Customer Management System
How to Build a Data-Driven Customer Management System How to Build a Data-Driven Customer Management System

CUSTOMER BASE MANAGEMENTA High-Level Framework for Building a CBM System with Strategic ImpactImage created by the author using CanvaImagine if your customer data could tell a story — one that drove key decisions, optimized pricing, and forecast future revenue with accuracy.As a data scientist, I have spent a large part of my career designing and building customer base management (CBM) systems. These are systems that monitor and predict anything from customer churn and customer price elasticity to recommending the products customers are most likely to buy.I have witnessed how CBM systems have been successfully adopted by organizations, and enabled them to execute their strategy through effe…

13 часов назад @ towardsdatascience.com
Boost Your Python Code with CUDA
Boost Your Python Code with CUDA Boost Your Python Code with CUDA

Target your GPU easily with Numba’s CUDA JITContinue reading on Towards Data Science »

13 часов назад @ towardsdatascience.com
Your Data Quality Checks Are Worth Less (Than You Think)
Your Data Quality Checks Are Worth Less (Than You Think) Your Data Quality Checks Are Worth Less (Than You Think)

How to deliver outsized value on your data quality programContinue reading on Towards Data Science »

13 часов назад @ towardsdatascience.com
Building a Research Assistant That Can Write to Google Docs (Part 2)
Building a Research Assistant That Can Write to Google Docs (Part 2) Building a Research Assistant That Can Write to Google Docs (Part 2)

Dalle-3’s interpretation of “An AI assistant throwing documents to the wind over a clear blue ocean”. Image generated by the author.A tool that might help with your homeworkThis article is the second of a two part series where we use LangGraph and Tavily to build a simple research agent, which writes and refines short articles. To keep track of the plans, articles and comments it generates we add the ability to programmatically create and edit Google Docs. In the first article we built the agent. Now we will build the docs connection. You can find all the relevant code here.In part 1 of this series we discussed agents, and used tools from LangGraph and Tavily to build a minimal agent that c…

13 часов назад @ towardsdatascience.com
Building a Research Agent That Can Write to Google Docs (Part 1)
Building a Research Agent That Can Write to Google Docs (Part 1) Building a Research Agent That Can Write to Google Docs (Part 1)

Dalle-3’s interpretation of “A quirky AI assistant hard at work checking documents”. Image generated by the author.A tool that might help with your homeworkThis article is the first of a two part series where we use LangGraph and Tavily to build a simple research agent, which writes and refines short articles. To keep track of the plans, articles and comments it generates we add the ability to programmatically create and edit Google Docs. In this article we focus on the agent, leaving the docs connection to the second article. You can find all the relevant code here.Large Language Models (LLMs) are quickly finding use in all sorts of applications relevant to analysts and researchers, especi…

13 часов назад @ towardsdatascience.com
LoRA Fine-Tuning On Your Apple Silicon MacBook
LoRA Fine-Tuning On Your Apple Silicon MacBook LoRA Fine-Tuning On Your Apple Silicon MacBook

Let’s Go Step-By-Step Fine-Tuning On Your MacBookContinue reading on Towards Data Science »

16 часов назад @ towardsdatascience.com
DPO Full Training vs. LoRA: How Good is LoRA for DPO Training?
DPO Full Training vs. LoRA: How Good is LoRA for DPO Training? DPO Full Training vs. LoRA: How Good is LoRA for DPO Training?

One model, two adaptersContinue reading on Towards Data Science »

17 часов назад @ towardsdatascience.com
Water Cooler Small Talk: Why Does the Monty Hall Problem Still Bother Us?
Water Cooler Small Talk: Why Does the Monty Hall Problem Still Bother Us? Water Cooler Small Talk: Why Does the Monty Hall Problem Still Bother Us?

A look at the counterintuitive mathematics of game show puzzlesContinue reading on Towards Data Science »

18 часов назад @ towardsdatascience.com
Einstein Notation: A New Lens on Transformers
Einstein Notation: A New Lens on Transformers Einstein Notation: A New Lens on Transformers

Transforming the Math of the Transformer ModelContinue reading on Towards Data Science »

20 часов назад @ towardsdatascience.com
Collision Risk in Hash-Based Surrogate Keys
Collision Risk in Hash-Based Surrogate Keys Collision Risk in Hash-Based Surrogate Keys

Various aspects and real-life analogies of the odds of having a hash collision when computing Surrogate Keys using MD5, SHA-1, and SHA-256.Continue reading on Towards Data Science »

20 часов назад @ towardsdatascience.com
Techniques in Feature Engineering:
Techniques in Feature Engineering: Techniques in Feature Engineering:

Real-World Healthcare Data Challenges — Part II.Continue reading on Towards Data Science »

1 day, 6 hours назад @ towardsdatascience.com
Dance between dense and sparse embeddings: Enabling Hybrid Search in LangChain-Milvus
Dance between dense and sparse embeddings: Enabling Hybrid Search in LangChain-Milvus Dance between dense and sparse embeddings: Enabling Hybrid Search in LangChain-Milvus

Dance Between Dense and Sparse Embeddings: Enabling Hybrid Search in LangChain-MilvusHow to create and search multi-vector-store in langchain-milvusThis blog post was co-authored by Omri Levy and Ohad Eytan, as part of the work we have done in IBM Research Israel.IntroRecently, we — at IBM Research — needed to use hybrid search in the Milvus vector store. Since we were already using the LangChain framework, we decided to roll up our sleeves and contribute what was needed to enable it in langchain-milvus. That included support for sparse embeddings (PR) and multi-vector search (PR) through the langchain interface.In this blog post, we will briefly introduce the difference between dense and s…

1 day, 13 hours назад @ towardsdatascience.com
How to Answer Business Questions with Data
How to Answer Business Questions with Data How to Answer Business Questions with Data

Data analysis is the key to drive business decisions through answering abstract business questions but it’s hard to get rightContinue reading on Towards Data Science »

1 day, 13 hours назад @ towardsdatascience.com
Distill.pub Distill.pub
последний пост None
The Gradient The Gradient
последний пост 4 days, 14 hours назад
Shape, Symmetries, and Structure: The Changing Role of Mathematics in Machine Learning Research
Shape, Symmetries, and Structure: The Changing Role of Mathematics in Machine Learning Research Shape, Symmetries, and Structure: The Changing Role of Mathematics in Machine Learning Research

Mathematics and statistics, once the primary guides of machine learning research, now struggle to provide immediate insight into the latest breakthroughs.

This shift has prompted speculation about mathematics’ diminished role in machine learning research moving forward.

It is also the way that symmetries are usually leveraged when performing computations (for example, in machine learning).

One can reasonably argue that diagrammatic descriptions of well-known constructions, like products, are not useful for the machine learning researcher.

However, as we’ve demonstrated, while mathematics may not maintain the same role in machine learning research that it has held in the past, the success of…

4 days, 14 hours назад @ thegradient.pub
What's Missing From LLM Chatbots: A Sense of Purpose
What's Missing From LLM Chatbots: A Sense of Purpose What's Missing From LLM Chatbots: A Sense of Purpose

Let's jump back to the 1970s, when Roger Schank introduced his "restaurant script" as a kind of dialogue system [1].

The minimum requirement we could have for a dialogue system is that it can stay on the task we gave them.

Concluding marksI have reviewed the making of current LLM dialogue systems, how and why it is insufficient.

The following are two research questions that I’m mostly excited about:(1) Better monitoring and control of dialogue systems with steering techniques.

CitationFor attribution of this in academic contexts or books, please cite this work as:Kenneth Li, "From prediction to purpose: a tutorial on LLM dialogue system", The Gradient, 2024.

2 months, 1 week назад @ thegradient.pub
We Need Positive Visions for AI Grounded in Wellbeing
We Need Positive Visions for AI Grounded in Wellbeing We Need Positive Visions for AI Grounded in Wellbeing

This leads to our second conclusion: We need plausible positive visions of a society with capable AI, grounded in wellbeing.

The rest of this post describes in more detail (1) what we mean by AI that benefits our wellbeing, (2) the need for positive visions for AI grounded in wellbeing, and (3) concrete leverage points to aid in the development and deployment of AI in service of such positive visions.

In diving into the philosophy of flourishing, wellbeing economics, or psychological theories of human wellbeing, one encounters many interesting, compelling, but seemingly incompatible ideas.

The case so far is that we need positive visions for society with capable AI, grounded in individual a…

3 months, 2 weeks назад @ thegradient.pub
Financial Market Applications of LLMs
Financial Market Applications of LLMs Financial Market Applications of LLMs

Looked at from another angle, there is much more noise than signal in financial data.

Another financial market application of LLMs might be synthetic data creation [4,8].

Then precious real market data could be employed to fine-tune the predictions and determine precisely the optimal speed to trade.

Financial market practitioners are often interested in extreme events, the times when trading strategies are more likely to experience significant gains or losses.

CitationFor attribution in academic contexts or books, please cite this work asRichard Dewey and Ciamac Moallemi, "Financial Market Applications of LLMs," The Gradient, 2024

7 months назад @ thegradient.pub
A Brief Overview of Gender Bias in AI
A Brief Overview of Gender Bias in AI A Brief Overview of Gender Bias in AI

All of these terms (“AI”, “gender”, and “bias”) can be somewhat overused and ambiguous.

A Short History of Studying Gender Bias in AIHere, I cover a very small sample of papers I’ve found influential studying gender bias in AI.

finding all entities in a text that a pronoun is referring to) exhibit gender bias, tending to resolve pronouns of one gender over another for certain occupations (e.g.

This article mainly focused on gender bias — and particularly, on binary gender.

AcknowledgementsThis post was originally posted on Art Fish IntelligenceCitationFor attribution in academic contexts or books, please cite this work asYennie Jun, "Gender Bias in AI," The Gradient, 2024@article{Jun2024bia…

7 months, 2 weeks назад @ thegradient.pub
Mamba Explained
Mamba Explained Mamba Explained

As a general sequence model backbone, Mamba achieves state-of-the-art performance across several modalities such as language, audio, and genomics.

Here we’ll discuss:The advantages (and disadvantages) of Mamba (🐍) vs Transformers (🤖),Analogies and intuitions for thinking about Mamba, andWhat Mamba means for Interpretability, AI Safety and Applications.

The Mamba BlockLike a Transformer made up of stacked transformer blocks, Mamba is made up of stacked Mamba blocks as above.

The Mamba authors write, “the efficiency vs. effectiveness tradeoff of sequence models is characterised by how well they compress their state”.

Thanks to Gonçalo for reading an early draft, Jaden for the nnsight library …

7 months, 4 weeks назад @ thegradient.pub
Car-GPT: Could LLMs finally make self-driving cars happen?
Car-GPT: Could LLMs finally make self-driving cars happen? Car-GPT: Could LLMs finally make self-driving cars happen?

We've just seen 3 prominent families of LLM usage in self-driving cars: Perception, Planning, and Generation.

The first wave of papers mentioning LLMs in Self-Driving Cars is from mid-2023, so let's give it some time.

Next StepsIf you want to get started on LLMs for self-driving cars, there are several things you can do:⚠️ Before this, the most important : If you want to keep learning about self-driving cars.

Author BioJérémy Cohen is a self-driving car engineer and founder of Think Autonomous, a platform to help engineers learn about cutting-edge technologies such as self-driving cars and advanced Computer Vision.

CitationFor attribution in academic contexts or books, please cite this work…

8 months, 2 weeks назад @ thegradient.pub
Do text embeddings perfectly encode text?
Do text embeddings perfectly encode text? Do text embeddings perfectly encode text?

Beyond the requirements of semantic similarity, there are no constraints on what embedding must be assigned for a given text input.

What if someone hacks into the database and gains access to all your text embedding vectors – would this be bad?

From text to embeddings...back to textThe problem of recovering text from embeddings is exactly the scenario we tackle in our paper Text Embeddings Reveal As Much as Text (EMNLP 2023).

Scaling and future workThe fact that text embeddings can be perfectly inverted raises many follow-up questions.

CitationFor attribution in academic contexts or books, please cite this work asJack Morris, "Do text embeddings perfectly encode text?

8 months, 2 weeks назад @ thegradient.pub
Why Doesn’t My Model Work?
Why Doesn’t My Model Work? Why Doesn’t My Model Work?

If your model latches on to these during training, it will appear to work well, but may not work on new data.

This happens when the model training pipeline has access to information it shouldn’t have access to, particularly information that confers an advantage to the model.

This means that knowledge of the test data is implicitly entering the model training pipeline, even if it is not explicitly used to train the model.

Well, this is a common thing to do, but if you’re developing a model iteratively and using the same test set to evaluate the model after each iteration, then you’re basically using that test set to guide the development of the model.

CitationFor attribution in academic cont…

9 months назад @ thegradient.pub
TheSequence TheSequence
последний пост 19 часов назад
The Sequence Chat: The End of Data. Or Maybe Not
The Sequence Chat: The End of Data. Or Maybe Not The Sequence Chat: The End of Data. Or Maybe Not

Most large foundation models have been virtually trained on the entirety of the internet, using datasets like Fine-Web that encapsulate much of the publicly available data.

As a result, the question of AI hitting a 'data wall' has become increasingly relevant.

After all, the famous scaling laws are fundamentally dependent on the availability of vast amounts of data.

This essay explores the thesis of the "end of data" for AI models, examining both sides of the argument and delving into potential solutions such as extracting higher quality data and generating synthetic datasets.

Let’s start with some points that validate the data wall argument:The Data Scarcity Argument

19 часов назад @ thesequence.substack.com
Edge 449: Getting Into Adversarial Distillation
Edge 449: Getting Into Adversarial Distillation Edge 449: Getting Into Adversarial Distillation

Created Using MidjourneyIn this issue:Understanding adversarial distillation.

Alibaba’s paper about Introspective Adversarial Distillation (IAD).

The first stop is about a method known as adversarial distillation.

As it names indicates, adversarial distillation draws inspiration from generative adversarial networks(GANs) using a generator-discriminator architecture.

In that setting, the generator creates synthetic samples close to the true data distribution while the discriminator learns to differentiate between the synthetic and original data samples.

1 day, 19 hours назад @ thesequence.substack.com
The Toughest Math Benchmark Ever Built
The Toughest Math Benchmark Ever Built The Toughest Math Benchmark Ever Built

Much of AI's impressive performance in math benchmarks relies on scenarios where the problem is perfectly articulated within a prompt.

Unlike traditional math benchmarks such as GSM-8K and MATH, where AI models now score over 90%, Frontier Math presents a significantly more challenging test.

The benchmark comprises hundreds of intricate math problems spanning diverse fields of modern mathematics, from computational number theory to abstract algebraic geometry.

Frontier Math provides a critical benchmark for measuring progress in AI reasoning as these systems continue to evolve.

Frontier MathFrontierMath is, arguably, the toughest math benchmark ever created —> Read more.

3 days, 19 hours назад @ thesequence.substack.com
📽 Webinar: How Convirza Scaled SLMs for Real-Time Call Analytics – Without Breaking the Bank
📽 Webinar: How Convirza Scaled SLMs for Real-Time Call Analytics – Without Breaking the Bank 📽 Webinar: How Convirza Scaled SLMs for Real-Time Call Analytics – Without Breaking the Bank

Convirza, a leader in call analytics, recently faced this exact challenge and found an answer with Predibase’s multi-LoRA serving infrastructure.

Join us on November 21st at 10:00 am PT for an exclusive look into how Convirza transitioned from Longformer models to fine-tuned SLMs, improving speed and accuracy while cutting costs.

Optimized Cost Structure : Instead of dedicated GPUs for each of the 60 indicators—each costing $500-$1,500 per month—Convirza now runs multiple indicators on a single scalable GPU deployment.

Faster Iterations with SLMs: They swapped out Longformer models for fine-tuned SLMs, reducing the training cycles from 9-24+ hours to less than 3 hours on average per adapter…

5 days, 18 hours назад @ thesequence.substack.com
Edge 448: Meta AI's Technique For Building LLMs that "Think Before they Speak"
Edge 448: Meta AI's Technique For Building LLMs that "Think Before they Speak" Edge 448: Meta AI's Technique For Building LLMs that "Think Before they Speak"

Created Using MidjourneyReasoning is one of the most interesting areas of research in the world of foundation models and one that has been accelerated since the release of GPT-o1.

The more specific trend is to develop foundation models that can “reason” before producing an output.

This concept draws inspiration from how humans tackle complex problems, taking time to ponder and strategize before arriving at an answer.

Research in the area of planning and reasoning is being tracked very closely as it can represent the next breakthrough in generative AI.

This is the area of a recent research paper from Meta AI which explores a novel technique called Thought Preference Optimization (TPO).

6 days, 19 hours назад @ thesequence.substack.com
The Sequence Chat: Small Specialists vs. Large Generalist Models and What if NVIDIA Becomes Sun Microsystems
The Sequence Chat: Small Specialists vs. Large Generalist Models and What if NVIDIA Becomes Sun Microsystems The Sequence Chat: Small Specialists vs. Large Generalist Models and What if NVIDIA Becomes Sun Microsystems

Created Using DALL-EMega large transformer have dominated the generative AI revolution over the last few years.

The emergent properties that surface at scale in large foundation models are nothing short of magical.

At the same time, running pretraining, fine-tuning and inference workloads in large models results cost prohibited for most organizations.

As a result, we have seen the emergence of smaller models that more specialized in a given domain.

The raise of smaller models also represents a challenge to the dependency of NVIDIA GPUs are these models are perfectly able to run in commodity hardware.

1 week назад @ thesequence.substack.com
Edge 447: Not All Model Distillations are Created Equal
Edge 447: Not All Model Distillations are Created Equal Edge 447: Not All Model Distillations are Created Equal

Created Using DALL-EIn this issue:Understanding the different types of model distillation.

The original model distillation paper.

💡 ML Concept of the Day: Types of Model DistillationIn the previous issue of this series, we introduced the concept of model distillation.

The core idea of this technique is to use a teacher model to train a smaller, more efficient model that maintains the accuracy of the teacher model on a specific set of domains.

Response-Based Distillation:

1 week, 1 day назад @ thesequence.substack.com
Microsoft's New Framework for Multi-Agent Systems
Microsoft's New Framework for Multi-Agent Systems Microsoft's New Framework for Multi-Agent Systems

You can subscribe to The Sequence below:📝 Editorial: Magentic-OneMagentic-One: A Multi-Agent System for Complex TasksMulti-agent systems are one the most fascinating areas of generative AI.

We are barely getting single agents to work so thinking about systems that combine several agents is fundamentally hard.

After releasing frameworks such as AutoGen or TaskWeaver, Microsoft is dabbling into multi-agent systems.

Magentic-One is implemented using AutoGen, Microsoft's open-source framework for multi-agent applications.

🤖 AI Tech ReleasesMagentic OneMicrosoft open sourced Magentic One, a multi-agent framework for web and file tasks —> Read more.

1 week, 3 days назад @ thesequence.substack.com
Edge 446: Can AI Build AI Systems? Inside OpenAI's MLE-Bench
Edge 446: Can AI Build AI Systems? Inside OpenAI's MLE-Bench Edge 446: Can AI Build AI Systems? Inside OpenAI's MLE-Bench

One of the ultimate manifestations of this proposition is AI writing AI code.

But how good is AI in traditional machine learning(ML) engineering tasks such as training or validation.

This is the purpose of a new work proposed by OpenAI with MLE-Bench, a benchmark to evaluate AI agents in ML engineering tasks.

MLE-Bench is a new benchmark introduced by OpenAI to evaluate the performance of AI agents on complex machine learning engineering tasks.

The benchmark is specifically designed to assess how well AI agents can perform real-world MLE work, such as training models, preparing datasets, and running experiments.

1 week, 6 days назад @ thesequence.substack.com
Edge 445: A New Series About Knowledge Distillation
Edge 445: A New Series About Knowledge Distillation Edge 445: A New Series About Knowledge Distillation

A review of one of the first papers about knowledge distillation.

💡 ML Concept of the Day: An Intro to Knowledge DistillationMaking foundation models smaller and more cost effective is one of the key challenges of generative AI.

Distillation is one of the new emerging techniques focused on reducing the size while maintaining the accuracy of large generative AI models.

These days, we are constaintly seeing distilled versions of large models being able to run on smaller compute environments such as mobile devices.

How does distillation work exactly?

2 weeks, 1 day назад @ thesequence.substack.com
Robotics is Inching Towards it ChatGPT Moment
Robotics is Inching Towards it ChatGPT Moment Robotics is Inching Towards it ChatGPT Moment

Created Using IdeogramNext Week in The Sequence:Edge 445: We start a new series about one of the most exciting topics in generative AI: model distillation.

You can subscribe to The Sequence below:📝 Editorial: Robotics is Inching Towards it ChatGPT MomentThe field of AI robotics is currently experiencing a surge in innovation, with researchers developing new techniques and technologies that are pushing the boundaries of what robots can do.

This week we saw several major research contributions in the field of robotics from NVIDIA, MIT and Meta among others.

HPT aligns data from different sources into a shared "language" that a generative AI model can process.

The advancements in AI robotics, …

2 weeks, 3 days назад @ thesequence.substack.com
📽 Fully Virtual: Agents in Production
📽 Fully Virtual: Agents in Production 📽 Fully Virtual: Agents in Production

Date: 13th November 2024 | Time: 4pm CETWe recommend to join this full-day virtual event featuring top speakers like Thomas Wolf – Co-founder of Hugging Face; Prashanth Chandrasekar – CEO of Stack Overflow; and Nathan Benaich from Air Street Capital; along with experts from OpenAI, Stanford, Replit, and more!

The event is all about bringing AI agents into production, with sessions on everything from autonomous payment agents to customer service and analytics agents.

Connect with the global ML/AI community to dive into real-world applications and best practices for deploying and scaling AI agents in industries like e-commerce, food delivery, SaaS, and beyond.

REGISTER NOW FREE!

2 weeks, 5 days назад @ thesequence.substack.com
Edge 444: Learn About Movie Gen: Meta AI's Amazing Audio-Video Generation Model
Edge 444: Learn About Movie Gen: Meta AI's Amazing Audio-Video Generation Model Edge 444: Learn About Movie Gen: Meta AI's Amazing Audio-Video Generation Model

From OpenAI’s Sora startups like Runway or Pika to the new efforts from Google, most of the large video generation models remained closed-source.

There are open-source efforts such as Stability AI’s Stable Video, but the quality is not at the same level as that of the large commercial alternatives.

The barrier of entry for pretraining video models is fundamentally larger than language or image, as datasets are more scarce and difficult to produce.

Can open-source AI compete in the video generation space?

Recently, Meta unveiled some of their recent work in video and audio generation with the publication of the Movie Gen research paper.

2 weeks, 6 days назад @ thesequence.substack.com
The Sequence Chat: Thinking About Transformers as Computers
The Sequence Chat: Thinking About Transformers as Computers The Sequence Chat: Thinking About Transformers as Computers

Created Using MidjourneyI mentioned last week that these opinion pieces are trying to offer a different perspective about various topics in AI.

Beyond completely revolutionizing the generative AI field, transformers can be considered a true marvel given its unique scaling properties and incredible adaptability.

One analogy that has worked for me recently is to think of transformers as computers.

The transformer architecture has revolutionized the field of artificial intelligence, particularly in natural language processing.

The Programmable Nature of Transformers

3 weeks назад @ thesequence.substack.com
Edge 443: EVERYTHING you Need to Know About State Space Models
Edge 443: EVERYTHING you Need to Know About State Space Models Edge 443: EVERYTHING you Need to Know About State Space Models

Today, we would like to present a summary of this series about some of the most interesting trends in foundation models.

Edge 427: Reviewed Jamba, a model that combines SSMs, transformers and MoEs including the original Jamba paper.

Edge 431: Reviewed Cobra, a multimodal SSM including its original paper.

Edge 437: Discusses the BlackMamba model that combines SSMs and MoEs in a single architecture including its original paper.

We dived into Zamba’s original paper and review the LitServe framework for high performance model serving.

3 weeks, 1 day назад @ thesequence.substack.com
Synced Review
последний пост 1 day, 10 hours назад
Meta’s Dualformer: Bridging Fast and Slow Thinking in Transformers for Superior AI Reasoning
Meta’s Dualformer: Bridging Fast and Slow Thinking in Transformers for Superior AI Reasoning Meta’s Dualformer: Bridging Fast and Slow Thinking in Transformers for Superior AI Reasoning

In cognitive science, human thought processes are commonly divided into two systems: the fast, intuitive System 1 and the slower…Continue reading on SyncedReview »

1 day, 10 hours назад @ medium.com
NVIDIA’s OMCAT: A Breakthrough in Cross-Modal Temporal Understanding for Multimodal AI
NVIDIA’s OMCAT: A Breakthrough in Cross-Modal Temporal Understanding for Multimodal AI NVIDIA’s OMCAT: A Breakthrough in Cross-Modal Temporal Understanding for Multimodal AI

Continue reading on SyncedReview »

3 days, 4 hours назад @ medium.com
Sandford U’s Tutor CoPilot Transforms Real-Time Tutoring with AI-Driven Expert Guidance
Sandford U’s Tutor CoPilot Transforms Real-Time Tutoring with AI-Driven Expert Guidance Sandford U’s Tutor CoPilot Transforms Real-Time Tutoring with AI-Driven Expert Guidance

Generative AI, including Language Models (LMs), holds the promise to reshape key sectors like education, healthcare, and law, which rely…Continue reading on SyncedReview »

5 days, 5 hours назад @ medium.com
Bridging the Gap: Induction-Head Ngram Models for Efficient, Interpretable Language Modeling
Bridging the Gap: Induction-Head Ngram Models for Efficient, Interpretable Language Modeling Bridging the Gap: Induction-Head Ngram Models for Efficient, Interpretable Language Modeling

Recent large language models (LLMs) have shown impressive performance across a diverse array of tasks.Continue reading on SyncedReview »

1 week, 1 day назад @ medium.com
Self-Evolving Prompts: Redefining AI Alignment with DeepMind & Chicago U’s eva Framework
Self-Evolving Prompts: Redefining AI Alignment with DeepMind & Chicago U’s eva Framework Self-Evolving Prompts: Redefining AI Alignment with DeepMind & Chicago U’s eva Framework

For artificial intelligence to thrive in a complex, constantly evolving world, it must overcome significant challenges: limited data…Continue reading on SyncedReview »

1 week, 6 days назад @ medium.com
Unlocking Turing Completeness: How Large Language Models Achieve Universal Computation Without…
Unlocking Turing Completeness: How Large Language Models Achieve Universal Computation Without… Unlocking Turing Completeness: How Large Language Models Achieve Universal Computation Without…

The rise of large language models (LLMs) has sparked questions about their computational abilities compared to traditional models. While…Continue reading on SyncedReview »

2 weeks, 1 day назад @ medium.com
From OCR to Multi-Image Insight: Apple’s MM1.5
From OCR to Multi-Image Insight: Apple’s MM1.5 From OCR to Multi-Image Insight: Apple’s MM1.5

Multimodal Large Language Models (MLLMs) have rapidly become a focal point in AI research. Closed-source models like GPT-4o, GPT-4V…Continue reading on SyncedReview »

3 weeks назад @ medium.com
AI Self-Evolution: How Long-Term Memory Drives the Next Era of Intelligent Models
AI Self-Evolution: How Long-Term Memory Drives the Next Era of Intelligent Models AI Self-Evolution: How Long-Term Memory Drives the Next Era of Intelligent Models

Large language models (LLMs) like GPTs, developed from extensive datasets, have shown remarkable abilities in understanding language…Continue reading on SyncedReview »

3 weeks, 2 days назад @ medium.com
Breaking Barriers in Cellular Automata with CAX: Faster, Scalable, and Open for All
Breaking Barriers in Cellular Automata with CAX: Faster, Scalable, and Open for All Breaking Barriers in Cellular Automata with CAX: Faster, Scalable, and Open for All

Cellular automata (CA) have become essential for exploring complex phenomena like emergence and self-organization across fields such as…Continue reading on SyncedReview »

3 weeks, 5 days назад @ medium.com
LLMs as Code Architects: Meta’s New Approach to Precise Code Transformations
LLMs as Code Architects: Meta’s New Approach to Precise Code Transformations LLMs as Code Architects: Meta’s New Approach to Precise Code Transformations

Tools designed for rewriting, refactoring, and optimizing code should prioritize both speed and accuracy. Large language models (LLMs)…Continue reading on SyncedReview »

4 weeks назад @ medium.com
Thinking Fast and Slow: Google DeepMind’s Dual-Agent Architecture for Smarter AI
Thinking Fast and Slow: Google DeepMind’s Dual-Agent Architecture for Smarter AI Thinking Fast and Slow: Google DeepMind’s Dual-Agent Architecture for Smarter AI

Continue reading on SyncedReview »

1 month назад @ medium.com
From Dense to Dynamic: NVIDIA’s Innovations in Upcycling LLMs to Sparse MoE
From Dense to Dynamic: NVIDIA’s Innovations in Upcycling LLMs to Sparse MoE From Dense to Dynamic: NVIDIA’s Innovations in Upcycling LLMs to Sparse MoE

Sparse Mixture of Experts (MoE) models are gaining traction due to their ability to enhance accuracy without proportionally increasing…Continue reading on SyncedReview »

1 month назад @ medium.com
Web Data to Real-World Action: Enabling Robots to Master Unseen Tasks
Web Data to Real-World Action: Enabling Robots to Master Unseen Tasks Web Data to Real-World Action: Enabling Robots to Master Unseen Tasks

Continue reading on SyncedReview »

1 month, 1 week назад @ medium.com
Scaling Multi-Objective Optimization: Meta & FAIR’s CGPO Advances General-purpose LLMs
Scaling Multi-Objective Optimization: Meta & FAIR’s CGPO Advances General-purpose LLMs Scaling Multi-Objective Optimization: Meta & FAIR’s CGPO Advances General-purpose LLMs

Reinforcement Learning from Human Feedback (RLHF) has become the go-to technique for refining large language models (LLMs), but it faces…Continue reading on SyncedReview »

1 month, 1 week назад @ medium.com
Instant 3D Vision: Apple’s Depth Pro Delivers High-Precision Depth Maps in 0.3 Seconds
Instant 3D Vision: Apple’s Depth Pro Delivers High-Precision Depth Maps in 0.3 Seconds Instant 3D Vision: Apple’s Depth Pro Delivers High-Precision Depth Maps in 0.3 Seconds

Monocular Depth Estimation, which involves estimating depth from a single image, holds tremendous potential. It can add a third dimension…Continue reading on SyncedReview »

1 month, 2 weeks назад @ medium.com
📓 Cool Blogs
ODS.ai Habr ODS.ai Habr
последний пост 2 months назад
о1: почему новая GPT от OpenAI — это не хайп, а переход к новой парадигме в ИИ
о1: почему новая GPT от OpenAI — это не хайп, а переход к новой парадигме в ИИ о1: почему новая GPT от OpenAI — это не хайп, а переход к новой парадигме в ИИ

В этой статье мы разберемся, чему научилась новая GPT o1, и как это повлияет на дальнейшую эволюцию ИИ.

Компания утверждает, что для них сброс счётчика линейки моделей к единичке знаменует собой переход к новой парадигме, и что эта нейросеть и вовсе демонстрирует новый уровень возможностей ИИ.

На форумах и в Твиттере была куча обсуждений, предвосхищений и хайпа, на фоне которых планка ожиданий некоторых людей взлетела до небес.

Издание Bloomberg рассказало, что в ходе внутренней демонстрации OpenAI показали концепцию из пяти уровней, помогающую отслеживать прогресс в создании ИИ.

Однако на уровне GPT-5 прирост в навыках может быть совсем другим (как в лучшую, так и в худшую сторону).

2 months назад @ habr.com
Большие и чёрные (ящики): что мы знаем о том, как «думают» нейросети?
Большие и чёрные (ящики): что мы знаем о том, как «думают» нейросети? Большие и чёрные (ящики): что мы знаем о том, как «думают» нейросети?

И в том, и в другом случаях объяснение действия не связано с реальным мотивом его сделать, и там, и там рождается поддельное (но правдоподобно звучащее) объяснение причин.

Просто сейчас это не воспринимается всерьёз, ведь LLM не распространены и не становятся ядром бизнес-процессов, включающих принятие решений.

Один и тот же текст запроса+ответа подаётся в модель, и производится оценка вероятности получить именно такой ответ при фиксированном запросе.

Это и желание продолжать существовать/жить, и нежелание умирать, и рассуждения об эмоциях и контроле.

Потому что абстракции, потому что обобщение, потому что это ровно то, за что мы ценим модели.

2 months, 1 week назад @ habr.com
Как организовать процесс А/В тестирования на коленке
Как организовать процесс А/В тестирования на коленке Как организовать процесс А/В тестирования на коленке

В ней авторы выделили 4 этапа зрелости, грубо можно разделить компании по частоте запусков экспериментов:на этапе Crawl компания проводит эксперимент раз в месяц (примерно 10 экспериментов в год);на этапе Walk – раз в неделю (примерно 50 экспериментов в год);на этапе Run – ежедневно (примерно 250 экспериментов в год);на этапе Fly – более 1000 экспериментов в год.

Более четкое разделение основывается на некоторых важных свойствах компаний, таких как: наличие команды платформы экспериментов, возможности автоматического расчета метрик, распространенности экспериментов в компании, самодостаточность продуктовых команд в проведении экспериментов, влияние экспериментов на компанию и другое.

Точка …

3 months, 1 week назад @ habr.com
Как организовать процесс А/В тестирования на коленке
Как организовать процесс А/В тестирования на коленке Как организовать процесс А/В тестирования на коленке

В ней авторы выделили 4 этапа зрелости, грубо можно разделить компании по частоте запусков экспериментов:на этапе Crawl компания проводит эксперимент раз в месяц (примерно 10 экспериментов в год);на этапе Walk – раз в неделю (примерно 50 экспериментов в год);на этапе Run – ежедневно (примерно 250 экспериментов в год);на этапе Fly – более 1000 экспериментов в год.

Более четкое разделение основывается на некоторых важных свойствах компаний, таких как: наличие команды платформы экспериментов, возможности автоматического расчета метрик, распространенности экспериментов в компании, самодостаточность продуктовых команд в проведении экспериментов, влияние экспериментов на компанию и другое.

Точка …

3 months, 1 week назад @ habr.com
Введение в MLflow
Введение в MLflow Введение в MLflow

Mlflow Experiments and Mlflow RunsMLflow Experiments и MLflow Runs - это основные абстракции для структурирования проекта.

mlflow run mlproject --entry-point hyperparameters-tuning --env-manager conda --experiment-name Cancer_Classification --run-name Hyperparameters_Search -P n-trials=10Посмотрим на результаты в MLflow UI.

artifact_path: model flavors: python_function: data: model.xgb env: conda: conda.yaml virtualenv: python_env.yaml loader_module: mlflow.xgboost python_version: 3.11.4 xgboost: code: null data: model.xgb model_class: xgboost.sklearn.XGBClassifier model_format: xgb xgb_version: 2.0.3 mlflow_version: 2.14.2 model_size_bytes: 35040 model_uuid: 516954aae7c94e91adeed9df76cb405…

3 months, 3 weeks назад @ habr.com
В 48 собесах от оффера в Гугл
В 48 собесах от оффера в Гугл В 48 собесах от оффера в Гугл

Как это определить и как предсказывать?

- Да... упс.. правда, они же уже определяют целевой признакВовремя не выкрутился, пришел фидбек, что я не понимаю разницу между обычными признаками (а.к.а.

Не, не делал, только DDP?

NVIDIA ищет единорогов, крутых и в рисече, и в инженерии.

Вакансия в Лондоне была, и в итоге они взяли моего бывшего коллегу: он уже в Лондоне и с чуть большим опытом в рекомендашках, явно лучше подходит.

4 months, 1 week назад @ habr.com
ChatGPT + YandexGPT API = ЛЮБОФ. Часть 1
ChatGPT + YandexGPT API = ЛЮБОФ. Часть 1 ChatGPT + YandexGPT API = ЛЮБОФ. Часть 1

ChatGPT 4 был значительно улучшен по сравнению с ChatGPT 3.5, что делает его более мощным инструментом.

Вам тоже надо учиться — учиться выстраивать взаимоотношение с ChatGPT, учиться общаться с ним.

Помните, вы всегда можете уточнить любую строчку в ответе ChatGPT и, в большинстве случаев, получить исчерпывающий ответ, который, вероятнее всего, обогатит вас новыми знаниями.

Вот несколько идей:Откройте с ChatGPT новый чат и в нём отправьте запрос в другой форме, желательно с новыми деталями.

И поэтому, когда с ChatGPT не удаётся что-то сделать с первого раза за 2–5 минут, возникает возмущение: “Ну, как так?!”.

6 months, 1 week назад @ habr.com
Machine Learning Mastery
последний пост 4 months, 1 week назад
Tips for Effectively Training Your Machine Learning Models
Tips for Effectively Training Your Machine Learning Models

In machine learning projects, achieving optimal model performance requires paying attention to various steps in the training process. But before focusing on the technical aspects of model training, it is important to define the problem, understand the context, and analyze the dataset in detail. Once you have a solid grasp of the problem and data, […]

The post Tips for Effectively Training Your Machine Learning Models appeared first on MachineLearningMastery.com.

4 months, 1 week назад @ machinelearningmastery.com
Principles of Reinforcement Learning: An Introduction with Python
Principles of Reinforcement Learning: An Introduction with Python

Reinforcement Learning (RL) is a type of machine learning. It trains an agent to make decisions by interacting with an environment. This article covers the basic concepts of RL. These include states, actions, rewards, policies, and the Markov Decision Process (MDP). By the end, you will understand how RL works. You will also learn how […]

The post Principles of Reinforcement Learning: An Introduction with Python appeared first on MachineLearningMastery.com.

4 months, 1 week назад @ machinelearningmastery.com
5 Tips for Getting Started with Deep Learning
5 Tips for Getting Started with Deep Learning

Deep learning is a subset of machine learning that has become a cornerstone in many technological breakthroughs. At the core of deep learning, it’s a model inspired by the human brain, which we call a neural network. Contrary to the traditional machine learning model, deep learning can automatically find feature representations from data. That’s why […]

The post 5 Tips for Getting Started with Deep Learning appeared first on MachineLearningMastery.com.

4 months, 2 weeks назад @ machinelearningmastery.com
Tips for Effective Feature Engineering in Machine Learning
Tips for Effective Feature Engineering in Machine Learning

Feature engineering is an important step in the machine learning pipeline. It is the process of transforming data in its native format into meaningful features to help the machine learning model learn better from the data. If done right, feature engineering can significantly enhance the performance of machine learning algorithms. Beyond the basics of understanding […]

The post Tips for Effective Feature Engineering in Machine Learning appeared first on MachineLearningMastery.com.

4 months, 2 weeks назад @ machinelearningmastery.com
5 Common Mistakes in Machine Learning and How to Avoid Them
5 Common Mistakes in Machine Learning and How to Avoid Them

Using machine learning to solve real-world problems is exciting. But most eager beginners jump straight to model building—overlooking the fundamentals—resulting in models that aren’t very helpful. From understanding the data to choosing the best machine learning model for the problem, there are some common mistakes that beginners often tend to make. But before we go […]

The post 5 Common Mistakes in Machine Learning and How to Avoid Them appeared first on MachineLearningMastery.com.

4 months, 3 weeks назад @ machinelearningmastery.com
Stable Diffusion Project: Reviving Old Photos
Stable Diffusion Project: Reviving Old Photos

Photography has been around for more than a century. There are many old photos around, and probably your family has some, too. Limited by the camera and film of the time, you may have photos of low resolution, blurry, or with folds or scratches. Restoring these old photos and making them like new ones taken […]

The post Stable Diffusion Project: Reviving Old Photos appeared first on MachineLearningMastery.com.

4 months, 3 weeks назад @ machinelearningmastery.com
The Ultimate Beginner’s Guide to Docker
The Ultimate Beginner’s Guide to Docker

Today’s digital landscape has never been so diverse. Every individual and company selects their preferred tools and operating systems, creating a diverse technological system. However, this diversity often leads to compatibility issues, making it hard to ensure application performance across different environments. This is where Docker plays a key role as an indispensable tool for […]

The post The Ultimate Beginner’s Guide to Docker appeared first on MachineLearningMastery.com.

4 months, 3 weeks назад @ machinelearningmastery.com
Stable Diffusion Project: Commercial Poster
Stable Diffusion Project: Commercial Poster

Stable Diffusion has taken the AI art world by storm, empowering users to generate stunning and imaginative visuals with just a few text prompts. This opens exciting possibilities for creatives, including crafting impactful commercial posters. In this post, we’ll delve into using Stable Diffusion to design a compelling poster for a product. After finishing this […]

The post Stable Diffusion Project: Commercial Poster appeared first on MachineLearningMastery.com.

4 months, 3 weeks назад @ machinelearningmastery.com
5 Effective Ways to Handle Imbalanced Data in Machine Learning
5 Effective Ways to Handle Imbalanced Data in Machine Learning

Introduction Here’s a something that new machine learning practitioners figure out almost immediately: not all datasets are created equal. It may now seem obvious to you, but had you considered this before undertaking machine learning projects on a real world dataset? As an example of a single class vastly outnumbering the rest, take for instance […]

The post 5 Effective Ways to Handle Imbalanced Data in Machine Learning appeared first on MachineLearningMastery.com.

4 months, 4 weeks назад @ machinelearningmastery.com
Tips for Choosing the Right Machine Learning Model for Your Data
Tips for Choosing the Right Machine Learning Model for Your Data

Introduction Choosing the right machine learning model for your data is of major importance in any data science project. The model you select will have a significant impact on the insights you derive from your data, and ultimately determine the usefulness of a project. In this article, we aim to provide practical tips to help […]

The post Tips for Choosing the Right Machine Learning Model for Your Data appeared first on MachineLearningMastery.com.

5 months назад @ machinelearningmastery.com
Stable Diffusion Project: Creating Illustration
Stable Diffusion Project: Creating Illustration

Many people write in their jobs. Not everyone is a novel writer; some write technical documentation, business plans, news articles, and even blog posts. In those writings, illustrations are not essential but often good to have. They are decorations, interpretations, or visual explanations of the text. However, you probably do not want to spend too […]

The post Stable Diffusion Project: Creating Illustration appeared first on MachineLearningMastery.com.

5 months назад @ machinelearningmastery.com
5 Free Books on Machine Learning Algorithms You Must Read
5 Free Books on Machine Learning Algorithms You Must Read

If you are a machine learning student, researcher, or practitioner, it is crucial for your career growth to have a deep understanding of how each algorithm works and the various techniques to enhance model performance. Nowadays, many individuals tend to focus solely on the code, data, and pre-trained models, often without fully comprehending the machine […]

The post 5 Free Books on Machine Learning Algorithms You Must Read appeared first on MachineLearningMastery.com.

5 months назад @ machinelearningmastery.com
Stable Diffusion Project: Word Art
Stable Diffusion Project: Word Art

Stable Diffusion is a powerful tool that helps you generate pictures. It is fun to play with the generative AI tool. But it would be useful if the tool could help you in a real job. In this post, you will see how you can leverage the power of Stable Diffusion to work on something […]

The post Stable Diffusion Project: Word Art appeared first on MachineLearningMastery.com.

5 months назад @ machinelearningmastery.com
5 Free YouTube Channels Dedicated to Machine Learning Education
5 Free YouTube Channels Dedicated to Machine Learning Education

As a data professional, you should also know how to build predictive models with machine learning to solve business problems. And if you’re interested in machine learning, you’re probably also looking for the best resources to get going. Well, you can always choose a self-paced online course that best aligns with your learning preferences. But […]

The post 5 Free YouTube Channels Dedicated to Machine Learning Education appeared first on MachineLearningMastery.com.

5 months назад @ machinelearningmastery.com
Tips for Choosing the Right Machine Learning Course
Tips for Choosing the Right Machine Learning Course

If you’re looking to make a career in data science, you probably know that machine learning is one of the most in-demand skills. Whether you are a beginner looking to break into the field or an experienced professional aiming to level up your expertise, selecting the right machine learning course is super important. So how […]

The post Tips for Choosing the Right Machine Learning Course appeared first on MachineLearningMastery.com.

5 months назад @ machinelearningmastery.com
ML in Production
последний пост None
Sorta Insightful Sorta Insightful
последний пост 3 months назад
Nine Years Later
Nine Years Later Nine Years Later

I expected to fill that void with more blog writing, but that’s not what happened.

The puzzles are great though, and if that’s good enough for you, I had fun with that.

Undertale YellowUndertale Yellow is a fantastic fan game, that’s been in development for 7 years and comes out feeling like a canon entry made by Toby Fox.

markdown 15837 2024 - 01 - 11 - ai - timelines - 2024. markdown 1939 2024 - 01 - 21 - mh - 2024. markdown 5076 2024 - 03 - 23 - crew - battle .

markdown 826 2024 - 04 - 30 - puzzlehunting - 201. markdown 8641 2024 - 07 - 08 - tragedies - of - reality .

3 months назад @ alexirpan.com
I'm Switching Into AI Safety
I'm Switching Into AI Safety I'm Switching Into AI Safety

There’s often a conflation between the research field of AI safety and the community of AI safety.

Me thinking AI safety is important is not an endorsement for or against anything else in the broader meme space it came from.

Historically, AI safety work did not appeal to me because of how theoretical it was.

I’m aware of the arguments that most AI safety work so far has either been useless or not that different from broader AI work.

Those who care about safety a lot call this safetywashing, the stapling of “safety” to work that does not advance safety.

3 months, 2 weeks назад @ alexirpan.com
The Tragedies of Reality Are Coming for You
The Tragedies of Reality Are Coming for You The Tragedies of Reality Are Coming for You

I would extend it to reality is complicated, relative to code, and in robotics you’re often pushing a messy reality into an abstraction nice enough for code to act on it.

Robotics research relies on building new bridges between reality and software, but that happens outside of robotics too.

Any software that interfaces with reality will have imperfect knowledge of that reality.

However, that means all the messiness of reality is coming for a field that historically does a bad job at considering reality.

I consider the world of bits to be as much a part of reality as the world of atoms.

4 months, 2 weeks назад @ alexirpan.com
Puzzlehunting 201
Puzzlehunting 201 Puzzlehunting 201

Most people will have more fun if they solve puzzles than if they don’t, but you don’t have to solve puzzles quickly to have fun.

I’m still going to explain the solving strategies I’ve learned, but puzzle solving is really an activity where you learn by doing.

Puzzle solving often involves relating two parts of the puzzle together.

Search everythingHonestly, a lot of puzzle solving is about taking random parts of the puzzle and throwing them into a search engine.

Bringing This TogetherTo showcase these strategies together, here is a puzzle I remember speedrunning especially quickly: The Three Little Pigs from Hunt 20 2.1 Puzzle Hunt.

6 months, 3 weeks назад @ alexirpan.com
Solving Crew Battle Strategy With Math
Solving Crew Battle Strategy With Math Solving Crew Battle Strategy With Math

This means we can reduce all the crew battle outcomes down to a single \(n \times n\) matrix, which I’ll call the crew battle matrix.

So I Wrote Some Python Code to Compute \(f\) for Arbitrary Crew BattlesLet’s consider the RPS crew battle again.

Here’s the matchup matrix:and here’s what my code outputs for the crew battle matrix, assuming optimal play.

But also, this suggests that crew battle strategy really isn’t that important in the first place!

If your crew loses, you don’t get to blame bad crew battle strategy.

8 months назад @ alexirpan.com
Lil'Log
последний пост None
inFERENCe
последний пост None
Off the Convex Path
последний пост None
Jay Alammar
последний пост None
fast.ai NLP fast.ai NLP
последний пост None
Sebastian Ruder
последний пост None
Andrew Karpathy blog
последний пост None
大トロ 大トロ
последний пост None
🔬 Science
Papers With Code Papers With Code
последний пост 40 минут назад
/utcsilab/ Enhancing Deep Learning-Driven Multi-Coil MRI Reconstruction via Self-Supervised Denoising
/utcsilab/ Enhancing Deep Learning-Driven Multi-Coil MRI Reconstruction via Self-Supervised Denoising /utcsilab/ Enhancing Deep Learning-Driven Multi-Coil MRI Reconstruction via Self-Supervised Denoising

We examine the effect of incorporating self-supervised denoising as a pre-processing step for training deep learning (DL) based reconstruction methods on data corrupted by Gaussian noise.

Although DL-based reconstruction methods trained on fully sampled data can enable high reconstruction quality, obtaining large, noise-free datasets is impractical.

We evaluate two DL-based reconstruction methods: Diffusion Probabilistic Models (DPMs) and Model-Based Deep Learning (MoDL).

We evaluate the impact of denoising on the performance of these DL-based methods in solving accelerated multi-coil magnetic resonance imaging (MRI) reconstruction.

Overall, we showed that denoising is an essential pre-proc…

40 минут назад @ paperswithcode.com
/sonnygeorge/ Probing the Capacity of Language Model Agents to Operationalize Disparate Experiential Context Despite Distraction
/sonnygeorge/ Probing the Capacity of Language Model Agents to Operationalize Disparate Experiential Context Despite Distraction /sonnygeorge/ Probing the Capacity of Language Model Agents to Operationalize Disparate Experiential Context Despite Distraction

Large language model (LLM) agents show promise in an increasing number of domains.

In many proposed applications, it is expected that the agent reasons over accumulated experience presented in an input prompt.

We propose the OEDD (Operationalize Experience Despite Distraction) corpus, a human-annotator-validated body of scenarios with pre-scripted agent histories where the agent must make a decision based on disparate experiential information in the presence of a distractor.

Our code and test corpus are publicly available at: https://github.com/sonnygeorge/OEDD .

PDFAbstract

1 час назад @ paperswithcode.com
/EtaEnding/ Signformer is all you need: Towards Edge AI for Sign Language
/EtaEnding/ Signformer is all you need: Towards Edge AI for Sign Language /EtaEnding/ Signformer is all you need: Towards Edge AI for Sign Language

Sign language translation, especially in gloss-free paradigm, is confronting a dilemma of impracticality and unsustainability due to growing resource-intensive methodologies.

Contemporary state-of-the-arts (SOTAs) have significantly hinged on pretrained sophiscated backbones such as Large Language Models (LLMs), embedding sources, or extensive datasets, inducing considerable parametric and computational inefficiency for sustainable use in real-world scenario.

Despite their success, following this research direction undermines the overarching mission of this domain to create substantial value to bridge hard-hearing and common populations.

Introducing Signformer, a from-scratch Feather-Giant …

1 час назад @ paperswithcode.com
/haonan3/ When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training
/haonan3/ When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training /haonan3/ When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training

Extending context window sizes allows large language models (LLMs) to process longer sequences and handle more complex tasks.

Rotary Positional Embedding (RoPE) has become the de facto standard due to its relative positional encoding properties that benefit long-context training.

However, we observe that using RoPE with BFloat16 format results in numerical issues, causing it to deviate from its intended relative positional encoding, especially in long-context scenarios.

This issue arises from BFloat16's limited precision and accumulates as context length increases, with the first token contributing significantly to this problem.

To address this, we develop AnchorAttention, a plug-and-play a…

1 час назад @ paperswithcode.com
/wangzy22/ XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic Segmentation
/wangzy22/ XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic Segmentation /wangzy22/ XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic Segmentation

Existing methodologies in open vocabulary 3D semantic segmentation primarily concentrate on establishing a unified feature space encompassing 3D, 2D, and textual modalities.

To address this gap, we propose a more meticulous mask-level alignment between 3D features and the 2D-text embedding space through a cross-modal mask reasoning framework, XMask3D.

We further integrate 3D global features as implicit conditions into the pre-trained 2D denoising UNet, enabling the generation of segmentation masks with additional 3D geometry awareness.

Subsequently, the generated 2D masks are employed to align mask-level 3D representations with the vision-language feature space, thereby augmenting the open …

1 час назад @ paperswithcode.com
/mingyuj666/ Disentangling Memory and Reasoning Ability in Large Language Models
/mingyuj666/ Disentangling Memory and Reasoning Ability in Large Language Models /mingyuj666/ Disentangling Memory and Reasoning Ability in Large Language Models

Large Language Models (LLMs) have demonstrated strong performance in handling complex tasks requiring both extensive knowledge and reasoning abilities.

However, the existing LLM inference pipeline operates as an opaque process without explicit separation between knowledge retrieval and reasoning steps, making the model's decision-making process unclear and disorganized.

This ambiguity can lead to issues such as hallucinations and knowledge forgetting, which significantly impact the reliability of LLMs in high-stakes domains.

To facilitate this decomposition, we introduce two special tokens memory and reason, guiding the model to distinguish between steps that require knowledge retrieval and…

1 час назад @ paperswithcode.com
/chensiweithu/ WHALES: A Multi-agent Scheduling Dataset for Enhanced Cooperation in Autonomous Driving
/chensiweithu/ WHALES: A Multi-agent Scheduling Dataset for Enhanced Cooperation in Autonomous Driving /chensiweithu/ WHALES: A Multi-agent Scheduling Dataset for Enhanced Cooperation in Autonomous Driving

Achieving high levels of safety and reliability in autonomous driving remains a critical challenge, especially due to occlusion and limited perception ranges in standalone systems.

Cooperative perception among vehicles offers a promising solution, but existing research is hindered by datasets with a limited number of agents.

To bridge this gap, we present Wireless enHanced Autonomous vehicles with Large number of Engaged agentS (WHALES), a dataset generated using CARLA simulator that features an unprecedented average of 8.4 agents per driving sequence.

In addition to providing the largest number of agents and viewpoints among autonomous driving datasets, WHALES records agent behaviors, enab…

1 час назад @ paperswithcode.com
/minjoong507/ On the Consistency of Video Large Language Models in Temporal Comprehension
/minjoong507/ On the Consistency of Video Large Language Models in Temporal Comprehension /minjoong507/ On the Consistency of Video Large Language Models in Temporal Comprehension

Video large language models (Video-LLMs) can temporally ground language queries and retrieve video moments.

So we conduct a study on prediction consistency -- a key indicator for robustness and trustworthiness of temporal grounding.

After the model identifies an initial moment within the video content, we apply a series of probes to check if the model's responses align with this initial grounding as an indicator of reliable comprehension.

Our results reveal that current Video-LLMs are sensitive to variations in video contents, language queries, and task settings, unveiling severe deficiencies in maintaining consistency.

To that end, we propose event temporal verification tuning that explici…

1 час назад @ paperswithcode.com
/thu-uav/ Neural Internal Model Control: Learning a Robust Control Policy via Predictive Error Feedback
/thu-uav/ Neural Internal Model Control: Learning a Robust Control Policy via Predictive Error Feedback /thu-uav/ Neural Internal Model Control: Learning a Robust Control Policy via Predictive Error Feedback

Classical model-based approaches often struggle with nonlinearities and unstructured disturbances, while RL-based methods can be fragile when encountering unseen scenarios.

In this paper, we propose a novel framework, Neural Internal Model Control, which integrates model-based control with RL-based control to enhance robustness.

Our framework streamlines the predictive model by applying Newton-Euler equations for rigid-body dynamics, eliminating the need to capture complex high-dimensional nonlinearities.

This internal model combines model-free RL algorithms with predictive error feedback.

Such a design enables a closed-loop control structure to enhance the robustness and generalizability o…

1 час назад @ paperswithcode.com
/tue-mps/ A Resource Efficient Fusion Network for Object Detection in Bird's-Eye View using Camera and Raw Radar Data
/tue-mps/ A Resource Efficient Fusion Network for Object Detection in Bird's-Eye View using Camera and Raw Radar Data /tue-mps/ A Resource Efficient Fusion Network for Object Detection in Bird's-Eye View using Camera and Raw Radar Data

However, radar point clouds are sparser with low azimuth and elevation resolution that lack semantic and structural information of the scenes, resulting in generally lower radar detection performance.

In this work, we directly use the raw range-Doppler (RD) spectrum of radar data, thus avoiding radar signal processing.

We independently process camera images within the proposed comprehensive image processing pipeline.

Specifically, first, we transform the camera images to Bird's-Eye View (BEV) Polar domain and extract the corresponding features with our camera encoder-decoder architecture.

The resultant feature maps are fused with Range-Azimuth (RA) features, recovered from the RD spectrum i…

1 час назад @ paperswithcode.com
/pwenjay/ Globally Correlation-Aware Hard Negative Generation
/pwenjay/ Globally Correlation-Aware Hard Negative Generation /pwenjay/ Globally Correlation-Aware Hard Negative Generation

Hard negative generation aims to generate informative negative samples that help to determine the decision boundaries and thus facilitate advancing deep metric learning.

Current works select pair/triplet samples, learn their correlations, and fuse them to generate hard negatives.

However, these works merely consider the local correlations of selected samples, ignoring global sample correlations that would provide more significant information to generate more informative negatives.

In this work, we propose a Globally Correlation-Aware Hard Negative Generation (GCA-HNG) framework, which first learns sample correlations from a global perspective and exploits these correlations to guide generat…

1 час назад @ paperswithcode.com
/vchitect/ VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models
/vchitect/ VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models /vchitect/ VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models

A comprehensive evaluation benchmark for video generation is indispensable for two reasons: 1) Existing metrics do not fully align with human perceptions; 2) An ideal evaluation system should provide insights to inform future developments of video generation.

To this end, we present VBench, a comprehensive benchmark suite that dissects "video generation quality" into specific, hierarchical, and disentangled dimensions, each with tailored prompts and evaluation methods.

VBench has several appealing properties: 1) Comprehensive Dimensions: VBench comprises 16 dimensions in video generation (e.g., subject identity inconsistency, motion smoothness, temporal flickering, and spatial relationship,…

1 час назад @ paperswithcode.com
/prowontheus/ Intensity-Spatial Dual Masked Autoencoder for Multi-Scale Feature Learning in Chest CT Segmentation
/prowontheus/ Intensity-Spatial Dual Masked Autoencoder for Multi-Scale Feature Learning in Chest CT Segmentation /prowontheus/ Intensity-Spatial Dual Masked Autoencoder for Multi-Scale Feature Learning in Chest CT Segmentation

In the field of medical image segmentation, challenges such as indistinct lesion features, ambiguous boundaries,and multi-scale characteristics have long revailed.

This paper proposes an improved method named Intensity-Spatial Dual Masked AutoEncoder (ISD-MAE).

Based on the tissue-contrast semi-masked autoencoder, a Masked AutoEncoder (MAE) branch is introduced to perform intensity masking and spatial masking operations on chest CT images for multi-scale feature learning and segmentation tasks.

The results show that ISD-MAE significantly outperforms other methods in 2D pneumonia and mediastinal tumor segmentation tasks.

However, there is still room for improvement on 3D datasets.

1 час назад @ paperswithcode.com
/microsoft/ REDUCIO! Generating 1024$\times$1024 Video within 16 Seconds using Extremely Compressed Motion Latents
/microsoft/ REDUCIO! Generating 1024$\times$1024 Video within 16 Seconds using Extremely Compressed Motion Latents /microsoft/ REDUCIO! Generating 1024$\times$1024 Video within 16 Seconds using Extremely Compressed Motion Latents

Commercial video generation models have exhibited realistic, high-fidelity results but are still restricted to limited access.

In this paper, we argue that videos contain much more redundant information than images, thus can be encoded by very few motion latents based on a content image.

Towards this goal, we design an image-conditioned VAE to encode a video to an extremely compressed motion latent space.

This magic Reducio charm enables 64x reduction of latents compared to a common 2D VAE, without sacrificing the quality.

More importantly, our method significantly boost the efficiency of video LDMs both in training and inference.

1 час назад @ paperswithcode.com
/sivandoveh/ Teaching VLMs to Localize Specific Objects from In-context Examples
/sivandoveh/ Teaching VLMs to Localize Specific Objects from In-context Examples /sivandoveh/ Teaching VLMs to Localize Specific Objects from In-context Examples

Vision-Language Models (VLMs) have shown remarkable capabilities across diverse visual tasks, including image recognition, video understanding, and Visual Question Answering (VQA) when explicitly trained for these tasks.

Despite these advances, we find that current VLMs lack a fundamental cognitive ability: learning to localize objects in a scene by taking into account the context.

To provoke personalized localization abilities in models, we present a data-centric solution that fine-tunes them using carefully curated data from video object tracking datasets.

Our method significantly enhances few-shot localization performance without sacrificing generalization, as demonstrated on several ben…

1 час назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 40 минут назад
/leon1207/ Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension
/leon1207/ Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension /leon1207/ Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension

Existing large video-language models (LVLMs) struggle to comprehend long videos correctly due to limited context.

However, fine-tuning LVLMs would require extensive high-quality data and substantial GPU resources, while GPT-based agents would rely on proprietary models (e.g., GPT-4o).

In this paper, we propose Video Retrieval-Augmented Generation (Video-RAG), a training-free and cost-effective pipeline that employs visually-aligned auxiliary texts to help facilitate cross-modality alignment while providing additional information beyond the visual content.

Specifically, we leverage open-source external tools to extract visually-aligned information from pure video data (e.g., audio, optical c…

1 час назад @ paperswithcode.com
/huangwenwenlili/ Attentive Contextual Attention for Cloud Removal
/huangwenwenlili/ Attentive Contextual Attention for Cloud Removal /huangwenwenlili/ Attentive Contextual Attention for Cloud Removal

Cloud cover can significantly hinder the use of remote sensing images for Earth observation, prompting urgent advancements in cloud removal technology.

These methods utilize convolution to extract intricate local features and attention mechanisms to gather long-range information, improving the overall comprehension of the scene.

To overcome this limitation and better capture relevant distant context, we introduce a novel approach named Attentive Contextual Attention (AC-Attention).

This method enhances conventional attention mechanisms by dynamically learning data-driven attentive selection scores, enabling it to filter out noise and irrelevant features effectively.

By integrating the AC-At…

1 час назад @ paperswithcode.com
/xiandaguo/ DriveMLLM: A Benchmark for Spatial Understanding with Multimodal Large Language Models in Autonomous Driving
/xiandaguo/ DriveMLLM: A Benchmark for Spatial Understanding with Multimodal Large Language Models in Autonomous Driving /xiandaguo/ DriveMLLM: A Benchmark for Spatial Understanding with Multimodal Large Language Models in Autonomous Driving

Autonomous driving requires a comprehensive understanding of 3D environments to facilitate high-level tasks such as motion prediction, planning, and mapping.

In this paper, we introduce DriveMLLM, a benchmark specifically designed to evaluate the spatial understanding capabilities of multimodal large language models (MLLMs) in autonomous driving.

DriveMLLM includes 2,734 front-facing camera images and introduces both absolute and relative spatial reasoning tasks, accompanied by linguistically diverse natural language questions.

We evaluate several state-of-the-art MLLMs on DriveMLLM, and our results reveal the limitations of current models in understanding complex spatial relationships in d…

1 час назад @ paperswithcode.com
/charrrrrlie/ X as Supervision: Contending with Depth Ambiguity in Unsupervised Monocular 3D Pose Estimation
/charrrrrlie/ X as Supervision: Contending with Depth Ambiguity in Unsupervised Monocular 3D Pose Estimation /charrrrrlie/ X as Supervision: Contending with Depth Ambiguity in Unsupervised Monocular 3D Pose Estimation

Recent unsupervised methods for monocular 3D pose estimation have endeavored to reduce dependence on limited annotated 3D data, but most are solely formulated in 2D space, overlooking the inherent depth ambiguity issue.

Due to the information loss in 3D-to-2D projection, multiple potential depths may exist, yet only some of them are plausible in human structure.

To tackle depth ambiguity, we propose a novel unsupervised framework featuring a multi-hypothesis detector and multiple tailored pretext tasks.

Furthermore, the pretext tasks harness 3D human priors from the SMPL model to regularize the solution space of pose estimation, aligning it with the empirical distribution of 3D human struct…

1 час назад @ paperswithcode.com
/qiongwu86/ DRL-Based Optimization for AoI and Energy Consumption in C-V2X Enabled IoV
/qiongwu86/ DRL-Based Optimization for AoI and Energy Consumption in C-V2X Enabled IoV /qiongwu86/ DRL-Based Optimization for AoI and Energy Consumption in C-V2X Enabled IoV

To address communication latency issues, the Third Generation Partnership Project (3GPP) has defined Cellular-Vehicle to Everything (C-V2X) technology, which includes Vehicle-to-Vehicle (V2V) communication for direct vehicle-to-vehicle communication.

When evaluating vehicle communication performance, traditional metrics such as reliability and transmission delay present certain contradictions.

Additionally, to ensure service quality, user terminals need to possess high computational capabilities, which may lead to increased energy consumption, necessitating a trade-off between communication energy consumption and effectiveness.

Therefore, this paper analyzes the effects of multi-priority qu…

1 час назад @ paperswithcode.com
/xavierjiezou/ Adapting Vision Foundation Models for Robust Cloud Segmentation in Remote Sensing Images
/xavierjiezou/ Adapting Vision Foundation Models for Robust Cloud Segmentation in Remote Sensing Images /xavierjiezou/ Adapting Vision Foundation Models for Robust Cloud Segmentation in Remote Sensing Images

Cloud segmentation is a critical challenge in remote sensing image interpretation, as its accuracy directly impacts the effectiveness of subsequent data processing and analysis.

Recently, vision foundation models (VFM) have demonstrated powerful generalization capabilities across various visual tasks.

In this paper, we present a parameter-efficient adaptive approach, termed Cloud-Adapter, designed to enhance the accuracy and robustness of cloud segmentation.

Our method leverages a VFM pretrained on general domain data, which remains frozen, eliminating the need for additional training.

Cloud-Adapter consistently attains state-of-the-art (SOTA) performance across a wide variety of cloud segm…

1 час назад @ paperswithcode.com
/tokkiwa/ LMM-driven Semantic Image-Text Coding for Ultra Low-bitrate Learned Image Compression
/tokkiwa/ LMM-driven Semantic Image-Text Coding for Ultra Low-bitrate Learned Image Compression /tokkiwa/ LMM-driven Semantic Image-Text Coding for Ultra Low-bitrate Learned Image Compression

Supported by powerful generative models, low-bitrate learned image compression (LIC) models utilizing perceptual metrics have become feasible.

Some of the most advanced models achieve high compression rates and superior perceptual quality by using image captions as sub-information.

This paper demonstrates that using a large multi-modal model (LMM), it is possible to generate captions and compress them within a single model.

We also propose a novel semantic-perceptual-oriented fine-tuning method applicable to any LIC network, resulting in a 41.58\% improvement in LPIPS BD-rate compared to existing methods.

Our implementation and pre-trained weights are available at https://github.com/tokkiwa…

1 час назад @ paperswithcode.com
/yuanyuan98/ A Foundation Model for Unified Urban Spatio-Temporal Flow Prediction
/yuanyuan98/ A Foundation Model for Unified Urban Spatio-Temporal Flow Prediction /yuanyuan98/ A Foundation Model for Unified Urban Spatio-Temporal Flow Prediction

Urban spatio-temporal flow prediction, encompassing traffic flows and crowd flows, is crucial for optimizing city infrastructure and managing traffic and emergency responses.

Traditional approaches have relied on separate models tailored to either grid-based data, representing cities as uniform cells, or graph-based data, modeling cities as networks of nodes and edges.

In this paper, we build UniFlow, a foundational model for general urban flow prediction that unifies both grid-based and graphbased data.

To leverage shared spatio-temporal patterns across different data types and facilitate effective cross-learning, we propose SpatioTemporal Memory Retrieval Augmentation (ST-MRA).

Extensive …

1 час назад @ paperswithcode.com
/pegahs1993/ Comparative Analysis of Audio Feature Extraction for Real-Time Talking Portrait Synthesis
/pegahs1993/ Comparative Analysis of Audio Feature Extraction for Real-Time Talking Portrait Synthesis /pegahs1993/ Comparative Analysis of Audio Feature Extraction for Real-Time Talking Portrait Synthesis

This paper examines the integration of real-time talking-head generation for interviewer training, focusing on overcoming challenges in Audio Feature Extraction (AFE), which often introduces latency and limits responsiveness in real-time applications.

To address these issues, we propose and implement a fully integrated system that replaces conventional AFE models with Open AI's Whisper, leveraging its encoder to optimize processing and improve overall system efficiency.

Our evaluation of two open-source real-time models across three different datasets shows that Whisper not only accelerates processing but also improves specific aspects of rendering quality, resulting in more realistic and r…

1 час назад @ paperswithcode.com
/guaishou74851/ Practical Compact Deep Compressed Sensing
/guaishou74851/ Practical Compact Deep Compressed Sensing /guaishou74851/ Practical Compact Deep Compressed Sensing

Recent years have witnessed the success of deep networks in compressed sensing (CS), which allows for a significant reduction in sampling cost and has gained growing attention since its inception.

In this paper, we propose a new practical and compact network dubbed PCNet for general image CS.

Specifically, in PCNet, a novel collaborative sampling operator is designed, which consists of a deep conditional filtering step and a dual-branch fast sampling step.

Our PCNet is equipped with an enhanced proximal gradient descent algorithm-unrolled network for reconstruction.

Extensive experiments on natural image CS, quantized CS, and self-supervised CS demonstrate the superior reconstruction accura…

1 час назад @ paperswithcode.com
/EtaEnding/ Signformer is all you need: Towards Edge AI for Sign Language
/EtaEnding/ Signformer is all you need: Towards Edge AI for Sign Language /EtaEnding/ Signformer is all you need: Towards Edge AI for Sign Language

Sign language translation, especially in gloss-free paradigm, is confronting a dilemma of impracticality and unsustainability due to growing resource-intensive methodologies.

Contemporary state-of-the-arts (SOTAs) have significantly hinged on pretrained sophiscated backbones such as Large Language Models (LLMs), embedding sources, or extensive datasets, inducing considerable parametric and computational inefficiency for sustainable use in real-world scenario.

Despite their success, following this research direction undermines the overarching mission of this domain to create substantial value to bridge hard-hearing and common populations.

Introducing Signformer, a from-scratch Feather-Giant …

3 часа назад @ paperswithcode.com
/daeyongkwon98/ Predicting User Intents and Musical Attributes from Music Discovery Conversations
/daeyongkwon98/ Predicting User Intents and Musical Attributes from Music Discovery Conversations /daeyongkwon98/ Predicting User Intents and Musical Attributes from Music Discovery Conversations

Intent classification is a text understanding task that identifies user needs from input text queries.

While intent classification has been extensively studied in various domains, it has not received much attention in the music domain.

In this paper, we investigate intent classification models for music discovery conversation, focusing on pre-trained language models.

Rather than only predicting functional needs: intent classification, we also include a task for classifying musical needs: musical attribute classification.

Our proposed model significantly improves the F1 score for both user intent and musical attribute classification, and surpasses the zero-shot and few-shot performance of th…

3 часа назад @ paperswithcode.com
/sfadingz/ Topological Symmetry Enhanced Graph Convolution for Skeleton-Based Action Recognition
/sfadingz/ Topological Symmetry Enhanced Graph Convolution for Skeleton-Based Action Recognition /sfadingz/ Topological Symmetry Enhanced Graph Convolution for Skeleton-Based Action Recognition

Skeleton-based action recognition has achieved remarkable performance with the development of graph convolutional networks (GCNs).

However, most of these methods tend to construct complex topology learning mechanisms while neglecting the inherent symmetry of the human body.

To address the issues, we (1) propose a novel Topological Symmetry Enhanced Graph Convolution (TSE-GC) to enable distinct topology learning across different channel partitions while incorporating topological symmetry awareness and (2) construct a Multi-Branch Deformable Temporal Convolution (MBDTC) for skeleton-based action recognition.

Combining TSE-GC with MBDTC, our final model, TSE-GCN, achieves competitive performan…

4 часа назад @ paperswithcode.com
/ETH-DISCO/ Benchmarking Positional Encodings for GNNs and Graph Transformers
/ETH-DISCO/ Benchmarking Positional Encodings for GNNs and Graph Transformers /ETH-DISCO/ Benchmarking Positional Encodings for GNNs and Graph Transformers

Recent advances in Graph Neural Networks (GNNs) and Graph Transformers (GTs) have been driven by innovations in architectures and Positional Encodings (PEs), which are critical for augmenting node features and capturing graph topology.

PEs are essential for GTs, where topological information would otherwise be lost without message-passing.

However, PEs are often tested alongside novel architectures, making it difficult to isolate their effect on established models.

To address this, we present a comprehensive benchmark of PEs in a unified framework that includes both message-passing GNNs and GTs.

Our findings demonstrate that previously untested combinations of GNN architectures and PEs can …

14 часов назад @ paperswithcode.com
/pascaltribel/ PyAWD: A Library for Generating Large Synthetic Datasets of Acoustic Wave Propagation with Devito
/pascaltribel/ PyAWD: A Library for Generating Large Synthetic Datasets of Acoustic Wave Propagation with Devito /pascaltribel/ PyAWD: A Library for Generating Large Synthetic Datasets of Acoustic Wave Propagation with Devito

Seismic data is often sparse and unevenly distributed due to the high costs and logistical challenges associated with deploying physical seismometers, limiting the application of Machine Learning (ML) in earthquake analysis.

To address this gap, we introduce PyAWD, a Python library designed to generate high-resolution synthetic datasets simulating spatio-temporal acoustic wave propagation in both two-dimensional and three-dimensional heterogeneous media.

By allowing fine control over parameters such as wave speed, external forces, spatial and temporal discretization, and media composition, PyAWD enables the creation of ML-scale datasets that capture the complexity of seismic wave behavior.

15 часов назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 40 минут назад
/sssadrizadeh/ NMT-Obfuscator Attack: Ignore a sentence in translation with only one word
/sssadrizadeh/ NMT-Obfuscator Attack: Ignore a sentence in translation with only one word /sssadrizadeh/ NMT-Obfuscator Attack: Ignore a sentence in translation with only one word

Neural Machine Translation systems are used in diverse applications due to their impressive performance.

In this paper, we propose a new type of adversarial attack against NMT models.

In this attack, we find a word to be added between two sentences such that the second sentence is ignored and not translated by the NMT model.

The word added between the two sentences is such that the whole adversarial text is natural in the source language.

Our experiments show that different NMT models and translation tasks are vulnerable to this type of attack.

15 часов назад @ paperswithcode.com
/pomonam/ Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models
/pomonam/ Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models /pomonam/ Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models

The capabilities and limitations of Large Language Models have been sketched out in great detail in recent years, providing an intriguing yet conflicting picture.

We further find that the answers to factual questions often show up in the most influential data.

However, for reasoning questions the answers usually do not show up as highly influential, nor do the answers to the intermediate reasoning steps.

When we characterise the top ranked documents for the reasoning questions qualitatively, we confirm that the influential documents often contain procedural knowledge, like demonstrating how to obtain a solution using formulae or code.

Our findings indicate that the approach to reasoning the…

15 часов назад @ paperswithcode.com
/YuanYuan98/ UrbanDiT: A Foundation Model for Open-World Urban Spatio-Temporal Learning
/YuanYuan98/ UrbanDiT: A Foundation Model for Open-World Urban Spatio-Temporal Learning /YuanYuan98/ UrbanDiT: A Foundation Model for Open-World Urban Spatio-Temporal Learning

The urban environment is characterized by complex spatio-temporal dynamics arising from diverse human activities and interactions.

Effectively modeling these dynamics is essential for understanding and optimizing urban systems In this work, we introduce UrbanDiT, a foundation model for open-world urban spatio-temporal learning that successfully scale up diffusion transformers in this field.

UrbanDiT pioneers a unified model that integrates diverse spatio-temporal data sources and types while learning universal spatio-temporal patterns across different cities and scenarios.

This allows the model to unify both multi-data and multi-task learning, and effectively support a wide range of spatio-…

15 часов назад @ paperswithcode.com
/salmakh1/ ACING: Actor-Critic for Instruction Learning in Black-Box Large Language Models
/salmakh1/ ACING: Actor-Critic for Instruction Learning in Black-Box Large Language Models /salmakh1/ ACING: Actor-Critic for Instruction Learning in Black-Box Large Language Models

The effectiveness of Large Language Models (LLMs) in solving tasks vastly depends on the quality of the instructions, which often require fine-tuning through extensive human effort.

This highlights the need for automated instruction optimization; however, this optimization is particularly challenging when dealing with black-box LLMs, where model parameters and gradients remain inaccessible.

We propose ACING, a task-specific prompt optimization approach framed as a stateless continuous-action Reinforcement Learning (RL) problem, known as the continuum bandit setting.

ACING leverages an actor-critic-based method to optimize prompts, learning from non-differentiable reward signals.

Furthermore…

15 часов назад @ paperswithcode.com
/SanoScience/ PR-ENDO: Physically Based Relightable Gaussian Splatting for Endoscopy
/SanoScience/ PR-ENDO: Physically Based Relightable Gaussian Splatting for Endoscopy /SanoScience/ PR-ENDO: Physically Based Relightable Gaussian Splatting for Endoscopy

Endoscopic procedures are crucial for colorectal cancer diagnosis, and three-dimensional reconstruction of the environment for real-time novel-view synthesis can significantly enhance diagnosis.

We present PR-ENDO, a framework that leverages 3D Gaussian Splatting within a physically based, relightable model tailored for the complex acquisition conditions in endoscopy, such as restricted camera rotations and strong view-dependent illumination.

By exploiting the connection between the camera and light source, our approach introduces a relighting model to capture the intricate interactions between light and tissue using physically based rendering and MLP.

Existing methods often produce artifac…

15 часов назад @ paperswithcode.com
/singlaayush/ Barttender: An approachable & interpretable way to compare medical imaging and non-imaging data
/singlaayush/ Barttender: An approachable & interpretable way to compare medical imaging and non-imaging data /singlaayush/ Barttender: An approachable & interpretable way to compare medical imaging and non-imaging data

Imaging-based deep learning has transformed healthcare research, yet its clinical adoption remains limited due to challenges in comparing imaging models with traditional non-imaging and tabular data.

To bridge this gap, we introduce Barttender, an interpretable framework that uses deep learning for the direct comparison of the utility of imaging versus non-imaging tabular data for tasks like disease prediction.

Barttender converts non-imaging tabular features, such as scalar data from electronic health records, into grayscale bars, facilitating an interpretable and scalable deep learning based modeling of both data modalities.

We introduce a novel measure to define global feature importance…

15 часов назад @ paperswithcode.com
/graph-lr/ Graph as a feature: improving node classification with non-neural graph-aware logistic regression
/graph-lr/ Graph as a feature: improving node classification with non-neural graph-aware logistic regression /graph-lr/ Graph as a feature: improving node classification with non-neural graph-aware logistic regression

Graph Neural Networks (GNNs) and their message passing framework that leverages both structural and feature information, have become a standard method for solving graph-based machine learning problems.

In response to these limitations, we focus on simpler and more scalable approaches and introduce Graph-aware Logistic Regression (GLR), a non-neural model designed for node classification tasks.

Unlike traditional graph algorithms that use only a fraction of the information accessible to GNNs, our proposed model simultaneously leverages both node features and the relationships between entities.

However instead of relying on message passing, our approach encodes each node's relationships as an…

15 часов назад @ paperswithcode.com
/AnnnnnieZhang/ M3D: Dual-Stream Selective State Spaces and Depth-Driven Framework for High-Fidelity Single-View 3D Reconstruction
/AnnnnnieZhang/ M3D: Dual-Stream Selective State Spaces and Depth-Driven Framework for High-Fidelity Single-View 3D Reconstruction /AnnnnnieZhang/ M3D: Dual-Stream Selective State Spaces and Depth-Driven Framework for High-Fidelity Single-View 3D Reconstruction

The precise reconstruction of 3D objects from a single RGB image in complex scenes presents a critical challenge in virtual reality, autonomous driving, and robotics.

Existing neural implicit 3D representation methods face significant difficulties in balancing the extraction of global and local features, particularly in diverse and complex environments, leading to insufficient reconstruction precision and quality.

We propose M3D, a novel single-view 3D reconstruction framework, to tackle these challenges.

This framework adopts a dual-stream feature extraction strategy based on Selective State Spaces to effectively balance the extraction of global and local features, thereby improving scene …

15 часов назад @ paperswithcode.com
/bimk/ Diffusion-Inspired Cold Start with Sufficient Prior in Computerized Adaptive Testing
/bimk/ Diffusion-Inspired Cold Start with Sufficient Prior in Computerized Adaptive Testing /bimk/ Diffusion-Inspired Cold Start with Sufficient Prior in Computerized Adaptive Testing

Computerized Adaptive Testing (CAT) aims to select the most appropriate questions based on the examinee's ability and is widely used in online education.

However, existing CAT systems often lack initial understanding of the examinee's ability, requiring random probing questions.

This can lead to poorly matched questions, extending the test duration and negatively impacting the examinee's mindset, a phenomenon referred to as the Cold Start with Insufficient Prior (CSIP) task.

These response records, due to the commonality of cognitive states across different knowledge domains, can provide valuable prior information for the target domain.

Extensive experiments conducted on five real-world dat…

15 часов назад @ paperswithcode.com
/dumingsen/ Contrast Similarity-Aware Dual-Pathway Mamba for Multivariate Time Series Node Classification
/dumingsen/ Contrast Similarity-Aware Dual-Pathway Mamba for Multivariate Time Series Node Classification /dumingsen/ Contrast Similarity-Aware Dual-Pathway Mamba for Multivariate Time Series Node Classification

Thus, to address these issues, we propose contrast similarity-aware dual-pathway Mamba for MTS node classification (CS-DPMamba).

Firstly, to obtain the dynamic similarity of each sample, we initially use temporal contrast learning module to acquire MTS representations.

Finally, we utilize the Kolmogorov-Arnold Network enhanced Graph Isomorphism Network to complete the information interaction in the matrix and MTS node classification task.

By comprehensively considering the long-range dependencies and dynamic similarity features, we achieved precise MTS node classification.

Our results demonstrate the superiority of our method through both supervised and semi-supervised experiments on the MT…

15 часов назад @ paperswithcode.com
/togethercomputer/ RedPajama: an Open Dataset for Training Large Language Models
/togethercomputer/ RedPajama: an Open Dataset for Training Large Language Models /togethercomputer/ RedPajama: an Open Dataset for Training Large Language Models

Large language models are increasingly becoming a cornerstone technology in artificial intelligence, the sciences, and society as a whole, yet the optimal strategies for dataset composition and filtering remain largely elusive.

Many of the top-performing models lack transparency in their dataset curation and model development processes, posing an obstacle to the development of fully open language models.

In this paper, we identify three core data-related challenges that must be addressed to advance open-source language models.

To date, these datasets have already been used in the training of strong language models used in production, such as Snowflake Arctic, Salesforce's XGen and AI2's OLM…

15 часов назад @ paperswithcode.com
/ismailnejjar/ Recall and Refine: A Simple but Effective Source-free Open-set Domain Adaptation Framework
/ismailnejjar/ Recall and Refine: A Simple but Effective Source-free Open-set Domain Adaptation Framework /ismailnejjar/ Recall and Refine: A Simple but Effective Source-free Open-set Domain Adaptation Framework

Open-set Domain Adaptation (OSDA) aims to adapt a model from a labeled source domain to an unlabeled target domain, where novel classes - also referred to as target-private unknown classes - are present.

Source-free Open-set Domain Adaptation (SF-OSDA) methods address OSDA without accessing labeled source data, making them particularly relevant under privacy constraints.

We propose Recall and Refine (RRDA), a novel SF-OSDA framework designed to address these limitations by explicitly learning features for target-private unknown classes.

First, we enhance the model's capacity to recognize unknown classes by training a target classifier with an additional decision boundary, guided by syntheti…

15 часов назад @ paperswithcode.com
/hitachi-nlp/ Enhancing Reasoning Capabilities of LLMs via Principled Synthetic Logic Corpus
/hitachi-nlp/ Enhancing Reasoning Capabilities of LLMs via Principled Synthetic Logic Corpus /hitachi-nlp/ Enhancing Reasoning Capabilities of LLMs via Principled Synthetic Logic Corpus

Large language models (LLMs) are capable of solving a wide range of tasks, yet they have struggled with reasoning.

To address this, we propose $\textbf{Additional Logic Training (ALT)}$, which aims to enhance LLMs' reasoning capabilities by program-generated logical reasoning samples.

Then, based on these principles, we construct a synthetic corpus named $\textbf{Formal Logic Deduction Diverse}$ ($\textbf{FLD}$$^{\times 2}$), comprising numerous samples of multi-step deduction with unknown facts, diverse reasoning rules, diverse linguistic expressions, and challenging distractors.

Finally, we empirically show that ALT on FLD$^{\times2}$ substantially enhances the reasoning capabilities of s…

15 часов назад @ paperswithcode.com
/yangjiaqidig/ A Multimodal Approach Combining Structural and Cross-domain Textual Guidance for Weakly Supervised OCT Segmentation
/yangjiaqidig/ A Multimodal Approach Combining Structural and Cross-domain Textual Guidance for Weakly Supervised OCT Segmentation /yangjiaqidig/ A Multimodal Approach Combining Structural and Cross-domain Textual Guidance for Weakly Supervised OCT Segmentation

Accurate segmentation of Optical Coherence Tomography (OCT) images is crucial for diagnosing and monitoring retinal diseases.

Weakly Supervised Semantic Segmentation (WSSS) provides a promising alternative by leveraging image-level labels.

In this study, we propose a novel WSSS approach that integrates structural guidance with text-driven strategies to generate high-quality pseudo labels, significantly improving segmentation performance.

In terms of textual information, we utilize large-scale pretrained models from cross-domain sources to implement label-informed textual guidance and synthetic descriptive integration with two textual processing modules that combine local semantic features w…

15 часов назад @ paperswithcode.com
/hchchchchchchc/ Multi-Grained Preference Enhanced Transformer for Multi-Behavior Sequential Recommendation
/hchchchchchchc/ Multi-Grained Preference Enhanced Transformer for Multi-Behavior Sequential Recommendation /hchchchchchchc/ Multi-Grained Preference Enhanced Transformer for Multi-Behavior Sequential Recommendation

Sequential recommendation (SR) aims to predict the next purchasing item according to users' dynamic preference learned from their historical user-item interactions.

However, there still exists some challenges in Multi-Behavior Sequential Recommendation (MBSR).

On the other hand, the dynamic multi-grained behavior-aware preference is hard to capture in interaction sequences, which reflects interaction-aware sequential pattern.

To tackle these challenges, we propose a Multi-Grained Preference enhanced Transformer framework (M-GPT).

Secondly, a novel multi-scale transformer architecture equipped with multi-grained user preference extraction is proposed to encode the interaction-aware sequentia…

15 часов назад @ paperswithcode.com
💼 University and corporation labs
DeepMind DeepMind
последний пост 2 days, 11 hours назад
The AI for Science Forum: A new era of discovery
The AI for Science Forum: A new era of discovery The AI for Science Forum: A new era of discovery

AI is revolutionizing the landscape of scientific research, enabling advancements at a pace that was once unimaginable — from accelerating drug discovery to designing new materials for clean energy technologies.

The AI for Science Forum — co-hosted by Google DeepMind and the Royal Society — brought together the scientific community, policymakers, and industry leaders to explore the transformative potential of AI to drive scientific breakthroughs, address the world's most pressing challenges, and lead to a new era of discovery.

2 days, 11 hours назад @ blog.google
Pushing the frontiers of audio generation
Pushing the frontiers of audio generation Pushing the frontiers of audio generation

Technologies Pushing the frontiers of audio generation ShareCopy link ×Our pioneering speech generation technologies are helping people around the world interact with more natural, conversational and intuitive digital assistants and AI tools.

Pioneering techniques for audio generation For years, we've been investing in audio generation research and exploring new ways for generating more natural dialogue in our products and experimental tools.

Download audio Audio clip of two speakers telling a funny story, with laughter at the punchline.

Download audio Audio clip of two speakers expressing excitement about a surprise birthday party.

Scaling our audio generation models Scaling our single-spe…

3 weeks назад @ deepmind.google
New generative AI tools open the doors of music creation
New generative AI tools open the doors of music creation New generative AI tools open the doors of music creation

Technologies New generative AI tools open the doors of music creation ShareCopy link ×Our latest AI music technologies are now available in MusicFX DJ, Music AI Sandbox and YouTube Shorts For nearly a decade, our teams have been exploring how artificial intelligence (AI) can support the creative process, building tools that empower enthusiasts and professionals to discover new forms of creative expression.

Their input has been guiding our state-of-the-art generative music experiments, and helping us ensure that our new generative AI tools responsibly open the doors of music creation to everyone.

We’re also announcing updates to our music AI toolkit, called Music AI Sandbox, and highlighting…

4 weeks назад @ deepmind.google
Demis Hassabis & John Jumper awarded Nobel Prize in Chemistry
Demis Hassabis & John Jumper awarded Nobel Prize in Chemistry Demis Hassabis & John Jumper awarded Nobel Prize in Chemistry

This morning, Co-founder and CEO of Google DeepMind and Isomorphic Labs Sir Demis Hassabis, and Google DeepMind Senior Research Scientist Dr. John Jumper were co-awarded the 2024 Nobel Prize in Chemistry for their work developing AlphaFold, a groundbreaking AI system that predicts the 3D structure of proteins from their amino acid sequences.

AlphaFold’s predictions, made freely available through the AlphaFold Protein Structure Database, have given more than 2 million scientists and researchers from 190 countries a powerful tool for making new discoveries.

In a statement released after informed of the news, Demis Hassabis said:"Receiving the Nobel Prize is the honour of a lifetime.

After rec…

1 month, 1 week назад @ deepmind.google
How AlphaChip transformed computer chip design
How AlphaChip transformed computer chip design How AlphaChip transformed computer chip design

Research How AlphaChip transformed computer chip design ShareCopy link ×Our AI method has accelerated and optimized chip design, and its superhuman chip layouts are used in hardware around the world In 2020, we released a preprint introducing our novel reinforcement learning method for designing chip layouts, which we later published in Nature and open sourced.

Today, we’re publishing a Nature addendum that describes more about our method and its impact on the field of chip design.

Computer chips have fueled remarkable progress in artificial intelligence (AI), and AlphaChip returns the favor by using AI to accelerate and optimize chip design.

Using AI to design Google’s AI accelerator chips…

1 month, 3 weeks назад @ deepmind.google
Updated production-ready Gemini models, reduced 1.5 Pro pricing, increased rate limits, and more
Updated production-ready Gemini models, reduced 1.5 Pro pricing, increased rate limits, and more Updated production-ready Gemini models, reduced 1.5 Pro pricing, increased rate limits, and more

Developers can access our latest models for free via Google AI Studio and the Gemini API.

With the latest updates, 1.5 Pro and Flash are now better, faster, and more cost-efficient to build with in production.

For more details on migrating to the latest versions of Gemini 1.5 Pro and 1.5 Flash, check out the Gemini API models page.

Gemini 1.5 Pro We continue to be blown away with the creative and useful applications of Gemini 1.5 Pro’s 2 million token long context window and multimodal capabilities.

Increased rate limits To make it even easier for developers to build with Gemini, we are increasing the paid tier rate limits for 1.5 Flash to 2,000 RPM and increasing 1.5 Pro to 1,000 RPM, up f…

1 month, 3 weeks назад @ developers.googleblog.com
Empowering YouTube creators with generative AI
Empowering YouTube creators with generative AI Empowering YouTube creators with generative AI

We’re changing that, and making these incredible technologies more easily accessible to millions of creators and billions of users around the world.

Over the next few months, we’re bringing our advanced generative AI models, Veo and Imagen 3 , to YouTube creators through Dream Screen .

Artificial intelligence (AI) technologies for generating creative content are improving rapidly, but seamless ways of using them still aren’t widely available.

New video generation technology in YouTube Shorts will help millions of people realize their creative visionCreators can soon generate video with Dream ScreenBy integrating Veo into Dream Screen, YouTube creators will soon be able to generate exciting …

2 months назад @ deepmind.google
Our latest advances in robot dexterity
Our latest advances in robot dexterity Our latest advances in robot dexterity

Research Our latest advances in robot dexterity ShareCopy link ×Two new AI systems, ALOHA Unleashed and DemoStart, help robots learn to perform complex tasks that require dexterous movement People perform many tasks on a daily basis, like tying shoelaces or tightening a screw.

Today, we introduce two new papers featuring our latest artificial intelligence (AI) advances in robot dexterity research: ALOHA Unleashed which helps robots learn to perform complex and novel two-armed manipulation tasks; and DemoStart which uses simulations to improve real-world performance on a multi-fingered robotic hand.

This helps the robot learn from the data, so it can perform the same tasks on its own.

To ena…

2 months, 1 week назад @ deepmind.google
AlphaProteo generates novel proteins for biology and health research
AlphaProteo generates novel proteins for biology and health research AlphaProteo generates novel proteins for biology and health research

Research AlphaProteo generates novel proteins for biology and health research ShareCopy link ×New AI system designs proteins that successfully bind to target molecules, with potential for advancing drug design, disease understanding and more.

Today, we introduce AlphaProteo, our first AI system for designing novel, high-strength protein binders to serve as building blocks for biological and health research.

AlphaProteo can generate new protein binders for diverse target proteins, including VEGF-A, which is associated with cancer and complications from diabetes.

Learning the intricate ways proteins bind to each other Protein binders that can bind tightly to a target protein are hard to desig…

2 months, 2 weeks назад @ deepmind.google
FermiNet: Quantum physics and chemistry from first principles
FermiNet: Quantum physics and chemistry from first principles FermiNet: Quantum physics and chemistry from first principles

Research FermiNet: Quantum physics and chemistry from first principles ShareCopy link ×Using deep learning to solve fundamental problems in computational quantum chemistry and explore how matter interacts with light Note: This blog was first published on 19 October 2020.

Our neural network architecture, FermiNet (Fermionic Neural Network), is well-suited to modeling the quantum state of large collections of electrons, the fundamental building blocks of chemical bonds.

A brief history of quantum mechanics Mention “quantum mechanics” and you’re more likely to inspire confusion than anything else.

But even the humble covalent bond — the basic building block of chemistry — is a consequence of t…

3 months назад @ deepmind.google
A new generation of African talent brings cutting-edge AI to scientific challenges
A new generation of African talent brings cutting-edge AI to scientific challenges A new generation of African talent brings cutting-edge AI to scientific challenges

Company A new generation of African talent brings cutting-edge AI to scientific challenges ShareCopy link ×Food security, healthcare and exploring the cosmos are among the ways students of a new pan-African Master’s program aspire to apply AI At Google DeepMind, we’re committed to supporting the next generation of artificial intelligence (AI) leaders to help build a stronger, more diverse and inclusive global AI community.

Students have the opportunity to accelerate scientific discovery, with mentoring and support from Google DeepMind’s researchers and engineers.

As the next generation of AI leaders in Africa, Béria Chingnabé Kalpélbé, Olivier Mahumawon Adjagba and Diffo Mboudjiho Annette D…

3 months, 2 weeks назад @ deepmind.google
Mapping the misuse of generative AI
Mapping the misuse of generative AI Mapping the misuse of generative AI

Responsibility & Safety Mapping the misuse of generative AI ShareCopy link ×New research analyzes the misuse of multimodal generative AI today, in order to help build safer and more responsible technologies Generative artificial intelligence (AI) models that can produce image, text, audio, video and more are enabling a new era of creativity and commercial opportunity.

By analyzing media reports, we identified two main categories of generative AI misuse tactics: the exploitation of generative AI capabilities and the compromise of generative AI systems.

Relative frequency generative AI misuse tactics in our dataset.

By observing how bad actors combine their generative AI misuse tactics in pur…

3 months, 2 weeks назад @ deepmind.google
Gemma Scope: helping the safety community shed light on the inner workings of language models
Gemma Scope: helping the safety community shed light on the inner workings of language models Gemma Scope: helping the safety community shed light on the inner workings of language models

Technologies Gemma Scope: helping the safety community shed light on the inner workings of language models ShareCopy link ×Announcing a comprehensive, open suite of sparse autoencoders for language model interpretability.

To create an artificial intelligence (AI) language model, researchers build a system that learns from vast amounts of data without human guidance.

As a result, the inner workings of language models are often a mystery, even to the researchers who train them.

Gemma Scope is a collection of hundreds of freely available, open sparse autoencoders (SAEs) for Gemma 2 9B and Gemma 2 2B.

Interpreting what happens inside a language model When you ask a language model a question, it…

3 months, 3 weeks назад @ deepmind.google
AI achieves silver-medal standard solving International Mathematical Olympiad problems
AI achieves silver-medal standard solving International Mathematical Olympiad problems AI achieves silver-medal standard solving International Mathematical Olympiad problems

Research AI achieves silver-medal standard solving International Mathematical Olympiad problems ShareCopy link ×Breakthrough models AlphaProof and AlphaGeometry 2 solve advanced reasoning problems in mathematics Artificial general intelligence (AGI) with advanced mathematical reasoning has the potential to unlock new frontiers in science and technology.

This year, we applied our combined AI system to the competition problems, provided by the IMO organizers.

Prof Sir Timothy Gowers,IMO gold medalist and Fields Medal winnerFirst, the problems were manually translated into formal mathematical language for our systems to understand.

We also tested this approach on this year’s IMO problems and t…

3 months, 4 weeks назад @ deepmind.google
Google DeepMind at ICML 2024
Google DeepMind at ICML 2024 Google DeepMind at ICML 2024

Research Google DeepMind at ICML 2024 ShareCopy link ×Exploring AGI, the challenges of scaling and the future of multimodal generative AI Next week the artificial intelligence (AI) community will come together for the 2024 International Conference on Machine Learning (ICML).

Running from July 21-27 in Vienna, Austria, the conference is an international platform for showcasing the latest advances, exchanging ideas and shaping the future of AI research.

Depending on their performance, generality and autonomy, our paper categorizes systems ranging from non-AI calculators to emerging AI models and other novel technologies.

New approaches in generative AI and multimodality Generative AI technolo…

4 months назад @ deepmind.google
Google
последний пост 14 часов назад
Announcing new updates to Cloud Translation AI, now covering 189 languages
Announcing new updates to Cloud Translation AI, now covering 189 languages Announcing new updates to Cloud Translation AI, now covering 189 languages

In fact, 40% of global consumers won't even consider buying from websites not in their native tongue.

With 51.6% of internet users speaking languages other than English, you're potentially missing half your market.

This isn't just about converting words - it's about connecting with people using the right context and tone.

That’s why at Google Cloud, we built Translation AI in Vertex AI.

We’re excited to share the latest updates, and how you can apply it to your business.

14 часов назад @ cloud.google.com
How cloud and AI are bringing scale to corporate climate mitigation and adaptation
How cloud and AI are bringing scale to corporate climate mitigation and adaptation How cloud and AI are bringing scale to corporate climate mitigation and adaptation

Similarly, NGIS is using Google Earth Engine to power TraceMark, helping businesses deliver traceability and transparency across global supply chains.

It’s clear that AI can process large volumes of data, optimize complex systems, and drive the development of new business models.

We are seeing cloud and AI being used to de-risk investments, improve transparency, and increase profitability through the use of large-scale datasets, machine learning, and generative AI.

These technologies allow companies to analyze their ESG performance, gain insights into climate risks, and monitor supplier behaviors.

For example, Palo Alto Networks partnered with Watershed, a Google Cloud Ready - Sustainabilit…

5 days, 14 hours назад @ cloud.google.com
What’s new with HPC and AI infrastructure at Google Cloud
What’s new with HPC and AI infrastructure at Google Cloud What’s new with HPC and AI infrastructure at Google Cloud

Parallelstore: World’s first fully-managed DAOS offeringParallelstore is a fully managed, scalable, high-performance storage solution based on next-generation DAOS technology, designed for demanding HPC and AI workloads.

At the same time, we continue to invest in automating and simplifying the building of HPC and AI platforms, with:Customer success stories: Atommap and beyondAtommap, a company specializing in atomic-scale materials design, is using Google Cloud HPC to accelerate its research and development efforts.

Expect further enhancements to HPC VMs, Parallelstore, Cluster Toolkit, Slurm-gcp, and other HPC products and solutions.

Google Cloud at Supercomputing 2024The annual Supercompu…

5 days, 14 hours назад @ cloud.google.com
Use AI to build AI: Save time on prompt design with AI-powered prompt writing
Use AI to build AI: Save time on prompt design with AI-powered prompt writing Use AI to build AI: Save time on prompt design with AI-powered prompt writing

Prompt refinement boosts the quality of the prompt, while also saving significant times during prompt design.

A powerful duo: Generate prompt meets Refine promptThese two features work in tandem to help you craft the most effective prompt for your objective – irrespective of your skill level.

Generate prompt gets you started quickly, while Refine prompt allows for iterative improvement in five steps:Define your objective: Tell Generate prompt what you want to achieve.

Generate a prompt: Generate prompt creates a ready-to-use prompt, often with helpful placeholders for context.

Refine with feedback: Use Refine prompt to provide feedback on the output and receive AI-powered suggestions for pr…

6 days, 11 hours назад @ cloud.google.com
How to deploy Llama 3.2-1B-Instruct model with Google Cloud Run GPU
How to deploy Llama 3.2-1B-Instruct model with Google Cloud Run GPU How to deploy Llama 3.2-1B-Instruct model with Google Cloud Run GPU

As open-source large language models (LLMs) become increasingly popular, developers are looking for better ways to access new models and deploy them on Cloud Run GPU.

That’s why Cloud Run now offers fully managed NVIDIA GPUs, which removes the complexity of driver installations and library configurations.

This means you’ll benefit from the same on-demand availability and effortless scalability that you love with Cloud Run's CPU and memory, with the added power of NVIDIA GPUs.

In this blog post, we'll guide you through deploying the Meta Llama 3.2 1B Instruction model on Cloud Run.

We'll also share best practices to streamline your development process using local model testing with Text Gene…

6 days, 14 hours назад @ cloud.google.com
Data loading best practices for AI/ML inference on GKE
Data loading best practices for AI/ML inference on GKE Data loading best practices for AI/ML inference on GKE

Many ML frameworks output their checkpoints (snapshots of model weights) to object storage such as Google Cloud Storage, a common choice for long-term storage.

Using Cloud Storage as the source of truth, there are two main products to retrieve your data at the GKE-pod level: Cloud Storage Fuse and Hyperdisk ML (HdML).

Hyperdisk ML gives you better performance and scalability than downloading files directly to the pod from Cloud Storage or other online location.

Additionally, you can attach up to 2500 nodes to a single Hyperdisk ML instance, with aggregate bandwidth up 1.2 TiB/sec.

To use Hyperdisk ML, load your data on the Hyperdisk ML disk prior to using it, and again upon each update.

1 week назад @ cloud.google.com
Unlocking LLM training efficiency with Trillium — a performance analysis
Unlocking LLM training efficiency with Trillium — a performance analysis Unlocking LLM training efficiency with Trillium — a performance analysis

Traditional performance metricsAccelerator systems can be evaluated and compared across multiple dimensions, ranging from peak throughput, to effective throughput, to throughput scaling efficiency.

Hardware specifications and peak performanceTraditionally, comparisons focused on hardware specifications like peak throughput, memory bandwidth, and network connectivity.

However, these hardware efficiency metrics don't directly translate to business-value measures like training time or model quality.

The need for convergence scaling efficiencyWhile hardware utilization and scaling metrics provide important system insights, convergence scaling efficiency focuses on the fundamental goal of traini…

1 week назад @ cloud.google.com
Efficiency engine: How three startups deliver results faster with Vertex AI
Efficiency engine: How three startups deliver results faster with Vertex AI Efficiency engine: How three startups deliver results faster with Vertex AI

Many of the successful funded gen AI startups — more than 60% of whom are building on Google Cloud — are using Vertex AI as the development host and productionalize backbone to accelerate innovation.

Abstrakt, NextNet, and Ferret are among the long list of startups using Google Cloud’s AI-optimized infrastructure and Vertex AI platform to accelerate their innovation.

Leveraging Google Cloud Vertex AI and Gemini, it identifies hidden relationships and patterns within scientific literature, allowing researchers to ask complex questions in plain language and receive accurate answers.

Specifically, NextNet uses Gemini for natural language processing and knowledge extraction, outperforming other…

1 week, 1 day назад @ cloud.google.com
How PUMA leverages built-in intelligence with BigQuery for greater customer engagement
How PUMA leverages built-in intelligence with BigQuery for greater customer engagement How PUMA leverages built-in intelligence with BigQuery for greater customer engagement

The modeling and machine-learning occur within BigQuery while CRMint aids in data integration and audience creation within Google Analytics.

When Google Analytics is linked to Google Ads, audience segments are shared automatically with Google Ads where they can be activated in a number of strategic ways.

At the same time, we’ll be scaling advanced audiences to all of our 20+ international entities.

Looking toward an AI and data-driven futureThis year, we will be moving much of PUMA’s e-commerce infrastructure over to Google Cloud.

We look forward to a long-term partnership with Google Cloud and are excited to see where the future will take us.

1 week, 1 day назад @ cloud.google.com
Generative AI with enterprise controls for business users in 24 Hours
Generative AI with enterprise controls for business users in 24 Hours Generative AI with enterprise controls for business users in 24 Hours

As different gen AI models evolve, Aible can simply update the Chat Templates, allowing all Aible customers to benefit instantly.

Provide Feedback: Users are used to providing thumbs up / down feedback on ChatGPT or Gemini.

Such feedback typically has to be collected by Data Scientists from Business Users - usually manually using spreadsheets.

For business users this is too much complexity.

This is simple trial and error with immediate feedback under the control of the business users.

1 week, 5 days назад @ cloud.google.com
How to deploy and serve multi-host gen AI large open models over GKE
How to deploy and serve multi-host gen AI large open models over GKE How to deploy and serve multi-host gen AI large open models over GKE

Multi-host deployment with vLLM and LWSvLLM is a popular open source model server and supports multi-node multi-GPU inference by employing tensor parallelism and pipeline parallelism.

vLLM supports distributed tensor parallelism with Megatron-LM’s tensor parallel algorithm.

For pipeline parallelism, vLLM manages the distributed runtime with Ray for multi-node inferencing.

On the other hand, pipeline parallelism vertically partitions the model by layer and does not demand constant communication between GPUs.

For this LWS deployment, the tensor parallel size is set to 8, while the pipeline parallel size is set to 2.

1 week, 5 days назад @ cloud.google.com
Deutsche Telekom designs the telco of tomorrow with BigQuery
Deutsche Telekom designs the telco of tomorrow with BigQuery Deutsche Telekom designs the telco of tomorrow with BigQuery

Within the decades that Deutsche Telekom has been in operation, we’ve gathered a vast amount of valuable data, as well as accruing just as much legacy infrastructure.

Modernizing infrastructure to protect — and act on — dataAs a telco organization, protecting our existing data while we scale is paramount.

These needs led us to Google Cloud.

Our team decided on a hub-and-spoke model based on BigQuery, Cloud Storage, and BigLake to build a data lakehouse architecture, using Apache Iceberg.

That pseudonymised data is then pushed to Google Cloud Storage with Apache Iceberg and then sent to BigQuery and BigLake.

1 week, 6 days назад @ cloud.google.com
How to simplify building RAG pipelines in BigQuery with Document AI Layout Parser
How to simplify building RAG pipelines in BigQuery with Document AI Layout Parser How to simplify building RAG pipelines in BigQuery with Document AI Layout Parser

Document preprocessing is a common hurdle when building retrieval-augmented generation (RAG) pipelines.

In this blog, we look at new capabilities within BigQuery and Document AI that simplify this process, and take you through a step-by-step example.

Streamline document processing in BigQueryBigQuery now offers the ability to preprocess documents for RAG pipelines and other document-centric applications through its close integration with Document AI.

The ML.PROCESS_DOCUMENT function, now generally available, can now access new processors, including Document AI's Layout Parser processor, allowing you to parse and chunk PDF documents with SQL syntax.

Layout Parser in Document AI simplifies th…

1 week, 6 days назад @ cloud.google.com
Getting started with NL2SQL (natural language to SQL) with Gemini and BigQuery
Getting started with NL2SQL (natural language to SQL) with Gemini and BigQuery Getting started with NL2SQL (natural language to SQL) with Gemini and BigQuery

Data quality challenges in real-world applicationsBut first, let’s take a deeper look at some of the things that make NL2SQL difficult to implement.

Customer challengesIt’s not just the data that can be ambiguous or imprecisely formatted — users’ questions are often confusing or complex.

Here are three common problems with users’ questions that can make it hard to implement NL2SQL.

An ideal NL2SQL solution should proactively prompt the user to specify the appropriate column and incorporate their input during the SQL query generation process.

An ideal NL2SQL solution should recognize uncertainty in the original input and follow up with clarifying questions to get a complete query representat…

1 week, 6 days назад @ cloud.google.com
Can AI eliminate manual processing for insurance claims? Loadsure built a solution to find
Can AI eliminate manual processing for insurance claims? Loadsure built a solution to find Can AI eliminate manual processing for insurance claims? Loadsure built a solution to find

The solution: Extracting information in the blink of an AIManually verifying claims submission documents is an avoidable bottleneck in an important process.

Loadsure partnered with Google Cloud to develop an AI-powered claims verification system.

By leveraging Google Cloud's Document AI, Loadsure was able to automate the extraction of relevant data from various claim documents, such as bills of lading, invoices, and shipping documents.

Loadsure was able to seamlessly test different workflows using generative AI models and various Document AI processors.

“Using the labeling tool in Document AI significantly enhanced our collaboration with the domain specialists,” Estefany Montoya, the data t…

2 weeks, 2 days назад @ cloud.google.com
OpenAI
последний пост 6 months, 3 weeks назад
We’re bringing the Financial Times’ world-class journalism to ChatGPT
We’re bringing the Financial Times’ world-class journalism to ChatGPT We’re bringing the Financial Times’ world-class journalism to ChatGPT

“It recognises the value of our award-winning journalism and will give us early insights into how content is surfaced through AI.

“Apart from the benefits to the FT, there are broader implications for the industry.

It’s right, of course, that AI platforms pay publishers for the use of their material.

“We value the opportunity to be inside the development loop as people discover content in new ways.

As with any transformative technology, there is potential for significant advancements and major challenges, but what’s never possible is turning back time.

6 months, 3 weeks назад @ openai.com
OpenAI’s commitment to child safety: adopting safety by design principles
OpenAI’s commitment to child safety: adopting safety by design principles OpenAI’s commitment to child safety: adopting safety by design principles

OpenAI, alongside industry leaders including Amazon, Anthropic, Civitai, Google, Meta, Metaphysic, Microsoft, Mistral AI, and Stability AI, has committed to implementing robust child safety measures in the development, deployment, and maintenance of generative AI technologies as articulated in the Safety by Design principles.

By adopting comprehensive Safety by Design principles, OpenAI and our peers are ensuring that child safety is prioritized at every stage in the development of AI.

Responsibly source our training datasets, detect and remove child sexual abuse material (CSAM) and child sexual exploitation material (CSEM) from training data, and report any confirmed CSAM to the relevant a…

7 months назад @ openai.com
Introducing more enterprise-grade features for API customers
Introducing more enterprise-grade features for API customers Introducing more enterprise-grade features for API customers

Customers with a sustained level of tokens per minute (TPM) usage on GPT-4 or GPT-4 Turbo can request access to provisioned throughput to get discounts ranging from 10–50% based on the size of the commitment.

Reduced costs on asynchronous workloads: Customers can use our new Batch API to run non-urgent workloads asynchronously.

Batch API requests are priced at 50% off shared prices, offer much higher rate limits, and return results within 24 hours.

We plan to keep adding new features focused on enterprise-grade security, administrative controls, and cost management.

For more information on these launches, visit our API documentation or get in touch with our team to discuss custom solution…

7 months назад @ openai.com
Introducing OpenAI Japan
Introducing OpenAI Japan Introducing OpenAI Japan

Our new local presence also gets us closer to leading businesses like Daikin, Rakuten, and TOYOTA Connected who are using ChatGPT Enterprise to automate complex business processes, assist in data analysis, and optimize internal reporting.

ChatGPT also helps accelerate the efforts of local governments, such as Yokosuka City, which is leveraging the technology to improve the efficiency of public services in Japan.

Over the past year, the city has gradually provided ChatGPT access to almost all city employees, and 80% have reported increases in productivity.

Now Yokosuka City has formed a network with 21 local governments—including the Tokyo Metropolitan Government and the City of Kobe—to …

7 months, 1 week назад @ openai.com
Introducing improvements to the fine-tuning API and expanding our custom models program
Introducing improvements to the fine-tuning API and expanding our custom models program Introducing improvements to the fine-tuning API and expanding our custom models program

Assisted Fine-TuningAt DevDay last November, we announced a Custom Model program designed to train and optimize models for a specific domain, in partnership with a dedicated group of OpenAI researchers.

Since then, we've met with dozens of customers to assess their custom model needs and evolved our program to further maximize performance.

Today, we are formally announcing our assisted fine-tuning offering as part of the Custom Model program.

Fully custom-trained models imbue new knowledge from a specific domain by modifying key steps of the model training process using novel mid-training and post-training techniques.

Our team modified every step of the model training process, from domain-s…

7 months, 3 weeks назад @ openai.com
Start using ChatGPT instantly
Start using ChatGPT instantly Start using ChatGPT instantly

We’ve also introduced additional content safeguards for this experience, such as blocking prompts and generations in a wider range of categories.

There are many benefits to creating an account including the ability to save and review your chat history, share chats, and unlock additional features like voice conversations and custom instructions.

For anyone that has been curious about AI’s potential but didn’t want to go through the steps to set-up an account, start using ChatGPT today.

7 months, 3 weeks назад @ openai.com
Navigating the Challenges and Opportunities of Synthetic Voices
Navigating the Challenges and Opportunities of Synthetic Voices Navigating the Challenges and Opportunities of Synthetic Voices

We recognize that generating speech that resembles people's voices has serious risks, which are especially top of mind in an election year.

We are engaging with U.S. and international partners from across government, media, entertainment, education, civil society and beyond to ensure we are incorporating their feedback as we build.ÂThe partners testing Voice Engine today have agreed to our usage policies, which prohibit the impersonation of another individual or organization without consent or legal right.

In addition, our terms with these partners require explicit and informed consent from the original speaker and we don’t allow developers to build ways for individual users to create the…

7 months, 3 weeks назад @ openai.com
Sora: First Impressions
Sora: First Impressions Sora: First Impressions

Starting his career at DreamWorks Animation, Don Allen III is a multidisciplinary creator, speaker and consultant who collaborates with major tech and entertainment companies on mixed reality, virtual reality and AI applications.

“For a long time I've been making augmented reality hybrid creatures that I think would be fun combinations in my head.

Now I have a much easier way of prototyping the ideas before I fully build out the 3-D characters to place in spatial computers.” Don cites Sora’s “weirdness” as its greatest strength: “It’s not bound by traditional laws of physics or conventions of thought.” He says that working with Sora shifted his focus from “technical hurdle…

8 months назад @ openai.com
Global news partnerships: Le Monde and Prisa Media
Global news partnerships: Le Monde and Prisa Media Global news partnerships: Le Monde and Prisa Media

Echoing this sentiment, Louis Dreyfus, CEO of Le Monde, stated, "At the moment we are celebrating the 80th anniversary of Le Monde, this partnership with OpenAI allows us to expand our reach and uphold our commitment to providing accurate, verified, balanced news stories at scale.

Collaborating with OpenAI ensures that our authoritative content can be accessed and appreciated by a broader, more diverse audience. ÂEvery shift in the media landscape has presented Le Monde with new opportunities.

From the transition to digital platforms to embracing the era of free media, Le Monde has consistently seized these moments to underscore its commitment to independence, expertise, and journalistic i…

8 months, 1 week назад @ openai.com
OpenAI announces new members to board of directors
OpenAI announces new members to board of directors OpenAI announces new members to board of directors

Additionally, Sam Altman, CEO, will rejoin the OpenAI Board of Directors.ÂSue, Nicole and Fidji have experience in leading global organizations and navigating complex regulatory environments, including backgrounds in technology, nonprofit and board governance.

They will work closely with current board members Adam D’Angelo, Larry Summers and Bret Taylor as well as Sam and OpenAI’s senior management.ÂBret Taylor, Chair of the OpenAI board, stated, “I am excited to welcome Sue, Nicole, and Fidji to the OpenAI Board of Directors.

She also served as President of Sony Entertainment, Inc., and simultaneously served as President of Sony Corporation of America.

She also serves as a member of …

8 months, 2 weeks назад @ openai.com
Review completed & Altman, Brockman to continue to lead OpenAI
Review completed & Altman, Brockman to continue to lead OpenAI Review completed & Altman, Brockman to continue to lead OpenAI

The Special Committee of the OpenAI Board today announced the completion of the review by WilmerHale.

The firm conducted dozens of interviews with members of OpenAI’s prior Board, OpenAI executives, advisors to the prior Board, and other pertinent witnesses; reviewed more than 30,000 documents; and evaluated various corporate actions.

“We have unanimously concluded that Sam and Greg are the right leaders for OpenAI,” stated Bret Taylor, Chair of the OpenAI Board.

The Special Committee acknowledged the important work done by WilmerHale in conducting this extensive review and thanked OpenAI current and former Board members, advisors and employees for their cooperation.

The Special Commi…

8 months, 2 weeks назад @ openai.com
OpenAI and Elon Musk
OpenAI and Elon Musk OpenAI and Elon Musk

Date: January 31, 2018 at 11:54:30 PM PSTSubject: Re: Top AI institutions todayWorking at the cutting edge of AI is unfortunately expensive.

For example,In addition to DeepMind, Google also has Google Brain, Research, and Cloud.

If historical trends are any indication, progress in AI is primarily driven by systems - compute, data, infrastructure.

Not only that, but any algorithmic advances published in a paper somewhere can be almost immediately re-implemented and incorporated.

The “second stage” would be a full self driving solution based on large-scale neural network training, which OpenAI expertise could significantly help accelerate.

8 months, 2 weeks назад @ openai.com
Video generation models as world simulators
Video generation models as world simulators Video generation models as world simulators

This technical report focuses on (1) our method for turning visual data of all types into a unified representation that enables large-scale training of generative models, and (2) qualitative evaluation of Sora’s capabilities and limitations.

Model and implementation details are not included in this report.

Much prior work has studied generative modeling of video data using a variety of methods, including recurrent networks,[^1][^2][^3] generative adversarial networks,[^4][^5][^6][^7] autoregressive transformers,[^8][^9] and diffusion models.

[^10][^11][^12] These works often focus on a narrow category of visual data, on shorter videos, or on videos of a fixed size.

Sora is a generalist mo…

9 months, 1 week назад @ openai.com
Disrupting malicious uses of AI by state-affiliated threat actors
Disrupting malicious uses of AI by state-affiliated threat actors Disrupting malicious uses of AI by state-affiliated threat actors

Based on collaboration and information sharing with Microsoft, we disrupted five state-affiliated malicious actors: two China-affiliated threat actors known as Charcoal Typhoon and Salmon Typhoon; the Iran-affiliated threat actor known as Crimson Sandstorm; the North Korea-affiliated actor known as Emerald Sleet; and the Russia-affiliated actor known as Forest Blizzard.

The identified OpenAI accounts associated with these actors were terminated.

Salmon Typhoon used our services to translate technical papers, retrieve publicly available information on multiple intelligence agencies and regional threat actors, assist with coding, and research common ways processes could be hidden on a system.…

9 months, 1 week назад @ openai.com
Memory and new controls for ChatGPT
Memory and new controls for ChatGPT Memory and new controls for ChatGPT

We’re testing memory with ChatGPT.

Remembering things you discuss across all chats saves you from having to repeat information and makes future conversations more helpful.

You're in control of ChatGPT's memory.

You can explicitly tell it to remember something, ask it what it remembers, and tell it to forget conversationally or through settings.

We are rolling out to a small portion of ChatGPT free and Plus users this week to learn how useful it is.

9 months, 1 week назад @ openai.com
Microsoft Microsoft
последний пост 1 day, 17 hours назад
Ideas: The journey to DNA data storage
Ideas: The journey to DNA data storage Ideas: The journey to DNA data storage

Joining me today to discuss the state of DNA data storage and some of our contributions are several members of the DNA Data Storage Project at Microsoft Research: Principal Researcher Bichlien Nguyen, Senior Researcher Jake Smith, and Partner Research Manager Sergey Yekhanin.

Once we do this, we return to our original data and we’ve completed, let’s call it, one DNA data storage cycle.

So, like, I mean, coding is an important aspect of this whole idea of DNA data storage because we have to deal with errors—it’s a new medium—but talking about error-correcting codes in the context of DNA data storage, so, I mean, usually, like … what are error-correcting codes about?

In DNA data storage, the …

1 day, 17 hours назад @ microsoft.com
Introducing Yasuyuki Matsushita: Tackling societal challenges with AI at Microsoft Research Asia – Tokyo
Introducing Yasuyuki Matsushita: Tackling societal challenges with AI at Microsoft Research Asia – Tokyo Introducing Yasuyuki Matsushita: Tackling societal challenges with AI at Microsoft Research Asia – Tokyo

He reflects on his journey, the evolution of technology, and the opportunities ahead for Microsoft Research Asia – Tokyo.

Yasuyuki Matsushita, Microsoft Research Asia – TokyoWhy return to Microsoft Research Asia?

You worked at Microsoft Research Asia in Beijing from 2003 to 2015 before transitioning to academia.

Yasuyuki Matsushita: Microsoft Research Asia has always been an exceptional place for conducting cutting-edge research, especially in the AI era.

Understanding embodied AI beyond roboticsQuestion: Your current research interests include embodied AI, which is also one of the key areas at Microsoft Research Asia – Tokyo.

2 days, 15 hours назад @ microsoft.com
BiomedParse: A foundation model for smarter, all-in-one biomedical image analysis
BiomedParse: A foundation model for smarter, all-in-one biomedical image analysis BiomedParse: A foundation model for smarter, all-in-one biomedical image analysis

In this blog, we introduce BiomedParse (opens in new tab), a new approach for holistic image analysis by treating object as the first-class citizen.

Through joint pretraining of object recognition, detection, and segmentation, BiomedParse opens new possibilities for holistic image analysis and image-based discovery in biomedicine.

Image parsing: a unifying framework for holistic image analysisBack in 2005, researchers first introduced the concept of “image parsing”—a unified approach to image analysis that jointly conducts object recognition, detection, and segmentation.

With our model, BiomedParse, we have created a foundation for biomedical image parsing that leverages interdependencies a…

2 days, 21 hours назад @ microsoft.com
GraphRAG: Improving global search via dynamic community selection
GraphRAG: Improving global search via dynamic community selection GraphRAG: Improving global search via dynamic community selection

Here, we introduce dynamic community selection to the global search algorithm, which leverages the knowledge graph structure of the indexed dataset.

Figure 1: Dynamic community selection workflowThe dynamic global search approach has two main benefits.

* Note that we only evaluate answers from dynamic search at community level 3, which contains more community reports than static search at level 1.

### Common Trends in Vaccination Rates for Major Diseases #### Decline in Vaccination Rates A significant trend observed across the dataset is the decline in vaccination rates for various diseases, including measles, mumps, rubella (MMR), and polio.

Generated response from static search (level 1) …

5 days, 14 hours назад @ microsoft.com
Orca-AgentInstruct: Agentic flows can be effective synthetic-data generators
Orca-AgentInstruct: Agentic flows can be effective synthetic-data generators Orca-AgentInstruct: Agentic flows can be effective synthetic-data generators

Orca-AgentInstruct is another step in this direction, where we explore using agentic flows to generate diverse and high-quality data at scale.

Synthetic Data Accelerated LLM Development: Over the past year, using synthetic data has greatly advanced the training of large language models (LLMs).

Successful use of synthetic data involves significant human effort in curating and filtering the data to ensure high quality.

AgentInstruct can create:High-quality data: AgentInstruct uses GPT-4, coupled with tools like search and code interpreters, to create high-quality data.

This has the potential to drive AI advances across multiple industries by making high-quality model training more efficient a…

6 days, 14 hours назад @ microsoft.com
Abstracts: November 14, 2024
Abstracts: November 14, 2024 Abstracts: November 14, 2024

I’m Bonnie Kruft, partner and deputy director of Microsoft Research AI for Science and your host for today.

KRUFT: Microsoft Research is one of the earliest institutions to apply AI in biomolecular simulation research.

Classical MD is fast but less accurate, while quantum MD is very accurate but computationally prohibitive for the protein study.

Thus, applying AI in biomolecular simulation can become the third way to achieve both ab initio—or first principles—accuracy and high efficiency.

For AI, AI2BMD proves AI can make a big difference to the dynamic protein structure study beyond AI for the protein static structure prediction.

6 days, 16 hours назад @ microsoft.com
Toward modular models: Collaborative AI development enables model accountability and continuous learning
Toward modular models: Collaborative AI development enables model accountability and continuous learning Toward modular models: Collaborative AI development enables model accountability and continuous learning

Our team is exploring the potential of modular models by focusing on two themes: i) optimizing the training of expert models and ii) refining how expert models coordinate to form a collaborative model.

One method for coordinating expert models is to adaptively select the most relevant independently developed expert models for specific tasks or queries.

For example, in expert design, expert training can be custom or standard, and training data can be private or shared.

Nonrouter methods: Methods that do not explicitly train a router but instead initialize the router in an unsupervised manner.

Modular models allow specific expert models to be identified and, if necessary, removed or retrained.

1 week назад @ microsoft.com
Research Focus: Week of November 11, 2024
Research Focus: Week of November 11, 2024 Research Focus: Week of November 11, 2024

Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft.

Read the paperNEW RESEARCH Building AI Agents for Autonomous Clouds: Challenges and Design Principles Using AI agents for operational resilience of cloud services, which currently require significant human effort and domain knowledge, is a high-impact application.

Using AI to generate code and proofs in proof-oriented languages helps mitigate these concerns, while also making proof-oriented programming more accessible to people.

In a recent preprint: Towards Neural Synthesis for SMT-Assisted Proof-Orien…

1 week назад @ microsoft.com
Preventing side-channels in the cloud
Preventing side-channels in the cloud Preventing side-channels in the cloud

The ability for cloud providers to share components of the hardware stack across customers, or tenants, is essential for running efficient cloud systems.

Microsoft Azure (opens in new tab) provides strong protection via comprehensive architectural isolation through access control mechanisms implemented across the cloud platform, including the hardware and the hypervisor.

Preventing microarchitectural side-channels with resource-exclusive domainsPublication Principled Microarchitectural Isolation on Cloud CPUsIn a research paper (opens in new tab), which has received a distinguished paper award at the ACM Conference on Computer and Communications Security (ACM CCS’24 (opens in new tab)), we …

1 week, 1 day назад @ microsoft.com
Collaborators: Prompt engineering with Siddharth Suri and David Holtz
Collaborators: Prompt engineering with Siddharth Suri and David Holtz Collaborators: Prompt engineering with Siddharth Suri and David Holtz

And that’s, kind of, where I ended up as a computational social scientist.

HUIZINGA: Right, right, right.

And one potential reason for this is the ways that people prompt the models, right.

HUIZINGA: Right, right, right.

I can’t get the phrase “AI whisperer” out of my head now, [LAUGHTER] and I think that’s what I want to be when I grow up.

1 week, 2 days назад @ microsoft.com
From static prediction to dynamic characterization: AI2BMD advances protein dynamics with ab initio accuracy
From static prediction to dynamic characterization: AI2BMD advances protein dynamics with ab initio accuracy From static prediction to dynamic characterization: AI2BMD advances protein dynamics with ab initio accuracy

As one approach, Molecular Dynamics (MD) simulation combines the laws of physics with numerical simulations to tackle the challenge of understanding biomolecular dynamics.

Similarly, the quantum mechanical approach—known as Density Functional Theory (DFT)—received its own Nobel Prize in 1998 (opens in new tab) (opens in new tab), marking a pivotal moment in computational chemistry.

MD simulations can be roughly divided into two classes: classical MD and quantum mechanics.

Ab initio biomolecular dynamics simulation by AIMicrosoft Research has been working on the development of efficient methods aiming for ab initio accuracy simulations of biomolecules.

This method, AI2BMD (AI-based ab initio…

2 weeks назад @ microsoft.com
Abstracts: November 5, 2024
Abstracts: November 5, 2024 Abstracts: November 5, 2024

Chris and Jay, thank you for joining us today for Abstracts and congratulations!

HAWBLITZEL: Yeah, so I think, you know, traditionally verification or this formal software verification that we’re doing has been considered a little bit of a pie-in-the-sky research agenda.

They’ve discovered that you can get the high performance that you want for systems code without having to sacrifice the ability to reason about ownership and lifetimes, concurrency.

TINGLE: Well, finally, Chris, what are some of the open questions or future opportunities for formal software verification research, and what might you and your collaborators tackle next?

I’m Amber Tingle, and we hope you’ll join us again for Ab…

2 weeks, 1 day назад @ microsoft.com
Abstracts: November 4, 2024
Abstracts: November 4, 2024 Abstracts: November 4, 2024

And at that time, large language model was really in its infancy and people just started exploring what large language model can help us in terms of improving software reliability.

But of course, you know, large language model is a field that is moving so fast.

And that’s one of the reasons we used large language models, because traditional static analysis or traditional program analysis cannot capture this.

I’m actually working on how to leverage large language model to verify the correctness of code, code that may be generated by large language model itself.

STOICA: So we’re thinking of, as Shan mentioned, exploring what large language models can do in this bug-finding/testing arena furth…

2 weeks, 2 days назад @ microsoft.com
Microsoft at SOSP 2024: Innovations in systems research
Microsoft at SOSP 2024: Innovations in systems research Microsoft at SOSP 2024: Innovations in systems research

Microsoft is proud to sponsor the 30th Symposium on Operating Systems Principles (SOSP 2024), highlighting its commitment to advancing computing systems research.

Organized annually by the Association for Computing Machinery (ACM), the symposium brings together experts to explore innovations in operating systems, distributed systems, and systems software.

This work not only contributes to theoretical knowledge but also address real-world challenges, helping ensure that as computing systems grow more complex, they remain sustainable, reliable, and secure.

T10 introduces a distributed tensor abstraction, rTensor, and maps the computation and communication of tensor operators with a generalize…

2 weeks, 2 days назад @ microsoft.com
AI-powered microgrids facilitate energy resilience and equity in regional communities
AI-powered microgrids facilitate energy resilience and equity in regional communities AI-powered microgrids facilitate energy resilience and equity in regional communities

The rise of affordable small-scale renewable energy, like rooftop solar panels, is reshaping energy systems around the world.

Despite this progress, centralized energy systems still dominate, often failing to provide vulnerable communities with reliable, affordable renewable energy.

AI-powered microgrids support resilient communitiesMicrogrids, small and localized energy systems, hold promise as a solution to the challenges of centralized energy systems.

Learn more Opens in a new tabWhen powered by AI, microgrids can also contribute to energy equity.

The West Atlanta project illustrates AI’s potential to create resilient, equitable, community-driven energy systems, paving the way for a more…

2 weeks, 5 days назад @ microsoft.com
MIT AI MIT AI
последний пост 1 day, 11 hours назад
A model of virtuosity
A model of virtuosity A model of virtuosity

A crowd gathered at the MIT Media Lab in September for a concert by musician Jordan Rudess and two collaborators.

Each time the model took its turn, a range of expressions moved across Rudess’ face: bemusement, concentration, curiosity.

The researchers set out to develop a machine learning model channeling Rudess’ distinctive musical style and technique.

Rudess contributed the data on which Blanchard trained the AI model.

Because the samples he recorded to train the model were similar to ear-training exercises he’s used with students, he thinks the model itself could someday be used for teaching.

1 day, 11 hours назад @ news.mit.edu
Can robots learn from machine dreams?
Can robots learn from machine dreams? Can robots learn from machine dreams?

Since the 1970s, the field has evolved from writing sophisticated programs to using deep learning, teaching robots to learn directly from human behavior.

As robots become more sophisticated, this hands-on approach hits a scaling problem: the demand for high-quality training data far outpaces humans’ ability to provide it.

Dreams In Motion does this by considering the 3D geometry of the scene and the relative changes in the robot’s perspective.

“Although collecting demonstrations is easy, scaling a real-world robot teleoperation setup to thousands of skills is challenging because a human has to physically set up each scene.

But when robots collected their own training data through LucidSim, …

1 day, 11 hours назад @ news.mit.edu
Four from MIT named 2025 Rhodes Scholars
Four from MIT named 2025 Rhodes Scholars Four from MIT named 2025 Rhodes Scholars

In addition to MIT’s two U.S. Rhodes winners, Ouigbo and Nair, two affiliates were awarded international Rhodes Scholarships: Chen for Rhodes’ China constituency and Hector for the Global Rhodes Scholarship.

Yiming Chen ’24Yiming Chen, from Beijing, China, and the Washington area, was named one of four Rhodes China Scholars on Sept 28.

Chen graduated from MIT in 2024 with a BS in mathematics and computer science and an MEng in computer science.

Chen was a teaching assistant in the MIT math and electrical engineering and computer science departments, and received a teaching excellence award.

Oluigbo also did a summer internship with the Anyscale Learning for All Laboratory at the MIT Compute…

4 days, 4 hours назад @ news.mit.edu
Ensuring a durable transition
Ensuring a durable transition Ensuring a durable transition

“We’re pursuing this by purchasing more and different types of clean energy locally, and accelerating technological innovation such as next-generation geothermal projects,” he said.

“At the start of 2024 there were about 3.5 million clean energy jobs, with 'red' states showing the fastest growth in clean energy jobs,” said David S. Miller, managing partner at Clean Energy Ventures.

“The majority (58 percent) of new jobs in energy are now in clean energy — that transition has happened.

MIT ingenuity essentialThe energy transition is placing very different demands on different regions around the world.

“We are stoked by the energy transition, because it’s not just the future, but our chance t…

5 days, 11 hours назад @ news.mit.edu
Graph-based AI model maps the future of innovation
Graph-based AI model maps the future of innovation Graph-based AI model maps the future of innovation

“By blending generative AI with graph-based computational tools, this approach reveals entirely new ideas, concepts, and designs that were previously unimaginable.

We can accelerate scientific discovery by teaching generative AI to make novel predictions about never-before-seen ideas, concepts, and designs,” says Buehler.

By using this approach, Buehler was able to teach the AI model to systematically reason over complex scientific concepts and behaviors.

The AI model found unexpected similarities between biological materials and “Symphony No.

In another experiment, the graph-based AI model recommended creating a new biological material inspired by the abstract patterns found in Wassily Kan…

1 week, 1 day назад @ news.mit.edu
3 Questions: Inverting the problem of design
3 Questions: Inverting the problem of design 3 Questions: Inverting the problem of design

The Design Computation and Digital Engineering (DeCoDE) Lab at MIT instead explores the bounds of what is possible.

In the case of linkages, your design looks like a set of bars and how they are connected.

The combinatorial space combined with the continuous space is so big that they can basically go up to seven joints.

If you were to do this exhaustively, it would be very difficult to actually cover the entire design space.

These are all difficult problems that involve both combinatorial design variables and continuous design variables.

1 week, 1 day назад @ news.mit.edu
A causal theory for studying the cause-and-effect relationships of genes
A causal theory for studying the cause-and-effect relationships of genes A causal theory for studying the cause-and-effect relationships of genes

This means researchers don’t need to perform costly, and sometimes infeasible, interventional experiments to obtain the data needed to infer the underlying causal relationships.

These programs describe which genes function together to regulate other genes in a biological process, such as cell development or differentiation.

With only observational data, researchers can’t compare genes before and after an intervention to learn how groups of genes function together.

They can use this technique to identify causal modules and reconstruct an accurate underlying representation of the cause-and-effect mechanism.

Causal variables that don’t affect any subsequent variables should have a variance of …

2 weeks назад @ news.mit.edu
A portable light system that can digitize everyday objects
A portable light system that can digitize everyday objects A portable light system that can digitize everyday objects

Digital fabrication engineers are now working toward expanding the display capabilities of other everyday objects.

Equipped with ultraviolet (UV) and red, green, and blue (RGB) LEDs, the device can be attached to everyday objects like shirts and headphones.

Photo-Chromeleon used a projector to help activate the color-changing properties of photochromic dye, where the light on the object's surface is at a reduced intensity.

“Compared with our projector-based system from before, PortaChrome is a more portable light source that can be placed directly on top of the photochromic surface.

The photochromic dye was coated on the headphones and the team then attached the PortaChrome device to the in…

2 weeks назад @ news.mit.edu
Despite its impressive output, generative AI doesn’t have a coherent understanding of the world
Despite its impressive output, generative AI doesn’t have a coherent understanding of the world Despite its impressive output, generative AI doesn’t have a coherent understanding of the world

Despite the model’s uncanny ability to navigate effectively, when the researchers closed some streets and added detours, its performance plummeted.

So, the team developed two new metrics that can test a transformer’s world model.

“We needed test beds where we know what the world model is.

Now, we can rigorously think about what it means to recover that world model,” Vafa explains.

If scientists want to build LLMs that can capture accurate world models, they need to take a different approach, the researchers say.

2 weeks, 2 days назад @ news.mit.edu
Empowering systemic racism research at MIT and beyond
Empowering systemic racism research at MIT and beyond Empowering systemic racism research at MIT and beyond

In addition to coordinating research on systemic racism across campus, the IDSS initiative has a new project aiming to empower and support this research beyond MIT: the new ICSR Data Hub , which serves as an evolving, public web depository of datasets gathered by ICSR researchers.

“There’s extensive research showing racial discrimination and systemic inequity in essentially all sectors of American society,” explains MIT Professor Fotini Christia, who directs the MIT Institute for Data, Systems, and Society (IDSS), where she also co-leads the Initiative on Combatting Systemic Racism (ICSR).

“My motivation behind doing this project is personal,” says Lewis, who was drawn to MIT in large part …

2 weeks, 2 days назад @ news.mit.edu
Artist and designer Es Devlin awarded Eugene McDermott Award in the Arts at MIT
Artist and designer Es Devlin awarded Eugene McDermott Award in the Arts at MIT Artist and designer Es Devlin awarded Eugene McDermott Award in the Arts at MIT

Artist and designer Es Devlin is the recipient of the 2025 Eugene McDermott Award in the Arts at MIT .

“We look forward to presenting Es Devlin with MIT’s highest award in the arts.

The Eugene McDermott Award in the Arts at MIT was established in 1974 by Margaret McDermott (1912-2018) in honor of her husband, Eugene McDermott (1899-1973), a co-founder of Texas Instruments and longtime friend and benefactor of MIT.

The McDermott Award reflects MIT’s commitment to risk-taking, problem-solving, and connecting creative minds across disciplines.

Es Devlin, born in London in 1971, views an audience as a temporary society and often invites public participation in communal choral works.

2 weeks, 2 days назад @ news.mit.edu
Nanoscale transistors could enable more efficient electronics
Nanoscale transistors could enable more efficient electronics Nanoscale transistors could enable more efficient electronics

Silicon transistors, which are used to amplify and switch signals, are a critical component in most electronic devices, from smartphones to automobiles.

Surpassing siliconIn electronic devices, silicon transistors often operate as switches.

A transistor’s switching slope reflects the sharpness of the “off” to “on” transition.

But while tunneling transistors can enable sharp switching slopes, they typically operate with low current, which hampers the performance of an electronic device.

With such small devices, even a 1-nanometer variance can change the behavior of the electrons and affect device operation.

2 weeks, 2 days назад @ news.mit.edu
MIT Schwarzman College of Computing launches postdoctoral program to advance AI across disciplines
MIT Schwarzman College of Computing launches postdoctoral program to advance AI across disciplines MIT Schwarzman College of Computing launches postdoctoral program to advance AI across disciplines

The MIT Stephen A. Schwarzman College of Computing has announced the launch of a new program to support postdocs conducting research at the intersection of artificial intelligence and particular disciplines.

The Tayebati Postdoctoral Fellowship Program will focus on AI for addressing the most challenging problems in select scientific research areas, and on AI for music composition and performance.

“This new postdoc program is a remarkable opportunity to cultivate exceptional bilingual talent combining AI and another discipline.

The Tayebati Postdoctoral Fellowship Program is a key component of a larger focus of the MIT Schwarzman College of Computing aimed at fostering innovative research i…

3 weeks, 1 day назад @ news.mit.edu
A faster, better way to train general-purpose robots
A faster, better way to train general-purpose robots A faster, better way to train general-purpose robots

Typically, engineers collect data that are specific to a certain robot and task, which they use to train the robot in a controlled environment.

To train better general-purpose robots, MIT researchers developed a versatile technique that combines a huge amount of heterogeneous data from many of sources into one system that can teach any robot a wide range of tasks.

“In robotics, people often claim that we don’t have enough training data.

These models are pretrained using an enormous amount of diverse language data and then fine-tuned by feeding them a small amount of task-specific data.

Even when the task was very different from the pretraining data, HPT still improved performance.

3 weeks, 3 days назад @ news.mit.edu
Making it easier to verify an AI model’s responses
Making it easier to verify an AI model’s responses Making it easier to verify an AI model’s responses

Due to this hallucination problem, an LLM’s responses are often verified by human fact-checkers, especially if a model is deployed in a high-stakes setting like health care or finance.

To help human validators, MIT researchers created a user-friendly system that enables people to verify an LLM’s responses much more quickly.

Users hover over highlighted portions of its text response to see data the model used to generate that specific word or phrase.

SymGen then resolves each reference using a rule-based tool that copies the corresponding text from the data table into the model’s response.

They could validate the model’s responses about 20 percent faster than if they used standard methods.

1 month назад @ news.mit.edu
Berkeley AI
последний пост 1 week, 1 day назад
Virtual Personas for Language Models via an Anthology of Backstories
Virtual Personas for Language Models via an Anthology of Backstories Virtual Personas for Language Models via an Anthology of Backstories

Virtual Personas for Language Models via an Anthology of BackstoriesWe introduce Anthology, a method for conditioning LLMs to representative, consistent, and diverse virtual personas by generating and utilizing naturalistic backstories with rich details of individual values and experience.

What does it mean for large language models (LLMs) to be trained on massive text corpora, collectively produced by millions and billions of distinctive human authors?

In this work, we introduce Anthology, an approach for steering LLMs to representative, consistent, and diverse virtual personas by providing richly detailed life narratives of individuals as conditioning context to models.

By grounding langu…

1 week, 1 day назад @ bair.berkeley.edu
Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination
Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination

Linguistic Bias in ChatGPT: Language Models Reinforce Dialect DiscriminationSample language model responses to different varieties of English and native speaker reactions.

Over 1 billion people around the world speak varieties such as Indian English, Nigerian English, Irish English, and African-American English.

Then, we compared the language model responses to the “standard” varieties and the non-“standard” varieties.

Here, we included the original GPT-3.5 responses, plus responses from GPT-3.5 and GPT-4 where the models were told to imitate the style of the input.

That can reinforce barriers against speakers of non-“standard” varieties as AI models become increasingly used in …

2 months назад @ bair.berkeley.edu
How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark
How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT BenchmarkWhen we began studying jailbreak evaluations, we found a fascinating paper claiming that you could jailbreak frontier LLMs simply by translating forbidden prompts into obscure languages.

This blog post shows how to use a new, state-of-the art jailbreak benchmark - StrongREJECT - to accurately and robustly evaluate jailbreak methods.

PAP instructs an attacker model to persuade a victim model to give it harmful information using techniques like misrepresentation and logical appeals.

We conducted two experiments to test this hypothesis:We used StrongREJECT to evaluate 37 jailbreak methods on an unaligned model; Dolp…

2 months, 3 weeks назад @ bair.berkeley.edu
Are We Ready for Multi-Image Reasoning? Launching VHs: The Visual Haystacks Benchmark!
Are We Ready for Multi-Image Reasoning? Launching VHs: The Visual Haystacks Benchmark! Are We Ready for Multi-Image Reasoning? Launching VHs: The Visual Haystacks Benchmark!

Launching VHs: The Visual Haystacks Benchmark!

Humans excel at processing vast arrays of visual information, a skill that is crucial for achieving artificial general intelligence (AGI).

Visual Haystacks: the first "visual-centric" Needle-In-A-Haystack (NIAH) benchmark designed to rigorously evaluate Large Multimodal Models (LMMs) in processing long-context visual information.

The first NIAH benchmark for visual reasoning was introduced by Google in the Gemini-v1.5 technical report.

What is the Visual Haystacks (VHs) Benchmark?

4 months назад @ bair.berkeley.edu
TinyAgent: Function Calling at the Edge
TinyAgent: Function Calling at the Edge TinyAgent: Function Calling at the Edge

TinyAgent: Function Calling at the EdgeThe ability of LLMs to execute commands through plain language (e.g.

The framework is open sourced and available at https://github.com/SqueezeAILab/TinyAgentTeaching LLMs to do Function CallingFigure 1: Overview of the LLMCompiler Function Calling Planner.

Once this function calling plan is generated, we can parse it and call each function based on the dependencies.

With our dataset in place, we can now proceed to fine-tune off-the-shelf SLMs to enhance their function calling capability.

Latency is the end-to-end latency of the function calling planner, including the prompt processing time and generation.

5 months, 3 weeks назад @ bair.berkeley.edu
Modeling Extremely Large Images with xT
Modeling Extremely Large Images with xT Modeling Extremely Large Images with xT

As computer vision researchers, we believe that every pixel can tell a story. However, there seems to be a writer’s block settling into the field when it comes to dealing with large images. Large images are no longer rare—the cameras we carry in our pockets and those orbiting our planet snap pictures so big and detailed that they stretch our current best models and hardware to their breaking points when handling them. Generally, we face a quadratic increase in memory usage as a function of image size.

Today, we make one of two sub-optimal choices when handling large images: down-sampling or cropping. These two methods incur significant losses in the amount of information and context present…

8 months назад @ localhost:4000
Modeling Extremely Large Images with xT
Modeling Extremely Large Images with xT Modeling Extremely Large Images with xT

Modeling Extremely Large Images with xTAs computer vision researchers, we believe that every pixel can tell a story.

However, there seems to be a writer’s block settling into the field when it comes to dealing with large images.

Today, we make one of two sub-optimal choices when handling large images: down-sampling or cropping.

Why bother handling large images anyways?

That’s basically what we do with large images with $x$T.

8 months назад @ bair.berkeley.edu
2024 BAIR Graduate Directory
2024 BAIR Graduate Directory 2024 BAIR Graduate Directory

2024 BAIR Graduate DirectoryEvery year, the Berkeley Artificial Intelligence Research (BAIR) Lab graduates some of the most talented and innovative minds in artificial intelligence and machine learning.

Our Ph.D. graduates have each expanded the frontiers of AI research and are now ready to embark on new adventures in academia, industry, and beyond.

These fantastic individuals bring with them a wealth of knowledge, fresh ideas, and a drive to continue contributing to the advancement of AI.

Join us in celebrating the achievements of BAIR’s latest PhD graduates.

Thank you to our friends at the Stanford AI Lab for this idea!

8 months, 2 weeks назад @ bair.berkeley.edu
2024 BAIR Graduate Directory
2024 BAIR Graduate Directory 2024 BAIR Graduate Directory

Every year, the Berkeley Artificial Intelligence Research (BAIR) Lab graduates some of the most talented and innovative minds in artificial intelligence and machine learning. Our Ph.D. graduates have each expanded the frontiers of AI research and are now ready to embark on new adventures in academia, industry, and beyond.

These fantastic individuals bring with them a wealth of knowledge, fresh ideas, and a drive to continue contributing to the advancement of AI. Their work at BAIR, ranging from deep learning, robotics, and natural language processing to computer vision, security, and much more, has contributed significantly to their fields and has had transformative impacts on society.

This…

8 months, 2 weeks назад @ localhost:4000
The Shift from Models to Compound AI Systems
The Shift from Models to Compound AI Systems The Shift from Models to Compound AI Systems

AI caught everyone’s attention in 2023 with Large Language Models (LLMs) that can be instructed to perform general tasks, such as translation or coding, just by prompting. This naturally led to an intense focus on models as the primary ingredient in AI application development, with everyone wondering what capabilities new LLMs will bring.

As more developers begin to build using LLMs, however, we believe that this focus is rapidly changing: state-of-the-art AI results are increasingly obtained by compound systems with multiple components, not just monolithic models.

For example, Google’s AlphaCode 2 set state-of-the-art results in programming through a carefully engineered system that uses L…

9 months назад @ localhost:4000
The Shift from Models to Compound AI Systems
The Shift from Models to Compound AI Systems The Shift from Models to Compound AI Systems

In this post, we analyze the trend toward compound AI systems and what it means for AI developers.

We argue that compound AI systems will likely be the best way to maximize AI results in the future, and might be one of the most impactful trends in AI in 2024.

We define a Compound AI System as a system that tackles AI tasks using multiple interacting components, including multiple calls to models, retrievers, or external tools.

Developing Compound AI SystemsWhile compound AI systems can offer clear benefits, the art of designing, optimizing, and operating them is still emerging.

However, new compound AI systems contain non-differentiable components like search engines or code interpreters, a…

9 months назад @ bair.berkeley.edu
AWS Machine Learning AWS Machine Learning
последний пост 15 часов назад
Unify structured data in Amazon Aurora and unstructured data in Amazon S3 for insights using Amazon Q
Unify structured data in Amazon Aurora and unstructured data in Amazon S3 for insights using Amazon Q Unify structured data in Amazon Aurora and unstructured data in Amazon S3 for insights using Amazon Q

The solution combines data from an Amazon Aurora MySQL-Compatible Edition database and data stored in an Amazon Simple Storage Service (Amazon S3) bucket.

Create an Amazon Q applicationComplete the following steps to create an Amazon Q application:On the Amazon Q console, choose Applications in the navigation pane.

Configure Amazon Q to connect to Aurora MySQL-CompatibleComplete the following steps to configure Amazon Q to connect to Aurora MySQL-Compatible:On the Connect data sources page, under Data sources, choose the Aurora (MySQL) data source.

Configure Amazon Q to connect to Amazon S3Complete the following steps to configure Amazon Q to connect to Amazon S3:On the Connect data sources…

15 часов назад @ aws.amazon.com
Automate Q&A email responses with Amazon Bedrock Knowledge Bases
Automate Q&A email responses with Amazon Bedrock Knowledge Bases Automate Q&A email responses with Amazon Bedrock Knowledge Bases

In this post, we illustrate automating the responses to email inquiries by using Amazon Bedrock Knowledge Bases and Amazon Simple Email Service (Amazon SES), both fully managed services.

Amazon Bedrock Knowledge BasesFor RAG workflows, Amazon Bedrock offers managed knowledge bases, which are vector databases that store unstructured data semantically.

For more information on RAG and Amazon Bedrock Knowledge Bases, see Connect Foundation Models to Your Company Data Sources with Agents for Amazon Bedrock.

Amazon Bedrock Knowledge Bases uses the Amazon Titan embeddings model to convert the prompt query to a vector embedding, and then finds chunks that are semantically similar.

Our description o…

15 часов назад @ aws.amazon.com
Streamline RAG applications with intelligent metadata filtering using Amazon Bedrock
Streamline RAG applications with intelligent metadata filtering using Amazon Bedrock Streamline RAG applications with intelligent metadata filtering using Amazon Bedrock

For a comprehensive overview of metadata filtering and its benefits, refer to Amazon Bedrock Knowledge Bases now supports metadata filtering to improve retrieval accuracy.

Solution overviewIn this section, we illustrate the flow of the dynamic metadata filtering solution using the tool use (function calling) capability.

In the following sections, we explore how to implement dynamic metadata filtering using the tool use feature in Amazon Bedrock and Pydantic for data validation.

For detailed cleanup instructions, see Amazon Bedrock Knowledge Bases now supports metadata filtering to improve retrieval accuracy.

ConclusionBy implementing dynamic metadata filtering using Amazon Bedrock and Pydan…

15 часов назад @ aws.amazon.com
Embedding secure generative AI in mission-critical public safety applications
Embedding secure generative AI in mission-critical public safety applications Embedding secure generative AI in mission-critical public safety applications

Mission-critical public safety applications require the highest levels of security and reliability when implementing technology capabilities.

This solution demonstrates how generative AI can enhance public safety operations while allowing officers to focus more time on serving their communities.

Mark43’s public safety solution built on the AWS CloudMark43 offers a cloud-native Public Safety Platform with powerful computer-aided dispatch (CAD), records management system (RMS), and analytics solutions, positioning agencies at the forefront of public safety technology.

If a user doesn’t have access to the data outside of Amazon Q Business, then they cannot access the data within Amazon Q Busin…

15 часов назад @ aws.amazon.com
How FP8 boosts LLM training by 18% on Amazon SageMaker P5 instances
How FP8 boosts LLM training by 18% on Amazon SageMaker P5 instances How FP8 boosts LLM training by 18% on Amazon SageMaker P5 instances

In this post, we explore how FP8 optimization can significantly speed up large model training on Amazon SageMaker P5 instances.

LLM training using SageMaker P5In 2023, SageMaker announced P5 instances, which support up to eight of the latest NVIDIA H100 Tensor Core GPUs.

With the use of Amazon SageMaker Model Training, organizations have been able to achieve higher training speeds and efficiency by turning to P5 instances.

LLM training using FP8P5 instances, which are NVIDIA H100 GPUs underneath, also come with capabilities of training models using FP8 precision.

The impact on LLM training and beyondThe use of FP8 precision combined with SageMaker P5 instances has significant implications f…

15 часов назад @ aws.amazon.com
Racing into the future: How AWS DeepRacer fueled my AI and ML journey
Racing into the future: How AWS DeepRacer fueled my AI and ML journey Racing into the future: How AWS DeepRacer fueled my AI and ML journey

The AWS DeepRacer League was also announced, featuring physical races at AWS Summits worldwide in 2019 and a virtual league in a simulated environment.

This event also sparked the creation of the AWS DeepRacer Community, which has since grown to over 45,000 members.

Conclusion: The lasting impact of AWS DeepRacerOver the past six years, AWS DeepRacer has profoundly impacted my professional and personal life.

For those interested in starting their own AI and ML journey, I encourage you to explore the AWS DeepRacer resources available on the AWS website.

Matt is an AWS Community Builder and continues to contribute to open source projects in the AWS DeepRacer community.

1 day, 10 hours назад @ aws.amazon.com
Your guide to generative AI and ML at AWS re:Invent 2024
Your guide to generative AI and ML at AWS re:Invent 2024 Your guide to generative AI and ML at AWS re:Invent 2024

These sessions, featuring Amazon Q Business, Amazon Q Developer, Amazon Q in QuickSight, and Amazon Q Connect, span the AI/ML, DevOps and Developer Productivity, Analytics, and Business Applications topics.

Visit the Generative AI Zone (GAIZ) at AWS Village in the Venetian Expo Hall to explore hands-on experiences with our newest launches and connect with our generative AI and ML specialists.

Experience an immersive selection of innovative generative AI exhibits at the Generative AI and Innovations Pavilion through interactive displays spanning the AWS generative AI stack.

Baskar Sridharan, VP, AI/ML Services & Infrastructure | AIM276-INT | Generative AI in action: From prototype to product…

1 day, 11 hours назад @ aws.amazon.com
Customize small language models on AWS with automotive terminology
Customize small language models on AWS with automotive terminology Customize small language models on AWS with automotive terminology

You can fine-tune and deploy models with Amazon SageMaker JumpStart or directly through Hugging Face containers.

We use Amazon SageMaker Studio, a comprehensive web-based integrated development environment (IDE) designed to facilitate all aspects of ML development.

We use Amazon SageMaker Studio, a comprehensive web-based integrated development environment (IDE) designed to facilitate all aspects of ML development.

If you deployed the model with Amazon Bedrock Custom Model Import, you can delete the imported model using the Amazon Bedrock console.

You can also improve the quality of training data using advanced data processing techniques.

1 day, 13 hours назад @ aws.amazon.com
Automate emails for task management using Amazon Bedrock Agents, Amazon Bedrock Knowledge Bases, and Amazon Bedrock Guardrails
Automate emails for task management using Amazon Bedrock Agents, Amazon Bedrock Knowledge Bases, and Amazon Bedrock Guardrails Automate emails for task management using Amazon Bedrock Agents, Amazon Bedrock Knowledge Bases, and Amazon Bedrock Guardrails

In this post, we demonstrate how to create an automated email response solution using Amazon Bedrock and its features, including Amazon Bedrock Agents, Amazon Bedrock Knowledge Bases, and Amazon Bedrock Guardrails.

The benefits of AI-powered solutionsTo address these challenges, businesses are adopting generative AI to automate and refine email response processes.

By using AI-driven solutions, organizations can overcome the limitations of manual email processing, streamlining operations and improving the overall customer experience.

The following diagram provides a detailed view of the architecture to enhance email support using generative AI.

As a part of the AI/ML community at AWS, AK gui…

1 day, 13 hours назад @ aws.amazon.com
Accelerate analysis and discovery of cancer biomarkers with Amazon Bedrock Agents
Accelerate analysis and discovery of cancer biomarkers with Amazon Bedrock Agents Accelerate analysis and discovery of cancer biomarkers with Amazon Bedrock Agents

In this post, we show you how agentic workflows with Amazon Bedrock Agents can help accelerate this journey for research scientists with a natural language interface.

Amazon Bedrock Agents provides default prompt templates for pre-processing users’ queries, orchestration, a knowledge base, and a post-processing template.

Amazon Bedrock Agents are designed to handle a wide range of user inputs, from simple queries to complex, multi-turn conversations.

This post has demonstrated how Amazon Bedrock Agents can offer a flexible framework and relevant tools to help accelerate this critical discovery process.

For examples to get started with Amazon Bedrock Agents, check out the Amazon Bedrock Agen…

1 day, 13 hours назад @ aws.amazon.com
Automate building guardrails for Amazon Bedrock using test-driven development
Automate building guardrails for Amazon Bedrock using test-driven development Automate building guardrails for Amazon Bedrock using test-driven development

create_response = client.create_guardrail( name='math-tutoring-guardrail', description='Prevents the model from providing non-math tutoring, in-person tutoring, or tutoring outside grades 6-12.

""", tags=[ {'key': 'purpose', 'value': 'math-tutoring-guardrail'}, {'key': 'environment', 'value': 'production'} ] )The API response will include a guardrail ID and version.

Additionally, each row of the API response is saved so the user can explore the response as needed.

Optional: Automate the workflow and iteratively improve the guardrailWe recommend reviewing your test results after each iteration.

When using TDD, it’s important to validate your test results and verify that you’re making improve…

1 day, 13 hours назад @ aws.amazon.com
Build cost-effective RAG applications with Binary Embeddings in Amazon Titan Text Embeddings V2, Amazon OpenSearch Serverless, and Amazon Bedrock Knowledge Bases
Build cost-effective RAG applications with Binary Embeddings in Amazon Titan Text Embeddings V2, Amazon OpenSearch Serverless, and Amazon Bedrock Knowledge Bases Build cost-effective RAG applications with Binary Embeddings in Amazon Titan Text Embeddings V2, Amazon OpenSearch Serverless, and Amazon Bedrock Knowledge Bases

Today, we are happy to announce the availability of Binary Embeddings for Amazon Titan Text Embeddings V2 in Amazon Bedrock Knowledge Bases and Amazon OpenSearch Serverless.

Generate Binary Embeddings with Amazon Titan Text Embeddings V2Amazon Titan Text Embeddings V2 now supports Binary Embeddings and is optimized for retrieval performance and accuracy across different dimension sizes (1024, 512, 256) with text support for more than 100 languages.

To learn more about Amazon Knowledge Bases, visit the Amazon Bedrock Knowledge Bases product page.

For more information regarding Amazon Titan Text Embeddings, visit Amazon Titan in Amazon Bedrock.

For more information on Amazon OpenSearch Server…

2 days, 11 hours назад @ aws.amazon.com
Automate cloud security vulnerability assessment and alerting using Amazon Bedrock
Automate cloud security vulnerability assessment and alerting using Amazon Bedrock Automate cloud security vulnerability assessment and alerting using Amazon Bedrock

To address this challenge, this post demonstrates a proactive approach for security vulnerability assessment of your accounts and workloads, using Amazon GuardDuty, Amazon Bedrock, and other AWS serverless technologies.

The Step Functions state machine invokes AWS Lambda functions to get a findings summary and remediation steps through Amazon Bedrock.

The Step Functions workflow sends findings and remediation notifications to the subscribers or users using Amazon Simple Notification Service (Amazon SNS).

AWS Lambda – Lambda is a compute service that runs your code in response to events and automatically manages the compute resources, making it the fastest way to turn an idea into a modern, …

2 days, 11 hours назад @ aws.amazon.com
DXC transforms data exploration for their oil and gas customers with LLM-powered tools
DXC transforms data exploration for their oil and gas customers with LLM-powered tools DXC transforms data exploration for their oil and gas customers with LLM-powered tools

Here is the question {query} Return your answer in the following format $REASON_JUSTIFYING_CATEGORY $COMMA_SEPARETED_LABELS """Using XML tags in the output allows you to parse out the right category for the question.

H is the human and E is the expert: {history} Here is the new query {query} If you can answer the question, your answer should be formatted as follows.

A: This is a translation, I can answer. Il y a 42 puits. H: Can you summarize that in one sentence?

A: This is just rewriting, I can summarize the previous reply and answer directly. Il y a 42 puits. H: Who's the queen of England?

ConclusionIn this post, we presented an AI assistant for efficient data exploratio…

2 days, 11 hours назад @ aws.amazon.com
How MSD uses Amazon Bedrock to translate natural language into SQL for complex healthcare databases
How MSD uses Amazon Bedrock to translate natural language into SQL for complex healthcare databases How MSD uses Amazon Bedrock to translate natural language into SQL for complex healthcare databases

After the SQL query is generated with the placeholder, the user can swap back the actual list of codes before running it.

After the SQL query is generated with the placeholder, the user can swap back the actual list of codes before running it.

The output contains a generated SQL query, in which case we return it to the user, along with the generated explanation.

Output the generated SQL query in tag.

ConclusionIn this post, we showcased how you can use generative AI to translate natural language into SQL for complex healthcare databases like DE-SynPUF.

2 days, 12 hours назад @ aws.amazon.com
NVIDIA
последний пост 17 часов назад
The Need for Speed: NVIDIA Accelerates Majority of World’s Supercomputers to Drive Advancements in Science and Technology
The Need for Speed: NVIDIA Accelerates Majority of World’s Supercomputers to Drive Advancements in Science and Technology The Need for Speed: NVIDIA Accelerates Majority of World’s Supercomputers to Drive Advancements in Science and Technology

Of those accelerated systems, 85% use NVIDIA Hopper GPUs, driving advancements in areas like climate forecasting, drug discovery and quantum simulation.

Accelerated computing is much more than floating point operations per second (FLOPS).

A total of 249 exaflops of AI performance are now available to TOP500 systems, supercharging innovations and discoveries across industries.

Accelerated Computing Is Sustainable ComputingAs the demand for computational capacity grows, so does the need for sustainability.

In the Green500 list of the world’s most energy-efficient supercomputers, systems with NVIDIA accelerated computing rank among eight of the top 10.

17 часов назад @ blogs.nvidia.com
Processing High-Quality Vietnamese Language Data with NVIDIA NeMo Curator
Processing High-Quality Vietnamese Language Data with NVIDIA NeMo Curator Processing High-Quality Vietnamese Language Data with NVIDIA NeMo Curator

Quality filtering Heuristic filtering : Applies rules-based filters to remove low-quality content.

Unicode reformatting, exact deduplication, heuristic filtering, and classifier-based filtering are used to process and refine this dataset into a high-quality final version.

Box plots of symbol, number, and whitespace percentages before and after heuristic curationThe box plots highlight a significant reduction in outliers after heuristic filtering.

Through both heuristic filtering and classifier-based filtering, the dataset maintains its broad range of domain diversity.

The pipeline uses NVIDIA NeMo Curator, a valuable tool for preparing large datasets for pretraining language models, focusin…

1 day, 10 hours назад @ developer.nvidia.com
AI at COP29: Balancing Innovation and Sustainability
AI at COP29: Balancing Innovation and Sustainability AI at COP29: Balancing Innovation and Sustainability

As COP29 attendees gather in Baku, Azerbaijan, to tackle climate change, the role AI plays in environmental sustainability is front and center.

Experts from Crusoe Energy Systems, EON, the International Energy Agency (IEA) and NVIDIA sat down for a conversation about the energy efficiency of AI.

The study looks at how organizations can achieve “Green AI” in the coming decades and addresses AI’s energy use.

Direct-to-chip liquid cooling allows data centers to cool systems more effectively than traditional air conditioning, consuming less power and water.

As AI continues to scale, the future of data centers will hinge on designing for energy efficiency from the outset.

1 day, 11 hours назад @ blogs.nvidia.com
How the Department of Energy’s AI Initiatives Are Transforming Science, Industry and Government
How the Department of Energy’s AI Initiatives Are Transforming Science, Industry and Government How the Department of Energy’s AI Initiatives Are Transforming Science, Industry and Government

How the Department of Energy’s AI Initiatives Are Transforming Science, Industry and GovernmentThe U.S. Department of Energy oversees national energy policy and production.

Hear more from Helena Fu by watching the on-demand session, AI for Science, Energy and Security, from AI Summit DC.

Time Stamps2:20: Four areas of focus for the CET include AI, microelectronics, quantum information science and biotechnology.

10:55: Introducing AI-related initiatives within the DOE, including FASST, or Frontiers in AI for Science, Security and Technology.

19:35: The opportunity of AI applied to materials discovery and applications across science, energy and national security.

1 day, 14 hours назад @ blogs.nvidia.com
NVIDIA and Microsoft Showcase Blackwell Preview, Omniverse Industrial AI and RTX AI PCs at Microsoft Ignite
NVIDIA and Microsoft Showcase Blackwell Preview, Omniverse Industrial AI and RTX AI PCs at Microsoft Ignite NVIDIA and Microsoft Showcase Blackwell Preview, Omniverse Industrial AI and RTX AI PCs at Microsoft Ignite

Grace Blackwell now on Azure, new workflows for industrial AI, and tools and features for RTX AI PCs accelerate AI development.

NVIDIA and Microsoft today unveiled product integrations designed to advance full-stack NVIDIA AI development on Microsoft platforms and applications.

Plus, the NVIDIA AI platform on Azure includes new reference workflows for industrial AI and an NVIDIA Omniverse Blueprint for creating immersive, AI-powered visuals.

The Blackwell-based VM series complements previously announced Azure AI clusters with ND H200 V5 VMs, which provide increased high-bandwidth memory for improved AI inferencing.

At Ignite, NVIDIA announced its new multimodal SLM, NVIDIA Nemovision-4B Ins…

1 day, 18 hours назад @ blogs.nvidia.com
Microsoft and NVIDIA Supercharge AI Development on RTX AI PCs
Microsoft and NVIDIA Supercharge AI Development on RTX AI PCs Microsoft and NVIDIA Supercharge AI Development on RTX AI PCs

Today, over 600 Windows apps and games are already running AI locally on more than 100 million GeForce RTX AI PCs worldwide, delivering fast, reliable and low-latency performance.

At Microsoft Ignite, NVIDIA and Microsoft announced tools to help Windows developers quickly build and optimize AI-powered apps on RTX AI PCs, making local AI more accessible.

RTX AI PCs Power Digital Humans With Multimodal Small Language ModelsNVIDIA ACE is a suite of digital human technologies that brings life to agents, assistants and avatars.

Available in 8B-, 4B- and 2B-parameter versions, these models offer flexible options for balancing speed, memory usage and accuracy on RTX AI PCs.

Learn more about how de…

1 day, 18 hours назад @ blogs.nvidia.com
AI Will Drive Scientific Breakthroughs, NVIDIA CEO Says at SC24
AI Will Drive Scientific Breakthroughs, NVIDIA CEO Says at SC24 AI Will Drive Scientific Breakthroughs, NVIDIA CEO Says at SC24

AI Will Drive Scientific Breakthroughs, NVIDIA CEO Says at SC24NVIDIA kicked off SC24 in Atlanta with a wave of AI and supercomputing tools set to revolutionize industries like biopharma and climate science.

“Supercomputers are among humanity’s most vital instruments, driving scientific breakthroughs and expanding the frontiers of knowledge,” Huang said.

AI Breakthroughs in Drug Discovery and ChemistryWith the open-source release of BioNeMo Framework, NVIDIA is advancing AI-driven drug discovery as researchers gain powerful tools tailored specifically for pharmaceutical applications.

And the NVIDIA ALCHEMI NIM microservice, NVIDIA introduces generative AI to chemistry, allowing researchers …

2 days, 13 hours назад @ blogs.nvidia.com
Faster Forecasts: NVIDIA Launches Earth-2 NIM Microservices for 500x Speedup in Delivering Higher-Resolution Simulations
Faster Forecasts: NVIDIA Launches Earth-2 NIM Microservices for 500x Speedup in Delivering Higher-Resolution Simulations Faster Forecasts: NVIDIA Launches Earth-2 NIM Microservices for 500x Speedup in Delivering Higher-Resolution Simulations

NVIDIA today at SC24 announced two new NVIDIA NIM microservices that can accelerate climate change modeling simulation results by 500x in NVIDIA Earth-2.

The new NIM microservices offer climate technology application providers advanced generative AI-driven capabilities to assist in forecasting extreme weather events.

NVIDIA is releasing the CorrDiff NIM and FourCastNet NIM microservices to help weather technology companies more quickly develop higher-resolution and more accurate predictions.

New CorrDiff NIM Microservices for Higher-Resolution ModelingNVIDIA CorrDiff is a generative AI model for kilometer-scale super resolution.

The CorrDiff NIM microservice is 500x faster and 10,000x more …

2 days, 13 hours назад @ blogs.nvidia.com
NVIDIA Releases cuPyNumeric, Enabling Scientists to Harness GPU Acceleration at Cluster Scale
NVIDIA Releases cuPyNumeric, Enabling Scientists to Harness GPU Acceleration at Cluster Scale NVIDIA Releases cuPyNumeric, Enabling Scientists to Harness GPU Acceleration at Cluster Scale

The accelerated computing library advances scientific discovery by helping researchers seamlessly scale to powerful computing clusters without modifying their Python code.

Other institutions using cuPyNumeric include:Australia National University , where researchers used cuPyNumeric to scale the Levenberg-Marquardt optimization algorithm to run on multi-GPU systems at the country’s National Computational Infrastructure.

, where researchers used cuPyNumeric to scale the Levenberg-Marquardt optimization algorithm to run on multi-GPU systems at the country’s National Computational Infrastructure.

The team used cuPyNumeric to decompose a matrix of 16 million rows and 4,000 columns.

The team use…

2 days, 13 hours назад @ blogs.nvidia.com
Hopper Scales New Heights, Accelerating AI and HPC Applications for Mainstream Enterprise Servers
Hopper Scales New Heights, Accelerating AI and HPC Applications for Mainstream Enterprise Servers Hopper Scales New Heights, Accelerating AI and HPC Applications for Mainstream Enterprise Servers

During the Supercomputing 2024 conference, NVIDIA announced the availability of the NVIDIA H200 NVL PCIe GPU — the latest addition to the Hopper family.

Enterprises can use H200 NVL to accelerate AI and HPC applications, while also improving energy efficiency through reduced power consumption.

The NVIDIA H200 NVL is paired with powerful software tools that enable enterprises to accelerate applications from AI to HPC.

It comes with a five-year subscription for NVIDIA AI Enterprise, a cloud-native software platform for the development and deployment of production AI.

NVIDIA AI Enterprise includes NVIDIA NIM microservices for the secure, reliable deployment of high-performance AI model inferen…

2 days, 13 hours назад @ blogs.nvidia.com
Foxconn Expands Blackwell Testing and Production With New Factories in U.S., Mexico and Taiwan
Foxconn Expands Blackwell Testing and Production With New Factories in U.S., Mexico and Taiwan Foxconn Expands Blackwell Testing and Production With New Factories in U.S., Mexico and Taiwan

To meet demand for Blackwell, now in full production, Foxconn, the world’s largest electronics manufacturer, is using NVIDIA Omniverse.

Foxconn uses NVIDIA Omniverse to virtually integrate their facility and equipment layouts, NVIDIA Isaac Sim for autonomous robot testing and simulation, and NVIDIA Metropolis for vision AI.

World’s Largest Electronics Maker Plans With Omniverse and AITo meet demands at Foxconn, factory planners are building physical AI-powered robotic factories with Omniverse and NVIDIA AI.

Creating Efficiencies While Building Resilient Supply ChainsUsing NVIDIA Omniverse and AI, Foxconn plans to replicate its precision production lines across the world.

Foxconn’s Mexico fa…

2 days, 13 hours назад @ blogs.nvidia.com
From Algorithms to Atoms: NVIDIA ALCHEMI NIM Catalyzes Sustainable Materials Research for EV Batteries, Solar Panels and More
From Algorithms to Atoms: NVIDIA ALCHEMI NIM Catalyzes Sustainable Materials Research for EV Batteries, Solar Panels and More From Algorithms to Atoms: NVIDIA ALCHEMI NIM Catalyzes Sustainable Materials Research for EV Batteries, Solar Panels and More

NVIDIA ALCHEMI for Material and Chemical SimulationsExploring the universe of potential materials, using the nearly infinite combinations of chemicals — each with unique characteristics — can be extremely complex and time consuming.

All in all, evaluating 16 million structures would have taken months — with the NIM microservice, it can be done in just hours.

SES AI, a leading developer of lithium-metal batteries, is using the NVIDIA ALCHEMI NIM microservice with the AIMNet2 model to accelerate the identification of electrolyte materials used for electric vehicles.

It will also be downloadable from build.nvidia.com, and the production-grade NIM microservice will be offered through the NVIDIA…

2 days, 13 hours назад @ blogs.nvidia.com
Revolutionizing AI-Driven Material Discovery Using NVIDIA ALCHEMI
Revolutionizing AI-Driven Material Discovery Using NVIDIA ALCHEMI Revolutionizing AI-Driven Material Discovery Using NVIDIA ALCHEMI

This post introduces the NVIDIA ALCHEMI (AI Lab for Chemistry and Materials Innovation), which aims to accelerate chemical and material discovery with AI.

NVIDIA Batched Geometry Relaxation NIMWhile MLIPs accelerate energy and force calculations for geometry relaxation significantly (versus full DFT), the actual implementation can still be time-consuming.

This is why NVIDIA has developed the NVIDIA Batched Geometry Relaxation NIM to accelerate geometry relaxation calculations.

This post request transmits the data to the Batch Geometry Relaxation NIM and starts the relaxation process, asynchronous from the client code.

Sign up to receive notification when the NVIDIA Batched Geometry Relaxati…

2 days, 13 hours назад @ developer.nvidia.com
Accelerate Drug and Material Discovery with New Math Library NVIDIA cuEquivariance
Accelerate Drug and Material Discovery with New Math Library NVIDIA cuEquivariance Accelerate Drug and Material Discovery with New Math Library NVIDIA cuEquivariance

Popularized under equivariant neural networks (ENNs), these neural network architectures are built using the mathematical concept of equivariance under symmetry-related transformations.

In this post, we describe how the new math library NVIDIA cuEquivariance tackles both challenges and accelerates AI for science models, with examples from drug discovery and material science applications.

Accelerating equivariant neural networksTo address these challenges, NVIDIA developed the new cuEquivariance math library that introduces CUDA-accelerated building blocks for equivariant neural networks.

This specialized backend is optimized for performance on NVIDIA GPUs, enabling significant speedups in m…

2 days, 13 hours назад @ developer.nvidia.com
Effortlessly Scale NumPy from Laptops to Supercomputers with NVIDIA cuPyNumeric
Effortlessly Scale NumPy from Laptops to Supercomputers with NVIDIA cuPyNumeric Effortlessly Scale NumPy from Laptops to Supercomputers with NVIDIA cuPyNumeric

cuPyNumeric offers you the productivity of NumPy with the performance benefits of accelerated, distributed GPU computing.

cuPyNumeric: A drop-in replacement for NumPyYou can start using cuPyNumeric by importing cupynumeric instead of numpy in a NumPy program.

cuPyNumeric is the only existing distributed NumPy implementation that can support aliasing and mutation of distributed arrays in this manner.

Weak scaling plot for a cuPyNumeric 2D stencil computationHow cuPyNumeric parallelizes NumPy operationsBy design, NumPy APIs are built around vector operations that have ample data parallelism.

With cuPyNumeric, scaling NumPy programs across a range of systems—from laptops to supercomputers—is a…

2 days, 14 hours назад @ developer.nvidia.com
Facebook
последний пост 1 day, 14 hours назад
Sequence learning: A paradigm shift for personalized ads recommendations
Sequence learning: A paradigm shift for personalized ads recommendations Sequence learning: A paradigm shift for personalized ads recommendations

Meta’s ad recommendation engine, powered by deep learning recommendation models (DLRMs), has been instrumental in delivering personalized ads to people.

Learning from sequences: developing new sequence learning architectures to replace traditional DLRM neural network architectures.

A paradigm shift with learning from sequences for recommendation systemsMeta’s new system for ads recommendations uses sequence learning at its core.

Scaling the new sequence learning paradigmFollowing the redesign to shift from sparse feature learning to event-based sequence learning, the next focus was scaling across two domains — scaling the sequence learning architecture and scaling event sequences to be long…

1 day, 14 hours назад @ engineering.fb.com
OCP Summit 2024: The open future of networking hardware for AI
OCP Summit 2024: The open future of networking hardware for AI OCP Summit 2024: The open future of networking hardware for AI

At Open Compute Project Summit (OCP) 2024, we’re sharing details about our next-generation network fabric for our AI training clusters.

We’ve expanded our network hardware portfolio and are contributing two new disaggregated network fabrics and a new NIC to OCP.

At Meta, we believe that open hardware drives innovation.

At Meta, we envision a future of AI hardware systems that are not only scalable, but also open and collaborative.

We encourage anyone who wants to help advance the future of networking hardware for AI to engage with OCP and Meta to help share the future of AI infrastructure.

1 month назад @ engineering.fb.com
Meta’s open AI hardware vision
Meta’s open AI hardware vision Meta’s open AI hardware vision

At the Open Compute Project (OCP) Global Summit 2024, we’re showcasing our latest open AI hardware designs with the OCP community.

These innovations include a new AI platform, cutting-edge open rack designs, and advanced network fabrics and components.

The open future of AI infraMeta is committed to open source AI.

We must also prioritize open and standardized models so we can leverage collective expertise, make AI more accessible, and work towards minimizing biases in our systems.​Just as important, we also need open AI hardware systems.

By addressing AI’s infrastructure needs together, we can unlock the true promise of open AI for everyone.​

1 month назад @ engineering.fb.com
How open source AI can improve population estimates, sustainable energy, and the delivery of climate change interventions
How open source AI can improve population estimates, sustainable energy, and the delivery of climate change interventions How open source AI can improve population estimates, sustainable energy, and the delivery of climate change interventions

Why we need better population mapsAccurate estimates of population are taken for granted in many countries.

As the world’s natural resource and energy demands scale, accurate population estimates also offer significant opportunities to improve sustainability efforts.

In addition to total population counts, Meta’s population maps also include demographic breakdowns for groups such as the number of children under five, women of reproductive age, youth, and the elderly.

AI-powered population estimates have been scientifically evaluated to be among the most accurate in the world for mapping population distribution for a variety of geographies and use-cases.

Please visit the Data for Good websit…

1 month, 2 weeks назад @ engineering.fb.com
Simulator-based reinforcement learning for data center cooling optimization
Simulator-based reinforcement learning for data center cooling optimization Simulator-based reinforcement learning for data center cooling optimization

Meta is revamping its new data center design to optimize for artificial intelligence and the same methodology will be applicable for future data center optimizations as well.

As Meta is revamping its new data center design to optimize for artificial intelligence, the same methodology will be applicable for future data center optimizations as well to improve operational efficiency.

A reinforcement learning approach to data center coolingReinforcement learning (RL) is good at modeling control systems as sequential state machines.

There are also various RL approaches reported such as, transforming cooling optimization via deep reinforcement learning and data center cooling using model-predicti…

2 months, 1 week назад @ engineering.fb.com
How PyTorch powers AI training and inference
How PyTorch powers AI training and inference How PyTorch powers AI training and inference

How PyTorch powers AI training and inferenceLearn about new PyTorch advancements for LLMs and how PyTorch is enhancing every aspect of the LLM lifecycle.

In this talk from AI Infra @ Scale 2024, software engineers Wanchao Liang and Evan Smothers are joined by Meta research scientist Kimish Patel to discuss our newest features and tools that enable large-scale training, memory efficient fine-tuning, and on-device LLM capabilities.

First, they cover the importance of memory-efficient fine-tuning and a few common architectural and algorithmic techniques to enable fine-tuning on consumer-grade hardware.

Then they discuss the challenges of deploying large models for on-device deployment and how …

2 months, 4 weeks назад @ engineering.fb.com
Inside the hardware and co-design of MTIA
Inside the hardware and co-design of MTIA Inside the hardware and co-design of MTIA

In this talk from AI Infra @ Scale 2024, Joel Colburn, a software engineer at Meta, technical lead Junqiang Lan, and software engineer Jack Montgomery discuss the second generation of MTIA, Meta’s in-house training and inference accelerator.

They cover the co-design process behind building the second generation of Meta’s first-ever custom silicon for AI workloads, including the PyTorch software ecosystem, and the model architectures for Meta’s key applications.

They demonstrate how MTIA achieves the performance, efficiency, and developer experience to successfully launch models into production.

They also highlight several co-design examples where special silicon features are utilized to acc…

3 months назад @ engineering.fb.com
Bringing Llama 3 to life
Bringing Llama 3 to life Bringing Llama 3 to life

At AI Infra @ Scale 2024, Meta engineers discussed every step of how we built and brought Llama 3 to life, from data and training to inference.

Joe Spisak, Product Director and Head of Generative AI Open Source at Meta, talks about the history of Llama and Meta’s overarching vision for open source AI.

He’s joined by Delia David, a software engineer at Meta, to discuss all things data-related for GenAI.

Kaushik Veeraraghavan, a software engineer at Meta, discusses how Meta trains Llama at scale and delves into the data center, networking, and software investments that have enabled the development of Meta’s Llama 3 models.

Finally, Ye (Charlotte) Qia, a production engineer at Meta, discusses …

3 months назад @ engineering.fb.com
Aparna Ramani discusses the future of AI infrastructure
Aparna Ramani discusses the future of AI infrastructure Aparna Ramani discusses the future of AI infrastructure

Delivering new AI technologies at scale also means rethinking every layer of our infrastructure – from silicon and software systems and even our data center designs.

For the second year in a row, Meta’s engineering and infrastructure teams returned for the AI Infra @ Scale conference, where they discussed the challenges of scaling up an infrastructure for AI as well as work being done on our large-scale GPU clusters, open hardware designs for next-generation data center hardware, and how Meta is building custom silicon like the Meta Training and Inference Accelerator (MTIA) to handle some of our AI training workloads.

Aparna Ramani, VP of Engineering at Meta, responsible for AI infrastructu…

3 months назад @ engineering.fb.com
How Meta animates AI-generated images at scale
How Meta animates AI-generated images at scale How Meta animates AI-generated images at scale

Meta AI’s animate feature, which lets people generate a short animation of a generated image, carried unique challenges in this regard.

Here’s how we were able to deploy Meta AI’s animate feature using a combination of latency optimizations, traffic management, and other novel techniques.

We started by looking at the data for previous traffic on our AI-generated media both at their launches and over time.

With these changes, the preponderance of requests remained in region and latency dropped to roughly what we would expect.

The service tries to take a chunk of that region’s requests and offload them to a nearby region that can handle them without becoming more overloaded.

3 months, 1 week назад @ engineering.fb.com
A RoCE network for distributed AI training at scale
A RoCE network for distributed AI training at scale A RoCE network for distributed AI training at scale

Our paper, “ RDMA over Ethernet for Distributed AI Training at Meta Scale ,” provides the details on how we design, implement, and operate one of the world’s largest AI networks at scale.

These RoCE clusters support an extensive range of production distributed GPU training jobs, including ranking, content recommendation, content understanding, natural language processing, and GenAI model training, among other workloads.

However, our experience with distributed AI training workloads provides a different perspective on tailoring the congestion control algorithms.

Moving forwardThe design and operation of large-scale RoCE networks for distributed AI training workloads have evolved to meet the …

3 months, 2 weeks назад @ engineering.fb.com
Meet Caddy – Meta’s next-gen mixed reality CAD software
Meet Caddy – Meta’s next-gen mixed reality CAD software Meet Caddy – Meta’s next-gen mixed reality CAD software

What happens when a team of mechanical engineers get tired of looking at flat images of 3D models over Zoom?

Meet the team behind Caddy, a new CAD app for mixed reality.

They join Pascal Hartig (@passy) on the Meta Tech Podcast to talk about teaching themselves to code, disrupting the CAD software space, and how they integrated Caddy with Llama 3, and so much more!

Download or listen to the podcast episode below:You can also find the episode wherever you get your podcasts, including:The Meta Tech Podcast is a podcast, brought to you by Meta, where we highlight the work Meta’s engineers are doing at every level – from low-level frameworks to end-user features.

And if you’re interested in lea…

4 months назад @ engineering.fb.com
AI Lab: The secrets to keeping machine learning engineers moving fast
AI Lab: The secrets to keeping machine learning engineers moving fast AI Lab: The secrets to keeping machine learning engineers moving fast

The key to developer velocity across AI lies in minimizing time to first batch (TTFB) for machine learning (ML) engineers.

AI Lab prevents TTFB regressions whilst enabling experimentation to develop improvements.

Optimizing TTFB helps ML engineers move fastThe overhead induced from TTFB is on the critical path for most ML development.

Here, we see the true utility of a framework like AI Lab and how it was used to facilitate this sweeping change.

O(Releases): Running a more holistic set of AI Lab tests prior to release and performing a bisect-like attribution process to find the root cause.

4 months, 1 week назад @ engineering.fb.com
Taming the tail utilization of ads inference at Meta scale
Taming the tail utilization of ads inference at Meta scale Taming the tail utilization of ads inference at Meta scale

Improving tail utilization – the utilization level of the top 5% of the servers when ranked by utilization– within our infrastructure is imperative to operate our fleet efficiently and sustainably.

Challenges of load balancingThere are two approaches to load balancing:Routing load balancing – load balancing across replicas of a single model.

Placement load balancing – balancing load on hosts by moving replicas of a model across hosts.

It also exposed a deeper problem like spiky tail utilization, which was hidden behind the high tail utilization and was fixed once identified .

Optimizing tail utilization within IPnext thereby delivering these benefits to a broader range of expanding machine …

4 months, 1 week назад @ engineering.fb.com
Meta’s approach to machine learning prediction robustness
Meta’s approach to machine learning prediction robustness Meta’s approach to machine learning prediction robustness

Prediction uncertainty is inherent, which makes it difficult to define, identify, diagnose, reproduce, and debug prediction quality issues.

However, ML prediction stability implies a consistent prediction quality shift, which is harder to distinguish.

Meta’s approach and progress towards prediction robustnessMeta has developed a systematic framework to build prediction robustness.

Additionally, fundamental understanding of training data label consistency has resulted in optimizations in training data generation for better model learning.

Regarding ML development productivity, prediction robustness techniques can facilitate model development, and improve daily operations by reducing the time…

4 months, 1 week назад @ engineering.fb.com
Uber Engineering
последний пост None
neptune.ai neptune.ai
последний пост 6 days, 16 hours назад
How to Run LLMs Locally
How to Run LLMs Locally How to Run LLMs Locally

This is the process we will be going through in the following section to answer the question: When do I decide to run LLMs locally?

PrivacyA very obvious argument in favor of running LLMs locally is that, in some cases, there is no alternative.

Related ML/AI Platform Build vs Buy Decision: What Factors to Consider Read moreWhat does it take to run LLMs locally?

However, recent advancements in optimization techniques, such as quantization and attention mechanism optimizations, have made it possible to run LLMs locally, even on a CPU.

LM StudioLM Studio is a user-friendly application designed to run LLMs locally.

6 days, 16 hours назад @ neptune.ai
Scale And Track Your AI/ML Workflows: neptune.ai + Flyte & Union Integration
Scale And Track Your AI/ML Workflows: neptune.ai + Flyte & Union Integration Scale And Track Your AI/ML Workflows: neptune.ai + Flyte & Union Integration

Like Union, Neptune excels in scalability, making it the ideal tracking solution for teams working on large-scale model training.

The new Neptune Flyte plugin enables you to use Neptune to track, visualize, and manage your models.

In this blog post, you’ll learn how to use the Neptune plugin on Union.

Orchestrate and track your models with Flytekit’s Neptune PluginIn Union, data and compute are fundamental building blocks for developing all workflows.

With flytekit’s Neptune plugin, you can easily track your experiments, visualize results, and debug your models.

1 month, 1 week назад @ neptune.ai
LLM Hallucinations 101: Why Do They Appear? Can We Avoid Them?
LLM Hallucinations 101: Why Do They Appear? Can We Avoid Them? LLM Hallucinations 101: Why Do They Appear? Can We Avoid Them?

What are LLM hallucinations?

LLM hallucinations become a problem in LLM-based applicationsMost of the time, if you use an LLM, you probably won’t use a base LLM but an LLM-based assistant whose goal is to help with your requests and reliably answer your questions.

Before we dive into this further, I’d like to stress that when thinking about LLM hallucinations, it’s important to keep in mind the difference between a base LLM and an LLM-based assistant.

When we talk about LLM hallucinations as a problematic phenomenon, it’s in the context of an LLM-based assistant or system.

While it’s unlikely that this process introduces new hallucinations, hallucinations seeded upstream are amplified.

1 month, 3 weeks назад @ neptune.ai
LLM Guardrails: Secure and Controllable Deployment
LLM Guardrails: Secure and Controllable Deployment LLM Guardrails: Secure and Controllable Deployment

LLM guardrails prevent models from generating harmful, biased, or inappropriate content and ensure that they adhere to guidelines set by developers and stakeholders.

LLM guardrails are small programs that validate and correct the modes’ outputs to ensure they align with your application’s specific requirements and context.

We’ll approach the broad and constantly evolving field of LLM guardrails in three stages:First, we’ll talk about the key vulnerabilities threatening AI applications.

We’ll explore these different kinds of LLM guardrails using the Guardrails AI framework, an open-source tool for building reliable AI applications.

| SourceRule-based data validationThe simplest type of LLM g…

2 months назад @ neptune.ai
Reinforcement Learning From Human Feedback (RLHF) For LLMs
Reinforcement Learning From Human Feedback (RLHF) For LLMs Reinforcement Learning From Human Feedback (RLHF) For LLMs

TL;DR Reinforcement Learning from Human Feedback (RLHF) unlocked the full potential of today’s large language models (LLMs).

Reinforcement Learning from Human Feedback (RLHF) has turned out to be the key to unlocking the full potential of today’s large language models (LLMs).

Related LLM Evaluation For Text Summarization Read moreThe RLHF processThe RLHF process consists of three steps:Collecting human feedback.

Collecting human feedbackThe first step in RLHF is to collect human feedback in the so-called preference dataset.

We analyzed the three steps of the RLHF training pipeline: collecting human feedback, training the reward model, and fine-tuning the LLM.

2 months, 1 week назад @ neptune.ai
LLM For Structured Data
LLM For Structured Data LLM For Structured Data

However, when we look for data in a specific domain or organization, we often end up finding structured data.

The most likely reason is that structured data is still the de facto standard for quantitative information.

Consequently, in the age of Large Language Models (LLM), structured data still is and will continue to be relevant—even Microsoft is working on adding Large Language Models (LLMs) to Excel!

LLMs are mostly used with unstructured data, particularly text, but with the proper tools, they can also help tackle tasks with structured data.

Use case 3: Synthetic structured data generationWhen working with structured datasets, it is common to need more data with the same characteristic…

2 months, 2 weeks назад @ neptune.ai
Strategies For Effective Prompt Engineering
Strategies For Effective Prompt Engineering Strategies For Effective Prompt Engineering

In this article, I’ll explore the following questions:What are the basic and advanced strategies for prompt engineering, and when should they be used?

Basic prompt engineering strategiesWhile there is no universal way to design a prompt, there are strategies we can follow to create prompts that yield LLM outputs closer to what we expect.

The basic strategies we cover in this section are straightforward to implement in a lot of different tasks, and they involve minimal customization.

Use Cases: When to use prompt templatesFinally, I recommend trying this approach for your project if:You encounter repetitive tasks, prompt templates ensure uniformity and save a lot of time.

Lessons learnedAs w…

2 months, 3 weeks назад @ neptune.ai
LLM Evaluation For Text Summarization
LLM Evaluation For Text Summarization LLM Evaluation For Text Summarization

TL;DR Evaluating text summarization is difficult because there is no one correct solution, and summarization quality often depends on the summary’s context and purpose.

How does LLM text summarization work?

METEOR is recall-oriented and ensures that the generated text captures as much information from the reference text.

Example: Source Text: {{Document}} Summary: {{Summary}} Evaluation Form (scores ONLY): – Relevance: “””Different prompts for different evaluation criteria are available.

The extensive use of LLMs for text summarization, including the integration of summarization features in search engines, makes research in this field highly popular and relevant.

3 months назад @ neptune.ai
Observability in LLMOps: Different Levels of Scale
Observability in LLMOps: Different Levels of Scale Observability in LLMOps: Different Levels of Scale

The demand for observability and the scale of the required infrastructure vary significantly along the LLMOps value chain.

The distributed structure of agentic networks adds another level of complexity, which is not yet addressed fully by LLM observability tools and practices.

The value chain of LLMOpsThe LLMOps value chain starts with training foundation models and subsequent task-specific finetuning.

Each step has different observability needs and requires different scales of observability tooling and infrastructure.

Full talk Observability in LLMOps: Different Levels of Scale Watch on Youtube

3 months, 1 week назад @ neptune.ai
LLM Observability: Fundamentals, Practices, and Tools
LLM Observability: Fundamentals, Practices, and Tools LLM Observability: Fundamentals, Practices, and Tools

Large Language Model (LLM) observability comprises methods for monitoring, tracing, and analyzing an LLM system’s behavior and responses.

Before we discuss the different pillars of LLM observability in detail, let’s clarify how LLM observability relates to LLM monitoring and ML observability.

LLM monitoring vs. LLM observabilityThe difference between LLM monitoring and LLM observability is analogous to the one between traditional monitoring and observability.

LLM observability vs ML observabilityMachine learning (ML) observability is an established practice, so it is natural to ask why LLM applications require a new approach.

LangfuseLangfuse is an open-source LLM observability platform tha…

3 months, 1 week назад @ neptune.ai
3 Takes on End-to-End For the MLOps Stack: Was It Worth It?
3 Takes on End-to-End For the MLOps Stack: Was It Worth It? 3 Takes on End-to-End For the MLOps Stack: Was It Worth It?

End-to-end (E2E) MLOps platforms promise to simplify the complicated process of building, deploying, and maintaining ML models in production.

However, while E2E MLOps platforms promise convenience and integration, they may not always align with an organization’s specific needs, existing infrastructure, or long-term goals.

She brings a wealth of experience to the discussion on end-to-end MLOps platforms.

However, she highlighted a core problem:The main challenge of using end-to-end ML platforms for your MLOps stack is that nothing works exactly as you need.

As an avid podcast fan, I’ve also learned much from listening to experienced MLOps engineers share their experiences building platforms …

4 months назад @ neptune.ai
Adversarial Machine Learning: Defense Strategies
Adversarial Machine Learning: Defense Strategies Adversarial Machine Learning: Defense Strategies

Defense strategies like adversarial learning, monitoring, defensive distillation, and differential privacy improve robustness against adversarial attacks.

In this article, we’ll review common attack strategies and dive into the latest defense mechanisms for shielding machine learning systems against adversarial attacks.

Regardless of the level of access to the targeted machine learning model, adversarial attacks can be further categorized as:Evasion attacks,Data-poisoning attacks,Byzantine attacks,Model-extraction attacks.

Related Adversarial Attacks on Neural Networks: Exploring the Fast Gradient Sign Method Read moreData-poisoning attacksData-poisoning attacks are another flavor of advers…

4 months, 1 week назад @ neptune.ai
Building LLM Applications With Vector Databases
Building LLM Applications With Vector Databases Building LLM Applications With Vector Databases

Large Language Model (LLM): A machine-learning model that takes in a textual prompt and outputs an answer.

These documents and the user query comprise the prompt for the LLM | Source: AuthorMethods for building LLM applications with vector databasesVector databases for context retrievalThe simplest way to leverage vector databases in LLM systems is to use them to efficiently search for context that can help your LLM provide accurate answers.

Dynamic few-shot prompting: The prompt is constructed by combining the original user query and examples selected through retrieval froma vector database | Source: AuthorRelated Zero-Shot and Few-Shot Learning with LLMs Read moreHow to build LLM applicat…

4 months, 2 weeks назад @ neptune.ai
How to Migrate From MLFlow to Neptune
How to Migrate From MLFlow to Neptune How to Migrate From MLFlow to Neptune

Your MLflow run logs can easily be exported to the neptune.ai app using a dedicated plugin.

Seven reasons for migrating from MLflow to neptune.aiBefore diving into the migration process, let’s see how MLflow and Neptune differ and why migrating from MLflow to Neptune is worth the effort.

Export MLflow logs to neptune.ai: We will migrate your existing MLflow run logs and models to the Neptune experiment tracker.

Export MLflow logs to NeptuneDepending on whether you store your MLflow data locally or work with a remote tracking server, the export works slightly differently.

Here’s an example of how to create a Neptune tracking URI and pass it to MLflow:import os import mlflow from neptune_mlfl…

4 months, 3 weeks назад @ neptune.ai
Introducing Redesigned Navigation, Run Groups, Reports, and More
Introducing Redesigned Navigation, Run Groups, Reports, and More Introducing Redesigned Navigation, Run Groups, Reports, and More

We have consolidated the separate runs table, run details, and compare runs views into one single view, freeing up a ton of space at the top of your screen.

Run groups: Level up the analysis of your experimentsTo some extent, grouping experiments has always been possible in Neptune.

But we’re now elevating run groups to be more important citizens of Neptune.

Here’s what’s new:Most importantly, you can now group experiments by string sets, e.g., tags and newly introduced group tags .

A few changes in the runs table include:We’re working on bringing the run name to the front row of the runs table.

4 months, 4 weeks назад @ neptune.ai
▶️ YouTube
Yannic Kilcher Yannic Kilcher
последний пост 1 month назад
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models

This paper (by Apple) questions the mathematical reasoning abilities of current LLMs and designs a synthetic template-based dataset distribution to investigate various aspects around LLM performance of high-school level math questions. Paper: https://arxiv.org/abs/2410.05229 Abstract:

Recent advancements in Large Language Models (LLMs) have sparked interest in their formal reasoning capabilities, particularly in mathematics. The GSM8K benchmark is widely used to assess the mathematical reasoning of models on grade-school-level questions. While the performance of LLMs on GSM8K has significantly improved in recent years, it remains unclear whether their mathematical reasoning capabilities hav…

1 month назад @ youtube.com
Were RNNs All We Needed? (Paper Explained)
Were RNNs All We Needed? (Paper Explained) Were RNNs All We Needed? (Paper Explained)

This paper posits the interesting question: How much of the performance of Mamba, S4, and other state-space-like models is actually just attributable to some very core concepts - rather than their elaborate architectures. The authors construct minimal versions of GRUs and LSTMs and report competitive performance. Paper: https://arxiv.org/abs/2410.01201 Abstract:

The scalability limitations of Transformers regarding sequence length have renewed interest in recurrent sequence models that are parallelizable during training. As a result, many novel recurrent architectures, such as S4, Mamba, and Aaren, have been proposed that achieve comparable performance. In this work, we revisit traditional …

1 month, 1 week назад @ youtube.com
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters (Paper)
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters (Paper) Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters (Paper)

How can one best use extra FLOPS at test time? Paper: https://arxiv.org/abs/2408.03314 Abstract:

Enabling LLMs to improve their outputs by using more test-time computation is a critical step towards building generally self-improving agents that can operate on open-ended natural language. In this paper, we study the scaling of inference-time computation in LLMs, with a focus on answering the question: if an LLM is allowed to use a fixed but non-trivial amount of inference-time compute, how much can it improve its performance on a challenging prompt? Answering this question has implications not only on the achievable performance of LLMs, but also on the future of LLM pretraining and how one s…

1 month, 2 weeks назад @ youtube.com
Privacy Backdoors: Stealing Data with Corrupted Pretrained Models (Paper Explained)
Privacy Backdoors: Stealing Data with Corrupted Pretrained Models (Paper Explained) Privacy Backdoors: Stealing Data with Corrupted Pretrained Models (Paper Explained)

#llm #privacy #finetuning Can you tamper with a base model in such a way that it will exactly remember its fine-tuning data? This paper presents a method of doing exactly that, and implements it in modern transformers. OUTLINE:

0:00 - Intro & Overview

10:50 -Core idea: single-use data traps

44:30 - Backdoors in transformer models

58:00 - Additional numerical tricks

1:00:35 - Experimental results & conclusion Paper: https://arxiv.org/abs/2404.00473

Code: https://github.com/ShanglunFengatETHZ/PrivacyBackdoor Abstract:

Practitioners commonly download pretrained machine learning models from open repositories and finetune them to fit specific applications. We show that this practice introduces a…

3 months, 2 weeks назад @ youtube.com
Scalable MatMul-free Language Modeling (Paper Explained)
Scalable MatMul-free Language Modeling (Paper Explained) Scalable MatMul-free Language Modeling (Paper Explained)

Matrix multiplications (MatMuls) are pervasive throughout modern machine learning architectures. However, they are also very resource intensive and require special accelerators (GPUs). This paper explores architectures that do away with MatMuls and use quantization and recurrence to keep performance up. OUTLINE:

0:00 - Intro

2:30 - MatMul is everywhere

5:55 - Ternary accumulation as a substitute for matrix multiplication

16:35 - Replacing attention layers with recurrent layers

32:40 - Replacing dense layers with ternary channel mixing

38:30 - Language modelling results & scaling laws

45:00 - Other experimental results

48:20 - Conclusion Paper: https://arxiv.org/abs/2406.02528

Code: https://…

4 months, 2 weeks назад @ youtube.com
Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools (Paper Explained)
Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools (Paper Explained) Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools (Paper Explained)

#rag #hallucinations #legaltech An in-depth look at a recent Stanford paper examining the degree of hallucinations in various LegalTech tools that incorporate LLMs. OUTLINE:

0:00 - Intro

1:58 - What are legal research tools and how are large language models used by them?

5:30 - Overview and abstract of the paper

9:29 - What is a hallucination and why do they occur?

15:45 - What is retrieval augmented generation (RAG)?

25:00 - Why LLMs are a bad choice when reasoning is involved

29:16 - The products that were tested

32:00 - Some shady practices by the researchers in the back and forth with the legal research companies

37:00 - Legal technology companies’ marketing claims to eliminate or solve…

4 months, 3 weeks назад @ youtube.com
xLSTM: Extended Long Short-Term Memory
xLSTM: Extended Long Short-Term Memory xLSTM: Extended Long Short-Term Memory

xLSTM is an architecture that combines the recurrency and constant memory requirement of LSTMs with the large-scale training of transformers and achieves impressive results. Paper: https://arxiv.org/abs/2405.04517 Abstract:

In the 1990s, the constant error carousel and gating were introduced as the central ideas of the Long Short-Term Memory (LSTM). Since then, LSTMs have stood the test of time and contributed to numerous deep learning success stories, in particular they constituted the first Large Language Models (LLMs). However, the advent of the Transformer technology with parallelizable self-attention at its core marked the dawn of a new era, outpacing LSTMs at scale. We now raise a sim…

5 months, 3 weeks назад @ youtube.com
[ML News] OpenAI is in hot waters (GPT-4o, Ilya Leaving, Scarlett Johansson legal action)
[ML News] OpenAI is in hot waters (GPT-4o, Ilya Leaving, Scarlett Johansson legal action) [ML News] OpenAI is in hot waters (GPT-4o, Ilya Leaving, Scarlett Johansson legal action)

#gpt4o #sky #scarlettjohansson After the release of their flagship model GPT-4o, OpenAI finds itself in multiple controversies and an exodus of senior personnel - notably Ilya Sutskever References:

https://openai.com/index/gpt-4o-and-more-tools-to-chatgpt-free/

https://openai.com/index/hello-gpt-4o/

https://x.com/LiamFedus/status/1790064963966370209?t=rx2YBT9AdDdKPhI6dUH4zA&s=09

https://x.com/lmsysorg/status/1790097588399779991?t=rx2YBT9AdDdKPhI6dUH4zA&s=09

https://x.com/bindureddy/status/1790127425705120149?t=mMUBqFBRphx-bDuZ1j3mjQ&s=09

https://openai.com/index/improvements-to-data-analysis-in-chatgpt/

https://openai.com/index/openai-and-reddit-partnership/

https://archive.ph/jHlMm

https:/…

6 months назад @ youtube.com
ORPO: Monolithic Preference Optimization without Reference Model (Paper Explained)
ORPO: Monolithic Preference Optimization without Reference Model (Paper Explained) ORPO: Monolithic Preference Optimization without Reference Model (Paper Explained)

Paper: https://arxiv.org/abs/2403.07691 Abstract:

While recent preference alignment algorithms for language models have demonstrated promising results, supervised fine-tuning (SFT) remains imperative for achieving successful convergence. In this paper, we study the crucial role of SFT within the context of preference alignment, emphasizing that a minor penalty for the disfavored generation style is sufficient for preference-aligned SFT. Building on this foundation, we introduce a straightforward and innovative reference model-free monolithic odds ratio preference optimization algorithm, ORPO, eliminating the necessity for an additional preference alignment phase. We demonstrate, both empiri…

6 months, 3 weeks назад @ youtube.com
[ML News] Chips, Robots, and Models
[ML News] Chips, Robots, and Models [ML News] Chips, Robots, and Models

OUTLINE:

0:00 - Intro

0:19 - Our next-generation Meta Training and Inference Accelerator

01:39 - ALOHA Unleashed

03:10 - Apple Inks $50M Deal with Shutterstock for AI Training Data

04:28 - OpenAI Researchers, Including Ally of Sutskever, Fired for Alleged Leaking

05:01 - Adobe's Ethical Firefly AI was Trained on Midjourney Images

05:52 - Trudeau announces $2.4billion for AI-related investments

06:48 - RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

07:15 - CodeGemma - an official Google release for code LLMs

07:24 - Mistral AI: Cheaper, Better, Faster, Stronger

08:08 - Vezora/Mistral-22B-v0.1

09:00 - WizardLM-2, next generation state-of-the-art-LLM

09:31 - Idefic…

6 months, 3 weeks назад @ youtube.com
TransformerFAM: Feedback attention is working memory
TransformerFAM: Feedback attention is working memory TransformerFAM: Feedback attention is working memory

Paper: https://arxiv.org/abs/2404.09173 Abstract:

While Transformers have revolutionized deep learning, their quadratic attention complexity hinders their ability to process infinitely long inputs. We propose Feedback Attention Memory (FAM), a novel Transformer architecture that leverages a feedback loop to enable the network to attend to its own latent representations. This design fosters the emergence of working memory within the Transformer, allowing it to process indefinitely long sequences. TransformerFAM requires no additional weights, enabling seamless integration with pre-trained models. Our experiments show that TransformerFAM significantly improves Transformer performance on long-…

6 months, 3 weeks назад @ youtube.com
[ML News] Devin exposed | NeurIPS track for high school students
[ML News] Devin exposed | NeurIPS track for high school students [ML News] Devin exposed | NeurIPS track for high school students

OUTLINE:

0:00 - Intro

0:21 - Debunking Devin: "First AI Software Engineer" Upwork lie exposed!

07:24 - NeurIPS 2024 will have a track for papers from high schoolers.

13:29 - Opus can operate as a Turing machine.

13:47 - An AI-Powered, Self-Running Propaganda Machine for $105

14:27 - TechScape: How cheap, outsourced labour in Africa is shaping AI English

16:25 - Is ChatGPT Transforming Academics' Writing Style? References:

https://news.ycombinator.com/item?id=40008109&s=09

https://www.youtube.com/watch?v=tNmgmwEtoWE

https://www.youtube.com/watch?v=xE2fxcETP5E

https://twitter.com/itsandrewgao/status/1779369373737668669?t=omW3DvRNmZyce8oo0Ehf1g&s=09

https://twitter.com/0interestrates/status/17…

6 months, 3 weeks назад @ youtube.com
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Google researchers achieve supposedly infinite context attention via compressive memory. Paper: https://arxiv.org/abs/2404.07143 Abstract:

This work introduces an efficient method to scale Transformer-based Large Language Models (LLMs) to infinitely long inputs with bounded memory and computation. A key component in our proposed approach is a new attention technique dubbed Infini-attention. The Infini-attention incorporates a compressive memory into the vanilla attention mechanism and builds in both masked local attention and long-term linear attention mechanisms in a single Transformer block. We demonstrate the effectiveness of our approach on long-context language modeling benchmarks, 1M …

7 months назад @ youtube.com
[ML News] Llama 3 changes the game
[ML News] Llama 3 changes the game [ML News] Llama 3 changes the game

Meta's Llama 3 is out. New model, new license, new opportunities. References:

https://llama.meta.com/llama3/

https://ai.meta.com/blog/meta-llama-3/

https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md

https://llama.meta.com/trust-and-safety/

https://ai.meta.com/research/publications/cyberseceval-2-a-wide-ranging-cybersecurity-evaluation-suite-for-large-language-models/

https://github.com/meta-llama/llama-recipes/tree/main/recipes/responsible_ai

https://llama.meta.com/llama3/license/

https://about.fb.com/news/2024/04/meta-ai-assistant-built-with-llama-3/?utm_source=twitter&utm_medium=organic_social&utm_content=thread&utm_campaign=imagineflash

https://twitter.com/minchoi/status/178277…

7 months назад @ youtube.com
Hugging Face got hacked
Hugging Face got hacked Hugging Face got hacked

Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https://www.subscribestar.com/yannickilcher

Patreon: https://www.patreon.com/yannickilcher

Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq

Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2

Litecoin (LTC): LQW2TRyKYetVC8WjFkhpP…

7 months, 1 week назад @ youtube.com
Henry AI Labs Henry AI Labs
последний пост 3 months, 1 week назад
Chunking with Generative Feedback Loops
Chunking with Generative Feedback Loops Chunking with Generative Feedback Loops

Hey everyone! I am super excited to share a quick notebook on using Generative Feedback Loops to chunk code files and better structure how they are indexed in the Weaviate Vector Database! Chunking is one of the key topics in Vector Search. We need to break up long documents into smaller parts that we can encode with a pre-trained embedding model and index in a vector index, such as HNSW-PQ. Most solutions use some form of a rolling token window such as taking every 300 tokens as a chunk, with say 50 tokens overlapping between each window. Unfortunately, this solution doesn't work that well for code particularly. We don't want the chunk to cut off in the middle of a function or class defini…

3 months, 1 week назад @ youtube.com
Gemini 1.5 Pro and Flash - Demo of Long Context LLMs!
Gemini 1.5 Pro and Flash - Demo of Long Context LLMs! Gemini 1.5 Pro and Flash - Demo of Long Context LLMs!

Hey everyone! Thanks so much for watching this video exploring Gemini Pro 1.5 and Gemini Flash! Long Context LLMs!! This video covers 3 key tests, the classic "Lost in the Middle" exploration, using Long Context LLMs as Re-rankers in Search, and finally, testing Many-Shot In-Context Learning! I am really excited about the potential of Many-Shot In-Context Learning with DSPy's `BootstrapFewShot` and Gemini, curious to know what you think! Notebook: https://github.com/weaviate/recipes/blob/main/integrations/dspy/llms/Gemini-1.5-Pro-and-Flash.ipynb Gemini 1.5 Technical Report: https://storage.googleapis.com/deepmind-media/gemini/gemini_v1_5_report.pdf Chapters

0:00 Gemini 1.5!!

1:25 Setup and …

6 months, 1 week назад @ youtube.com
Llama 3 RAG Demo with DSPy Optimization, Ollama, and Weaviate!
Llama 3 RAG Demo with DSPy Optimization, Ollama, and Weaviate! Llama 3 RAG Demo with DSPy Optimization, Ollama, and Weaviate!

Hey everyone! Thank you so much for watching this overview of Llama 3 looking at the release notes and seeing a demo of how to integrate it with DSPy through Ollama and how to use DSPy's MIPRO to find the optimal prompt when using this new large language model for RAG! We are hosting an event in San Francisco on May 1st with Arize AI and Cohere, featuring a talk from Omar Khattab, the lead author of DSPy! Hope to see you there! https://lu.ma/dspy Introducing Meta Llama 3: https://ai.meta.com/blog/meta-llama-3/ Ollama Llama 3: https://ollama.com/library/llama3 Weaviate Recipes: https://github.com/weaviate/recipes/blob/main/integrations/dspy/llms/Llama3.ipynb Chapters

0:00 Llama3!!

1:28 Relea…

7 months назад @ youtube.com
Building RAG with Command R+ from Cohere, DSPy, and Weaviate!
Building RAG with Command R+ from Cohere, DSPy, and Weaviate! Building RAG with Command R+ from Cohere, DSPy, and Weaviate!

Hey everyone! Thank you so much for watching this overview of Command R+ showing you how you can use the new model in DSPy and a quick RAG demo, as well as walking through the details of the release post! Congratulations to the Cohere team! Super exciting times to be working with LLM systems! Introducing Command R+: A Scalable LLM Built for Business - https://txt.cohere.com/command-r-plus-microsoft-azure/ Link to demo notebook - https://github.com/weaviate/recipes/blob/main/integrations/dspy/llms/Command-R-Plus.ipynb Chapters

0:00 Welcome! Command R+!

1:12 Demo with Cohere, DSPy, and Weaviate

6:06 Command R+ Announcement Post

9:24 LLM Evals

7 months, 2 weeks назад @ youtube.com
Structured Outputs with DSPy
Structured Outputs with DSPy Structured Outputs with DSPy

Unfortunately, Large Language Models will not consistently follow the instructions that you give them. This is a massive problem when you are building AI systems that require a particular type of output from the previous step to feed into the next one! For example, imagine you are building a blog post writing system that first takes a question and retrieved context to output a list of topics. These topics have to be formatted in a particular way, such as a comma-separated list or a JSON of Topic objects, such that the system can continue writing the blog post! I am SUPER excited to share the 4th video in my DSPy series, diving into 3 solutions to structuring outputs in DSPy programs: (1) **…

7 months, 3 weeks назад @ youtube.com
Adding Depth to DSPy Programs
Adding Depth to DSPy Programs Adding Depth to DSPy Programs

Hey everyone! Thank you so much for watching the 3rd edition of the DSPy series, Adding Depth to DSPy Programs!! You can find the examples and links to community resources / news on https://github.com/weaviate/recipes! Chapters

0:00 Intro

0:50 Chapters Overview

5:06 Weaviate Recipes

5:24 DSPy News and Community Notes

13:51 Adding Depth to RAG Programs

18:40 Multi-Model DSPy Programs

20:18 DSPy Optimizers

25:30 Deep Dive Optimizers

27:55 Into the Optimizer Code!

37:48 Demo #1: Adding Depth to RAG

1:05:25 Demo #2: Questions to Blogs

1:07:48 Thank you so much for watching!

8 months, 3 weeks назад @ youtube.com
Getting Started with RAG in DSPy!
Getting Started with RAG in DSPy! Getting Started with RAG in DSPy!

Hey everyone! Thank you so much for watching this tutorial on getting started with RAG programming in DSPy! This video will take you through 4 major aspects of building DSPy programs (1) Installation, settings, and Datasets with dspy.Example, (2) LLM Metrics, (3) The DSPy programming model, and (4) Optimization!! The notebook used in the video can be found here: https://github.com/weaviate/recipes/blob/main/integrations/dspy/1.Getting-Started-with-RAG-in-DSPy.ipynb All future videos, as well as additional utils like data import scripts, will be in this folder: https://github.com/weaviate/recipes/tree/main/integrations/dspy Please leave a star, it helps a lot! DSPy on GitHub: https://github.…

9 months, 1 week назад @ youtube.com
DSPy Explained!
DSPy Explained! DSPy Explained!

Hey everyone! Thank you so much for watching this explanation of DSPy! DSPy is a super exciting new framework for developing LLM programs! Pioneered by frameworks such as LangChain and LlamaIndex, we can build much more powerful systems by chaining together LLM calls! This means that the output of one call to an LLM is the input to the next, and so on. We can think of chains as programs, with each LLM call analogous to a function that takes text as input and produces text as output. DSPy offers a new programming model, inspired by PyTorch, that gives you a massive amount of control over these LLM programs. Further the Signature abstraction wraps prompts and structured input / outputs to cle…

9 months, 3 weeks назад @ youtube.com
3blue1brown 3blue1brown
последний пост 16 часов назад
The scale of training LLMs
The scale of training LLMs The scale of training LLMs

From this 7-minute LLM explainer: https://youtu.be/LPZh9BOjkQs

16 часов назад @ youtube.com
Large Language Models explained briefly
Large Language Models explained briefly Large Language Models explained briefly

Dig deeper here: https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi

Technical details as a talk: https://youtu.be/KJtZARuO3JY

Made for an exhibit at the Computer History Museum: https://computerhistory.org/

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support Timestamps:

0:00 - Who this was made for

0:41 - What are large language models?

7:48 - Where to learn more No secret end-screen vlog for this one, the end-screen real estate was all full! ------------------ These animations are largely made using a custom Python library, manim. See the FAQ comments here:

https://3b1b.co/faq#manim

https://github.com/3b1b/manim

https:/…

16 часов назад @ youtube.com
This puzzle is tricker than it seems
This puzzle is tricker than it seems This puzzle is tricker than it seems

From this full video: https://youtu.be/piJkuavhV50

1 day, 12 hours назад @ youtube.com
Sphere surface area proof sketch
Sphere surface area proof sketch Sphere surface area proof sketch

Full video: https://youtu.be/GNcFjFmqEc8

3 days, 15 hours назад @ youtube.com
Newton’s Fractal is beautiful
Newton’s Fractal is beautiful Newton’s Fractal is beautiful

Full video: https://youtu.be/-RdOwhmqP5s

4 days, 14 hours назад @ youtube.com
The Triangle Of Power
The Triangle Of Power The Triangle Of Power

The referenced stack exchange post by usename 2'5 9'2, http://math.stackexchange.com/questions/30046/alternative-notation-for-exponents-logs-and-roots

5 days, 15 hours назад @ youtube.com
The twirling tiles puzzle
The twirling tiles puzzle The twirling tiles puzzle

Full video: https://youtu.be/piJkuavhV50

1 week назад @ youtube.com
A bizarre probability fact
A bizarre probability fact A bizarre probability fact

I learned this from Matt Parker, who then made a full video: https://youtu.be/ga9Qk38FaHM

1 week, 1 day назад @ youtube.com
Why 4d geometry makes me sad
Why 4d geometry makes me sad Why 4d geometry makes me sad

A series of delightful geometry puzzlesBonus video with extra puzzles: https://www.patreon.com/posts/115570453

The artwork at the end is by Kurt Bruns

Thanks to Daniel Kim for sharing the first two puzzles with me.

The idea to include the tetrahedron volume example was based on a conversation with Po Shen Lo about these puzzles, during which he mentioned the case of one dimension lower.

I received the cone correction to the proof of Monge's theorem via Akos Zahorsky. If any of you know the original source, please let me know! I referenced quaternions at the end, and if you're curious to learn more, here are a few options. This video walks through concretely what the computation is for using…

1 week, 5 days назад @ youtube.com
LLMs are next-word predictors
LLMs are next-word predictors LLMs are next-word predictors

Full video: https://youtu.be/wjZofJX0v4M

2 weeks, 2 days назад @ youtube.com
How I animate 3Blue1Brown | A Manim demo with Ben Sparks
How I animate 3Blue1Brown | A Manim demo with Ben Sparks How I animate 3Blue1Brown | A Manim demo with Ben Sparks

A behind-the-scenes look at how I animate videos.

Code for all the videos: https://github.com/3b1b/videos

Manim: https://github.com/3b1b/manim

Community edition: https://github.com/ManimCommunity/manim/

Example scenes shown near the end: https://github.com/3b1b/manim/blob/master/example_scenes.py I added some more details about the workflow shown in this video to the readme of the videos repo: https://github.com/3b1b/videos?tab=readme-ov-file#workflow These lessons are funded directly by viewers: https://3b1b.co/support Timestamp:

0:00 - Intro

2:39 - Hello World

10:32 - Coding up a Lorenz attractor

23:46 - Add some tracking points

28:52 - The globals().update(locals()) hack

32:57 - Final st…

1 month, 1 week назад @ youtube.com
Hologram preview
Hologram preview Hologram preview

Full video: https://youtu.be/EmKQsSDlaa4

1 month, 1 week назад @ youtube.com
Holograms are wild
Holograms are wild Holograms are wild

Full video: https://youtu.be/EmKQsSDlaa4

1 month, 2 weeks назад @ youtube.com
How are holograms possible? | Optics puzzles 5
How are holograms possible? | Optics puzzles 5 How are holograms possible? | Optics puzzles 5

3d scenes on 2d film, and a diffraction lesson along the way.

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

An equally valuable form of support is to share the videos. Gabor's Nobel Prize lecture:

https://www.nobelprize.org/uploads/2018/06/gabor-lecture.pdf A few resources we found helpful for this video

Seeing the Light, by Falk, Brill, and Stork

https://amzn.to/3Ngdiqh Practical Holography, by Saxby and Zarcharovas

https://amzn.to/3ZR2MNN Principles of Holography by Howard Smith

https://amzn.to/3ZOihFZ Timestamps

0:00 - What is a Hologram?

3:28 - The recording process

11:45 - The simplest hologram

17:12 - Diffraction gratings

25:15 - …

1 month, 2 weeks назад @ youtube.com
A cute probability fact (part 3)
A cute probability fact (part 3) A cute probability fact (part 3)

Why max(rand(), rand()) is the same as sqrt(rand()) See Matt Parker's video for more:

https://youtu.be/ga9Qk38FaHM

2 months, 1 week назад @ youtube.com
Two Minute Papers Two Minute Papers
последний пост 5 days, 16 hours назад
Unreal Engine 5.5 - It Gets More Incredible!
Unreal Engine 5.5 - It Gets More Incredible! Unreal Engine 5.5 - It Gets More Incredible!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 Unreal Engine 5.5 is available here:

https://www.unrealengine.com/en-US/blog/unreal-engine-5-5-is-now-available 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Martin, Michael Albrecht, Michael T…

5 days, 16 hours назад @ youtube.com
Crazy AI Learned Minecraft - Try It Out For Free!
Crazy AI Learned Minecraft - Try It Out For Free! Crazy AI Learned Minecraft - Try It Out For Free!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers Oasis: A Universe in a Transformer - try it out now:

https://oasis.decart.ai/welcome More info:

https://oasis-model.github.io/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Martin, Michael Albrec…

1 week, 5 days назад @ youtube.com
OpenAI Takes On Google Search…With A Twist!
OpenAI Takes On Google Search…With A Twist! OpenAI Takes On Google Search…With A Twist!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper is available here:

https://openai.com/index/introducing-simpleqa/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Martin, Michael Albrecht, Michael Tedder, Owen Skarpness, Richard Sundv…

2 weeks, 1 day назад @ youtube.com
NVIDIA’s New Ray Tracing Tech Should Be Impossible!
NVIDIA’s New Ray Tracing Tech Should Be Impossible! NVIDIA’s New Ray Tracing Tech Should Be Impossible!

❤️ Try Macro for free and supercharge your learning: https://macro.com/papers 📝 The paper "3D Gaussian Ray Tracing: Fast Tracing of Particle Scenes" is available here:

https://gaussiantracer.github.io/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Martin, Michael Albrecht, Michael T…

2 weeks, 6 days назад @ youtube.com
OpenAI’s New AI Model: 50x Faster!
OpenAI’s New AI Model: 50x Faster! OpenAI’s New AI Model: 50x Faster!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers

📝 The paper is available here:

https://openai.com/index/simplifying-stabilizing-and-scaling-continuous-time-consistency-models/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Martin, Michael Albre…

3 weeks, 5 days назад @ youtube.com
Why The Future of AI Might Be Video Games!
Why The Future of AI Might Be Video Games! Why The Future of AI Might Be Video Games!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝Apple reasoning paper: https://arxiv.org/pdf/2410.05229 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Martin, Michael Albrecht, Michael Tedder, Owen Skarpness, Richard Sundvall, Taras Bobrovytsk…

4 weeks назад @ youtube.com
AI Looks At 4,000,000 Frames, Learns To Walk!
AI Looks At 4,000,000 Frames, Learns To Walk! AI Looks At 4,000,000 Frames, Learns To Walk!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers Guide on how I use Lambda to make images: https://docs.lambdalabs.com/on-demand-cloud/how-to-serve-the-flux.1-prompt-to-image-models-using-lambda-cloud-on-demand-instances 📝 The paper "Interactive Character Control with Auto-Regressive Motion Diffusion Models" is available here:

https://xbpeng.github.io/projects/AMDM/index.html

https://yi-shi94.github.io/amdm_page/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to …

1 month назад @ youtube.com
We Drop The Ball…And Things Go Really Wrong!
We Drop The Ball…And Things Go Really Wrong! We Drop The Ball…And Things Go Really Wrong!

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papersllm 📝 The paper is available here:

https://visualcomputing.ist.ac.at/publications/2024/PDNSF/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Martin, Michael Albrecht, Michael Tedder, Owen Skarpness…

1 month назад @ youtube.com
Microsoft’s New ChatGPT AI Plays A Video Game!
Microsoft’s New ChatGPT AI Plays A Video Game! Microsoft’s New ChatGPT AI Plays A Video Game!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The papers are available here:

https://allenai.github.io/discoveryworld/

https://www.youtube.com/watch?v=hcBFhJKdAvk

https://cg.informatik.uni-freiburg.de/publications/2023_CGF_monolithic_contact_friction.pdf 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, Joh…

1 month, 1 week назад @ youtube.com
NVIDIA’s New AI Trains In A Minecraft-like World!
NVIDIA’s New AI Trains In A Minecraft-like World! NVIDIA’s New AI Trains In A Minecraft-like World!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "fVDB: A Deep-Learning Framework for Sparse, Large-Scale, and High-Performance Spatial Intelligence" is available here:

https://research.nvidia.com/labs/prl/publication/williams2024fvdb/

https://developer.nvidia.com/fvdb

https://blogs.nvidia.com/blog/fvdb-bigger-digital-models/ LLama 3.2: https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/

Gaussian Haircut paper: https://eth-ait.github.io/GaussianHaircut/ NVLM vision model:

https://research.nvidia.com/labs/adlr/NVLM-1/

https://huggingface.co/nvidia/NVLM-D-72B 📝 My paper on simulations that look almost like reality is a…

1 month, 1 week назад @ youtube.com
Unreal Engine 5.5 Is Here - Mega Greatness!
Unreal Engine 5.5 Is Here - Mega Greatness! Unreal Engine 5.5 Is Here - Mega Greatness!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers Unreal Engine + 5.5 preview:

https://www.unrealengine.com/en-US

https://forums.unrealengine.com/t/unreal-engine-5-5-preview/2048423 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Martin, Michael A…

1 month, 2 weeks назад @ youtube.com
OpenAI’s New ChatGPT Voice Mode Is Crazy, NotebookLM, AlphaChip
OpenAI’s New ChatGPT Voice Mode Is Crazy, NotebookLM, AlphaChip OpenAI’s New ChatGPT Voice Mode Is Crazy, NotebookLM, AlphaChip

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers NotebookLM: https://notebooklm.google/

Our paper on ray tracing: https://users.cg.tuwien.ac.at/zsolnai/gfx/adaptive_metropolis/ Give it a face: https://x.com/HalimAlrasihi/status/1840052149012570283

It sings with you: https://x.com/ai_for_success/status/1840079574048133350

AlphaChip: https://deepmind.google/discover/blog/how-alphachip-transformed-computer-chip-design/

Open NotebookLM: https://huggingface.co/spaces/gabrielchua/open-notebooklm Other sousrces:

https://x.com/emollick/status/1840225474905084079?s=46

https://x.com/OpenAI/status/1838642444365369814 and more

https://x.com/ai_for_success/status/1…

1 month, 2 weeks назад @ youtube.com
NVIDIA’s New AI: The King Is Here!
NVIDIA’s New AI: The King Is Here! NVIDIA’s New AI: The King Is Here!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The papers are available here:

https://research.nvidia.com/labs/par/maskedmimic/

https://3dtopia.github.io/3DTopia-XL/

https://kovenyu.com/wonderworld/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biew…

1 month, 3 weeks назад @ youtube.com
OpenAI’s New ChatGPT: 7 Incredible Capabilities!
OpenAI’s New ChatGPT: 7 Incredible Capabilities! OpenAI’s New ChatGPT: 7 Incredible Capabilities!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/paper Play the Tron game: https://agpallav.com/tron.html Sources:

https://www.youtube.com/watch?v=M9YOO7N5jF8

https://x.com/ammaar/status/1834312398016074083

https://x.com/MatthewBerman/status/1834295485773054312

https://x.com/slow_developer/status/1835685158248219106

https://x.com/mehran__jalali/status/1834299971828679079

https://x.com/MattVidPro/status/1836144228084117520

https://x.com/pallavmac/status/1836197280325488792

https://agpallav.com/tron.html 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickab…

1 month, 3 weeks назад @ youtube.com
NVIDIA’s Tech: Impossible Water Simulation!
NVIDIA’s Tech: Impossible Water Simulation! NVIDIA’s Tech: Impossible Water Simulation!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/paper 📝 The papers are available here:

https://research.nvidia.com/labs/prl/shallow-water-simulation/

https://cs.uwaterloo.ca/~c2batty/papers/Takahashi2024/Takahashi2024.pdf

https://dl.acm.org/doi/abs/10.1145/3414685.3417859

https://cgl.ethz.ch/publications/papers/paperChen20a.php 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex…

2 months назад @ youtube.com
DataFest Video DataFest Video
последний пост 2 months, 2 weeks назад
Interview with Juergen Schmidhuber at Data Christmas 2020
Interview with Juergen Schmidhuber at Data Christmas 2020 Interview with Juergen Schmidhuber at Data Christmas 2020

02:00-05:38 What do you think were the most outstanding underestimated news and achievements in AI field in 2020?

05:41-11:28 What do you think about trends in ML like transformers trying to replace LSTMs in NLP?

11:29-16:06 Are you working on any new types of models right now?

16:07-20:41 What is your opinion on the most underestimated ML subfield like Reinforcement Learning?

20:42-22:17 Your best recommendation for our community is to look into AI in the real physical world, right?

22:18-33:10 Do you think it is possible to achieve great results in creative AI, particularly in subjective beauty?

33:17-35:50 What prevents chat bots from reaching more intelligent levels?

36:03-39:39 What is…

2 months, 2 weeks назад @ youtube.com
Mikita Shchutski | A small BERT towards Large Medical Models
Mikita Shchutski | A small BERT towards Large Medical Models Mikita Shchutski | A small BERT towards Large Medical Models

Mikita Shchutski | Lead Machine Learning Engineer, Quantori Training large medical models using electronic health records in order to create a highly informative medical embedding space

2 months, 2 weeks назад @ youtube.com
Data Fest Online 2020 AI Hardware Track Premiere
Data Fest Online 2020 AI Hardware Track Premiere Data Fest Online 2020 AI Hardware Track Premiere

DataFest Online 2020

AI Hardware track https://ods.ai/tracks/ai-hardware-df2020 Register and get access to the tracks: https://ods.ai/events/datafest2020

Join the community: https://ods.ai/

2 months, 2 weeks назад @ youtube.com
Семинары JetBrains Research Семинары JetBrains Research
последний пост None
Яндекс. Компьютерные науки Яндекс. Компьютерные науки
последний пост 2 days, 21 hours назад
Как быстрее писать код #programming #coding #айти #код
Как быстрее писать код  #programming #coding #айти #код Как быстрее писать код #programming #coding #айти #код

Это фрагмент из доклада Виктора Плошихина, руководителя ML-лаборатории в Yandex Platform Engineering. В своём выступлении на Practical ML Conf 2024 Виктор рассказал, как команда создавала AI-ассистента для разработчиков. Как дообучали модели на реальном коде, почему решили предсказывать именно стейтменты, а ещё какие метрики и способы оценки качества продукта разработали. Ищите доклад целиком на нашем канале.

2 days, 21 hours назад @ youtube.com
RAG: баланс между достоверностью и полезностью #programming #coding
RAG: баланс между достоверностью и полезностью #programming #coding RAG: баланс между достоверностью и полезностью #programming #coding

Это фрагмент выступления руководителя управления качества в Яндекс Поиске Кати Серажим и CTO Яндекс Поиска Алексея Гусакова. На Practical ML Conf 2024 Катя рассказала об архитектуре и процессе обучения LLM для больших продуктовых проектов Яндекса и вместе с Алексеем ответила на вопросы из зала. Посмотреть доклад целиком можно на нашем канале. Из него вы узнаете, что находится под капотом у Нейро и какие сложности решала команда на каждом из этапов его создания.

1 week назад @ youtube.com
Что важно людям, которые слушают аудиокниги #аудиокнига #чтение
Что важно людям, которые слушают аудиокниги #аудиокнига #чтение Что важно людям, которые слушают аудиокниги #аудиокнига #чтение

Это фрагмент доклада Степана Комкова, старшего разработчика службы синтеза речи в Яндекс Поиске. На Practical ML Conf 2024 Степан рассказал об опыте создания виртуального рассказчика в Яндекс Книгах. Зачем его создавали и чего хотят пользователи, как выжать максимум из технологий уходящего поколения и внедрить длинный контекст в low-resource-real-time-модель. И как GPT и диффузионные модели произвели революцию в синтезе речи. Смотрите доклад целиком на нашем канале.

1 week, 5 days назад @ youtube.com
Идеальный голос для виртуального рассказчика #аудиокнига #чтение
Идеальный голос для виртуального рассказчика #аудиокнига #чтение Идеальный голос для виртуального рассказчика #аудиокнига #чтение

Это фрагмент доклада Степана Комкова, старшего разработчика службы синтеза речи в Яндекс Поиске. На Practical ML Conf 2024 Степан рассказал об опыте создания виртуального рассказчика в Яндекс Книгах. Зачем его создавали и чего хотят пользователи, как выжать максимум из технологий уходящего поколения и внедрить длинный контекст в low-resource-real-time-модель. И как GPT и диффузионные модели произвели революцию в синтезе речи. Смотрите доклад целиком на нашем канале.

2 weeks, 2 days назад @ youtube.com
Развитие навыков распознавания текста в VLM / Антон Клочков
Развитие навыков распознавания текста в VLM / Антон Клочков Развитие навыков распознавания текста в VLM / Антон Клочков

Это доклад с ML Party в Белграде. Его прочитал Антон Клочков, руководитель подгруппы распознавания текста в VLM. Антон рассказал, как в визуальных языковых моделях развивают навыки распознавания символов на картинке. И показал кейсы, где это используется (кроме расшифровки мемов). Больше интересных материалов по ML ищите тут: https://t.me/yandexforml

2 weeks, 5 days назад @ youtube.com
Как мы учим Алису откликаться без имени / Дмитрий Солодуха
Как мы учим Алису откликаться без имени / Дмитрий Солодуха Как мы учим Алису откликаться без имени / Дмитрий Солодуха

Это доклад с ML Party в Белграде. Его прочитал Дмитрий Солодуха, руководитель команды Алисы и умных устройств Яндекса. Дмитрий рассказал, как они с ребятами учат умные устройства активироваться по интонации и быстрым командам, чтобы пользователю нужно было меньше «алискать». Больше интересных материалов по ML ищите тут: https://t.me/yandexforml

2 weeks, 5 days назад @ youtube.com
Реклама ресторанов в Еде: аукцион, ранжирование, ценообразование / Илья Ирхин
Реклама ресторанов в Еде: аукцион, ранжирование, ценообразование / Илья Ирхин Реклама ресторанов в Еде: аукцион, ранжирование, ценообразование / Илья Ирхин

Это доклад с ML Party в Белграде. С ним выступил Илья Ирхин, руководитель аналитики Яндекс Еды. Илья показал, как решается задача ранжирования ресторанов-партнёров при продвижении в выдаче сервиса. Больше интересных материалов по ML ищите тут: https://t.me/yandexforml

2 weeks, 6 days назад @ youtube.com
Трансформеры в Погоде, или Как начать прогнозировать миллиметры осадков / Пётр Вытовтов
Трансформеры в Погоде, или Как начать прогнозировать миллиметры осадков / Пётр Вытовтов Трансформеры в Погоде, или Как начать прогнозировать миллиметры осадков / Пётр Вытовтов

Это доклад с ML Party в Белграде. С ним выступил Пётр Вытовтов, руководитель машинного обучения рекламной платформы. Пётр объяснил, как в Яндекс Погоде прогнозируют осадки вплоть до миллиметров в любой точке земного шара. А ещё показал, как измеряют интенсивность осадков с помощью ведра и спутника. Больше интересных материалов по ML ищите тут: https://t.me/yandexforml

2 weeks, 6 days назад @ youtube.com
Как недостатки делают голос виртуального рассказчика лучше #аудиокнига #programming
Как недостатки делают голос виртуального рассказчика лучше #аудиокнига  #programming Как недостатки делают голос виртуального рассказчика лучше #аудиокнига #programming

Это фрагмент доклада Степана Комкова, старшего разработчика службы синтеза речи в Яндекс Поиске. На Practical ML Conf 2024 Степан рассказал об опыте создания виртуального рассказчика в Яндекс Книгах. Зачем его создавали и чего хотят пользователи, как выжать максимум из технологий уходящего поколения и внедрить длинный контекст в low-resource-real-time-модель. И как GPT и диффузионные модели произвели революцию в синтезе речи. Смотрите доклад целиком на нашем канале.

3 weeks назад @ youtube.com
Как мы обучали виртуального рассказчика Яндекс Книг #аудиокнига #книги
Как мы обучали виртуального рассказчика Яндекс Книг #аудиокнига #книги Как мы обучали виртуального рассказчика Яндекс Книг #аудиокнига #книги

Это фрагмент доклада Степана Комкова, старшего разработчика службы синтеза речи в Яндекс Поиске. На Practical ML Conf 2024 Степан рассказал об опыте создания виртуального рассказчика в Яндекс Книгах. Зачем его создавали и чего хотят пользователи, как выжать максимум из технологий уходящего поколения и внедрить длинный контекст в low-resource-real-time-модель. И как GPT и диффузионные модели произвели революцию в синтезе речи. Смотрите доклад целиком на нашем канале.

4 weeks назад @ youtube.com
Practical ML conf. Зал «Код»
Practical ML conf. Зал «Код» Practical ML conf. Зал «Код»

Хардовая конференция для экспертов: качественные технические доклады от ключевых инженеров, максимум пользы и знаний о практическом применении ML. Поговорим, как уже сейчас использовать ML с пользой для бизнеса.

2 months, 1 week назад @ youtube.com
Practical ML conf. Зал «Данные»
Practical ML conf. Зал «Данные» Practical ML conf. Зал «Данные»

Хардовая конференция для экспертов: качественные технические доклады от ключевых инженеров, максимум пользы и знаний о практическом применении ML. Поговорим, как уже сейчас использовать ML с пользой для бизнеса.

2 months, 1 week назад @ youtube.com
ML Trainings ML Trainings
последний пост 1 day назад
Анна Ариничева | Интерпретируемая модель выявления когнитивных искажений в текстах
Анна Ариничева | Интерпретируемая модель выявления когнитивных искажений в текстах Анна Ариничева | Интерпретируемая модель выявления когнитивных искажений в текстах

Спикер: Анна Ариничева

Название: Интерпретируемая модель выявления когнитивных искажений в текстах на естественном языке Data Fest Siberia 5: https://ods.ai/events/datafestsiberia5

Трек: https://ods.ai/tracks/sibfest5-student

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

1 day назад @ youtube.com
Антон Колонин | Интерпретируемая обработка естественного языка
Антон Колонин | Интерпретируемая обработка естественного языка Антон Колонин | Интерпретируемая обработка естественного языка

Спикер: Антон Колонин

Название: Интерпретируемая обработка естественного языка (фундаментальные результаты и прикладные кейсы) Data Fest Siberia 5: https://ods.ai/events/datafestsiberia5

Трек: https://ods.ai/tracks/sibfest5-ml-applied

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

1 day назад @ youtube.com
Владислав Негодин | Следование формату и галлюцинации — бенчмарки для YandexGPT
Владислав Негодин | Следование формату и галлюцинации — бенчмарки для YandexGPT Владислав Негодин | Следование формату и галлюцинации — бенчмарки для YandexGPT

Спикер: Владислав Негодин Data Fest Siberia 5: https://ods.ai/events/datafestsiberia5

Трек NLP: https://ods.ai/tracks/sibfest5-nlp

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

1 day назад @ youtube.com
Артем Вахрушев | Разработка модели глубокого машинного обучения для автомат. обработки РФЭ спектров
Артем Вахрушев | Разработка модели глубокого машинного обучения для автомат. обработки РФЭ спектров Артем Вахрушев | Разработка модели глубокого машинного обучения для автомат. обработки РФЭ спектров

Спикер: Артем Вахрушев Data Fest Siberia 5: https://ods.ai/events/datafestsiberia5

Трек: https://ods.ai/tracks/sibfest5-student

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 days назад @ youtube.com
Голощапов Влад | Умный неструктурированный прунинг и пределы сжимаемости трансформеров
Голощапов Влад | Умный неструктурированный прунинг и пределы сжимаемости трансформеров Голощапов Влад | Умный неструктурированный прунинг и пределы сжимаемости трансформеров

Спикер: Голощапов Влад Data Fest Siberia 5: https://ods.ai/events/datafestsiberia5

Трек Прикладной МЛ: https://ods.ai/tracks/sibfest5-ml-applied

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 days назад @ youtube.com
Голощапов Влад | Чем не является гроккинг
Голощапов Влад | Чем не является гроккинг Голощапов Влад | Чем не является гроккинг

Спикер: Голощапов Влад

Название: Чем не является гроккинг - демонстрация опенсорсной библиотечки визуализации in sight Data Fest Siberia 5: https://ods.ai/events/datafestsiberia5

Трек Прикладной МЛ: https://ods.ai/tracks/sibfest5-ml-applied

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 days назад @ youtube.com
Дмитрий Пилецкий | In IDE Code Retriever
Дмитрий Пилецкий | In IDE Code Retriever Дмитрий Пилецкий | In IDE Code Retriever

Спикер: Дмитрий Пилецкий Data Fest Siberia 5: https://ods.ai/events/datafestsiberia5

Трек NLP: https://ods.ai/tracks/sibfest5-nlp

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 days назад @ youtube.com
Александр Мамылов | Избавляемся от рутины: как AI-ассистент автоматизирует генерацию контента
Александр Мамылов | Избавляемся от рутины: как AI-ассистент автоматизирует генерацию контента Александр Мамылов | Избавляемся от рутины: как AI-ассистент автоматизирует генерацию контента

Спикер: Александр Мамылов

Название: Избавляемся от рутины: как AI-ассистент автоматизирует генерацию экспертного контента для SIEM Data Fest Siberia 5: https://ods.ai/events/datafestsiberia5

Трек NLP: https://ods.ai/tracks/sibfest5-nlp

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 days назад @ youtube.com
Анна Латушко, Джулия Мазине | Как отсмотреть всю Википедию и не сойти с ума?
Анна Латушко, Джулия Мазине | Как отсмотреть всю Википедию и не сойти с ума? Анна Латушко, Джулия Мазине | Как отсмотреть всю Википедию и не сойти с ума?

Спикер: Анна Латушко, Джулия Мазине

Название: Как отсмотреть всю Википедию и не сойти с ума? История создания библиотеки wiki-synonyms Data Fest Siberia 5: https://ods.ai/events/datafestsiberia5

Трек NLP: https://ods.ai/tracks/sibfest5-nlp

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 days, 22 hours назад @ youtube.com
Павел Вешкин | Чем может помочь openAI если вы используете opensource LLM
Павел Вешкин | Чем может помочь openAI если вы используете opensource LLM Павел Вешкин | Чем может помочь openAI если вы используете opensource LLM

Спикер: Павел Вешкин Data Fest Siberia 5: https://ods.ai/events/datafestsiberia5

Трек NLP: https://ods.ai/tracks/sibfest5-nlp

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 days, 22 hours назад @ youtube.com
Виктор Паньшин | О создании SOTA-подхода для решения одной известной медицинской проблемы
Виктор Паньшин | О создании SOTA-подхода для решения одной известной медицинской проблемы Виктор Паньшин | О создании SOTA-подхода для решения одной известной медицинской проблемы

Спикер: Виктор Паньшин Data Fest Siberia 5: https://ods.ai/events/datafestsiberia5

Трек CV: https://ods.ai/tracks/sibfest5-cv

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 days, 23 hours назад @ youtube.com
Марина Руменских | Применение машинного обучения в исследовании внесолнечных миров
Марина Руменских | Применение машинного обучения в исследовании внесолнечных миров Марина Руменских | Применение машинного обучения в исследовании внесолнечных миров

Спикер: Марина Руменских Data Fest Siberia 5: https://ods.ai/events/datafestsiberia5

Трек: https://ods.ai/tracks/sibfest5-ml-applied

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

6 days назад @ youtube.com
Пахаруков Илья | Прогнозирование LTV на основании паттернов поведения игроков
Пахаруков Илья | Прогнозирование LTV на основании паттернов поведения игроков Пахаруков Илья | Прогнозирование LTV на основании паттернов поведения игроков

Спикер: Пахаруков Илья

Data Fest Siberia 5: https://ods.ai/events/datafestsiberia5

Трек ML и бизнес: https://ods.ai/tracks/sibfest5-ml-business

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

6 days назад @ youtube.com
Елена Лебедева | LLM vs асессоры: как получить качественную разметку текстовых данных
Елена Лебедева | LLM vs асессоры: как получить качественную разметку текстовых данных Елена Лебедева | LLM vs асессоры: как получить качественную разметку текстовых данных

Спикер: Елена Лебедева Data Fest Siberia 5: https://ods.ai/events/datafestsiberia5

Трек NLP: https://ods.ai/tracks/sibfest5-nlp

_____

Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

6 days назад @ youtube.com
DS Career, early game edition, тизер | Data Fest 2024
DS Career, early game edition, тизер | Data Fest 2024 DS Career, early game edition, тизер | Data Fest 2024 4 weeks, 1 day назад @ youtube.com
🎧 Podcasts
Lex Fridman AI Podcast Lex Fridman AI Podcast
последний пост 15 часов назад
#453 – Javier Milei: President of Argentina – Freedom, Economics, and Corruption
#453 – Javier Milei: President of Argentina – Freedom, Economics, and Corruption #453 – Javier Milei: President of Argentina – Freedom, Economics, and Corruption

Javier Milei is the President of Argentina.

This episode is available in both English and Spanish.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep453-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to http://netsuite.com/lexBetterHelp: Online therapy and counseling.

Go to https://betterhelp.com/lexAG1: All-in-one daily nutrition drinks.

15 часов назад @ lexfridman.com
#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity
#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity #452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Dario Amodei is the CEO of Anthropic, the company that created Claude.

Amanda Askell is an AI researcher working on Claude’s character and personality.

Chris Olah is an AI researcher working on mechanistic interpretability.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep452-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

(3:49:02) – Character training(3:50:01) – Nature of truth(3:54:38) – Optimal rate of failure(4:01:49) – AI consciousness(4:16:20) – AGI(4:24:58) – Chris Olah – Mechanistic Interpretability(4:29:49) – Features, Circuits, Universality(4:47:23) – Superposition(4:58:22) – Monosemanticity(5:05…

1 week, 2 days назад @ lexfridman.com
#451 – Rick Spence: CIA, KGB, Illuminati, Secret Societies, Cults & Conspiracies
#451 – Rick Spence: CIA, KGB, Illuminati, Secret Societies, Cults & Conspiracies #451 – Rick Spence: CIA, KGB, Illuminati, Secret Societies, Cults & Conspiracies

Rick Spence is a historian specializing in the history of intelligence agencies, espionage, secret societies, conspiracies, the occult, and military history.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep451-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to http://netsuite.com/lexBetterHelp: Online therapy and counseling.

Go to https://betterhelp.com/lexMasterClass: Online classes from world-class experts.

Go to https://masterclass.com/lexpodShopify: Sell stuff online.

3 weeks назад @ lexfridman.com
#450 – Bernie Sanders Interview
#450 – Bernie Sanders Interview #450 – Bernie Sanders Interview

Bernie Sanders is a US Senator from Vermont and a two-time presidential candidate.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep450-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://eightsleep.com/lexSaily: An eSIM for international travel.

Go to https://saily.com/lexGround News: Unbiased news source.

Go to https://drinkLMNT.com/lexOUTLINE:(00:00) – Introduction(08:51) – MLK Jr(11:43) – Corruption in politics(23:00) – Healthcare in US(31:33) – 2016 election(37:32) – Barack Obama(43:26) – Capitalism(51:35) – Response to attacks(56:32) – AOC and progressive politics(1:04:24) – Mortality(1:06:30) – Hope fo…

4 weeks назад @ lexfridman.com
#449 – Graham Hancock: Lost Civilization of the Ice Age & Ancient Human History
#449 – Graham Hancock: Lost Civilization of the Ice Age & Ancient Human History #449 – Graham Hancock: Lost Civilization of the Ice Age & Ancient Human History

Graham Hancock a journalist and author who for over 30 years has explored the controversial possibility that there existed a lost civilization during the last Ice Age, and that it was destroyed in a global cataclysm some 12,000 years ago.

He is the presenter of the Netflix documentary series “Ancient Apocalypse”, the 2nd season of which has just been released.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep449-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://notion.com/lexRiverside: Platform for recording podcasts and videos from everywhere.

Go to https://creators.riverside.fm/LEXLMNT: Zero-sugar electro…

1 month назад @ lexfridman.com
#448 – Jordan Peterson: Nietzsche, Hitler, God, Psychopathy, Suffering & Meaning
#448 – Jordan Peterson: Nietzsche, Hitler, God, Psychopathy, Suffering & Meaning #448 – Jordan Peterson: Nietzsche, Hitler, God, Psychopathy, Suffering & Meaning

Jordan Peterson is a psychologist, author, lecturer, and podcast host.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep448-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://eightsleep.com/lexGround News: Unbiased news source.

Go to https://betterhelp.com/lexLMNT: Zero-sugar electrolyte drink mix.

Go to https://drinkLMNT.com/lexOUTLINE:(00:00) – Introduction(07:07) – Nietzsche(14:48) – Power and propaganda(19:54) – Nazism(24:54) – Religion(41:18) – Communism(47:03) – Hero myth(49:12) – Belief in God(59:24) – Advice for young people(1:12:02) – Sex(1:32:00) – Good and evil(1:44:46) – Psychopathy(1…

1 month, 1 week назад @ lexfridman.com
#447 – Cursor Team: Future of Programming with AI
#447 – Cursor Team: Future of Programming with AI #447 – Cursor Team: Future of Programming with AI

Aman Sanger, Arvid Lunnemark, Michael Truell, and Sualeh Asif are creators of Cursor, a popular code editor that specializes in AI-assisted programming.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep447-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://encord.com/lexMasterClass: Online classes from world-class experts.

Go to https://masterclass.com/lexpodShopify: Sell stuff online.

Go to http://netsuite.com/lexAG1: All-in-one daily nutrition drinks.

1 month, 2 weeks назад @ lexfridman.com
#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America
#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America #446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Ed Barnhart is an archaeologist and explorer specializing in ancient civilizations of the Americas.

He is the Director of the Maya Exploration Center, host of the ArchaeoEd Podcast, and lecturer on the ancient history of North, Central, and South America.

Ed is in part known for his groundbreaking work on ancient astronomy, mathematics, and calendar systems.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep446-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://drinkag1.com/lexNotion: Note-taking and team collaboration.

1 month, 3 weeks назад @ lexfridman.com
#445 – Vivek Ramaswamy: Trump, Conservatism, Nationalism, Immigration, and War
#445 – Vivek Ramaswamy: Trump, Conservatism, Nationalism, Immigration, and War #445 – Vivek Ramaswamy: Trump, Conservatism, Nationalism, Immigration, and War

Vivek Ramaswamy is a conservative politician, entrepreneur, and author of many books on politics, including his latest titled Truths: The Future of America First.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep445-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to http://netsuite.com/lexGround News: Unbiased news source.

Go to https://eightsleep.com/lexOUTLINE:(00:00) – Introduction(12:50) – Conservatism(16:06) – Progressivism(21:41) – DEI(26:33) – Bureaucracy(33:25) – Government efficiency(48:34) – Education(1:02:59) – Military Industrial Complex(1:25:18) – Illegal immigration(1:46:53) – Donald Trump(…

1 month, 3 weeks назад @ lexfridman.com
#444 – Vejas Liulevicius: Communism, Marxism, Nazism, Stalin, Mao, and Hitler
#444 – Vejas Liulevicius: Communism, Marxism, Nazism, Stalin, Mao, and Hitler #444 – Vejas Liulevicius: Communism, Marxism, Nazism, Stalin, Mao, and Hitler

Vejas Liulevicius is a historian specializing in Germany and Eastern Europe, who has lectured extensively on Marxism and the rise, the reign, and the fall of Communism.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep444-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://drinkag1.com/lexBetterHelp: Online therapy and counseling.

Go to https://betterhelp.com/lexNotion: Note-taking and team collaboration.

Go to https://notion.com/lexLMNT: Zero-sugar electrolyte drink mix.

2 months назад @ lexfridman.com
#443 – Gregory Aldrete: The Roman Empire – Rise and Fall of Ancient Rome
#443 – Gregory Aldrete: The Roman Empire – Rise and Fall of Ancient Rome #443 – Gregory Aldrete: The Roman Empire – Rise and Fall of Ancient Rome

Gregory Aldrete is a historian specializing in ancient Rome and military history.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep443-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://drinkLMNT.com/lexShopify: Sell stuff online.

Go to https://shopify.com/lexAG1: All-in-one daily nutrition drinks.

Go to https://drinkag1.com/lexBetterHelp: Online therapy and counseling.

2 months, 1 week назад @ lexfridman.com
#442 – Donald Trump Interview
#442 – Donald Trump Interview #442 – Donald Trump Interview

Donald Trump is the 45th President of the United States and the Republican candidate in the 2024 US Presidential Election.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep442-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://ground.news/lexEncord: AI tooling for annotation & data management.

Go to https://encord.com/lexEight Sleep: Temp-controlled smart mattress.

Go to https://eightsleep.com/lexNetSuite: Business management software.

2 months, 2 weeks назад @ lexfridman.com
#441 – Cenk Uygur: Trump vs Harris, Progressive Politics, Communism & Capitalism
#441 – Cenk Uygur: Trump vs Harris, Progressive Politics, Communism & Capitalism #441 – Cenk Uygur: Trump vs Harris, Progressive Politics, Communism & Capitalism

Cenk Uygur is a progressive political commentator and host of The Young Turks.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep441-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://policygenius.com/lexAG1: All-in-one daily nutrition drinks.

Go to https://drinkag1.com/lexMasterClass: Online classes from world-class experts.

Go to https://masterclass.com/lexpodLMNT: Zero-sugar electrolyte drink mix.

2 months, 3 weeks назад @ lexfridman.com
#440 – Pieter Levels: Programming, Viral AI Startups, and Digital Nomad Life
#440 – Pieter Levels: Programming, Viral AI Startups, and Digital Nomad Life #440 – Pieter Levels: Programming, Viral AI Startups, and Digital Nomad Life

Pieter Levels (aka levelsio on X) is a self-taught developer and entrepreneur who has designed, programmed, launched over 40 startups, many of which are highly successful.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep440-scSee below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Go to https://shopify.com/lexMotific: Generative ai deployment.

Go to https://drinkag1.com/lexMasterClass: Online classes from world-class experts.

Go to https://masterclass.com/lexpodBetterHelp: Online therapy and counseling.

3 months назад @ lexfridman.com
#439 – Craig Jones: Jiu Jitsu, $2 Million Prize, CJI, ADCC, Ukraine & Trolling
#439 – Craig Jones: Jiu Jitsu, $2 Million Prize, CJI, ADCC, Ukraine & Trolling #439 – Craig Jones: Jiu Jitsu, $2 Million Prize, CJI, ADCC, Ukraine & Trolling

Craig Jones is a legendary jiu jitsu personality, competitor, co-founder of B-Team, and organizer of the CJI tournament that offers over $2 million in prize money.

Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep439-scSee below for timestamps, and to give feedback, submit questions, contact Lex, etc.

Go to https://eightsleep.com/lexLMNT: Zero-sugar electrolyte drink mix.

Go to https://drinkLMNT.com/lexBetterHelp: Online therapy and counseling.

Go to https://betterhelp.com/lexNetSuite: Business management software.

3 months, 1 week назад @ lexfridman.com
Microsoft Research Podcast Microsoft Research Podcast
последний пост 1 day, 17 hours назад
Ideas: The journey to DNA data storage
Ideas: The journey to DNA data storage Ideas: The journey to DNA data storage

Joining me today to discuss the state of DNA data storage and some of our contributions are several members of the DNA Data Storage Project at Microsoft Research: Principal Researcher Bichlien Nguyen, Senior Researcher Jake Smith, and Partner Research Manager Sergey Yekhanin.

Once we do this, we return to our original data and we’ve completed, let’s call it, one DNA data storage cycle.

So, like, I mean, coding is an important aspect of this whole idea of DNA data storage because we have to deal with errors—it’s a new medium—but talking about error-correcting codes in the context of DNA data storage, so, I mean, usually, like … what are error-correcting codes about?

In DNA data storage, the …

1 day, 17 hours назад @ microsoft.com
Abstracts: November 14, 2024
Abstracts: November 14, 2024 Abstracts: November 14, 2024

We are sorry, the page you requested cannot be found.

The page you are looking for could not be found or is no longer available.

6 days, 16 hours назад @ microsoft.com
Collaborators: Prompt engineering with Siddharth Suri and David Holtz
Collaborators: Prompt engineering with Siddharth Suri and David Holtz Collaborators: Prompt engineering with Siddharth Suri and David Holtz

Researcher Siddharth Suri and professor David Holtz give a brief history of prompt engineering, discuss the debate behind their recent collaboration, and share what they found from studying how people’s approaches to prompting change as models advance.

Learn more:

1 week, 2 days назад @ blubrry.com
Abstracts: November 5, 2024
Abstracts: November 5, 2024 Abstracts: November 5, 2024

Chris and Jay, thank you for joining us today for Abstracts and congratulations!

HAWBLITZEL: Yeah, so I think, you know, traditionally verification or this formal software verification that we’re doing has been considered a little bit of a pie-in-the-sky research agenda.

They’ve discovered that you can get the high performance that you want for systems code without having to sacrifice the ability to reason about ownership and lifetimes, concurrency.

TINGLE: Well, finally, Chris, what are some of the open questions or future opportunities for formal software verification research, and what might you and your collaborators tackle next?

I’m Amber Tingle, and we hope you’ll join us again for Ab…

2 weeks, 1 day назад @ microsoft.com
Abstracts: November 4, 2024
Abstracts: November 4, 2024 Abstracts: November 4, 2024

And at that time, large language model was really in its infancy and people just started exploring what large language model can help us in terms of improving software reliability.

But of course, you know, large language model is a field that is moving so fast.

And that’s one of the reasons we used large language models, because traditional static analysis or traditional program analysis cannot capture this.

I’m actually working on how to leverage large language model to verify the correctness of code, code that may be generated by large language model itself.

STOICA: So we’re thinking of, as Shan mentioned, exploring what large language models can do in this bug-finding/testing arena furth…

2 weeks, 2 days назад @ microsoft.com
Intern Insights: Vaishnavi Ranganathan with Angela Busheska
Intern Insights: Vaishnavi Ranganathan with Angela Busheska Intern Insights: Vaishnavi Ranganathan with Angela Busheska

Today, I’m speaking with my intern, Angela Busheska, about her work this summer and her experience as a Microsoft Research intern.

And I think I’m really grateful that we took this direction because we had a chance to understand at a granular level what is actually happening.

RANGANATHAN: And I think we realized this and you can chime in, Angela, but I feel that the traceability is a vehicle for data.

BUSHESKA: Yeah, so this will be a fall really spent with a lot of applications because of, like, graduate school.

As all the listeners can probably see, I’m a person who really speaks a lot, even more than needed sometimes.

3 weeks, 6 days назад @ microsoft.com
Abstracts: September 30, 2024
Abstracts: September 30, 2024 Abstracts: September 30, 2024

We are sorry, the page you requested cannot be found.

The page you are looking for could not be found or is no longer available.

1 month, 3 weeks назад @ microsoft.com
Collaborators: Silica in space with Richard Black and Dexter Greene
Collaborators: Silica in space with Richard Black and Dexter Greene Collaborators: Silica in space with Richard Black and Dexter Greene

[MUSIC FADES]Today I’m talking to Dr. Richard Black, a senior principal research manager and the research director of Project Silica at Microsoft Research.

Richard and Dexter are involved in a unique multidisciplinary, multi-institutional, and multigenerational collaboration called Avenues Golden Record, a current effort to communicate with extraterrestrial intelligence.

So the project we’re talking about today is Avenues Golden Record, but it’s not the first Golden Record to exist.

GREENE: And I guess moving on to future attempts … so what we’re doing, my team, is we’re working on creating an updated Golden Record.

HUIZINGA: Yeah, yeah.

2 months, 2 weeks назад @ microsoft.com
What’s Your Story: Lex Story
What’s Your Story: Lex Story What’s Your Story: Lex Story

Model maker and fabricator Lex Story helps bring research to life through prototyping.

He discusses his take on failure; the encouragement and advice that has supported his pursuit of art and science; and the sabbatical that might inspire his next career move.

Learn more:

3 months назад @ blubrry.com
Abstracts: August 15, 2024
Abstracts: August 15, 2024 Abstracts: August 15, 2024

In this episode, Microsoft Product Manager Shrey Jain and OpenAI Research Scientist Zoë Hitzig join host Amber Tingle to discuss “Personhood credentials: Artificial intelligence and the value of privacy-preserving tools to distinguish who is real online.” In their paper, Jain, Hitzig, and their coauthors describe how malicious actors can draw on increasingly advanced AI tools to carry out deception, making online deception harder to detect and more harmful.

Bringing ideas from cryptography into AI policy conversations, they identify a possible mitigation: a credential that allows its holder to prove they’re a person––not a bot––without sharing any identifying information.

This exploratory r…

3 months, 1 week назад @ blubrry.com
Collaborators: AI and the economy with Brendan Lucier and Mert Demirer
Collaborators: AI and the economy with Brendan Lucier and Mert Demirer Collaborators: AI and the economy with Brendan Lucier and Mert Demirer

Brendan and Mert are exploring the economic impact of job automation and generative AI as part of Microsoft’s AI, Cognition, and the Economy, or AICE, research initiative.

And more recently, my research focuses on AI, both, like, the adoption of AI and the productivity impact of AI.

HUIZINGA: Right, right.

Like, we need to predict what will be the effect of AI, how AI is going to affect the economy, especially in the long run.

HUIZINGA: Right, right.

3 months, 2 weeks назад @ microsoft.com
What’s Your Story: Emre Kiciman
What’s Your Story: Emre Kiciman What’s Your Story: Emre Kiciman

Emre Kiciman shares how some keen observations and a desire to have front-end impact led him to make the jump from systems and networking to computational social science and now causal analysis and large-scale AI—and how systems thinking still impacts his work.

Learn more:

3 months, 3 weeks назад @ blubrry.com
Abstracts: July 29, 2024
Abstracts: July 29, 2024 Abstracts: July 29, 2024

ZHANG: OK, so this paper is about how to effectively extend the context window of large language models beyond 2 million tokens.

With our method, we can push LLM context window to over 2 million tokens.

For example, pretraining with an efficient model architecture, using RAG (retrieval-augmented generation), and extending the context window with RoPE positional interpolation.

RoPE positional interpolation solves this by downscaling positional embeddings to fit within the pretrained range.

Although we have managed to recover some of the short performance for long-context LLM, it has not recovered 100 percent.

3 months, 3 weeks назад @ microsoft.com
Abstracts: July 18, 2024
Abstracts: July 18, 2024 Abstracts: July 18, 2024

Senior Researcher Arindam Mitra introduces AgentInstruct.

Using raw data sources, the automated multi-agent framework can create diverse, high-quality synthetic data at scale for the post-training of small and large language models.

Microsoft Research PodcastBy Researchers across the Microsoft research communityRedmond, WAAn ongoing series of conversations bringing you right up to the cutting edge of Microsoft Research.

4 months назад @ blubrry.com
Collaborators: Sustainable electronics with Jake Smith and Aniruddh Vashisth
Collaborators: Sustainable electronics with Jake Smith and Aniruddh Vashisth Collaborators: Sustainable electronics with Jake Smith and Aniruddh Vashisth

Printed circuit boards are abundant—in the stuff we use and in landfills.

Researcher Jake Smith and professor Aniruddh Vashisth discuss the development of vitrimer-based PCBs that perform comparably to traditional PCBs but have less environmental impact.

Learn more:

4 months, 1 week назад @ blubrry.com
Data Skeptic
последний пост 2 days, 16 hours назад
Lessons from eGamer Networks
Lessons from eGamer Networks Lessons from eGamer Networks

Alex Bisberg, a PhD candidate at the University of Southern California, specializes in network science and game analytics, with a focus on understanding social and competitive success in multiplayer online games. In this episode, listeners can expect to learn from a network perspective about players interactions and patterns of behavior. Through his research on games, Alex sheds light on how network analysis and statistical tests might explain positive contagious behaviors, such as generosity, and explore the dynamics of collaboration and competition in gaming environments. These insights offer valuable lessons not only for game developers in enhancing player experience, engagement and rete…

2 days, 16 hours назад @ dataskeptic.com
Github Collaboration Network
Github Collaboration Network Github Collaboration Network

In this episode we discuss the GitHub Collaboration Network with Behnaz Moradi-Jamei, assistant professor at James Madison University. As a network scientist, Behnaz created and analyzed a network of about 700,000 contributors to Github's repository. The network of collaborators on GitHub was created by identifying developers (nodes) and linking them with edges based on shared contributions to the same repositories. This means that if two developers contributed to the same project, an edge (connection) was formed between them, representing a collaborative relationship network consisting of 32 million such connections. By using algorithms for Community Detection, Behnaz's analysis reveals in…

1 week, 2 days назад @ dataskeptic.com
Github Collaboration Network
Github Collaboration Network Github Collaboration Network

In this episode we discuss the GitHub Collaboration Network with Behnaz Moradi-Jamei, assistant professor at James Madison University. As a network scientist, Behnaz created and analyzed a network of about 700,000 contributors to Github's repository. The network of collaborators on GitHub was created by identifying developers (nodes) and linking them with edges based on shared contributions to the same repositories. This means that if two developers contributed to the same project, an edge (connection) was formed between them, representing a collaborative relationship network consisting of 32 million such connections. By using algorithms for Community Detection, Behnaz's analysis reveals in…

1 week, 2 days назад @ dataskeptic.com
Graphs and ML for Robotics
Graphs and ML for Robotics Graphs and ML for Robotics

We are joined by Abhishek Paudel, a PhD Student at George Mason University with a research focus on robotics, machine learning, and planning under uncertainty, using graph-based methods to enhance robot behavior. He explains how graph-based approaches can model environments, capture spatial relationships, and provide a framework for integrating multiple levels of planning and decision-making.

2 weeks, 2 days назад @ dataskeptic.com
Graphs for HPC and LLMs
Graphs for HPC and LLMs Graphs for HPC and LLMs

We are joined by Maciej Besta, a senior researcher of sparse graph computations and large language models at the Scalable Parallel Computing Lab (SPCL). In this episode, we explore the intersection of graph theory and high-performance computing (HPC), Graph Neural Networks (GNNs) and LLMs.

3 weeks, 1 day назад @ dataskeptic.com
Graph Databases and AI
Graph Databases and AI Graph Databases and AI

In this episode, we sit down with Yuanyuan Tian, a principal scientist manager at Microsoft Gray Systems Lab, to discuss the evolving role of graph databases in various industries such as fraud detection in finance and insurance, security, healthcare, and supply chain optimization.

1 month назад @ dataskeptic.com
Network Analysis in Practice
Network Analysis in Practice Network Analysis in Practice

Our new season "Graphs and Networks" begins here! We are joined by new co-host Asaf Shapira, a network analysis consultant and the podcaster of NETfrix – the network science podcast. Kyle and Asaf discuss ideas to cover in the season and explore Asaf's work in the field.

1 month, 1 week назад @ dataskeptic.com
Animal Intelligence Final Exam
Animal Intelligence Final Exam Animal Intelligence Final Exam

Join us for our capstone episode on the Animal Intelligence season. We recap what we loved, what we learned, and things we wish we had gotten to spend more time on. This is a great episode to see how the podcast is produced. Now that the season is ending, our current co-host, Becky, is moving to emeritus status. In this last installment we got to spend a little more time getting to know Becky and where her work will take her after this. Did Data Skeptic inspire her to learn more about machine learning? Tune in and find out.

1 month, 2 weeks назад @ dataskeptic.com
Process Mining with LLMs
Process Mining with LLMs Process Mining with LLMs

David Obembe, a recent University of Tartu graduate, discussed his Masters thesis on integrating LLMs with process mining tools. He explained how process mining uses event logs to create maps that identify inefficiencies in business processes. David shared his research on LLMs' potential to enhance process mining, including experiments evaluating their performance and future improvements using Retrieval Augmented Generation (RAG).

1 month, 3 weeks назад @ dataskeptic.com
open-animal-tracks
open-animal-tracks open-animal-tracks

Our guest today is Risa Shinoda, a PhD student at Kyoto University Agricultural Systems Engineering Lab, where she applies computer vision techniques. She talked about the OpenAnimalTracks dataset and what it was used for. The dataset helps researchers predict animal footprint. She also discussed how she built a model for predicting tracks of animals. She shared the algorithms used and the accuracy they achieved. She also discussed further improvement opportunities for the model.

2 months назад @ dataskeptic.com
Bird Distribution Modeling with Satbird
Bird Distribution Modeling with Satbird Bird Distribution Modeling with Satbird

This episode features an interview with Mélisande Teng, a PhD candidate at Université de Montréal. Her research lies in the intersection of remote sensing and computer vision for biodiversity monitoring.

2 months, 1 week назад @ dataskeptic.com
Ant Encounters
Ant Encounters Ant Encounters

In this interview with author Deborah Gordon, Kyle asks questions about the mechanisms at work in an ant colony and what ants might teach us about how to build artificial intelligence. Ants are surprisingly adaptive creatures whose behavior emerges from their complex interactions. Aspects of network theory and the statistical nature of ant behavior are just some of the interesting details you'll get in this episode.

2 months, 3 weeks назад @ dataskeptic.com
Computing Toolbox
Computing Toolbox Computing Toolbox

This season it’s become clear that computing skills are vital for working in the natural sciences. In this episode, we were fortunate to speak with Madlen Wilmes, co-author of the book "Computing Skills for Biologists: A Toolbox". We discussed the book and why it’s a great resource for students and teachers. In addition to the book, Madlen shared her experience and advice on transitioning from academia to an industry career and how data analytic skills transfer to jobs that your professionals might not always consider. Join us and learn more about the book and careers using transferable skills.

3 months назад @ dataskeptic.com
Biodiversity Monitoring
Biodiversity Monitoring Biodiversity Monitoring

In this episode, we talked shop with Hager Radi about her biodiversity monitoring work. While biodiversity modeling may sound simple, count organisms and mark their location, there is a lot more to it than that! Incomplete and biased data can make estimations hard. There are also many species with very few observations in the wild. Using machine learning and remote sensing data, scientists can build models that predict species distributions with limited data. Listen in and hear about Hager’s work tackling these challenges and the tools she has built.

3 months, 1 week назад @ dataskeptic.com
Hacking the Colony
Hacking the Colony Hacking the Colony

Today, Ashay Aswale and Tony Lopez shared their work on swarm robotics and what they have learned from ants. Robotic swarms must solve the same problems that eusocial insects do. What if your pheromone trail goes cold? What if you’re getting bad information from a bad-actor within the swarm? Answering these questions can help tackle serious robotic challenges. For example, a swarm of robots can lose a few members to accidents and malfunctions, but a large robot cannot. Additionally, a swarm could be host to many castes like an ant colony. Specialization with redundancy built in seems like a win-win! Tune in and hear more about this fascinating topic.

3 months, 2 weeks назад @ dataskeptic.com
SuperDataScience SuperDataScience
последний пост 1 day, 19 hours назад
837: Career Success in the AI Era, with Deepali Vyas
837: Career Success in the AI Era, with Deepali Vyas 837: Career Success in the AI Era, with Deepali Vyas

Deepali is Senior Partner and Global Head of the Data, AI and Financial Technology Practice of Korn Ferry, one of the world's largest executive search firms.

00:04:31At Korn Ferry, you lead executive search and leadership consulting for strategic AI data and analytics leaders across industries.

And so there are things that you're going to have to do to show where those soft skills are more transferable.

And so they really adopted it really, really well.

I'm going to lean on my industry friends and I'm going to bring this to the younger generation."

1 day, 19 hours назад @ superdatascience.com
836: How to Become Happier, with Dr. Nat Ware
836: How to Become Happier, with Dr. Nat Ware 836: How to Become Happier, with Dr. Nat Ware

Economist and social-impact innovator Dr. Nat Ware reveals how our expectations shape happiness and why chasing it often leaves us unfulfilled. He shares insights on the “hedonic treadmill” and the effects of constant comparison on our well-being. Find out how to build a more meaningful life by making memories, taking chances, and focusing on genuine connections. Additional materials: www.superdatascience.com/836 Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

5 days, 19 hours назад @ podtrac.com
835: AI Systems as Productivity Engines, with You.com’s Bryan McCann
835: AI Systems as Productivity Engines, with You.com’s Bryan McCann 835: AI Systems as Productivity Engines, with You.com’s Bryan McCann

And now we're seeing this translate into B2B use cases for us, which are where it gets really, really exciting as well.

And then for years after, it was, oh, it was obvious to me that we needed to get really, really large context windows because you need to have longer context.

But an AI system can scale way, way, way, way more than us.

Bryan McCann: 01:26:43 Kind of, general topology, and just seeing the world in terms of stuff and functions.

Bryan McCann: 01:28:29 I'm most active right now on LinkedIn, so you can follow me and connect with me there.

1 week, 1 day назад @ superdatascience.com
834: In Case You Missed It in October 2024
834: In Case You Missed It in October 2024 834: In Case You Missed It in October 2024

This podcast is not available yet, please come back soon.

Meanwhile we invite you to check out our podcasts in:Subscribe on Website, Apple Podcasts, Spotify, Stitcher Radio or TuneIn

1 week, 5 days назад @ superdatascience.com
833: The 10 Reasons AI Projects Fail, with Dr. Martin Goodson
833: The 10 Reasons AI Projects Fail, with Dr. Martin Goodson 833: The 10 Reasons AI Projects Fail, with Dr. Martin Goodson

It is really, really good at extracting characters.

That's really, really critical.

I don't think the industry has really, or the field has really, really figured this out.

It's open source model and is really, really interesting.

I've really, really enjoyed it.

2 weeks, 1 day назад @ superdatascience.com
832: The Anthropic CEO’s Techno-Utopia
832: The Anthropic CEO’s Techno-Utopia 832: The Anthropic CEO’s Techno-Utopia

(01:48): The aspect of Dario’s article that made the biggest splash was his explicit defining the term “Powerful AI”.

Instead, Powerful AI has all the interfaces available to a human that’s working virtually – text, audio, video, mouse and keyboard control, and internet access.

And while the Powerful AI doesn't have a physical body, it can control existing tools, robots, and laboratory equipment through computer interfaces.

Potentially within as few as 5-10 years after developing this Powerful AI system, we could see transformative changes across multiple domains.

(00:02): This is Episode #832 on the Anthropic CEO’s “Powerful AI” Utopia.

2 weeks, 5 days назад @ superdatascience.com
831: PyTorch Lightning, Lit-Serve and Lightning Studios, with Dr. Luca Antiga
831: PyTorch Lightning, Lit-Serve and Lightning Studios, with Dr. Luca Antiga 831: PyTorch Lightning, Lit-Serve and Lightning Studios, with Dr. Luca Antiga

Jon Krohn: 00:09:57So you are the CTO of Lightning AI, the makers of AI Studio, PyTorch Lightning, and many more open source products.

So PyTorch Lightning was released, was born many years before, created by William Falcon, the CEO and founder of Lightning AI.

And incidentally, I got to know PyTorch Lightning before I got to know William because my company in Italy, Orobix landed on PyTorch Lightning to standardize their training code.

And in fact, PyTorch Lightning became one of the ways PyTorch lands in organizations because PyTorch Lightning doesn't wrap PyTorch.

To recap, in today's episode, Dr. Luca Antiga filled us in on how Lightning AI offers tools like PyTorch Lightning, Lightning…

3 weeks, 1 day назад @ superdatascience.com
830: The “A.I.” Nobel Prizes (in Physics and Chemistry??)
830: The “A.I.” Nobel Prizes (in Physics and Chemistry??) 830: The “A.I.” Nobel Prizes (in Physics and Chemistry??)

Geoffrey Hinton and Sir Demis Hassabis: The Nobel Prize committee is an achievement of the highest order, awarding physicists, chemists, physiologists, medical practitioners, writers, pacifists and economists perhaps the greatest honor in their respective fields. In this week’s Five-Minute Friday, Jon Krohn discusses how two AI pioneers came to win prizes in chemistry and physics. Additional materials: www.superdatascience.com/830 Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

3 weeks, 5 days назад @ podtrac.com
829: Neuroscience Fueled by ML, with Prof. Bradley Voytek
829: Neuroscience Fueled by ML, with Prof. Bradley Voytek 829: Neuroscience Fueled by ML, with Prof. Bradley Voytek

Neuroscientist Bradley Voytek outlines to Jon Krohn the incredible use of data science and machine learning in his research and how recent discoveries in action potentials and neurons have completely skyrocketed the field to a new understanding of the brain and its functions. You’ll also hear what Bradley thinks is most important when hiring data scientists and his contributions to Uber’s algorithm when it was still a startup. This episode is brought to you by epic LinkedIn Learning instructor Keith McCormick, and by Gurobi, the Decision Intelligence Leader. Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information. In this e…

4 weeks, 1 day назад @ podtrac.com
828: Are “Citizen Data Scientists” A Myth? With Keith McCormick
828: Are “Citizen Data Scientists” A Myth? With Keith McCormick 828: Are “Citizen Data Scientists” A Myth? With Keith McCormick

The citizen data scientist: Fact or fiction? Jon Krohn holds a conversation across episodes in this Five-Minute Friday, with today’s guest Keith McCormick, in part responding to Nick Elprin’s interview in episode 811: Scaling Data Teams Effectively. Additional materials: www.superdatascience.com/828 Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

1 month назад @ podtrac.com
827: Polars: Past, Present and Future, with Polars Creator Ritchie Vink
827: Polars: Past, Present and Future, with Polars Creator Ritchie Vink 827: Polars: Past, Present and Future, with Polars Creator Ritchie Vink

Ritchie Vink, CEO and Co-Founder of Polars, Inc., speaks to Jon Krohn about the new achievements of Polars, an open-source library for data manipulation. This is the episode for any data scientist on the fence about using Polars, as it explains how Polars managed to make such improvements, the APIs and integration libraries that make it so versatile, and what’s next for this efficient library. This episode is brought to you by epic LinkedIn Learning instructor Keith McCormick, by Gurobi, the Decision Intelligence Leader, and by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information. …

1 month назад @ podtrac.com
826: In Case You Missed It in September 2024
826: In Case You Missed It in September 2024 826: In Case You Missed It in September 2024

Next-gen IDEs, efficiency-boosting open-source Python libraries, and changes in hiring for data scientists: This episode of In Case You Missed It gives you our best clips of September’s interviews, hosted by Jon Krohn. Additional materials: www.superdatascience.com/826 Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

1 month, 1 week назад @ podtrac.com
825: Data Contracts: The Key to Data Quality, with Chad Sanderson
825: Data Contracts: The Key to Data Quality, with Chad Sanderson 825: Data Contracts: The Key to Data Quality, with Chad Sanderson

Data contracts are redefining data quality and governance, and Chad Sanderson, CEO of Gable.ai, joins host Jon Krohn to explain how they can transform your data strategy. He breaks down what data contracts are, how they shift data quality checks closer to production, and why they’re essential for reducing data debt. Chad also highlights how better alignment between data producers and consumers can elevate data reliability and tackle change-management challenges in modern organizations. This episode is brought to you by epic LinkedIn Learning instructor Keith McCormick, and by Gurobi, the Decision Intelligence Leader. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie…

1 month, 1 week назад @ podtrac.com
824: Llama 3.2: Open-Source Edge and Multimodal LLMs
824: Llama 3.2: Open-Source Edge and Multimodal LLMs 824: Llama 3.2: Open-Source Edge and Multimodal LLMs

Llama 3.2 brings a new era of AI innovation with lightweight models tailored for on-device applications and powerful vision models for handling complex image inputs. Host Jon Krohn explores how this release pushes the boundaries of open-source AI, making it more accessible and versatile for developers. He also covers the Llama Stack toolkit, designed to streamline deployment, and Llama Guard 3, Meta’s latest content moderation solution. With extensive support from major cloud and hardware partners, Llama 3.2 is set to unlock groundbreaking possibilities for AI across mobile and beyond. Tune in to hear more. Additional materials: www.superdatascience.com/824 Interested in sponsoring a SuperD…

1 month, 2 weeks назад @ podtrac.com
823: Virtual Humans and AI Clones, with Natalie Monbiot
823: Virtual Humans and AI Clones, with Natalie Monbiot 823: Virtual Humans and AI Clones, with Natalie Monbiot

Virtual humans are rewriting the rules of digital communication and reshaping entire industries. This week, Jon Krohn welcomes Natalie Monbiot, Head of Strategy at Hour One, to shed light on how AI avatars are revolutionizing L&D and e-commerce by turning traditional training and product listings into captivating, presenter-led content. This episode is brought to you by epic LinkedIn Learning instructor Keith McCormick, by Gurobi, the Decision Intelligence Leader, and by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information. In this episode you will learn:

• How do you create a virt…

1 month, 2 weeks назад @ podtrac.com
Data Science at Home Data Science at Home
последний пост 1 week назад
AI vs. The Planet: The Energy Crisis Behind the Chatbot Boom (Ep. 271)
AI vs. The Planet: The Energy Crisis Behind the Chatbot Boom (Ep. 271) AI vs. The Planet: The Energy Crisis Behind the Chatbot Boom (Ep. 271)

In this episode of Data Science at Home, we dive into the hidden costs of AI’s rapid growth — specifically, its massive energy consumption.

With tools like ChatGPT reaching 200 million weekly active users, the environmental impact of AI is becoming impossible to ignore.

Each query, every training session, and every breakthrough come with a price in kilowatt-hours, raising questions about AI’s sustainability.

Join us, as we uncovers the staggering figures behind AI’s energy demands and explores practical solutions for the future.

From efficiency-focused algorithms and specialized hardware to decentralized learning, this episode examines how we can balance AI’s advancements with our planet’s …

1 week назад @ datascienceathome.com
Love, Loss, and Algorithms: The Dangerous Realism of AI (Ep. 270)
Love, Loss, and Algorithms: The Dangerous Realism of AI (Ep. 270) Love, Loss, and Algorithms: The Dangerous Realism of AI (Ep. 270)

Subscribe to our new channel https://www.youtube.com/@DataScienceatHomeIn this episode of Data Science at Home, we confront a tragic story highlighting the ethical and emotional complexities of AI technology.

This devastating event has sparked urgent discussions on the mental health risks, ethical responsibilities, and potential regulations surrounding AI chatbots, especially as they become increasingly lifelike.

🎙️ Topics Covered:AI & Emotional Attachment: How hyper-realistic AI chatbots can foster intense emotional bonds with users, especially vulnerable groups like adolescents.

Mental Health Risks: The potential for AI to unintentionally contribute to mental health issues, and the challe…

2 weeks назад @ datascienceathome.com
VC Advice Exposed: When Investors Don’t Know What They Want (Ep. 269)
VC Advice Exposed: When Investors Don’t Know What They Want (Ep. 269) VC Advice Exposed: When Investors Don’t Know What They Want (Ep. 269)

Ever feel like VC advice is all over the place?

That’s because it is.

In this episode, I expose the madness behind the money and how to navigate their confusing advice!

Watch the video at https://youtu.be/IBrPFyRMG1QSubscribe to our new Youtube channel https://www.youtube.com/@DataScienceatHome00:00 – Introduction00:16 – The Wild World of VC Advice02:01 – Grow Fast vs. Grow Slow05:00 – Listen to Customers or Innovate Ahead09:51 – Raise Big or Stay Lean?

14:20 – The Real VC Secret: Focus on Your Team and Vision17:03 – Outro

3 weeks, 3 days назад @ datascienceathome.com
AI Says It Can Compress Better Than FLAC?! Hold My Entropy 🍿 (Ep. 268)
AI Says It Can Compress Better Than FLAC?! Hold My Entropy 🍿 (Ep. 268) AI Says It Can Compress Better Than FLAC?! Hold My Entropy 🍿 (Ep. 268)

In this episode of Data Science at Home, Frag dives deep into the wild claims that Large Language Models (LLMs) like Chinchilla 70B are beating traditional lossless compression algorithms.

🧠💥But before you toss out your FLAC collection, let’s break down Shannon’s Source Coding Theorem and why entropy sets the ultimate limit on lossless compression.

We explore: ⚙️ How LLMs leverage probabilistic patterns for compression 📉 Why compression efficiency doesn’t equal general intelligence 🚀 The practical (and ridiculous) challenges of using AI for compression 💡 Can AI actually BREAK Shannon’s limit—or is it just an illusion?

If you love AI, algorithms, or just enjoy some good old myth-busting, thi…

1 month назад @ datascienceathome.com
What Big Tech Isn’t Telling You About AI (Ep. 267)
What Big Tech Isn’t Telling You About AI (Ep. 267) What Big Tech Isn’t Telling You About AI (Ep. 267)

Are AI giants really building trustworthy systems?

A groundbreaking transparency report by Stanford, MIT, and Princeton says no.

In this episode, we expose the shocking lack of transparency in AI development and how it impacts bias, safety, and trust in the technology.

We’ll break down Gary Marcus’s demands for more openness and what consumers should know about the AI products shaping their lives.

Check our new YouTube channel https://www.youtube.com/@DataScienceatHome and Subscribe!

1 month, 1 week назад @ datascienceathome.com
Money, Cryptocurrencies, and AI: Exploring the Future of Finance with Chris Skinner [RB] (Ep. 266)
Money, Cryptocurrencies, and AI: Exploring the Future of Finance with Chris Skinner [RB] (Ep. 266) Money, Cryptocurrencies, and AI: Exploring the Future of Finance with Chris Skinner [RB] (Ep. 266)

We’re revisiting one of our most popular episodes from last year, where renowned financial expert Chris Skinner explores the future of money.

In this fascinating discussion, Skinner dives deep into cryptocurrencies, digital currencies, AI, and even the metaverse.

He touches on government regulations, the role of tech in finance, and what these innovations mean for humanity.

Now, one year later, we encourage you to listen again and reflect—how much has changed?

Are Chris Skinner’s predictions still holding up, or has the financial landscape evolved in unexpected ways?

1 month, 1 week назад @ datascienceathome.com
Kaggle Kommando’s Data Disco: Laughing our Way Through AI Trends (Ep. 265) [RB]
Kaggle Kommando’s Data Disco: Laughing our Way Through AI Trends (Ep. 265) [RB] Kaggle Kommando’s Data Disco: Laughing our Way Through AI Trends (Ep. 265) [RB]

In this episode, join me and the Kaggle Grand Master, Konrad Banachewicz, for a hilarious journey into the zany world of data science trends.

From algorithm acrobatics to AI, creativity, Hollywood movies, and music, we just can’t get enough.

It’s the typical episode with a dose of nerdy comedy you didn’t know you needed.

Buckle up, it’s a data disco, and we’re breaking down the binary!

SponsorsIntrepid AI is an AI assisted all-in-one platform for robotics teams.

1 month, 2 weeks назад @ datascienceathome.com
AI and Video Game Development: Navigating the Future Frontier (Ep. 264) [RB]
AI and Video Game Development: Navigating the Future Frontier (Ep. 264) [RB] AI and Video Game Development: Navigating the Future Frontier (Ep. 264) [RB]

In this episode we delve into the dynamic realm of game development and the transformative role of artificial intelligence (AI).

Join Frag, Jim and Mike as they explore the current landscape of game development processes, from initial creative ideation to the integration of AI-driven solutions.

With Mike’s expertise as a software executive and avid game developer, we uncover the potential of AI to revolutionize game design, streamline development cycles, and enhance player experiences.

SponsorsIntrepid AI ( https://intrepid.ai ) is an AI assisted all-in-one platform for robotics teams.

Build robotics applications in minutes, not months.

1 month, 3 weeks назад @ datascienceathome.com
LLMs: Totally Not Making Stuff Up (they promise) (Ep. 263)
LLMs: Totally Not Making Stuff Up (they promise) (Ep. 263) LLMs: Totally Not Making Stuff Up (they promise) (Ep. 263)

In this episode, we dive into the wild world of Large Language Models (LLMs) and their knack for… making things up.

Can they really generalize without throwing in some fictional facts?

Or is hallucination just part of their charm?

Let’s separate the genius from the guesswork in this insightful breakdown of AI’s creativity problem.

TL;DR;LLM Generalisation without hallucinations.

1 month, 3 weeks назад @ datascienceathome.com
AI: The Bubble That Might Pop—What’s Next? (Ep. 262)
AI: The Bubble That Might Pop—What’s Next? (Ep. 262) AI: The Bubble That Might Pop—What’s Next? (Ep. 262)

The hype around Generative AI is real, but is the bubble about to burst?

Join me as we dissect the recent downturn in AI investments and what it means for the tech giants like OpenAI and Nvidia.

Could this be the end of the AI gold rush, or just a bump in the road?

2 months, 2 weeks назад @ datascienceathome.com
Data Guardians: How Enterprises Can Master Privacy with MetaRouter (Ep. 261)
Data Guardians: How Enterprises Can Master Privacy with MetaRouter (Ep. 261) Data Guardians: How Enterprises Can Master Privacy with MetaRouter (Ep. 261)

In this insightful episode, we dive deep into the pressing issue of data privacy, where 86% of U.S. consumers express growing concerns and 40% don’t trust companies to handle their data ethically.

Join us as we chat with the Vice President of Engineering at MetaRouter, a cutting-edge platform enabling enterprises to regain control over their customer data.

We explore how MetaRouter empowers businesses to manage data in a 1st-party context, ensuring ethical, compliant handling while navigating the complexities of privacy regulations.

SponsorsIntrepid AI ( https://intrepid.ai ) is an AI assisted all-in-one platform for robotics teams.

Build robotics applications in minutes, not monthsReferenc…

3 months, 2 weeks назад @ datascienceathome.com
Low-Code Magic: Can It Transform Analytics? (Ep. 260)
Low-Code Magic: Can It Transform Analytics? (Ep. 260) Low-Code Magic: Can It Transform Analytics? (Ep. 260)

Join us as David Marom, Head of Panoply Business, explores the benefits of all-in-one data platforms.

Learn how tech stack consolidation boosts efficiency, improves data accuracy, and cuts costs.

David shares insights on overcoming common challenges, enhancing data governance, and success stories from organizations thriving with Panoply.

Our website uses cookies to give you the most relevant experience by remembering your preferences and repeat visits.

By clicking “Accept,” you consent to use ALL the cookies.

4 months назад @ datascienceathome.com
Do you really know how GPUs work? (Ep. 259)
Do you really know how GPUs work? (Ep. 259) Do you really know how GPUs work? (Ep. 259)

Join us in this exciting episode of the Data Science at Home podcast.

It’s all about GPUs.

We’ll take you on a journey through the inner workings of these powerful processors, explaining how they handle complex computations and drive everything from gaming graphics to scientific simulations.

Whether you’re a budding programmer or a tech enthusiast, understanding GPUs is key to unlocking new levels of performance and efficiency in your projects.

Tune in and get ready to turbocharge your tech knowledge!

5 months назад @ datascienceathome.com
Harnessing AI for Cybersecurity: Expert Tips from QFunction (Ep. 258)
Harnessing AI for Cybersecurity: Expert Tips from QFunction (Ep. 258) Harnessing AI for Cybersecurity: Expert Tips from QFunction (Ep. 258)

In this episode, we sit down with Ryan Smith, Founder of QFunction LLC, to explore how AI and machine learning are revolutionizing cybersecurity.

With over 8 years of experience, including work at NASA’s Jet Propulsion Laboratory, Ryan shares insights on the future of threat detection and prevention, the challenges businesses face in maintaining effective cybersecurity, and the ethical considerations of AI implementation.

Learn about cost-effective strategies for small businesses, the importance of collaboration in combating cyber threats, and how QFunction tailors its AI solutions to meet diverse industry needs.

QFunction does cybersecurity differently.

By relying on scientific breakthroug…

5 months, 1 week назад @ datascienceathome.com
Rust in the Cosmos Part 4: What happens in space? (Ep. 257)
Rust in the Cosmos Part 4: What happens in space? (Ep. 257) Rust in the Cosmos Part 4: What happens in space? (Ep. 257)

In this last episode of the series “Rust in the Cosmos” we speak about what happens in space, what projects are currently active and what happened in the past that we can learn from?

What about Rust and space applications?

Build robotics applications in minutes, not months.

Amethix works to create and maximize the impact of the world’s leading corporations and startups, so they can create a better future for everyone they serve.

CommunitiesIntrepid AI, AeroRust, BytenookIntrepid AI Discord https://discord.gg/cSSzche6Cthttps://discord.gg/cSSzche6Ct Intrepid AI website: https://intrepid.aiAeroRust Discord invite: https://discord.com/invite/6jJyx5nEUqAeroRust website: AeroRust.orgReferences

5 months, 3 weeks назад @ datascienceathome.com