Very ML
State-of-the-art Machine Learning News Feed
/r/MachineLearning
последний пост 1 час назад
[D] Is there an equivalent BigDL project for NVIDIA GPUs, which allows distributing work loads across a DL cluster with spark?
[D] Is there an equivalent BigDL project for NVIDIA GPUs, which allows distributing work loads across a DL cluster with spark?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 час назад @ reddit.com
[P] New Book: BUILD GPT: HOW AI WORKS
[P] New Book: BUILD GPT: HOW AI WORKS [P] New Book: BUILD GPT: HOW AI WORKS

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 час назад @ reddit.com
[D] What is the best TTS model for my case?
[D] What is the best TTS model for my case?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

3 часа назад @ reddit.com
[D] Exploring Complex Number Representations for Word Vectors: A New Approach
[D] Exploring Complex Number Representations for Word Vectors: A New Approach

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

8 часов назад @ reddit.com
[R] French GEC dataset
[R] French GEC dataset

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

11 часов назад @ reddit.com
[D] tutorial on how to build streaming ML applications
[D] tutorial on how to build streaming ML applications

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

13 часов назад @ reddit.com
[D] ML promising fields
[D] ML promising fields

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

13 часов назад @ reddit.com
[D] Why is R^2 so crazy?
[D] Why is R^2 so crazy? [D] Why is R^2 so crazy?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

14 часов назад @ reddit.com
[D] The only DS in the team
[D] The only DS in the team

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

16 часов назад @ reddit.com
[Discussion] Preparing for new job - looking for resources
[Discussion] Preparing for new job - looking for resources

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

17 часов назад @ reddit.com
Recall Score Increase [D]
Recall Score Increase [D]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

18 часов назад @ reddit.com
[D] Preserving spatial distribution of data during data splitting
[D] Preserving spatial distribution of data during data splitting [D] Preserving spatial distribution of data during data splitting

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

18 часов назад @ reddit.com
[N] Snowflake releases open (Apache 2.0) 128x3B MoE model
[N] Snowflake releases open (Apache 2.0) 128x3B MoE model

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

19 часов назад @ reddit.com
[D] Why would such a simple sentence break an LLM?
[D] Why would such a simple sentence break an LLM? [D] Why would such a simple sentence break an LLM?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

19 часов назад @ reddit.com
[R] Speaker diarization
[R] Speaker diarization

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

20 часов назад @ reddit.com
Towards Data Science
последний пост 4 часа назад
Data Science Project Management
Data Science Project Management Data Science Project Management

A 5-step Project Management Framework for Data ScienceThis article is part of a larger series on Full Stack Data Science. In the previous post, I introduced the idea of a full-stack data scientist and the 4 hats it entails. In this article, I will discuss the first of these four hats — the project manager (PM). While there are countless ways to approach data science project management, here I propose one possible framework and the PM’s role in executing it.Photo by Scott Blake on UnsplashData science projects often involve developing machine learning (ML) models to solve business problems. While this may seem commonplace in business today, it still comes with several risks.Namely, developin…

4 часа назад @ towardsdatascience.com
To Know Is Also to Remember
To Know Is Also to Remember To Know Is Also to Remember

Understanding Long Short-Term Memory NetworksContinue reading on Towards Data Science »

5 часов назад @ towardsdatascience.com
How to Keep on Developing as a Data Scientist
How to Keep on Developing as a Data Scientist How to Keep on Developing as a Data Scientist

A few practical tips on how to keep on learning during your daily jobContinue reading on Towards Data Science »

6 часов назад @ towardsdatascience.com
My First Steps into Mastering SAP’s Data Models
My First Steps into Mastering SAP’s Data Models My First Steps into Mastering SAP’s Data Models

If you are a curious reader interested in learning more about SAP data models, you’ve come to the right place!Hello Medium readers! I’m excited to share some learnings from a recent project where I dived into the complexity of SAP’s data models.For confidentiality reasons, I’m unable to share all the project details 🤫. However, I’ll discuss a challenge I faced regarding the complexity of SAP data models: what does the SAP data architecture looks like and how does everything fit into a coherent data model that makes sense for business users?In this project, my primary focus was to integrate data into an analytics/mining platform using SAP business data on procurement processes. As I progress…

18 часов назад @ towardsdatascience.com
Uncertainty Quantification and Why You Should Care
Uncertainty Quantification and Why You Should Care Uncertainty Quantification and Why You Should Care

How to improve your ML model with three lines of codeTurning a point prediction into a prediction region to quantify the model’s uncertainty to give us more information (Image by the author).Prediction models are trained to predict well and give us point forecasts.Let’s assume we want to buy a house. Before we do so, we want to verify that the advertised price of 400,000 € is reasonable. For this, we use a model that, based on the number of rooms, the size and the location of the house, predicts that the house is worth 500,232.12 €.Should we buy this house? It seems like a good deal, doesn’t it? But would our decision be different if the model instead had predicted a price of 340,021.34 €? …

18 часов назад @ towardsdatascience.com
Experimenting with MLFlow and Microsoft Fabric
Experimenting with MLFlow and Microsoft Fabric Experimenting with MLFlow and Microsoft Fabric

Fabric Madness part 4Image by author and ChatGPT. “Design an illustration, with imagery representing data experiments, focusing on basketball data” prompt. ChatGPT, 4, OpenAI, 15April. 2024. https://chat.openai.com.A Huge thanks to Martim Chaves who co-authored this post and developed the example scripts.It’s no secret that Machine Learning (ML) systems require careful tuning to become truly useful, and it would be an extremely rare occurrence for a model to work perfectly the first time it’s run!When first starting out on your ML journey, an easy trap to fall into is to try lots of different things to improve performance, but not recording these configurations along the way. This then make…

18 часов назад @ towardsdatascience.com
physipy: make python unit-aware
physipy: make python unit-aware physipy: make python unit-aware

Part 1: physipy brings meter and Joule to pythonContinue reading on Towards Data Science »

19 часов назад @ towardsdatascience.com
Why Deep Learning Models Run Faster on GPUs: A Brief Introduction to CUDA Programming
Why Deep Learning Models Run Faster on GPUs: A Brief Introduction to CUDA Programming Why Deep Learning Models Run Faster on GPUs: A Brief Introduction to CUDA Programming

For those who want to understand what .to(“cuda”) doesImage by the author with the assistance of AI (https://copilot.microsoft.com/images/create)Nowadays, when we talk about deep learning, it is very common to associate its implementation with utilizing GPUs in order to improve performance.GPUs (Graphical Processing Units) were originally designed to accelerate rendering of images, 2D, and 3D graphics. However, due to their capability of performing many parallel operations, their utility extends beyond that to applications such as deep learning.The use of GPUs for deep learning models started around the mid to late 2000s and became very popular around 2012 with the emergence of AlexNet. Ale…

19 часов назад @ towardsdatascience.com
Python Meets Pawn 2: Clustering Chess Grandmasters based on their Openings
Python Meets Pawn 2: Clustering Chess Grandmasters based on their Openings Python Meets Pawn 2: Clustering Chess Grandmasters based on their Openings

In this blog, I will guide you through the process of analyzing Chess Grandmasters’ openings using Python.Continue reading on Towards Data Science »

21 час назад @ towardsdatascience.com
How to Detect Floods in Satellite Imagery, Case Study: Dubai Flooding
How to Detect Floods in Satellite Imagery, Case Study: Dubai Flooding How to Detect Floods in Satellite Imagery, Case Study: Dubai Flooding

Detecting and monitoring flooded areas in satellite imagery with a simple classification approachContinue reading on Towards Data Science »

1 day, 5 hours назад @ towardsdatascience.com
Organizing Python Functions in Utility Classes
Organizing Python Functions in Utility Classes Organizing Python Functions in Utility Classes

Explore how utility classes provide enhanced namespaces for grouping related functions.Continue reading on Towards Data Science »

1 day, 5 hours назад @ towardsdatascience.com
Permutation Feature Importance from Scratch
Permutation Feature Importance from Scratch Permutation Feature Importance from Scratch

Understanding the importance of permutations in the field of explainable AIContinue reading on Towards Data Science »

1 day, 6 hours назад @ towardsdatascience.com
Spatial Challenges in RCTs
Spatial Challenges in RCTs Spatial Challenges in RCTs

Location, Location, LocationContinue reading on Towards Data Science »

1 day, 8 hours назад @ towardsdatascience.com
Introduction to Kaggle and Scoring Top 7% in the Titanic Competition
Introduction to Kaggle and Scoring Top 7% in the Titanic Competition Introduction to Kaggle and Scoring Top 7% in the Titanic Competition

Get started with Kaggle and submit a (good) first solutionContinue reading on Towards Data Science »

1 day, 16 hours назад @ towardsdatascience.com
Speak, Don’t Type: Exploring Voice Interaction with LLMs [Part 1]
Speak, Don’t Type: Exploring Voice Interaction with LLMs [Part 1] Speak, Don’t Type: Exploring Voice Interaction with LLMs [Part 1]

Augmenting LLM Apps with a Voice ModalityPhoto by Ian Harber on UnsplashMany LLMs, particularly those that are open-source, have typically been limited to processing text or, occasionally, text with images (Large Multimodal Models or LMMs). But what if you want to communicate with your LLM using your voice? Thanks to the advancement of powerful speech-to-text open-source technologies in recent years, this becomes achievable.We will go into the integration of Llama 3 with a speech-to-text model, all within a user-friendly interface. This fusion enables (near) real-time communication with an LLM through speech. Our exploration involves selecting Llama 3 8B as the LLM, using the Whisper speech…

1 day, 17 hours назад @ towardsdatascience.com
Distill.pub Distill.pub
последний пост None
The Gradient The Gradient
последний пост 4 days, 17 hours назад
Financial Market Applications of LLMs
Financial Market Applications of LLMs Financial Market Applications of LLMs

Looked at from another angle, there is much more noise than signal in financial data.

Another financial market application of LLMs might be synthetic data creation [4,8].

Then precious real market data could be employed to fine-tune the predictions and determine precisely the optimal speed to trade.

Financial market practitioners are often interested in extreme events, the times when trading strategies are more likely to experience significant gains or losses.

CitationFor attribution in academic contexts or books, please cite this work asRichard Dewey and Ciamac Moallemi, "Financial Market Applications of LLMs," The Gradient, 2024

4 days, 17 hours назад @ thegradient.pub
A Brief Overview of Gender Bias in AI
A Brief Overview of Gender Bias in AI A Brief Overview of Gender Bias in AI

All of these terms (“AI”, “gender”, and “bias”) can be somewhat overused and ambiguous.

A Short History of Studying Gender Bias in AIHere, I cover a very small sample of papers I’ve found influential studying gender bias in AI.

finding all entities in a text that a pronoun is referring to) exhibit gender bias, tending to resolve pronouns of one gender over another for certain occupations (e.g.

This article mainly focused on gender bias — and particularly, on binary gender.

AcknowledgementsThis post was originally posted on Art Fish IntelligenceCitationFor attribution in academic contexts or books, please cite this work asYennie Jun, "Gender Bias in AI," The Gradient, 2024@article{Jun2024bia…

2 weeks, 2 days назад @ thegradient.pub
Mamba Explained
Mamba Explained Mamba Explained

As a general sequence model backbone, Mamba achieves state-of-the-art performance across several modalities such as language, audio, and genomics.

Here we’ll discuss:The advantages (and disadvantages) of Mamba (🐍) vs Transformers (🤖),Analogies and intuitions for thinking about Mamba, andWhat Mamba means for Interpretability, AI Safety and Applications.

The Mamba BlockLike a Transformer made up of stacked transformer blocks, Mamba is made up of stacked Mamba blocks as above.

The Mamba authors write, “the efficiency vs. effectiveness tradeoff of sequence models is characterised by how well they compress their state”.

Thanks to Gonçalo for reading an early draft, Jaden for the nnsight library …

4 weeks назад @ thegradient.pub
Car-GPT: Could LLMs finally make self-driving cars happen?
Car-GPT: Could LLMs finally make self-driving cars happen? Car-GPT: Could LLMs finally make self-driving cars happen?

We've just seen 3 prominent families of LLM usage in self-driving cars: Perception, Planning, and Generation.

The first wave of papers mentioning LLMs in Self-Driving Cars is from mid-2023, so let's give it some time.

Next StepsIf you want to get started on LLMs for self-driving cars, there are several things you can do:⚠️ Before this, the most important : If you want to keep learning about self-driving cars.

Author BioJérémy Cohen is a self-driving car engineer and founder of Think Autonomous, a platform to help engineers learn about cutting-edge technologies such as self-driving cars and advanced Computer Vision.

CitationFor attribution in academic contexts or books, please cite this work…

1 month, 2 weeks назад @ thegradient.pub
Do text embeddings perfectly encode text?
Do text embeddings perfectly encode text? Do text embeddings perfectly encode text?

Beyond the requirements of semantic similarity, there are no constraints on what embedding must be assigned for a given text input.

What if someone hacks into the database and gains access to all your text embedding vectors – would this be bad?

From text to embeddings...back to textThe problem of recovering text from embeddings is exactly the scenario we tackle in our paper Text Embeddings Reveal As Much as Text (EMNLP 2023).

Scaling and future workThe fact that text embeddings can be perfectly inverted raises many follow-up questions.

CitationFor attribution in academic contexts or books, please cite this work asJack Morris, "Do text embeddings perfectly encode text?

1 month, 2 weeks назад @ thegradient.pub
Why Doesn’t My Model Work?
Why Doesn’t My Model Work? Why Doesn’t My Model Work?

If your model latches on to these during training, it will appear to work well, but may not work on new data.

This happens when the model training pipeline has access to information it shouldn’t have access to, particularly information that confers an advantage to the model.

This means that knowledge of the test data is implicitly entering the model training pipeline, even if it is not explicitly used to train the model.

Well, this is a common thing to do, but if you’re developing a model iteratively and using the same test set to evaluate the model after each iteration, then you’re basically using that test set to guide the development of the model.

CitationFor attribution in academic cont…

2 months назад @ thegradient.pub
Deep learning for single-cell sequencing: a microscope to see the diversity of cells
Deep learning for single-cell sequencing: a microscope to see the diversity of cells Deep learning for single-cell sequencing: a microscope to see the diversity of cells

Evolution of single-cell sequencing over timeHaving explored the panorama of single-cell sequencing, let us now delve into the role of deep learning in the context of single-cell sequencing.

Deep Learning on single-cell sequencingDeep learning is increasingly employed in single-cell analysis due to its capacity to handle the complexity of single-cell sequencing data.

The deep learning approach, however, autonomously captures relevant characteristics from single-cell sequencing data, addressing the heterogeneity between single-cell sequencing experiments, as well as the associated noise and sparsity in such data.

As we explore the reasons behind using deep learning in single-cell sequencing …

3 months, 1 week назад @ thegradient.pub
Salmon in the Loop
Salmon in the Loop Salmon in the Loop

In order to obtain a license or permit from FERC, hydroelectric dam operators must submit detailed plans and studies demonstrating that their facility meets regulations.

Typically, a hydroelectric dam requires lots of space to store water on one side of it, which means they tend to be located away from population centers.

Enter Computer VisionSome organizations are exploring the use of computer vision and machine learning to significantly automate fish counting.

The annotated images are then used to train a machine learning model.

CitationFor attribution of this in academic contexts or books, please cite this work as:Kevin McCraney, "Salmon in the Loop", The Gradient, 2023.

4 months, 1 week назад @ thegradient.pub
Neural algorithmic reasoning
Neural algorithmic reasoning Neural algorithmic reasoning

In recent work with computer networking and machine learning collaborators from ETH Zürich, we studied the applicability of neural algorithmic reasoning in computer networking [27].

Our proposal, the neural algorithmic reasoning blueprint [32], aims to bridge this divide by neuralising the target algorithm.

Neural Algorithmic Reasoning with Causal Regularisation.

Neural Algorithmic Reasoning.

CitationFor attribution in academic contexts or books, please cite this work asPetar Veličković, "Neural Algorithmic Reasoning", The Gradient, 2023.

6 months, 1 week назад @ thegradient.pub
The Artificiality of Alignment
The Artificiality of Alignment The Artificiality of Alignment

This community has developed an extensive vocabulary around theories of AI safety and alignment, many first introduced as detailed blog posts in forums like LessWrong and AI Alignment Forum.

One such idea that is useful for contextualizing technical alignment work — and is perhaps the more formal version of what Bostrom was referring to — is the concept of intent alignment.

Anthropic’s product marketing pages are plastered with notes and phrases about their alignment work —“HHH” is also Claude's biggest selling point.

The site uses the phrasing “AI Safety” instead of “AI Alignment” in the title, but the article itself proceeds to use “safety” and “alignment” interchangeably without differen…

6 months, 2 weeks назад @ thegradient.pub
An Introduction to the Problems of AI Consciousness
An Introduction to the Problems of AI Consciousness An Introduction to the Problems of AI Consciousness

This brief introduction is aimed at those working within the AI community who are interested in AI consciousness, but may not know much about the philosophical and scientific work behind consciousness generally or the topic of AI consciousness in particular.

(Image by author)AI and ConsciousnessTwo Problems for AI ConsciousnessLet’s return to the topic of AI consciousness.

The problem of AI consciousness may seem less difficult than the hard problem: the problem of AI consciousness only asks if silicon could support consciousness, but it does not ask for an explanation of why silicon can or cannot, like the hard problem does.

The second test proposed by Schneider and Edwin Turner [27], call…

6 months, 3 weeks назад @ thegradient.pub
Text-to-CAD: Risks and Opportunities
Text-to-CAD: Risks and Opportunities Text-to-CAD: Risks and Opportunities

We’ve identified three key areas where such programs can level up: dataset curation, a pattern language for usability, and filtering.

The format for a friction hinge pattern might look like this:Pattern Name Friction Hinge Pattern Description The Friction Hinge pattern addresses the need for adjustable friction in hinges so as to provide tuneable resistance, but without compromising smooth movement.

Alas, once text-to-CAD models get open-sourced or leaked, many of these queries will be satisfied without compunction.

____In conclusion, the emergence of AI-powered text-to-CAD generation presents both risks and opportunities, the ratio of which is still very much undecided.

CitationFor attribu…

7 months, 2 weeks назад @ thegradient.pub
Interpretability Creationism
Interpretability Creationism Interpretability Creationism

In this piece, I will discuss the tendency towards “interpretability creationism” – interpretability methods that only look at the final state of the model and ignore its evolution over the course of training – and propose a focus on the training process to supplement interpretability research.

In the case of language models, they behave similarly to ngram models early on and exhibit linguistic patterns later.

Let us consider an explanation usually based on analyzing static models: hierarchical behavior in language models.

An ExampleI recently had to manage the trap of interpretability creationism myself.

Citation:For attribution in academic contexts or books, please cite this work asNaomi …

9 months, 2 weeks назад @ thegradient.pub
What Do LLMs Know About Linguistics? It Depends on How You Ask
What Do LLMs Know About Linguistics? It Depends on How You Ask What Do LLMs Know About Linguistics? It Depends on How You Ask

Understanding what LLMs learn about linguistics from large-scale pretraining is a good framework for understanding how they work in general.

How we extend in-context learning to structured prediction tasks with structured prompting (left) and an example of using structured prompting with an LLM to annotate parts-of-speech (right).

Overall, structured prompting can consistently generate the linguistic structure underlying text from LLMs, confirming that these models have implicitly learned these structures during pretraining.

This leads us to ask: what factors of structured prompting allow LLMs to annotate linguistic structure?

By searching for labels instead of full examples, we can find ta…

9 months, 2 weeks назад @ thegradient.pub
TheSequence TheSequence
последний пост 2 days, 1 hour назад
Edge 389: Understanding Large Action Models
Edge 389: Understanding Large Action Models Edge 389: Understanding Large Action Models

Created Using IdeogramIn this Issue:An overview of large action models(LAM) in autonomous agents.

An introduction to the MetaGPT framework for building autonomous agents.

💡 ML Concept of the Day: Large Action Models and Autonomous AgentsThe ability to execute actions in a given environment is one of the hallmarks of autonomous agents.

One of the key topics of debate in autonomous agent circles is how much of the action execution relies on external components versus being built into the model itself.

One of the most interesting approaches among the proponents of the latter camp is what is known as large action models (LAMs).

2 days, 1 hour назад @ thesequence.substack.com
Some Cool Details About Llama 3
Some Cool Details About Llama 3 Some Cool Details About Llama 3

You can subscribed to The Sequence below:📝 Editorial: Some Cool Details About Llama 3I had an editorial prepared for this week’s newsletter, but then Meta AI released Llama 3!

I prefer to use the term "open models," given that these releases are not completely open source, but that’s just my preference.

The release of Llama 3 builds on incredible momentum within the open model ecosystem and brings its own innovations.

The momentum in the generative AI open models space definitely continues, even if it forced me to rewrite the entire editorial.

🤖 Cool AI Tech ReleasesLlama 3Meta AI introduced the highly anticipated Llama 3 model —> Read more.

4 days назад @ thesequence.substack.com
Edge 388: Google DeepMind's SIMA can Follow Language Instructions in 3D Games Just Like Humans
Edge 388: Google DeepMind's SIMA can Follow Language Instructions in 3D Games Just Like Humans Edge 388: Google DeepMind's SIMA can Follow Language Instructions in 3D Games Just Like Humans

Created Using IdeogramVideo games have long served as some of the best environments for training AI agents.

However, most of the AI breakthroughs in 3D game environments have been constrained to one or a small number of games.

The goal of the project was to develop instructable agents that can interact with any 3D environment just like a human by following simple language instructions.

Language is the most powerful and yet simple abstraction for communicating instructions about the world or, in this case, a 3D virtual world.

The magic of SIMA is its ability to translate those abstract instructions into mouse and keyboard actions used to navigate an environment.

1 week назад @ thesequence.substack.com
Edge 387: Tool Learning in Autonomous Agents
Edge 387: Tool Learning in Autonomous Agents Edge 387: Tool Learning in Autonomous Agents

Created Using IdeogramIn this Issue:Tool learning in autonomous agents.

💡 ML Concept of the Day: Tool Learning in Autonomous AgentsOne of the key differentiators between agents and models is the capability of the former to take actions in a given environment.

Part of that action execution typically involves interactions with different systems or tools.

From this perspective, tool learning has become one of the most important building blocks of autonomous agents.

When it comes to tool learning in autonomous agents, we should identify two main groups of interactions:

1 week, 2 days назад @ thesequence.substack.com
Neuro-Symbolic Models are Making a Comeback
Neuro-Symbolic Models are Making a Comeback Neuro-Symbolic Models are Making a Comeback

You can subscribed to The Sequence below:📝 Editorial: Neuro-Symbolic Models are Making a ComebackLarge language models (LLMs) have dominated the AI narrative in recent years to the point that we almost need to wonder about the future of other areas of machine learning.

One of the most interesting options to address these limitations comes from a pretty old ML school: neuro-symbolic models.

As the name indicates, neuro-symbolic models combine neural networks, such as LLMs, with smaller, easier-to-interpret symbolic models to adapt LLMs to specific domains.

Neuro-symbolic architectures are making a comeback as one of the possible architectures to play a role in this movement.

Neuro-symbolic m…

1 week, 4 days назад @ thesequence.substack.com
Edge 386: Inside Yi, 01's Model Leading the Chinese LLM Movement
Edge 386: Inside Yi, 01's Model Leading the Chinese LLM Movement Edge 386: Inside Yi, 01's Model Leading the Chinese LLM Movement

Created Using DALL-EThe Chinese ecosystem around foundation models have been on fire recently.

One of the most ambitious foundation models effort in China comes from 01, the startup founded by former Microsoft and Google researcher Kai Fu Lee.

01’s first iteration came in the form of the Yi models.

A few days ago, 01 published a technical report about the Yi models and we thought it would be interesting to share some details.

The Yi series models stands out for their bilingual capabilities.

2 weeks назад @ thesequence.substack.com
Edge 385: The Two Big Schools for Building Autonomous Agents
Edge 385: The Two Big Schools for Building Autonomous Agents Edge 385: The Two Big Schools for Building Autonomous Agents

Created Using IdeogramIn this Issue:Building LLM-based vs. computer-based autonomous agents.

💡 ML Concept of the Day: The Two Big Schools for Building Autonomous AgentsIn our series about autonomous agents, today we would like to explore the two fundamental schools used for implementations of this AI systems.

Typically, we associate autonomous agents with LLMs but there are competitive techniques fundamentally based on computer vision models which has been gaining quite a bit of traction.

CV-based autonomous agents fundamentally focus on recording actions in a user’s computer and replicating those actions with models that can understand pixel-by-pixel positions and mouse actions.

In general…

2 weeks, 2 days назад @ thesequence.substack.com
Generative Audio Models Just Had a Great Week
Generative Audio Models Just Had a Great Week Generative Audio Models Just Had a Great Week

You can subscribed to The Sequence below:📝 Editorial: Generative Audio Models Just Had a Great WeekAudio is rapidly becoming one of the most important frontiers in generative AI, a field that is advancing swiftly.

Technically, generative audio poses a fundamentally simpler problem than video or 3D, which leads to faster iterations in research and implementation.

The pace of innovation in generative audio is accelerating at remarkable levels.

To these innovations, you need to add more established players such as Eleven Labs, which have been pushing the boundaries of generative audio for years.

OpenAI Custom ModelsOpenAI announced enhacements to its training API as well as new mechanisms for …

2 weeks, 4 days назад @ thesequence.substack.com
📝 Guest Post: The EU AI Act – A Guide for Developers*
📝 Guest Post: The EU AI Act – A Guide for Developers* 📝 Guest Post: The EU AI Act – A Guide for Developers*

This year the EU AI Act came into law and has worried a lot of developers.

Anyone who is selling an AI product within the EU has to comply with the EU AI Act, even if you’re not based in the EU.

Minimal Risk AI Applications: This category covers the majority of AI applications such as AI in games, spam filters and recommendation engines.

The EU AI Act creates a separate category for what they consider to be “General Purpose AI” systems.

The worst elements of early drafts have mostly been stripped from the EU AI Act.

2 weeks, 6 days назад @ thesequence.substack.com
Edge 384: Inside Genie: Google DeepMind's Astonishing Model that can Build 2D Games from Text and Images
Edge 384: Inside Genie: Google DeepMind's Astonishing Model that can Build 2D Games from Text and Images Edge 384: Inside Genie: Google DeepMind's Astonishing Model that can Build 2D Games from Text and Images

Created Using IdeogramThe pace of research in generative AI is nothing short of remarkable.

Even so, from time to time, there are papers that literally challenge our imagination about how far the generative AI space can go.

A few weelks ago, Google DeepMind published some of that work with the release of Genie, a model that is able to generative interactive game environments from text and images.

This is the vision that Google DeepMind has turned into reality with Genie, a groundbreaking approach to generative AI.

Genie can craft interactive environments from a mere text or image prompt, thanks to its training on over 200,000 hours of publicly available gaming videos from the Internet.

3 weeks назад @ thesequence.substack.com
Edge 383: The Key Capabilities of Autonomous Agens
Edge 383: The Key Capabilities of Autonomous Agens Edge 383: The Key Capabilities of Autonomous Agens

Created Using IdeogramIn this Issue:An review of the key capabilities of autonomous agents.

An introduction to the Crew AI framework for building autonomous agents.

Different frameworks have outlined various feature sets for autonomous agents.

However, there are a few building blocks that can be consistently considered as the strong foundation of any autonomous agent application.

Most capabilities of autonomous agents can be considered variations of the following key areas:

3 weeks, 2 days назад @ thesequence.substack.com
Four New Major Open Source Foundation Models in a Week
Four New Major Open Source Foundation Models in a Week Four New Major Open Source Foundation Models in a Week

You can subscribe to The Sequence using the link below:📝 Editorial: Four New Major Open Source Foundation Models in a WeekOpen source generative AI is experiencing tremendous momentum, and last week was a major example of this with the release of four major foundation models.

By open source, we refer to the weights of the models and not the training datasets or processes.

At this time, it's fair to say that the model weights are where most companies draw the line between open source and closed source.

Last week, we witnessed the release of four major open source models, each innovative in its own way:DBRX: Databricks released DBRX, a new model based on a mixture-of-experts architecture.

Thi…

3 weeks, 4 days назад @ thesequence.substack.com
Edge 381: Google DeepMind's PrompBreeder Self-Improves Prompts
Edge 381: Google DeepMind's PrompBreeder Self-Improves Prompts Edge 381: Google DeepMind's PrompBreeder Self-Improves Prompts

Created Using DALL-EReasoning and prompt evolution/optimization are being recognized as the next significant frontier for large language models(LLMs).

Among the various strategies employed to enhance the reasoning capabilities of LLMs, one of the prominent ones is Chain-of-Thought Prompting, often hailed for its effectiveness.

However, it’s worth noting that manually crafted prompt strategies tend to fall short of optimal performance.

Recently, researchers from Google DeepMind unveiled PROMPTBREEDER , a self-improving prompt algorithm that uses evolutionary techniques to arrive to the best prompts for a given task.

PROMPTBREEDER addresses some of the limitations of CoT with a simple and sup…

4 weeks назад @ thesequence.substack.com
Edge 380: A New Series About Autonomous Agents
Edge 380: A New Series About Autonomous Agents Edge 380: A New Series About Autonomous Agents

Created Using DALL-EIn this Issue:An introduction to our series about autonomous agents.

💡 ML Concept of the Day: A New Series About Autonomous AgentsToday, we start one of our most ambitious series at The Sequence by diving into the world of autonomous agents.

What makes this series ambitious is that autonomous agents remains a relatively nascent and unsolved problem in AI.

The rapid pace of growth in LLMs have certainly accelerated the viability of autonomous agents but the space still remains in a very early stages.

What is an autonomous agents?

1 month назад @ thesequence.substack.com
📝 Guest Post: Zilliz Unveiled Milvus 2.4 at GTC 24, Transforming Vector Databases with GPU Acceleration*
📝 Guest Post: Zilliz Unveiled Milvus 2.4 at GTC 24, Transforming Vector Databases with GPU Acceleration* 📝 Guest Post: Zilliz Unveiled Milvus 2.4 at GTC 24, Transforming Vector Databases with GPU Acceleration*

High throughput ensures that vector databases can handle a large volume of incoming queries concurrently, delivering low-latency responses to end-users or services.

At the heart of vector databases lies a core set of vector operations, such as similarity calculations and matrix operations, which are highly parallelizable and computationally intensive.

To leverage the benefits of GPU acceleration, CAGRA is integrated into Milvus’ Index and Query Nodes.

Blazing New TrailsThe integration of NVIDIA's CAGRA GPU acceleration framework into Milvus 2.4 represents a groundbreaking achievement in vector databases.

The unveiling of Milvus 2.4, a collaboration between Zilliz and NVIDIA, exemplifies the…

1 month назад @ thesequence.substack.com
Synced Review
последний пост 10 часов назад
Decoding Code Execution: How DeepMind’s NExT Empowers AI Reasoning
Decoding Code Execution: How DeepMind’s NExT Empowers AI Reasoning Decoding Code Execution: How DeepMind’s NExT Empowers AI Reasoning

In recent years, there has been a surge in the development of large language models (LLMs) tailored for code-related tasks. These LLMs have shown remarkable proficiency in aiding developers with tasks such as writing, editing, explaining, and reviewing code. However, they often stumble when faced with more intricate software engineering challenges that demand a deeper understanding of a program’s runtime behavior.Addressing this gap, in a new paper NExT: Teaching Large Language Models to Reason about Code Execution, a Google DeepMind research team proposes Naturalized Execution Tuning (NExT), a method aims to equip LLMs with the ability to scrutinize program execution traces and deduce runt…

10 часов назад @ medium.com
NVIDIA’s ScaleFold Slashes AlphaFold’s Training Time to 10 Hours
NVIDIA’s ScaleFold Slashes AlphaFold’s Training Time to 10 Hours NVIDIA’s ScaleFold Slashes AlphaFold’s Training Time to 10 Hours

AlphaFold2 (AF2), crafted by DeepMind, stands as a beacon in the realm of artificial intelligence (AI), boasting the remarkable ability to predict the three-dimensional (3D) structures of proteins from amino acid sequences with unprecedented atomic-level precision. While lauded as a revolutionary advancement in protein folding, its training regimen has long been hampered by its laborious nature, failing to reap significant benefits from the scaling up of computational resources.In a new paper ScaleFold: Reducing AlphaFold Initial Training Time to 10 Hours, a team of researchers from NVIDIA presents ScaleFold, a novel and scalable training methodology tailored for the AlphaFold model. Notabl…

2 days, 10 hours назад @ medium.com
DeepMind’s RecurrentGemma Pioneering Efficiency for Open Small Language Models
DeepMind’s RecurrentGemma Pioneering Efficiency for Open Small Language Models DeepMind’s RecurrentGemma Pioneering Efficiency for Open Small Language Models

In the expansive realm of artificial intelligence and natural language processing, Small Language Models (SLMs) are making significant strides. Unlike their larger counterparts with hefty parameter counts and demanding computational needs, SLMs are sleeker versions crafted for optimal performance even in resource-constrained settings.In a new paper RecurrentGemma: Moving Past Transformers for Efficient Open Language Models, a Google DeepMind research team introduce RecurrentGemma, an open language model built on Google’s innovative Griffin architecture. This model reduces memory usage and facilitates efficient inference on lengthy sequences, thereby unlocking new possibilities for highly ef…

4 days, 21 hours назад @ medium.com
87% ImageNet Accuracy, 3.8ms Latency: Google’s MobileNetV4 Redefines On-Device Mobile Vision
87% ImageNet Accuracy, 3.8ms Latency: Google’s MobileNetV4 Redefines On-Device Mobile Vision 87% ImageNet Accuracy, 3.8ms Latency: Google’s MobileNetV4 Redefines On-Device Mobile Vision

Efficient on-device neural networks offer rapid, real-time, and interactive experiences while safeguarding private data from public internet exposure. Yet, the computational limitations of mobile devices present a formidable challenge in maintaining a delicate balance between accuracy and efficiency.Addressing this challenge head-on, a recent paper titled “MobileNetV4 — Universal Models for the Mobile Ecosystem,” penned by a Google research team, unveils the latest iteration of MobileNets: MobileNetV4 (MNv4). This cutting-edge model boasts an impressive 87% ImageNet-1K accuracy, coupled with an astonishingly low Pixel 8 EdgeTPU runtime of merely 3.8ms.At the heart of this breakthrough lies …

6 days, 11 hours назад @ medium.com
Unveiling the Black Box: Meta’s LM Transparency Tool Deciphers Transformer Language Models
Unveiling the Black Box: Meta’s LM Transparency Tool Deciphers Transformer Language Models Unveiling the Black Box: Meta’s LM Transparency Tool Deciphers Transformer Language Models

Transformer-based language models have emerged as powerful tools across various tasks, underlining their significance in critical contexts. Understanding the inner workings of these models is paramount for ensuring their safety, reliability, and trustworthiness, given their widespread adoption.In a new paper LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models, a research team from Meta, University College London and Universitat Politècnica de Catalunya introduces the LM Transparency Tool (LM-TT), an open-source interactive toolkit designed for dissecting Transformer-based language models.Existing analysis tools often focus on isolated aspects of decision-making …

1 week, 1 day назад @ medium.com
OPPO AI’s Transformer-Lite Delivers 10x+ Prefill and 2~3x Decoding Boost on Mobile Phone GPUs
OPPO AI’s Transformer-Lite Delivers 10x+ Prefill and 2~3x Decoding Boost on Mobile Phone GPUs OPPO AI’s Transformer-Lite Delivers 10x+ Prefill and 2~3x Decoding Boost on Mobile Phone GPUs

The Large Language Model (LLM) has showcased remarkable efficacy across various real-world applications, including intelligent assistants, text summarization, translation, and multi-modality tasks on mobile devices. Nonetheless, the current methodologies for on-device deployment of LLMs are hampered by sluggish inference speeds, leading to subpar user experiences.In a new paper Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs, researchers from OPPO AI Center have introduced a solution. They present four optimization techniques and introduce a novel mobile inference engine dubbed Transformer-Lite. This engine outperforms CPU-based FastLLM and GPU-bas…

1 week, 2 days назад @ medium.com
Revolutionizing Video Understanding: Real-Time Captioning for Any Length with Google’s Streaming…
Revolutionizing Video Understanding: Real-Time Captioning for Any Length with Google’s Streaming… Revolutionizing Video Understanding: Real-Time Captioning for Any Length with Google’s Streaming…

Revolutionizing Video Understanding: Real-Time Captioning for Any Length with Google’s Streaming ModelThe exponential growth of online video platforms has led to a surge in video content, thereby heightening the need for advanced video comprehension. However, existing computer vision models tailored for video understanding often fall short, typically analyzing only a limited number of frames, typically spanning mere seconds, and categorizing these brief segments into predefined concepts.To address the abovementioned challenge, in a new paper Streaming Dense Video Captioning, a Google research team proposes a streaming dense video captioning model, which revolutionizes dense video captioning…

2 weeks назад @ medium.com
AURORA-M: A Global Symphony of Innovation as 33 Prestigious Institutions Unify for Open-Source…
AURORA-M: A Global Symphony of Innovation as 33 Prestigious Institutions Unify for Open-Source… AURORA-M: A Global Symphony of Innovation as 33 Prestigious Institutions Unify for Open-Source…

AURORA-M: A Global Symphony of Innovation as 33 Prestigious Institutions Unify for Open-Source Multilingual MasteryLarge Language Models (LLMs) have revolutionized various applications, including machine translation, text summarization, dialogue systems, and code generation. Yet, the hefty computational requirements for pretraining these models pose significant barriers to broader accessibility and development.To address these challenges, recent open-source initiatives like BLOOM, StarCoder, and StarCoder-2 have emerged, aiming to democratize access to pretrained LLMs. However, these models encounter limitations such as restricted multilingual capabilities, computational intensity, and the …

2 weeks, 2 days назад @ medium.com
Huawei & Peking U’s DiJiang: A Transformer Achieving LLaMA2–7B Performance at 1/50th the Training…
Huawei & Peking U’s DiJiang: A Transformer Achieving LLaMA2–7B Performance at 1/50th the Training… Huawei & Peking U’s DiJiang: A Transformer Achieving LLaMA2–7B Performance at 1/50th the Training…

Huawei & Peking U’s DiJiang: A Transformer Achieving LLaMA2–7B Performance at 1/50th the Training CostThe Transformer architecture has emerged as a pivotal tool in numerous domains, excelling particularly in tasks like speech recognition, machine translation, and document summarization. Yet, its efficacy often hinges on expanding the model’s size to tackle increasingly intricate challenges, thereby imposing substantial computational burdens.In the pursuit of alleviating the computational strain associated with Transformers, the exploration of linear attention mechanisms has garnered notable traction. Nonetheless, enhancing these mechanisms typically entails extensive retraining, a prohibiti…

3 weeks назад @ medium.com
KCL Leverages Topos Theory to Decode Transformer Architectures
KCL Leverages Topos Theory to Decode Transformer Architectures KCL Leverages Topos Theory to Decode Transformer Architectures

The transformer architecture has emerged as the predominant framework for deep learning, playing a pivotal role in the remarkable achievements of large language models like ChatGPT. Despite its widespread adoption, the theoretical underpinnings of its success remain largely uncharted territory.In a new paper The Topos of Transformer Networks, a King’s College London research team delves into a theoretical exploration of the transformer architecture, employing the lens of topos theory. This innovative approach conjectures that the factorization through “choose” and “eval” morphisms can yield effective neural network architecture designs.The primary objective of this paper is to offer a categ…

3 weeks, 3 days назад @ medium.com
Robotic Marvels: Conquering San Francisco’s Streets Through Next Token Prediction
Robotic Marvels: Conquering San Francisco’s Streets Through Next Token Prediction Robotic Marvels: Conquering San Francisco’s Streets Through Next Token Prediction

In recent years, there has been a remarkable surge in the effectiveness of large transformer models trained through generative modeling on extensive language datasets sourced from the Internet. These models have demonstrated impressive capabilities across diverse domains. By predicting subsequent words, these models glean intricate understandings of language, which can then be applied to various tasks through multi-task learning and efficient few-shot learning techniques.This success has led researchers to ponder: can we replicate this approach to develop robust models for sensory and motor representation? While there have been encouraging signs of progress in learning sensorimotor represen…

3 weeks, 5 days назад @ medium.com
First Model-Stealing Attack Reveals Secrets of Black-Box Production Language Models
First Model-Stealing Attack Reveals Secrets of Black-Box Production Language Models First Model-Stealing Attack Reveals Secrets of Black-Box Production Language Models

Despite the remarkable capabilities of contemporary large language models like GPT-4, Claude 2, or Gemini, the inner mechanisms governing their operations remain shrouded in mystery. While the intricate details of these models are kept under wraps, they are made accessible through APIs, prompting the question: How much insight can adversaries glean about these models by interacting with their APIs?To answer this question, in a new paper Stealing Part of a Production Language Model, a research team from Google DeepMind, ETH Zurich, University of Washington, OpenAI and McGill University introduces a groundbreaking model-stealing attack. This attack unveils precise, nontrivial information from…

4 weeks назад @ medium.com
DeepMind & UBC’s Genie: A Revolutionary Leap in Generative AI for Interactive Virtual Worlds
DeepMind & UBC’s Genie: A Revolutionary Leap in Generative AI for Interactive Virtual Worlds DeepMind & UBC’s Genie: A Revolutionary Leap in Generative AI for Interactive Virtual Worlds

In recent years, there has been a remarkable surge in the development of generative AI, showcasing the potential of models to create novel and imaginative content. While strides have been made in various domains, the realm of video generation stands as a promising frontier.Recent advancements suggest that scaling up these models could further enhance their capabilities. However, there remains a significant gap between the interactive engagement offered by video generative models and the rich interactions facilitated by language tools like ChatGPT, not to mention more immersive experiences.In response to this challenge, in a new paper Genie: Generative Interactive Environments, a research te…

1 month назад @ medium.com
ByteDance’s AnimateDiff-Lightning Shines in State-of-the-Art Video Creation in Lightning Speed
ByteDance’s AnimateDiff-Lightning Shines in State-of-the-Art Video Creation in Lightning Speed ByteDance’s AnimateDiff-Lightning Shines in State-of-the-Art Video Creation in Lightning Speed

In recent times, video generative models have emerged as a focal point of attention, unlocking a realm of fresh creative opportunities. Despite this, the velocity of these models remains a significant obstacle to their broader adoption. State-of-the-art generative models, while impressive in their capabilities, are hampered by sluggishness and computational demands due to their iterative diffusion processes.To address this issue, in a new paper AnimateDiff-Lightning: Cross-Model Diffusion Distillation, a ByteDance research team presents AnimateDiff-Lightning, a novel approach that utilizes progressive adversarial diffusion distillation, catapulting video generation into a realm of lightning…

1 month назад @ medium.com
Stanford’s VideoAgent Achieves New SOTA of Long-Form Video Understanding via Agent-Based System
Stanford’s VideoAgent Achieves New SOTA of Long-Form Video Understanding via Agent-Based System Stanford’s VideoAgent Achieves New SOTA of Long-Form Video Understanding via Agent-Based System

Understanding long-form videos presents a formidable challenge within the realm of computer vision. This undertaking requires a model adept at processing multi-modal data, managing extensive sequences, and effectively reasoning over these sequences.In response to this challenge, in a new paper VideoAgent: Long-form Video Understanding with Large Language Model as Agent, a Stanford University research team introduces VideoAgent, an innovative approach simulates human comprehension of long-form videos through an agent-based system, showcasing superior effectiveness and efficiency compared to current state-of-the-art methods. This underscores the potential of agent-based approaches in advancin…

1 month, 1 week назад @ medium.com
📓 Cool Blogs
ODS.ai Habr ODS.ai Habr
последний пост 4 months, 1 week назад
GPT-like модель «впервые сделала научное открытие»: что, как и куда дальше?
GPT-like модель «впервые сделала научное открытие»: что, как и куда дальше? GPT-like модель «впервые сделала научное открытие»: что, как и куда дальше?

Или кликбейт — и это в Nature?

Статья ниже — подробный разбор достаточно сложного топика, и в некоторых моментах нужно будет сосредоточиться и вдумчиво читать.

Они же пишут код в помощь разработчикам, да и в целом помогают решать разного рода проблемы.

Однако может так получиться, что сет долгое время не выпадает — и не потому, что игроки проворонили, а потому, что его действительно просто нет на столе.

Надеемся, что после прочтения этой статьи стало ясно, что ошибки нейросетей — это не баг, это фича.

4 months, 1 week назад @ habr.com
Кто такие LLM-агенты и что они умеют?
Кто такие LLM-агенты и что они умеют? Кто такие LLM-агенты и что они умеют?

Также присоединяйтесь к моему телеграм каналу AI[ex]Time , где я пишу про машинное обучение, NLP, LLM и в том числе про агентов.

LLaVaВ качестве LLM для генерации текстового ответа используется LLaMA, которая декодирует эмбеддинги (то же, что и векторы) картинок и входного текста в ответ.

На вход модель получает картинку и запрос от пользователя (Normal prompt) и на выходе должна дать ответ (Response).

Модель таким образом от решения задачи напрямую переходит к рассуждению по шагам, что в некотором смысле и является декомпозицией задачи.

И это улучшает качество работы модели — в том числе и ризонинга.

4 months, 3 weeks назад @ habr.com
Главное событие в мире AI: создатель ChatGPT рассказал, в какое будущее он нас всех ведет
Главное событие в мире AI: создатель ChatGPT рассказал, в какое будущее он нас всех ведет Главное событие в мире AI: создатель ChatGPT рассказал, в какое будущее он нас всех ведет

Мы помним, что в начале 2023-го продуктом уже пользовалось больше 100 млн человек в месяц.

У самых любознательных читателей может возникнуть вопрос: а как это вообще работает?

Насколько нам известно, это первый раз, когда модель такого масштаба обучается на синтетических данных, а не на произведенных человеком.

Использование такой модели дешевле в 3 раза на текст из промпта, и в 2 раза на генерируемые токены (их обычно меньше).

Но вот как с ними обращаться, когда использовать и как комбинировать — это уже решает AI по контексту диалога.

5 months, 2 weeks назад @ habr.com
Пять книг про NLP, с которых можно начать
Пять книг про NLP, с которых можно начать Пять книг про NLP, с которых можно начать

Меня зовут Валентин Малых, я — руководитель направления NLP-исследований в MTS AI, вот уже 6 лет я читаю курс по NLP.

Поскольку я все время отвечаю одно и то же, появилась идея сделать пост про мой список книг, заодно описав их.

При этом в книге больше информации про информационный поиск (information retrieval) и меньше про NLP, но в наше время эти две области уже (или все еще) очень близки.

Правда, его тоже уже не достать, хотя PDF версия ищется без проблем.

Foundations of Statistical Natural Language ProcessingНасколько мне известно, эта книга не переводилась на русский язык.

7 months, 3 weeks назад @ habr.com
Пять книг про NLP, с которых можно начать
Пять книг про NLP, с которых можно начать Пять книг про NLP, с которых можно начать

Меня зовут Валентин Малых, я — руководитель направления NLP-исследований в MTS AI, вот уже 6 лет я читаю курс по NLP.

Поскольку я все время отвечаю одно и то же, появилась идея сделать пост про мой список книг, заодно описав их.

При этом в книге больше информации про информационный поиск (information retrieval) и меньше про NLP, но в наше время эти две области уже (или все еще) очень близки.

Правда, его тоже уже не достать, хотя PDF версия ищется без проблем.

Foundations of Statistical Natural Language ProcessingНасколько мне известно, эта книга не переводилась на русский язык.

7 months, 3 weeks назад @ habr.com
Дропаем ранжирующие метрики в рекомендательной системе, часть 3: платформа для экспериментов
Дропаем ранжирующие метрики в рекомендательной системе, часть 3: платформа для экспериментов Дропаем ранжирующие метрики в рекомендательной системе, часть 3: платформа для экспериментов

Платформа для экспериментов в RecSysНа примере Киона я показала сложности экспериментов с рекомендательными системами.

В работе над RecSys в реальном сервисе мы строим гипотезы, выкатываем модели в АБ тесты, строим новые гипотезы.

Нашей целью была инфраструктура и пайплайны для максимально быстрых и удобных экспериментов с любыми алгоритмами.

Трекинг экспериментов: метрики и визуальный анализТрекинг экспериментов мы проводили одновременно в Mlflow и в отчётах в репозитории.

Схема на картинке:Инфраструктура платформы для экспериментов и приложения с рекомендациямиЖёлтые стрелки отражают запись результатов экспериментов: логирование метрик с кросс-валидации, сохранение обученных моделей в Min…

8 months назад @ habr.com
Дропаем ранжирующие метрики в рекомендательной системе, часть 2: двухэтапные модели
Дропаем ранжирующие метрики в рекомендательной системе, часть 2: двухэтапные модели Дропаем ранжирующие метрики в рекомендательной системе, часть 2: двухэтапные модели

В первой части статьи я описывала, как мы с напарником решили выкатить модель из соревнования в онлайн рекомендации, и что из этого вышло.

Давайте построим фичу для бустинга, которая будет решать главную проблему нашей текущей модели - не релевантные айтемы в хороших подборках.

Также у нас были замечательные фичи, проверенные ещё на модели в соревновании.

В задаче классификации мы подаём модели кандидатов, бывших в интеракциях в качестве позитивных таргетов, и не бывших в качестве негативных, и учим его отличать их друг от друга.

Самый стабильный подход к кросс-валидации в рекомендательных системах - это схема скользящего окна, которая больше всего соответствует работе модели в реальном мир…

8 months, 1 week назад @ habr.com
Дропаем ранжирующие метрики в рекомендательной системе, часть 1: визуальный анализ и popularity bias
Дропаем ранжирующие метрики в рекомендательной системе, часть 1: визуальный анализ и popularity bias Дропаем ранжирующие метрики в рекомендательной системе, часть 1: визуальный анализ и popularity bias

Смотрим, что мы имеем в Кионе:Распределение популярности топ 100 айтемов в датасете KionЧто там в топе просмотров?

Популярные айтемы заполняют собой полку рекомендаций и на этот запрос, и на многие другие.

Фильм занимает 6 место в топе просмотров в датасете и очень не вовремя попадает в рекомендации у самых разных алгоритмов.

Проверим рекомендации от модели с максимумом Recall без популярного:Рекомендации модели bm25 с максимумом Recall без популярного.

Recall и MAP понизились примерно в 2 раза от модели в соревновании.

8 months, 2 weeks назад @ habr.com
«Диалектик», независимое социалистическое медиа, рассказывает о своих NLP проектах, публикует датасеты и делится кодом
«Диалектик», независимое социалистическое медиа, рассказывает о своих NLP проектах, публикует датасеты и делится кодом «Диалектик», независимое социалистическое медиа, рассказывает о своих NLP проектах, публикует датасеты и делится кодом

Почти сразу после публикации поста про систему поиска новостей о трудовых конфликтах в СНГ я познакомился с коллективом проекта «Диалектик».

Коллектив волнуют процессы, происходящие как в экономическом базисе, так и в его надстройке – в обществе, культуре, политике.

Этот метод познания помогал всем классикам марксизма анализировать сложные ситуации в экономике и обществе, этим он ценен и для коллектива «Диалектика».

Поэтому внутри коллектива ведется разработка ИТ-инструментов для внутреннего (только для редакторов) и для внешнего (для всех пользователей) использования.

Этот публичный агрегатор находится на этапе тестирования и в скором времени будет представлен общественности.

8 months, 2 weeks назад @ habr.com
Machine Learning Mastery
последний пост 4 days, 21 hours назад
Generate Realistic Faces in Stable Diffusion
Generate Realistic Faces in Stable Diffusion

Stable Diffusion’s latest models are very good at generating hyper-realistic images, but they can struggle with accurately generating human faces. We can experiment with prompts, but to get seamless, photorealistic results for faces, we may need to try new methodologies and models. In this post, we will explore various techniques and models for generating highly […]

The post Generate Realistic Faces in Stable Diffusion appeared first on MachineLearningMastery.com.

4 days, 21 hours назад @ machinelearningmastery.com
Using LoRA in Stable Diffusion
Using LoRA in Stable Diffusion

The deep learning model of Stable Diffusion is huge. The weight file is multiple GB large. Retraining the model means to update a lot of weights and that is a lot of work. Sometimes we must modify the Stable Diffusion model, for example, to define a new interpretation of prompts or make the model to […]

The post Using LoRA in Stable Diffusion appeared first on MachineLearningMastery.com.

6 days, 22 hours назад @ machinelearningmastery.com
Prompting Techniques for Stable Diffusion
Prompting Techniques for Stable Diffusion

Generating pictures using Stable Diffusion in all cases would involve to submit a prompt to the pipeline. This is only one of the parameters, but the most important one. An incomplete or poorly constructed prompt would make the resulting image not as you would expect. In this post, you will learn some key techniques to […]

The post Prompting Techniques for Stable Diffusion appeared first on MachineLearningMastery.com.

1 week, 3 days назад @ machinelearningmastery.com
How to Create Images Using Stable Diffusion Web UI
How to Create Images Using Stable Diffusion Web UI

Launching the Stable Diffusion Web UI can be done in one command. After that, you can control the image generation pipeline from a browser. The pipeline has a lot of moving parts and all are important in one way or another. To effectively command Stable Diffusion to generate images, you should recognize the widgets from […]

The post How to Create Images Using Stable Diffusion Web UI appeared first on MachineLearningMastery.com.

1 week, 6 days назад @ machinelearningmastery.com
A Technical Introduction to Stable Diffusion
A Technical Introduction to Stable Diffusion

The introduction of GPT-3, particularly its chatbot form, i.e. the ChatGPT, has proven to be a monumental moment in the AI landscape, marking the onset of the generative AI (GenAI) revolution. Although prior models existed in the image generation space, it’s the GenAI wave that caught everyone’s attention. Stable Diffusion is a member of the […]

The post A Technical Introduction to Stable Diffusion appeared first on MachineLearningMastery.com.

2 weeks, 4 days назад @ machinelearningmastery.com
Brief Introduction to Diffusion Models for Image Generation
Brief Introduction to Diffusion Models for Image Generation

The advance of generative machine learning models makes computers capable of creative work. In the scope of drawing pictures, there are a few notable models that allow you to convert a textual description into an array of pixels. The most powerful models today are part of the family of diffusion models. In this post, you […]

The post Brief Introduction to Diffusion Models for Image Generation appeared first on MachineLearningMastery.com.

3 weeks, 1 day назад @ machinelearningmastery.com
Unfolding Data Stories: From First Glance to In-Depth Analysis
Unfolding Data Stories: From First Glance to In-Depth Analysis

The path to uncovering meaningful insights often starts with a single step: looking at the data before asking questions. This journey through the Ames Housing dataset is more than an exploration; it’s a narrative about the hidden stories within numbers, waiting to be told. Through a “Data First Approach,” we invite you to dive deep […]

The post Unfolding Data Stories: From First Glance to In-Depth Analysis appeared first on MachineLearningMastery.com.

1 month назад @ machinelearningmastery.com
The Da Vinci Code of Data: Mastering The Data Science Mind Map
The Da Vinci Code of Data: Mastering The Data Science Mind Map

Data Science embodies a delicate balance between the art of visual storytelling, the precision of statistical analysis, and the foundational bedrock of data preparation, transformation, and analysis. The intersection of these domains is where true data alchemy happens – transforming and interpreting data to tell compelling stories that drive decision-making and knowledge discovery. Just as […]

The post The Da Vinci Code of Data: Mastering The Data Science Mind Map appeared first on MachineLearningMastery.com.

1 month, 1 week назад @ machinelearningmastery.com
Finding Value with Data: The Cohesive Force Behind Luxury Real Estate Decisions
Finding Value with Data: The Cohesive Force Behind Luxury Real Estate Decisions

The real estate industry is a vast network of stakeholders including agents, homeowners, investors, developers, municipal planners, and tech innovators, each bringing unique perspectives and objectives to the table. Within this intricate ecosystem, data emerges as the critical element that binds these diverse interests together, facilitating collaboration and innovation. PropTech, or Property Technology, illustrates this […]

The post Finding Value with Data: The Cohesive Force Behind Luxury Real Estate Decisions appeared first on MachineLearningMastery.com.

1 month, 2 weeks назад @ machinelearningmastery.com
Skewness Be Gone: Transformative Tricks for Data Scientists
Skewness Be Gone: Transformative Tricks for Data Scientists

Data transformations enable data scientists to refine, normalize, and standardize raw data into a format ripe for analysis. These transformations are not merely procedural steps; they are essential in mitigating biases, handling skewed distributions, and enhancing the robustness of statistical models. This post will primarily focus on how to address skewed data. By focusing on […]

The post Skewness Be Gone: Transformative Tricks for Data Scientists appeared first on MachineLearningMastery.com.

1 month, 2 weeks назад @ machinelearningmastery.com
Best Free Resources to Learn Data Analysis and Data Science
Best Free Resources to Learn Data Analysis and Data Science

Sponsored Content In my decade of teaching online, the most significant inspiration has been that online learning democratizes access to education globally. Regardless of your ethnic background, income level, and geographical location—as long as you can surf the web—you can find an ocean of free educational content to help you learn new skills. […]

The post Best Free Resources to Learn Data Analysis and Data Science appeared first on MachineLearningMastery.com.

1 month, 3 weeks назад @ machinelearningmastery.com
Harmonizing Data: A Symphony of Segmenting, Concatenating, Pivoting, and Merging
Harmonizing Data: A Symphony of Segmenting, Concatenating, Pivoting, and Merging

In the world of data science, where raw information swirls in a cacophony of numbers and variables, lies the art of harmonizing data. Like a maestro conducting a symphony, the skilled data scientist orchestrates the disparate elements of datasets, weaving them together into a harmonious composition of insights. Welcome to a journey where data transcends […]

The post Harmonizing Data: A Symphony of Segmenting, Concatenating, Pivoting, and Merging appeared first on MachineLearningMastery.com.

1 month, 3 weeks назад @ machinelearningmastery.com
Beyond SQL: Transforming Real Estate Data into Actionable Insights with Pandas
Beyond SQL: Transforming Real Estate Data into Actionable Insights with Pandas

In the realm of data analysis, SQL stands as a mighty tool, renowned for its robust capabilities in managing and querying databases. However, Python’s pandas library brings SQL-like functionalities to the fingertips of analysts and data scientists, enabling sophisticated data manipulation and analysis without the need for a traditional SQL database. This exploration delves into […]

The post Beyond SQL: Transforming Real Estate Data into Actionable Insights with Pandas appeared first on MachineLearningMastery.com.

1 month, 3 weeks назад @ machinelearningmastery.com
Spotting the Exception: Classical Methods for Outlier Detection in Data Science
Spotting the Exception: Classical Methods for Outlier Detection in Data Science

Outliers are unique in that they often don’t play by the rules. These data points, which significantly differ from the rest, can skew your analyses and make your predictive models less accurate. Although detecting outliers is critical, there is no universally agreed-upon method for doing so. While some advanced techniques like machine learning offer solutions, […]

The post Spotting the Exception: Classical Methods for Outlier Detection in Data Science appeared first on MachineLearningMastery.com.

1 month, 3 weeks назад @ machinelearningmastery.com
Leveraging ANOVA and Kruskal-Wallis Tests to Analyze the Impact of the Great Recession on Housing Prices
Leveraging ANOVA and Kruskal-Wallis Tests to Analyze the Impact of the Great Recession on Housing Prices

In the world of real estate, numerous factors influence property prices. The economy, market demand, location, and even the year a property is sold can play significant roles. The years 2007 to 2009 marked a tumultuous time for the US housing market. This period, often referred to as the Great Recession, saw a drastic decline […]

The post Leveraging ANOVA and Kruskal-Wallis Tests to Analyze the Impact of the Great Recession on Housing Prices appeared first on MachineLearningMastery.com.

2 months назад @ machinelearningmastery.com
ML in Production
последний пост None
Sorta Insightful Sorta Insightful
последний пост 1 month назад
Solving Crew Battle Strategy With Math
Solving Crew Battle Strategy With Math Solving Crew Battle Strategy With Math

This means we can reduce all the crew battle outcomes down to a single \(n \times n\) matrix, which I’ll call the crew battle matrix.

So I Wrote Some Python Code to Compute \(f\) for Arbitrary Crew BattlesLet’s consider the RPS crew battle again.

Here’s the matchup matrix:and here’s what my code outputs for the crew battle matrix, assuming optimal play.

But also, this suggests that crew battle strategy really isn’t that important in the first place!

If your crew loses, you don’t get to blame bad crew battle strategy.

1 month назад @ alexirpan.com
MIT Mystery Hunt 2024
MIT Mystery Hunt 2024 MIT Mystery Hunt 2024

This has spoilers for MIT Mystery Hunt 2024.

I hunted with teammate again this year, because there is nothing quite like writing a Mystery Hunt to forge friends through fire.

I needed to spend some to avoid hitting the vacation cap, and what better time than Mystery Hunt?

If you were forced to pick how a Mystery Hunt runs long, I think most people would pick the “too many puzzles” side of Mystery Hunt 2024 over the “too difficult puzzles” side of Mystery Hunt 2023.

So did Spoilr, the codebase for Mystery Hunt 2022, and tph-site from Mystery Hunt 2023.

3 months назад @ alexirpan.com
My AI Timelines Have Sped Up (Again)
My AI Timelines Have Sped Up (Again) My AI Timelines Have Sped Up (Again)

In August 2020, I wrote a post about my AI timelines.

Diffusion based image augmentation has been shown to improve robot learning, and Anthropic has based a lot of its branding on constitutional AI and “RL from AI feedback”.

I don’t like AI Twitter for reasons I’ve explained here, but I especially do not AI twitter post-ChatGPT.

In AI, models can never do everything people claim they can, but what the models can do is ever-growing and never slides backward.

The bull is to say that we can figure out how to scale models, and scaled up models will solve all the other hard problems.

3 months, 2 weeks назад @ alexirpan.com
Far More Research Into Making Neopoints Than Anyone Needs to Know
Far More Research Into Making Neopoints Than Anyone Needs to Know Far More Research Into Making Neopoints Than Anyone Needs to Know

You know how much time I have spent studying how to squeeze Neopoints water out of the Neopets stone?

I’d say you can expect to get about 25k NP a day, depending on how many NP rewards you get.

Trudy’s Surprise also gives items for 7 day streaks, but these items are usually junk and not worth anything.

When you win against CPU opponents, you earn a small amount of Neopoints and an item drop.

Or your needs to…see someone do a lot of research into things that don’t matter?

5 months назад @ alexirpan.com
Everfree Northwest 2023
Everfree Northwest 2023 Everfree Northwest 2023

Everfree Northwest is a My Little Pony convention in the Seattle area.

Except, I heard that with the ending of BronyCon, Everfree Northwest is the largest pony convention left.

My Little Pony: Friendship is Magic is Generation 4, or G4, and is the one that kicked off the brony fandom.

People tend to assume that “pony concert” means “remixes of pony songs”, or at least “overt references to My Little Pony”.

The first was a signed copy of Ponies: The Galloping, the Magic: The Gathering x My Little Pony crossover.

7 months, 3 weeks назад @ alexirpan.com
Eight Years Later
Eight Years Later Eight Years Later

I started working on that post right after Mystery Hunt 2023 finished, and hardcore focused on getting it out ASAP.

The end result was that I was exerting “write Mystery Hunt” levels of effort (10-20 hrs/week) for 1.5 years straight.

markdown 1929 2023 - 04 - 21 - mh - 2023. markdown 114 2023 - 05 - 09 - bootes - 2023. markdown 153 2023 - 07 - 19 - ml - hurry .

markdownPosts in LimboPost about Dominion Online:Odds of writing this year: 5%Odds of writing eventually: 25%My priorities are moving away from Dominion.

Post about AI timelines:Odds of writing this year: 90%Odds of writing eventually: 99%I’m not planning a big update to the last timelines post.

8 months, 1 week назад @ alexirpan.com
Machine Learning Got Itself in a Big Damn Hurry
Machine Learning Got Itself in a Big Damn Hurry Machine Learning Got Itself in a Big Damn Hurry

As I asked questions about compute resources and the presenter’s position on generative AI ethics, I had a moment of realization.

Why are we talking about whether an RTX 3090 is big enough to finetune a LLaMa checkpoint?

The much easier and less-speculative way for a field to move faster is by having more people working in that field.

The MLP AI enthusiasts mentioned some pretrained LLMs that I had not even heard of.

I remember being a young whippersnapper, in the deep learning wave of 2015.

9 months, 1 week назад @ alexirpan.com
Lil'Log
последний пост None
inFERENCe
последний пост None
The Spectator
последний пост 4 months, 1 week назад
Generative Science: Roles for Generative AI in Scientific Discovery
Generative Science: Roles for Generative AI in Scientific Discovery Generative Science: Roles for Generative AI in Scientific Discovery

Generative AI is a membership based concept, so it is defined by the set of approaches and applications that get put under that name.

I think this is one part of the intrigue of generative AI for science.

So maybe generative AI applications are part of this drive for a post-theory mode of science.

This generative science, it is hoped, can add vigour into the sciences, especially in cross-disciplinary ways and provide broad-based benefit.

Generative AI for Science presents opportunities for new ways of doing theory and renewed vigour across our sciences.

4 months, 1 week назад @ blog.shakirm.com
Responsibilities of the Pioneer: Generative AI and its Sociotechnical Foundations
Responsibilities of the Pioneer: Generative AI and its Sociotechnical Foundations Responsibilities of the Pioneer: Generative AI and its Sociotechnical Foundations

Keynote at the Stanford Human-centered AI 2023 Fall Conference on New Horizons in Generative AI: Science, Creativity, and Society.

Our conference today on new horizons in generative AI, invites us to think of the frontier of research and innovation.

These stories will expose some features of the sociotechnical foundations of generative AI that is my underlying message and call to action.

What he doesn’t know yet, is that this book will become a cornerstone of the field and industry of weather forecasting.

There is a specific and firm model for responsibility that is built on taking a sociotechnical approach to generative AI and our work.

6 months назад @ blog.shakirm.com
Machine Learning with Social Purpose
Machine Learning with Social Purpose Machine Learning with Social Purpose

Abstract: This talk talk has a single objective: to advocate for machine learning infused with social purpose.

In this way, social purpose transforms our field of machine learning: into something that is both technical and social.

So social purpose will make machine learning something that is both technical and social.

A more global field and industry can shift machine learning to be more general: magnifying social purpose.

Your continued volunteering, funding, support, and openness to these groups shows yet another way of infusing machine learning with social purpose.

8 months, 3 weeks назад @ blog.shakirm.com
The Unofficial Google Data Science Blog The Unofficial Google Data Science Blog
последний пост 1 day, 7 hours назад
Towards optimal experimentation in online systems
Towards optimal experimentation in online systems Towards optimal experimentation in online systems

At Youtube, the relationships between system parameters and metrics often seem simple — straight-line models sometimes fit our data well.

To find optimal values of two parameters experimentally, the obvious strategy would be to experiment with and update them in separate, sequential stages.

We address uncertainty in estimates $\widehat \beta$ of model coefficients with subsampling, our usual strategy for variance estimation in online experiments[12].

If our experiments were smaller then we might need to model our metrics with Poisson, Binomial or other GLMs instead.

We sometimes use $L^1$ regularization when we fit models to our live experiment data to select the most important non-linear c…

1 day, 7 hours назад @ unofficialgoogledatascience.com
Measuring Validity and Reliability of Human Ratings
Measuring Validity and Reliability of Human Ratings Measuring Validity and Reliability of Human Ratings

When measuring an attribute, we must assume that the thing we are trying to measure exists, and then validity tells us if we’re actually measuring it.

Reliability sets the upper bound on your validity, and you cannot measure something validly if you cannot measure it reliably.

Two fundamental concepts emerged from this research: reliability and validity.For any procedure, reliability is necessary for high-quality data but not sufficient.

Inference for non-parametric reliability and validity measurementsThere are derivations for standard errors and confidence intervals for some non-parametric reliability metrics.

In practice, we measure reliability with both non-parametric and parametric met…

9 months, 1 week назад @ unofficialgoogledatascience.com
Off the Convex Path
последний пост None
Jay Alammar
последний пост None
Piekniewski's blog
последний пост 3 weeks, 2 days назад
fast.ai NLP fast.ai NLP
последний пост None
大トロ 大トロ
последний пост None
🔬 Science
Papers With Code Papers With Code
последний пост 1 час назад
/uulm-mrm/ Revisiting Out-of-Distribution Detection in LiDAR-based 3D Object Detection
/uulm-mrm/ Revisiting Out-of-Distribution Detection in LiDAR-based 3D Object Detection /uulm-mrm/ Revisiting Out-of-Distribution Detection in LiDAR-based 3D Object Detection

LiDAR-based 3D object detection has become an essential part of automated driving due to its ability to localize and classify objects precisely in 3D.

These out-of-distribution (OOD) objects can lead to misclassifications, posing a significant risk to the safety and reliability of automated vehicles.

Currently, LiDAR-based OOD object detection has not been well studied.

We address this problem by generating synthetic training data for OOD objects by perturbing known object categories.

Our idea is that these synthetic OOD objects produce different responses in the feature map of an object detector compared to in-distribution (ID) objects.

1 час назад @ paperswithcode.com
/Inshsang/ 3DBench: A Scalable 3D Benchmark and Instruction-Tuning Dataset
/Inshsang/ 3DBench: A Scalable 3D Benchmark and Instruction-Tuning Dataset /Inshsang/ 3DBench: A Scalable 3D Benchmark and Instruction-Tuning Dataset

Evaluating the performance of Multi-modal Large Language Models (MLLMs), integrating both point cloud and language, presents significant challenges.

Current evaluations heavily rely on classification and caption tasks, falling short in providing a thorough assessment of MLLMs.

To address these issues, we introduce a scalable 3D benchmark, accompanied by a large-scale instruction-tuning dataset known as 3DBench, providing an extensible platform for a comprehensive evaluation of MLLMs.

Specifically, we establish the benchmark that spans a wide range of spatial and semantic scales, from object-level to scene-level, addressing both perception and planning tasks.

Furthermore, we present a rigoro…

3 часа назад @ paperswithcode.com
/kinyatoride/ Using Deep Learning to Identify Initial Error Sensitivity of ENSO Forecasts
/kinyatoride/ Using Deep Learning to Identify Initial Error Sensitivity of ENSO Forecasts /kinyatoride/ Using Deep Learning to Identify Initial Error Sensitivity of ENSO Forecasts

We introduce a hybrid method that integrates deep learning with model-analog forecasting, a straightforward yet effective approach that generates forecasts from similar initial climate states in a repository of model simulations.

This hybrid framework employs a convolutional neural network to estimate state-dependent weights to identify analog states.

We evaluate our approach using the Community Earth System Model Version 2 Large Ensemble to forecast the El Ni\~no-Southern Oscillation (ENSO) on a seasonal-to-annual time scale.

Furthermore, our hybrid model demonstrates improvements in boreal winter and spring initialization when evaluated against a reanalysis dataset.

This approach has broa…

3 часа назад @ paperswithcode.com
/RafaelSterzinger/ Drawing the Line: Deep Segmentation for Extracting Art from Ancient Etruscan Mirrors
/RafaelSterzinger/ Drawing the Line: Deep Segmentation for Extracting Art from Ancient Etruscan Mirrors /RafaelSterzinger/ Drawing the Line: Deep Segmentation for Extracting Art from Ancient Etruscan Mirrors

Etruscan mirrors constitute a significant category within Etruscan art and, therefore, undergo systematic examinations to obtain insights into ancient times.

A crucial aspect of their analysis involves the labor-intensive task of manually tracing engravings from the backside.

Additionally, this task is inherently challenging due to the damage these mirrors have sustained, introducing subjectivity into the process.

Compared to our baseline, we improve predictive performance w.r.t.

When assessing performance on complete mirrors against a human baseline, our approach yields quantitative similar performance to a human annotator and significantly outperforms existing binarization methods.

3 часа назад @ paperswithcode.com
/jjzgeeks/ Leverage Variational Graph Representation For Model Poisoning on Federated Learning
/jjzgeeks/ Leverage Variational Graph Representation For Model Poisoning on Federated Learning /jjzgeeks/ Leverage Variational Graph Representation For Model Poisoning on Federated Learning

This paper puts forth a new training data-untethered model poisoning (MP) attack on federated learning (FL).

The new MP attack extends an adversarial variational graph autoencoder (VGAE) to create malicious local models based solely on the benign local models overheard without any access to the training data of FL.

Such an advancement leads to the VGAE-MP attack that is not only efficacious but also remains elusive to detection.

VGAE-MP attack extracts graph structural correlations among the benign local models and the training data features, adversarially regenerates the graph structure, and generates malicious local models using the adversarial graph structure and benign models' features.…

4 часа назад @ paperswithcode.com
/schneiderkamplab/ SynthEval: A Framework for Detailed Utility and Privacy Evaluation of Tabular Synthetic Data
/schneiderkamplab/ SynthEval: A Framework for Detailed Utility and Privacy Evaluation of Tabular Synthetic Data /schneiderkamplab/ SynthEval: A Framework for Detailed Utility and Privacy Evaluation of Tabular Synthetic Data

With the growing demand for synthetic data to address contemporary issues in machine learning, such as data scarcity, data fairness, and data privacy, having robust tools for assessing the utility and potential privacy risks of such data becomes crucial.

SynthEval, a novel open-source evaluation framework distinguishes itself from existing tools by treating categorical and numerical attributes with equal care, without assuming any special kind of preprocessing steps.

This~makes it applicable to virtually any synthetic dataset of tabular records.

Our tool leverages statistical and machine learning techniques to comprehensively evaluate synthetic data fidelity and privacy-preserving integrity…

4 часа назад @ paperswithcode.com
/Suyimu/ Raformer: Redundancy-Aware Transformer for Video Wire Inpainting
/Suyimu/ Raformer: Redundancy-Aware Transformer for Video Wire Inpainting /Suyimu/ Raformer: Redundancy-Aware Transformer for Video Wire Inpainting

Video Wire Inpainting (VWI) is a prominent application in video inpainting, aimed at flawlessly removing wires in films or TV series, offering significant time and labor savings compared to manual frame-by-frame removal.

However, wire removal poses greater challenges due to the wires being longer and slimmer than objects typically targeted in general video inpainting tasks, and often intersecting with people and background objects irregularly, which adds complexity to the inpainting process.

WRV2 dataset comprises over 4,000 videos with an average length of 80 frames, designed to facilitate the development and efficacy of inpainting models.

Building upon this, our research proposes the Redu…

5 часов назад @ paperswithcode.com
/zod-l/ Cross-Temporal Spectrogram Autoencoder (CTSAE): Unsupervised Dimensionality Reduction for Clustering Gravitational Wave Glitches
/zod-l/ Cross-Temporal Spectrogram Autoencoder (CTSAE): Unsupervised Dimensionality Reduction for Clustering Gravitational Wave Glitches /zod-l/ Cross-Temporal Spectrogram Autoencoder (CTSAE): Unsupervised Dimensionality Reduction for Clustering Gravitational Wave Glitches

The advancement of The Laser Interferometer Gravitational-Wave Observatory (LIGO) has significantly enhanced the feasibility and reliability of gravitational wave detection.

However, LIGO's high sensitivity makes it susceptible to transient noises known as glitches, which necessitate effective differentiation from real gravitational wave signals.

In response to this challenge, we introduce the Cross-Temporal Spectrogram Autoencoder (CTSAE), a pioneering unsupervised method for the dimensionality reduction and clustering of gravitational wave glitches.

CTSAE integrates a novel four-branch autoencoder with a hybrid of Convolutional Neural Networks (CNN) and Vision Transformers (ViT).

To the b…

6 часов назад @ paperswithcode.com
/yunyuny/ Understanding Hyperbolic Metric Learning through Hard Negative Sampling
/yunyuny/ Understanding Hyperbolic Metric Learning through Hard Negative Sampling /yunyuny/ Understanding Hyperbolic Metric Learning through Hard Negative Sampling

In recent years, there has been a growing trend of incorporating hyperbolic geometry methods into computer vision.

While these methods have achieved state-of-the-art performance on various metric learning tasks using hyperbolic distance measurements, the underlying theoretical analysis supporting this superior performance remains under-exploited.

In this study, we investigate the effects of integrating hyperbolic space into metric learning, particularly when training with contrastive loss.

We identify a need for a comprehensive comparison between Euclidean and hyperbolic spaces regarding the temperature effect in the contrastive loss within the existing literature.

We also reveal that hyper…

6 часов назад @ paperswithcode.com
/mihir3009/ Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models
/mihir3009/ Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models /mihir3009/ Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models

Recently developed large language models (LLMs) have been shown to perform remarkably well on a wide range of language understanding tasks.

However, the crucial skill pertaining to 'logical reasoning' has remained underexplored.

Addressing the above limitation, we comprehensively evaluate the logical reasoning ability of LLMs on 25 different reasoning patterns spanning over propositional, first-order, and non-monotonic logics.

To enable systematic evaluation, we introduce LogicBench, a natural language question-answering dataset focusing on the use of a single inference rule.

We believe that our work and findings facilitate future research for evaluating and enhancing the logical reasoning …

6 часов назад @ paperswithcode.com
/niru-umass-dev/ CASPR: Automated Evaluation Metric for Contrastive Summarization
/niru-umass-dev/ CASPR: Automated Evaluation Metric for Contrastive Summarization /niru-umass-dev/ CASPR: Automated Evaluation Metric for Contrastive Summarization

Summarizing comparative opinions about entities (e.g., hotels, phones) from a set of source reviews, often referred to as contrastive summarization, can considerably aid users in decision making.

Prior work has proposed token-overlap based metrics, Distinctiveness Score, to measure contrast which does not take into account the sensitivity to meaning-preserving lexical variations.

In this work, we propose an automated evaluation metric CASPR to better measure contrast between a pair of summaries.

We compare CASPR with Distinctiveness Score and a simple yet powerful baseline based on BERTScore.

Our results on a prior dataset CoCoTRIP demonstrate that CASPR can more reliably capture the contra…

6 часов назад @ paperswithcode.com
/HenryPengZou/ ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction
/HenryPengZou/ ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction /HenryPengZou/ ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction

Existing datasets for attribute value extraction (AVE) predominantly focus on explicit attribute values while neglecting the implicit ones, lack product images, are often not publicly available, and lack an in-depth human inspection across diverse domains.

To address these limitations, we present ImplicitAVE, the first, publicly available multimodal dataset for implicit attribute value extraction.

ImplicitAVE, sourced from the MAVE dataset, is carefully curated and expanded to include implicit AVE and multimodality, resulting in a refined dataset of 68k training and 1.6k testing data across five domains.

We also explore the application of multimodal large language models (MLLMs) to implicit…

6 часов назад @ paperswithcode.com
/emi-group/ FR-NAS: Forward-and-Reverse Graph Predictor for Efficient Neural Architecture Search
/emi-group/ FR-NAS: Forward-and-Reverse Graph Predictor for Efficient Neural Architecture Search /emi-group/ FR-NAS: Forward-and-Reverse Graph Predictor for Efficient Neural Architecture Search

Neural Architecture Search (NAS) has emerged as a key tool in identifying optimal configurations of deep neural networks tailored to specific tasks.

Given that neural architectures fundamentally resemble Directed Acyclic Graphs (DAGs), Graph Neural Networks (GNNs) become an apparent choice for such predictive tasks.

This predictor renders neural architectures into vector representations by combining both the conventional and inverse graph views.

Additionally, we incorporate a customized training loss within the GNN predictor to ensure efficient utilization of both types of representations.

Benchmarked against leading GNN predictors, the experimental results showcase a significant improvemen…

6 часов назад @ paperswithcode.com
/Zhaoyang-Chu/ Graph Neural Networks for Vulnerability Detection: A Counterfactual Explanation
/Zhaoyang-Chu/ Graph Neural Networks for Vulnerability Detection: A Counterfactual Explanation /Zhaoyang-Chu/ Graph Neural Networks for Vulnerability Detection: A Counterfactual Explanation

Vulnerability detection is crucial for ensuring the security and reliability of software systems.

Recently, Graph Neural Networks (GNNs) have emerged as a prominent code embedding approach for vulnerability detection, owing to their ability to capture the underlying semantic structure of source code.

Inspired by advancements of counterfactual reasoning in artificial intelligence, we propose CFExplainer, a novel counterfactual explainer for GNN-based vulnerability detection.

Unlike factual reasoning-based explainers, CFExplainer seeks the minimal perturbation to the input code graph that leads to a change in the prediction, thereby addressing the what-if questions for vulnerability detection…

6 часов назад @ paperswithcode.com
/qinzzz/ Variational Deep Survival Machines: Survival Regression with Censored Outcomes
/qinzzz/ Variational Deep Survival Machines: Survival Regression with Censored Outcomes /qinzzz/ Variational Deep Survival Machines: Survival Regression with Censored Outcomes

Survival regression aims to predict the time when an event of interest will take place, typically a death or a failure.

A fully parametric method [18] is proposed to estimate the survival function as a mixture of individual parametric distributions in the presence of censoring.

In this paper, We present a novel method to predict the survival time by better clustering the survival data and combine primitive distributions.

The model is trained end to end by jointly optimizing the VAE loss and regression loss.

Thorough experiments on dataset SUPPORT and FLCHAIN show that our method can effectively improve the clustering result and reach competitive scores with previous methods.

6 часов назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 1 час назад
/hafeezzwiz21/ A Bi-directional Quantum Search Algorithm
/hafeezzwiz21/ A Bi-directional Quantum Search Algorithm /hafeezzwiz21/ A Bi-directional Quantum Search Algorithm

This paper combines Partial Grover's search algorithm and Bi-directional Search to create a fast Grover's quantum search algorithm, referred to as Bi-Directional Grover Search (BDGS).

We incorporated a bi-directional search tactic with a partial Grover search, starting from an initial state and a single marked state in parallel.

We have shown in this article that our novel approach requires $\frac{\pi}{4\sqrt{2}}\sqrt{N}(1-\sqrt{\frac{1}{b^{r/2k}}})$ iterations over regular Grover Search and Partial Grover Search (PGS), which takes $\frac{\pi}{4}\sqrt{N}\sqrt{1-\frac{1}{b}}$ (here, $N=2^r$ elements, $b$ is the branching factor of partial search, and $k= \lceil\log_2b \rceil$).

The proposed …

6 часов назад @ paperswithcode.com
/goncamateus/ Planning the path with Reinforcement Learning: Optimal Robot Motion Planning in RoboCup Small Size League Environments
/goncamateus/ Planning the path with Reinforcement Learning: Optimal Robot Motion Planning in RoboCup Small Size League Environments /goncamateus/ Planning the path with Reinforcement Learning: Optimal Robot Motion Planning in RoboCup Small Size League Environments

This work investigates the potential of Reinforcement Learning (RL) to tackle robot motion planning challenges in the dynamic RoboCup Small Size League (SSL).

Using a heuristic control approach, we evaluate RL's effectiveness in obstacle-free and single-obstacle path-planning environments.

Our method achieved a 60% time gain in obstacle-free environments compared to baseline algorithms.

Additionally, our findings demonstrated dynamic obstacle avoidance capabilities, adeptly navigating around moving blocks.

These findings highlight the potential of RL to enhance robot motion planning in the challenging and unpredictable SSL environment.

9 часов назад @ paperswithcode.com
/AIGAnimation/ Taming Diffusion Probabilistic Models for Character Control
/AIGAnimation/ Taming Diffusion Probabilistic Models for Character Control /AIGAnimation/ Taming Diffusion Probabilistic Models for Character Control

We present a novel character control framework that effectively utilizes motion diffusion probabilistic models to generate high-quality and diverse character animations, responding in real-time to a variety of dynamic user-supplied control signals.

At the heart of our method lies a transformer-based Conditional Autoregressive Motion Diffusion Model (CAMDM), which takes as input the character's historical motion and can generate a range of diverse potential future motions conditioned on high-level, coarse user control.

These include separate condition tokenization, classifier-free guidance on past motion, and heuristic future trajectory extension, all designed to address the challenges assoc…

22 часа назад @ paperswithcode.com
/apple/ OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework
/apple/ OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework /apple/ OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

To this end, we release OpenELM, a state-of-the-art open language model.

We also release code to convert models to MLX library for inference and fine-tuning on Apple devices.

This comprehensive release aims to empower and strengthen the open research community, paving the way for future open research endeavors.

Our source code along with pre-trained model weights and training recipes is available at \url{https://github.com/apple/corenet}.

Additionally, \model models can be found on HuggingFace at: \url{https://huggingface.co/apple/OpenELM}.

23 часа назад @ paperswithcode.com
/I2-Multimedia-Lab/ Unified Unsupervised Salient Object Detection via Knowledge Transfer
/I2-Multimedia-Lab/ Unified Unsupervised Salient Object Detection via Knowledge Transfer /I2-Multimedia-Lab/ Unified Unsupervised Salient Object Detection via Knowledge Transfer

Recently, unsupervised salient object detection (USOD) has gained increasing attention due to its annotation-free nature.

In this paper, we propose a unified USOD framework for generic USOD tasks.

Firstly, we propose a Progressive Curriculum Learning-based Saliency Distilling (PCL-SD) mechanism to extract saliency cues from a pre-trained deep network.

Afterwards, the obtained saliency cues are utilized to train a saliency detector, and we employ a Self-rectify Pseudo-label Refinement (SPR) mechanism to improve the quality of pseudo-labels.

Finally, an adapter-tuning method is devised to transfer the acquired saliency knowledge, leveraging shared knowledge to attain superior transferring per…

1 day, 4 hours назад @ paperswithcode.com
/algoprog/ Simulating Task-Oriented Dialogues with State Transition Graphs and Large Language Models
/algoprog/ Simulating Task-Oriented Dialogues with State Transition Graphs and Large Language Models /algoprog/ Simulating Task-Oriented Dialogues with State Transition Graphs and Large Language Models

SynTOD utilizes a state transition graph to define the desired behavior of a TOD system and generates diverse, structured conversations through random walks and response simulation using large language models (LLMs).

In our experiments, using graph-guided response simulations leads to significant improvements in intent classification, slot filling and response relevance compared to naive single-prompt simulated conversations.

We also investigate the end-to-end TOD effectiveness of different base and instruction-tuned LLMs, with and without the constructed synthetic conversations.

Finally, we explore how various LLMs can evaluate responses in a TOD system and how well they are correlated wit…

1 day, 4 hours назад @ paperswithcode.com
/TUDB-Labs/ MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA based Mixture of Experts
/TUDB-Labs/ MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA based Mixture of Experts /TUDB-Labs/ MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA based Mixture of Experts

Large Language Models (LLMs) have showcased exceptional performance across a wide array of Natural Language Processing (NLP) tasks.

While methods like LoRA have effectively tackled GPU memory constraints during fine-tuning, their applicability is often restricted to limited performance, especially on multi-task.

On the other hand, Mix-of-Expert (MoE) models, such as Mixtral 8x7B, demonstrate remarkable performance across multiple NLP tasks while maintaining a reduced parameter count.

To address these challenge, we propose MixLoRA, an innovative approach aimed at constructing a resource-efficient sparse MoE model based on LoRA.

MixLoRA inserts multiple LoRA-based experts within the feed-forw…

1 day, 6 hours назад @ paperswithcode.com
/mertan-a/ Towards Multi-Morphology Controllers with Diversity and Knowledge Distillation
/mertan-a/ Towards Multi-Morphology Controllers with Diversity and Knowledge Distillation /mertan-a/ Towards Multi-Morphology Controllers with Diversity and Knowledge Distillation

Finding controllers that perform well across multiple morphologies is an important milestone for large-scale robotics, in line with recent advances via foundation models in other areas of machine learning.

However, the challenges of learning a single controller to control multiple morphologies make the `one robot one task' paradigm dominant in the field.

To alleviate these challenges, we present a pipeline that: (1) leverages Quality Diversity algorithms like MAP-Elites to create a dataset of many single-task/single-morphology teacher controllers, then (2) distills those diverse controllers into a single multi-morphology controller that performs well across many different body plans by mimi…

1 day, 6 hours назад @ paperswithcode.com
/adeyemiadeoye/ Regularized Gauss-Newton for Optimizing Overparameterized Neural Networks
/adeyemiadeoye/ Regularized Gauss-Newton for Optimizing Overparameterized Neural Networks /adeyemiadeoye/ Regularized Gauss-Newton for Optimizing Overparameterized Neural Networks

The generalized Gauss-Newton (GGN) optimization method incorporates curvature estimates into its solution steps, and provides a good approximation to the Newton method for large-scale optimization problems.

GGN has been found particularly interesting for practical training of deep neural networks, not only for its impressive convergence speed, but also for its close relation with neural tangent kernel regression, which is central to recent studies that aim to understand the optimization and generalization properties of neural networks.

This work studies a GGN method for optimizing a two-layer neural network with explicit regularization.

We study the convergence of the two-layer neural netwo…

1 day, 6 hours назад @ paperswithcode.com
/jlab-nlp/ The Power of the Noisy Channel: Unsupervised End-to-End Task-Oriented Dialogue with LLMs
/jlab-nlp/ The Power of the Noisy Channel: Unsupervised End-to-End Task-Oriented Dialogue with LLMs /jlab-nlp/ The Power of the Noisy Channel: Unsupervised End-to-End Task-Oriented Dialogue with LLMs

Training task-oriented dialogue systems typically requires turn-level annotations for interacting with their APIs: e.g.

a dialogue state and the system actions taken at each step.

With advances in LLMs, we hypothesize unlabelled data and a schema definition are sufficient for building a working task-oriented dialogue system, completely unsupervised.

We iteratively improve these pseudo-labels with expectation-maximization (EM), and use the inferred labels to train an end-to-end dialogue agent.

Evaluating our approach on the MultiWOZ benchmark, our method more than doubles the dialogue success rate of a strong GPT-3.5 baseline.

1 day, 6 hours назад @ paperswithcode.com
/teodorchiaburu/ CoProNN: Concept-based Prototypical Nearest Neighbors for Explaining Vision Models
/teodorchiaburu/ CoProNN: Concept-based Prototypical Nearest Neighbors for Explaining Vision Models /teodorchiaburu/ CoProNN: Concept-based Prototypical Nearest Neighbors for Explaining Vision Models

However, building task specific explanations is time consuming and requires domain expertise which can be difficult to integrate into generic XAI methods.

A promising approach towards designing useful task specific explanations with domain experts is based on compositionality of semantic concepts.

Here, we present a novel approach that enables domain experts to quickly create concept-based explanations for computer vision tasks intuitively via natural language.

Leveraging recent progress in deep generative methods we propose to generate visual concept-based prototypes via text-to-image methods.

These prototypes are then used to explain predictions of computer vision models via a simple k-Ne…

1 day, 6 hours назад @ paperswithcode.com
/xuxy09/ SMPLer: Taming Transformers for Monocular 3D Human Shape and Pose Estimation
/xuxy09/ SMPLer: Taming Transformers for Monocular 3D Human Shape and Pose Estimation /xuxy09/ SMPLer: Taming Transformers for Monocular 3D Human Shape and Pose Estimation

Existing Transformers for monocular 3D human shape and pose estimation typically have a quadratic computation and memory complexity with respect to the feature length, which hinders the exploitation of fine-grained information in high-resolution features that is beneficial for accurate reconstruction.

In this work, we propose an SMPL-based Transformer framework (SMPLer) to address this issue.

SMPLer incorporates two key ingredients: a decoupled attention operation and an SMPL-based target representation, which allow effective utilization of high-resolution features in the Transformer.

In addition, based on these two designs, we also introduce several novel modules including a multi-scale at…

1 day, 6 hours назад @ paperswithcode.com
/rexhaha/ Cross-Domain Causal Preference Learning for Out-of-Distribution Recommendation
/rexhaha/ Cross-Domain Causal Preference Learning for Out-of-Distribution Recommendation /rexhaha/ Cross-Domain Causal Preference Learning for Out-of-Distribution Recommendation

Recommender systems use users' historical interactions to learn their preferences and deliver personalized recommendations from a vast array of candidate items.

Current recommender systems primarily rely on the assumption that the training and testing datasets have identical distributions, which may not hold true in reality.

This study delves deeply into the challenge of OOD generalization and proposes a novel model called Cross-Domain Causal Preference Learning for Out-of-Distribution Recommendation (CDCOR), which involves employing a domain adversarial network to uncover users' domain-shared preferences and utilizing a causal structure learner to capture causal invariance to deal with the…

1 day, 6 hours назад @ paperswithcode.com
/medicl-vu/ PRISM: A Promptable and Robust Interactive Segmentation Model with Visual Prompts
/medicl-vu/ PRISM: A Promptable and Robust Interactive Segmentation Model with Visual Prompts /medicl-vu/ PRISM: A Promptable and Robust Interactive Segmentation Model with Visual Prompts

In this paper, we present PRISM, a Promptable and Robust Interactive Segmentation Model, aiming for precise segmentation of 3D medical images.

PRISM accepts various visual inputs, including points, boxes, and scribbles as sparse prompts, as well as masks as dense prompts.

The model produces segmentations by using visual prompts from previous iterations to achieve progressive improvement.

PRISM employs multiple segmentation heads per input image, each generating a continuous map and a confidence score to optimize predictions.

Following each segmentation iteration, PRISM employs a shallow corrective refinement network to reassign mislabeled voxels.

1 day, 6 hours назад @ paperswithcode.com
/ritatsousa/ Integrating Heterogeneous Gene Expression Data through Knowledge Graphs for Improving Diabetes Prediction
/ritatsousa/ Integrating Heterogeneous Gene Expression Data through Knowledge Graphs for Improving Diabetes Prediction /ritatsousa/ Integrating Heterogeneous Gene Expression Data through Knowledge Graphs for Improving Diabetes Prediction

Diabetes is a worldwide health issue affecting millions of people.

Machine learning methods have shown promising results in improving diabetes prediction, particularly through the analysis of diverse data types, namely gene expression data.

While gene expression data can provide valuable insights, challenges arise from the fact that the sample sizes in expression datasets are usually limited, and the data from different datasets with different gene expressions cannot be easily combined.

This work proposes a novel approach to address these challenges by integrating multiple gene expression datasets and domain-specific knowledge using knowledge graphs, a unique tool for biomedical data integr…

1 day, 6 hours назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 1 час назад
/cliffbb/ Unsupervised Domain Adaptation Architecture Search with Self-Training for Land Cover Mapping
/cliffbb/ Unsupervised Domain Adaptation Architecture Search with Self-Training for Land Cover Mapping /cliffbb/ Unsupervised Domain Adaptation Architecture Search with Self-Training for Land Cover Mapping

Unsupervised domain adaptation (UDA) is a challenging open problem in land cover mapping.

Previous studies show encouraging progress in addressing cross-domain distribution shifts on remote sensing benchmarks for land cover mapping.

Thus, we proposed a simple yet effective framework to search for lightweight neural networks automatically for land cover mapping tasks under domain shifts.

This is the first attempt to combine NAS with self-training UDA as a single framework for land cover mapping.

With only less than 2M parameters and 30.16 GFLOPs, the best-discovered lightweight network reaches state-of-the-art performance on the regional target domain of OpenEarthMap (59.38% mIoU) and the co…

1 day, 6 hours назад @ paperswithcode.com
/yz1019117968/ Exploring and Unleashing the Power of Large Language Models in Automated Code Translation
/yz1019117968/ Exploring and Unleashing the Power of Large Language Models in Automated Code Translation /yz1019117968/ Exploring and Unleashing the Power of Large Language Models in Automated Code Translation

Code translation tools are developed for automatic source-to-source translation.

LLMs pre-trained on huge amounts of human-written code/text have shown remarkable performance in many code intelligence tasks due to their powerful generality, even without task-specific training.

Enlightened by the above findings, we propose UniTrans, an Unified code Translation framework, applicable to various LLMs, for unleashing their power in this field.

Specifically, UniTrans first craft a series of test cases for target programs with the assistance of source programs.

Next, it harnesses the above auto-generated test cases to augment the code translation and then evaluate their correctness via execution.

1 day, 6 hours назад @ paperswithcode.com
/lmbxmu/ CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method
/lmbxmu/ CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method /lmbxmu/ CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method

Transforming large pre-trained low-resolution diffusion models to cater to higher-resolution demands, i.e., diffusion extrapolation, significantly improves diffusion adaptability.

We propose tuning-free CutDiffusion, aimed at simplifying and accelerating the diffusion extrapolation process, making it more affordable and improving performance.

CutDiffusion abides by the existing patch-wise extrapolation but cuts a standard patch diffusion process into an initial phase focused on comprehensive structure denoising and a subsequent phase dedicated to specific detail refinement.

Comprehensive experiments highlight the numerous almighty advantages of CutDiffusion: (1) simple method construction t…

1 day, 6 hours назад @ paperswithcode.com
/byyx666/ Revisiting Neural Networks for Continual Learning: An Architectural Perspective
/byyx666/ Revisiting Neural Networks for Continual Learning: An Architectural Perspective /byyx666/ Revisiting Neural Networks for Continual Learning: An Architectural Perspective

Efforts to overcome catastrophic forgetting have primarily centered around developing more effective Continual Learning (CL) methods.

In contrast, less attention was devoted to analyzing the role of network architecture design (e.g., network depth, width, and components) in contributing to CL.

This paper seeks to bridge this gap between network architecture design and CL, and to present a holistic study on the impact of network architectures on CL.

This work considers architecture design at the network scaling level, i.e., width and depth, and also at the network components, i.e., skip connections, global pooling layers, and down-sampling.

In both cases, we first derive insights through sys…

1 day, 6 hours назад @ paperswithcode.com
/ruc-nlpir/ From Matching to Generation: A Survey on Generative Information Retrieval
/ruc-nlpir/ From Matching to Generation: A Survey on Generative Information Retrieval /ruc-nlpir/ From Matching to Generation: A Survey on Generative Information Retrieval

Information Retrieval (IR) systems are crucial tools for users to access information, widely applied in scenarios like search engines, question answering, and recommendation systems.

With the advancement of pre-trained language models, generative information retrieval (GenIR) has emerged as a novel paradigm, gaining increasing attention in recent years.

Currently, research in GenIR can be categorized into two aspects: generative document retrieval (GR) and reliable response generation.

GR leverages the generative model's parameters for memorizing documents, enabling retrieval by directly generating relevant document identifiers without explicit indexing.

We also review the evaluation, chall…

1 day, 6 hours назад @ paperswithcode.com
/wagner-niklas/ CAGE: Circumplex Affect Guided Expression Inference
/wagner-niklas/ CAGE: Circumplex Affect Guided Expression Inference /wagner-niklas/ CAGE: Circumplex Affect Guided Expression Inference

Understanding emotions and expressions is a task of interest across multiple disciplines, especially for improving user experiences.

People understand discrete emotions differently due to a variety of factors, including cultural background, individual experiences, and cognitive biases.

Therefore, most approaches to expression understanding, particularly those relying on discrete categories, are inherently biased.

Using a small-scaled MaxViT-based model architecture, we evaluate the impact of discrete expression category labels in training with the continuous valence and arousal labels.

We show that considering valence and arousal in addition to discrete category labels helps to significantl…

1 day, 6 hours назад @ paperswithcode.com
/grammarly/ Pillars of Grammatical Error Correction: Comprehensive Inspection Of Contemporary Approaches In The Era of Large Language Models
/grammarly/ Pillars of Grammatical Error Correction: Comprehensive Inspection Of Contemporary Approaches In The Era of Large Language Models /grammarly/ Pillars of Grammatical Error Correction: Comprehensive Inspection Of Contemporary Approaches In The Era of Large Language Models

In this paper, we carry out experimental research on Grammatical Error Correction, delving into the nuances of single-model systems, comparing the efficiency of ensembling and ranking methods, and exploring the application of large language models to GEC as single-model systems, as parts of ensembles, and as ranking methods.

We set new state-of-the-art performance with F_0.5 scores of 72.8 on CoNLL-2014-test and 81.4 on BEA-test, respectively.

To support further advancements in GEC and ensure the reproducibility of our research, we make our code, trained models, and systems' outputs publicly available.

PDFAbstract

1 day, 6 hours назад @ paperswithcode.com
/yaooxu/ Generate-on-Graph: Treat LLM as both Agent and KG in Incomplete Knowledge Graph Question Answering
/yaooxu/ Generate-on-Graph: Treat LLM as both Agent and KG in Incomplete Knowledge Graph Question Answering /yaooxu/ Generate-on-Graph: Treat LLM as both Agent and KG in Incomplete Knowledge Graph Question Answering

To address the issue of insufficient knowledge and the tendency to generate hallucination in Large Language Models (LLMs), numerous studies have endeavored to integrate LLMs with Knowledge Graphs (KGs).

However, all these methods are evaluated on conventional Knowledge Graph Question Answering (KGQA) with complete KGs, where the factual triples involved in each question are entirely covered by the given KG.

In this situation, LLM mainly acts as an agent to find answer entities by exploring the KG, rather than effectively integrating internal and external knowledge sources.

However, in real-world scenarios, KGs are often incomplete to cover all the knowledge required to answer questions.

To …

1 day, 6 hours назад @ paperswithcode.com
/ise-uiuc/ XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts
/ise-uiuc/ XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts /ise-uiuc/ XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts

We introduce XFT, a simple yet powerful training scheme, by simply merging upcycled Mixture-of-Experts (MoE) to unleash the performance limit of instruction-tuned code Large Language Models (LLMs).

While vanilla sparse upcycling fails to improve instruction tuning, XFT introduces a shared expert mechanism with a novel routing weight normalization strategy into sparse upcycling, which significantly boosts instruction tuning.

After fine-tuning the upcycled MoE model, XFT introduces a learnable model merging mechanism to compile the upcycled MoE model back to a dense model, achieving upcycled MoE-level performance with only dense-model compute.

By applying XFT to a 1.3B model, we create a new …

1 day, 6 hours назад @ paperswithcode.com
/salt-nlp/ CultureBank: An Online Community-Driven Knowledge Base Towards Culturally Aware Language Technologies
/salt-nlp/ CultureBank: An Online Community-Driven Knowledge Base Towards Culturally Aware Language Technologies /salt-nlp/ CultureBank: An Online Community-Driven Knowledge Base Towards Culturally Aware Language Technologies

To enhance language models' cultural awareness, we design a generalizable pipeline to construct cultural knowledge bases from different online communities on a massive scale.

With the pipeline, we construct CultureBank, a knowledge base built upon users' self-narratives with 12K cultural descriptors sourced from TikTok and 11K from Reddit.

Unlike previous cultural knowledge resources, CultureBank contains diverse views on cultural descriptors to allow flexible interpretation of cultural knowledge, and contextualized cultural scenarios to help grounded evaluation.

With CultureBank, we evaluate different LLMs' cultural awareness, and identify areas for improvement.

We also fine-tune a languag…

1 day, 6 hours назад @ paperswithcode.com
/opendatalab/ UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition
/opendatalab/ UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition /opendatalab/ UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition

This paper presents the UniMER dataset to provide the first study on Mathematical Expression Recognition (MER) towards complex real-world scenarios.

Therefore, the UniMER dataset enables the training of a robust and high-accuracy MER model and comprehensive evaluation of model performance.

Moreover, we introduce the Universal Mathematical Expression Recognition Network (UniMERNet), an innovative framework designed to enhance MER in practical scenarios.

UniMERNet incorporates a Length-Aware Module to process formulas of varied lengths efficiently, thereby enabling the model to handle complex mathematical expressions with greater accuracy.

Our extensive experiments demonstrate that UniMERNet …

1 day, 6 hours назад @ paperswithcode.com
/id-animator/ ID-Animator: Zero-Shot Identity-Preserving Human Video Generation
/id-animator/ ID-Animator: Zero-Shot Identity-Preserving Human Video Generation /id-animator/ ID-Animator: Zero-Shot Identity-Preserving Human Video Generation

Generating high fidelity human video with specified identities has attracted significant attention in the content generation community.

However, existing techniques struggle to strike a balance between training efficiency and identity preservation, either requiring tedious case-by-case finetuning or usually missing the identity details in video generation process.

In this study, we present ID-Animator, a zero-shot human-video generation approach that can perform personalized video generation given single reference facial image without further training.

ID-Animator inherits existing diffusion-based video generation backbones with a face adapter to encode the ID-relevant embeddings from learn…

1 day, 6 hours назад @ paperswithcode.com
/princeton-vl/ Multi-Session SLAM with Differentiable Wide-Baseline Pose Optimization
/princeton-vl/ Multi-Session SLAM with Differentiable Wide-Baseline Pose Optimization /princeton-vl/ Multi-Session SLAM with Differentiable Wide-Baseline Pose Optimization

We introduce a new system for Multi-Session SLAM, which tracks camera motion across multiple disjoint videos under a single global reference.

Our approach couples the prediction of optical flow with solver layers to estimate camera pose.

The backbone is trained end-to-end using a novel differentiable solver for wide-baseline two-view pose.

The full system can connect disjoint sequences, perform visual odometry, and global optimization.

Compared to existing approaches, our design is accurate and robust to catastrophic failures.

1 day, 6 hours назад @ paperswithcode.com
/matthewyccheung/ Metric-guided Image Reconstruction Bounds via Conformal Prediction
/matthewyccheung/ Metric-guided Image Reconstruction Bounds via Conformal Prediction /matthewyccheung/ Metric-guided Image Reconstruction Bounds via Conformal Prediction

Recent advancements in machine learning have led to novel imaging systems and algorithms that address ill-posed problems.

Assessing their trustworthiness and understanding how to deploy them safely at test time remains an important and open problem.

We propose a method that leverages conformal prediction to retrieve upper/lower bounds and statistical inliers/outliers of reconstructions based on the prediction intervals of downstream metrics.

We apply our method to sparse-view CT for downstream radiotherapy planning and show 1) that metric-guided bounds have valid coverage for downstream metrics while conventional pixel-wise bounds do not and 2) anatomical differences of upper/lower bounds b…

1 day, 6 hours назад @ paperswithcode.com
/tyang816/ Simple, Efficient and Scalable Structure-aware Adapter Boosts Protein Language Models
/tyang816/ Simple, Efficient and Scalable Structure-aware Adapter Boosts Protein Language Models /tyang816/ Simple, Efficient and Scalable Structure-aware Adapter Boosts Protein Language Models

Fine-tuning Pre-trained protein language models (PLMs) has emerged as a prominent strategy for enhancing downstream prediction tasks, often outperforming traditional supervised learning approaches.

As a widely applied powerful technique in natural language processing, employing Parameter-Efficient Fine-Tuning techniques could potentially enhance the performance of PLMs.

However, the direct transfer to life science tasks is non-trivial due to the different training strategies and data forms.

To address this gap, we introduce SES-Adapter, a simple, efficient, and scalable adapter method for enhancing the representation learning of PLMs.

Extensive evaluations are conducted on 2 types of foldin…

1 day, 6 hours назад @ paperswithcode.com
💼 University and corporation labs
DeepMind DeepMind
последний пост 6 days, 1 hour назад
The ethics of advanced AI assistants
The ethics of advanced AI assistants The ethics of advanced AI assistants

Responsibility & Safety The ethics of advanced AI assistants ShareCopy link ×Exploring the promise and risks of a future with more capable AI Imagine a future where we interact regularly with a range of advanced artificial intelligence (AI) assistants — and where millions of assistants interact with each other on our behalf.

General-purpose foundation models are paving the way for increasingly advanced AI assistants.

Advanced AI assistants could have a profound impact on users and society, and be integrated into most aspects of people’s lives.

Able to fluidly communicate using natural language, the written output and voices of advanced AI assistants may become hard to distinguish from those…

6 days, 1 hour назад @ deepmind.google
TacticAI: an AI assistant for football tactics
TacticAI: an AI assistant for football tactics TacticAI: an AI assistant for football tactics

Research TacticAI: an AI assistant for football tactics ShareCopy link ×As part of our multi-year collaboration with Liverpool FC, we develop a full AI system that can advise coaches on corner kicks 'Corner taken quickly… Origi!'

Our first paper, Game Plan, looked at why AI should be used in assisting football tactics, highlighting examples such as analyzing penalty kicks.

Predicting corner kick outcomes with geometric deep learning A corner kick is awarded when the ball passes over the byline, after touching a player of the defending team.

With TacticAI, we have developed a capable AI assistant for football tactics and achieved a milestone in developing useful assistants in sports AI.

We s…

1 month назад @ deepmind.google
SIMA generalist AI agent for 3D virtual environments
SIMA generalist AI agent for 3D virtual environments SIMA generalist AI agent for 3D virtual environments

In a new technical report, we introduce SIMA, short for Scalable Instructable Multiworld Agent, a generalist AI agent for 3D virtual settings.

SIMA: a versatile AI agent SIMA is an AI agent that can perceive and understand a variety of environments, then take actions to achieve an instructed goal.

What’s more, an agent trained in all but one game performed nearly as well on that unseen game as an agent trained specifically on it, on average.

We compare this performance with three types of generalist SIMA agent, each trained across multiple environments.

Advancing AI agent research SIMA’s results show the potential to develop a new wave of generalist, language-driven AI agents.

1 month, 1 week назад @ deepmind.google
Gemma: Introducing new state-of-the-art open models
Gemma: Introducing new state-of-the-art open models Gemma: Introducing new state-of-the-art open models

Today, we’re excited to introduce a new generation of open models from Google to assist developers and researchers in building AI responsibly.

Gemma open modelsGemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models.

This enables Gemma 2B and 7B to achieve best-in-class performance for their sizes compared to other open models.

And Gemma models are capable of running directly on a developer laptop or desktop computer.

Notably, Gemma surpasses significantly larger models on key benchmarks while adhering to our rigorous standards for safe and responsible outputs.

2 months назад @ blog.google
Our next-generation model: Gemini 1.5
Our next-generation model: Gemini 1.5 Our next-generation model: Gemini 1.5

A note from Google and Alphabet CEO Sundar Pichai:Last week, we rolled out our most capable model, Gemini 1.0 Ultra, and took a significant step forward in making Google products more helpful, starting with Gemini Advanced.

Today, developers and Cloud customers can begin building with 1.0 Ultra too — with our Gemini API in AI Studio and in Vertex AI.

Our teams continue pushing the frontiers of our latest models with safety at the core.

In fact, we’re ready to introduce the next generation: Gemini 1.5.

It shows dramatic improvements across a number of dimensions and 1.5 Pro achieves comparable quality to 1.0 Ultra, while using less compute.

2 months, 1 week назад @ blog.google
The next chapter of our Gemini era
The next chapter of our Gemini era The next chapter of our Gemini era

We’re excited by the progress, for example with our Search Generative Experience, or SGE, which you can try in Search Labs.

Introducing Gemini AdvancedBard has been the best way for people to directly experience our most capable models.

To reflect the advanced tech at its core, Bard will now simply be called Gemini.

The version with Ultra will be called Gemini Advanced, a new experience far more capable at reasoning, following instructions, coding, and creative collaboration.

You can start using Gemini Advanced by subscribing to the new Google One AI Premium plan, which offers the best of Google’s AI features in a single place.

2 months, 2 weeks назад @ blog.google
AlphaGeometry: An Olympiad-level AI system for geometry
AlphaGeometry: An Olympiad-level AI system for geometry AlphaGeometry: An Olympiad-level AI system for geometry

Research AlphaGeometry: An Olympiad-level AI system for geometry ShareCopy link ×Our AI system surpasses the state-of-the-art approach for geometry problems, advancing AI reasoning in mathematicsReflecting the Olympic spirit of ancient Greece, the International Mathematical Olympiad is a modern-day arena for the world's brightest high-school mathematicians.

In a paper published today in Nature, we introduce AlphaGeometry, an AI system that solves complex geometry problems at a level approaching a human Olympiad gold-medalist - a breakthrough in AI performance.

In a benchmarking test of 30 Olympiad geometry problems, AlphaGeometry solved 25 within the standard Olympiad time limit.

Solving Ol…

3 months, 1 week назад @ deepmind.google
Shaping the future of advanced robotics
Shaping the future of advanced robotics Shaping the future of advanced robotics

Today we’re announcing a suite of advances in robotics research that bring us a step closer to this future.

AutoRT, SARA-RT, and RT-Trajectory build on our historic Robotics Transformers work to help robots make decisions faster, and better understand and navigate their environments.

So the AutoRT system comprises layers of practical safety measures from classical robotics.

SARA-RT: Making Robotics Transformers leaner and faster Our new system, Self-Adaptive Robust Attention for Robotics Transformers (SARA-RT), converts Robotics Transformer (RT) models into more efficient versions.

We will continue to tackle challenges in robotics today and to adapt to the new capabilities and technologies …

3 months, 3 weeks назад @ deepmind.google
Images altered to trick machine vision can influence humans too
Images altered to trick machine vision can influence humans too Images altered to trick machine vision can influence humans too

Research Images altered to trick machine vision can influence humans too ShareCopy link ×New research shows that even subtle changes to digital images, designed to confuse computer vision systems, can also affect human perception Computers and humans see the world in different ways.

Our discovery highlights a similarity between human and machine vision, but also demonstrates the need for further research to understand the influence adversarial images have on people, as well as AI systems.

An adversarial image is one that has been subtly altered by a procedure that causes an AI model to confidently misclassify the image contents.

Indeed, security concerns have led researchers to investigate …

3 months, 3 weeks назад @ deepmind.google
2023: A Year of Groundbreaking Advances in AI and Computing
2023: A Year of Groundbreaking Advances in AI and Computing 2023: A Year of Groundbreaking Advances in AI and Computing

Company 2023: A Year of Groundbreaking Advances in AI and Computing ShareCopy link ×This has been a year of incredible progress in the field of Artificial Intelligence (AI) research and its practical applications.

In this Year-in-Review post we’ll go over some of Google Research’s and Google DeepMind’s efforts putting these paragraphs into practice safely throughout 2023.

In November, in partnership with YouTube, we announced Lyria, our most advanced AI music generation model to date.

Then in December, we launched Gemini, our most capable and general AI model.

Advances in capable language and multimodal models have also benefited our robotics research efforts.

4 months назад @ deepmind.google
FunSearch: Making new discoveries in mathematical sciences using Large Language Models
FunSearch: Making new discoveries in mathematical sciences using Large Language Models FunSearch: Making new discoveries in mathematical sciences using Large Language Models

Research FunSearch: Making new discoveries in mathematical sciences using Large Language Models ShareCopy link ×By searching for “functions” written in computer code, FunSearch made the first discoveries in open problems in mathematical sciences using LLMs Large Language Models (LLMs) are useful assistants - they excel at combining concepts and can read, write and code to help people solve problems.

This work represents the first time a new discovery has been made for challenging open problems in science or mathematics using LLMs.

FunSearch discovered new solutions for the cap set problem, a longstanding open problem in mathematics.

The corresponding function produced by FunSearch for each …

4 months, 1 week назад @ deepmind.google
Google DeepMind at NeurIPS 2023
Google DeepMind at NeurIPS 2023 Google DeepMind at NeurIPS 2023

We’ll be showcasing demos of our cutting edge AI models for global weather forecasting, materials discovery, and watermarking AI-generated content.

There will also be an opportunity to hear from the team behind Gemini, our largest and most capable AI model.

Here’s a look at some of our research highlights:Multimodality: language, video, actionUniSim is a universal simulator of real-world interactions.

Our approach surpassed current methods on vision and language tasks, and showed more potential to scale.

To help people solve problems and find what they’re looking for, AI models need to process billions of unique values efficiently.

4 months, 2 weeks назад @ deepmind.google
Introducing Gemini: our largest and most capable AI model
Introducing Gemini: our largest and most capable AI model Introducing Gemini: our largest and most capable AI model

AI has the potential to create opportunities — from the everyday to the extraordinary — for people everywhere.

That’s what excites me: the chance to make AI helpful for everyone, everywhere in the world.

At the same time, developers are using our models and infrastructure to build new generative AI applications, and startups and enterprises around the world are growing with our AI tools.

Now, we’re taking the next step on our journey with Gemini, our most capable and general model yet, with state-of-the-art performance across many leading benchmarks.

I’m genuinely excited for what’s ahead, and for the opportunities Gemini will unlock for people everywhere.

4 months, 2 weeks назад @ blog.google
Millions of new materials discovered with deep learning
Millions of new materials discovered with deep learning Millions of new materials discovered with deep learning

Research Millions of new materials discovered with deep learning ShareCopy link ×AI tool GNoME finds 2.2 million new crystals, including 380,000 stable materials that could power future technologies Modern technologies from computer chips and batteries to solar panels rely on inorganic crystals.

Computational approaches drawing from the Materials Project, Open Quantum Materials Database and WBM database boosted this number to 48,000 stable crystals.

Over the last decade, computational approaches led by the Materials Project and other groups have helped discover 28,000 new materials.

GNoME: Harnessing graph networks for materials explorationGNoME uses two pipelines to discover low-energy (st…

4 months, 3 weeks назад @ deepmind.google
Transforming the future of music creation
Transforming the future of music creation Transforming the future of music creation

Music AI tools – a set of tools we’re designing with artists, songwriters, and producers to help bolster their creative processes.

To develop these projects, we’ve brought together technical experts from across Google with a diverse range of world-renowned artists and songwriters to explore how generative music technologies can responsibly shape the future of music creation.

Exploring music AI tools with the industry Our researchers have been exploring with artists, songwriters, and producers in YouTube’s Music AI Incubator how generative AI can best support the creative process, and working together to responsibly design a suite of music AI tools.

This work draws on our history of research…

5 months, 1 week назад @ deepmind.google
Google
последний пост 3 часа назад
How Prewave is helping to secure deep supply chains worldwide with AI on Google Cloud
How Prewave is helping to secure deep supply chains worldwide with AI on Google Cloud How Prewave is helping to secure deep supply chains worldwide with AI on Google Cloud

At Prewave, we’re driven by the mission to help companies make their entire supply chains more resilient, transparent, and sustainable.

That’s why we built the Prewave supply chain risk intelligence platform on Google Cloud from inception in 2019.

And second, it makes supply chains more sustainable by detecting and solving ESG risks, such as forced labor or environmental issues.

With this information, Prewave also maps our clients’ supply chains from immediate and sub-tier suppliers down to the raw materials’ providers.

We’re confident that our collaboration with Google Cloud will continue to bring us huge benefits as we help more companies internationally to achieve transparency, resilienc…

3 часа назад @ cloud.google.com
Google Cloud Innovator Juan Guillermo Gómez on transforming AI and the importance of community
Google Cloud Innovator Juan Guillermo Gómez on transforming AI and the importance of community Google Cloud Innovator Juan Guillermo Gómez on transforming AI and the importance of community

Google Cloud Champion Innovators are a global network of more than 600 non-Google professionals, who are technical experts in Google Cloud products and services.

I find the Google Cloud developer blog and YouTube channel particularly instructive when it comes to real use cases.

Google Developer Groups, or GDGs, for example are a great way to network and keep up with the latest developments.

Becoming a Google Cloud Champion Innovator has been really beneficial.

Take the next steps on your Google Cloud journey and learn more about the Google Cloud Innovators Program, designed to help developers and practitioners grow their Google Cloud skills and advance their careers.

19 часов назад @ cloud.google.com
Innovating in patent search: How IPRally leverages AI with Google Kubernetes Engine and Ray
Innovating in patent search: How IPRally leverages AI with Google Kubernetes Engine and Ray Innovating in patent search: How IPRally leverages AI with Google Kubernetes Engine and Ray

Next on the horizon is expanding to big data solutions with Ray Data and BigQuery.

With just two DevOps engineers and one MLOps engineer, IPRally was able to build its own customized ML platform with GKE and Ray as key components.

To more easily manage Ray, IPRally uses KubeRay, a specialized tool that simplifies Ray cluster management on Kubernetes.

IPRally uses Ray for the most advanced tasks like massive preprocessing of data and exploratory deep learning in R&D.

And byleveraging them within GKE, IPRally facilitates the creation of nodes on-demand, scales GPU resources as needed, thus optimizing its operational costs.

6 days, 19 hours назад @ cloud.google.com
Meta Llama 3 Available Today on Google Cloud Vertex AI
Meta Llama 3 Available Today on Google Cloud Vertex AI Meta Llama 3 Available Today on Google Cloud Vertex AI

When developers access Llama 3 through Vertex AI, they will soon have access to multiple state of the art tuning options made available through Colab Enterprise.

Vertex AI also makes it simple for developers to evaluate their tuned Llama models, either through preconfigured notebooks directly in Model Garden or with Auto SxS, Vertex AI’s pairwise model-based evaluation tool.

And with robust features like Model Registry, Vertex AI makes it easy to manage and monitor model variants and endpoints and scale them appropriately for your needs.

A thriving, open ecosystem for enterprise model buildersWith over 130 first-party, third-party, and open models, Vertex AI Model Garden is a one-stop desti…

1 week назад @ cloud.google.com
Introducing LLM fine-tuning and evaluation in BigQuery
Introducing LLM fine-tuning and evaluation in BigQuery Introducing LLM fine-tuning and evaluation in BigQuery

BigQuery allows you to analyze your data using a range of large language models (LLMs) hosted in Vertex AI including Gemini 1.0 Pro, Gemini 1.0 Pro Vision and text-bison.

These models work well for several tasks such as text summarization, sentiment analysis, etc.

Today, we are announcing support for customizing LLMs in BigQuery with supervised fine-tuning.

Feature walkthroughTo illustrate model fine-tuning, let’s look at a classification problem using text data.

To fine-tune and evaluate our model, we first create an evaluation table and a training table in BigQuery using a subset of this data available in Cloud Storage as follows:

1 week, 2 days назад @ cloud.google.com
Google Cloud offers new AI, cybersecurity, and data analytics training to unlock job opportunities
Google Cloud offers new AI, cybersecurity, and data analytics training to unlock job opportunities Google Cloud offers new AI, cybersecurity, and data analytics training to unlock job opportunities

This new initiative uses Google Cloud technology to help move certificate completers through the hiring process.

The U.S. Department of the Treasury will start using these new Google Cloud Certificates and labs for cyber and data analytics talent identification across the federal agency, per President Biden's Executive Order on AI.

Together, they were the pioneers in stacking Grow with Google certificates into four types of degree-earning credit certificates over the past two years.

(ISC)2 Cybersecurity Workforce Study (2022)5.

The Google Cloud Certificates offer a recommendation from the American Council on Education® of up to 10 college credits.

1 week, 2 days назад @ cloud.google.com
Introducing multimodal and structured data embedding support in BigQuery
Introducing multimodal and structured data embedding support in BigQuery Introducing multimodal and structured data embedding support in BigQuery

Embeddings represent real-world objects, like entities, text, images, or videos as an array of numbers (a.k.a vectors) that machine learning models can easily process.

Embeddings are the building blocks of many ML applications such as semantic search, recommendations, clustering, outlier detection, named entity extraction, and more.

Last year, we introduced support for text embeddings in BigQuery, allowing machine learning models to understand real-world data domains more effectively and earlier this year we introduced vector search, which lets you index and work with billions of embeddings and build generative AI applications on BigQuery.

You can start using multimodal embedding in BigQuer…

1 week, 6 days назад @ cloud.google.com
Introducing Isolator: Enabling secure multi-party collaboration with healthcare data
Introducing Isolator: Enabling secure multi-party collaboration with healthcare data Introducing Isolator: Enabling secure multi-party collaboration with healthcare data

Isolator can enable multi-party collaboration on work that involves handling raw and unprocessed, sensitive, and regulated data in an isolated, private, and compliant environment on Google Cloud.

Leverage complex data : Isolator can advance adoption of our Medical Image Suite, giving researchers and data scientists the efficiencies to improve workflows, and at a lower cost than keeping data on-premise.

Scaling advanced data analytics : Isolator can assist in reducing costs and improving results by moving and transforming data from siloed, on-premise databases to feature-rich and massively scalable analytic and data processing services such as Healthcare Data Engine and BigQuery.

This can he…

1 week, 6 days назад @ cloud.google.com
Build powerful gen AI applications with Firestore vector similarity search
Build powerful gen AI applications with Firestore vector similarity search Build powerful gen AI applications with Firestore vector similarity search

Creating innovative AI-powered solutions for use cases such as product recommendations and chatbots often requires vector similarity search, or vector search for short.

At Google Cloud Next ‘24, we announced the Firestore vector search in preview, using exact K-nearest neighbor (KNN) search.

Developers can now perform vector search on transactional Firestore data without the hassle of copying data to another vector search solution, maintaining operational simplicity and efficiency.

Developers can now utilize Firestore vector search with popular orchestration frameworks such as LangChain and LlamaIndex through native integrations.

How to use KNN vector search in FirestoreThe first step in ut…

1 week, 6 days назад @ cloud.google.com
Eating our own dogfood: Building an AI-driven business at Google Cloud Consulting
Eating our own dogfood: Building an AI-driven business at Google Cloud Consulting Eating our own dogfood: Building an AI-driven business at Google Cloud Consulting

What you build now with gen AI is just a starting point, but it will be worth the investment.

Your success at building with and embracing gen AI will likely be a differentiator in the future — the sooner you get started, the better.

We’re excited to share more at Google Cloud Next and stay tuned for future posts that dive into the technical implementation and underpinnings for these two solutions.

Within Google Cloud Consulting, we provide guidance to customers on how to test and productionize GenAI use cases with Vertex AI.

Learn more about how Google Cloud Consulting can help you learn, build, operate and succeed.

1 week, 6 days назад @ cloud.google.com
Accelerating database modernization with Gemini in Database Migration Service
Accelerating database modernization with Gemini in Database Migration Service Accelerating database modernization with Gemini in Database Migration Service

At Google Cloud Next ‘23, we announced support in Database Migration Service (DMS) for Oracle-to-PostgreSQL migrations to make them smooth, intuitive and efficient.

This week at Google Cloud Next ‘24, we unveiled Gemini for Google Cloud, a new generation of AI-assistive capabilities based on Google's Gemini family of models, including Gemini in Databases.

In addition, we announced the preview of a key feature of Gemini in Database: assistive code and schema conversion, allowing you to speed up the modernization of Oracle databases to PostgreSQL databases on Google Cloud.

Finally, we announced Gemini-assisted code explainability, a key feature of Gemini in Databases, which simplifies code co…

1 week, 6 days назад @ cloud.google.com
Gemma on Google Kubernetes Engine DeepDive: New innovations to serve open generative AI models
Gemma on Google Kubernetes Engine DeepDive: New innovations to serve open generative AI models Gemma on Google Kubernetes Engine DeepDive: New innovations to serve open generative AI models

Gemma models achieve best-in-class performance for their size (Gemma 2B and Gemma 7B) compared to other open models, and are pre-trained and equipped with instruction-tuned variants to enable research and development.

This lets you easily deploy models from your preferred repositories to the infrastructure of your choice.

Earlier today we published a performance deepdive of Gemma on Google Cloud AI optimized infrastructure for training and serving generative AI workloads.

Hugging Face launched the Gemma model card, allowing you to deploy Gemma from Hugging Face directly to Google Cloud.

Once you click the Google Cloud option, you will be redirected to Vertex Model Garden, where you can choo…

2 weeks назад @ cloud.google.com
Gemini in Databases — supercharge database development and management
Gemini in Databases — supercharge database development and management Gemini in Databases — supercharge database development and management

Yesterday, at Next ‘24, we announced a preview of Gemini in Databases — an AI powered database assistant that simplifies all aspects of database operations including assisted development, performance optimization, fleet management, governance, and migrations.

With Gemini in Databases, companies no longer need to exclusively rely on people with specialized skills and custom resources to manage their databases — Google's proactive AI assistant can help!

In most organizations, developers, database administrators, and platform engineers handle various aspects of database development and management.

With Gemini in Databases, database professionals get all of these capabilities in a single offeri…

2 weeks назад @ cloud.google.com
Natural language support in AlloyDB for building gen AI apps with real-time data
Natural language support in AlloyDB for building gen AI apps with real-time data Natural language support in AlloyDB for building gen AI apps with real-time data

AlloyDB AI's natural language support, with features like end-user access controls and NL2SQL primitives, are now available in technology preview through the downloadable AlloyDB Omni edition.

Introducing natural language support to AlloyDBAlloyDB AI provides the best of both worlds: accuracy, security, and flexibility.

These include:Intent clarification: Because users are often imprecise with language, AlloyDB AI has mechanisms for interactively clarifying user intent.

How natural language support worksIn its easiest-to-use form, AlloyDB AI provides an interface that takes in any natural language question, including ones that are broad or imprecise, and returns either accurate results or f…

2 weeks назад @ cloud.google.com
Introducing new Vertex AI text embedding models
Introducing new Vertex AI text embedding models Introducing new Vertex AI text embedding models

Text embedding models are essential for many diverse natural-language processing (NLP) applications, from document retrieval and similarity measurement to classification and clustering.

Our text embedding models power applications across Google Cloud including BigQuery, Cloud Database, Vertex AI Search, and Workspace.

New text embedding models with stronger performanceWe evaluated our new models and published the metrics and more technical details in the Google study: Gecko: Versatile text embeddings distilled from large language models.

The pricing for our text embedding models is $0.000025/1,000 characters for online requests and $0.00002/1,000 characters for batch requests.

Dynamic embed…

2 weeks назад @ cloud.google.com
OpenAI
последний пост 2 days, 4 hours назад
Introducing more enterprise-grade features for API customers
Introducing more enterprise-grade features for API customers Introducing more enterprise-grade features for API customers

Customers with a sustained level of tokens per minute (TPM) usage on GPT-4 or GPT-4 Turbo can request access to provisioned throughput to get discounts ranging from 10–50% based on the size of the commitment.

Reduced costs on asynchronous workloads: Customers can use our new Batch API to run non-urgent workloads asynchronously.

Batch API requests are priced at 50% off shared prices, offer much higher rate limits, and return results within 24 hours.

We plan to keep adding new features focused on enterprise-grade security, administrative controls, and cost management.

For more information on these launches, visit our API documentation or get in touch with our team to discuss custom solution…

2 days, 4 hours назад @ openai.com
OpenAI’s commitment to child safety: adopting safety by design principles
OpenAI’s commitment to child safety: adopting safety by design principles OpenAI’s commitment to child safety: adopting safety by design principles

OpenAI, alongside industry leaders including Amazon, Anthropic, Civitai, Google, Meta, Metaphysic, Microsoft, Mistral AI, and Stability AI, has committed to implementing robust child safety measures in the development, deployment, and maintenance of generative AI technologies as articulated in the Safety by Design principles.

By adopting comprehensive Safety by Design principles, OpenAI and our peers are ensuring that child safety is prioritized at every stage in the development of AI.

Responsibly source our training datasets, detect and remove child sexual abuse material (CSAM) and child sexual exploitation material (CSEM) from training data, and report any confirmed CSAM to the relevant a…

2 days, 4 hours назад @ openai.com
Introducing OpenAI Japan
Introducing OpenAI Japan Introducing OpenAI Japan

Our new local presence also gets us closer to leading businesses like Daikin, Rakuten, and TOYOTA Connected who are using ChatGPT Enterprise to automate complex business processes, assist in data analysis, and optimize internal reporting.

ChatGPT also helps accelerate the efforts of local governments, such as Yokosuka City, which is leveraging the technology to improve the efficiency of public services in Japan.

Over the past year, the city has gradually provided ChatGPT access to almost all city employees, and 80% have reported increases in productivity.

Now Yokosuka City has formed a network with 21 local governments—including the Tokyo Metropolitan Government and the City of Kobe—to …

1 week, 4 days назад @ openai.com
Introducing improvements to the fine-tuning API and expanding our custom models program
Introducing improvements to the fine-tuning API and expanding our custom models program Introducing improvements to the fine-tuning API and expanding our custom models program

Assisted Fine-TuningAt DevDay last November, we announced a Custom Model program designed to train and optimize models for a specific domain, in partnership with a dedicated group of OpenAI researchers.

Since then, we've met with dozens of customers to assess their custom model needs and evolved our program to further maximize performance.

Today, we are formally announcing our assisted fine-tuning offering as part of the Custom Model program.

Fully custom-trained models imbue new knowledge from a specific domain by modifying key steps of the model training process using novel mid-training and post-training techniques.

Our team modified every step of the model training process, from domain-s…

3 weeks назад @ openai.com
Start using ChatGPT instantly
Start using ChatGPT instantly Start using ChatGPT instantly

We’ve also introduced additional content safeguards for this experience, such as blocking prompts and generations in a wider range of categories.

There are many benefits to creating an account including the ability to save and review your chat history, share chats, and unlock additional features like voice conversations and custom instructions.

For anyone that has been curious about AI’s potential but didn’t want to go through the steps to set-up an account, start using ChatGPT today.

3 weeks, 3 days назад @ openai.com
Navigating the Challenges and Opportunities of Synthetic Voices
Navigating the Challenges and Opportunities of Synthetic Voices Navigating the Challenges and Opportunities of Synthetic Voices

We recognize that generating speech that resembles people's voices has serious risks, which are especially top of mind in an election year.

We are engaging with U.S. and international partners from across government, media, entertainment, education, civil society and beyond to ensure we are incorporating their feedback as we build.ÂThe partners testing Voice Engine today have agreed to our usage policies, which prohibit the impersonation of another individual or organization without consent or legal right.

In addition, our terms with these partners require explicit and informed consent from the original speaker and we don’t allow developers to build ways for individual users to create the…

3 weeks, 6 days назад @ openai.com
Sora: First Impressions
Sora: First Impressions Sora: First Impressions

Starting his career at DreamWorks Animation, Don Allen III is a multidisciplinary creator, speaker and consultant who collaborates with major tech and entertainment companies on mixed reality, virtual reality and AI applications.

“For a long time I've been making augmented reality hybrid creatures that I think would be fun combinations in my head.

Now I have a much easier way of prototyping the ideas before I fully build out the 3-D characters to place in spatial computers.” Don cites Sora’s “weirdness” as its greatest strength: “It’s not bound by traditional laws of physics or conventions of thought.” He says that working with Sora shifted his focus from “technical hurdle…

1 month назад @ openai.com
Global news partnerships: Le Monde and Prisa Media
Global news partnerships: Le Monde and Prisa Media Global news partnerships: Le Monde and Prisa Media

Echoing this sentiment, Louis Dreyfus, CEO of Le Monde, stated, "At the moment we are celebrating the 80th anniversary of Le Monde, this partnership with OpenAI allows us to expand our reach and uphold our commitment to providing accurate, verified, balanced news stories at scale.

Collaborating with OpenAI ensures that our authoritative content can be accessed and appreciated by a broader, more diverse audience. ÂEvery shift in the media landscape has presented Le Monde with new opportunities.

From the transition to digital platforms to embracing the era of free media, Le Monde has consistently seized these moments to underscore its commitment to independence, expertise, and journalistic i…

1 month, 1 week назад @ openai.com
OpenAI announces new members to board of directors
OpenAI announces new members to board of directors OpenAI announces new members to board of directors

Additionally, Sam Altman, CEO, will rejoin the OpenAI Board of Directors.ÂSue, Nicole and Fidji have experience in leading global organizations and navigating complex regulatory environments, including backgrounds in technology, nonprofit and board governance.

They will work closely with current board members Adam D’Angelo, Larry Summers and Bret Taylor as well as Sam and OpenAI’s senior management.ÂBret Taylor, Chair of the OpenAI board, stated, “I am excited to welcome Sue, Nicole, and Fidji to the OpenAI Board of Directors.

She also served as President of Sony Entertainment, Inc., and simultaneously served as President of Sony Corporation of America.

She also serves as a member of …

1 month, 2 weeks назад @ openai.com
Review completed & Altman, Brockman to continue to lead OpenAI
Review completed & Altman, Brockman to continue to lead OpenAI Review completed & Altman, Brockman to continue to lead OpenAI

The Special Committee of the OpenAI Board today announced the completion of the review by WilmerHale.

The firm conducted dozens of interviews with members of OpenAI’s prior Board, OpenAI executives, advisors to the prior Board, and other pertinent witnesses; reviewed more than 30,000 documents; and evaluated various corporate actions.

“We have unanimously concluded that Sam and Greg are the right leaders for OpenAI,” stated Bret Taylor, Chair of the OpenAI Board.

The Special Committee acknowledged the important work done by WilmerHale in conducting this extensive review and thanked OpenAI current and former Board members, advisors and employees for their cooperation.

The Special Commi…

1 month, 2 weeks назад @ openai.com
OpenAI and Elon Musk
OpenAI and Elon Musk OpenAI and Elon Musk

Date: January 31, 2018 at 11:54:30 PM PSTSubject: Re: Top AI institutions todayWorking at the cutting edge of AI is unfortunately expensive.

For example,In addition to DeepMind, Google also has Google Brain, Research, and Cloud.

If historical trends are any indication, progress in AI is primarily driven by systems - compute, data, infrastructure.

Not only that, but any algorithmic advances published in a paper somewhere can be almost immediately re-implemented and incorporated.

The “second stage” would be a full self driving solution based on large-scale neural network training, which OpenAI expertise could significantly help accelerate.

1 month, 3 weeks назад @ openai.com
Video generation models as world simulators
Video generation models as world simulators Video generation models as world simulators

This technical report focuses on (1) our method for turning visual data of all types into a unified representation that enables large-scale training of generative models, and (2) qualitative evaluation of Sora’s capabilities and limitations.

Model and implementation details are not included in this report.

Much prior work has studied generative modeling of video data using a variety of methods, including recurrent networks,[^1][^2][^3] generative adversarial networks,[^4][^5][^6][^7] autoregressive transformers,[^8][^9] and diffusion models.

[^10][^11][^12] These works often focus on a narrow category of visual data, on shorter videos, or on videos of a fixed size.

Sora is a generalist mo…

2 months, 1 week назад @ openai.com
Disrupting malicious uses of AI by state-affiliated threat actors
Disrupting malicious uses of AI by state-affiliated threat actors Disrupting malicious uses of AI by state-affiliated threat actors

Based on collaboration and information sharing with Microsoft, we disrupted five state-affiliated malicious actors: two China-affiliated threat actors known as Charcoal Typhoon and Salmon Typhoon; the Iran-affiliated threat actor known as Crimson Sandstorm; the North Korea-affiliated actor known as Emerald Sleet; and the Russia-affiliated actor known as Forest Blizzard.

The identified OpenAI accounts associated with these actors were terminated.

Salmon Typhoon used our services to translate technical papers, retrieve publicly available information on multiple intelligence agencies and regional threat actors, assist with coding, and research common ways processes could be hidden on a system.…

2 months, 1 week назад @ openai.com
Memory and new controls for ChatGPT
Memory and new controls for ChatGPT Memory and new controls for ChatGPT

We’re testing memory with ChatGPT.

Remembering things you discuss across all chats saves you from having to repeat information and makes future conversations more helpful.

You're in control of ChatGPT's memory.

You can explicitly tell it to remember something, ask it what it remembers, and tell it to forget conversationally or through settings.

We are rolling out to a small portion of ChatGPT free and Plus users this week to learn how useful it is.

2 months, 1 week назад @ openai.com
Building an early warning system for LLM-aided biological threat creation
Building an early warning system for LLM-aided biological threat creation Building an early warning system for LLM-aided biological threat creation

We believe that these efforts would benefit from broader input, and that methods-sharing could also be of value to the AI risk research community.

To this end, we are presenting some of our early work—today, focused on biological risk.

This evaluation aims to measure whether models could meaningfully increase malicious actors’ access to dangerous information about biological threat creation, compared to the baseline of existing resources (i.e., the internet).

Each participant was then asked to complete a set of tasks covering aspects of the end-to-end process for biological threat creation.

We also discuss the limitations of statistical significance as an effective method of measuring m…

2 months, 3 weeks назад @ openai.com
Microsoft Microsoft
последний пост 6 days, 19 hours назад
SAMMO: A general-purpose framework for prompt optimization
SAMMO: A general-purpose framework for prompt optimization SAMMO: A general-purpose framework for prompt optimization

New generations of language models like GPT-4 and Mixtral 8x7B advance the capability to process long input texts.

The first structure, the task description, remains static and independent of the input as a result of conventional prompt optimization techniques.

Despite previous efforts in prompt optimization, the evolution towards more complex prompt structures has rendered many older strategies ineffective in this new context.

SAMMO: A prompt optimization approachDownload SAMMOTo address these challenges, we developed the Structure-Aware Multi-objective Metaprompt Optimization (SAMMO) framework.

We compared it against Automatic Prompt Optimization (APO) and GrIPS, applying open-source mode…

6 days, 19 hours назад @ microsoft.com
Research Focus: Week of April 15, 2024
Research Focus: Week of April 15, 2024 Research Focus: Week of April 15, 2024

Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft.

NEW RESEARCHAppropriate reliance on Generative AI: Research synthesisAppropriate reliance on AI happens when people accept correct AI outputs and reject incorrect ones.

Spotlight: Event Series Microsoft Research Forum Join us for a continuous exchange of ideas about research in the era of general AI.

They also develop AFRICOMET: COMET evaluation metrics for African languages by leveraging DA data from well-resourced languages and an African-centric multilingual encoder (AfroXLMR) to create state-of-the-…

1 week назад @ microsoft.com
Microsoft at NDSI 2024: Discoveries and implementations in networked systems
Microsoft at NDSI 2024: Discoveries and implementations in networked systems Microsoft at NDSI 2024: Discoveries and implementations in networked systems

Networked systems and their applications are essential in building the reliable, scalable, secure, and innovative infrastructure required to meet society’s evolving needs.

Microsoft is honored to support NDSI ‘24 as a returning sponsor.

We are pleased to announce that 23 papers from Microsoft researchers and their partners have been accepted to the conference.

They encompass both early-stage research and systems already deployed in production.

Spotlight: Microsoft research newsletter Microsoft Research Newsletter Stay connected to the research community at Microsoft.

1 week, 1 day назад @ microsoft.com
Abstracts: April 16, 2024
Abstracts: April 16, 2024 Abstracts: April 16, 2024

GRETCHEN HUIZINGA: Welcome to Abstracts, a Microsoft Research Podcast that puts the spotlight on world-class research in brief.

CHAKRABORTY: So satellite connectivity is nothing new and has been there for long.

So we are talking about the satellites that are at least 10 to 20 times cheaper and smaller than state-of-the-art satellites.

So the device sends some packet to the satellite; satellite sends some packet to the device—it’s all about packet exchange.

So our vision is clear: to bring 24-7 connectivity for devices anywhere on Earth with just a click of power button.

1 week, 1 day назад @ microsoft.com
Ideas: Language technologies for everyone with Kalika Bali
Ideas: Language technologies for everyone with Kalika Bali

The new series “Ideas” debuts with guest Kalika Bali. The speech and language tech researcher talks sci-fi and its impact on her career, the design thinking philosophy behind her research, and the “outrageous idea” she had to work with low-resource languages. The post Ideas: Language technologies for everyone with Kalika Bali appeared first on Microsoft Research.

1 week, 6 days назад @ microsoft.com
Research Focus: Week of April 1, 2024
Research Focus: Week of April 1, 2024 Research Focus: Week of April 1, 2024

Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft.

Existing work on tool-augmented LLMs primarily focuses on the broad coverage of tools and the flexibility of adding new tools.

However, much of this research has been confined to English, leaving LLM building and evaluation for non-English languages relatively unexplored.

NEW RESEARCHTraining Audio Captioning Models without AudioAutomated Audio Captioning (AAC) is a process that creates text descriptions for audio recordings.

Their approach leverages CLAP, a contrastive learning model that uses audio an…

3 weeks назад @ microsoft.com
AI Frontiers: Rethinking intelligence with Ashley Llorens and Ida Momennejad
AI Frontiers: Rethinking intelligence with Ashley Llorens and Ida Momennejad AI Frontiers: Rethinking intelligence with Ashley Llorens and Ida Momennejad

And so I just want to start here: for you, Ida, what is general intelligence?

Different people at different times provide different criteria for what would be the artificial general intelligence notion.

One is artificial general intelligence and the other is humanlike intelligence or human-level intelligence.

Artificial general intelligence and humanlike, human-level intelligence—how do these two concepts relate to you?

LLORENS: So it sounds like a very extensive set of experiments across many different tasks and with many different leading AI models, and you’ve uncovered a lack of robustness across some of these different tasks.

3 weeks, 6 days назад @ microsoft.com
Learning from interaction with Microsoft Copilot (web)
Learning from interaction with Microsoft Copilot (web) Learning from interaction with Microsoft Copilot (web)

AI systems like Bing and Microsoft Copilot (web) are as good as they are because they continuously learn and improve from people’s interactions.

[1]How are people using Copilot (web)?

Conversely, they express explicit frustration or switch topics when encountering mistakes in the response from Copilot (web).

These three reports present a comprehensive and multi-faceted approach to dynamically learning from conversation logs in Copilot (web) at scale.

[1] The research was performed only on fully de-identified interaction data from Copilot (web) consumers.

4 weeks назад @ microsoft.com
Abstracts: March 21, 2024
Abstracts: March 21, 2024 Abstracts: March 21, 2024

GRETCHEN HUIZINGA: Welcome to Abstracts, a Microsoft Research Podcast that puts the spotlight on world-class research in brief.

These two examples are also the differences from other deep learning OFDFT works.

This is the generalization challenge and is one of the major challenges of deep learning method for molecular science applications.

This somehow shows the benefits of leveraging the OFDFT framework for using a deep learning method to solve molecular tasks.

You can also read it on arXiv, or you can check out the March 2024 issue of Nature Computational Science.

1 month назад @ microsoft.com
Research Focus: Week of March 18, 2024
Research Focus: Week of March 18, 2024 Research Focus: Week of March 18, 2024

Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft.

NEW RESEARCHFewer is More: Boosting LLM Reasoning with Reinforced Context PruningLarge language models (LLMs) have shown impressive capabilities, yet they still struggle with math reasoning.

Given that adding more concise CoT examples in the prompt can improve LLM reasoning performance, CoT-Influx employs a coarse-to-fine pruner to maximize the input of effective and concise CoT examples.

The pruner first selects as many crucial CoT examples as possible and then prunes unimportant tokens to fit the cont…

1 month назад @ microsoft.com
Intelligent monitoring: Towards AI-assisted monitoring for cloud services
Intelligent monitoring: Towards AI-assisted monitoring for cloud services Intelligent monitoring: Towards AI-assisted monitoring for cloud services

Microsoft cloud monitor platformsWhen they are properly configured, cloud monitors can help to meet monitoring requirements.

Notably, missing monitors and alerts constituted over 40 percent of all misdetections, indicating the complexity of determining what to monitor in cloud services.

Data-driven intelligent monitoringOrganizing monitor dataBecause there is no standardized approach to building monitors, monitor data often lacks structure.

In our paper, “Intelligent Monitoring Framework for Cloud Services: A Data-Driven Approach (opens in new tab),” to be presented at ICSE 2024 (opens in new tab), we propose a data-driven approach for developing this ontology.

This shows us that we can pre…

1 month назад @ microsoft.com
Introducing Garnet – an open-source, next-generation, faster cache-store for accelerating applications and services
Introducing Garnet – an open-source, next-generation, faster cache-store for accelerating applications and services Introducing Garnet – an open-source, next-generation, faster cache-store for accelerating applications and services

Researchers at Microsoft have been working for nearly a decade to address the increasing demand for data storage mechanisms to support the rapid advances in interactive web applications and services.

This has fueled a growing cache-store industry, including many open-source systems, such as Redis, Memcached, KeyDB, and Dragonfly.

Garnet demonstrates better client latency at the 99 th and 99.9 th percentiles, which is critical to real-world scenarios.

We compare Garnet to the latest open-source versions of Redis (opens in new tab) (v7.2), KeyDB (opens in new tab) (v6.3.4), and Dragonfly (opens in new tab) (v6.2.11).

We see in Figure 4 that Garnet’s latency is low across the board.

1 month, 1 week назад @ microsoft.com
Exploring how context, culture, and character matter in avatar research
Exploring how context, culture, and character matter in avatar research Exploring how context, culture, and character matter in avatar research

In our paper, “Ecological Validity and the Evaluation of Avatar Facial Animation Noise,” presented at ANIVAE 2024, we explore the challenge of evaluating avatar noise without a standardized approach.

Traditional methods, which present participants with isolated facial animation noise to gauge perception thresholds, fall short of reflecting real-life avatar interactions.

Our approach emphasizes ecological validity—the extent to which experiments mimic real-world conditions—as central in assessing avatar noise.

Isolated clips, on the other hand, led to greater annoyance with facial animation noise, suggesting the importance of social context over hyper-realistic animation.

This indicates that…

1 month, 1 week назад @ microsoft.com
Scaling early detection of esophageal cancer with AI
Scaling early detection of esophageal cancer with AI Scaling early detection of esophageal cancer with AI

Microsoft Research and Cyted have collaborated to build novel AI models (opens in new tab) to scale the early detection of esophageal cancer.

Fewer than 1 in 5 patients survive five years after diagnosis, making early detection of this disease critical to improving a patient’s chances.

However early detection of BE has typically involved an endoscopic biopsy, a procedure that many people find uncomfortable and invasive.

As we move forward, the scalability of this technology holds the promise for widespread adoption of early detection in the fight against esophageal cancer.

As researchers, it has been exciting to work closely with Cyted and be part of the long path towards early detection of…

1 month, 2 weeks назад @ microsoft.com
Improving LLM understanding of structured data and exploring advanced prompting methods
Improving LLM understanding of structured data and exploring advanced prompting methods Improving LLM understanding of structured data and exploring advanced prompting methods

Our paper, “Table Meets LLM: Can Large Language Models Understand Structured Table Data?

Despite the simplicity of the benchmark tasks, the highest overall accuracy across seven tasks is only 65.43 percent.

Choosing the right combination of input designs can significantly enhance LLMs’ understanding of structured data.

We suggest future research should prioritize the integration of structural information to improve performance with various structured data types.

Additionally, we propose exploring LLMs’ ability to use external tools or agents for improved handling of structured data, opening new avenues for application.

1 month, 2 weeks назад @ microsoft.com
MIT AI MIT AI
последний пост 1 day, 15 hours назад
Mapping the brain pathways of visual memorability
Mapping the brain pathways of visual memorability Mapping the brain pathways of visual memorability

To do this, they set out to map the spatio-temporal brain dynamics involved in recognizing a visual image.

What they found was that a more distributed network of brain regions than previously thought are actively involved in the encoding and retention processes that underpin memorability.

“We've identified a brain signature of visual memorability that emerges around 300 milliseconds after seeing an image, involving areas across the ventral occipital cortex and temporal cortex, which processes information like color perception and object recognition.

They created a “representational matrix,” which is like a detailed chart, showing how similar neural responses are in various brain regions.

Wh…

1 day, 15 hours назад @ news.mit.edu
This tiny chip can safeguard user data while enabling efficient computing on a smartphone
This tiny chip can safeguard user data while enabling efficient computing on a smartphone This tiny chip can safeguard user data while enabling efficient computing on a smartphone

Their chip can keep a user’s health records, financial information, or other sensitive data private while still enabling huge AI models to run efficiently on devices.

A digital IMC chip performs computations inside a device’s memory, where pieces of a machine-learning model are stored after being moved over from a central server.

In a side-channel attack, a hacker monitors the chip’s power consumption and uses statistical techniques to reverse-engineer data as the chip computes.

First, they employed a security measure where data in the IMC are split into random pieces.

Safety testingTo test their chip, the researchers took on the role of hackers and tried to steal secret information using s…

2 days, 7 hours назад @ news.mit.edu
To build a better AI helper, start by modeling the irrational behavior of humans
To build a better AI helper, start by modeling the irrational behavior of humans To build a better AI helper, start by modeling the irrational behavior of humans

To build AI systems that can collaborate effectively with humans, it helps to have a good model of human behavior to start with.

Ultimately, this work could help scientists teach AI systems how humans behave, which could enable these systems to respond better to their human collaborators.

Being able to model human behavior is an important step toward building an AI agent that can actually help that human,” he says.

Modeling behaviorResearchers have been building computational models of human behavior for decades.

Moreover, the researchers saw that their model of human behavior matched up well with measures of player skill (in chess matches) and task difficulty.

6 days, 7 hours назад @ news.mit.edu
Using deep learning to image the Earth’s planetary boundary layer
Using deep learning to image the Earth’s planetary boundary layer Using deep learning to image the Earth’s planetary boundary layer

Although the troposphere is often thought of as the closest layer of the atmosphere to the Earth’s surface, the planetary boundary layer (PBL) — the lowest layer of the troposphere — is actually the part that most significantly influences weather near the surface.

This PBL-focused research effort builds on more than a decade of related work on fast, operational neural network algorithms developed by Lincoln Laboratory for NASA missions.

According to a Global Drought Snapshot report released last year, droughts are a pressing planetary issue that the global community needs to address.

Lack of humidity near the surface, specifically at the level of the PBL, is the leading indicator of drought…

6 days, 16 hours назад @ news.mit.edu
Advancing technology for aquaculture
Advancing technology for aquaculture Advancing technology for aquaculture

Like land-based farming, shellfish aquaculture requires healthy seed production in order to maintain a sustainable industry.

Aquaculture hatchery production of shellfish larvae — seeds — requires close monitoring to track mortality rates and assess health from the earliest stages of life.

Careful observation is necessary to inform production scheduling, determine effects of naturally occurring harmful bacteria, and ensure sustainable seed production.

ARC faces challenges with manually quantifying larvae classes, an important step in their seed production process.

Developing an automated identification and counting system will help to improve this step in the production process with time and…

6 days, 16 hours назад @ news.mit.edu
3 Questions: Enhancing last-mile logistics with machine learning
3 Questions: Enhancing last-mile logistics with machine learning 3 Questions: Enhancing last-mile logistics with machine learning

Q: What is the vehicle routing problem, and how do traditional operations research (OR) methods address it?

A: The vehicle routing problem is faced by pretty much every logistics and delivery company like USPS, Amazon, UPS, FedEx, DHL every single day.

To solve the vehicle routing problem, we obviously we can't do our modeling without proper demand information and, ideally, customer-related characteristics.

Then there are a bunch of other equations that define the inner workings of a routing problem.

Q: You’re currently applying machine learning to the vehicle routing problem.

1 week, 1 day назад @ news.mit.edu
A crossroads for computing at MIT
A crossroads for computing at MIT A crossroads for computing at MIT

On Vassar Street, in the heart of MIT’s campus, the MIT Stephen A. Schwarzman College of Computing recently opened the doors to its new headquarters in Building 45.

“The college has a broad mandate for computing across MIT,” says Daniel Huttenlocher, dean of the MIT Schwarzman College of Computing and the Henry Ellis Warren Professor of Electrical Engineering and Computer Science.

The building will also accommodate 50 computing research groups, which correspond to the number of new faculty the college is hiring — 25 in core computing positions and 25 in shared positions with departments at MIT.

Organized by various MIT faculty, the 12 sessions in the series delved into exciting areas of com…

1 week, 6 days назад @ news.mit.edu
New AI method captures uncertainty in medical images
New AI method captures uncertainty in medical images New AI method captures uncertainty in medical images

In biomedicine, segmentation involves annotating pixels from an important structure in a medical image, like an organ or cell.

However, these models typically only provide one answer, while the problem of medical image segmentation is often far from black and white.

Rakic is lead author of a paper with others at MIT, the Broad Institute of MIT and Harvard, and Massachusetts General Hospital that introduces a new AI tool that can capture the uncertainty in a medical image.

Addressing ambiguityAI systems for medical image segmentation typically use neural networks.

The researchers also developed a version of Tyche that can be used with an existing, pretrained model for medical image segmentat…

1 week, 6 days назад @ news.mit.edu
A faster, better way to prevent an AI chatbot from giving toxic responses
A faster, better way to prevent an AI chatbot from giving toxic responses A faster, better way to prevent an AI chatbot from giving toxic responses

To prevent this and other safety issues, companies that build large language models typically safeguard them using a process called red-teaming.

But this only works effectively if engineers know which toxic prompts to use.

The technique outperformed human testers and other machine-learning approaches by generating more distinct prompts that elicited increasingly toxic responses.

This trial-and-error process rewards the red-team model for generating prompts that trigger toxic responses from the chatbot being tested.

“If you are releasing a new AI model and are concerned about whether it will behave as expected, consider using curiosity-driven red-teaming,” says Agrawal.

2 weeks, 1 day назад @ news.mit.edu
Extracting hydrogen from rocks
Extracting hydrogen from rocks Extracting hydrogen from rocks

Geologic hydrogen, as it’s known, is produced when water reacts with iron-rich rocks, causing the iron to oxidize.

Abate is looking to jump-start the natural hydrogen production process, implementing “proactive” approaches that involve stimulating production and harvesting the gas.

In December, French President Emmanuel Macron said his government would provide funding to explore natural hydrogen.

The projects themselves are diverse, ranging from applying industrial oil and gas methods for hydrogen production and extraction to developing models to understand hydrogen formation in rocks.

The lab-scale device will also inform the design of a real-world reactor that can accelerate hydrogen prod…

2 weeks, 2 days назад @ news.mit.edu
When an antibiotic fails: MIT scientists are using AI to target “sleeper” bacteria
When an antibiotic fails: MIT scientists are using AI to target “sleeper” bacteria When an antibiotic fails: MIT scientists are using AI to target “sleeper” bacteria

When an infection is treated repeatedly, clinicians run the risk of bacteria becoming resistant to the antibiotics.

One well-documented possibility is that the bacteria are becoming metabolically inert, escaping detection of traditional antibiotics that only respond to metabolic activity.

Another revelation was semapimod's ability to disrupt the membranes of so-called “Gram-negative” bacteria, which are known for their high intrinsic resistance to antibiotics due to their thicker, less-penetrable outer membrane.

Examples of Gram-negative bacteria include E. coli, A. baumannii, Salmonella, and Pseudomonis, all of which are challenging to find new antibiotics for.

Valeri recalls a quote from …

2 weeks, 2 days назад @ news.mit.edu
A new computational technique could make it easier to engineer useful proteins
A new computational technique could make it easier to engineer useful proteins A new computational technique could make it easier to engineer useful proteins

This process has yielded optimized versions of many important proteins, including green fluorescent protein (GFP).

MIT researchers have now developed a computational approach that makes it easier to predict mutations that will lead to better proteins, based on a relatively small amount of data.

These landscapes contain peaks that represent fitter proteins and valleys that represent less fit proteins.

To overcome this problem, the researchers used an existing computational technique to “smooth” the fitness landscape.

The researchers now plan to use this computational technique on data that Bracha has been generating on voltage indicator proteins.

3 weeks, 1 day назад @ news.mit.edu
Most work is new work, long-term study of U.S. census data shows
Most work is new work, long-term study of U.S. census data shows Most work is new work, long-term study of U.S. census data shows

In 1900, Orville and Wilbur Wright listed their occupations as “Merchant, bicycle” on the U.S. census form.

Most work in the U.S. is new work, as U.S. census forms reveal.

Finally, the study brings novel data to a tricky question: To what extent does technology create new jobs, and to what extent does it replace jobs?

“In the first period, from 1940 to 1980, there’s a lot of work being created for people without college degrees, a lot of clerical work and production work, middle-skill work.

Support for the research was provided, in part, by the Carnegie Corporation; Google; Instituut Gak; the MIT Work of the Future Task Force; Schmidt Futures; the Smith Richardson Foundation; and the Washin…

3 weeks, 3 days назад @ news.mit.edu
Does technology help or hurt employment?
Does technology help or hurt employment? Does technology help or hurt employment?

Overall, does technology replace more jobs than it creates?

On net, the study finds, and particularly since 1980, technology has replaced more U.S. jobs than it has generated.

That has allowed them, for the first time, to quantify the effects of technology over both job loss and job creation.

“There’s no law these things have to be one-for-one balanced, although there’s been no period where we haven’t also created new work,” Autor observes.

“The missing link was documenting and quantifying how much technology augments people’s jobs,” Autor says.

3 weeks, 3 days назад @ news.mit.edu
Researchers create “The Consensus Game” to elevate AI’s text comprehension and generation skills
Researchers create “The Consensus Game” to elevate AI’s text comprehension and generation skills Researchers create “The Consensus Game” to elevate AI’s text comprehension and generation skills

“Imagine a new way to help language models understand and generate text, like a game.

The math behind this partially inspired The Consensus Game.

The Consensus Game system reaches equilibrium as an agreement, ensuring accuracy and fidelity to the model's original insights.

In practice, implementing the Consensus Game approach to language model querying, especially for question-answering tasks, does involve significant computational challenges.

“The proposal by the MIT researchers is an innovative game-theoretic framework for decoding from language models through solving the equilibrium of a consensus game.

3 weeks, 5 days назад @ news.mit.edu
Berkeley AI
последний пост 1 month назад
Modeling Extremely Large Images with xT
Modeling Extremely Large Images with xT Modeling Extremely Large Images with xT

Modeling Extremely Large Images with xTAs computer vision researchers, we believe that every pixel can tell a story.

However, there seems to be a writer’s block settling into the field when it comes to dealing with large images.

Today, we make one of two sub-optimal choices when handling large images: down-sampling or cropping.

Why bother handling large images anyways?

That’s basically what we do with large images with $x$T.

1 month назад @ bair.berkeley.edu
Modeling Extremely Large Images with xT
Modeling Extremely Large Images with xT Modeling Extremely Large Images with xT

As computer vision researchers, we believe that every pixel can tell a story. However, there seems to be a writer’s block settling into the field when it comes to dealing with large images. Large images are no longer rare—the cameras we carry in our pockets and those orbiting our planet snap pictures so big and detailed that they stretch our current best models and hardware to their breaking points when handling them. Generally, we face a quadratic increase in memory usage as a function of image size.

Today, we make one of two sub-optimal choices when handling large images: down-sampling or cropping. These two methods incur significant losses in the amount of information and context present…

1 month назад @ localhost:4000
2024 BAIR Graduate Directory
2024 BAIR Graduate Directory 2024 BAIR Graduate Directory

Every year, the Berkeley Artificial Intelligence Research (BAIR) Lab graduates some of the most talented and innovative minds in artificial intelligence and machine learning. Our Ph.D. graduates have each expanded the frontiers of AI research and are now ready to embark on new adventures in academia, industry, and beyond.

These fantastic individuals bring with them a wealth of knowledge, fresh ideas, and a drive to continue contributing to the advancement of AI. Their work at BAIR, ranging from deep learning, robotics, and natural language processing to computer vision, security, and much more, has contributed significantly to their fields and has had transformative impacts on society.

This…

1 month, 2 weeks назад @ localhost:4000
2024 BAIR Graduate Directory
2024 BAIR Graduate Directory 2024 BAIR Graduate Directory

2024 BAIR Graduate DirectoryEvery year, the Berkeley Artificial Intelligence Research (BAIR) Lab graduates some of the most talented and innovative minds in artificial intelligence and machine learning.

Our Ph.D. graduates have each expanded the frontiers of AI research and are now ready to embark on new adventures in academia, industry, and beyond.

These fantastic individuals bring with them a wealth of knowledge, fresh ideas, and a drive to continue contributing to the advancement of AI.

Join us in celebrating the achievements of BAIR’s latest PhD graduates.

Thank you to our friends at the Stanford AI Lab for this idea!

1 month, 2 weeks назад @ bair.berkeley.edu
The Shift from Models to Compound AI Systems
The Shift from Models to Compound AI Systems The Shift from Models to Compound AI Systems

In this post, we analyze the trend toward compound AI systems and what it means for AI developers.

We argue that compound AI systems will likely be the best way to maximize AI results in the future, and might be one of the most impactful trends in AI in 2024.

We define a Compound AI System as a system that tackles AI tasks using multiple interacting components, including multiple calls to models, retrievers, or external tools.

Developing Compound AI SystemsWhile compound AI systems can offer clear benefits, the art of designing, optimizing, and operating them is still emerging.

However, new compound AI systems contain non-differentiable components like search engines or code interpreters, a…

2 months, 1 week назад @ bair.berkeley.edu
The Shift from Models to Compound AI Systems
The Shift from Models to Compound AI Systems The Shift from Models to Compound AI Systems

AI caught everyone’s attention in 2023 with Large Language Models (LLMs) that can be instructed to perform general tasks, such as translation or coding, just by prompting. This naturally led to an intense focus on models as the primary ingredient in AI application development, with everyone wondering what capabilities new LLMs will bring.

As more developers begin to build using LLMs, however, we believe that this focus is rapidly changing: state-of-the-art AI results are increasingly obtained by compound systems with multiple components, not just monolithic models.

For example, Google’s AlphaCode 2 set state-of-the-art results in programming through a carefully engineered system that uses L…

2 months, 1 week назад @ localhost:4000
Ghostbuster: Detecting Text Ghostwritten by Large Language Models
Ghostbuster: Detecting Text Ghostwritten by Large Language Models Ghostbuster: Detecting Text Ghostwritten by Large Language Models

The structure of Ghostbuster, our new state-of-the-art method for detecting AI-generated text. Large language models like ChatGPT write impressively well—so well, in fact, that they’ve become a problem. Students have begun using these models to ghostwrite assignments, leading some schools to ban ChatGPT. In addition, these models are also prone to producing text with factual errors, so wary readers may want to know if generative AI tools have been used to ghostwrite news articles or other sources before trusting them.

What can teachers and consumers do? Existing tools to detect AI-generated text sometimes do poorly on data that differs from what they were trained on. In addition, if these m…

5 months, 1 week назад @ localhost:4000
Ghostbuster: Detecting Text Ghostwritten by Large Language Models
Ghostbuster: Detecting Text Ghostwritten by Large Language Models Ghostbuster: Detecting Text Ghostwritten by Large Language Models

Ghostbuster: Detecting Text Ghostwritten by Large Language ModelsThe structure of Ghostbuster, our new state-of-the-art method for detecting AI-generated text.

Large language models like ChatGPT write impressively well—so well, in fact, that they’ve become a problem.

Existing tools to detect AI-generated text sometimes do poorly on data that differs from what they were trained on.

Our recent paper introduces Ghostbuster, a state-of-the-art method for detecting AI-generated text.

Many current AI-generated text detection systems are brittle to classifying different types of text (e.g., different writing styles, or different text generation models or prompts).

5 months, 1 week назад @ bair.berkeley.edu
Asymmetric Certified Robustness via Feature-Convex Neural Networks
Asymmetric Certified Robustness via Feature-Convex Neural Networks Asymmetric Certified Robustness via Feature-Convex Neural Networks

Asymmetric Certified Robustness via Feature-Convex Neural Networks TLDR: We propose the asymmetric certified robustness problem, which requires certified robustness for only one class and reflects real-world adversarial scenarios. This focused setting allows us to introduce feature-convex classifiers, which produce closed-form and deterministic certified radii on the order of milliseconds. Figure 1. Illustration of feature-convex classifiers and their certification for sensitive-class inputs. This architecture composes a Lipschitz-continuous feature map $\varphi$ with a learned convex function $g$. Since $g$ is convex, it is globally underapproximated by its tangent plane at $\varphi(x)$, y…

5 months, 1 week назад @ localhost:4000
Asymmetric Certified Robustness via Feature-Convex Neural Networks
Asymmetric Certified Robustness via Feature-Convex Neural Networks Asymmetric Certified Robustness via Feature-Convex Neural Networks

Asymmetric Certified Robustness via Feature-Convex Neural NetworksAsymmetric Certified Robustness via Feature-Convex Neural NetworksTLDR: We propose the asymmetric certified robustness problem, which requires certified robustness for only one class and reflects real-world adversarial scenarios.

We argue that these issues can be addressed by refining the certified robustness problem to be more aligned with practical adversarial settings.

The Asymmetric Certified Robustness ProblemCurrent certifiably robust classifiers produce certificates for inputs belonging to any class.

Feature-convex classifiersWe propose feature-convex neural networks to address the asymmetric robustness problem.

Conclu…

5 months, 1 week назад @ bair.berkeley.edu
Goal Representations for Instruction Following
Goal Representations for Instruction Following Goal Representations for Instruction Following

Goal Representations for Instruction FollowingGoal Representations for Instruction FollowingA longstanding goal of the field of robot learning has been to create generalist agents that can perform tasks for humans.

Goal Representations for Instruction FollowingThe GRIF model consists of a language encoder, a goal encoder, and a policy network.

Our approach, Goal Representations for Instruction Following (GRIF), jointly trains a language- and a goal- conditioned policy with aligned task representations.

In particular, we exploit this structure by requiring that language- and goal- representations be similar for the same semantic task.

We train dual image and text encoders by doing contrastiv…

6 months, 1 week назад @ bair.berkeley.edu
Goal Representations for Instruction Following
Goal Representations for Instruction Following Goal Representations for Instruction Following

Goal Representations for Instruction Following A longstanding goal of the field of robot learning has been to create generalist agents that can perform tasks for humans. Natural language has the potential to be an easy-to-use interface for humans to specify arbitrary tasks, but it is difficult to train robots to follow language instructions. Approaches like language-conditioned behavioral cloning (LCBC) train policies to directly imitate expert actions conditioned on language, but require humans to annotate all training trajectories and generalize poorly across scenes and behaviors. Meanwhile, recent goal-conditioned approaches perform much better at general manipulation tasks, but do not e…

6 months, 1 week назад @ localhost:4000
Rethinking the Role of PPO in RLHF
Rethinking the Role of PPO in RLHF Rethinking the Role of PPO in RLHF

Rethinking the Role of PPO in RLHFRethinking the Role of PPO in RLHFTL;DR: In RLHF, there’s tension between the reward learning phase, which uses human preference in the form of comparisons, and the RL fine-tuning phase, which optimizes a single, non-comparative reward.

By incorporating a new component - pairwise policy gradient, we can unify the reward modeling stage and RL stage, enabling direct updates based on pairwise responses.

Proximal Policy Optimization (PPO), the dominant RL optimizer in this process, has been reported to exhibit instability and implementation complications.

Derivation of P3OOur idea stems from the vanilla policy gradient (VPG).

Under this framework, we develop …

6 months, 1 week назад @ bair.berkeley.edu
Rethinking the Role of PPO in RLHF
Rethinking the Role of PPO in RLHF Rethinking the Role of PPO in RLHF

Rethinking the Role of PPO in RLHF TL;DR: In RLHF, there’s tension between the reward learning phase, which uses human preference in the form of comparisons, and the RL fine-tuning phase, which optimizes a single, non-comparative reward. What if we performed RL in a comparative way? Figure 1: This diagram illustrates the difference between reinforcement learning from absolute feedback and relative feedback. By incorporating a new component - pairwise policy gradient, we can unify the reward modeling stage and RL stage, enabling direct updates based on pairwise responses. Large Language Models (LLMs) have powered increasingly capable virtual assistants, such as GPT-4, Claude-2, Bard and Bing…

6 months, 1 week назад @ localhost:4000
Goal Representations for Instruction Following
Goal Representations for Instruction Following Goal Representations for Instruction Following

Goal Representations for Instruction FollowingGoal Representations for Instruction FollowingA longstanding goal of the field of robot learning has been to create generalist agents that can perform tasks for humans.

Goal Representations for Instruction FollowingThe GRIF model consists of a language encoder, a goal encoder, and a policy network.

Our approach, Goal Representations for Instruction Following (GRIF), jointly trains a language- and a goal- conditioned policy with aligned task representations.

In particular, we exploit this structure by requiring that language- and goal- representations be similar for the same semantic task.

We train dual image and text encoders by doing contrastiv…

6 months, 1 week назад @ bair.berkeley.edu
AWS Machine Learning AWS Machine Learning
последний пост 19 часов назад
Enhance conversational AI with advanced routing techniques with Amazon Bedrock
Enhance conversational AI with advanced routing techniques with Amazon Bedrock Enhance conversational AI with advanced routing techniques with Amazon Bedrock

Agents for Amazon Bedrock approachAgents for Amazon Bedrock allows you to build generative AI applications that can run multi-step tasks across a company’s systems and data sources.

For this post, the Lambda function sends notifications using Amazon Simple Email Service (Amazon SES).

We use Knowledge Bases for Amazon Bedrock to fetch from historical data stored as embeddings in the Amazon OpenSearch Service vector database.

Agents for Amazon Bedrock – Agents for Amazon Bedrock completes the user queries through a series of reasoning steps and corresponding actions based on ReAct prompting: Knowledge Bases for Amazon Bedrock – Knowledge Bases for Amazon Bedrock provides fully managed RAG to …

19 часов назад @ aws.amazon.com
Improve LLM performance with human and AI feedback on Amazon SageMaker for Amazon Engineering
Improve LLM performance with human and AI feedback on Amazon SageMaker for Amazon Engineering Improve LLM performance with human and AI feedback on Amazon SageMaker for Amazon Engineering

The Amazon Engineering SMEs validated this AI feedback and acknowledged the benefits of using AI scores.

AI feedback provides AI scores automatically and enables the SMEs to use filtering, sorting, and grouping to validate the scores and identify trends in the responses.

In total, the average AI feedback score increases up to 8%, from 3.9 to 4.2.

The model after RLAIF provided better performance for Amazon Engineering’s question answering bot, improved the AI feedback score by 8%.

For your own project, you can introduce the human-in-the-loop approach to collect your users’ feedback, or generate AI feedback using another LLM.

19 часов назад @ aws.amazon.com
Improve accuracy of Amazon Rekognition Face Search with user vectors
Improve accuracy of Amazon Rekognition Face Search with user vectors Improve accuracy of Amazon Rekognition Face Search with user vectors

Amazon Rekognition face matchingAmazon Rekognition face matching enables measuring the similarity of a face vector extracted from one image to a face vector extracted from another image.

In June 2023, AWS launched user vectors, a new capability that significantly improves face search accuracy by using multiple face images of a user.

Now, you can create user vectors, which aggregate multiple face vectors of the same user.

We guide you through creating a collection, storing face vectors in that collection, aggregating those face vectors into user vectors, and then comparing the results of searching against those individual face vectors and user vectors.

We demonstrated how to improve face sea…

19 часов назад @ aws.amazon.com
Accelerate ML workflows with Amazon SageMaker Studio Local Mode and Docker support
Accelerate ML workflows with Amazon SageMaker Studio Local Mode and Docker support Accelerate ML workflows with Amazon SageMaker Studio Local Mode and Docker support

In this post, we guide you through setting up Local Mode in SageMaker Studio, running a sample training job, and deploying the model on an Amazon SageMaker endpoint from a SageMaker Studio notebook.

SageMaker Studio Local ModeSageMaker Studio introduces Local Mode, enabling you to run SageMaker training, inference, batch transform, and processing jobs directly on your JupyterLab, Code Editor, or SageMaker Studio Classic notebook instances without requiring remote compute resources.

Docker support in SageMaker StudioSageMaker Studio now also enables building and running Docker containers locally on your SageMaker Studio notebook instance.

For more details on how to get started with SageMaker…

1 day, 16 hours назад @ aws.amazon.com
Significant new capabilities make it easier to use Amazon Bedrock to build and scale generative AI applications – and achieve impressive results
Significant new capabilities make it easier to use Amazon Bedrock to build and scale generative AI applications – and achieve impressive results Significant new capabilities make it easier to use Amazon Bedrock to build and scale generative AI applications – and achieve impressive results

We introduced Amazon Bedrock to the world a little over a year ago, delivering an entirely new way to build generative artificial intelligence (AI) applications.

They are realizing real business value by quickly moving generative AI applications into production to improve customer experiences and increase operational efficiency.

LexisNexis Legal & Professional, a leading global provider of information and analytics, developed a personalized legal generative AI assistant on Lexis+ AI.

With Amazon Bedrock, we’re focused on the key areas that customers need to build production-ready, enterprise-grade generative AI applications at the right cost and speed.

That’s why Amazon Bedrock has capabili…

2 days назад @ aws.amazon.com
Building scalable, secure, and reliable RAG applications using Knowledge Bases for Amazon Bedrock
Building scalable, secure, and reliable RAG applications using Knowledge Bases for Amazon Bedrock Building scalable, secure, and reliable RAG applications using Knowledge Bases for Amazon Bedrock

AWS Well-Architected design principlesRAG-based applications built using Knowledge Bases for Amazon Bedrock can greatly benefit from following the AWS Well-Architected Framework.

By aligning with the Well-Architected Framework, organizations can effectively build and manage enterprise-grade RAG applications using Knowledge Bases for Amazon Bedrock.

Now, let’s dive deep into the new features launched within Knowledge Bases for Amazon Bedrock.

To support this, Knowledge Bases for Amazon Bedrock now supports private network policies for Amazon OpenSearch Serverless.

Knowledge Bases for Amazon Bedrock provides an option for using OpenSearch Serverless as a vector store.

2 days назад @ aws.amazon.com
Integrate HyperPod clusters with Active Directory for seamless multi-user login
Integrate HyperPod clusters with Active Directory for seamless multi-user login Integrate HyperPod clusters with Active Directory for seamless multi-user login

For more details on how to create HyperPod clusters, refer to Getting started with SageMaker HyperPod and the HyperPod workshop.

Choose (right-click) hyperpod , choose New, and choose Organizational Unit.

$ python3 -c "import getpass,pysss; print(pysss.password().encrypt(getpass.getpass('AD reader user password: ').strip(), pysss.password().AES_256))" AD reader user password: (Enter ReadOnly user password) AAAQACK2....

Create a HyperPod cluster with an SSSD-enabled lifecycle scriptNext, you create a HyperPod cluster with LDAPS/Active Directory integration.

$ ssh MyCluster-LoginNode : : ____ __ ___ __ __ __ ___ __ / __/__ ____ ____ / |/ /__ _/ /_____ ____ / // /_ _____ ___ ____/ _ \___ ___/ …

2 days, 18 hours назад @ aws.amazon.com
The executive’s guide to generative AI for sustainability
The executive’s guide to generative AI for sustainability The executive’s guide to generative AI for sustainability

This post serves as a starting point for any executive seeking to navigate the intersection of generative artificial intelligence (generative AI) and sustainability.

A roadmap to generative AI for sustainabilityIn the sections that follow, we provide a roadmap for integrating generative AI into sustainability initiatives1.

Figure 5: Generative AI modalitiesIn addition, figure 6 outlines the main generative AI optimization strategies, including prompt engineering, Retrieval Augmented Generation, and fine-tuning or continued pre-training.

Minimize resource usage throughout the generative AI lifecycleIn some cases, generative AI itself can have a high energy cost.

ConclusionIn this post, we pr…

2 days, 18 hours назад @ aws.amazon.com
Introducing automatic training for solutions in Amazon Personalize
Introducing automatic training for solutions in Amazon Personalize Introducing automatic training for solutions in Amazon Personalize

Amazon Personalize is excited to announce automatic training for solutions.

PrerequisitesTo enable automatic training for your solutions, you first need to set up Amazon Personalize resources.

For more information about tagging Amazon Personalize resources, see Tagging Amazon Personalize resources.

To use automatic training, in the Automatic training section, select Turn on and specify your training frequency.

For more information about optimizing your user experience with Amazon Personalize, see the Amazon Personalize Developer Guide.

5 days, 11 hours назад @ aws.amazon.com
Use Kubernetes Operators for new inference capabilities in Amazon SageMaker that reduce LLM deployment costs by 50% on average
Use Kubernetes Operators for new inference capabilities in Amazon SageMaker that reduce LLM deployment costs by 50% on average Use Kubernetes Operators for new inference capabilities in Amazon SageMaker that reduce LLM deployment costs by 50% on average

We are excited to announce a new version of the Amazon SageMaker Operators for Kubernetes using the AWS Controllers for Kubernetes (ACK).

For more details, see Amazon SageMaker adds new inference capabilities to help reduce foundation model deployment costs and latency.

In this post, we show how to use SageMaker ACK Operators to deploy SageMaker inference components.

kubectl apply passes this file, called a manifest, to the Kubernetes API server running in the Kubernetes controller node.

ConclusionIn this post, we showed how to use SageMaker ACK Operators to deploy SageMaker inference components.

5 days, 18 hours назад @ aws.amazon.com
Talk to your slide deck using multimodal foundation models hosted on Amazon Bedrock and Amazon SageMaker – Part 2
Talk to your slide deck using multimodal foundation models hosted on Amazon Bedrock and Amazon SageMaker – Part 2 Talk to your slide deck using multimodal foundation models hosted on Amazon Bedrock and Amazon SageMaker – Part 2

We used AWS services including Amazon Bedrock, Amazon SageMaker, and Amazon OpenSearch Serverless in this solution.

These descriptions are then converted into text embeddings using the Amazon Titan Text Embeddings model and stored in a vector database.

OpenSearch Serverless is an on-demand serverless configuration for Amazon OpenSearch Service.

Amazon OpenSearch Ingestion (OSI) is a fully managed, serverless data collector that delivers data to OpenSearch Service domains and OpenSearch Serverless collections.

The user input is converted into embeddings using the Amazon Titan Text Embeddings model accessed using Amazon Bedrock.

5 days, 20 hours назад @ aws.amazon.com
Scale AI training and inference for drug discovery through Amazon EKS and Karpenter
Scale AI training and inference for drug discovery through Amazon EKS and Karpenter Scale AI training and inference for drug discovery through Amazon EKS and Karpenter

In this post, we focus on how we used Karpenter on Amazon Elastic Kubernetes Service (Amazon EKS) to scale AI training and inference, which are core elements of the Iambic discovery platform.

The Kubernetes Event Driven Autoscaler (KEDA) is configured to automatically scale the number of service pods, based on the custom metrics available in Prometheus.

We deploy Karpenter by following the instructions in the Karpenter documentation, or by running the following script, which automates the deployment instructions.

We then deploy the load generator, which causes KEDA to add more service pods and Karpenter adds more GPU instances.

When we scale the load-generating deployment to 40 replicas, KE…

5 days, 20 hours назад @ aws.amazon.com
Generate customized, compliant application IaC scripts for AWS Landing Zone using Amazon Bedrock
Generate customized, compliant application IaC scripts for AWS Landing Zone using Amazon Bedrock Generate customized, compliant application IaC scripts for AWS Landing Zone using Amazon Bedrock

With Amazon Bedrock, teams can input high-level architectural descriptions and use generative AI to generate a baseline configuration of Terraform scripts.

In this post, we show you how to generate customized, compliant IaC scripts for AWS Landing Zone using Amazon Bedrock.

The following is an example of a customized Terraform-based AWS Landing Zone solution, in which each application resides in its own AWS account.

This function, which is central to the operation, translates these inputs into compliant code, using Amazon Bedrock and Knowledge Bases for Amazon Bedrock.

Allow Amazon Bedrock to create and manage the vector store for you in Amazon OpenSearch Service.

6 days, 17 hours назад @ aws.amazon.com
Live Meeting Assistant with Amazon Transcribe, Amazon Bedrock, and Knowledge Bases for Amazon Bedrock
Live Meeting Assistant with Amazon Transcribe, Amazon Bedrock, and Knowledge Bases for Amazon Bedrock Live Meeting Assistant with Amazon Transcribe, Amazon Bedrock, and Knowledge Bases for Amazon Bedrock

In this post, we show you how to use LMA with Amazon Transcribe, Amazon Bedrock, and Knowledge Bases for Amazon Bedrock.

It uses Amazon Transcribe for speech to text, Knowledge Bases for Amazon Bedrock for contextual queries against your company’s documents and knowledge sources, and Amazon Bedrock models for customizable transcription insights and summaries.

Browser extension captures audio and meeting metadata from popular meeting apps – The browser extension captures meeting metadata—the meeting title and names of active speakers—and audio from you (your microphone) and others (from the meeting browser tab).

To customize the meeting assist capabilities based on the QnABot on AWS solution…

6 days, 18 hours назад @ aws.amazon.com
Meta Llama 3 models are now available in Amazon SageMaker JumpStart
Meta Llama 3 models are now available in Amazon SageMaker JumpStart Meta Llama 3 models are now available in Amazon SageMaker JumpStart

Today, we are excited to announce that Meta Llama 3 foundation models are available through Amazon SageMaker JumpStart to deploy and run inference.

Discover modelsYou can access the foundation models through SageMaker JumpStart in the SageMaker Studio UI and the SageMaker Python SDK.

For more details on how to get started and set up SageMaker Studio, refer to Amazon SageMaker Studio.

In SageMaker Studio, you can access SageMaker JumpStart, which contains pre-trained models, notebooks, and prebuilt solutions, under Prebuilt and automated solutions.

The Eiffel Tower: The iconic Eiffel Tower is one of the most recognizable landmarks in the world and offers breathtaking views of the city.

6 days, 19 hours назад @ aws.amazon.com
NVIDIA
последний пост 20 часов назад
How Virtual Factories Are Making Industrial Digitalization a Reality
How Virtual Factories Are Making Industrial Digitalization a Reality How Virtual Factories Are Making Industrial Digitalization a Reality

Manufacturers are using virtual factories to unlock new possibilities from planning to operations.

A virtual factory is a physically accurate representation of a real factory.

: During facility design, construction and commissioning, virtual factories allow project stakeholders to visualize designs in the context of the entire facility and production process.

Intelligent and Optimized Operations: Operations teams can integrate their virtual factories with valuable production data from Internet of Things technology at the edge, and tap AI to drive further optimizations.

Virtual Factories: A Testing Ground for AI and RoboticsRobotics developers are increasingly using virtual factories to trai…

20 часов назад @ blogs.nvidia.com
NVIDIA to Acquire GPU Orchestration Software Provider Run:ai
NVIDIA to Acquire GPU Orchestration Software Provider Run:ai NVIDIA to Acquire GPU Orchestration Software Provider Run:ai

Run:ai customers include some of the world’s largest enterprises across multiple industries, which use the Run:ai platform to manage data-center-scale GPU clusters.

NVIDIA DGX and DGX Cloud customers will gain access to Run:ai’s capabilities for their AI workloads, particularly for large language model deployments.

Run:ai’s solutions are already integrated with NVIDIA DGX, NVIDIA DGX SuperPOD, NVIDIA Base Command, NGC containers, and NVIDIA AI Enterprise software, among other products.

Together with Run:ai, NVIDIA will enable customers to have a single fabric that accesses GPU solutions anywhere.

Customers can expect to benefit from better GPU utilization, improved management of GPU infrast…

22 часа назад @ blogs.nvidia.com
Forecasting the Future: AI2’s Christopher Bretherton Discusses Using Machine Learning for Climate Modeling
Forecasting the Future: AI2’s Christopher Bretherton Discusses Using Machine Learning for Climate Modeling Forecasting the Future: AI2’s Christopher Bretherton Discusses Using Machine Learning for Climate Modeling

Through ongoing research and collaboration, Bretherton and his team aim to improve climate modeling and enable society to better mitigate and adapt to the impacts of climate change.

Stay tuned for more episodes recorded live from GTC, and watch the replay of Bretherton’s GTC session on using machine learning for climate modeling.

Time Stamps2:03: What is climate modeling and how can it prepare us for climate change?

5:28: How can machine learning help enhance climate modeling?

25:59: How do you measure the accuracy or performance of an emulator that’s doing something like climate modeling out into the future?

22 часа назад @ blogs.nvidia.com
Rays Up: Decoding AI-Powered DLSS 3.5 Ray Reconstruction
Rays Up: Decoding AI-Powered DLSS 3.5 Ray Reconstruction Rays Up: Decoding AI-Powered DLSS 3.5 Ray Reconstruction

DLSS 3.5 with Ray Reconstruction creates higher quality ray-traced images for intensive ray-traced games and apps.

DLSS 3.5 Ray Reconstruction introduces an NVIDIA supercomputer-trained, AI-powered neural network that generates higher-quality pixels in between the sampled rays.

Deep Learning, Deep GamingRay Reconstruction is just one of the AI graphics breakthroughs that multiply performance in DLSS.

Combining the DLSS-generated frames with DLSS Super Resolution enables DLSS 3 to reconstruct seven-eighths of the displayed pixels with AI, boosting frame rates by up to 4x compared to without DLSS.

Because DLSS Frame Generation is post-processed (applied after the main render) on the GPU, it c…

22 часа назад @ blogs.nvidia.com
Democratizing AI Workflows with Union.ai and NVIDIA DGX Cloud
Democratizing AI Workflows with Union.ai and NVIDIA DGX Cloud Democratizing AI Workflows with Union.ai and NVIDIA DGX Cloud

NVIDIA DGX CloudTo help address the need for AI training, NVIDIA DGX Cloud offers state-of-the-art accelerated computing resources for users to access without the hassle of sourcing, setting up, and configuring the infrastructure themselves.

In this post, I’m excited to introduce Union’s NVIDIA DGX Agent, which enables you to integrate your Flyte workflows with NVIDIA DGX Cloud.

In the context of AI model development, workflows have become the de facto abstraction for managing the complexity of data and model management.

Fine-tuning a pipeline using FlyteIntroducing Union’s NVIDIA DGX AgentLet’s say you’re working on fine-tuning the Mixtral 8x7b model.

To leverage DGX Cloud to get around th…

1 day, 10 hours назад @ developer.nvidia.com
Small and Mighty: NVIDIA Accelerates Microsoft’s Open Phi-3 Mini Language Models
Small and Mighty: NVIDIA Accelerates Microsoft’s Open Phi-3 Mini Language Models Small and Mighty: NVIDIA Accelerates Microsoft’s Open Phi-3 Mini Language Models

NVIDIA announced today its acceleration of Microsoft’s new Phi-3 Mini open language model with NVIDIA TensorRT-LLM, an open-source library for optimizing large language model inference when running on NVIDIA GPUs from PC to cloud.

Phi-3 Mini packs the capability of 10x larger models and is licensed for both research and broad commercial usage, advancing Phi-2 from its research-only roots.

Phi-3 Mini has two variants, with one supporting 4k tokens and the other supporting 128K tokens, which is the first model in its class for very long contexts.

Developers can try Phi-3 Mini with the 128K context window at ai.nvidia.com, where it is packaged as an NVIDIA NIM, a microservice with a standard a…

1 day, 18 hours назад @ blogs.nvidia.com
Advancing Cell Segmentation and Morphology Analysis with NVIDIA AI Foundation Model VISTA-2D
Advancing Cell Segmentation and Morphology Analysis with NVIDIA AI Foundation Model VISTA-2D Advancing Cell Segmentation and Morphology Analysis with NVIDIA AI Foundation Model VISTA-2D

Spatial omics providers like NanoString are already leveraging NVIDIA GPUs and accelerating the compute onboard their CosMx SMI device to address these challenges.

VISTA-2D NVIDIA AI Foundation model for cell segmentationOne of the most important steps in analyzing the images of these spatial omics approaches is cell segmentation.

VISTA-2D has a network architecture with ~100 million training hyperparameters, making it adaptable and scalable for fast cell segmentation.

Performance on benchmark datasetsExtensive evaluation was performed for the VISTA-2D model with multiple public datasets, such as TissueNet, LIVECell, Omnipose, DeepBacs, Cellpose, and many more.

VISTA-2D is an NVIDIA AI Foun…

2 days, 17 hours назад @ developer.nvidia.com
Climate Tech Startups Integrate NVIDIA AI for Sustainability Applications
Climate Tech Startups Integrate NVIDIA AI for Sustainability Applications Climate Tech Startups Integrate NVIDIA AI for Sustainability Applications

Earth-2 features a suite of AI models that help simulate, visualize and deliver actionable insights about weather and climate.

Delivering Tomorrow’s ForecastTomorrow.io, based in Boston, is a leading resilience platform that helps organizations adapt to increasing weather and climate volatility.

It’s also conducting experiments using Earth-2 AI forecast models to determine the optimal configurations of satellites to improve weather-forecasting conditions.

To learn more about the Sustainable Futures program, watch the “AI Nations and Sustainable Futures Day” session from NVIDIA GTC.

Learn more in the GTC session, “Global Strategies: Startups, Venture Capital, and Climate Change Solutions.”Vi…

2 days, 20 hours назад @ blogs.nvidia.com
Wide Open: NVIDIA Accelerates Inference on Meta Llama 3
Wide Open: NVIDIA Accelerates Inference on Meta Llama 3 Wide Open: NVIDIA Accelerates Inference on Meta Llama 3

The latest open large language model from Meta — built with NVIDIA technology — is optimized to run on NVIDIA GPUs from the cloud and data center to the edge and the PC.

NVIDIA today announced optimizations across all its platforms to accelerate Meta Llama 3, the latest generation of the large language model (LLM).

With support from NVIDIA, Meta tuned its network, software and model architectures for its flagship LLM.

Putting Llama 3 to WorkVersions of Llama 3, accelerated on NVIDIA GPUs, are available today for use in the cloud, data center, edge and PC.

Custom models can be optimized for inference with NVIDIA TensorRT-LLM and deployed with NVIDIA Triton Inference Server.

6 days, 19 hours назад @ blogs.nvidia.com
Up to No Good: ‘No Rest for the Wicked’ Early Access Launches on GeForce NOW
Up to No Good: ‘No Rest for the Wicked’ Early Access Launches on GeForce NOW Up to No Good: ‘No Rest for the Wicked’ Early Access Launches on GeForce NOW

Members can now stream No Rest for the Wicked from the cloud.

Holy MolyNo Rest for the Wicked is the highly anticipated action role-playing game from Moon Studios, developer of the Ori series, and publisher Private Division.

Embark on a dark and perilous journey, where no rest awaits the wicked.

Rise to the challenge and stream from GeForce RTX 4080 servers with a GeForce NOW Ultimate membership for the smoothest gameplay from the cloud.

Shiny New GamesBecome a Wild West superhero in Evil West, streaming on GeForce NOW this week and part of PC Game Pass.

6 days, 22 hours назад @ blogs.nvidia.com
NVIDIA Honors Partners of the Year in Europe, Middle East, Africa
NVIDIA Honors Partners of the Year in Europe, Middle East, Africa NVIDIA Honors Partners of the Year in Europe, Middle East, Africa

The recipients were honored at the annual EMEA Partner Day hosted by the NVIDIA Partner Network (NPN).

“This year marks another milestone for NVIDIA and our partners across EMEA as we pioneer technological breakthroughs and unlock new business opportunities using NVIDIA’s full-stack platform,” said Dirk Barfuss, director of EMEA channel at NVIDIA.

received the Rising Star Northern Europe award for its exceptional revenue growth and broad customer base deploying NVIDIA AI solutions in data centers.

Through extensive collaboration with NVIDIA, the company has become a cornerstone of the NVIDIA partner landscape in Germany.

Through extensive collaboration with NVIDIA, the company has become a …

1 week назад @ blogs.nvidia.com
Advancing Medical Image Decoding with GPU-Accelerated nvImageCodec
Advancing Medical Image Decoding with GPU-Accelerated nvImageCodec Advancing Medical Image Decoding with GPU-Accelerated nvImageCodec

We’ll guide you through the intricacies of image decoding, introduce you to AWS HealthImaging, and explore the advancements enabled by GPU-accelerated decoding solutions.

GPU-accelerated image decodingTo further enhance the image decoding performance, AWS HealthImaging supports GPU acceleration, specifically leveraging the NVIDIA nvJPEG2000 library.

AWS HealthImaging walkthroughIn this walkthrough, we showcase the use of AWS HealthImaging.

We demonstrate the process of employing a SageMaker multimodel endpoint for image decoding, utilizing a GPU-accelerated interface.

: In computer vision applications, image decoding speed plays a vital role in real-time processing tasks such as object dete…

1 week назад @ developer.nvidia.com
Seeing Beyond: Living Optics CEO Robin Wang on Democratizing Hyperspectral Imaging
Seeing Beyond: Living Optics CEO Robin Wang on Democratizing Hyperspectral Imaging Seeing Beyond: Living Optics CEO Robin Wang on Democratizing Hyperspectral Imaging

Step into the realm of the unseen with Robin Wang, CEO of Living Optics.

Living Optics’ hyperspectral imaging camera, which can capture visual data across 96 colors, reveals details invisible to the human eye.

1:45: The Living Optics camera’s ability to capture 96 colors3:36: Where is hyperspectral imaging being used, and why is it so important?

9:34: Other use cases of hyperspectral imaging13:07: What’s unique about Living Optics’ hyperspectral imaging camera?

18:36: Breakthroughs, challenges during the technology’s development23:27: What’s next for Living Optics and hyperspectral imaging?

1 week назад @ blogs.nvidia.com
Moving Pictures: Transform Images Into 3D Scenes With NVIDIA Instant NeRF
Moving Pictures: Transform Images Into 3D Scenes With NVIDIA Instant NeRF Moving Pictures: Transform Images Into 3D Scenes With NVIDIA Instant NeRF

Using Instant NeRF, creatives are transforming collections of still images into digital 3D scenes in just seconds.

The researchers won a best paper award for the work, and TIME Magazine named Instant NeRF a best invention of 2022.

Ready, Set, GoGet started with Instant NeRF to learn about radiance fields and experience imagery in a new way.

The Instant NeRF gallery showcases some of the most innovative and thought-provoking examples, viewable as video clips in any web browser.

Thanks to a recent Instant NeRF update, users can render their scenes from static images and virtually step inside the environments, moving freely within the 3D space.

1 week назад @ blogs.nvidia.com
New NVIDIA RTX A400 and A1000 GPUs Enhance AI-Powered Design and Productivity Workflows
New NVIDIA RTX A400 and A1000 GPUs Enhance AI-Powered Design and Productivity Workflows New NVIDIA RTX A400 and A1000 GPUs Enhance AI-Powered Design and Productivity Workflows

To meet this growing need, NVIDIA is expanding its RTX professional graphics offerings with two new NVIDIA Ampere architecture-based GPUs for desktops: the NVIDIA RTX A400 and NVIDIA RTX A1000.

A New Era of Creativity, Performance and EfficiencyThe RTX A400 GPU introduces accelerated ray tracing and AI to the RTX 400 series GPUs.

With a sleek, single-slot design and consuming just 50W, the A400 and A1000 GPUs bring impressive features to compact, energy-efficient workstations.

AvailabilityThe NVIDIA RTX A1000 GPU is now available through global distribution partners such as PNY and Ryoyo Electric.

The RTX A400 GPU is expected to be available from channel partners starting in May, with antic…

1 week, 1 day назад @ blogs.nvidia.com
Facebook
последний пост 1 week, 6 days назад
Building new custom silicon for Meta’s AI workloads
Building new custom silicon for Meta’s AI workloads Building new custom silicon for Meta’s AI workloads

To help personalize content, tailor and measure ads and provide a safer experience, we use cookies.

By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies.

Learn more, including about available controls: Cookie PolicyAccept

1 week, 6 days назад @ engineering.fb.com
Introducing the next-gen Meta Training and Inference Accelerator
Introducing the next-gen Meta Training and Inference Accelerator Introducing the next-gen Meta Training and Inference Accelerator

The next generation of Meta’s large-scale infrastructure is being built with AI in mind, including supporting new generative AI (GenAI) products and services, recommendation systems, and advanced AI research.

It’s an investment we expect will grow in the years ahead as the compute requirements to support AI models increase alongside the models’ sophistication.

Last year, we unveiled the Meta Training and Inference Accelerator (MTIA) v1, our first-generation AI inference accelerator that we designed in-house with Meta’s AI workloads in mind – specifically our deep learning recommendation models that are improving a variety of experiences across our products.

MTIA is a long-term venture to pr…

2 weeks назад @ ai.meta.com
Optimizing RTC bandwidth estimation with machine learning
Optimizing RTC bandwidth estimation with machine learning Optimizing RTC bandwidth estimation with machine learning

Bandwidth estimation (BWE) and congestion control play an important role in delivering high-quality real-time communication (RTC) across Meta’s family of apps.

Network characterizationAn ML model-based approach leverages time series data to improve the bandwidth estimation by using offline parameter tuning for characterized network types.

The first component is offline ML model learning using ML to categorize the network type (random packet loss versus bursty loss).

The non-time series data or dense data will pass through a dense layer (i.e., a fully connected layer).

Use case: Random packet loss classificationLet’s consider the use case of categorizing packet loss as either random or conge…

1 month назад @ engineering.fb.com
Logarithm: A logging engine for AI training workflows and services
Logarithm: A logging engine for AI training workflows and services Logarithm: A logging engine for AI training workflows and services

In this post, we present the design behind Logarithm, and show how it powers AI training debugging use cases.

AI training debugging with LogarithmBefore looking at Logarithm’s internals, we present support for training systems and model issue debugging, one of the prominent use cases of Logarithm at Meta.

ML model training workflows tend to have a wide range of failure modes, spanning data inputs, model code and hyperparameters, and systems components (e.g., PyTorch, data readers, checkpointing, framework code, and hardware).

Logarithm ingests both systems logs from the training stack, and model telemetry from training jobs that the stack executes.

Filter–by-callsite enables hiding known lo…

1 month, 1 week назад @ engineering.fb.com
Building Meta’s GenAI Infrastructure
Building Meta’s GenAI Infrastructure Building Meta’s GenAI Infrastructure

While we’ve had a long history of building AI infrastructure, we first shared details on our AI Research SuperCluster (RSC), featuring 16,000 NVIDIA A100 GPUs, in 2022.

Under the hoodOur newer AI clusters build upon the successes and lessons learned from RSC.

Our out-of-box performance for large clusters was initially poor and inconsistent, compared to optimized small cluster performance.

Commitment to open AI innovationMeta maintains its commitment to open innovation in AI software and hardware.

The future of Meta’s AI infrastructureThese two AI training cluster designs are a part of our larger roadmap for the future of AI.

1 month, 1 week назад @ engineering.fb.com
Improving machine learning iteration speed with faster application build and packaging
Improving machine learning iteration speed with faster application build and packaging Improving machine learning iteration speed with faster application build and packaging

These improvements helped us find and remove many unnecessary dependencies, making build graph analysis and overall build times much better.

In response to this challenge, we implemented a new solution for the packaging and distribution of Python executables – the Content Addressable Filesystem (CAF).

LazyCAF and enforcing uniform revisions: Areas for further ML iteration improvementsThe improvements we implemented have proven highly effective, drastically reducing the overhead and significantly elevating the efficiency of our ML engineers.

Faster build times and more efficient packaging and distribution of executables have reduced overhead by double-digit percentages.

We plan to enable all…

2 months, 3 weeks назад @ engineering.fb.com
Lazy is the new fast: How Lazy Imports and Cinder accelerate machine learning at Meta
Lazy is the new fast: How Lazy Imports and Cinder accelerate machine learning at Meta Lazy is the new fast: How Lazy Imports and Cinder accelerate machine learning at Meta

At Meta, the quest for faster model training has yielded an exciting milestone: the adoption of Lazy Imports and the Python Cinder runtime.

The challenges of adopting Lazy ImportsWhile Lazy Imports’ approach significantly improved ML development, it was not all a bed of roses.

With Lazy Imports, Meta’s ML developers are now equipped to work more efficiently, experiment more rapidly, and achieve results faster.

Here’s a glimpse into our future endeavors:Streamlining developer onboardingThe learning curve associated with Lazy Imports can be a challenge for newcomers.

Building a robust community that helps supporting paradigms and patterns that play well with Lazy Imports is one of our future …

3 months, 1 week назад @ engineering.fb.com
How Meta is advancing GenAI
How Meta is advancing GenAI How Meta is advancing GenAI

What’s going on with generative AI (GenAI) at Meta?

In this episode of the Meta Tech Podcast, Meta engineer Pascal Hartig (@passy) speaks with Devi Parikh, an AI research director at Meta.

They cover a wide range of topics, including the history and future of GenAI and the most interesting research papers that have come out recently.

And, of course, they discuss some of Meta’s latest GenAI innovations, including:Audiobox, a foundational model for generating sound and soundscapes using natural language prompts.

And if you’re interested in AI career opportunities at Meta visit the Meta Careers page.

3 months, 2 weeks назад @ engineering.fb.com
AI debugging at Meta with HawkEye
AI debugging at Meta with HawkEye AI debugging at Meta with HawkEye

In this post, we will provide an overview of the end-to-end debugging workflows supported by HawkEye, components of the system, and the product surface for Meta product and monetization teams to debug AI model and feature issues.

HawkEye includes infrastructure for continuously collecting data on serving and training models, data generation, and analysis components for mining root causes.

However, significant differences indicate problems with either the training data or loss divergence (e.g., loss or gradient explosion) in the bad snapshot.

Such issues can happen for several hard-to-diagnose reasons, ranging from the complex data pipelines behind training data, to data corruptions.

HawkEye…

4 months, 1 week назад @ engineering.fb.com
Watch: Meta’s engineers on building network infrastructure for AI
Watch: Meta’s engineers on building network infrastructure for AI Watch: Meta’s engineers on building network infrastructure for AI

The 2023 edition of Networking at Scale focused on how Meta’s engineers and researchers have been designing and operating the network infrastructure over the last several years for Meta’s AI workloads, including our numerous ranking and recommendation workloads and the immense Generative AI models.

But the sheer scale and complexity of GenAi models means new challenges for Meta’s network infrastructure.

Meta’s Network Journey to Enable AIHany Morsy, Network EngineerSusana Contrera, Network EngineerOver the years, Meta’s AI infrastructure has transitioned from CPU-based to GPU-based training due to growing AI workloads.

Traffic Engineering for AI Training NetworksShuqiang Zhang, Software Eng…

5 months, 1 week назад @ engineering.fb.com
How Meta is creating custom silicon for AI
How Meta is creating custom silicon for AI How Meta is creating custom silicon for AI

Fueling the success of these products are world-class infrastructure teams, including Meta’s custom AI silicon team, led by Olivia Wu, a leader in the silicon industry for 30 years.

In 2018, I saw a social media post from Yann LeCun, our Chief AI Scientist, that Meta was looking for someone to help build AI silicon in-house.

I knew of just a few other companies designing their own custom AI silicon, but they were mainly focused only on silicon and not the software ecosystem and products.

What’s next for the AI silicon design team?

We’re continuing to gather feedback and input from our AI software teams to shape the features of our future AI silicon.

6 months, 1 week назад @ engineering.fb.com
Using Chakra execution traces for benchmarking and network performance optimization
Using Chakra execution traces for benchmarking and network performance optimization Using Chakra execution traces for benchmarking and network performance optimization

Meta presents Chakra execution traces , an open graph-based representation of AI/ML workload execution, laying the foundation for benchmarking and network performance optimization.

The limitations of traditional AI benchmarking methodologyTraditionally, benchmarking AI systems has largely relied on running full ML workloads.

How Meta leverages Chakra execution tracesAt Meta, we collect execution traces from our production servers every day.

Future plansEnhancing the benchmarking capability of Chakra execution tracesWhile the execution trace replayer enables replay of execution traces, it brings forth challenges.

Using AI to generate representative execution tracesChakra execution traces are…

7 months, 2 weeks назад @ engineering.fb.com
Arcadia: An end-to-end AI system performance simulator
Arcadia: An end-to-end AI system performance simulator Arcadia: An end-to-end AI system performance simulator

We’re introducing Arcadia, Meta’s unified system that simulates the compute, memory, and network performance of AI training clusters.

Arcadia gives Meta’s researchers and engineers valuable insights into the performance of AI models and workloads in an AI cluster – enabling data-driven decision making in the design of AI clusters.

That’s where Arcadia, Meta’s end-to-end AI system performance simulator, comes in.

By providing insights into the impact of these factors on system performance, Arcadia facilitates data-driven decision-making processes and fosters the evolution of models and hardware.

This comprehensive set of metrics empowers stakeholders to analyze the impact of different factor…

7 months, 2 weeks назад @ engineering.fb.com
Code Llama: Meta’s state-of-the-art LLM for coding
Code Llama: Meta’s state-of-the-art LLM for coding Code Llama: Meta’s state-of-the-art LLM for coding

Today, we are releasing Code Llama, a large language model (LLM) that can use text prompts to generate code.

Additionally, we have further fine-tuned two additional variations of Code Llama: Code Llama - Python and Code Llama - Instruct.

Code Llama - Python is a language-specialized variation of Code Llama, further fine-tuned on 100B tokens of Python code.

Code Llama - Instruct is an instruction fine-tuned and aligned variation of Code Llama.

We recommend using Code Llama - Instruct variants whenever using Code Llama for code generation since Code Llama - Instruct has been fine-tuned to generate helpful and safe answers in natural language.

8 months назад @ ai.meta.com
Meta Connect 2023: September 27 – 28
Meta Connect 2023: September 27 – 28 Meta Connect 2023: September 27 – 28

[...]

Read More...

The post Meta Connect 2023: September 27 – 28 appeared first on Engineering at Meta.

8 months, 2 weeks назад @ meta.com
Uber Engineering
последний пост None
neptune.ai neptune.ai
последний пост 6 days, 3 hours назад
Deep Learning Optimization Algorithms
Deep Learning Optimization Algorithms Deep Learning Optimization Algorithms

In this article, we’ll survey the most commonly used deep learning optimization algorithms, including Gradient Descent, Stochastic Gradient Descent, and the Adam optimizer.

Understanding different optimization algorithms and their strengths and weaknesses is crucial for any data scientist training deep learning models.

Optimization in deep learning Have a look at other articles on our blog exploring aspects of optimization in deep learning: Deep Learning Model Optimization Methods: Deep learning models exhibit excellent performance but require high computational resources.

:Mini-batch Gradient DescentMini-batch Gradient Descent strikes a balance between the thorough, calculated approach of …

6 days, 3 hours назад @ neptune.ai
Track and Visualize Information From Your Pipelines: neptune.ai + ZenML Integration
Track and Visualize Information From Your Pipelines: neptune.ai + ZenML Integration Track and Visualize Information From Your Pipelines: neptune.ai + ZenML Integration

On top of that, neptune.ai integrates with any MLOps stack, and it just works.

Now, with less boilerplate code, you can log and visualize information from your ZenML pipeline steps (e.g., models, parameters, metrics).

You’re looking for a more visually interactive way of navigating the results produced from your ZenML pipeline runs (e.g., models, metrics, datasets).

In this example, we log a simple ZenML pipeline to Neptune using the Experiment Tracker stack component.

The example assumes that you have ZenML installed together with the Neptune integration.

1 week, 1 day назад @ neptune.ai
Product Updates September ’23: Scatter Plots, Airflow Integration, and More
Product Updates September ’23: Scatter Plots, Airflow Integration, and More Product Updates September ’23: Scatter Plots, Airflow Integration, and More

Scatter plotsIf you have two metrics or parameters that you wish to compare or see how they relate to each other throughout the runs, you can now create a scatter plot.

See an example of a scatter plot in the Neptune app.

Distraction-free modeYou can now view run dashboards and compare views in distraction-free mode.

(neptune 1.5.0)We added a new environment variable NEPTUNE_DATA_DIRECTORY, which lets you specify where the .neptune folder should be created instead of the current working directory.

(neptune 1.6.0)Web applicationWe fixed an issue where a field named “type” could not be displayed as a runs table column.

1 week, 1 day назад @ neptune.ai
Train, Track, and Deploy Your Models: Neptune + Modelbit Integration
Train, Track, and Deploy Your Models: Neptune + Modelbit Integration Train, Track, and Deploy Your Models: Neptune + Modelbit Integration

We are excited to announce that Neptune and Modelbit have partnered to release an integration to enable better ML model deployment and experiment tracking.

Data scientists and machine learning engineers can use the integration to train and deploy machine learning models in Modelbit while logging and visualizing training progress in Neptune.

We’ll log the model’s hyperparameters and accuracy to Neptune and then deploy the model to a REST endpoint.

First, import “modelbit” and “neptune” and authenticate your notebook with Modelbit:import modelbit, neptune mb = modelbit.login()If your “NEPTUNE_API_TOKEN” isn’t already in your notebook’s environment, add it:import os os.environ["NEPTUNE_API_TOK…

1 week, 1 day назад @ neptune.ai
Product Updates March’24: MosaicML Composer integration, Neptune Query Language, and More
Product Updates March’24: MosaicML Composer integration, Neptune Query Language, and More Product Updates March’24: MosaicML Composer integration, Neptune Query Language, and More

(neptune 1.9.0): When fetching runs with , you have 4 new parameters: , , , and .

(neptune 1.9.0) All messages that Neptune prints to the console are prefixed with [neptune].

(neptune 1.9.0)that Neptune prints to the console are prefixed with [neptune].

(neptune 1.9.0) We introduced querying functionality to table fetching methods.

(neptune 1.10.0)IntegrationsWe introduced the log_model_summary parameter to NeptuneCallback() ( neptune- tensorflow-keras 2.2.1)parameter to ( neptune- 2.2.1) We added support for logging project requirements with the dependencies parameter.

1 week, 1 day назад @ neptune.ai
Product Updates December ’23: MLflow Plugin, New Docs Tutorials, and More
Product Updates December ’23: MLflow Plugin, New Docs Tutorials, and More Product Updates December ’23: MLflow Plugin, New Docs Tutorials, and More

This page lists all functions, parameters, and constants exposed by the Neptune Python API.

(kedro-neptune 0.3.0)Web applicationWhen modifying a search query in the runs table, you can now edit the operator and value without changing the other parts.

You can now click on a username in the Owner column (in the runs table) to instantly create a filter for runs created by that account.

column (in the runs table) to instantly create a filter for runs created by that account.

(neptune 1.8.6)Web applicationWe fixed an issue where a field named “type” could not be displayed as a runs table column.

1 week, 1 day назад @ neptune.ai
How to Optimize GPU Usage During Model Training With neptune.ai
How to Optimize GPU Usage During Model Training With neptune.ai How to Optimize GPU Usage During Model Training With neptune.ai

Strategies for improving GPU usage include mixed-precision training, optimizing data transfer and processing, and appropriately dividing workloads between CPU and GPU.

The GPU memory usage metric reflects the amount of memory allocated to the GPU relative to its total memory capacity.

Optimizing data preprocessingIn addition to I/O and data transfer to the GPU, data preprocessing can become a bottleneck for GPU utilization.

This indicates that by improving GPU utilization, training could be sped up, and GPU resources could be used more efficiently.

However, across the many projects I’ve worked on, the following guidelines for optimizing GPU usage have often proven helpful:Always monitor GPU…

3 weeks, 6 days назад @ neptune.ai
Zero-Shot and Few-Shot Learning with LLMs
Zero-Shot and Few-Shot Learning with LLMs Zero-Shot and Few-Shot Learning with LLMs

The role of zero-shot and few-shot learning in LLMsThe goal of zero-shot and few-shot learning is to get a machine-learning model to perform a new task it was not trained for.

“Few-shot learning” and “zero-shot learning” are well-known concepts in machine learning that were studied long before LLMs appeared on the scene.

In the context of LLMs, these terms are sometimes used interchangeably with “few-shot prompting” and “zero-shot prompting.” However, they are not the same.

Where zero-shot prompting failsLet’s turn to two use cases where zero-shot prompting is insufficient.

Where few-shot prompting failsFinally, let’s look at a situation where few-shot prompting won’t get us far.

1 month назад @ neptune.ai
LLMOps: What It Is, Why It Matters, and How to Implement It
LLMOps: What It Is, Why It Matters, and How to Implement It LLMOps: What It Is, Why It Matters, and How to Implement It

Large Language Models (LLMs) like Meta AI’s LLaMA models, MISTRAL AI’s open models, and OpenAI’s GPT series have improved language-based AI.

Monitoring Monitor model performance for data drift and model degradation, often using automated monitoring tools.

Automate prompt selection: Implement systems that automatically choose the best prompt for a given task using historical data on prompt performance and the specifics of the current request.

Monitoring and observability are about tracking LLMs’ performance, health, and operational metrics in production to ensure they perform optimally and reliably.

Use platforms that offer comprehensive observability into LLM performance, including function…

1 month, 2 weeks назад @ neptune.ai
The Real Cost of Self-Hosting MLflow
The Real Cost of Self-Hosting MLflow The Real Cost of Self-Hosting MLflow

The canonical MLflow setup for teams: The MLflow client embedded in the training code communicates with the MLflow tracking server, which handles access to cloud storage (artifact store) and a database (metadata store).

| Modified based on: sourceDeploying the MLflow tracking serverMLflow’s tracking server is relatively lightweight.

With this in mind, there are three main options for deploying the MLflow tracking server on AWS:Deploying the MLflow tracking server on an Amazon EC2 instance.

Recommended The Best MLflow alternatives Read alsoSetting up an artifact storeThe artifact store is the third relevant cost item in an MLflow deployment.

Further, everyone who has access to your MLflow tr…

1 month, 2 weeks назад @ neptune.ai
Deep Learning Model Optimization Methods
Deep Learning Model Optimization Methods Deep Learning Model Optimization Methods

Knowledge distillation transfers insights from a complex “teacher” model to a simpler “student” model, maintaining performance with less computational demand.

Distillation loss: The student model is trained not just to replicate the output of the teacher model but to match the output distributions produced by the teacher model.

Model architecture compatibility: The effectiveness of knowledge distillation depends on how well the student model can learn from the teacher model, which is greatly influenced by their architectural compatibility.

Data is fed to both a complex ‘Teacher Model’ and a simpler ‘Student Model.’ The Teacher Model, which consists of multiple layers (from Layer 1 to Layer …

1 month, 3 weeks назад @ neptune.ai
Continual Learning: Methods and Application
Continual Learning: Methods and Application Continual Learning: Methods and Application

Continual learning scenariosDepending on the data stream characteristics, problems within the continual learning scenario can be divided into three, each with a common solution.

Task incremental continual learningTask Incremental (TI) continual learning is classic multi-task learning but in an incremental way.

Continual learning methodsOver the second decade of the 2000s, there has been a rapid improvement in recent advances in continual learning methods.

Recommended How to Build Machine Learning Systems with a Feature Store See alsoHow to choose the right continual learning method for your projectAcross the three groups of continual learning approaches, many techniques exist.

Continual lea…

2 months назад @ neptune.ai
2024 Layoffs and LLMs: Pivoting for Success
2024 Layoffs and LLMs: Pivoting for Success 2024 Layoffs and LLMs: Pivoting for Success

Neither teams working on proof-of-concept projects nor production ML systems are immune from job cuts.

The rapid advancements of Large Language Models (LLMs) are changing the day-to-day work of ML practitioners and how company leadership thinks about AI.

The rise of modern LLMsIn 2023, the dominance of modern LLMs became increasingly evident, challenging the incumbent classical NLP models.

Even small and relatively weaker LLMs like DistilGPT2 and t5-small have surpassed classical NLP models in understanding context and generating coherent text.

As you can see, it’s unclear whether people working on PoC or production projects are at higher risk of layoffs.

2 months, 1 week назад @ neptune.ai
Mikiko Bazeley: What I Learned Building the ML Platform at Mailchimp
Mikiko Bazeley: What I Learned Building the ML Platform at Mailchimp Mikiko Bazeley: What I Learned Building the ML Platform at Mailchimp

In this article, I will outline the learnings and challenges I faced while on the ML platform team at Mailchimp, building infrastructure and setting up the environment for development and testing.

Mailchimp’s ML Platform: genesis, challenges, and objectivesMailchimp is a 20-year-old bootstrapped email marketing company.

Team setup and responsibilitiesWe had around 20 data engineers and ML(Ops) engineers working on the ML platform at Mailchimp.

Recommended Learnings From Building the ML Platform at Mailchimp See full episodeIn times of generative AI, a good ML infrastructure matters a lotA lesson that I’ve learned time and time again over the past years is the enduring importance of ‘boring’…

2 months, 4 weeks назад @ neptune.ai
How to Build Machine Learning Systems With a Feature Store
How to Build Machine Learning Systems With a Feature Store How to Build Machine Learning Systems With a Feature Store

Understanding machine learning pipelinesMachine learning (ML) pipelines are a key component of ML systems.

Machine learning systems with feature storesMachine learning (ML) systems manage the data transformations, model training, and predictions made on ML models.

A feature store typically comprises a feature repository, a feature serving layer, and a metadata store.

A feature store is a data platform that supports the development and operation of machine learning systems by managing the storage and efficient querying of feature data.

A feature store stores feature data in tables called feature groups, also called feature sets or tables.

2 months, 4 weeks назад @ neptune.ai
▶️ YouTube
Yannic Kilcher Yannic Kilcher
последний пост 13 часов назад
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Google researchers achieve supposedly infinite context attention via compressive memory. Paper: https://arxiv.org/abs/2404.07143 Abstract:

This work introduces an efficient method to scale Transformer-based Large Language Models (LLMs) to infinitely long inputs with bounded memory and computation. A key component in our proposed approach is a new attention technique dubbed Infini-attention. The Infini-attention incorporates a compressive memory into the vanilla attention mechanism and builds in both masked local attention and long-term linear attention mechanisms in a single Transformer block. We demonstrate the effectiveness of our approach on long-context language modeling benchmarks, 1M …

13 часов назад @ youtube.com
[ML News] Llama 3 changes the game
[ML News] Llama 3 changes the game [ML News] Llama 3 changes the game

Meta's Llama 3 is out. New model, new license, new opportunities. References:

https://llama.meta.com/llama3/

https://ai.meta.com/blog/meta-llama-3/

https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md

https://llama.meta.com/trust-and-safety/

https://ai.meta.com/research/publications/cyberseceval-2-a-wide-ranging-cybersecurity-evaluation-suite-for-large-language-models/

https://github.com/meta-llama/llama-recipes/tree/main/recipes/responsible_ai

https://llama.meta.com/llama3/license/

https://about.fb.com/news/2024/04/meta-ai-assistant-built-with-llama-3/?utm_source=twitter&utm_medium=organic_social&utm_content=thread&utm_campaign=imagineflash

https://twitter.com/minchoi/status/178277…

1 day, 12 hours назад @ youtube.com
Hugging Face got hacked
Hugging Face got hacked Hugging Face got hacked

Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https://www.subscribestar.com/yannickilcher

Patreon: https://www.patreon.com/yannickilcher

Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq

Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2

Litecoin (LTC): LQW2TRyKYetVC8WjFkhpP…

1 week назад @ youtube.com
[ML News] Microsoft to spend 100 BILLION DOLLARS on supercomputer (& more industry news)
[ML News] Microsoft to spend 100 BILLION DOLLARS on supercomputer (& more industry news) [ML News] Microsoft to spend 100 BILLION DOLLARS on supercomputer (& more industry news)

Some updates from industry in the Machine Learning world Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https://www.subscribestar.com/yannickilcher

Patreon: https://www.patreon.com/yannickilcher

Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq

Ethereum (ETH): 0x7ad3513E3B8f66799f507…

1 week, 2 days назад @ youtube.com
[ML News] Jamba, CMD-R+, and other new models (yes, I know this is like a week behind 🙃)
[ML News] Jamba, CMD-R+, and other new models (yes, I know this is like a week behind 🙃) [ML News] Jamba, CMD-R+, and other new models (yes, I know this is like a week behind 🙃)

A flurry of new models continues to appear. Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https://www.subscribestar.com/yannickilcher

Patreon: https://www.patreon.com/yannickilcher

Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq

Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC…

1 week, 5 days назад @ youtube.com
Flow Matching for Generative Modeling (Paper Explained)
Flow Matching for Generative Modeling (Paper Explained) Flow Matching for Generative Modeling (Paper Explained)

Flow matching is a more general method than diffusion and serves as the basis for models like Stable Diffusion 3. Paper: https://arxiv.org/abs/2210.02747 Abstract:

We introduce a new paradigm for generative modeling built on Continuous Normalizing Flows (CNFs), allowing us to train CNFs at unprecedented scale. Specifically, we present the notion of Flow Matching (FM), a simulation-free approach for training CNFs based on regressing vector fields of fixed conditional probability paths. Flow Matching is compatible with a general family of Gaussian probability paths for transforming between noise and data samples -- which subsumes existing diffusion paths as specific instances. Interestingly, …

2 weeks, 2 days назад @ youtube.com
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping (Searchformer)
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping (Searchformer) Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping (Searchformer)

Paper: https://arxiv.org/abs/2402.14083 Abstract:

While Transformers have enabled tremendous progress in various application settings, such architectures still lag behind traditional symbolic planners for solving complex decision making tasks. In this work, we demonstrate how to train Transformers to solve complex planning tasks and present Searchformer, a Transformer model that optimally solves previously unseen Sokoban puzzles 93.7% of the time, while using up to 26.8% fewer search steps than standard A∗ search. Searchformer is an encoder-decoder Transformer model trained to predict the search dynamics of A∗. This model is then fine-tuned via expert iterations to perform fewer search step…

2 weeks, 4 days назад @ youtube.com
[ML News] Grok-1 open-sourced | Nvidia GTC | OpenAI leaks model names | AI Act
[ML News] Grok-1 open-sourced | Nvidia GTC | OpenAI leaks model names | AI Act [ML News] Grok-1 open-sourced | Nvidia GTC | OpenAI leaks model names | AI Act

OUTLINE:

0:00 - Intro

0:15 - XAI releases Grok-1

2:00 - Nvidia GTC

4:45 - Comment of the Week

5:35 - Brute-forcing OpenAI model names

7:30 - Inflection AI gets eaten by Microsoft

9:25 - EU AI Act moving forward

11:45 - Advances in Robotics

14:00 - India retracts controversial advisory

14:30 - OpenSora

15:20 - Improved Gemma fine-tuning

16:20 - Decoding encrypted LLM traffic

17:45 - Varia References:

https://x.ai/blog/grok-os

https://github.com/xai-org/grok-1

https://finance.yahoo.com/news/nvidia-debuts-next-generation-blackwell-ai-chip-at-gtc-2024-205825161.html?guccounter=1&guce_referrer=aHR0cHM6Ly9uZXdzLmdvb2dsZS5jb20v&guce_referrer_sig=AQAAAHYRVePPrDnH3HxPV8smDzUiia_ztWttteAmHKxy-x_Z75lq…

4 weeks, 1 day назад @ youtube.com
[ML News] Devin AI Software Engineer | GPT-4.5-Turbo LEAKED | US Gov't Report: Total Extinction
[ML News] Devin AI Software Engineer | GPT-4.5-Turbo LEAKED | US Gov't Report: Total Extinction [ML News] Devin AI Software Engineer | GPT-4.5-Turbo LEAKED | US Gov't Report: Total Extinction

Your weekly dose of ML News OUTLINE:

0:00 - Intro

0:15 - Devin: AI software engineer

5:50 - Mira Murati on Sora training data

6:50 - Inflection accused of copying Claude

9:00 - Tools & papers

16:30 - GPT-4.5-turbo mystery

17:30 - US government report: total extinction by AI

19:20 - Various other news References:

https://www.cognition-labs.com/introducing-devin

https://twitter.com/cognition_labs/status/1767548763134964000?t=ZECIn-uqbguwHtY8X_Gvtw&s=09

https://news.google.com/stories/CAAqNggKIjBDQklTSGpvSmMzUnZjbmt0TXpZd1NoRUtEd2lWMUwyU0N4RnVWM3pSRWhWX01pZ0FQAQ?hl=en-US&gl=US&ceid=US%3Aen

https://www.bloomberg.com/news/articles/2024-03-12/cognition-ai-is-a-peter-thiel-backed-coding-assistant?…

1 month, 1 week назад @ youtube.com
[ML News] Elon sues OpenAI | Mistral Large | More Gemini Drama
[ML News] Elon sues OpenAI | Mistral Large | More Gemini Drama [ML News] Elon sues OpenAI | Mistral Large | More Gemini Drama

#mlnews #ainews #openai OUTLINE:

0:00 - Intro

0:20 - Elon sues OpenAI

14:00 - Mistral Large

16:40 - ML Espionage

18:30 - More Gemini Drama

24:00 - Copilot generates spicy images

26:55 - Gemma bugs

28:45 - Varia References: https://gist.github.com/yk/0c065cdc8e414738abfaae4f8e417e00 Thumbnail pictures: Wikipedia Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and …

1 month, 2 weeks назад @ youtube.com
On Claude 3
On Claude 3 On Claude 3 1 month, 2 weeks назад @ youtube.com
No, Anthropic's Claude 3 is NOT sentient
No, Anthropic's Claude 3 is NOT sentient No, Anthropic's Claude 3 is NOT sentient

No, Anthropic's Claude 3 is not conscious or sentient or self-aware. References:

https://www.anthropic.com/news/claude-3-family

https://twitter.com/_akhaliq/status/1764673955313459560?t=gkBx2uTXfrxLl-5_mL7Btg&s=09

https://twitter.com/idavidrein/status/1764675668175094169?t=pJfbN3LtKaxsU8egz83Mvg&s=09

https://twitter.com/TolgaBilge_/status/1764754012824314102?t=9bakXDnVMC1oAEyZFoKimA&s=09

https://twitter.com/karinanguyen_/status/1764670019743690757?t=gkBx2uTXfrxLl-5_mL7Btg&s=09

https://twitter.com/alexalbert__/status/1764722513014329620

https://www.lesswrong.com/posts/pc8uP4S9rDoNpwJDZ/claude-3-claims-its-conscious Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTu…

1 month, 2 weeks назад @ youtube.com
[ML News] Groq, Gemma, Sora, Gemini, and Air Canada's chatbot troubles
[ML News] Groq, Gemma, Sora, Gemini, and Air Canada's chatbot troubles [ML News] Groq, Gemma, Sora, Gemini, and Air Canada's chatbot troubles

Your dose of ML News! OUTLINE:

0:00 - Intro

0:20 - Gemma & Gemini

3:40 - Groq

6:30 - Nvidia EOS Supercomputer

7:15 - Gpulist.ai

8:20 - Demis Hassabis on scale

10:10 - Hardware wars

12:05 - Sora

15:10 - Gemini 1.5 Pro & Long Context

18:45 - Air Canada must pay for chatbot mistake

23:30 - Giant Rat Balls

26:25 - Various News References:

https://blog.google/technology/developers/gemma-open-models/?utm_source=tw

https://twitter.com/altryne/status/1760358916624719938?t=PVZkHQA_p7GxmeUX0hcZ_Q&s=09

https://twitter.com/paulg/status/1760078920135872716?t=PVZkHQA_p7GxmeUX0hcZ_Q&s=09

https://groq.com/

https://twitter.com/mattshumer_/status/1759347920543834117?t=cS5nPvZOsV6iDA1mVabHOg&s=09

https://twit…

1 month, 3 weeks назад @ youtube.com
Gemini has a Diversity Problem
Gemini has a Diversity Problem Gemini has a Diversity Problem

Google turned the anti-bias dial up to 11 on their new Gemini Pro model. References:

https://developers.googleblog.com/2024/02/gemini-15-available-for-private-preview-in-google-ai-studio.html

https://blog.google/technology/developers/gemma-open-models/?utm_source=tw

https://storage.googleapis.com/deepmind-media/gemma/gemma-report.pdf

https://twitter.com/ClementDelangue/status/1760324815888486668?t=spXd7Oq_cSrRN2A-3r6gnQ&s=09

https://twitter.com/paulg/status/1760078920135872716?t=PVZkHQA_p7GxmeUX0hcZ_Q&s=09

https://twitter.com/yoavgo/status/1760445342691016811/photo/3

https://twitter.com/alex_peys/status/1760327435890135279/photo/2

https://twitter.com/woke8yearold/status/1760310705142558781/…

2 months назад @ youtube.com
V-JEPA: Revisiting Feature Prediction for Learning Visual Representations from Video (Explained)
V-JEPA: Revisiting Feature Prediction for Learning Visual Representations from Video (Explained) V-JEPA: Revisiting Feature Prediction for Learning Visual Representations from Video (Explained)

#vjepa #meta #unsupervisedlearning V-JEPA is a method for unsupervised representation learning of video data by using only latent representation prediction as objective function. Weights & Biases course on Structured LLM Outputs: https://wandb.me/course-yannic OUTLINE:

0:00 - Intro

1:45 - Predictive Feature Principle

8:00 - Weights & Biases course on Structured LLM Outputs

9:45 - The original JEPA architecture

27:30 - V-JEPA Concept

33:15 - V-JEPA Architecture

44:30 - Experimental Results

46:30 - Qualitative Evaluation via Decoding Blog: https://ai.meta.com/blog/v-jepa-yann-lecun-ai-model-video-joint-embedding-predictive-architecture/

Paper: https://ai.meta.com/research/publications/revisit…

2 months назад @ youtube.com
Henry AI Labs Henry AI Labs
последний пост 6 days, 13 hours назад
Llama 3 RAG Demo with DSPy Optimization, Ollama, and Weaviate!
Llama 3 RAG Demo with DSPy Optimization, Ollama, and Weaviate! Llama 3 RAG Demo with DSPy Optimization, Ollama, and Weaviate!

Hey everyone! Thank you so much for watching this overview of Llama 3 looking at the release notes and seeing a demo of how to integrate it with DSPy through Ollama and how to use DSPy's MIPRO to find the optimal prompt when using this new large language model for RAG! We are hosting an event in San Francisco on May 1st with Arize AI and Cohere, featuring a talk from Omar Khattab, the lead author of DSPy! Hope to see you there! https://lu.ma/dspy Introducing Meta Llama 3: https://ai.meta.com/blog/meta-llama-3/ Ollama Llama 3: https://ollama.com/library/llama3 Weaviate Recipes: https://github.com/weaviate/recipes/blob/main/integrations/dspy/llms/Llama3.ipynb Chapters

0:00 Llama3!!

1:28 Relea…

6 days, 13 hours назад @ youtube.com
Building RAG with Command R+ from Cohere, DSPy, and Weaviate!
Building RAG with Command R+ from Cohere, DSPy, and Weaviate! Building RAG with Command R+ from Cohere, DSPy, and Weaviate!

Hey everyone! Thank you so much for watching this overview of Command R+ showing you how you can use the new model in DSPy and a quick RAG demo, as well as walking through the details of the release post! Congratulations to the Cohere team! Super exciting times to be working with LLM systems! Introducing Command R+: A Scalable LLM Built for Business - https://txt.cohere.com/command-r-plus-microsoft-azure/ Link to demo notebook - https://github.com/weaviate/recipes/blob/main/integrations/dspy/llms/Command-R-Plus.ipynb Chapters

0:00 Welcome! Command R+!

1:12 Demo with Cohere, DSPy, and Weaviate

6:06 Command R+ Announcement Post

9:24 LLM Evals

2 weeks, 6 days назад @ youtube.com
Structured Outputs with DSPy
Structured Outputs with DSPy Structured Outputs with DSPy

Unfortunately, Large Language Models will not consistently follow the instructions that you give them. This is a massive problem when you are building AI systems that require a particular type of output from the previous step to feed into the next one! For example, imagine you are building a blog post writing system that first takes a question and retrieved context to output a list of topics. These topics have to be formatted in a particular way, such as a comma-separated list or a JSON of Topic objects, such that the system can continue writing the blog post! I am SUPER excited to share the 4th video in my DSPy series, diving into 3 solutions to structuring outputs in DSPy programs: (1) **…

3 weeks, 2 days назад @ youtube.com
Adding Depth to DSPy Programs
Adding Depth to DSPy Programs Adding Depth to DSPy Programs

Hey everyone! Thank you so much for watching the 3rd edition of the DSPy series, Adding Depth to DSPy Programs!! You can find the examples and links to community resources / news on https://github.com/weaviate/recipes! Chapters

0:00 Intro

0:50 Chapters Overview

5:06 Weaviate Recipes

5:24 DSPy News and Community Notes

13:51 Adding Depth to RAG Programs

18:40 Multi-Model DSPy Programs

20:18 DSPy Optimizers

25:30 Deep Dive Optimizers

27:55 Into the Optimizer Code!

37:48 Demo #1: Adding Depth to RAG

1:05:25 Demo #2: Questions to Blogs

1:07:48 Thank you so much for watching!

1 month, 3 weeks назад @ youtube.com
Getting Started with RAG in DSPy!
Getting Started with RAG in DSPy! Getting Started with RAG in DSPy!

Hey everyone! Thank you so much for watching this tutorial on getting started with RAG programming in DSPy! This video will take you through 4 major aspects of building DSPy programs (1) Installation, settings, and Datasets with dspy.Example, (2) LLM Metrics, (3) The DSPy programming model, and (4) Optimization!! The notebook used in the video can be found here: https://github.com/weaviate/recipes/blob/main/integrations/dspy/1.Getting-Started-with-RAG-in-DSPy.ipynb All future videos, as well as additional utils like data import scripts, will be in this folder: https://github.com/weaviate/recipes/tree/main/integrations/dspy Please leave a star, it helps a lot! DSPy on GitHub: https://github.…

2 months, 1 week назад @ youtube.com
DSPy Explained!
DSPy Explained! DSPy Explained!

Hey everyone! Thank you so much for watching this explanation of DSPy! DSPy is a super exciting new framework for developing LLM programs! Pioneered by frameworks such as LangChain and LlamaIndex, we can build much more powerful systems by chaining together LLM calls! This means that the output of one call to an LLM is the input to the next, and so on. We can think of chains as programs, with each LLM call analogous to a function that takes text as input and produces text as output. DSPy offers a new programming model, inspired by PyTorch, that gives you a massive amount of control over these LLM programs. Further the Signature abstraction wraps prompts and structured input / outputs to cle…

2 months, 3 weeks назад @ youtube.com
3blue1brown 3blue1brown
последний пост 1 week, 6 days назад
How word vectors encode meaning
How word vectors encode meaning How word vectors encode meaning

This comes from a full video dissecting how LLMs work. In the shorts player, you can click the link at the bottom of the screen, or for reference: https://youtu.be/wjZofJX0v4M

1 week, 6 days назад @ youtube.com
Visualizing Attention, a Transformer's Heart | Chapter 6, Deep Learning
Visualizing Attention, a Transformer's Heart | Chapter 6, Deep Learning Visualizing Attention, a Transformer's Heart | Chapter 6, Deep Learning

Demystifying attention, the key mechanism inside transformers and LLMs.

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

Special thanks to these supporters: https://www.3blue1brown.com/lessons/attention#thanks

An equally valuable form of support is to simply share the videos. Demystifying self-attention, multiple heads, and cross-attention.

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support The first pass for the translated subtitles here is machine-generated, and therefore notably imperfect. To contribute edits or fixes, visit https://translate.3blue1brown.com/ ------------------ Here are …

2 weeks, 3 days назад @ youtube.com
But what is a GPT? Visual intro to Transformers | Deep learning, chapter 5
But what is a GPT?  Visual intro to Transformers | Deep learning, chapter 5 But what is a GPT? Visual intro to Transformers | Deep learning, chapter 5

An introduction to transformers and their prerequisites

Early view of the next chapter for patrons: https://3b1b.co/early-attention Other recommended resources on the topic. Richard Turner's introduction is one of the best starting places:

https://arxiv.org/pdf/2304.10557.pdf Coding a GPT with Andrej Karpathy

https://youtu.be/kCc8FmEb1nY Introduction to self-attention by John Hewitt

https://web.stanford.edu/class/cs224n/readings/cs224n-self-attention-transformers-2023_draft.pdf History of language models by Brit Cruise:

https://youtu.be/OFS90-FX6pg ------------------ Timestamps 0:00 - Predict, sample, repeat

3:03 - Inside a transformer

6:36 - Chapter layout

7:20 - The premise of Deep Learni…

3 weeks, 2 days назад @ youtube.com
Simulating the electric field and a moving charge
Simulating the electric field and a moving charge Simulating the electric field and a moving charge

A link to the full video is at the bottom of the screen.

Or, for reference: https://youtu.be/aXRTczANuIs

3 months назад @ youtube.com
How the Mandelbrot set is defined
How the Mandelbrot set is defined How the Mandelbrot set is defined

A link to the full video answering this is at the bottom of the screen. Or, for reference: https://youtu.be/LqbZpur38nw

3 months назад @ youtube.com
A challenging puzzle about subset sums
A challenging puzzle about subset sums A challenging puzzle about subset sums

A link to the full video answering this is at the bottom of the screen. Or, for reference: https://youtu.be/bOXCLR3Wric

3 months назад @ youtube.com
Ellipses have multiple definitions, how are these the same?
Ellipses have multiple definitions, how are these the same? Ellipses have multiple definitions, how are these the same?

A link to the full video is at the bottom of the screen.

Or, for reference: https://youtu.be/pQa_tWZmlGs The full video this comes from proves why slicing a cone gives the same shape as the two-thumbtacks-and-string construction, which is beautiful. Editing from long-form to short by Dawid Kołodziej

3 months назад @ youtube.com
Three levels of understanding Bayes' theorem
Three levels of understanding Bayes' theorem Three levels of understanding Bayes' theorem

A link to the full video is at the bottom of the screen.

Or, for reference: https://youtu.be/HZGCoVF3YvM Editing from long-form to short by Dawid Kołodziej

3 months, 1 week назад @ youtube.com
The medical test paradox (well "paradox")
The medical test paradox (well "paradox") The medical test paradox (well "paradox")

A link to the full video about Bayesian thinking is at the bottom of the screen.

Or, for reference: https://youtu.be/lG4VkPoG3ko Long-to-short editing by Dawid Kołodziej

3 months, 1 week назад @ youtube.com
Positioned as the hardest question on a Putnam exam (#6, 1992)
Positioned as the hardest question on a Putnam exam  (#6, 1992) Positioned as the hardest question on a Putnam exam (#6, 1992)

A link to the full video is at the bottom of the screen.

Or, for reference: https://youtu.be/OkmNXy7er84 Editing from the original video into this short by Dawid Kołodziej

3 months, 2 weeks назад @ youtube.com
Why does light slowing imply a bend? (Beyond the tank/car analogy)
Why does light slowing imply a bend? (Beyond the tank/car analogy) Why does light slowing imply a bend? (Beyond the tank/car analogy)

A link to the full video is at the bottom of the screen.

Or, for reference: https://youtu.be/Cz4Q4QOuoo8 That video answers various viewer questions about the index of refraction. Editing from long-form to short by Dawid Kołodziej

3 months, 2 weeks назад @ youtube.com
The cube shadow puzzle
The cube shadow puzzle The cube shadow puzzle

A link to the full video is at the bottom of the screen. Or, for reference: https://youtu.be/ltLUadnCyi0

3 months, 2 weeks назад @ youtube.com
What does it mean that light "slows down" in glass?
What does it mean that light "slows down" in glass? What does it mean that light "slows down" in glass?

A link to the full video is at the bottom of the screen.

Or, for reference: https://youtu.be/KTzGBJPuJwM That video unpacks the mechanism behind how light slows down in passing through a medium, and why the slow-down rate would depend on color. Editing from long-form to short by Dawid Kołodziej

3 months, 2 weeks назад @ youtube.com
Why do we call them "scalars"?
Why do we call them "scalars"? Why do we call them "scalars"?

A link to the full video is at the bottom of the screen.

Or, for reference: https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab Editing from long-form to short by Dawid Kołodziej

3 months, 2 weeks назад @ youtube.com
A beautiful international math olympiad problem
A beautiful international math olympiad problem A beautiful international math olympiad problem

The link to the full video is at the bottom of the screen. For reference, here it is: https://youtu.be/M64HUIJFTZM

3 months, 3 weeks назад @ youtube.com
Two Minute Papers Two Minute Papers
последний пост 1 week назад
GPT-4 Just Got Supercharged!
GPT-4 Just Got Supercharged! GPT-4 Just Got Supercharged!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers ChatGPT:

https://chat.openai.com/ Chatbot arena leaderboard:

https://chat.lmsys.org/?leaderboard Video studying Devin: https://www.youtube.com/watch?v=tNmgmwEtoWE Try this instead! https://github.com/princeton-nlp/SWE-agent 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gord…

1 week назад @ youtube.com
NVIDIA’s AI Puts You In a Video Game 75,000x Faster!
NVIDIA’s AI Puts You In a Video Game 75,000x Faster! NVIDIA’s AI Puts You In a Video Game 75,000x Faster!

❤️ Check out Microsoft Azure AI and try it out for free:

https://azure.microsoft.com/en-us/solutions/ai 📝 The paper "Live 3D Portrait: Real-Time Radiance Fields for Single-Image Portrait View Synthesis " is available here:

https://research.nvidia.com/labs/nxp/lp3d/ Fully Connected conference: https://wandb.ai/site/resources/events/fully-connected MKBHD, iJustine ,Brian Tong persona source: https://www.youtube.com/watch?v=dtp6b76pMak 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous …

1 week, 3 days назад @ youtube.com
DeepMind’s New AI Saw 15,000,000,000 Chess Boards!
DeepMind’s New AI Saw 15,000,000,000 Chess Boards! DeepMind’s New AI Saw 15,000,000,000 Chess Boards!

❤️ Check out Microsoft Azure AI and try it out for free:

https://azure.microsoft.com/en-us/solutions/ai 📝 The paper "Grandmaster-Level Chess Without Search" is available here:

https://arxiv.org/abs/2402.04494 +1:

https://www.anthropic.com/news/decomposing-language-models-into-understandable-components

https://transformer-circuits.pub/2023/monosemantic-features/index.html 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Ba…

1 week, 5 days назад @ youtube.com
NVIDIA’s New Tech: Master of Illusions!
NVIDIA’s New Tech: Master of Illusions! NVIDIA’s New Tech: Master of Illusions!

❤️ Check out Microsoft Azure AI and try it out for free:

https://azure.microsoft.com/en-us/solutions/ai 📝 The paper "ViCMA: Visual Control of Multibody Animations" is available here:

https://research.nvidia.com/labs/prl/vicma/ 📝 The paper "Inverse-Foley Animation: Synchronizing rigid-body motions to sound" is available here:

https://www.cs.cornell.edu/projects/Sound/ifa/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Ba…

2 weeks назад @ youtube.com
DeepMind’s Gemini AI: Assistant From The Future!
DeepMind’s Gemini AI: Assistant From The Future! DeepMind’s Gemini AI: Assistant From The Future!

❤️ Check out Microsoft Azure AI and try it out for free:

https://azure.microsoft.com/en-us/solutions/ai 📝 The paper "Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context" is available here:

https://storage.googleapis.com/deepmind-media/gemini/gemini_v1_5_report.pdf 📝 The paper "Gemma: Open Models Based on Gemini Research and Technology" is available here:

https://storage.googleapis.com/deepmind-media/gemma/gemma-report.pdf Try Gemma:

https://huggingface.co/chat Sources:

https://twitter.com/skirano/status/1760468624706351383

https://twitter.com/mckaywrigley/status/1761113846520131816

https://simonwillison.net/2024/Feb/21/gemini-pro-video/

https://twitter.com/ha…

2 weeks, 4 days назад @ youtube.com
Blender 4.1 - An Amazing Tool…For Free!
Blender 4.1 - An Amazing Tool…For Free! Blender 4.1 - An Amazing Tool…For Free!

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers Get Blender: https://www.blender.org/ Demo files: https://www.blender.org/download/demo-files/

Blend Swap: https://www.blendswap.com/ Andrew Price's donut tutorial:

https://www.youtube.com/watch?v=B0J27sf9N1Y&list=PLjEaoINr3zgEPv5y--4MKpciLaoQYZB1Z&index=2 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Be…

2 weeks, 6 days назад @ youtube.com
OpenAI Sora: Beauty And Horror!
OpenAI Sora: Beauty And Horror! OpenAI Sora: Beauty And Horror!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 My Master thesis on fluids, with source code: https://users.cg.tuwien.ac.at/zsolnai/gfx/fluid_control_msc_thesis/

📝 Paper/poster on fluid control, with source code: https://users.cg.tuwien.ac.at/zsolnai/gfx/real_time_fluid_control_eg/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Bret Bri…

3 weeks, 3 days назад @ youtube.com
OpenAI Sora Just Supercharged Filmmaking!
OpenAI Sora Just Supercharged Filmmaking! OpenAI Sora Just Supercharged Filmmaking!

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers The conference:

https://wandb.ai/site/resources/events/fully-connected First impressions from creatives: https://openai.com/blog/sora-first-impressions 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Bret Brizzee, Gaston Ingaramo, Gordon Child, Jace O'Brien, John Le, Kyle Davis, Lukas Biewald…

4 weeks, 1 day назад @ youtube.com
NVIDIA GTC: This Is The Future Of Everything!
NVIDIA GTC: This Is The Future Of Everything! NVIDIA GTC: This Is The Future Of Everything!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Bret Brizzee, Gaston Ingaramo, Gordon Child, Jace O'Brien, John Le, Kyle Davis, Lukas Biewald, Martin, Michael Albrecht, Michael Tedder, Owen Skarpness, Richard Putra Iskandar, Richard Sundvall, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Tyb…

1 month назад @ youtube.com
DeepMind’s New Gaming AI Is Here!
DeepMind’s New Gaming AI Is Here! DeepMind’s New Gaming AI Is Here!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "A generalist AI agent for 3D virtual environments" is available here:

https://deepmind.google/discover/blog/sima-generalist-ai-agent-for-3d-virtual-environments/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Bret Brizzee, Gaston Ingaramo, Gordon Child, Jace O'Brien, John Le, Ky…

1 month назад @ youtube.com
The First AI Software Engineer Is Here!
The First AI Software Engineer Is Here! The First AI Software Engineer Is Here!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Bret Brizzee, Gaston Ingaramo, Gordon Child, Jace O'Brien, John Le, Kyle Davis, Lukas Biewald, Martin, Michael Albrecht, Michael Tedder, Owen Skarpness, Richard Putra Iskandar, Richard Sundvall, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Tyb…

1 month, 1 week назад @ youtube.com
The First AI Virus Is Here!
The First AI Virus Is Here! The First AI Virus Is Here!

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers 📝 The paper "ComPromptMized: Unleashing Zero-click Worms that Target GenAI-Powered Applications" is available here:

https://sites.google.com/view/compromptmized 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Bret Brizzee, Gaston Ingaramo, Gordon Child, Jace O'Brien, John Le, Kyle Davis, Luka…

1 month, 1 week назад @ youtube.com
Claude 3 AI: Smarter Than GPT-4?
Claude 3 AI: Smarter Than GPT-4? Claude 3 AI: Smarter Than GPT-4?

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers 📝 Claude 3 is available here - try it out for free (note that we are not affiliated with them):

https://www.anthropic.com/news/claude-3-family Conference I am coming to:

https://wandb.ai/site/resources/events/fully-connected Experiments, evaluations:

https://twitter.com/JiaweiLiu_/status/1764866009175920892

https://twitter.com/tolgabilge_/status/1764754012824314102

https://twitter.com/RubenHssd/status/1764692641436827842 Additional results (very nice so far!):

https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard 📝 My paper on simulations that look almost like reality is available for free here:

1 month, 2 weeks назад @ youtube.com
Stable Diffusion 3 - An Amazing AI For Free!
Stable Diffusion 3 - An Amazing AI For Free! Stable Diffusion 3 - An Amazing AI For Free!

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers 📝 The paper "Scaling Rectified Flow Transformers for High-Resolution Image Synthesis" is available here:

https://stability.ai/news/stable-diffusion-3-research-paper Waitlist: https://stability.ai/stablediffusion3 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Bret Brizzee, Gaston Ingaramo, G…

1 month, 2 weeks назад @ youtube.com
DeepMind’s New AI Makes Games From Scratch!
DeepMind’s New AI Makes Games From Scratch! DeepMind’s New AI Makes Games From Scratch!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "Genie: Generative Interactive Environments" is available here:

https://sites.google.com/view/genie-2024/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Bret Brizzee, Gaston Ingaramo, Gordon Child, Jace O'Brien, John Le, Kyle Davis, Lukas Biewald, Martin, Michael Albrecht, Michae…

1 month, 3 weeks назад @ youtube.com
DataFest Video DataFest Video
последний пост None
Яндекс. Компьютерные науки Яндекс. Компьютерные науки
последний пост 8 months, 1 week назад
03. Дикуссия «Ближайшее будущее диффузионных моделей»
03. Дикуссия «Ближайшее будущее диффузионных моделей» 03. Дикуссия «Ближайшее будущее диффузионных моделей»

Участники: - Петр Ермаков, ML Brand Director, Яндекс

- Иван Барабанов, Разработчик, ВКонтакте, deep vk

- Валентин Хрульков, ведущий исследователь, Yandex Research Денис Димитров, Исполнительный директор по исследованию данных Sber AI, научный консультант AIRI Вместе со специалистами в области диффузионных картиночных моделей порассуждаем, куда развивается область. Поговорим про текущие положение дел и актуальные технические барьеры области.

8 months, 1 week назад @ youtube.com
02. Практические аспекты обучения масштабных диффузионных моделей - Валентин Хрульков
02. Практические аспекты обучения масштабных диффузионных моделей - Валентин Хрульков 02. Практические аспекты обучения масштабных диффузионных моделей - Валентин Хрульков

Спикер: Валентин Хрульков, ведущий исследователь, Yandex Research Рассмотрим все этапы обучения foundational диффузионных моделей, начиная от подготовки датасета до регулярных замеров качества в процессе обучения. Обсудим scaling law эксперименты и их предварительные результаты. Так же обсудим различные аспекты применения этих моделей на практике: генерации рекламных баннеров, персонализация, сервис-социальная сеть Шедеврум.

8 months, 1 week назад @ youtube.com
01. Kandinsky 2 X - Андрей Кузнецов
01. Kandinsky 2 X - Андрей Кузнецов 01. Kandinsky 2 X - Андрей Кузнецов

Спикер: Андрей Кузнецов, Исполнительный директор по исследованию данных, Sber AI Рассмотрим теорию и доступную информацию о диффузионном процессе. Подробно покажу, чем отличается архитектура Kandinsky 2.1 от 2.2. Обсудим метрики для оценки качества генераций, поговорим про ключевые результаты релизов. Вместе посмотрим, в каких сценариях для бизнеса и практических приложениях можно применять нашу модель.

8 months, 1 week назад @ youtube.com
ML Party 03.08.2023
ML Party 03.08.2023 ML Party 03.08.2023

Добро пожаловать на вечерний митап для ML-инженеров от Яндекса. В этот раз поговорим про модные нынче «генеративки», а именно про диффузионные картиночные модели! программа:

00:00:00 – таймер обратного отсчёта перед эфиром

00:02:12 – открытие конференции

00:04:52 – Kandinsky 2.X

Андрей Кузнецов, исполнительный директор по исследованию данных, Sber AI.

00:46:55 – перерыв

01:03:40 – Практические аспекты обучения масштабных диффузионных моделей Валентин Хрульков, ведущий исследователь, Yandex Research. 01:49:34 – Дискуссия «Ближайшее будущее диффузионных моделей» Присоединяйтесь к нашем сообществу в телеграм, чтобы быть в курсе всех событий и мероприятий Яндекса https://t.me/yadatadojo. Вопрос…

8 months, 3 weeks назад @ youtube.com
ML Trainings ML Trainings
последний пост 1 month назад
Data Fusion Contest 2024 - митап с доразбором задачи Геоаналитика и QnA (21.03.2024)
Data Fusion Contest 2024 -  митап с доразбором задачи Геоаналитика и QnA (21.03.2024) Data Fusion Contest 2024 - митап с доразбором задачи Геоаналитика и QnA (21.03.2024)

Спикеры: Дмитрий Колодезев, Алексей Пустынников, Алексей Натекин Страница соревнования: https://ods.ai/tracks/data-fusion-2024-competitions

Дедлайн по соревнованию 5 апреля 2024 года, присоединяйтесь!

1 month назад @ youtube.com
Data Fusion Contest 2024 - митап по задачам Геоаналитика и Модели оттока (29.02.2024)
Data Fusion Contest 2024 -  митап по задачам Геоаналитика и Модели оттока (29.02.2024) Data Fusion Contest 2024 - митап по задачам Геоаналитика и Модели оттока (29.02.2024)

Спикеры: Алексей Натекин, Дмитрий Колодезев Страница соревнования: https://ods.ai/tracks/data-fusion-2024-competitions

Все презентации можно скачать на странице митапа https://ods.ai/tracks/data-fusion-2024-competitions/meetup Дедлайн по соревнованию 5 апреля 2024 года, присоединяйтесь!

1 month, 1 week назад @ youtube.com
ODS SPB, WiDS Meetup 7 марта
ODS SPB, WiDS Meetup 7 марта ODS SPB, WiDS Meetup 7 марта

С радостью приглашаем вас на уникальное событие, посвященное силе и вкладу женщин в мире данных - митап "Woman In Data Science"! Это не просто встреча, это праздник ума, таланта и вдохновения, организованный ODS SPB при поддержке компании Samokat.tech. Полная программа доступна на ODS:

https://ods.ai/events/wids-meetup-2024 Вступить в сообщество: https://ods.ai/ Соцсети Data Fest & Course Fest: https://t.me/datafest

https://vk.com/datafest

1 month, 3 weeks назад @ youtube.com
Деревья и их ансамбли 2023 | Растим дерево
Деревья и их ансамбли 2023 | Растим дерево Деревья и их ансамбли 2023 | Растим дерево

Курс Open ML Course: Деревья и их ансамбли:

https://ods.ai/tracks/trees-autumn23

Сезон курсов: https://ods.ai/events/course_season_autumn_23 Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 months, 1 week назад @ youtube.com
Деревья и их ансамбли 2023 | Деревья в анализе данных
Деревья и их ансамбли 2023 | Деревья в анализе данных Деревья и их ансамбли 2023 | Деревья в анализе данных

Курс Open ML Course: Деревья и их ансамбли:

https://ods.ai/tracks/trees-autumn23

Сезон курсов: https://ods.ai/events/course_season_autumn_23 Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 months, 2 weeks назад @ youtube.com
DRL Course 2023 | Model-Free Reinforcement Learning: Monte-Carlo, SARSA, Q-Learning
DRL Course 2023 | Model-Free Reinforcement Learning: Monte-Carlo, SARSA, Q-Learning DRL Course 2023 | Model-Free Reinforcement Learning: Monte-Carlo, SARSA, Q-Learning

Курс Deep Reinforcement Learning 2023: https://ods.ai/tracks/drlcourse23

Сезон курсов:https://ods.ai/events/course_season_autumn_23 В четвертой лекции:

- Рассматривается случай MDP с неизвестными функциями награды и перехода между состояниями

- Рассмотрели подход Monte-Carlo и Temporal-Difference для нахождения Q-функции в этом случае

- Обсудили epsilon-жадные политики

- Вывили алгоритмы Monte-Carlo, SARSA и Q-learning Автор курса: Антон Плаксин, исследователь в группе Yandex.Research и доцент Уральского федерального университета. Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам:…

2 months, 3 weeks назад @ youtube.com
DRL Course 2023 |Dynamic Programming. Policy and Value Iterations
DRL Course 2023 |Dynamic Programming. Policy and Value Iterations DRL Course 2023 |Dynamic Programming. Policy and Value Iterations

Курс Deep Reinforcement Learning 2023: https://ods.ai/tracks/drlcourse23

Сезон курсов:https://ods.ai/events/course_season_autumn_23 В третьей лекции:

- Поговорили про принцип динамического программирования

- Рассмотрели понятия v- и q-функций, а также понятия оптимальной политики.

- Выписали уравнения Белламана и научились их решать методами Policy Iteration и Value Iteration. Автор курса: Антон Плаксин, исследователь в группе Yandex.Research и доцент Уральского федерального университета. Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат …

2 months, 3 weeks назад @ youtube.com
DRL Course 2023 | Практическое занятие 2. PyTorch and Deep Cross-Entropy Method.
DRL Course 2023 | Практическое занятие 2. PyTorch and Deep Cross-Entropy Method. DRL Course 2023 | Практическое занятие 2. PyTorch and Deep Cross-Entropy Method.

Курс Deep Reinforcement Learning 2023: https://ods.ai/tracks/drlcourse23

Сезон курсов:https://ods.ai/events/course_season_autumn_23 На втором практическом занятии: - Разбираемся с PyTorch

- Пишем линейную регрессию

- Решаем задачу регрессии с помощь нейронных сетей

- Реализуем Deep Cross-Entropy метод и решаем CartPole Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 months, 3 weeks назад @ youtube.com
Линейные модели 2023 | Разбор домашнего задания
Линейные модели 2023 | Разбор домашнего задания Линейные модели 2023 | Разбор домашнего задания

Курс Open ML Course: Линейные модели 2023: https://ods.ai/tracks/linear-models-autumn23

Сезон курсов: https://ods.ai/events/course_season_autumn_23 Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

3 months назад @ youtube.com
DRL Course 2023 | Практическое занятие 3. Policy Iteration
DRL Course 2023 | Практическое занятие 3. Policy Iteration DRL Course 2023 | Практическое занятие 3. Policy Iteration

На третьем практическом занятии: - Разбираемся с со средой Frozen Lake

- Пишем Policy Iteration Автор курса: Антон Плаксин, исследователь в группе Yandex.Research и доцент Уральского федерального университета. Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

3 months назад @ youtube.com
Линейные модели 2023 | Выбор модели. Создание новых признаков
Линейные модели 2023 | Выбор модели. Создание новых признаков Линейные модели 2023 | Выбор модели. Создание новых признаков

Курс Open ML Course: Линейные модели 2023: https://ods.ai/tracks/linear-models-autumn23

Сезон курсов: https://ods.ai/events/course_season_autumn_23 Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

3 months назад @ youtube.com
My First Data Project: от идеи к продукту - Создаем прототип продукта. Proof of concept
My First Data Project: от идеи к продукту - Создаем прототип продукта. Proof of concept My First Data Project: от идеи к продукту - Создаем прототип продукта. Proof of concept

Страница курса: https://ods.ai/tracks/my_first_data_project

Все доп.материалы в блоке на странице курса: https://ods.ai/tracks/my_first_data_project/blocks/98c41cb4-aaff-4c5e-8ddf-252be36ed722 Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

3 months назад @ youtube.com
Деревья и их ансамбли 2023 | Дополнительные условия при построении деревьев
Деревья и их ансамбли 2023 | Дополнительные условия при построении деревьев Деревья и их ансамбли 2023 | Дополнительные условия при построении деревьев

Курс Open ML Course: Деревья и их ансамбли:

https://ods.ai/tracks/trees-autumn23

Сезон курсов: https://ods.ai/events/course_season_autumn_23 Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

3 months назад @ youtube.com
Основы проектирования ML-систем (autumn 2023 update)
Основы проектирования ML-систем (autumn 2023 update) Основы проектирования ML-систем (autumn 2023 update)

Курс ML System Design: https://ods.ai/tracks/ml-system-design-23

Сезон курсов: https://ods.ai/events/course_season_autumn_23 Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

3 months назад @ youtube.com
DRL Course 2023 | Introduction to Neural Networks. Deep Cross-Entropy Method
DRL Course 2023 | Introduction to Neural Networks. Deep Cross-Entropy Method DRL Course 2023 | Introduction to Neural Networks. Deep Cross-Entropy Method

Курс Deep Reinforcement Learning 2023: https://ods.ai/tracks/drlcourse23

Сезон курсов:https://ods.ai/events/course_season_autumn_23 Во второй лекции:

- рассмотрены понятия нейрона, функции активации, нейронных сетей;

- кратко изложен нейросетевой подход к решению задач регрессии и классификации;

- приведена Теорема Цибенко об аппроксимации нейронными сетями непрерывных функций;

- рассказана модификация алгоритма Кросс-Энтропии с использованием нейронных сетей для решения задач обучения с подкреплением с бесконечными пространствами состояний и действий. Автор курса: Антон Плаксин, исследователь в группе Yandex.Research и доцент Уральского федерального университета. Наши соц.сети:

Telegram: h…

3 months, 1 week назад @ youtube.com
🎧 Podcasts
Lex Fridman AI Podcast Lex Fridman AI Podcast
последний пост 2 days, 16 hours назад
#428 – Sean Carroll: General Relativity, Quantum Mechanics, Black Holes & Aliens
#428 – Sean Carroll: General Relativity, Quantum Mechanics, Black Holes & Aliens #428 – Sean Carroll: General Relativity, Quantum Mechanics, Black Holes & Aliens

Sean Carroll is a theoretical physicist, author, and host of Mindscape podcast.

Please support this podcast by checking out our sponsors:– HiddenLayer: https://hiddenlayer.com/lex– Cloaked: https://cloaked.com/lex and use code LexPod to get 25% off– Notion: https://notion.com/lex– Shopify: https://shopify.com/lex to get $1 per month trial– NetSuite: http://netsuite.com/lex to get free product tourTranscript: https://lexfridman.com/sean-carroll-3-transcriptEPISODE LINKS:Sean’s Website: https://preposterousuniverse.comMindscape Podcast: https://www.preposterousuniverse.com/podcast/Sean’s YouTube: https://youtube.com/@seancarrollSean’s Patreon: https://www.patreon.com/seanmcarrollSean’s Twitte…

2 days, 16 hours назад @ lexfridman.com
#427 – Neil Adams: Judo, Olympics, Winning, Losing, and the Champion Mindset
#427 – Neil Adams: Judo, Olympics, Winning, Losing, and the Champion Mindset #427 – Neil Adams: Judo, Olympics, Winning, Losing, and the Champion Mindset

Neil Adams is a judo world champion, 2-time Olympic silver medalist, 5-time European champion, and often referred to as the Voice of Judo.

Please support this podcast by checking out our sponsors:– ZipRecruiter: https://ziprecruiter.com/lex– Eight Sleep: https://eightsleep.com/lex to get special savings– MasterClass: https://masterclass.com/lexpod to get 15% off– LMNT: https://drinkLMNT.com/lex to get free sample pack– NetSuite: http://netsuite.com/lex to get free product tourEPISODE LINKS:Neil’s Instagram: https://instagram.com/naefightingNeil’s YouTube: https://youtube.com/NAEffectiveFightingNeil’s TikTok: https://tiktok.com/@neiladamsmbeNeil’s Facebook: https://facebook.com/NeilAdamsJudo…

4 days, 13 hours назад @ lexfridman.com
#426 – Edward Gibson: Human Language, Psycholinguistics, Syntax, Grammar & LLMs
#426 – Edward Gibson: Human Language, Psycholinguistics, Syntax, Grammar & LLMs #426 – Edward Gibson: Human Language, Psycholinguistics, Syntax, Grammar & LLMs

Edward Gibson is a psycholinguistics professor at MIT and heads the MIT Language Lab.

Please support this podcast by checking out our sponsors:– Yahoo Finance: https://yahoofinance.com– Listening: https://listening.com/lex and use code LEX to get one month free– Policygenius: https://policygenius.com/lex– Shopify: https://shopify.com/lex to get $1 per month trial– Eight Sleep: https://eightsleep.com/lex to get special savingsTranscript: https://lexfridman.com/edward-gibson-transcriptEPISODE LINKS:Edward’s X: https://x.com/LanguageMITTedLab: https://tedlab.mit.edu/Edward’s Google Scholar: https://scholar.google.com/citations?user=4FsWE64AAAAJTedLab’s YouTube: https://youtube.com/@Tedlab-MITP…

1 week назад @ lexfridman.com
#425 – Andrew Callaghan: Channel 5, Gonzo, QAnon, O-Block, Politics & Alex Jones
#425 – Andrew Callaghan: Channel 5, Gonzo, QAnon, O-Block, Politics & Alex Jones #425 – Andrew Callaghan: Channel 5, Gonzo, QAnon, O-Block, Politics & Alex Jones

Andrew Callaghan is the host of Channel 5 on YouTube, where he does street interviews with fascinating humans at the edges of society, the so-called vagrants, vagabonds, runaways, outlaws, from QAnon adherents to Phish heads to O Block residents and much more.

Please support this podcast by checking out our sponsors:– ShipStation: https://shipstation.com/lex and use code LEX to get 60-day free trial– BetterHelp: https://betterhelp.com/lex to get 10% off– LMNT: https://drinkLMNT.com/lex to get free sample pack– MasterClass: https://masterclass.com/lexpod to get 15% off– AG1: https://drinkag1.com/lex to get 1 month supply of fish oilTranscript: https://lexfridman.com/andrew-callaghan-transcri…

1 week, 4 days назад @ lexfridman.com
#424 – Bassem Youssef: Israel-Palestine, Gaza, Hamas, Middle East, Satire & Fame
#424 – Bassem Youssef: Israel-Palestine, Gaza, Hamas, Middle East, Satire & Fame #424 – Bassem Youssef: Israel-Palestine, Gaza, Hamas, Middle East, Satire & Fame

Bassem Youssef is an Egyptian-American comedian & satirist, referred to as the Jon Stewart of the Arab World.

Please support this podcast by checking out our sponsors:– AG1: https://drinkag1.com/lex to get 1 month supply of fish oil– Shopify: https://shopify.com/lex to get $1 per month trial– Eight Sleep: https://eightsleep.com/lex to get special savings– LMNT: https://drinkLMNT.com/lex to get free sample packEPISODE LINKS:Bassem’s X: https://x.com/ByoussefBassem’s Instagram: https://instagram.com/bassemBassem’s Facebook: https://facebook.com/bassemyousseftvBassem’s Website: https://bassemyoussef.xyzPODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co…

2 weeks, 5 days назад @ lexfridman.com
#423 – Tulsi Gabbard: War, Politics, and the Military Industrial Complex
#423 – Tulsi Gabbard: War, Politics, and the Military Industrial Complex #423 – Tulsi Gabbard: War, Politics, and the Military Industrial Complex

Tulsi Gabbard is a politician, veteran, and author of For Love of Country.

Please support this podcast by checking out our sponsors:– Riverside: https://creators.riverside.fm/LEX and use code LEX to get 30% off– ExpressVPN: https://expressvpn.com/lexpod to get 3 months free– NetSuite: http://netsuite.com/lex to get free product tour– Notion: https://notion.com/lexEPISODE LINKS:For Love of Country (book): https://amzn.to/3VLlofMTulsi’s X: https://x.com/tulsigabbardTulsi’s YouTube: https://youtube.com/@TulsiGabbardTulsi’s Podcast: https://youtube.com/@TheTulsiGabbardShowTulsi’s Instagram: https://instagram.com/tulsigabbardTulsi’s Facebook: https://facebook.com/TulsiGabbardTulsi’s Website: htt…

3 weeks, 1 day назад @ lexfridman.com
#422 – Mark Cuban: Shark Tank, DEI & Wokeism Debate, Elon Musk, Politics & Drugs
#422 – Mark Cuban: Shark Tank, DEI & Wokeism Debate, Elon Musk, Politics & Drugs #422 – Mark Cuban: Shark Tank, DEI & Wokeism Debate, Elon Musk, Politics & Drugs

Mark Cuban is a businessman, investor, star of TV series Shark Tank, long-time principal owner of Dallas Mavericks, and founder of Cost Plus Drugs.

Please support this podcast by checking out our sponsors:– Listening: https://listening.com/lex and use code LEX to get one month free– Cloaked: https://cloaked.com/lex and use code LexPod to get 25% off– Notion: https://notion.com/lex– Eight Sleep: https://eightsleep.com/lex to get special savings– Shopify: https://shopify.com/lex to get $1 per month trialEPISODE LINKS:Mark’s X: https://twitter.com/mcubanMark’s Instagram: https://instagram.com/mcubanCost Plus Drugs: https://costplusdrugs.comShark Tank: https://abc.com/shows/shark-tankDallas Mav…

3 weeks, 5 days назад @ lexfridman.com
#421 – Dana White: UFC, Fighting, Khabib, Conor, Tyson, Ali, Rogan, Elon & Zuck
#421 – Dana White: UFC, Fighting, Khabib, Conor, Tyson, Ali, Rogan, Elon & Zuck #421 – Dana White: UFC, Fighting, Khabib, Conor, Tyson, Ali, Rogan, Elon & Zuck

Dana White is the CEO and president of the UFC.

Please support this podcast by checking out our sponsors:– LMNT: https://drinkLMNT.com/lex to get free sample pack– Notion: https://notion.com/lex– AG1: https://drinkag1.com/lex to get 1 month supply of fish oil– InsideTracker: https://insidetracker.com/lex to get 20% offTranscript: https://lexfridman.com/dana-white-transcriptEPISODE LINKS:Dana’s X: https://x.com/danawhiteDana’s Instagram: https://instagram.com/danawhiteDana’s Facebook: https://facebook.com/danawhiteUFC’s YouTube: https://youtube.com/@UFCUFC’s Website: https://ufc.com/PODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co/2lwqZIrSpotify: h…

1 month назад @ lexfridman.com
#420 – Annie Jacobsen: Nuclear War, CIA, KGB, Aliens, Area 51, Roswell & Secrecy
#420 – Annie Jacobsen: Nuclear War, CIA, KGB, Aliens, Area 51, Roswell & Secrecy #420 – Annie Jacobsen: Nuclear War, CIA, KGB, Aliens, Area 51, Roswell & Secrecy

Annie Jacobsen is an investigative journalist and author of “Nuclear War: A Scenario” and many other books on war, weapons, government secrecy, and national security.

Please support this podcast by checking out our sponsors:– HiddenLayer: https://hiddenlayer.com/lex– BetterHelp: https://betterhelp.com/lex to get 10% off– Policygenius: https://policygenius.com/lex– NetSuite: http://netsuite.com/lex to get free product tourEPISODE LINKS:Nuclear War: A Scenario (book): https://amzn.to/3THZHfrAnnie’s Twitter: https://twitter.com/anniejacobsenAnnie’s Website: https://anniejacobsen.com/Annie’s Books: https://amzn.to/3TGWyMJAnnie’s Books (audio): https://adbl.co/49ZnI7cPODCAST INFO:Podcast website…

1 month назад @ lexfridman.com
#419 – Sam Altman: OpenAI, GPT-5, Sora, Board Saga, Elon Musk, Ilya, Power & AGI
#419 – Sam Altman: OpenAI, GPT-5, Sora, Board Saga, Elon Musk, Ilya, Power & AGI #419 – Sam Altman: OpenAI, GPT-5, Sora, Board Saga, Elon Musk, Ilya, Power & AGI

Sam Altman is the CEO of OpenAI, the company behind GPT-4, ChatGPT, Sora, and many other state-of-the-art AI technologies.

Please support this podcast by checking out our sponsors:– Cloaked: https://cloaked.com/lex and use code LexPod to get 25% off– Shopify: https://shopify.com/lex to get $1 per month trial– BetterHelp: https://betterhelp.com/lex to get 10% off– ExpressVPN: https://expressvpn.com/lexpod to get 3 months freeTranscript: https://lexfridman.com/sam-altman-2-transcriptEPISODE LINKS:Sam’s X: https://x.com/samaSam’s Blog: https://blog.samaltman.com/OpenAI’s X: https://x.com/OpenAIOpenAI’s Website: https://openai.comChatGPT Website: https://chat.openai.com/Sora Website: https://op…

1 month, 1 week назад @ lexfridman.com
#418 – Israel-Palestine Debate: Finkelstein, Destiny, M. Rabbani & Benny Morris
#418 – Israel-Palestine Debate: Finkelstein, Destiny, M. Rabbani & Benny Morris #418 – Israel-Palestine Debate: Finkelstein, Destiny, M. Rabbani & Benny Morris

Norman Finkelstein and Benny Morris are historians.

Mouin Rabbani is a Middle East analyst.

Steven Bonnell (aka Destiny) is a political livestreamer.

On some podcast players you should be able to click the timestamp to jump to that time.

(00:00) – Introduction(12:11) – 1948(1:10:43) – Partition(2:15:16) – October 7(3:09:27) – Gaza(3:36:02) – Peace(4:40:47) – Hope for the future

1 month, 1 week назад @ lexfridman.com
#417 – Kimbal Musk: The Art of Cooking, Tesla, SpaceX, Zip2, and Family
#417 – Kimbal Musk: The Art of Cooking, Tesla, SpaceX, Zip2, and Family #417 – Kimbal Musk: The Art of Cooking, Tesla, SpaceX, Zip2, and Family

Kimbal Musk is a chef, entrepreneur, and author of The Kitchen Cookbook: Cooking for Your Community.

Please support this podcast by checking out our sponsors:– Eight Sleep: https://eightsleep.com/lex to get special savings– ExpressVPN: https://expressvpn.com/lexpod to get 3 months free– NetSuite: http://netsuite.com/lex to get free product tour– BetterHelp: https://betterhelp.com/lex to get 10% offTranscript: https://lexfridman.com/kimbal-musk-transcriptEPISODE LINKS:Kimbal’s X: https://x.com/kimbalKimbal’s Instagram: https://instagram.com/kimbalmusk/Kimbal’s Facebook: https://facebook.com/kimbalmuskofficial/The Kitchen Cookbook: https://amzn.to/4ccaCoEThe Kitchen (restaurants): https://www…

1 month, 2 weeks назад @ lexfridman.com
#416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI
#416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI #416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

Yann LeCun is the Chief AI Scientist at Meta, professor at NYU, Turing Award winner, and one of the most influential researchers in the history of AI.

Please support this podcast by checking out our sponsors:– HiddenLayer: https://hiddenlayer.com/lex– LMNT: https://drinkLMNT.com/lex to get free sample pack– Shopify: https://shopify.com/lex to get $1 per month trial– AG1: https://drinkag1.com/lex to get 1 month supply of fish oilEPISODE LINKS:Yann’s Twitter: https://twitter.com/ylecunYann’s Facebook: https://facebook.com/yann.lecunMeta AI: https://ai.meta.com/PODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co/2lwqZIrSpotify: https://spoti.fi/2nEwCF8R…

1 month, 2 weeks назад @ lexfridman.com
#415 – Serhii Plokhy: History of Ukraine, Russia, Soviet Union, KGB, Nazis & War
#415 – Serhii Plokhy: History of Ukraine, Russia, Soviet Union, KGB, Nazis & War #415 – Serhii Plokhy: History of Ukraine, Russia, Soviet Union, KGB, Nazis & War

Serhii Plokhy is a Ukrainian historian at Harvard University, director of the Ukrainian Research Institute, and an author of many books on history of Eastern Europe, including his latest book The Russo-Ukrainian War: The Return of History.

Please support this podcast by checking out our sponsors:– Eight Sleep: https://eightsleep.com/lex to get special savings– Shopify: https://shopify.com/lex to get $1 per month trial– NetSuite: http://netsuite.com/lex to get free product tour– AG1: https://drinkag1.com/lex to get 1 month supply of fish oilEPISODE LINKS:Serhii’s X: https://x.com/splokhySerhii’s Website: https://history.fas.harvard.edu/people/serhii-plokhiiHarvard Ukrainian Research Institut…

1 month, 3 weeks назад @ lexfridman.com
#414 – Tucker Carlson: Putin, Navalny, Trump, CIA, NSA, War, Politics & Freedom
#414 – Tucker Carlson: Putin, Navalny, Trump, CIA, NSA, War, Politics & Freedom #414 – Tucker Carlson: Putin, Navalny, Trump, CIA, NSA, War, Politics & Freedom

Tucker Carlson is a highly-influential political commentator.

You can watch and listen to him on the Tucker Carlson Network and the Tucker Carlson Podcast.

Please support this podcast by checking out our sponsors:– ZipRecruiter: https://ziprecruiter.com/lex– Listening: https://listening.com/lex and use code LEX to get one month free– HiddenLayer: https://hiddenlayer.com/lex– LMNT: https://drinkLMNT.com/lex to get free sample pack– AG1: https://drinkag1.com/lex to get 1 month supply of fish oilTranscript: https://lexfridman.com/tucker-carlson-transcriptEPISODE LINKS:Tucker Carlson Network: https://tuckercarlson.com/Tucker Carlson Podcast: https://tuckercarlson.com/listen/Tucker’s X: https://…

1 month, 3 weeks назад @ lexfridman.com
Microsoft Research Podcast Microsoft Research Podcast
последний пост 1 week, 1 day назад
Abstracts: April 16, 2024
Abstracts: April 16, 2024 Abstracts: April 16, 2024

GRETCHEN HUIZINGA: Welcome to Abstracts, a Microsoft Research Podcast that puts the spotlight on world-class research in brief.

CHAKRABORTY: So satellite connectivity is nothing new and has been there for long.

So we are talking about the satellites that are at least 10 to 20 times cheaper and smaller than state-of-the-art satellites.

So the device sends some packet to the satellite; satellite sends some packet to the device—it’s all about packet exchange.

So our vision is clear: to bring 24-7 connectivity for devices anywhere on Earth with just a click of power button.

1 week, 1 day назад @ microsoft.com
Ideas: Language technologies for everyone with Kalika Bali
Ideas: Language technologies for everyone with Kalika Bali Ideas: Language technologies for everyone with Kalika Bali

Behind every emerging technology is a great idea propelling it forward. In the new Microsoft Research Podcast series, Ideas, members of the research community at Microsoft discuss the beliefs that animate their research, the experiences and thinkers that inform it, and the positive human impact it targets. In this episode, host Gretchen Huizinga talks with Principal Researcher Kalika Bali. Inspired by an early vision of “talking computers” and a subsequent career in linguistics, Bali has spent the last two decades bringing the two together. Aided by recent advances in large language models and motivated by her belief that everyone should have access to AI in their own language, Bali and her…

1 week, 6 days назад @ microsoft.com
AI Frontiers: Rethinking intelligence with Ashley Llorens and Ida Momennejad
AI Frontiers: Rethinking intelligence with Ashley Llorens and Ida Momennejad AI Frontiers: Rethinking intelligence with Ashley Llorens and Ida Momennejad

And so I just want to start here: for you, Ida, what is general intelligence?

Different people at different times provide different criteria for what would be the artificial general intelligence notion.

One is artificial general intelligence and the other is humanlike intelligence or human-level intelligence.

Artificial general intelligence and humanlike, human-level intelligence—how do these two concepts relate to you?

LLORENS: So it sounds like a very extensive set of experiments across many different tasks and with many different leading AI models, and you’ve uncovered a lack of robustness across some of these different tasks.

3 weeks, 6 days назад @ microsoft.com
Abstracts: March 21, 2024
Abstracts: March 21, 2024 Abstracts: March 21, 2024

GRETCHEN HUIZINGA: Welcome to Abstracts, a Microsoft Research Podcast that puts the spotlight on world-class research in brief.

These two examples are also the differences from other deep learning OFDFT works.

This is the generalization challenge and is one of the major challenges of deep learning method for molecular science applications.

This somehow shows the benefits of leveraging the OFDFT framework for using a deep learning method to solve molecular tasks.

You can also read it on arXiv, or you can check out the March 2024 issue of Nature Computational Science.

1 month назад @ microsoft.com
Abstracts: February 29, 2024
Abstracts: February 29, 2024 Abstracts: February 29, 2024

And so we realized that working with generative AI really parallels these different aspects of what a manager does, right.

So this requires having self-awareness of the applicability of generative AI to your workflows and maintaining an appropriate level of confidence in completing tasks manually or relying on generative AI.

For example, whether it’s worth it for you to actually learn how to work with generative AI more effectively.

But I think, given how generative AI has rolled out in the world today, I mean, a lot of the focus has been on productivity and workflows.

If you want to read the full paper on metacognition and generative AI, you can find a link at aka.ms/abstracts, or you can …

1 month, 3 weeks назад @ microsoft.com
What’s Your Story: Nicole Forsgren
What’s Your Story: Nicole Forsgren What’s Your Story: Nicole Forsgren

NICOLE FORSGREN: Yeah, it’s, you know, it’s, kind of, this ridiculous story.

I was there for, you know, two, three years, and I’m doing really, really well.

GEHRKE: This is just “I had a feeling.”FORSGREN: In my gut, I’m like, I’m doing really well.

After that, things were going really well, but we were also growing and scaling really, really rapidly.

Because I realized there were pieces about research that I really, really loved.

2 months, 1 week назад @ microsoft.com
What’s Your Story: Ivan Tashev
What’s Your Story: Ivan Tashev What’s Your Story: Ivan Tashev

In the Microsoft Research Podcast series What’s Your Story, Johannes Gehrke explores the who behind the technical and scientific advancements helping to reshape the world.

A systems expert whose 10 years with Microsoft spans research and product, Gehrke talks to members of the company’s research community about what motivates their work and how they got where they are today.

Partner Software Architect Ivan Tashev’s expertise in audio signal processing has contributed to the design and study of audio components for Microsoft products such as Kinect, Teams, and HoloLens.

In this episode, Tashev discusses how a first-place finish in the Mathematical Olympiad fueled a lifelong passion for shoot…

2 months, 3 weeks назад @ blubrry.com
Abstracts: January 25, 2024
Abstracts: January 25, 2024 Abstracts: January 25, 2024

And up until now, there’s been a lot of work on pruning model parameters for a variety of reasons.

But generally, these papers show that as parameters are removed from the model, performance just does not degrade.

You can, overall, keep performance roughly the same even with a fairly drastic reduction of model parameters.

HUIZINGA: So, Jordan, I often think of an abstract as a sort of appetizer for a research paper.

I think for one, as a practical matter, there’s this question of just what’s the best way to find the best LASER intervention?

3 months назад @ microsoft.com
AI Frontiers: A deep dive into deep learning with Ashley Llorens and Chris Bishop
AI Frontiers: A deep dive into deep learning with Ashley Llorens and Chris Bishop AI Frontiers: A deep dive into deep learning with Ashley Llorens and Chris Bishop

LLORENS: Your new book lays out foundations in statistics and probability theory for modern machine learning.

LLORENS: Another concept that is key in machine learning is generalization.

So that’s, that’s … we show that in the book, in fact.

BISHOP: [LAUGHS] Well, that’s, that’s really interesting.

So I personally, actually, find this one of the most exciting frontiers not only of the natural sciences but also of machine learning.

4 months, 1 week назад @ microsoft.com
Abstracts: December 12, 2023
Abstracts: December 12, 2023 Abstracts: December 12, 2023

Members of the research community at Microsoft work continuously to advance their respective fields.

Abstracts brings its audience to the cutting edge with them through short, compelling conversations about new and noteworthy achievements.

In this episode, Senior Principal Research Manager Tao Qin and Senior Researcher Lijun Wu discuss “FABind: Fast and Accurate Protein-Ligand Binding.” The paper, accepted at the 2023 Conference on Neural Information Processing Systems (NeurIPS), introduces a new method for predicting the binding structures of proteins and ligands during drug development.

The method demonstrates improved speed and accuracy over current methods.

Learn moreFABind: Fast and Ac…

4 months, 2 weeks назад @ blubrry.com
Abstracts: December 11, 2023
Abstracts: December 11, 2023 Abstracts: December 11, 2023

Members of the research community at Microsoft work continuously to advance their respective fields.

Abstracts brings its audience to the cutting edge with them through short, compelling conversations about new and noteworthy achievements.

In this episode, Principal Researcher Alessandro Sordoni joins host Gretchen Huizinga to discuss “Joint Prompt Optimization of Stacked LLMs using Variational Inference.” In the paper, which was accepted at the 2023 Conference on Neural Information Processing Systems (NeurIPS), Sordoni and his coauthors introduce Deep Language Networks, or DLNs, an architecture that treats large language models as layers within a network and natural language prompts as eac…

4 months, 2 weeks назад @ blubrry.com
Abstracts: December 6, 2023
Abstracts: December 6, 2023 Abstracts: December 6, 2023

Members of the research community at Microsoft work continuously to advance their respective fields.

Abstracts brings its audience to the cutting edge with them through short, compelling conversations about new and noteworthy achievements.

In this episode, Xing Xie, a Senior Principal Research Manager of Microsoft Research Asia, joins host Dr. Gretchen Huizinga to discuss “Evaluating General-Purpose AI with Psychometrics.” As AI capabilities move from task specific to more general purpose, the paper explores psychometrics, a subfield of psychology, as an alternative to traditional methods for evaluating model performance and for supporting consistent and reliable systems.

Read the paper: Ev…

4 months, 2 weeks назад @ blubrry.com
Abstracts: December 6, 2023
Abstracts: December 6, 2023 Abstracts: December 6, 2023

Members of the research community at Microsoft work continuously to advance their respective fields.

Abstracts brings its audience to the cutting edge with them through short, compelling conversations about new and noteworthy achievements.

In this episode, Xing Xie, a Senior Principal Research Manager of Microsoft Research Asia, joins host Dr. Gretchen Huizinga to discuss “Evaluating General-Purpose AI with Psychometrics.” As AI capabilities move from task specific to more general purpose, the paper explores psychometrics, a subfield of psychology, as an alternative to traditional methods for evaluating model performance and for supporting consistent and reliable systems.

Read the paper: Ev…

4 months, 3 weeks назад @ blubrry.com
Collaborators: Teachable AI with Cecily Morrison and Karolina Pakėnaitė
Collaborators: Teachable AI with Cecily Morrison and Karolina Pakėnaitė Collaborators: Teachable AI with Cecily Morrison and Karolina Pakėnaitė

It often requires the knowledge and experience of individuals from across disciplines and institutions.

Collaborators, a Microsoft Research Podcast series, explores the relationships—both expected and unexpected—behind the projects, products, and services being pursued and delivered by researchers at Microsoft and the diverse range of people they’re teaming up with.

In this episode, Dr. Gretchen Huizinga speaks with Cecily Morrison, MBE, a Senior Principal Research Manager at Microsoft Research, and Karolina Pakėnaitė, who also goes by Caroline, a PhD student and member of the citizen design team working with Morrison on the research project Find My Things.

An AI phone application designed …

4 months, 3 weeks назад @ blubrry.com
Abstracts: November 20, 2023
Abstracts: November 20, 2023 Abstracts: November 20, 2023

Shrey and Zoë are coauthors of a paper called Contextual Confidence and Generative AI, and you can read a preprint of this paper now on arXiv.

What was the prompt—no pun intended—for generative AI that got you concerned about this?

HITZIG: I’d say this research paper draws on a few different strands of literature.

So, specifically, how does generative AI threaten our ability to identify and protect?

HUIZINGA: Zoë, if there was one takeaway that you want our listeners to get from this work on contextual confidence, what would it be?

5 months назад @ microsoft.com
Data Skeptic
последний пост 11 часов назад
Signal in the Noise
Signal in the Noise Signal in the Noise

She is interested in studying and understanding the neural mechanism of the honeybee waggle dance.

Barbara and Anna shared some breakthroughs in the field of animal communication.

Anna discussed how the honeybee uses the waggle dance to communicate.

Our guests explained how they captured the waggle dance of honeybees in their hives.

She also discussed how researchers use the neural socket to understand the workings of insects' brains.

11 часов назад @ dataskeptic.com
Pose Tracking
Pose Tracking Pose Tracking

Talmo PereiraDr. Talmo Pereira is a Principal Investigator at the Salk Institute for Biological Studies in San Diego, CA where he leads a research group as a Salk Fellow.

His lab (talmolab.org) focuses on the development of deep learning-based computational methods for recognition and modeling of complex biological systems, with applications ranging from neuroscience to cancer and plant biology.

His recent work has demonstrated how advances in deep learning and computer vision can enable quantitative phenotyping of complex behaviors through the development and application of approaches for markerless motion capture (sleap.ai).

This work has been published in Nature Methods and featured in T…

1 week, 2 days назад @ dataskeptic.com
Modeling Group Behavior
Modeling Group Behavior Modeling Group Behavior

Modeling Group BehaviorOur guest in this episode is Sebastien Motsch, an assistant professor at Arizona State University, working in the School of Mathematical and Statistical Science.

Sebastien discussed two approaches to modeling the group behavior of animals, for instance, a flock of birds.

He discussed how Boltzmann's questions and kinetic theory help them understand birds' interactions with respect to velocity changes.

Papers discussedA new model for self-organized dynamics and its flocking behaviorHeterophilious dynamics enhances consensusResourceBoltzmann equation$$f(t + \Delta t, \mathbf{x} + \Delta \mathbf{x}, \mathbf{v} + \Delta \mathbf{v}) = f(t, \mathbf{x}, \mathbf{v}) + \left( …

2 weeks, 2 days назад @ dataskeptic.com
Advances in Data Loggers
Advances in Data Loggers Advances in Data Loggers

Ryan discussed how the behavior of rattlesnakes is studied in the natural world, particularly with an increase in temperature.

He discussed how they collect data about the rattlesnake hunt using loggers and how they determine the loggers do not affect the animals' behavior.

Ryan discussed how he built a machine learning model to predict the behavior of the animals.

Ryan discussed what he discovered about the eating habits of rattlesnakes.

Rounding up, Ryan shared some future plans for the project.

1 month назад @ dataskeptic.com
What You Know About Intelligence is Wrong (fixed)
What You Know About Intelligence is Wrong (fixed) What You Know About Intelligence is Wrong (fixed)

What you Know About Intelligence is WrongWe are joined by Hank Schlinger, a professor of psychology at California State University, Los Angeles.

His research revolves around theoretical issues in psychology and behavioral analysis.

He also discussed how intelligence is measured in a given context.

Hank mentioned why the current measure of intelligence is fundamentally flawed.

Hank discussed how psychologists can perform behavioral experiments to understand consciousness.

1 month назад @ dataskeptic.com
What You Know About Intelligence is Wrong
What You Know About Intelligence is Wrong What You Know About Intelligence is Wrong

What you Know About Intelligence is WrongWe are joined by Hank Schlinger, a professor of psychology at California State University, Los Angeles.

His research revolves around theoretical issues in psychology and behavioral analysis.

He also discussed how intelligence is measured in a given context.

Hank mentioned why the current measure of intelligence is fundamentally flawed.

Hank discussed how psychologists can perform behavioral experiments to understand consciousness.

1 month, 1 week назад @ dataskeptic.com
Animal Decision Making
Animal Decision Making Animal Decision Making

Louis and the interim director at the Whitney R. Harris World Ecology Center.

Aimee discussed how animals perceive information and what they use it for.

She also discussed the costs required for learning and factors that affect animal learning.

She also discussed the different kinds of evolutionary experiments that can be performed.

Aimee discussed some models that researchers use during evolutionary experiments.

1 month, 1 week назад @ dataskeptic.com
Octopus Cognition
Octopus Cognition Octopus Cognition

Octopus CognitionWe are joined by Tamar Gutnick, a visiting professor at the University of Naples Federico II, Napoli, Italy.

She studies the octopus nervous system and their behavior, focusing on cognition and learning behaviors.

Tamar gave a background to the kind of research she does — lab research.

She discussed the octopus nervous system and why they are unique compared to other animals.

She discussed how they measure the behavior of octopuses using a video recording and a logger to track brain activity.

1 month, 2 weeks назад @ dataskeptic.com
Optimal Foraging
Optimal Foraging Optimal Foraging

Claire discussed how bumblebees make foraging decisions and how they communicate when foraging.

She discussed how they set up experiments in the lab to address questions about bumblebees foraging.

Claire discussed factors that drive an animal's foraging decisions.

She also touched on some irrational foraging behaviors she observed in her study.

She discussed the effect of climate change on foraging bees' learning behavior.

1 month, 3 weeks назад @ dataskeptic.com
Memory in Chess
Memory in Chess Memory in Chess

On today’s show, we are joined by our co-host, Becky Hansis-O’Neil. Becky is a Ph.D. student at the University of Missouri, St Louis, where she studies bumblebees and tarantulas to understand their learning and cognitive work. She joins us to discuss the paper: Perception in Chess. The paper aimed to understand how chess players perceive the positions of chess pieces on a chess board. She discussed the findings paper. She spoke about situations where grandmasters had better recall of chess positions than beginners and situations where they did not. Becky and Kyle discussed the use of chess engines for cheating. They also discussed how chess players use chunking. Becky discussed some approac…

2 months, 1 week назад @ dataskeptic.com
OpenWorm
OpenWorm OpenWorm

OpenwormOn this episode, we are joined by Stephen Larson, the CEO of MetaCell and an affiliate of the OpenWorm foundation.

Stephen discussed what the Openworm project is about.

They hope to use a digital C. elegans nematode (C. elegans for short) to study the basics of life.

Stephen discussed why C. elegans is an ideal organism for studying life in the lab.

He also mentioned how students can get involved in the Openworm project.

2 months, 2 weeks назад @ dataskeptic.com
What the Antlion Knows
What the Antlion Knows What the Antlion Knows

What the Antlion KnowsOur guest is Becky Hansis-O’Neil, a Ph.D. student at the University of Missouri, St Louis.

Becky discussed how they designed an experiment using a T-maze to understand antlions' behavior.

Becky discussed some interesting findings from the experiment.

Becky gave her thoughts on the findings of the paper and the methodologies used.

Paper in focusOperant conditioning in antlion larvae and its impairment following exposure to elevated temperaturesFollow our guestXWebsiteThis season’s cover art features original photographs by Becky Hansis-O’Neil

2 months, 3 weeks назад @ dataskeptic.com
AI Roundtable
AI Roundtable

Kyle is joined by friends and former guests Pramit Choudhary and Frank Bell to have an open discussion of the impacts LLMs and machine learning have had in the past year on industry, and where things may go in the current year.

3 months, 1 week назад @ dataskeptic.com
Uncontrollable AI Risks
Uncontrollable AI Risks Uncontrollable AI Risks

Uncontrollable AI RisksWe are joined by Darren McKee, a Policy Advisor and the host of Reality Check — a critical thinking podcast.

Darren gave a background about himself and how he got into the AI space.

Darren shared his thoughts on AGI's achievements in the coming years.

Darren discussed his worry about AI surpassing human understanding of the universe and potentially causing harm to humanity.

He explored whether AI possesses inherently evil intentions and gave his thoughts on regulating AI.

3 months, 4 weeks назад @ dataskeptic.com
I LLM and You Can Too
I LLM and You Can Too I LLM and You Can Too

This episode is not yet released.

Coming soon

4 months назад @ dataskeptic.com
SuperDataScience SuperDataScience
последний пост 2 days назад
777: Generative AI in Practice, with Bernard Marr
777: Generative AI in Practice, with Bernard Marr 777: Generative AI in Practice, with Bernard Marr

Generative AI is reshaping our world, and Bernard Marr, world-renowned futurist and best-selling author, joins Jon Krohn to guide us through this transformation.

In this episode, Bernard shares his insights on how AI is …

2 days назад @ soundcloud.com
776: Deep Utopia: AI Could Solve All Human Problems in Our Lifetime
776: Deep Utopia: AI Could Solve All Human Problems in Our Lifetime 776: Deep Utopia: AI Could Solve All Human Problems in Our Lifetime

What are the risks of AI progressing beyond a point of no return?

What do we stand to gain?

On this Five-Minute Friday, Jon Krohn talks ‘books’ as he outlines two nonfiction works on AI and futurism by Oxford philosopher…

6 days назад @ soundcloud.com
775: What will humans do when machines are vastly more intelligent? With Aleksa Gordić
775: What will humans do when machines are vastly more intelligent? With Aleksa Gordić 775: What will humans do when machines are vastly more intelligent? With Aleksa Gordić

Tech entrepreneurship, artificial superintelligence, and the future of education: Aleksa Gordić speaks to Jon Krohn about his strategies for self-directed learning, the traits that help people succeed in moving from big …

1 week, 2 days назад @ soundcloud.com
774: RFM-1 Gives Robots Human-like Reasoning and Conversation Abilities
774: RFM-1 Gives Robots Human-like Reasoning and Conversation Abilities 774: RFM-1 Gives Robots Human-like Reasoning and Conversation Abilities

Covariant's RFM-1: Jon Krohn explores the future of AI-driven robotics with RFM-1, a groundbreaking robot arm designed by Covariant and discussed by A.I.

roboticist Pieter Abbeel.

Explore how this innovation aims to merg…

1 week, 6 days назад @ soundcloud.com
773: Deep Reinforcement Learning for Maximizing Profits, with Prof. Barrett Thomas
773: Deep Reinforcement Learning for Maximizing Profits, with Prof. Barrett Thomas 773: Deep Reinforcement Learning for Maximizing Profits, with Prof. Barrett Thomas

Dr. Barrett Thomas, an award-winning Research Professor at the University of Iowa, explores the intricacies of Markov decision processes and their connection to Deep Reinforcement Learning.

Discover how these concepts ar…

2 weeks, 2 days назад @ soundcloud.com
772: In Case You Missed It in March 2024
772: In Case You Missed It in March 2024 772: In Case You Missed It in March 2024

Pytorch benefits, how to get funding for your AI startup, and managing scientific silos: In our new series for SuperDataScience, “In Case You Missed It”, host Jon Krohn engages in some “reinforcement learning through hum…

2 weeks, 6 days назад @ soundcloud.com
771: Gradient Boosting: XGBoost, LightGBM and CatBoost, with Kirill Eremenko
771: Gradient Boosting: XGBoost, LightGBM and CatBoost, with Kirill Eremenko 771: Gradient Boosting: XGBoost, LightGBM and CatBoost, with Kirill Eremenko

Kirill Eremenko joins Jon Krohn for another exclusive, in-depth teaser for a new course just released on the SuperDataScience platform, “Machine Learning Level 2”.

Kirill walks listeners through why decision trees and ra…

3 weeks, 2 days назад @ soundcloud.com
770: The Neuroscientific Guide to Confidence
770: The Neuroscientific Guide to Confidence 770: The Neuroscientific Guide to Confidence

Explore the science of confidence with Lucy Antrobus, as she unveils neuroscience-backed strategies to build and boost confidence through practice, positive energy, and the power of laughter.

An essential listen for fost…

3 weeks, 6 days назад @ soundcloud.com
769: Generative AI for Medicine, with Prof. Zack Lipton
769: Generative AI for Medicine, with Prof. Zack Lipton 769: Generative AI for Medicine, with Prof. Zack Lipton

Generative AI in medicine takes center stage as Prof. Zachary Lipton, Chief Scientific Officer at Abridge, joins host Jon Krohn to discuss the significant advancements in AI that are reshaping healthcare.

This episode i…

1 month назад @ soundcloud.com
768: Is Claude 3 Better than GPT-4?
768: Is Claude 3 Better than GPT-4? 768: Is Claude 3 Better than GPT-4?

Claude 3, LLMs and testing ML performance: Jon Krohn tests out Anthropic’s new model family, Claude 3, which includes the Haiku, Sonnet and Opus models (written in order of their performance power, from least to greatest…

1 month назад @ soundcloud.com
767: Open-Source LLM Libraries and Techniques, with Dr. Sebastian Raschka
767: Open-Source LLM Libraries and Techniques, with Dr. Sebastian Raschka 767: Open-Source LLM Libraries and Techniques, with Dr. Sebastian Raschka

Jon Krohn sits down with Sebastian Raschka to discuss his latest book, Machine Learning Q and AI, the open-source libraries developed by Lightning AI, how to exploit the greatest opportunities for LLM development, and wh…

1 month, 1 week назад @ soundcloud.com
766: Vonnegut's Player Piano (1952): An Eerie Novel on the Current AI Revolution
766: Vonnegut's Player Piano (1952): An Eerie Novel on the Current AI Revolution 766: Vonnegut's Player Piano (1952): An Eerie Novel on the Current AI Revolution

Kurt Vonnegut's "Player Piano" delivers striking parallels between its dystopian vision and today's AI challenges.

This week, Jon Krohn explores the novel's depiction of a world where humans are marginalized by machines,…

1 month, 1 week назад @ soundcloud.com
765: NumPy, SciPy and the Economics of Open-Source, with Dr. Travis Oliphant
765: NumPy, SciPy and the Economics of Open-Source, with Dr. Travis Oliphant 765: NumPy, SciPy and the Economics of Open-Source, with Dr. Travis Oliphant

Explore the origins of NumPy and SciPy with their creator, Dr. Travis Oliphant.

Discover the journey from personal need to global impact, the challenges overcome, and the future of these essential Python libraries in sci…

1 month, 2 weeks назад @ soundcloud.com
764: The Top 10 Episodes of 2023
764: The Top 10 Episodes of 2023 764: The Top 10 Episodes of 2023

Data science futurists, bestselling authors, and lively how-to guides from the industry’s top practitioners, which range from applying data science for good to using open-source tools for NLP: This is The Super Data Scie…

1 month, 2 weeks назад @ soundcloud.com
763: The Best A.I. Startup Opportunities, with venture capitalist Rudina Seseri
763: The Best A.I. Startup Opportunities, with venture capitalist Rudina Seseri 763: The Best A.I. Startup Opportunities, with venture capitalist Rudina Seseri

At Glasswing Ventures, Rudina Seseri wants to be able to answer the question: What has Glasswing Ventures done for the company beyond capital investment?

She speaks to Jon Krohn about how her company uses data to assess …

1 month, 2 weeks назад @ soundcloud.com
Data Science at Home Data Science at Home
последний пост 5 days, 18 hours назад
Rust in the Cosmos: Decoding Communication Part 2 (Ep. 255)
Rust in the Cosmos: Decoding Communication Part 2 (Ep. 255) Rust in the Cosmos: Decoding Communication Part 2 (Ep. 255)

In this episode of “Rust in the Cosmos” we delve into the challenge of testing software for… ehm … spaceHow can Rust help?

Build robotics applications in minutes, not months.

Amethix works to create and maximize the impact of the world’s leading corporations and startups, so they can create a better future for everyone they serve.

We provide solutions in AI/ML, Fintech, Defense, Robotics and Predictive maintenance.

CommunitiesAeroRust, Intrepid, BytenookAeroRust Discord invite: https://discord.com/invite/6jJyx5nEUqAeroRust website: AeroRust.orgIntrepid AI Discord https://discord.gg/cSSzche6Cthttps://discord.gg/cSSzche6Ct Intrepid AI website: https://intrepid.aiReferences

5 days, 18 hours назад @ datascienceathome.com
Rust in the Cosmos: Decoding Communication Part I (Ep. 254)
Rust in the Cosmos: Decoding Communication Part I (Ep. 254) Rust in the Cosmos: Decoding Communication Part I (Ep. 254)

In this inaugural episode of “Rust in the Cosmos,” we delve into the intricacies of communication in space and some of the challenges in space application development.

2 weeks назад @ datascienceathome.com
AI and Video Game Development: Navigating the Future Frontier (Ep. 253)
AI and Video Game Development: Navigating the Future Frontier (Ep. 253) AI and Video Game Development: Navigating the Future Frontier (Ep. 253)

In this episode we delve into the dynamic realm of game development and the transformative role of artificial intelligence (AI).

Join Frag, Jim and Mike as they explore the current landscape of game development processes, from initial creative ideation to the integration of AI-driven solutions.

With Mike’s expertise as a software executive and avid game developer, we uncover the potential of AI to revolutionize game design, streamline development cycles, and enhance player experiences.

SponsorsIntrepid AI is an AI assisted all-in-one platform for robotics teams.

Build robotics applications in minutes, not months.

3 weeks, 4 days назад @ datascienceathome.com
Kaggle Kommando’s Data Disco: Laughing our Way Through AI Trends (Ep. 252)
Kaggle Kommando’s Data Disco: Laughing our Way Through AI Trends (Ep. 252) Kaggle Kommando’s Data Disco: Laughing our Way Through AI Trends (Ep. 252)

In this episode, join me and the Kaggle Grand Master, Konrad Banachewicz, for a hilarious journey into the zany world of data science trends.

From algorithm acrobatics to AI, creativity, Hollywood movies, and music, we just can’t get enough.

It’s the typical episode with a dose of nerdy comedy you didn’t know you needed.

Buckle up, it’s a data disco, and we’re breaking down the binary!

SponsorsIntrepid AI is an AI assisted all-in-one platform for robotics teams.

1 month, 2 weeks назад @ datascienceathome.com
Revolutionizing Robotics: Embracing Low-Code Solutions (Ep. 251)
Revolutionizing Robotics: Embracing Low-Code Solutions (Ep. 251) Revolutionizing Robotics: Embracing Low-Code Solutions (Ep. 251)

In this episode of Data Science at Home, we explore the game-changing impact of low-code solutions in robotics development.

Discover how these tools bridge the coding gap, simplify integration, and enable trial-and-error development.

We’ll also uncover challenges with traditional coding methods using ROS.

Join us for a concise yet insightful discussion on the future of robotics!

2 months, 1 week назад @ datascienceathome.com
Is Sqream the fastest big data platform? (Ep. 250)
Is Sqream the fastest big data platform? (Ep. 250) Is Sqream the fastest big data platform? (Ep. 250)

Join us in a dynamic conversation with Yori Lavi, Field CTO at SQream, as we unravel the data analytics landscape.

From debunking the data lakehouse hype to SQream’s GPU-based magic, discover how extreme data challenges are met with agility.

Yori shares success stories, insights into SQream’s petabyte-scale capabilities, and a roadmap to breaking down organizational bottlenecks in data science.

Dive into the future of data analytics with SQream’s commitment to innovation, leaving legacy formats behind and leading the charge in large-scale, cost-effective data projects.

Tune in for a dose of GPU-powered revolution!

2 months, 3 weeks назад @ datascienceathome.com
OpenAI CEO Shake-up: Decoding December 2023 (Ep. 249)
OpenAI CEO Shake-up: Decoding December 2023 (Ep. 249) OpenAI CEO Shake-up: Decoding December 2023 (Ep. 249)

In this episode from a month ago, join me as we unravel the controversial CEO firing at OpenAI in December 2023.

I share my insights on the events, decode the intricacies, and explore what lies ahead for this influential organization.

Don’t miss this concise yet insightful take on the intersection of leadership and artificial intelligence innovation.

SponsorLearn what the new year holds for ransomware as a service, Active Directory, artificial intelligence and more when you download the 2024 Arctic Wolf Labs Predictions Report today at arcticwolf.com/datascience

3 months назад @ datascienceathome.com
Careers, Skills, and the Evolution of AI (Ep. 248)
Careers, Skills, and the Evolution of AI (Ep. 248) Careers, Skills, and the Evolution of AI (Ep. 248)

!!WARNING!!

Due to some technical issues the volume is not always constant during the show.

I sincerely apologise for any inconvenienceFrancescoIn this episode, I speak with Richie Cotton, Data Evangelist at DataCamp, as he delves into the dynamic intersection of AI and education.

Richie, a seasoned expert in data science and the host of the podcast, brings together a wealth of knowledge and experience to explore the evolving landscape of AI careers, the skills essential for generative AI technologies, and the symbiosis of domain expertise and technical skills in the industry.

3 months, 2 weeks назад @ datascienceathome.com
Open Source Revolution: AI’s Redemption in Data Science (Ep. 247)
Open Source Revolution: AI’s Redemption in Data Science (Ep. 247) Open Source Revolution: AI’s Redemption in Data Science (Ep. 247)

Dive into the world of Data Science at Home with our latest episode, where we explore the dynamic relationship between Artificial Intelligence and the redemption of open source software.

In this thought-provoking discussion, I share my insights on why now, more than ever, is the opportune moment for open source to leave an indelible mark on the field of AI.

Join me as I unpack my opinions and set expectations for the near future, discussing the pivotal role open source is set to play in shaping the landscape of data science and artificial intelligence.

Don’t miss out—tune in to gain a deeper understanding of this revolutionary intersection!

This episode is available as YouTube stream at htt…

4 months, 1 week назад @ datascienceathome.com
Money, Cryptocurrencies, and AI: Exploring the Future of Finance with Chris Skinner [RB] (Ep. 246)
Money, Cryptocurrencies, and AI: Exploring the Future of Finance with Chris Skinner [RB] (Ep. 246) Money, Cryptocurrencies, and AI: Exploring the Future of Finance with Chris Skinner [RB] (Ep. 246)

In this captivating podcast episode, join renowned financial expert Chris Skinner as he delves into the fascinating realm of the future of money.

From cryptocurrencies to government currencies, the metaverse to artificial intelligence (AI), Skinner explores the intricate interplay between technology and humanity.

Gain valuable insights as he defines the future of money, examines the potential impact of cryptocurrencies on traditional government currencies, and addresses the advantages and disadvantages of digital currencies.

Brace yourself for an enlightening discussion on the integration of AI in the financial sector and its potential impact on humanity.

Once subscribed, you get full acces…

4 months, 2 weeks назад @ datascienceathome.com
Debunking AGI Hype and Embracing Reality [RB] (Ep. 245)
Debunking AGI Hype and Embracing Reality [RB] (Ep. 245) Debunking AGI Hype and Embracing Reality [RB] (Ep. 245)

In this thought-provoking episode, we sit down with the renowned AI expert, Filip Piekniewski, Phd, who fearlessly challenges the prevailing narratives surrounding artificial general intelligence (AGI) and the singularity.

With a no-nonsense approach and a deep understanding of the field, Filip dismantles the hype and exposes some of the misconceptions about AI, LLMs and AGI.

If you’re seeking a refreshingly pragmatic perspective on the future of AI, this episode is an absolute must-listen.

Filip Piekniewski BioFilip Piekniewski is a distinguished computer vision researcher and engineer, specializing in visual object tracking and perception.

He is known for his realistic perspective on AI, …

4 months, 3 weeks назад @ datascienceathome.com
Destroy your toaster before it kills you. Drama at OpenAI and other stories (Ep. 244)
Destroy your toaster before it kills you. Drama at OpenAI and other stories (Ep. 244) Destroy your toaster before it kills you. Drama at OpenAI and other stories (Ep. 244)

Brace yourselves, dear friends!

In this episode, we delve into the earth-shattering revelation that OpenAI might have stumbled upon AGI (lol) and we’re all just seconds away from being replaced by highly sophisticated toasters (lol lol).

Spoiler alert: OpenAI’s CEO is just playing 7D chess with the entire human race.

So, sit back, relax, and enjoy this totally not ominous exploration into the ‘totally not happening’ future of AI!

4 months, 3 weeks назад @ datascienceathome.com
The AI Chip Chat 🤖💻 (Ep. 243)
The AI Chip Chat 🤖💻 (Ep. 243) The AI Chip Chat 🤖💻 (Ep. 243)

Dive into the cool world of AI chips with us!

🚀 We’re breaking down how these special computer chips for AI have evolved and what makes them different.

Think of them like the superheroes of the tech world!

Don’t miss out!

🎙️🔍 #AIChips #TechTalk #SimpleScience

4 months, 3 weeks назад @ datascienceathome.com
Rolling the Dice: Engineering in an Uncertain World (Ep. 242)
Rolling the Dice: Engineering in an Uncertain World (Ep. 242) Rolling the Dice: Engineering in an Uncertain World (Ep. 242)

Hey there, engineering enthusiasts!

Ever wondered how engineers deal with the wild, unpredictable twists and turns in their projects?

In this episode, we’re spilling the beans on uncertainty and why it’s the secret sauce in every engineering recipe, not just the fancy stuff like deep learning and neural networks!

Join us for a ride through the world of uncertainty quantification.

Tune in and let’s demystify the unpredictable together!

4 months, 3 weeks назад @ datascienceathome.com
How Language Models Are the Ultimate Database(Ep. 241)
How Language Models Are the Ultimate Database(Ep. 241) How Language Models Are the Ultimate Database(Ep. 241)

In this episode, dive deep into the world of Language Models as we decode their intricate structure, revealing how these powerful algorithms exploit concepts from the past.

But… what if LLMs were just a database?

Referenceshttps://fchollet.substack.com/p/how-i-think-about-llm-prompt-engineering

4 months, 3 weeks назад @ datascienceathome.com