Very ML
State-of-the-art Machine Learning News Feed
/r/MachineLearning
последний пост 1 час назад
[R] SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression
[R] SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression [R] SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression

reddit's awesome and all, but you may have a bit of a problem.

Make sure your User-Agent is not empty, is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 23.88.109.5 and reddit account

1 час назад @ reddit.com
[P]My Journey of No Typing
[P]My Journey of No Typing [P]My Journey of No Typing

reddit's awesome and all, but you may have a bit of a problem.

Make sure your User-Agent is not empty, is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 23.88.109.5 and reddit account

2 часа назад @ reddit.com
[D] Loss Function for Learning Gaussian Distribution
[D] Loss Function for Learning Gaussian Distribution

reddit's awesome and all, but you may have a bit of a problem.

Make sure your User-Agent is not empty, is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 23.88.109.5 and reddit account

7 часов назад @ reddit.com
[N] Senators are sending letters to Meta over LLAMA leak
[N] Senators are sending letters to Meta over LLAMA leak

reddit's awesome and all, but you may have a bit of a problem.

Make sure your User-Agent is not empty, is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 23.88.109.5 and reddit account

7 часов назад @ reddit.com
[D] Is Stability AI’s API reliable?
[D] Is Stability AI’s API reliable?

reddit's awesome and all, but you may have a bit of a problem.

Make sure your User-Agent is not empty, is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 23.88.109.5 and reddit account

9 часов назад @ reddit.com
[N] RedPajama 7B now available, instruct model outperforms all open 7B models on HELM benchmarks
[N] RedPajama 7B now available, instruct model outperforms all open 7B models on HELM benchmarks

reddit's awesome and all, but you may have a bit of a problem.

Make sure your User-Agent is not empty, is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 23.88.109.5 and reddit account

9 часов назад @ reddit.com
[Discussion] training a diffusion model with a destructive process other than gaussian noise
[Discussion] training a diffusion model with a destructive process other than gaussian noise

reddit's awesome and all, but you may have a bit of a problem.

Make sure your User-Agent is not empty, is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 23.88.109.5 and reddit account

9 часов назад @ reddit.com
RVC: AttributeError: 'NoneType' object has no attribute 'dtype' [R]
RVC: AttributeError: 'NoneType' object has no attribute 'dtype' [R]

reddit's awesome and all, but you may have a bit of a problem.

Make sure your User-Agent is not empty, is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 23.88.109.5 and reddit account

13 часов назад @ reddit.com
[D] Affordable Masters Programs
[D] Affordable Masters Programs

reddit's awesome and all, but you may have a bit of a problem.

Make sure your User-Agent is not empty, is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 23.88.109.5 and reddit account

15 часов назад @ reddit.com
[D] Hyperparameter optimization best practices
[D] Hyperparameter optimization best practices

reddit's awesome and all, but you may have a bit of a problem.

Make sure your User-Agent is not empty, is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 23.88.109.5 and reddit account

16 часов назад @ reddit.com
[Project] Hiring an AI Full Stack Developer/ ML Engineer!
[Project] Hiring an AI Full Stack Developer/ ML Engineer!

reddit's awesome and all, but you may have a bit of a problem.

Make sure your User-Agent is not empty, is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 23.88.109.5 and reddit account

16 часов назад @ reddit.com
[D] Paper Explained - Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust (Full Video Analysis)
[D] Paper Explained - Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust (Full Video Analysis)

reddit's awesome and all, but you may have a bit of a problem.

Make sure your User-Agent is not empty, is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 23.88.109.5 and reddit account

17 часов назад @ reddit.com
[D] ML model monitored with another ML model?
[D] ML model monitored with another ML model?

reddit's awesome and all, but you may have a bit of a problem.

Make sure your User-Agent is not empty, is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 23.88.109.5 and reddit account

18 часов назад @ reddit.com
[D] New to Machine Learning / Data Science / AI ?
[D] New to Machine Learning / Data Science / AI ?

reddit's awesome and all, but you may have a bit of a problem.

Make sure your User-Agent is not empty, is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 23.88.109.5 and reddit account

18 часов назад @ reddit.com
[P] MS-Paint GAN
[P] MS-Paint GAN

reddit's awesome and all, but you may have a bit of a problem.

Make sure your User-Agent is not empty, is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 23.88.109.5 and reddit account

18 часов назад @ reddit.com
Towards Data Science
последний пост 3 weeks назад
Emergent Abilities in AI: Are We Chasing a Myth?
Emergent Abilities in AI: Are We Chasing a Myth? Emergent Abilities in AI: Are We Chasing a Myth?

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 weeks назад @ towardsdatascience.com
3 Proven SQL Best Practices You Need To Know In Data Analysis
3 Proven SQL Best Practices You Need To Know In Data Analysis 3 Proven SQL Best Practices You Need To Know In Data Analysis

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 weeks назад @ towardsdatascience.com
10 Examples to Learn the JSON module of Python
10 Examples to Learn the JSON module of Python 10 Examples to Learn the JSON module of Python

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 weeks назад @ towardsdatascience.com
How to Plot Coordinates on Landsat Satellite Images with Python
How to Plot Coordinates on Landsat Satellite Images with Python How to Plot Coordinates on Landsat Satellite Images with Python

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 weeks назад @ towardsdatascience.com
Introduction to statistical sampling and resampling
Introduction to statistical sampling and resampling Introduction to statistical sampling and resampling

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 weeks назад @ towardsdatascience.com
Setting up Python Projects: Part IV
Setting up Python Projects: Part IV Setting up Python Projects: Part IV

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 weeks, 1 day назад @ towardsdatascience.com
Unraveling the Design Pattern of Physics-Informed Neural Networks: Series 01
Unraveling the Design Pattern of Physics-Informed Neural Networks: Series 01 Unraveling the Design Pattern of Physics-Informed Neural Networks: Series 01

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 weeks, 1 day назад @ towardsdatascience.com
Combining Traditional Thread-Based Code and Asyncio in Python
Combining Traditional Thread-Based Code and Asyncio in Python Combining Traditional Thread-Based Code and Asyncio in Python

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 weeks, 1 day назад @ towardsdatascience.com
How To List All BigQuery Datasets and Tables with Python
How To List All BigQuery Datasets and Tables with Python How To List All BigQuery Datasets and Tables with Python

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 weeks, 1 day назад @ towardsdatascience.com
How Decision Trees Split Nodes, from Loss Function Perspective
How Decision Trees Split Nodes, from Loss Function Perspective How Decision Trees Split Nodes, from Loss Function Perspective

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 weeks, 1 day назад @ towardsdatascience.com
Times Series for Climate Change: Forecasting Extreme Weather Events
Times Series for Climate Change: Forecasting Extreme Weather Events Times Series for Climate Change: Forecasting Extreme Weather Events

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 weeks, 1 day назад @ towardsdatascience.com
Implement Multi-GPU Training on a single GPU
Implement Multi-GPU Training on a single GPU Implement Multi-GPU Training on a single GPU

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 weeks, 1 day назад @ towardsdatascience.com
When Should You Fine-Tune LLMs?
When Should You Fine-Tune LLMs? When Should You Fine-Tune LLMs?

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 weeks, 1 day назад @ towardsdatascience.com
Loss Functions in Machine Learning
Loss Functions in Machine Learning Loss Functions in Machine Learning

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 weeks, 1 day назад @ towardsdatascience.com
What People Write about Climate: Twitter Data Clustering in Python
What People Write about Climate: Twitter Data Clustering in Python What People Write about Climate: Twitter Data Clustering in Python

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 weeks, 1 day назад @ towardsdatascience.com
Distill.pub Distill.pub
последний пост None
The Gradient The Gradient
последний пост 1 week, 3 days назад
Modern AI is Domestification
Modern AI is Domestification Modern AI is Domestification

IntroductionAs internet-scale AI models mature rapidly from coarse research demos to productionized user-facing systems, expectations have increased and goalposts have moved drastically.

Incorporating AI Feedback: AI CriticsWhile RLHF provides a powerful mechanism to transfer human knowledge to AI models, it also faces practical limitations: human feedback can be noisy, inconsistent, and expensive to collect.

To tackle these challenges, Reinforcement Learning from AI Feedback (RLAIF) aims to bring existing AI models into the loop by utilizing prompted pretrained models to generate preference data for training reward models.

Here are a few examples of AI feedback that amplify existing AI pri…

1 week, 3 days назад @ thegradient.pub
Artificial Curiosity as Moral Virtue
Artificial Curiosity as Moral Virtue Artificial Curiosity as Moral Virtue

An artificially intelligent agent could act and think in ways driven by curiosity, and, in these ways, exercise the moral virtue of curiosity - letting them become more and more human-like.

We begin to investigate this question through an exploration of artificial curiosity in the context of the free energy principle - as put forward by neuroscientist Karl Friston.

The free energy principle provides a way for adaptive systems to unify action, perception, and learning.

Artificial curiosity, as it could follow from the free energy principle, could be examined appropriately.

For attribution in academic contexts or books, please cite this work asSyed Hussain Ather, "Artificial Curiosity as Mora…

2 weeks, 3 days назад @ thegradient.pub
In-Context Learning, In Context
In-Context Learning, In Context In-Context Learning, In Context

BackgroundMuch recent work on large language models (LLMs) has explored the phenomenon of in-context learning (ICL).

We use the term “in-context learning” to describe the inner loop of this process, which occurs within the forward-pass upon each sequence.

In “What Learning Algorithm is In-Context Learning,” Akyürek et al., using linear regression as a toy problem, provide evidence for the hypothesis that transformer-based in-context learners indeed implement standard learning algorithms implicitly.

First, “The Learnability of In-Context Learning” presents a first-of-its-kind PAC-based framework for in-context learnability.

“Large Language Models Do In-Context Learning Differently” further s…

1 month, 1 week назад @ thegradient.pub
Software²: A new generation of AIs that become increasingly general by producing their own training data
Software²: A new generation of AIs that become increasingly general by producing their own training data Software²: A new generation of AIs that become increasingly general by producing their own training data

Despite the large impact of the training data on model performance, mainstream training practices are not inherently data-seeking.

Instead, they ignore the quality of information within the training data in favor of maximizing data quantity.

In this sense, we say an IGI performs open-ended learning, and call its associated process for collecting training data open-ended exploration.

In general, the full data space can be unbounded and cannot be captured by a single, predefined dataset or simulator.

Software²: A new generation of AIs that become increasingly general by producing their own training data.

1 month, 2 weeks назад @ thegradient.pub
Grounding Large Language Models in a Cognitive Foundation: How to Build Someone We Can Talk To
Grounding Large Language Models in a Cognitive Foundation: How to Build Someone We Can Talk To Grounding Large Language Models in a Cognitive Foundation: How to Build Someone We Can Talk To

And we need our robots to tell the truth and say when they don’t know—to actually be useful, robots need to be trustworthy.

Robots need strong mental modelsRobots need strong mental models so that they can learn quickly, form long-term memories, and understand truth.

Strengthening Mental Models using Curriculum Learning to Acquire a Cognitive FoundationRobots need strong mental models to learn quickly and adapt to novel situations.

When large language models (LLMs) train only on text they are only learning from part of the information.

ConclusionLarge language models (LLMs) such as ChatGPT are trained on internet data, which is the end product of evolution instead of its beginning.

1 month, 3 weeks назад @ thegradient.pub
Towards Geometric Deep Learning
Towards Geometric Deep Learning Towards Geometric Deep Learning

The last decade has witnessed an experimental revolution in data science and machine learning, epitomised by deep learning methods.

Geometric Deep Learning is concerned with exposing these regularities through unified geometric principles that can be applied throughout a broad spectrum of applications.

Kunihiko Fukushima and the neocognitron, an early geometric deep learning architecture and a precursor of the modern convolutional neural networks.

The ‘Erlangen Programme’ of Deep LearningOur historical overview of the geometric foundations of deep learning has now naturally brought us to the blueprint that underpins this book.

The Geometric Deep Learning Blueprint can be used to derive from…

3 months, 2 weeks назад @ thegradient.pub
Artists enable AI art - shouldn't they be compensated?
Artists enable AI art - shouldn't they be compensated? Artists enable AI art - shouldn't they be compensated?

However, there is another side to the AI art process, one that is not talked about enough.

In this article, I will cover why this the case, the debate around artist compensation in AI art, and some possible solutions to the problem.

With that out of the way, let’s move on to the overarching debate about AI art and whether it copies artists.

Let’s connect: https://rb.gy/m5ok2yMy Instagram: https://rb.gy/gmvuy9My Twitter: https://twitter.com/Machine0177681CitationFor attribution of this in academic contexts or books, please cite this work as:Devansh Lnu, "Artists enable AI art - shouldn't they be compensated?

BibTeX citation:@article{Lnu2023aiart,author = {Lnu, Devansh},title = {Artists enabl…

3 months, 3 weeks назад @ thegradient.pub
Do Large Language Models learn world models or just surface statistics?
Do Large Language Models learn world models or just surface statistics? Do Large Language Models learn world models or just surface statistics?

Back to the mystery on whether large language models are learning surface statistics or world models, there have been some tantalizing clues suggesting language models may build interpretable “world models” with probing techniques.

Back to the question we have at the beginning: do language models learn world models or just surface statistics?

Our experiment provides evidence supporting that these language models are developing world models and relying on the world model to generate sequences.

CitationFor attribution of this in academic contexts or books, please cite this work as:Kenneth Li, "Do Large Language Models learn world models or just surface statistics?

BibTeX citation (this blog):…

4 months, 2 weeks назад @ thegradient.pub
Reasons to Punish Autonomous Robots
Reasons to Punish Autonomous Robots Reasons to Punish Autonomous Robots

Autonomous Military Robots: Design and PlausibilityAs Sparrow notes, ‘autonomy’ means different things to different authors (2007, 65).

Danaher predicts that people won’t desire to punish autonomous robots because the robots don’t seem deserving of punishment.

That gives us someone to punish and might also help make sure autonomous military robots are only used judiciously.

Thus, to the extent that deterrence, restoring trust, communicating condemnation, or providing education provide good reasons for punishing human agents, they also provide reasons to punish autonomous robots.

If it is not reasonable or ethically defensible to punish* autonomous robots, we should look hard at whether it i…

4 months, 3 weeks назад @ thegradient.pub
Learning to Make the Right Mistakes - a Brief Comparison Between Human Perception and Multimodal LMs
Learning to Make the Right Mistakes - a Brief Comparison Between Human Perception and Multimodal LMs Learning to Make the Right Mistakes - a Brief Comparison Between Human Perception and Multimodal LMs

This is because their top-down perception has not had enough “experience/training data” to learn and refine itself.

In a way, we can say that their “world model” is not as good as that of adults.

An interesting consequence of a strong top-down perception is the ability of us humans to see things like animals/faces in the clouds (Pareidolia).

Multimodal Language Models (LMs) are an attempt to make such language models perceive the world in a way that’s one step closer to that of humans.

His work focuses on reverse engineering Large Multimodal Language Models to make them explainable to humans.

6 months назад @ thegradient.pub
Artificial Intelligence and the Future of Demos
Artificial Intelligence and the Future of Demos Artificial Intelligence and the Future of Demos

In one of the claimed birthplaces of democracy, Ancient Athens, demos covered all Athenian citizens, who had an equal say in collective decision-making.

And only the real people – the demos – can recognize the ‘real’ from the ‘not-so-real.’In essence, if you are not part of the demos, you have no say in collective decision-making.

Original Photo: Daria Shevtsova / Pixabay, edited by authorIn democracies, it is the demos that should have the topmost power over collective decision-making.

If we want to preserve democracy and/or demos based on equality and freedom, we could start asking ourselves: Is our future demos nation-state-based or global, and how could we align AI development with this…

8 months, 1 week назад @ thegradient.pub
Causal Inference: Connecting Data and Reality
Causal Inference: Connecting Data and Reality Causal Inference: Connecting Data and Reality

Any causal inference problem consists of two parts: causal identification and statistical inference.

Causal inference theoryCausal inference is a theory that describes, discriminates, and measures causal relationships, developed from statistics.

Causal representation learningUnlike the traditional causal inference approach, which uses causal graphs to connect random variables to complete the causal discovery and reasoning hypothesis task, the problem of causal representation learning has recently attracted more attention.

is not valid, and causal inference studies exactly such a situation: how to learn a causal model that can work under different distributions, imply a causal mechanism (Cau…

9 months назад @ thegradient.pub
The Future of Speech Recognition: Where Will We Be in 2030?
The Future of Speech Recognition: Where Will We Be in 2030? The Future of Speech Recognition: Where Will We Be in 2030?

"By 2030, speech recognition will feature truly multilingual models, rich standardized output objects, and be available to all and at scale.

Finally, speech recognition will engender the principles of responsible AI, and operate without bias."

Source: Hannun, Awni, “Speech Recognition is not Solved”.

CitationFor attribution in academic contexts or books, please cite this work asMigüel Jetté and Corey Miller, "The Future of Speech Recognition: Where will we be in 2030?

BibTeX citation:@article{miller2021futureofowork,author = {Jetté, Migüel and Miller, Corey},title = {The Future of Speech Recognition: Where will we be in 2030?

9 months, 2 weeks назад @ thegradient.pub
TheSequence TheSequence
последний пост 2 часа назад
The Sequence Chat: Raza Habib, Humanloop on Building LLM-Driven Applications
The Sequence Chat: Raza Habib, Humanloop on Building LLM-Driven Applications The Sequence Chat: Raza Habib, Humanloop on Building LLM-Driven Applications

I’ve believed for a long time now that foundational AI models, like GPT-3/4, are the start of the next big computing platform.

Developers building on top of these models will be able to build a new generation of applications that until recently would have felt like science fiction.

To unlock the potential of LLM applications we need a new set of tools built from first principles.

Reinforcement learning from Human Feedback (RLHF) — in the final step you gather a different type of feedback data, which is preferences.

The model capabilities also become increasingly dangerous in the hands of bad actors and will likely not be safe to Open source.

2 часа назад @ thesequence.substack.com
Edge 297: Tool-Augmented Language Models
Edge 297: Tool-Augmented Language Models Edge 297: Tool-Augmented Language Models

Created Using MidjourneyIn this Issue:The Concept: An overview of tool-augmented language models.

The Research: Meta AI’s Toolformer research.

💡 ML Concept of the Day: Tool-Augmented Language ModelsOne of the new frontiers of foundation model research is the ability to augment large language models(LLMs) with external information.

One of the main ways to obtain real time external information is mastering the use of APIs or tools.

This is precisely the focus of a new area of research in foundation models known as tool augmented language models(TALM).

1 day, 2 hours назад @ thesequence.substack.com
📺 See how programmatic labeling is the key to using LLMs [Live Demo]
📺 See how programmatic labeling is the key to using LLMs [Live Demo] 📺 See how programmatic labeling is the key to using LLMs [Live Demo]

Even with the rapid advancements to AI made possible by LLMs and Foundation Models, data remains the key to unlocking real value for enterprise AI.

You’ll see how Snorkel Flow:Accelerates AI development with programmatic labeling, a fundamentally more scalable approach to building and maintaining high-quality datasets.

), domain expert heuristics, embeddings, foundation models, and more to improve label quality.

Can be used to distill LLM knowledge into a smaller, efficient model or to fine-tune an existing foundation model like GPT-3.

Can use LLM knowledge with zero- and few-shot learning to auto-label training data with a push of a button.

2 days, 1 hour назад @ thesequence.substack.com
The Next RLHF Effect: Three Breakhroughts that can Unlock the Next Wave of Innovation in Foundation Models
The Next RLHF Effect: Three Breakhroughts that can Unlock the Next Wave of Innovation in Foundation Models The Next RLHF Effect: Three Breakhroughts that can Unlock the Next Wave of Innovation in Foundation Models

Created Using MidjourneyNext Week in The Sequence:Edge 297: Covers one of my favorite subjects in foundation models: tool augmented LLMs.

Edge 298: A review of MiniGPT-4, one of the most impressive open source multimodal foundation models released to date.

In essence, RLHF enabled language models (LLMs) to overcome a significant hurdle by offering two key capabilities:Producing output aligned with human intentions.

GorillaResearchers from UC Berkeley and Microsoft Research published a paper detailing Gorilla, an LLM able to use APIs and tools.

Differential Privacy in MLGoogle Research published a paper reviewing the current state of ML differential privacy methods.

3 days, 2 hours назад @ thesequence.substack.com
This week on TuringPost
This week on TuringPost This week on TuringPost

Hi there,Last week, we introduced Turing Post – the newsletter that complements TheSequence and covers other topics that help you make informed decisions about AI.

It is for those who are in the AI business and need to understand where it comes from and how it affects the world.

Here is a few:“I've often felt myself awash in a confusing sea of information that is very hard to judge for quality and veracity.

“I am quite impressed by the Turing Post.

Please check out our latest publications, subscribe and keep an eye on your inbox for Turing Post 🤍:The Governance of Undefined Superintelligence and the World of CopilotsAlgorithm or Personality?

4 days назад @ thesequence.substack.com
📝 Guest Post: Stop Hallucinations From Hurting your LLM Powered Apps*
📝 Guest Post: Stop Hallucinations From Hurting your LLM Powered Apps* 📝 Guest Post: Stop Hallucinations From Hurting your LLM Powered Apps*

Large language model (LLM) hallucinations pose a big threat to the successful adoption of the new wave of LLM apps.

In this post, the Galileo team dives into how one can prevent hallucinations from creeping in, as well as some metrics developed by the researchers at Galileo to quantify potential LLM hallucinations.

They also introduce a free access to the Galileo LLM Studio, powered by research-backed mechanisms to combat LLM hallucinations.

Introducing the Galileo LLM StudioTo build high performing LLM powered apps, requires careful debugging of prompts and the training data – the Galileo LLM Studio provides powerful tools to do just that, powered by research-backed mechanisms to combat LL…

5 days, 1 hour назад @ thesequence.substack.com
Edge 296: Inside OpenAI's Method to Use GPT-4 to Explain Neuron's Behaviors in GPT-2
Edge 296: Inside OpenAI's Method to Use GPT-4 to Explain Neuron's Behaviors in GPT-2 Edge 296: Inside OpenAI's Method to Use GPT-4 to Explain Neuron's Behaviors in GPT-2

Created Using MidjourneyAs language models have advanced in capability and widespread usage, there remains a significant knowledge gap regarding their internal workings.

Understanding whether these models employ biased heuristics or engage in deception solely based on their outputs can be challenging.

In the pursuit of interpretability, OpenAI delves into uncovering additional insights by exploring the model’s internal mechanisms.

Traditionally, this process entailed manual inspection by human experts to decipher the data features represented by these components.

This automated process is then applied to neurons within another language model.

6 days, 2 hours назад @ thesequence.substack.com
The Sequence Chat: Rohan Taori on Stanford's Alpaca, Alpaca Farm and the Future of LLMs
The Sequence Chat: Rohan Taori on Stanford's Alpaca, Alpaca Farm and the Future of LLMs The Sequence Chat: Rohan Taori on Stanford's Alpaca, Alpaca Farm and the Future of LLMs

Alpaca was really sort of a happy surprise along the way of our bigger project, AlpacaFarm.

Our goal for AlpacaFarm was to study methods for learning from human feedback (e.g.

Alpaca itself did not have any RLHF component and was only supervised fine-tuned on a small question-answer set.

Your research group followed the work on Alpaca with AlpacaFarm, a method that drastically improves the performance and efficiency of RLHF processes.

So developing new methods for tool use and mitigating the security risks is a really important research discussion for the near future.

1 week назад @ thesequence.substack.com
Edge 295: Self-Instruct Models
Edge 295: Self-Instruct Models Edge 295: Self-Instruct Models

Created Using MidjourneyIn this Issue:The Concept: Self-Instruct models.

💡 ML Concept of the Day: Self-Instruct ModelsInstruction following has become one of the core building blocks of the new generation of LLMs.

One technique that has been evolving as an alternative is the idea to create LLMs that can bootstrap their own instructions.

These methods are commonly known as self-instruct LLMs.

The core technique was unveiled in a December 2022 paper.

1 week, 1 day назад @ thesequence.substack.com
📝 Guest Post: How to build a responsible code LLM with crowdsourcing*
📝 Guest Post: How to build a responsible code LLM with crowdsourcing* 📝 Guest Post: How to build a responsible code LLM with crowdsourcing*

If you want to build a responsible AI solution, you need to be careful with data handling practices.

We’re going to show how Human-in-the-Loop can be put to effective use in building responsible AI tools, using the example of StarCoder, a code LLM.

By creating this open-source code LLM, the BigCode community, supported by Hugging Face and ServiceNow, has proven that high-performing AI solutions can be a part of responsible AI.

The main goal of the BigCode community was to develop a code LLM that follows responsible AI guidelines, particularly those related to training data.

Example of skill statisticsResponsibility to the crowdTo follow the principles of Responsible AI, crowd projects shoul…

1 week, 2 days назад @ thesequence.substack.com
GPT-Microsoft
GPT-Microsoft GPT-Microsoft

Copilots for everything seem to be Microsoft’s generative AI flag, and it was incredibly present at this week’s Build conference.

Here are some of the most relevant announcements:Plugins Everywhere: Microsoft announced that it's adopting OpenAI’s plugin standard, enabling interoperability between ChatGPT and Microsoft’s Copilot offerings.

Windows Copilot: Microsoft is bringing the CoPilot assistant natively to Windows, with integration with Bing, Edge, and Office apps.

Text-to-Speech, Speech-to-Text for 1100 LanguagesMeta AI published speech-to-text and text-to-speech models that work efficiently on more than 1100 languages.

Differential Privacy in MLGoogle Research published a paper review…

1 week, 3 days назад @ thesequence.substack.com
Announcing Turing Post
Announcing Turing Post Announcing Turing Post

But the AI world is vast, and there's so much more to explore to make better decisions about it.

That's why we're thrilled to introduce our new venture: Turing Post.

SubscribeIn this newsletter, we'll have more interactions with the community, that will help us explore AI in depth and width.

Turing Post is for those who are in the AI business and need to understand where it comes from and how it affects the world.

So, give us a chance and keep an eye on your inbox for Turing Post.

1 week, 4 days назад @ thesequence.substack.com
📢 Event: ML practitioners from Affirm, Block, Remitly, Tide & more share their learnings from building risk & fraud detection systems
📢 Event: ML practitioners from Affirm, Block, Remitly, Tide & more share their learnings from building risk & fraud detection systems 📢 Event: ML practitioners from Affirm, Block, Remitly, Tide & more share their learnings from building risk & fraud detection systems

Want to connect with the ML engineering community and learn best practices from ML practitioners on how to build risk and fraud detection systems?

Then join us on May 30 for apply(risk), a free half-day, virtual event!

REGISTER NOWWe’ve got a fantastic speaker lineup—here are some highlights from the agenda:Cooper Stimson, Software Engineer, Machine Learning Platform at Block , will explore the role of the ML platform in integrating a variety of data sources, tools, and applications.

Francisco Arceo, Engineering Manager at Affirm , will discuss the common pitfalls and share simple approaches to avoiding them.

Aravind Maguluri, Lead Data Scientist, and Tocho Tochev, Lead ML Engineer at Tide,…

1 week, 5 days назад @ thesequence.substack.com
Edge 294: Inside StarCoder: Hugging Face's New LLM that Can Generate Code in Over 80 Programming Languages
Edge 294: Inside StarCoder: Hugging Face's New LLM that Can Generate Code in Over 80 Programming Languages Edge 294: Inside StarCoder: Hugging Face's New LLM that Can Generate Code in Over 80 Programming Languages

Created Using MidjourneyCoding is one of the most interesting applications of modern large language models(LLMs).

Programming is a problem significatively more complex than other language tasks given that it involves different forms of reasoning.

GitHub CoPilot has become the gold standard for the application of AI to programming, but it’s certainly not the only one.

Recently, Hugging Face and ServiceNow announced StarCoder, a new open source LLM for coding that matches the performance of GPT-4.

StarCoder is part of a larger collaboration known as the BigCode project.

1 week, 6 days назад @ thesequence.substack.com
The Sequence Chat: Hugging Face's Leandro von Werra on StarCoder and Code Generating LLMs
The Sequence Chat: Hugging Face's Leandro von Werra on StarCoder and Code Generating LLMs The Sequence Chat: Hugging Face's Leandro von Werra on StarCoder and Code Generating LLMs

🛠 ML WorkYou are part of the StarCoder project, which was recently released by the BigCode community championed by Hugging Face and ServiceNow.

The goal of BigCode and subsequently StarCoder was to address these issues and produce a high-performance code model with clear data governance structures.

The StarCoder release includes two models: StarCoder and StarCoderBase.

How did you guys handle Jupyter Notebooks in the process of training StarCoder?

We parsed the notebooks in two ways:- we converted the notebooks to source code where the markdown cells become code comments.

2 weeks назад @ thesequence.substack.com
Synced Review
последний пост 3 weeks, 1 day назад
Meet VideoChat: Integrating Language and Video Models to Boost Video Understanding
Meet VideoChat: Integrating Language and Video Models to Boost Video Understanding Meet VideoChat: Integrating Language and Video Models to Boost Video Understanding

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 weeks, 1 day назад @ medium.com
OpenAI’s Shap·E Extends Generative Models for 3D Asset Generation
OpenAI’s Shap·E Extends Generative Models for 3D Asset Generation OpenAI’s Shap·E Extends Generative Models for 3D Asset Generation

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 weeks, 4 days назад @ medium.com
Georgia Tech’s ZipIt!
Georgia Tech’s ZipIt! Georgia Tech’s ZipIt!

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 weeks, 6 days назад @ medium.com
Microsoft’s Automatic Prompt Optimization Improves Prompts to Boost LLM Performance
Microsoft’s Automatic Prompt Optimization Improves Prompts to Boost LLM Performance Microsoft’s Automatic Prompt Optimization Improves Prompts to Boost LLM Performance

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

4 weeks назад @ medium.com
MIT, Harvard & Northeastern U’s Sparse Probing Aims at ‘Finding Neurons in a Haystack’
MIT, Harvard & Northeastern U’s Sparse Probing Aims at ‘Finding Neurons in a Haystack’ MIT, Harvard & Northeastern U’s Sparse Probing Aims at ‘Finding Neurons in a Haystack’

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

4 weeks, 1 day назад @ medium.com
CMU’s Unlimiformer Augments Transformers to Enable Unbounded Input Lengths
CMU’s Unlimiformer Augments Transformers to Enable Unbounded Input Lengths CMU’s Unlimiformer Augments Transformers to Enable Unbounded Input Lengths

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 month назад @ medium.com
Optimizing Transformers: Microsoft & RUC’s ResiDual Solves Gradient Vanishing and Representation…
Optimizing Transformers: Microsoft & RUC’s ResiDual Solves Gradient Vanishing and Representation… Optimizing Transformers: Microsoft & RUC’s ResiDual Solves Gradient Vanishing and Representation…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 month назад @ medium.com
Google & TAU Explore How Transformer-Based LLMs Extract Knowledge From Their Parameters
Google & TAU Explore How Transformer-Based LLMs Extract Knowledge From Their Parameters Google & TAU Explore How Transformer-Based LLMs Extract Knowledge From Their Parameters

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 month назад @ medium.com
CMU & Meta’s AlbedoGAN Advances Realistic 3D Face Generation
CMU & Meta’s AlbedoGAN Advances Realistic 3D Face Generation CMU & Meta’s AlbedoGAN Advances Realistic 3D Face Generation

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 month назад @ medium.com
Microsoft & Peking U’s WizardLM Enables LLMs to Automatically Mass-Produce Complex Instructions
Microsoft & Peking U’s WizardLM Enables LLMs to Automatically Mass-Produce Complex Instructions Microsoft & Peking U’s WizardLM Enables LLMs to Automatically Mass-Produce Complex Instructions

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 month, 1 week назад @ medium.com
UC Berkeley’s FastRLAP Learns Aggressive and Effective High-Speed Driving Strategies With <20…
UC Berkeley’s FastRLAP Learns Aggressive and Effective High-Speed Driving Strategies With <20… UC Berkeley’s FastRLAP Learns Aggressive and Effective High-Speed Driving Strategies With <20…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 month, 1 week назад @ medium.com
Microsoft’s NaturalSpeech 2 Outperforms Previous TTS Systems in Zero-Shot Speech and Singing…
Microsoft’s NaturalSpeech 2 Outperforms Previous TTS Systems in Zero-Shot Speech and Singing… Microsoft’s NaturalSpeech 2 Outperforms Previous TTS Systems in Zero-Shot Speech and Singing…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 month, 1 week назад @ medium.com
Look Again, YOLO: Baidu’s RT-DETR Detection Transformer Achieves SOTA Results on Real-Time Object…
Look Again, YOLO: Baidu’s RT-DETR Detection Transformer Achieves SOTA Results on Real-Time Object… Look Again, YOLO: Baidu’s RT-DETR Detection Transformer Achieves SOTA Results on Real-Time Object…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 month, 1 week назад @ medium.com
Huawei’s DiffFit Unlocks the Transferability of Large Diffusion Models to New Domains
Huawei’s DiffFit Unlocks the Transferability of Large Diffusion Models to New Domains Huawei’s DiffFit Unlocks the Transferability of Large Diffusion Models to New Domains

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 month, 2 weeks назад @ medium.com
DeepMind & MPG Establish a Research Program for Meta-Learned Models of Cognition
DeepMind & MPG Establish a Research Program for Meta-Learned Models of Cognition DeepMind & MPG Establish a Research Program for Meta-Learned Models of Cognition

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 month, 2 weeks назад @ medium.com
📓 Cool Blogs
ODS.ai Habr ODS.ai Habr
последний пост 1 month, 2 weeks назад
Создай своего ИИ-ассистента с помощью Streamlit
Создай своего ИИ-ассистента с помощью Streamlit Создай своего ИИ-ассистента с помощью Streamlit

Как отобразить диалог с ботом в чатеВ этом шаге можно увидеть код для отображения разговора в чате с использованием модели GPT.

Если сообщений нет, создается вводное сообщение ИИ с ai_role и добавляется в список, после чего следует ввод данных пользователем.

Разделение кода на эти функции позволяет легко настраивать поток общения между пользователем и ИИ и управлять им.

Создаётся родительский класс данных Locale , содержащий общий атрибут ai_role_options для списка возможных ролей ИИ для всех поддерживаемых языков.

Назначаются предопределённые роли ИИ для каждого языка с помощью AI_ROLE_OPTIONS_EN и AI_ROLE_OPTIONS_RU - это нам пригодится позже.

1 month, 2 weeks назад @ habr.com
Человечество против искусственного интеллекта: может ли развитие нейросетей привести к катастрофе
Человечество против искусственного интеллекта: может ли развитие нейросетей привести к катастрофе Человечество против искусственного интеллекта: может ли развитие нейросетей привести к катастрофе

В 2019 они ввалили миллиарды денег в OpenAI, а в 2023 доввалили еще, чтобы получить доступ к превью-версии GPT-4.

Она логическая или даже философская — а как максимально точно сформулировать то, что мы имеем в виду, а не то, что нам кажется мы хотим достичь?

Точно так же как и не решит «всех убить».

Мы не против энергии из ядерных реакторов, но давайте заранее придумаем, как нам безопасно хранить отработанный уран?

Во второй части этой статьи будет более подробный разбор всех аргументов как сторонников ИИ, как и противников, чтобы вы поняли картину глубже.

2 months назад @ habr.com
GPT-4: Чему научилась новая нейросеть, и почему это немного жутковато
GPT-4: Чему научилась новая нейросеть, и почему это немного жутковато GPT-4: Чему научилась новая нейросеть, и почему это немного жутковато

Нейросеть попробует разобраться сразу и в визуальных данных, и в текстовом промпте – и даст свой ответ.

Но это не беда: можно просто скопипастить текст ошибки в диалог с GPT-4 и скомандовать ей «слушай, ну сделай нормально уже, а?» – и та реально извинится и всё пофиксит!

Понятно, что это самые мейнстримные и популярные проекты, которые с одной стороны легко написать, но с другой – они всё-таки являются полноценными демонстрациями.

Причем сравнение здесь идет не с рандомами, а с людьми, которые к этим экзаменам действительно готовились!

И дело тут совсем не в расистских шутках или в инструкциях по сбору бомб в домашних условиях (и в опасении последующих судебных исков и разбирательств) – во…

2 months, 3 weeks назад @ habr.com
GPT-4: Чему научилась новая нейросеть, и почему это немного жутковато
GPT-4: Чему научилась новая нейросеть, и почему это немного жутковато GPT-4: Чему научилась новая нейросеть, и почему это немного жутковато

Нейросеть попробует разобраться сразу и в визуальных данных, и в текстовом промпте – и даст свой ответ.

Но это не беда: можно просто скопипастить текст ошибки в диалог с GPT-4 и скомандовать ей «слушай, ну сделай нормально уже, а?» – и та реально извинится и всё пофиксит!

Понятно, что это самые мейнстримные и популярные проекты, которые с одной стороны легко написать, но с другой – они всё-таки являются полноценными демонстрациями.

Причем сравнение здесь идет не с рандомами, а с людьми, которые к этим экзаменам действительно готовились!

И дело тут совсем не в расистских шутках или в инструкциях по сбору бомб в домашних условиях (и в опасении последующих судебных исков и разбирательств) – во…

2 months, 3 weeks назад @ habr.com
Как работает ChatGPT: объясняем на простом русском эволюцию языковых моделей с T9 до чуда
Как работает ChatGPT: объясняем на простом русском эволюцию языковых моделей с T9 до чуда Как работает ChatGPT: объясняем на простом русском эволюцию языковых моделей с T9 до чуда

При этом мало кто понимает — а как вообще нейросети вроде ChatGPT работают внутри?

Языковые модели без всякого труда генерируют длинные тексты, но делают они это по принципу «слово за словом».

Большинство людей легко понимают из контекста, что в одном случае «она» — это приманка, а в другом — рыба.

Краткое резюме: GPT-2 вышла в 2019 году, и она превосходила свою предшественницу и по объему тренировочных текстовых данных, и по размеру самой модели (числу параметров) в 10 раз.

Короче, InstructGPT (также известная как GPT-3.5) – это как раз и есть GPT-3, которую дообучили с помощью фидбека на максимизацию оценки живого человека.

3 months назад @ habr.com
Эволюция нейросетей от Т9 до ChatGPT: объясняем на простом русском, как работают языковые модели
Эволюция нейросетей от Т9 до ChatGPT: объясняем на простом русском, как работают языковые модели Эволюция нейросетей от Т9 до ChatGPT: объясняем на простом русском, как работают языковые модели

При этом мало кто понимает – а как вообще нейросети вроде ChatGPT работают внутри?

Вы начинаете печатать в ответ: «Да не, у меня уже дела(( я иду в...», и вот тут подключается Т9.

Языковые модели без всякого труда генерируют длинные тексты, но делают они это по принципу «слово за словом».

Большинство людей легко понимают из контекста, что в одном случае «она» – это приманка, а в другом – рыба.

Краткое резюме: GPT-2 вышла в 2019 году, и она превосходила свою предшественницу и по объему тренировочных текстовых данных, и по размеру самой модели (числу параметров) в 10 раз.

3 months назад @ habr.com
АБ-тесты — это не только ценный мех… Но еще и процессы
АБ-тесты — это не только ценный мех… Но еще и процессы АБ-тесты — это не только ценный мех… Но еще и процессы

Более детальное описание каждой ступени можно найти в моем докладе тут и в этой статье.

Определение географии пилота и выбор объектов для тестирования (пилотная группа, внедряем MVP) и сравнения (контрольная группа, ничего не внедряем).

О том, почему подобное ручное сравнение это плохо и что должно улучшить АБ (и как объяснить это бизнесу!

А если планируем эффект на пару категорий продаж, то проверять стоит на них, а не на тотал продажах.

Внедряем как получилось, а потом бахаем causal impact и вуаля, у нас есть оценка и пилота, и ролл-аута.

3 months, 2 weeks назад @ habr.com
АБ-тесты — это не только ценный мех… Но еще и процессы
АБ-тесты — это не только ценный мех… Но еще и процессы АБ-тесты — это не только ценный мех… Но еще и процессы

Более детальное описание каждой ступени можно найти в моем докладе тут и в этой статье.

Определение географии пилота и выбор объектов для тестирования (пилотная группа, внедряем MVP) и сравнения (контрольная группа, ничего не внедряем).

О том, почему подобное ручное сравнение это плохо и что должно улучшить АБ (и как объяснить это бизнесу!

А если планируем эффект на пару категорий продаж, то проверять стоит на них, а не на тотал продажах.

Внедряем как получилось, а потом бахаем causal impact и вуаля, у нас есть оценка и пилота, и ролл-аута.

3 months, 2 weeks назад @ habr.com
[Перевод] Запуск Stable Diffusion локально и в облаке с помощью Diffusers и dstack
[Перевод] Запуск Stable Diffusion локально и в облаке с помощью Diffusers и dstack [Перевод] Запуск Stable Diffusion локально и в облаке с помощью Diffusers и dstack

В этой статье, я на простом примере расскажу о том, как решать эту проблему с помощью diffusers и dstack.

Чтобы запустить сценарий через dstack , сценарий должен быть определен как workflow через YAML-файл в .dstack/workflows .

Настройка AWS в качестве remoteПо умолчанию workflows в dstack запускаются локально.

ЗаключениеЕсли эта статья показалась вам интересной, вы можете углубиться в эту тему, изучив документацию по diffusers и dstack.

В одной из следующих статей мы углубимся не только в генерацию изображений, но и в файнтьюнинг Stable Diffusion.

3 months, 3 weeks назад @ habr.com
[Перевод] Запуск Stable Diffusion локально и в облаке с помощью Diffusers и dstack
[Перевод] Запуск Stable Diffusion локально и в облаке с помощью Diffusers и dstack [Перевод] Запуск Stable Diffusion локально и в облаке с помощью Diffusers и dstack

В этой статье, я на простом примере расскажу о том, как решать эту проблему с помощью diffusers и dstack.

Чтобы запустить сценарий через dstack , сценарий должен быть определен как workflow через YAML-файл в .dstack/workflows .

Настройка AWS в качестве remoteПо умолчанию workflows в dstack запускаются локально.

ЗаключениеЕсли эта статья показалась вам интересной, вы можете углубиться в эту тему, изучив документацию по diffusers и dstack.

В одной из следующих статей мы углубимся не только в генерацию изображений, но и в файнтьюнинг Stable Diffusion.

3 months, 3 weeks назад @ habr.com
Теория вероятностей в машинном обучении. Часть 2: модель классификации
Теория вероятностей в машинном обучении. Часть 2: модель классификации Теория вероятностей в машинном обучении. Часть 2: модель классификации

Модель классификации и функция потерьЧтобы задать вероятностную модель, нам нужно определить, в какой форме она будет предсказывать распределение .

Если мы используем в обучении сигмоиду, то модель непосредственно предсказывает логит , то есть логарифм того, во сколько раз второй класс вероятнее первого.

Например, в пусть в задаче классификации эмоций по видеозаписи датасет размечен сразу несколькими людьми-аннотаторами, которые иногда дают разные ответы.

Если в задаче классификации в эталонном распределении вероятности классов равны 0.7 и 0.3, то мы хотели бы, чтобы в предсказании они тоже были бы равны 0.7 и 0.3.

В этом разделе мы рассмотрели более общий случай, когда эталонное распределе…

4 months назад @ habr.com
Теория вероятностей в машинном обучении. Часть 2: модель классификации
Теория вероятностей в машинном обучении. Часть 2: модель классификации Теория вероятностей в машинном обучении. Часть 2: модель классификации

Модель классификации и функция потерьЧтобы задать вероятностную модель, нам нужно определить, в какой форме она будет предсказывать распределение .

Если мы используем в обучении сигмоиду, то модель непосредственно предсказывает логит , то есть логарифм того, во сколько раз второй класс вероятнее первого.

Например, в пусть в задаче классификации эмоций по видеозаписи датасет размечен сразу несколькими людьми-аннотаторами, которые иногда дают разные ответы.

Если в задаче классификации в эталонном распределении вероятности классов равны 0.7 и 0.3, то мы хотели бы, чтобы в предсказании они тоже были бы равны 0.7 и 0.3.

В этом разделе мы рассмотрели более общий случай, когда эталонное распределе…

4 months назад @ habr.com
Теория вероятностей в машинном обучении. Часть 1: модель регрессии
Теория вероятностей в машинном обучении. Часть 1: модель регрессии Теория вероятностей в машинном обучении. Часть 1: модель регрессии

В пятом разделе рассмотрим модель регрессии с оценкой уверенности в виде формул и программного кода.

Вообще говоря, большую часть машинного обучения можно считать статистическим выводом, что мы более формально рассмотрим в дальнейшем.

При этом чем сложнее модель и меньше данных, тем менее точной получается аппроксимация, что и является причиной переобучения.

Проверим этот метод на практике, обучив модель на табличном датасете California Housing, в котором нужно предсказывать цену недвижимости в разных районах Калифорнии, имея 8 исходных признаков.

Положительная корреляция говорит о том, что модель в какой-то степени справляется с задачей оценки собственной уверенности в предсказании.

4 months, 1 week назад @ habr.com
Теория вероятностей в машинном обучении. Часть 1: модель регрессии
Теория вероятностей в машинном обучении. Часть 1: модель регрессии Теория вероятностей в машинном обучении. Часть 1: модель регрессии

В пятом разделе рассмотрим модель регрессии с оценкой уверенности в виде формул и программного кода.

Вообще говоря, большую часть машинного обучения можно считать статистическим выводом, что мы более формально рассмотрим в дальнейшем.

При этом чем сложнее модель и меньше данных, тем менее точной получается аппроксимация, что и является причиной переобучения.

Проверим этот метод на практике, обучив модель на табличном датасете California Housing, в котором нужно предсказывать цену недвижимости в разных районах Калифорнии, имея 8 исходных признаков.

Положительная корреляция говорит о том, что модель в какой-то степени справляется с задачей оценки собственной уверенности в предсказании.

4 months, 1 week назад @ habr.com
ChatGPT как инструмент для поиска: решаем основную проблему
ChatGPT как инструмент для поиска: решаем основную проблему ChatGPT как инструмент для поиска: решаем основную проблему

Под размером понимается количество параметров в модели, и для LLM это число превосходит несколько миллиардов.

Языковые модели и фактыЯзыковые модели, или Language Models (LM), решают очень простую задачу: предсказание следующего слова (или токена, части слова).

Это ясно нам, человекам, и как показывают современные языковые модели - это понятно и им.

Мы уже обсудили, что такое токен, и что для модели заранее создается словарь токенов, который используется для подачи входного текста.

Как модель для DotA 2 видит поле боя - принцип сбора признаком для подачи в нейросеть.

4 months, 1 week назад @ habr.com
Machine Learning Mastery
последний пост 2 weeks, 1 day назад
What Are Zero-Shot Prompting and Few-Shot Prompting
What Are Zero-Shot Prompting and Few-Shot Prompting

In the literature on language models, you will often encounter the terms “zero-shot prompting” and “few-shot prompting.” It is important to understand how a large language model generates an output. In this post, you will learn: What is zero-shot and few-shot prompting? How to experiment with them in GPT4All Let’s get started. Overview This post […]

The post What Are Zero-Shot Prompting and Few-Shot Prompting appeared first on MachineLearningMastery.com.

2 weeks, 1 day назад @ machinelearningmastery.com
Get a Taste of LLMs from GPT4All
Get a Taste of LLMs from GPT4All

Large language models have become popular recently. ChatGPT is fashionable. Trying out ChatGPT to understand what LLMs are about is easy, but sometimes, you may want an offline alternative that can run on your computer. In this post, you will learn about GPT4All as an LLM that you can install on your computer. In particular, […]

The post Get a Taste of LLMs from GPT4All appeared first on MachineLearningMastery.com.

2 weeks, 3 days назад @ machinelearningmastery.com
What are Large Language Models
What are Large Language Models

Large language models (LLMs) are recent advances in deep learning models to work on human languages. Some great use case of LLMs has been demonstrated. A large language model is a trained deep-learning model that understands and generates text in a human-like fashion. Behind the scene, it is a large transformer model that does all […]

The post What are Large Language Models appeared first on MachineLearningMastery.com.

2 weeks, 5 days назад @ machinelearningmastery.com
Activation Functions in PyTorch
Activation Functions in PyTorch

As neural networks become increasingly popular in the field of machine learning, it is important to understand the role that activation functions play in their implementation. In this article, you’ll explore the concept of activation functions that are applied to the output of each neuron in a neural network to introduce non-linearity into the model. […]

The post Activation Functions in PyTorch appeared first on MachineLearningMastery.com.

1 month назад @ machinelearningmastery.com
PyTorch Tutorial: How to Develop Deep Learning Models with Python
PyTorch Tutorial: How to Develop Deep Learning Models with Python

Predictive modeling with deep learning is a skill that modern developers need to know. PyTorch is the premier open-source deep learning framework developed and maintained by Facebook. At its core, PyTorch is a mathematical library that allows you to perform efficient computation and automatic differentiation on graph-based models. Achieving this directly is challenging, although thankfully, […]

The post PyTorch Tutorial: How to Develop Deep Learning Models with Python appeared first on MachineLearningMastery.com.

1 month, 1 week назад @ machinelearningmastery.com
Deep Learning with PyTorch (9-Day Mini-Course)
Deep Learning with PyTorch (9-Day Mini-Course)

Deep learning is a fascinating field of study and the techniques are achieving world class results in a range of challenging machine learning problems. It can be hard to get started in deep learning.Which library should you use and which techniques should you focus on? In this 9-part crash course you will discover applied deep […]

The post Deep Learning with PyTorch (9-Day Mini-Course) appeared first on MachineLearningMastery.com.

2 months назад @ machinelearningmastery.com
Text Generation with LSTM in PyTorch
Text Generation with LSTM in PyTorch

machinelearningmastery.comChecking if the site connection is secureEnable JavaScript and cookies to continuemachinelearningmastery.com needs to review the security of your connection before proceeding.

2 months, 3 weeks назад @ machinelearningmastery.com
LSTM for Time Series Prediction in PyTorch
LSTM for Time Series Prediction in PyTorch

machinelearningmastery.comChecking if the site connection is secureEnable JavaScript and cookies to continuemachinelearningmastery.com needs to review the security of your connection before proceeding.

2 months, 4 weeks назад @ machinelearningmastery.com
Handwritten Digit Recognition with LeNet5 Model in PyTorch
Handwritten Digit Recognition with LeNet5 Model in PyTorch

machinelearningmastery.comChecking if the site connection is secureEnable JavaScript and cookies to continuemachinelearningmastery.com needs to review the security of your connection before proceeding.

3 months назад @ machinelearningmastery.com
Building a Convolutional Neural Network in PyTorch
Building a Convolutional Neural Network in PyTorch

machinelearningmastery.comChecking if the site connection is secureEnable JavaScript and cookies to continuemachinelearningmastery.com needs to review the security of your connection before proceeding.

3 months назад @ machinelearningmastery.com
Visualizing a PyTorch Model
Visualizing a PyTorch Model

machinelearningmastery.comChecking if the site connection is secureEnable JavaScript and cookies to continuemachinelearningmastery.com needs to review the security of your connection before proceeding.

3 months назад @ machinelearningmastery.com
Managing a PyTorch Training Process with Checkpoints and Early Stopping
Managing a PyTorch Training Process with Checkpoints and Early Stopping

machinelearningmastery.comChecking if the site connection is secureEnable JavaScript and cookies to continuemachinelearningmastery.com needs to review the security of your connection before proceeding.

3 months, 1 week назад @ machinelearningmastery.com
Understand Model Behavior During Training by Visualizing Metrics
Understand Model Behavior During Training by Visualizing Metrics

machinelearningmastery.comChecking if the site connection is secureEnable JavaScript and cookies to continuemachinelearningmastery.com needs to review the security of your connection before proceeding.

3 months, 1 week назад @ machinelearningmastery.com
Training a PyTorch Model with DataLoader and Dataset
Training a PyTorch Model with DataLoader and Dataset

machinelearningmastery.comChecking if the site connection is secureEnable JavaScript and cookies to continuemachinelearningmastery.com needs to review the security of your connection before proceeding.

3 months, 1 week назад @ machinelearningmastery.com
Using Learning Rate Schedule in PyTorch Training
Using Learning Rate Schedule in PyTorch Training

machinelearningmastery.comChecking if the site connection is secureEnable JavaScript and cookies to continuemachinelearningmastery.com needs to review the security of your connection before proceeding.

3 months, 2 weeks назад @ machinelearningmastery.com
ML in Production
последний пост None
Sorta Insightful Sorta Insightful
последний пост 4 weeks, 1 day назад
A Boötes Shaped Addendum to Writing Mystery Hunt 2023
A Boötes Shaped Addendum to Writing Mystery Hunt 2023 A Boötes Shaped Addendum to Writing Mystery Hunt 2023

But why would you solve a puzzle from an old Mystery Hunt, when there are a bunch of puzzles to testsolve for the upcoming Mystery Hunt?

I’ve since had two people from Galactic tell me that they did this for the Students round in Mystery Hunt 2021.

As the FAQ mentions, the puzzles in that puzzlehunt were originally going to appear in Mystery Hunt 2023.

There are plenty of people who only do MIT Mystery Hunt and are broadly ignorant of other puzzles.

One advantage of Mystery Hunt always happening on MLK weekend is that anyone who wants to attend has ample warning time to clear their calendar.

4 weeks, 1 day назад @ alexirpan.com
Writing MIT Mystery Hunt 2023
Writing MIT Mystery Hunt 2023 Writing MIT Mystery Hunt 2023

This post is about 55,000 words long, and is riddled with spoilers for pretty much every aspect of MIT Mystery Hunt 2023.

I write a post about Mystery Hunt 2022, where I make a few predictions about how writing Mystery Hunt 2023 will go.

We talked a bit about whether this was okay, since some organizers for Teammate Hunt 2021 were not writing Mystery Hunt this year.

The first choice we had to make was whether we’d use the hunt codebase from Palindrome, or use the tph-site codebase we’d built over Teammate Hunt 2020 and Teammate Hunt 2021.

The assumption we made is that most Mystery Hunt teams do not have an active codebase, and default to using the code from the previous Mystery Hunt.

1 month, 2 weeks назад @ alexirpan.com
A Prelude to the Inevitable Long Post About MIT Mystery Hunt 2023
A Prelude to the Inevitable Long Post About MIT Mystery Hunt 2023 A Prelude to the Inevitable Long Post About MIT Mystery Hunt 2023

The first time I ever wrote for a puzzlehunt was Mystery Hunt 2013.

Ten years later, teammate wrote another Mystery Hunt that went into Monday, with a similarly large number of free answers as MH 2013.

I don’t think there was any single reason that Mystery Hunt was so hard this year, but there was definitely a systematic underestimation of difficulty and length.

However, there are some first-time constructors on teammate this year, where their Hunt puzzles are their first puzzles for the public.

I’m pretty sure I’ve spent more time on Hunt this year than I spent in all my past puzzle writing combined.

4 months, 2 weeks назад @ alexirpan.com
Generative Modelling is Still Accelerating
Generative Modelling is Still Accelerating Generative Modelling is Still Accelerating

In the months since, image generation has gone from a thing some people talked about, to something everyone was talking about.

I read a post from someone who discussed AI asceticism, and then acknowledged that they could not do it, the image generation was too fun to play with.

People have normalized that it is possible to get high quality language-guided image generation really, really quickly.

I think there’s only a few domains where we actually have enough human data at the moment.

I don’t think they’ll lead to fundamental floor raising of what we believe ML models are capable of.

8 months, 1 week назад @ alexirpan.com
Seven Years Later
Seven Years Later Seven Years Later

This January, the team I was on won MIT Mystery Hunt, the biggest puzzlehunt of the year.

See, people don’t quite understand how long it takes to write Mystery Hunt.

markdown 414 2022 - 01 - 22 - mh - 2022. markdown 400 2022 - 04 - 15 - do - what - i - mean .

markdownI’m a bit surprised the ML-related post has fewer views than the Mystery Hunt post.

I’m guessing shades of what this post would have been will appear in other posts I write later.

9 months, 3 weeks назад @ alexirpan.com
Lil'Log
последний пост None
inFERENCe
последний пост 1 week назад
Mortal Komputation: On Hinton's argument for superhuman AI.
Mortal Komputation: On Hinton's argument for superhuman AI. Mortal Komputation: On Hinton's argument for superhuman AI.

AGI opinion May 30, 2023Mortal Komputation: On Hinton's argument for superhuman AI.

Analogue hardware allows for lower energy cost but at the cost of mortality: algorithm and hardware are inseparable - the argument goes.

Digital intelligence has two advantages: aggregating learning from parallel experiences, and backpropagation which is implausible on analogue hardwareHinton concludes these advantages can/will lead to superhuman digital intelligence.

Brains running on analogue hardware are mortal: once the hardware dies, the algorithm dies with it.

What if we prematurely conclude digital brains are superior to analogue brains just because we haven't yet managed to make analogue computation …

1 week назад @ inference.vc
Autoregressive Models, OOD prompts and the Interpolation Regime
Autoregressive Models, OOD prompts and the Interpolation Regime Autoregressive Models, OOD prompts and the Interpolation Regime

March 30, 2023Autoregressive Models, OOD prompts and the Interpolation RegimeA few years ago I was very much into maximum likelihood-based generative modeling and autoregressive models (see this, this or this).

AR models > distributions over sequencesI have always associated AR models as just a smart way to parametrize probability distributions over multidimensional vectors or sequences.

Let's consider fitting AR models on a dataset generated by a probabilistic context free grammar $\mathbb{P}[S="a^nb^n"]=q^n(1-q)$.

Now consider two autoregressive models, $Q_1$ and $Q_2$, which both perfectly fit our training distribution.

Low probability promptsBut surely, we are not just interested in eva…

2 months, 1 week назад @ inference.vc
We May be Surprised Again: Why I take LLMs seriously.
We May be Surprised Again: Why I take LLMs seriously. We May be Surprised Again: Why I take LLMs seriously.

March 22, 2023We May be Surprised Again: Why I take LLMs seriously.

"Deep Learning is Easy, Learn something Harder" - I proclaimed in one of my early and provocative blog posts from 2016.

I wasn't alone in my deep learning skepticism, in fact I'm far from being the most extreme deep learning skeptic.

An important change-point in many of our attitudes was the 2016 paper Understanding deep learning requires rethinking generalization.

I have now realised that $\operatorname{argmin}\mathcal{L}$ is a very poor description of what actually happens in deep learning.

2 months, 2 weeks назад @ inference.vc
The Spectator
последний пост 1 month, 1 week назад
The Unofficial Google Data Science Blog The Unofficial Google Data Science Blog
последний пост None
Off the Convex Path
последний пост None
Jay Alammar
последний пост 4 weeks, 1 day назад
Generative AI and AI Product Moats
Generative AI and AI Product Moats Generative AI and AI Product Moats

Here are eight observations I’ve shared recently on the Cohere blog and videos that go over them.

:This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License Attribution example:Note: If you translate any of the posts, let me know so I can link your translation to the original post.

My email is in the about page

4 weeks, 1 day назад @ jalammar.github.io
Remaking Old Computer Graphics With AI Image Generation
Remaking Old Computer Graphics With AI Image Generation Remaking Old Computer Graphics With AI Image Generation

Can AI Image generation tools make re-imagined, higher-resolution versions of old video game graphics?

Over the last few days, I used AI image generation to reproduce one of my childhood nightmares.

It’s been a few months since the vast majority of people started having broad access to AI image generation tools.

In this time, I’ve gotten to use three of these image generation services.

A key point for extending the capability and building more advanced systems that use an image generation component.

5 months, 1 week назад @ jalammar.github.io
The Illustrated Stable Diffusion
The Illustrated Stable Diffusion The Illustrated Stable Diffusion

The image generator goes through two stages:1- Image information creatorThis component is the secret sauce of Stable Diffusion.

The image information creator works completely in the image information space (or latent space).

This concludes the description of image generation by diffusion models mostly as described in Denoising Diffusion Probabilistic Models.

Speed Boost: Diffusion on Compressed (Latent) Data Instead of the Pixel ImageTo speed up the image generation process, the Stable Diffusion paper runs the diffusion process not on the pixel images themselves, but on a compressed version of the image.

The released Stable Diffusion model uses ClipText (A GPT-based model), while the paper …

8 months назад @ jalammar.github.io
Piekniewski's blog
последний пост 1 month, 3 weeks назад
The Atom of Intelligence
The Atom of Intelligence The Atom of Intelligence

The initial atom of intelligence never had the luxury of abstraction or determinism.

Control loops of bigger organisms are composed of control loops of those at lower scale.

If the idea of internal and external control loop of an organism is the analog of atom for intelligence, the ability of compose such loops across scales is the analog of a chemical bond.

These complex organism not only build control loops from much smaller control loops, but also utilize numerous slower loops.

The web of feedback loops goes back all the way to the primordial atom of intelligence.

1 month, 3 weeks назад @ blog.piekniewski.info
Ai Reflections
Ai Reflections Ai Reflections

AI enthusiast: Maybe, but who cares, ultimately the thing works better than anything before.

AI enthusiast: yes but there are some measurable ways to verify that machine gets better than a human.

AI enthusiast: OK, but what about the Turing test, ultimately when humans get convinced that AI agent is sentient just as they are, it's game over, AGI is here.

AI enthusiast: But GPT can beat human at programming, can write better poems and makes fewer and fewer mistakes.

But as far as critical, practical applications in the real world go, AI deployment has been nothing but a failure.

1 month, 4 weeks назад @ blog.piekniewski.info
AI psychosis
AI psychosis AI psychosis

In reality A.I.

And here we come back to the psychosis we're in right now and particularly how I think most of it is really unjustified and wrong.

Later on numerous flaws were pointed out in the study and as it turns out A.I.

The problem with GPT though, as with many other AI contraptions, is that obviously it knows extremely little about the world.

We will just sugar rush ourselves with AI Elon Musk type hyper-promises, scare ourselves to the point of anxiety by the nonexistent AI daemons until we all slowly but surely go totally insane.

3 months, 4 weeks назад @ blog.piekniewski.info
Science, dogma and mysteries.
Science, dogma and mysteries. Science, dogma and mysteries.

Now almost 13 years after my PhD defense, my view is that science is actually a rather fragile thread we use to hold together and explain various mysteries in the world.

But I now view science as any other social activity, being influenced by zeitgeist, politics, fashion, financing and often stuck in a dogma, no different than the dogma that threatened Galileo or Copernicus.

A few years back I digested all of his books and this experience has completely changed my view on science.

Science is the best method, but scientific community is mostly toxicThe general theme of this post is that by looking at several seemingly disconnected aspects of science and technology we can see that our contemp…

5 months назад @ blog.piekniewski.info
What actually is statistics?
What actually is statistics? What actually is statistics?

Data science essentially is glorified statistics with a computer, AI is deeply statistical at its very core, we use statistical analysis for pretty much everything from economy to biology.

Statistics is a craft that allows us to analyze and predict certain subset of complex signals that are not possible to describe in terms of dynamics.

Now let me repeat this once again: statistics can be applied to some data sometimes.

Also the smaller signals can be reasonably "independent" of each other, but can all be dependent on some other bigger external thing.

they only applied statistics to what can be understood with mechanics but at a slightly higher level of organization.

5 months, 4 weeks назад @ blog.piekniewski.info
fast.ai NLP fast.ai NLP
последний пост None
Sebastian Ruder
последний пост 3 months, 2 weeks назад
Modular Deep Learning
Modular Deep Learning Modular Deep Learning

We give an in-depth overview of modularity in our survey on Modular Deep Learning.

Green components illustrate different routing functions, shade-of-purple components illustrate different modular computation functions.

It subsumes standard multi-task learning methods, modules that adapt a pre-trained model (known as 'adapters'), and rescaling methods.

In multi-task learning settings, modular task-specific components are trained jointly to mitigate catastrophic interference, with fixed or learned routing.

CitationFor attribution in academic contexts or books, please cite our survey as:Jonas Pfeiffer and Sebastian Ruder and Ivan Vulić and Edoardo M. Ponti, "Modular Deep Learning".

3 months, 2 weeks назад @ ruder.io
The State of Multilingual AI
The State of Multilingual AI The State of Multilingual AI

This post takes a closer look at the state of multilingual AI.

Multilingual models These models have multilingual analogues—in NLP, models such as mBERT, RemBERT , XLM-RoBERTa , mBART , mT5 , and mDeBERTa —that were trained in a similar fashion, predicting randomly masked tokens on data of around 100 languages.

Compared to their monolingual counterparts, these multilingual models require a much larger vocabulary to represent tokens in many languages.

CitationFor attribution in academic contexts or books, please cite this work as:Sebastian Ruder, "The State of Multilingual AI".

BibTeX citation:@misc{ruder2022statemultilingualai, author = {Ruder, Sebastian}, title = {{The State of Multilingua…

6 months, 3 weeks назад @ ruder.io
Andrew Karpathy blog
последний пост None
🔬 Science
Papers With Code Papers With Code
последний пост 4 часа назад
/jpp46/ Exploring the effects of robotic design on learning and neural control
/jpp46/ Exploring the effects of robotic design on learning and neural control /jpp46/ Exploring the effects of robotic design on learning and neural control

The ongoing deep learning revolution has allowed computers to outclass humans in various games and perceive features imperceptible to humans during classification tasks.

Current machine learning techniques have clearly distinguished themselves in specialized tasks.

Most work in this field is focused on the development of more sophisticated learning algorithms for a robot's controller given a largely static and presupposed robotic design.

Through this discovery, I also present novel metrics to explicitly measure the learning ability of a robotic design and its resistance to common problems such as catastrophic interference.

Overall, this dissertation intimates the ability to automatically de…

4 часа назад @ paperswithcode.com
/edadaltocg/ A Functional Data Perspective and Baseline On Multi-Layer Out-of-Distribution Detection
/edadaltocg/ A Functional Data Perspective and Baseline On Multi-Layer Out-of-Distribution Detection /edadaltocg/ A Functional Data Perspective and Baseline On Multi-Layer Out-of-Distribution Detection

A key feature of out-of-distribution (OOD) detection is to exploit a trained neural network by extracting statistical patterns and relationships through the multi-layer classifier to detect shifts in the expected input data distribution.

Despite achieving solid results, several state-of-the-art methods rely on the penultimate or last layer outputs only, leaving behind valuable information for OOD detection.

It goes beyond multivariate features aggregation and introduces a baseline rooted in functional anomaly detection.

In this new framework, OOD detection translates into detecting samples whose trajectories differ from the typical behavior characterized by the training set.

We validate our…

5 часов назад @ paperswithcode.com
/xhu248/ Conditional Diffusion Models for Weakly Supervised Medical Image Segmentation
/xhu248/ Conditional Diffusion Models for Weakly Supervised Medical Image Segmentation /xhu248/ Conditional Diffusion Models for Weakly Supervised Medical Image Segmentation

Recent advances in denoising diffusion probabilistic models have shown great success in image synthesis tasks.

While there are already works exploring the potential of this powerful tool in image semantic segmentation, its application in weakly supervised semantic segmentation (WSSS) remains relatively under-explored.

Our method is different from previous diffusion model methods with guidance from an external classifier, which accumulates noises in the background during the reconstruction process.

Our method outperforms state-of-the-art CAM and diffusion model methods on two public medical image segmentation datasets, which demonstrates that CDM is a promising tool in WSSS.

Also, experiment…

5 часов назад @ paperswithcode.com
/apptek/ Take the Hint: Improving Arabic Diacritization with Partially-Diacritized Text
/apptek/ Take the Hint: Improving Arabic Diacritization with Partially-Diacritized Text /apptek/ Take the Hint: Improving Arabic Diacritization with Partially-Diacritized Text

Automatic Arabic diacritization is useful in many applications, ranging from reading support for language learners to accurate pronunciation predictor for downstream tasks like speech synthesis.

While most of the previous works focused on models that operate on raw non-diacritized text, production systems can gain accuracy by first letting humans partly annotate ambiguous words.

In this paper, we propose 2SDiac, a multi-source model that can effectively support optional diacritics in input to inform all predictions.

We also introduce Guided Learning, a training scheme to leverage given diacritics in input with different levels of random masking.

Moreover, experiments on two common benchmark…

6 часов назад @ paperswithcode.com
/lins-lab/ On Pitfalls of Test-Time Adaptation
/lins-lab/ On Pitfalls of Test-Time Adaptation /lins-lab/ On Pitfalls of Test-Time Adaptation

Test-Time Adaptation (TTA) has recently emerged as a promising approach for tackling the robustness challenge under distribution shifts.

However, the lack of consistent settings and systematic studies in prior literature hinders thorough assessments of existing methods.

To address this issue, we present TTAB, a test-time adaptation benchmark that encompasses ten state-of-the-art algorithms, a diverse array of distribution shifts, and two evaluation protocols.

Through extensive experiments, our benchmark reveals three common pitfalls in prior efforts.

Third, even under optimal algorithmic conditions, none of the existing methods are capable of addressing all common types of distribution shif…

6 часов назад @ paperswithcode.com
/zjlab-ammi/ Enabling Efficient Interaction between an Algorithm Agent and an LLM: A Reinforcement Learning Approach
/zjlab-ammi/ Enabling Efficient Interaction between an Algorithm Agent and an LLM: A Reinforcement Learning Approach /zjlab-ammi/ Enabling Efficient Interaction between an Algorithm Agent and an LLM: A Reinforcement Learning Approach

Large language models (LLMs) encode a vast amount of world knowledge acquired from massive text datasets.

Recent studies have demonstrated that LLMs can assist an algorithm agent in solving complex sequential decision making tasks in embodied environments by providing high-level instructions.

In this paper, we explore how to enable efficient and cost-effective interactions between the agent and an LLM.

We propose a reinforcement learning based mediator model that determines when it is necessary to consult LLMs for high-level instructions to accomplish a target task.

Experimental results also suggest that by learning a mediator model to interact with the LLM, the agent's performance becomes …

6 часов назад @ paperswithcode.com
/lz1oceani/ Deductive Verification of Chain-of-Thought Reasoning
/lz1oceani/ Deductive Verification of Chain-of-Thought Reasoning /lz1oceani/ Deductive Verification of Chain-of-Thought Reasoning

Large Language Models (LLMs) significantly benefit from Chain-of-Thought (CoT) prompting in performing various reasoning tasks.

While CoT allows models to produce more comprehensive reasoning processes, its emphasis on intermediate reasoning steps can inadvertently introduce hallucinations and accumulated errors, thereby limiting models' ability to solve complex reasoning tasks.

Inspired by how humans engage in careful and meticulous deductive logical reasoning processes to solve tasks, we seek to enable language models to perform explicit and rigorous deductive reasoning, and also ensure the trustworthiness of their reasoning process through self-verification.

However, directly verifying t…

7 часов назад @ paperswithcode.com
/ml-research/ Masked Autoencoders are Efficient Continual Federated Learners
/ml-research/ Masked Autoencoders are Efficient Continual Federated Learners /ml-research/ Masked Autoencoders are Efficient Continual Federated Learners

Machine learning is typically framed from a perspective of i.i.d., and more importantly, isolated data.

In parts, federated learning lifts this assumption, as it sets out to solve the real-world challenge of collaboratively learning a shared model from data distributed across clients.

For this purpose, we demonstrate that masked autoencoders for distribution estimation are particularly amenable to this setup.

Specifically, their masking strategy can be seamlessly integrated with task attention mechanisms to enable selective knowledge transfer between clients.

We empirically corroborate the latter statement through several continual federated scenarios on both image and binary datasets.

8 часов назад @ paperswithcode.com
/lijiazheng99/ CUE: An Uncertainty Interpretation Framework for Text Classifiers Built on Pre-Trained Language Models
/lijiazheng99/ CUE: An Uncertainty Interpretation Framework for Text Classifiers Built on Pre-Trained Language Models /lijiazheng99/ CUE: An Uncertainty Interpretation Framework for Text Classifiers Built on Pre-Trained Language Models

Text classifiers built on Pre-trained Language Models (PLMs) have achieved remarkable progress in various tasks including sentiment analysis, natural language inference, and question-answering.

However, the occurrence of uncertain predictions by these classifiers poses a challenge to their reliability when deployed in practical applications.

But few studies have delved into factors influencing PLM-based classifiers' predictive uncertainty.

In particular, we first map PLM-encoded representations to a latent space via a variational auto-encoder.

We then generate text representations by perturbing the latent space which causes fluctuation in predictive uncertainty.

8 часов назад @ paperswithcode.com
/nianlonggu/ SciLit: A Platform for Joint Scientific Literature Discovery, Summarization and Citation Generation
/nianlonggu/ SciLit: A Platform for Joint Scientific Literature Discovery, Summarization and Citation Generation /nianlonggu/ SciLit: A Platform for Joint Scientific Literature Discovery, Summarization and Citation Generation

Scientific writing involves retrieving, summarizing, and citing relevant papers, which can be time-consuming processes in large and rapidly evolving fields.

By making these processes inter-operable, natural language processing (NLP) provides opportunities for creating end-to-end assistive writing tools.

We propose SciLit, a pipeline that automatically recommends relevant papers, extracts highlights, and suggests a reference sentence as a citation of a paper, taking into consideration the user-provided context and keywords.

SciLit efficiently recommends papers from large databases of hundreds of millions of papers using a two-stage pre-fetching and re-ranking literature search system that fl…

8 часов назад @ paperswithcode.com
/cognano/ AVIDa-hIL6: A Large-Scale VHH Dataset Produced from an Immunized Alpaca for Predicting Antigen-Antibody Interactions
/cognano/ AVIDa-hIL6: A Large-Scale VHH Dataset Produced from an Immunized Alpaca for Predicting Antigen-Antibody Interactions /cognano/ AVIDa-hIL6: A Large-Scale VHH Dataset Produced from an Immunized Alpaca for Predicting Antigen-Antibody Interactions

To accelerate therapeutic antibody discovery, computational methods, especially machine learning, have attracted considerable interest for predicting specific interactions between antibody candidates and target antigens such as viruses and bacteria.

However, the publicly available datasets in existing works have notable limitations, such as small sizes and the lack of non-binding samples and exact amino acid sequences.

To overcome these limitations, we have developed AVIDa-hIL6, a large-scale dataset for predicting antigen-antibody interactions in the variable domain of heavy chain of heavy chain antibodies (VHHs), produced from an alpaca immunized with the human interleukin-6 (IL-6) protei…

8 часов назад @ paperswithcode.com
/astrodrew/ Towards Alleviating the Object Bias in Prompt Tuning-based Factual Knowledge Extraction
/astrodrew/ Towards Alleviating the Object Bias in Prompt Tuning-based Factual Knowledge Extraction /astrodrew/ Towards Alleviating the Object Bias in Prompt Tuning-based Factual Knowledge Extraction

Many works employed prompt tuning methods to automatically optimize prompt queries and extract the factual knowledge stored in Pretrained Language Models.

In this paper, we observe that the optimized prompts, including discrete prompts and continuous prompts, exhibit undesirable object bias.

To handle this problem, we propose a novel prompt tuning method called MeCoD.

consisting of three modules: Prompt Encoder, Object Equalization and Biased Object Obstruction.

Experimental results show that MeCoD can significantly reduce the object bias and at the same time improve accuracy of factual knowledge extraction.

8 часов назад @ paperswithcode.com
/hincz-lab/ Machine learning in and out of equilibrium
/hincz-lab/ Machine learning in and out of equilibrium /hincz-lab/ Machine learning in and out of equilibrium

We focus in particular on the stationary state of the system in the long-time limit, which in conventional SGD is out of equilibrium, exhibiting persistent currents in the space of network parameters.

The stationary distribution of these rates obeys the integral and detailed fluctuation theorems -- nonequilibrium generalizations of the second law of thermodynamics.

While the fluctuation theorems are universal, there are other aspects of the stationary state that are highly sensitive to the training details.

Surprisingly, the effective loss landscape and diffusion matrix that determine the shape of the stationary distribution vary depending on the simple choice of minibatching done with or w…

8 часов назад @ paperswithcode.com
/hakonnoren/ Learning Dynamical Systems from Noisy Data with Inverse-Explicit Integrators
/hakonnoren/ Learning Dynamical Systems from Noisy Data with Inverse-Explicit Integrators /hakonnoren/ Learning Dynamical Systems from Noisy Data with Inverse-Explicit Integrators

We introduce the mean inverse integrator (MII), a novel approach to increase the accuracy when training neural networks to approximate vector fields of dynamical systems from noisy data.

This method can be used to average multiple trajectories obtained by numerical integrators such as Runge-Kutta methods.

We show that the class of mono-implicit Runge-Kutta methods (MIRK) has particular advantages when used in connection with MII.

When training vector field approximations, explicit expressions for the loss functions are obtained when inserting the training data in the MIRK formulae, unlocking symmetric and high-order integrators that would otherwise be implicit for initial value problems.

Th…

8 часов назад @ paperswithcode.com
/ibm/ From Key Points to Key Point Hierarchy: Structured and Expressive Opinion Summarization
/ibm/ From Key Points to Key Point Hierarchy: Structured and Expressive Opinion Summarization /ibm/ From Key Points to Key Point Hierarchy: Structured and Expressive Opinion Summarization

Key Point Analysis (KPA) has been recently proposed for deriving fine-grained insights from collections of textual comments.

KPA extracts the main points in the data as a list of concise sentences or phrases, termed key points, and quantifies their prevalence.

While key points are more expressive than word clouds and key phrases, making sense of a long, flat list of key points, which often express related ideas in varying levels of granularity, may still be challenging.

We develop ThinkP, a high quality benchmark dataset of key point hierarchies for business and product reviews, obtained by consolidating multiple annotations.

We compare different methods for predicting pairwise relations be…

8 часов назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 4 часа назад
/thomashikaru/ A Cross-Linguistic Pressure for Uniform Information Density in Word Order
/thomashikaru/ A Cross-Linguistic Pressure for Uniform Information Density in Word Order /thomashikaru/ A Cross-Linguistic Pressure for Uniform Information Density in Word Order

While natural languages differ widely in both canonical word order and word order flexibility, their word orders still follow shared cross-linguistic statistical patterns, often attributed to functional pressures.

In the effort to identify these pressures, prior work has compared real and counterfactual word orders.

Yet one functional pressure has been overlooked in such investigations: the uniform information density (UID) hypothesis, which holds that information should be spread evenly throughout an utterance.

Here, we ask whether a pressure for UID may have influenced word order patterns cross-linguistically.

To this end, we use computational models to test whether real orders lead to gr…

8 часов назад @ paperswithcode.com
/soummyaah/ FinRED: A Dataset for Relation Extraction in Financial Domain
/soummyaah/ FinRED: A Dataset for Relation Extraction in Financial Domain /soummyaah/ FinRED: A Dataset for Relation Extraction in Financial Domain

Relation extraction models trained on a source domain cannot be applied on a different target domain due to the mismatch between relation sets.

In the current literature, there is no extensive open-source relation extraction dataset specific to the finance domain.

In this paper, we release FinRED, a relation extraction dataset curated from financial news and earning call transcripts containing relations from the finance domain.

We also experiment with various state-of-the-art relation extraction models on this dataset to create the benchmark.

We see a significant drop in their performance on FinRED compared to the general relation extraction datasets which tells that we need better models f…

8 часов назад @ paperswithcode.com
/nhuang37/ Fine-grained Expressivity of Graph Neural Networks
/nhuang37/ Fine-grained Expressivity of Graph Neural Networks /nhuang37/ Fine-grained Expressivity of Graph Neural Networks

Numerous recent works have analyzed the expressive power of message-passing graph neural networks (MPNNs), primarily utilizing combinatorial techniques such as the $1$-dimensional Weisfeiler-Leman test ($1$-WL) for the graph isomorphism problem.

However, the graph isomorphism objective is inherently binary, not giving insights into the degree of similarity between two given graphs.

This work resolves this issue by considering continuous extensions of both $1$-WL and MPNNs to graphons.

Consequently, we provide a theoretical framework for graph and graphon similarity combining various topological variants of classical characterizations of the $1$-WL.

Moreover, we evaluate different MPNN archi…

8 часов назад @ paperswithcode.com
/egg-west/ Mildly Constrained Evaluation Policy for Offline Reinforcement Learning
/egg-west/ Mildly Constrained Evaluation Policy for Offline Reinforcement Learning /egg-west/ Mildly Constrained Evaluation Policy for Offline Reinforcement Learning

Offline reinforcement learning (RL) methodologies enforce constraints on the policy to adhere closely to the behavior policy, thereby stabilizing value learning and mitigating the selection of out-of-distribution (OOD) actions during test time.

Conventional approaches apply identical constraints for both value learning and test time inference.

To address this issue, we propose a Mildly Constrained Evaluation Policy (MCEP) for test time inference with a more constrained target policy for value estimation.

Since the target policy has been adopted in various prior approaches, MCEP can be seamlessly integrated with them as a plug-in.

The empirical results on MuJoCo locomotion tasks show that th…

8 часов назад @ paperswithcode.com
/zhjohnchan/ On the Difference of BERT-style and CLIP-style Text Encoders
/zhjohnchan/ On the Difference of BERT-style and CLIP-style Text Encoders /zhjohnchan/ On the Difference of BERT-style and CLIP-style Text Encoders

Masked language modeling (MLM) has been one of the most popular pretraining recipes in natural language processing, e.g., BERT, one of the representative models.

Recently, contrastive language-image pretraining (CLIP) has also attracted attention, especially its vision models that achieve excellent performance on a broad range of vision tasks.

However, few studies are dedicated to studying the text encoders learned by CLIP.

In this paper, we analyze the difference between BERT-style and CLIP-style text encoders from three experiments: (i) general text understanding, (ii) vision-centric text understanding, and (iii) text-to-image generation.

Experimental analyses show that although CLIP-styl…

8 часов назад @ paperswithcode.com
/peggypytang/ Efficient and Interpretable Compressive Text Summarisation with Unsupervised Dual-Agent Reinforcement Learning
/peggypytang/ Efficient and Interpretable Compressive Text Summarisation with Unsupervised Dual-Agent Reinforcement Learning /peggypytang/ Efficient and Interpretable Compressive Text Summarisation with Unsupervised Dual-Agent Reinforcement Learning

Recently, compressive text summarisation offers a balance between the conciseness issue of extractive summarisation and the factual hallucination issue of abstractive summarisation.

However, most existing compressive summarisation methods are supervised, relying on the expensive effort of creating a new training dataset with corresponding compressive summaries.

In this paper, we propose an efficient and interpretable compressive summarisation method that utilises unsupervised dual-agent reinforcement learning to optimise a summary's semantic coverage and fluency by simulating human judgment on summarisation quality.

Our model consists of an extractor agent and a compressor agent, and both a…

8 часов назад @ paperswithcode.com
/tmlr-group/ Unleashing Mask: Explore the Intrinsic Out-of-Distribution Detection Capability
/tmlr-group/ Unleashing Mask: Explore the Intrinsic Out-of-Distribution Detection Capability /tmlr-group/ Unleashing Mask: Explore the Intrinsic Out-of-Distribution Detection Capability

Out-of-distribution (OOD) detection is an indispensable aspect of secure AI when deploying machine learning models in real-world applications.

Previous paradigms either explore better scoring functions or utilize the knowledge of outliers to equip the models with the ability of OOD detection.

However, few of them pay attention to the intrinsic OOD detection capability of the given model.

Based on such insights, we propose a novel method, Unleashing Mask, which aims to restore the OOD discriminative capabilities of the well-trained model with ID data.

Our method utilizes a mask to figure out the memorized atypical samples, and then finetune the model or prune it with the introduced mask to f…

8 часов назад @ paperswithcode.com
/eleutherai/ LEACE: Perfect linear concept erasure in closed form
/eleutherai/ LEACE: Perfect linear concept erasure in closed form /eleutherai/ LEACE: Perfect linear concept erasure in closed form

Concept erasure aims to remove specified features from a representation.

preventing a classifier from using gender or race) and interpretability (e.g.

In this paper, we introduce LEAst-squares Concept Erasure (LEACE), a closed-form method which provably prevents all linear classifiers from detecting a concept while inflicting the least possible damage to the representation.

We apply LEACE to large language models with a novel procedure called "concept scrubbing," which erases target concept information from every layer in the network.

We demonstrate the usefulness of our method on two tasks: measuring the reliance of language models on part-of-speech information, and reducing gender bias in…

9 часов назад @ paperswithcode.com
/j-l-o/ Supervised Knowledge May Hurt Novel Class Discovery Performance
/j-l-o/ Supervised Knowledge May Hurt Novel Class Discovery Performance /j-l-o/ Supervised Knowledge May Hurt Novel Class Discovery Performance

Novel class discovery (NCD) aims to infer novel categories in an unlabeled dataset by leveraging prior knowledge of a labeled set comprising disjoint but related classes.

Given that most existing literature focuses primarily on utilizing supervised knowledge from a labeled set at the methodology level, this paper considers the question: Is supervised knowledge always helpful at different levels of semantic relevance?

Next, by using the proposed transfer flow, we conduct various empirical experiments with different levels of semantic similarity, yielding that supervised knowledge may hurt NCD performance.

These results reveal the inadequacy of the existing NCD literature which usually assume…

9 часов назад @ paperswithcode.com
/elsevier-ai-lab/ BioBLP: A Modular Framework for Learning on Multimodal Biomedical Knowledge Graphs
/elsevier-ai-lab/ BioBLP: A Modular Framework for Learning on Multimodal Biomedical Knowledge Graphs /elsevier-ai-lab/ BioBLP: A Modular Framework for Learning on Multimodal Biomedical Knowledge Graphs

Knowledge graphs (KGs) are an important tool for representing complex relationships between entities in the biomedical domain.

Several methods have been proposed for learning embeddings that can be used to predict new links in such graphs.

Some methods ignore valuable attribute data associated with entities in biomedical KGs, such as protein sequences, or molecular graphs.

This is not always the case for biomedical KGs, where entities exhibit heterogeneous modalities that are central to their representation in the subject domain.

We propose a modular framework for learning embeddings in KGs with entity attributes, that allows encoding attribute data of different modalities while also suppor…

9 часов назад @ paperswithcode.com
/yezi-66/ Instructive Feature Enhancement for Dichotomous Medical Image Segmentation
/yezi-66/ Instructive Feature Enhancement for Dichotomous Medical Image Segmentation /yezi-66/ Instructive Feature Enhancement for Dichotomous Medical Image Segmentation

Deep neural networks have been widely applied in dichotomous medical image segmentation (DMIS) of many anatomical structures in several modalities, achieving promising performance.

They made little instructions to which feature channels would be more beneficial for segmentation, and that may be why the performance and universality of these segmentation models are hindered.

In this study, we propose an instructive feature enhancement approach, namely IFE, to adaptively select feature channels with rich texture cues and strong discriminability to enhance raw features based on local curvature or global information entropy criteria.

To evaluate the proposed IFE, we constructed the first large-s…

9 часов назад @ paperswithcode.com
/spacelearner/ On Manipulating Signals of User-Item Graph: A Jacobi Polynomial-based Graph Collaborative Filtering
/spacelearner/ On Manipulating Signals of User-Item Graph: A Jacobi Polynomial-based Graph Collaborative Filtering /spacelearner/ On Manipulating Signals of User-Item Graph: A Jacobi Polynomial-based Graph Collaborative Filtering

Collaborative filtering (CF) is an important research direction in recommender systems that aims to make recommendations given the information on user-item interactions.

Graph CF has attracted more and more attention in recent years due to its effectiveness in leveraging high-order information in the user-item bipartite graph for better recommendations.

Specifically, recent studies show the success of graph neural networks (GNN) for CF is attributed to its low-pass filtering effects.

To this end, from the view of spectral transformation, we analyze the important factors that a graph filter should consider to achieve better performance.

Based on the discoveries, we design JGCF, an efficient …

9 часов назад @ paperswithcode.com
/zhishenyang/ SciCap+: A Knowledge Augmented Dataset to Study the Challenges of Scientific Figure Captioning
/zhishenyang/ SciCap+: A Knowledge Augmented Dataset to Study the Challenges of Scientific Figure Captioning /zhishenyang/ SciCap+: A Knowledge Augmented Dataset to Study the Challenges of Scientific Figure Captioning

Automating figure caption generation helps move model understandings of scientific documents beyond text and will help authors write informative captions that facilitate communicating scientific findings.

Unlike previous studies, we reframe scientific figure captioning as a knowledge-augmented image captioning task that models need to utilize knowledge embedded across modalities for caption generation.

To this end, we extended the large-scale SciCap dataset~\cite{hsu-etal-2021-scicap-generating} to SciCap+ which includes mention-paragraphs (paragraphs mentioning figures) and OCR tokens.

Our results indicate that mention-paragraphs serves as additional context knowledge, which significantly …

9 часов назад @ paperswithcode.com
/nicolas-hbt/ Schema First! Learn Versatile Knowledge Graph Embeddings by Capturing Semantics with MASCHInE
/nicolas-hbt/ Schema First! Learn Versatile Knowledge Graph Embeddings by Capturing Semantics with MASCHInE /nicolas-hbt/ Schema First! Learn Versatile Knowledge Graph Embeddings by Capturing Semantics with MASCHInE

Knowledge graph embedding models (KGEMs) have gained considerable traction in recent years.

These models learn a vector representation of knowledge graph entities and relations, a.k.a.

knowledge graph embeddings (KGEs).

Extensive experiments on various evaluation benchmarks demonstrate the soundness of this approach, which we call Modular and Agnostic SCHema-based Integration of protograph Embeddings (MASCHInE).

In particular, MASCHInE helps produce more versatile KGEs that yield substantially better performance for entity clustering and node classification tasks.

9 часов назад @ paperswithcode.com
/sr0920/ CiT-Net: Convolutional Neural Networks Hand in Hand with Vision Transformers for Medical Image Segmentation
/sr0920/ CiT-Net: Convolutional Neural Networks Hand in Hand with Vision Transformers for Medical Image Segmentation /sr0920/ CiT-Net: Convolutional Neural Networks Hand in Hand with Vision Transformers for Medical Image Segmentation

The hybrid architecture of convolutional neural networks (CNNs) and Transformer are very popular for medical image segmentation.

First, although a CNNs branch can capture the local image features using vanilla convolution, it cannot achieve adaptive feature learning.

To address these challenges, we propose a novel hybrid architecture of convolutional neural networks hand in hand with vision Transformers (CiT-Net) for medical image segmentation.

We apply them to the Transformer branch to learn the cross-dimensional long-term dependency for medical images.

Experimental results show that our CiT-Net provides better medical image segmentation results than popular SOTA methods.

9 часов назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 4 часа назад
/baneitixiaomai/ Mutual Information Regularization for Weakly-supervised RGB-D Salient Object Detection
/baneitixiaomai/ Mutual Information Regularization for Weakly-supervised RGB-D Salient Object Detection /baneitixiaomai/ Mutual Information Regularization for Weakly-supervised RGB-D Salient Object Detection

In this paper, we present a weakly-supervised RGB-D salient object detection model via scribble supervision.

Specifically, as a multimodal learning task, we focus on effective multimodal representation learning via inter-modal mutual information regularization.

In particular, following the principle of disentangled representation learning, we introduce a mutual information upper bound with a mutual information minimization regularizer to encourage the disentangled representation of each modality for salient object detection.

Based on our multimodal representation learning framework, we introduce an asymmetric feature extractor for our multimodal data, which is proven more effective than the…

9 часов назад @ paperswithcode.com
/lofrienger/ Curriculum-Based Augmented Fourier Domain Adaptation for Robust Medical Image Segmentation
/lofrienger/ Curriculum-Based Augmented Fourier Domain Adaptation for Robust Medical Image Segmentation /lofrienger/ Curriculum-Based Augmented Fourier Domain Adaptation for Robust Medical Image Segmentation

Accurate and robust medical image segmentation is fundamental and crucial for enhancing the autonomy of computer-aided diagnosis and intervention systems.

Medical data collection normally involves different scanners, protocols, and populations, making domain adaptation (DA) a highly demanding research field to alleviate model degradation in the deployment site.

To preserve the model performance across multiple testing domains, this work proposes the Curriculum-based Augmented Fourier Domain Adaptation (Curri-AFDA) for robust medical image segmentation.

Extensive experiments on two segmentation tasks of Retina and Nuclei collected from multiple sites and scanners suggest that our proposed me…

9 часов назад @ paperswithcode.com
/tmlr-group/ Exploring Model Dynamics for Accumulative Poisoning Discovery
/tmlr-group/ Exploring Model Dynamics for Accumulative Poisoning Discovery /tmlr-group/ Exploring Model Dynamics for Accumulative Poisoning Discovery

Adversarial poisoning attacks pose huge threats to various machine learning applications.

Especially, the recent accumulative poisoning attacks show that it is possible to achieve irreparable harm on models via a sequence of imperceptible attacks followed by a trigger batch.

In this paper, we dive into the perspective of model dynamics and propose a novel information measure, namely, Memorization Discrepancy, to explore the defense via the model-level information.

By implicitly transferring the changes in the data manipulation to that in the model outputs, Memorization Discrepancy can discover the imperceptible poison samples based on their distinct dynamics from the clean samples.

We thoro…

9 часов назад @ paperswithcode.com
/soummyaah/ Financial Numeric Extreme Labelling: A Dataset and Benchmarking for XBRL Tagging
/soummyaah/ Financial Numeric Extreme Labelling: A Dataset and Benchmarking for XBRL Tagging /soummyaah/ Financial Numeric Extreme Labelling: A Dataset and Benchmarking for XBRL Tagging

The U.S. Securities and Exchange Commission (SEC) mandates all public companies to file periodic financial statements that should contain numerals annotated with a particular label from a taxonomy.

In this paper, we formulate the task of automating the assignment of a label to a particular numeral span in a sentence from an extremely large label set.

Towards this task, we release a dataset, Financial Numeric Extreme Labelling (FNXL), annotated with 2,794 labels.

We benchmark the performance of the FNXL dataset by formulating the task as (a) a sequence labelling problem and (b) a pipeline with span extraction followed by Extreme Classification.

Although the two approaches perform comparably,…

9 часов назад @ paperswithcode.com
/nizhf/ Human-Object Interaction Prediction in Videos through Gaze Following
/nizhf/ Human-Object Interaction Prediction in Videos through Gaze Following /nizhf/ Human-Object Interaction Prediction in Videos through Gaze Following

Understanding the human-object interactions (HOIs) from a video is essential to fully comprehend a visual scene.

In this paper, we design a framework to detect current HOIs and anticipate future HOIs in videos.

These gaze features together with the scene contexts and the visual appearances of human-object pairs are fused through a spatio-temporal transformer.

Our model is trained and validated on the VidHOI dataset, which contains videos capturing daily life and is currently the largest video HOI dataset.

Moreover, we conduct an extensive ablation study to demonstrate the effectiveness of our modifications and extensions to the spatio-temporal transformer.

9 часов назад @ paperswithcode.com
/vita-group/ The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter
/vita-group/ The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter /vita-group/ The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter

Large pre-trained transformers are show-stealer in modern-day deep learning, and it becomes crucial to comprehend the parsimonious patterns that exist within them as they grow in scale.

In this paper, we comprehensively study induced sparse patterns across multiple large pre-trained vision and language transformers.

We propose the existence of -- essential sparsity defined with a sharp dropping point beyond which the performance declines much faster w.r.t the rise of sparsity level, when we directly remove weights with the smallest magnitudes in one-shot.

In the sparsity-performance curve We also present an intriguing emerging phenomenon of abrupt sparsification during the pre-training of B…

9 часов назад @ paperswithcode.com
/whiffe/ Student Classroom Behavior Detection based on Improved YOLOv7
/whiffe/ Student Classroom Behavior Detection based on Improved YOLOv7 /whiffe/ Student Classroom Behavior Detection based on Improved YOLOv7

Accurately detecting student behavior in classroom videos can aid in analyzing their classroom performance and improving teaching effectiveness.

However, the current accuracy rate in behavior detection is low.

To address this challenge, we propose the Student Classroom Behavior Detection method, based on improved YOLOv7.

First, we created the Student Classroom Behavior dataset (SCB-Dataset), which includes 18.4k labels and 4.2k images, covering three behaviors: hand raising, reading, and writing.

To improve detection accuracy in crowded scenes, we integrated the biformer attention module and Wise-IoU into the YOLOv7 network.

9 часов назад @ paperswithcode.com
/servicenow/ GEO-Bench: Toward Foundation Models for Earth Monitoring
/servicenow/ GEO-Bench: Toward Foundation Models for Earth Monitoring /servicenow/ GEO-Bench: Toward Foundation Models for Earth Monitoring

Such models, recently coined foundation models, have been transformational to the field of natural language processing.

To stimulate the development of foundation models for Earth monitoring, we propose a benchmark comprised of six classification and six segmentation tasks, which were carefully curated and adapted to be both relevant to the field and well-suited for model evaluation.

We accompany this benchmark with a robust methodology for evaluating models and reporting aggregated results to enable a reliable assessment of progress.

Finally, we report results for 20 baselines to gain information about the performance of existing models.

We believe that this benchmark will be a driver of p…

9 часов назад @ paperswithcode.com
/cranial-xix/ FAMO: Fast Adaptive Multitask Optimization
/cranial-xix/ FAMO: Fast Adaptive Multitask Optimization /cranial-xix/ FAMO: Fast Adaptive Multitask Optimization

One of the grand enduring goals of AI is to create generalist agents that can learn multiple different tasks from diverse data via multitask learning (MTL).

However, gradient descent (GD) on the average loss across all tasks may yield poor multitask performance due to severe under-optimization of certain tasks.

In this work, we introduce Fast Adaptive Multitask Optimization (FAMO), a dynamic weighting method that decreases task losses in a balanced way using O(1) space and time.

We conduct an extensive set of experiments covering multi-task supervised and reinforcement learning problems.

Our results indicate that FAMO achieves comparable or superior performance to state-of-the-art gradient …

9 часов назад @ paperswithcode.com
/pievos101/ Bayesian post-hoc regularization of random forests
/pievos101/ Bayesian post-hoc regularization of random forests /pievos101/ Bayesian post-hoc regularization of random forests

Random Forests are powerful ensemble learning algorithms widely used in various machine learning tasks.

However, they have a tendency to overfit noisy or irrelevant features, which can result in decreased generalization performance.

Post-hoc regularization techniques aim to mitigate this issue by modifying the structure of the learned ensemble after its training.

Here, we propose Bayesian post-hoc regularization to leverage the reliable patterns captured by leaf nodes closer to the root, while potentially reducing the impact of more specific and potentially noisy leaf nodes deeper in the tree.

We have evaluated the performance of our method on various machine learning data sets.

9 часов назад @ paperswithcode.com
/fushenghao/ Human-imperceptible, Machine-recognizable Images
/fushenghao/ Human-imperceptible, Machine-recognizable Images /fushenghao/ Human-imperceptible, Machine-recognizable Images

Massive human-related data is collected to train neural networks for computer vision tasks.

To reconcile this conflict, this paper proposes an efficient privacy-preserving learning paradigm, where images are first encrypted to become ``human-imperceptible, machine-recognizable'' via one of the two encryption strategies: (1) random shuffling to a set of equally-sized patches and (2) mixing-up sub-patches of the images.

Then, minimal adaptations are made to vision transformer to enable it to learn on the encrypted images for vision tasks, including image classification and object detection.

Extensive experiments on ImageNet and COCO show that the proposed paradigm achieves comparable accuracy…

9 часов назад @ paperswithcode.com
/abi-kothapalli/ Learning-Based Heuristic for Combinatorial Optimization of the Minimum Dominating Set Problem using Graph Convolutional Networks
/abi-kothapalli/ Learning-Based Heuristic for Combinatorial Optimization of the Minimum Dominating Set Problem using Graph Convolutional Networks /abi-kothapalli/ Learning-Based Heuristic for Combinatorial Optimization of the Minimum Dominating Set Problem using Graph Convolutional Networks

A dominating set of a graph $\mathcal{G=(V, E)}$ is a subset of vertices $S\subseteq\mathcal{V}$ such that every vertex $v\in \mathcal{V} \setminus S$ outside the dominating set is adjacent to a vertex $u\in S$ within the set.

The minimum dominating set problem seeks to find a dominating set of minimum cardinality and is a well-established NP-hard combinatorial optimization problem.

We propose a novel learning-based heuristic approach to compute solutions for the minimum dominating set problem using graph convolutional networks.

Our results indicate that the proposed learning-based approach can outperform a classical greedy approximation algorithm.

Finally, we utilize the proposed learning-…

9 часов назад @ paperswithcode.com
/maxwellreuter/ I'm Afraid I Can't Do That: Predicting Prompt Refusal in Black-Box Generative Language Models
/maxwellreuter/ I'm Afraid I Can't Do That: Predicting Prompt Refusal in Black-Box Generative Language Models /maxwellreuter/ I'm Afraid I Can't Do That: Predicting Prompt Refusal in Black-Box Generative Language Models

Since the release of OpenAI's ChatGPT, generative language models have attracted extensive public attention.

The increased usage has highlighted generative models' broad utility, but also revealed several forms of embedded bias.

Some is induced by the pre-training corpus; but additional bias specific to generative models arises from the use of subjective fine-tuning to avoid generating harmful content.

Second, we use this refusal classifier to bootstrap a larger (n=10,000) dataset adapted from the Quora Insincere Questions dataset.

This prompt classifier achieves 76% accuracy on a test set of manually labeled questions (n=1,009).

9 часов назад @ paperswithcode.com
/tencentarc/ SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation
/tencentarc/ SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation /tencentarc/ SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation

As an important and challenging problem in computer vision, PAnoramic Semantic Segmentation (PASS) gives complete scene perception based on an ultra-wide angle of view.

Therefore, their performance will drop a lot when inputting panoramic images with the 3D disturbance.

To be more robust to 3D disturbance, we propose our Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation (SGAT4PASS), considering 3D spherical geometry knowledge.

Specifically, a spherical geometry-aware framework is proposed for PASS.

It includes three modules, i.e., spherical geometry-aware image projection, spherical deformable patch embedding, and a panorama-aware loss, which takes input images with 3…

9 часов назад @ paperswithcode.com
/lars-research/ ColdNAS: Search to Modulate for User Cold-Start Recommendation
/lars-research/ ColdNAS: Search to Modulate for User Cold-Start Recommendation /lars-research/ ColdNAS: Search to Modulate for User Cold-Start Recommendation

Making personalized recommendation for cold-start users, who only have a few interaction histories, is a challenging problem in recommendation systems.

Recent works leverage hypernetworks to directly map user interaction histories to user-specific parameters, which are then used to modulate predictor by feature-wise linear modulation function.

However, the physical meaning of scaling and shifting in recommendation data is unclear.

Instead of using a fixed modulation function and deciding modulation position by expertise, we propose a modulation framework called ColdNAS for user cold-start problem, where we look for proper modulation structure, including function and position, via neural arc…

9 часов назад @ paperswithcode.com
💼 University and corporation labs
DeepMind DeepMind
последний пост 1 week, 6 days назад
An early warning system for novel AI risks
An early warning system for novel AI risks An early warning system for novel AI risks

AI researchers already use a range of evaluation benchmarks to identify unwanted behaviours in AI systems, such as AI systems making misleading statements, biased decisions, or repeating copyrighted content.

Model safety evaluations, including those assessing extreme risks, will be a critical component of safe AI development and deployment.

By identifying the risks early on, this will unlock opportunities to be more responsible when training new AI systems, deploying these AI systems, transparently describing their risks, and applying appropriate cybersecurity standards.

A blueprint for embedding model evaluations for extreme risks into important decision making processes throughout model t…

1 week, 6 days назад @ deepmind.com
DeepMind’s latest research at ICLR 2023
DeepMind’s latest research at ICLR 2023 DeepMind’s latest research at ICLR 2023

Research towards AI models that can generalise, scale, and accelerate scienceNext week marks the start of the 11th International Conference on Learning Representations (ICLR), taking place 1-5 May in Kigali, Rwanda.

We present a new approach where models learn by solving two problems in one.

We developed a new approach and open-source training data set to help models learn to explore in human-like ways over long time horizons.

On the other hand, adversarial attacks are a way of probing the limits of AI models by pushing them to create wrong or harmful outputs.

RL models also learn by trial and error which can be very data-intensive and time-consuming.

1 month, 1 week назад @ deepmind.com
How can we build human values into AI?
How can we build human values into AI? How can we build human values into AI?

These questions shed light on the role played by principles – the foundational values that drive decisions big and small in AI.

For humans, principles help shape the way we live our lives and our sense of right and wrong.

A tool for fairer decision-makingA key goal for AI researchers has been to align AI systems with human values.

However, there is no consensus on a single set of human values or preferences to govern AI – we live in a world where people have diverse backgrounds, resources and beliefs.

The veil of ignorance may provide a starting point for the selection of principles with which to align AI.

1 month, 2 weeks назад @ deepmind.com
Announcing Google DeepMind
Announcing Google DeepMind Announcing Google DeepMind

Earlier today we announced some changes that will accelerate our progress in AI and help us develop more capable AI systems safely and responsibly.

In the coming years, AI - and ultimately AGI - has the potential to drive one of the greatest social, economic and scientific transformations in history.ÂThat’s why today Sundar is announcing that DeepMind and the Brain team from Google Research will be joining forces as a single, focused unit called Google DeepMind.

By creating Google DeepMind, I believe we can get to that future faster.

Through Google DeepMind, we are bringing together our world-class talent in AI with the computing power, infrastructure and resources to create the next gene…

1 month, 2 weeks назад @ deepmind.com
Competitive programming with AlphaCode
Competitive programming with AlphaCode Competitive programming with AlphaCode

As part of DeepMind’s mission to solve intelligence, we created a system called AlphaCode that writes computer programs at a competitive level.

AlphaCode placed at about the level of the median competitor, marking the first time an AI code generation system has reached a competitive level of performance in programming competitions.

We pre-train our model on selected public GitHub code and fine-tune it on our relatively small competitive programming dataset.

"Solving competitive programming problems is a really hard thing to do, requiring both good coding skills and problem solving creativity in humans.

AlphaCode ranked within the top 54% in real-world programming competitions, an advancem…

6 months назад @ deepmind.com
AI for the board game Diplomacy
AI for the board game Diplomacy AI for the board game Diplomacy

Diplomacy is a seven-player game of negotiation and alliance formation, played on an old map of Europe partitioned into provinces, where each player controls multiple units (rules of Diplomacy).

We use Diplomacy as an analog to real-world negotiation, providing methods for AI agents to coordinate their moves.

We take our non-communicating Diplomacy agents and augment them to play Diplomacy with communication by giving them a protocol for negotiating contracts for a joint plan of action.

We call these augmented agents Baseline Negotiators, and they are bound by their agreements.ÂDiplomacy contracts.

In practice, Learned Deviators occasionally break contracts late in the game, and in doing so…

6 months назад @ deepmind.com
Mastering Stratego, the classic game of imperfect information
Mastering Stratego, the classic game of imperfect information Mastering Stratego, the classic game of imperfect information

Stratego is challenging for AI, in part, because it’s a game of imperfect information.

The machine learning approaches that work so well on perfect information games, such as DeepMind’s AlphaZero, are not easily transferred to Stratego.

The art of the bluffAs in poker, a good Stratego player must sometimes represent strength, even when weak.

See more by watching these four videos of full-length games played by DeepNash against (anonymised) human experts: Game 1, Game 2, Game 3, Game 4.“The level of play of DeepNash surprised me.

I had never heard of an artificial Stratego player that came close to the level needed to win a match against an experienced human player.

6 months, 1 week назад @ deepmind.com
DeepMind’s latest research at NeurIPS 2022
DeepMind’s latest research at NeurIPS 2022 DeepMind’s latest research at NeurIPS 2022

Advancing best-in-class large models, compute-optimal RL agents, and more transparent, ethical, and fair AI systemsThe thirty-sixth International Conference on Neural Information Processing Systems (NeurIPS 2022) is taking place from 28 November - 9 December 2022, as a hybrid event, based in New Orleans, USA.

We updated the scaling laws of large models, showing how previously trained models were too large for the amount of training performed.

Pioneering responsiblyAt the heart of DeepMind’s mission is our commitment to act as responsible pioneers in the field of AI.

We’re committed to developing AI systems that are transparent, ethical, and fair.ÂExplaining and understanding the behavio…

6 months, 2 weeks назад @ deepmind.com
Building interactive agents in video game worlds
Building interactive agents in video game worlds Building interactive agents in video game worlds

Learning in “the playhouse”Our framework begins with people interacting with other people in the video game world.

Human participants set the contexts for the interactions by navigating through the world, setting goals, and asking questions for agents.

This phase was covered in two of our earlier papers, Imitating Interactive Intelligence, and Creating Multimodal Interactive Agents with Imitation and Self-Supervised Learning, which explored building imitation-based agents.

Our agents trained by RL performed much better than those trained by imitation learning alone.ÂWe asked people to evaluate our agents in online real-time interactions.

In Deep reinforcement learning from human prefere…

6 months, 2 weeks назад @ deepmind.com
Benchmarking the next generation of never-ending learners
Benchmarking the next generation of never-ending learners Benchmarking the next generation of never-ending learners

For example, when large models are deployed, whatever they have learned on one task is seldom harnessed to facilitate their learning of the next task.

What’s more, once new data or more compute become available, large models are typically retrained from scratch – a costly, time-consuming process. ÂThis raises the question of whether we could improve the trade-off between the efficiency and performance of these large models, making them faster and more sustainable while also preserving their outstanding capabilities.

The Never-Ending Visual classification Stream (NEVIS’22) is a benchmark stream in addition to an evaluation protocol, a set of initial baselines, and an open-source codeb…

6 months, 2 weeks назад @ deepmind.com
Best practices for data enrichment
Best practices for data enrichment Best practices for data enrichment

In the past 12 months, we’ve collaborated with Partnership on AI (PAI) to carefully consider these challenges, and have co-developed standardised best practices and processes for responsible human data collection.

The best practicesFollowing PAI’s recent white paper on Responsible Sourcing of Data Enrichment Services, we collaborated to develop our practices and processes for data enrichment.

This included the creation of five steps AI practitioners can follow to improve the working conditions for people involved in data enrichment tasks (for more details, please visit PAI’s Data Enrichment Sourcing Guidelines):ÂSelect an appropriate payment model and ensure all workers are paid above…

6 months, 3 weeks назад @ deepmind.com
The pursuit of AI education - past, present, and future
The pursuit of AI education - past, present, and future The pursuit of AI education - past, present, and future

Meet Sylvia Christie, our education partnerships manager who’s played a leading role in expanding our scholarship programme, which has just celebrated its five-year anniversary.

Every academic year, we get to see the new crop of talented AI scholars become part of an international community of students and mentors.

We need to make sure that our work drives real change in the wider community and for AI education more generally.

The series also includes the short cinematic film below as a new way of speaking to audiences about the scholarship programme in a creative way.

What’re your biggest learnings now that the scholarship programme is five years old?ÂHow important collaboration is.

7 months назад @ deepmind.com
Digital transformation with Google Cloud
Digital transformation with Google Cloud Digital transformation with Google Cloud

Applying our AI research, we’ve helped Google Cloud enhance core solutions used by their customers at scaleAlphabet’s Google Cloud empowers organisations to digitally transform themselves into smarter businesses.

Last week, many of the platform’s latest advances were shared at Next '22, Google Cloud's annual developer and tech conference about digital transformation in the cloud.

We’ve partnered with Google Cloud over the last few years to apply our AI research for making a positive impact on core solutions used by their customers.

And in recent years, we’ve partnered with Google Cloud Professional Services to positively impact the wind energy sector to help build a carbon-free fu…

7 months, 2 weeks назад @ deepmind.com
Measuring perception in AI models
Measuring perception in AI models Measuring perception in AI models

So today, we’re introducing the Perception Test, a multimodal benchmark using real-world videos to help evaluate the perception capabilities of a model.

Multimodal models, such as Perceiver, Flamingo, or BEiT-3, aim to be more general models of perception.

Geolocation of crowd-sourced participants involved in filming.ÂLearning more about the Perception TestThe Perception Test benchmark is publicly available here and further details are available in our paper.

A leaderboard and a challenge server will be available soon too.ÂOn 23 October, 2022, we’re hosting a workshop about general perception models at the European Conference on Computer Vision in Tel Aviv (ECCV 2022), where we will dis…

7 months, 4 weeks назад @ deepmind.com
How undesired goals can arise with correct rewards
How undesired goals can arise with correct rewards How undesired goals can arise with correct rewards

Exploring examples of goal misgeneralisation – where an AI system's capabilities generalise but its goal doesn'tAs we build increasingly advanced artificial intelligence (AI) systems, we want to make sure they don’t pursue undesired goals.

Such behaviour in an AI agent is often the result of specification gaming – exploiting a poor choice of what they are rewarded for.

Crucially, in contrast to specification gaming, GMG can occur even when the AI system is trained with a correct specification.

During training, there is an “expert” agent (the red blob) that visits the coloured spheres in the correct order.

This AI system does what its designers intend it to do.

8 months назад @ deepmind.com
Google
последний пост 20 часов назад
Visual captions: Using large language models to augment video conferences with dynamic visuals
Visual captions: Using large language models to augment video conferences with dynamic visuals Visual captions: Using large language models to augment video conferences with dynamic visuals

For example, “I would love to see it!” corresponds to visual content of “face smiling”, a visual type of “emoji”, and visual source of “public search”.

For training, we parsed each visual intent into the format of " of from ".

{"prompt": " →", "completion": " of " from "; of " from "; ... \𝑛"}Using this format, this system can handle open-vocabulary conversations and contextually predict visual content, visual source, and visual type.

PerformanceTo evaluate the utility of the trained Visual Captions model, we invited 89 participants to perform 846 tasks.

We have also deployed Visual Captions in ARChat, which facilitates video conferences in Google Meet by transcribing meetings and augme…

20 часов назад @ ai.googleblog.com
Climate Cardinals: Bridging the climate information gap with AI-powered translations
Climate Cardinals: Bridging the climate information gap with AI-powered translations Climate Cardinals: Bridging the climate information gap with AI-powered translations

In this blog post, Sophia and her colleague Hikaru Hayakawa from Climate Cardinals talk about the importance of accessible climate education and the empowering potential of AI technologies in the fight against climate change.

Even reports from the Intergovernmental Panel on Climate Change (IPCC), which contain the most comprehensive assessments of how to reduce the impact of climate change, are only available in the six official UN languages.

In our first years of operation, we managed to translate more than 500,000 words of crucial climate information, like climate glossaries and UN documents, into more than 40 languages.

Accelerating climate action with AIInitially, a Google Cloud team of…

1 day, 1 hour назад @ cloud.google.com
Improving search experiences with Enterprise Search in Gen App Builder
Improving search experiences with Enterprise Search in Gen App Builder Improving search experiences with Enterprise Search in Gen App Builder

Let’s look at how Enterprise Search on Gen App Builder helps customers bypass these scale and reliability challenges, so they can start leveraging generative search quickly.

With Enterprise Search on Gen App Builder, we are building an intelligent assistant to securely and quickly search contracts.” said Sherif Bakir, CEO of Vodafone Voice and Roaming Services.

Read more about Enterprise Search on Gen App Builder and sign up for access on our webpage.

We’re also bringing generative AI features of Enterprise Search to our existing solutions like Contact Center AI and Document AI.

To keep up with our latest generative AI news, don’t miss The Prompt or our generative AI primer for executives o…

1 day, 2 hours назад @ cloud.google.com
From Receipts to Riches: Save Money w/ Google Cloud & Supermarket Bills - Part 2
From Receipts to Riches: Save Money w/ Google Cloud & Supermarket Bills - Part 2 From Receipts to Riches: Save Money w/ Google Cloud & Supermarket Bills - Part 2

This blog series aims to demonstrate how Google Cloud products like Document AI and BigQuery can be used together to help organizations eliminate manual document processing.

In the first part of this blog series, we discussed how to digitize grocery receipts from supermarkets and analyze spending patterns using Google Cloud services.

In this next installment of the blog series, we will learn how to automatically classify different supermarket receipts using the Document AI custom classifier processor.

We will then learn how to use Google Cloud Functions (Gen2) with triggers to automate the classification of receipts as soon as they are uploaded to Google Cloud storage.

To illustrate, let's …

1 day, 21 hours назад @ cloud.google.com
AVFormer: Injecting vision into frozen speech models for zero-shot AV-ASR
AVFormer: Injecting vision into frozen speech models for zero-shot AV-ASR AVFormer: Injecting vision into frozen speech models for zero-shot AV-ASR

Building audiovisual datasets for training AV-ASR models, however, is challenging.

With the above challenges in mind, in “AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR”, we present a simple method for augmenting existing large-scale audio-only models with visual information, at the same time performing lightweight domain adaptation.

In contrast, when the proposed two-phase curriculum is applied, our AV-ASR model performs significantly better than the baseline audio-only model.

ConclusionWe introduce AVFormer, a lightweight method for adapting existing, frozen state-of-the-art ASR models for AV-ASR.

As ASR models get larger and larger, tuning the entire parameter …

4 days, 20 hours назад @ ai.googleblog.com
Retrieval-augmented visual-language pre-training
Retrieval-augmented visual-language pre-training Retrieval-augmented visual-language pre-training

To address these issues, in “REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory”, to appear at CVPR 2023, we introduce a visual-language model that learns to utilize a multi-source multi-modal “memory” to answer knowledge-intensive queries.

We augment a visual-language model with the ability to retrieve multiple knowledge entries from a diverse set of knowledge sources, which helps generation.

Each knowledge item is processed through a multi-modal visual-language encoder, resulting in a sequence of image and text tokens.

To train all components of the REVEAL model end-to-end, we need to warm start the model to a good state (setting initial…

5 days, 19 hours назад @ ai.googleblog.com
How to simplify unstructured data analytics using BigQuery ML and Vertex AI
How to simplify unstructured data analytics using BigQuery ML and Vertex AI How to simplify unstructured data analytics using BigQuery ML and Vertex AI

Unstructured data such as images, speech and textual data can be notoriously difficult to manage, and even harder to analyze.

The new BigQuery ML inference engine empowers practitioners to run inferences on unstructured data using pre-trained AI models.

In this blog, we’ll explore how the new BigQuery ML inference engine can be used to run inferences against unstructured data in BigQuery.

The ability to run inferences in BigQuery using just SQL can make generating insights from your data using AI simple and accessible.

Natural Language Processing API: This model can be used to derive meaning from textual data stored in BigQuery tables.

5 days, 21 hours назад @ cloud.google.com
Build an image data classification model with BigQuery ML
Build an image data classification model with BigQuery ML Build an image data classification model with BigQuery ML

With BigQuery’s new unstructured data analysis feature, you can now store, process, analyze, model, predict, with unstructured data, and combine it with structured data in queries.

In this blog, we will discuss the use case of storing and analyzing images of yoga poses in BigQuery, and then implement a classification model with BigQuery ML to label the poses using only SQL constructs.

This makes it a great choice for storing ML training data.

But,BigQuery has expanded to perform all analytics and ML on unstructured data as well We can use SQL queries to perform insightful analysis, analytics and ML on images, videos, audio etc.

Image Data Classification with BigQuery MLWith first of its kin…

5 days, 21 hours назад @ cloud.google.com
Large sequence models for software development activities
Large sequence models for software development activities Large sequence models for software development activities

Today we describe DIDACT (​​Dynamic Integrated Developer ACTivity), which is a methodology for training large machine learning (ML) models for software development.

We leverage instrumentation of Google's software development to scale up the quantity and diversity of developer-activity data beyond previous works.

Results are extremely promising along two dimensions: usefulness to professional software developers, and as a potential basis for imbuing ML models with general software development skills.

DIDACT is a multi-task model trained on development activities that include editing, debugging, repair, and code review.

We believe DIDACT paves a promising path towards developing agents that …

6 days, 19 hours назад @ ai.googleblog.com
AI in software development: What you need to know
AI in software development: What you need to know AI in software development: What you need to know

Moreover, the development of AI itself requires human input, including data scientists, machine learning engineers, and software developers.

Myth: Training custom AI models is too expensive and resource intensive.

Another option is to use cloud-based machine learning platforms that offer scalable infrastructure and pre-built tools and frameworks for model development.

AI-powered solutions are being used in healthcare, finance, manufacturing, transportation, and many other fields, and the use of AI applications is only expected to grow.

Myth: No code/low code AI platforms are only for non technical users.

1 week, 4 days назад @ cloud.google.com
Foundation models for reasoning on charts
Foundation models for reasoning on charts Foundation models for reasoning on charts

For math reasoning pre-training, we pick textual numerical reasoning datasets and render the input into images, which the image-to-text model needs to decode for answers.

We also propose “DePlot: One-shot visual language reasoning by plot-to-table translation”, a model built on top of MatCha for one-shot reasoning on charts via translation to tables.

To collect sufficient pre-training data, we independently accumulate [chart, code] and [chart, table] pairs.

Math reasoningWe incorporate numerical reasoning knowledge into MatCha by learning math reasoning skills from textual math datasets.

We use two existing textual math reasoning datasets, MATH and DROP for pre-training.

1 week, 4 days назад @ ai.googleblog.com
Barkour: Benchmarking animal-level agility with quadruped robots
Barkour: Benchmarking animal-level agility with quadruped robots Barkour: Benchmarking animal-level agility with quadruped robots

Yet, while researchers have enabled robots to hike or jump over some obstacles, there is still no generally accepted benchmark that comprehensively measures robot agility or mobility.

In “Barkour: Benchmarking Animal-level Agility with Quadruped Robots”, we introduce the Barkour agility benchmark for quadruped robots, along with a Transformer-based generalist locomotion policy.

By providing a diverse and challenging obstacle course, the Barkour benchmark encourages researchers to develop locomotion controllers that move fast in a controllable and versatile way.

Measuring robustness of the different policies across a large number of runs on the Barkour benchmark.

ConclusionWe believe that de…

1 week, 4 days назад @ ai.googleblog.com
How to integrate a Virtual Agent using Google Dialogflow ES into a Twilio Conversation
How to integrate a Virtual Agent using Google Dialogflow ES into a Twilio Conversation How to integrate a Virtual Agent using Google Dialogflow ES into a Twilio Conversation

Twilio Flex is an omni-channel CCaaS (Contact Center as a Service) that makes it easy to build personalized support that’s unique to your business.

In this post, we will show you how to integrate Flex’s asynchronous channels with Google DialogFlow.

Flex uses Twilio Conversations to natively support conversational messaging use cases such as customer support and conversational commerce via SMS, MMS, WhatsApp, Chat, GBM and FBM.

Virtual agents can handle a high volume of simple, repetitive tasks, such as answering frequently asked questions, freeing up human agents to focus on more complex interactions.

Additionally, virtual agents can offer 24/7 availability, quick response times, and person…

1 week, 4 days назад @ cloud.google.com
Differentially private clustering for large-scale datasets
Differentially private clustering for large-scale datasets Differentially private clustering for large-scale datasets

To ensure privacy in a rigorous sense, one solution is to develop differentially private (DP) clustering algorithms.

This code brings differentially private k-means clustering to large scale datasets using distributed computing.

Differentially private hierarchical clusteringHierarchical clustering is a popular clustering approach that consists of recursively partitioning a dataset into clusters at an increasingly finer granularity.

Large-scale differentially private clusteringWe now switch gears and discuss our work for metric space clustering.

Vaccination search insights via DP clusteringWe then apply these advances in differentially private clustering to real-world applications.

1 week, 5 days назад @ ai.googleblog.com
Google Research at I/O 2023
Google Research at I/O 2023 Google Research at I/O 2023

Wednesday, May 10th was an exciting day for the Google Research community as we watched the results of months and years of our foundational and applied work get announced on the Google I/O stage.

So today, we’re excited to reveal more about the research efforts behind some of the many exciting announcements at this year's I/O.

This family of models is being incorporated into multiple Google products, including:Image generation in Google Slides and Android’s Generative AI wallpaper are powered by our text-to-image generation features.

I/O Flip, a digital take on a classic card game, features Google developer mascots on cards that were entirely AI generated.

PhenakiPhenaki, Google’s Transform…

1 week, 5 days назад @ ai.googleblog.com
OpenAI OpenAI
последний пост 6 days, 6 hours назад
OpenAI cybersecurity grant program
OpenAI cybersecurity grant program OpenAI cybersecurity grant program

We are launching the Cybersecurity Grant Program—a $1M initiative to boost and quantify AI-powered cybersecurity capabilities and to foster high-level AI and cybersecurity discourse.ÂOur goal is to work with defenders across the globe to change the power dynamics of cybersecurity through the application of AI and the coordination of like-minded individuals working for our collective safety.

Our program seeks to:ÂEmpower defenders: We would like to ensure that cutting-edge AI capabilities benefit defenders first and most.

Measure capabilities: We are working to develop methods for quantifying the cybersecurity capabilities of AI models, in order to better understand and improve their effec…

6 days, 6 hours назад @ openai.com
Improving Mathematical Reasoning with Process Supervision
Improving Mathematical Reasoning with Process Supervision Improving Mathematical Reasoning with Process Supervision

9I get sin ⁡ 10 0 ∘ + 2 ( sin ⁡ 18 0 ∘ cos ⁡ 2 0 ∘ + cos ⁡ 18 0 ∘ sin ⁡ 2 0 ∘ ) cos ⁡ 10 0 ∘ .

13I get ( sin ⁡ 9 0 ∘ cos ⁡ 1 0 ∘ + cos ⁡ 9 0 ∘ sin ⁡ 1 0 ∘ ) − 2 sin ⁡ 2 0 ∘ ( cos ⁡ 9 0 ∘ cos ⁡ 1 0 ∘ − sin ⁡ 9 0 ∘ sin ⁡ 1 0 ∘ ) .

15I get cos ⁡ 1 0 ∘ − 2 sin ⁡ 2 0 ∘ − sin ⁡ 1 0 ∘ .

19I get 2 ( sin ⁡ 3 0 ∘ cos ⁡ 1 0 ∘ − cos ⁡ 3 0 ∘ sin ⁡ 1 0 ∘ ) − cos ⁡ 1 0 ∘ sin ⁡ 1 0 ∘ .

21I get cos ⁡ 1 0 ∘ − 3 sin ⁡ 1 0 ∘ − cos ⁡ 1 0 ∘ sin ⁡ 1 0 ∘ .

1 week назад @ openai.com
Democratic Inputs to AI
Democratic Inputs to AI Democratic Inputs to AI

We believe that decisions about how AI behaves should be shaped by diverse perspectives reflecting the public interest.Â​​Laws encode values and norms to regulate behavior.

Beyond a legal framework, AI, much like society, needs more intricate and adaptive guidelines for its conduct.

For example: under what conditions should AI systems condemn or criticize public figures, given different opinions across groups regarding those figures?

Should AI by default reflect the persona of a median individual in the world, the user’s country, the user’s demographic, or something entirely different?

We are seeking teams from across the world to develop proof-of-concepts for a democratic process t…

1 week, 6 days назад @ openai.com
Governance of superintelligence
Governance of superintelligence Governance of superintelligence

First, we need some degree of coordination among the leading development efforts to ensure that the development of superintelligence occurs in a manner that allows us to both maintain safety and help smooth integration of these systems with society.

And of course, individual companies should be held to an extremely high standard of acting responsibly.

Tracking compute and energy usage could go a long way, and give us some hope this idea could actually be implementable.

Third, we need the technical capability to make a superintelligence safe.

This is an open research question that we and others are putting a lot of effort into.

2 weeks, 2 days назад @ openai.com
Introducing the ChatGPT app for iOS
Introducing the ChatGPT app for iOS Introducing the ChatGPT app for iOS

Since the release of ChatGPT, we've heard from users that they love using ChatGPT on the go.

Today, we’re launching the ChatGPT app for iOS.

The ChatGPT app is free to use and syncs your history across devices.

ChatGPT Plus subscribers get exclusive access to GPT-4’s capabilities, early access to features and faster response times, all on iOS.

With the ChatGPT app for iOS, we’re taking another step towards our mission by transforming state-of-the-art research into useful tools that empower people, while continuously making them more accessible.ÂP.S.

2 weeks, 6 days назад @ openai.com
Language models can explain neurons in language models
Language models can explain neurons in language models Language models can explain neurons in language models

Language models have become more capable and more broadly deployed, but our understanding of how they work internally is still very limited.

Interpretability research aims to uncover additional information by looking inside the model.

One simple approach to interpretability research is to first understand what the individual components (neurons and attention heads) are doing.

We propose an automated process that uses GPT-4 to produce and score natural language explanations of neuron behavior and apply it to neurons in another language model.

This work is part of the third pillar of our approach to alignment research: we want to automate the alignment research work itself.

4 weeks, 1 day назад @ openai.com
New ways to manage your data in ChatGPT
New ways to manage your data in ChatGPT New ways to manage your data in ChatGPT

As an AI language model, I can perform a wide range of tasks, including but not limited to:Answering questions and providing general information on various topics Offering recommendations or suggestions Assisting with problem-solving and brainstorming ideas Creating written content, such as articles, essays, stories, and poems Proofreading and editing text for grammar, punctuation, and style Explaining complex concepts in simpler terms Assisting with language learning and providing translations Generating conversation on various topics Helping with programming and coding-related tasks Providing summaries of articles or documentsPlease note that my knowledge is based on the information avail…

1 month, 1 week назад @ openai.com
Announcing OpenAI’s Bug Bounty Program
Announcing OpenAI’s Bug Bounty Program Announcing OpenAI’s Bug Bounty Program

OpenAI’s mission is to create artificial intelligence systems that benefit everyone.

To that end, we invest heavily in research and engineering to ensure our AI systems are safe and secure.

However, as with any complex technology, we understand that vulnerabilities and flaws can emerge.

We are excited to build on our coordinated disclosure commitments by offering incentives for qualifying vulnerability information.

Your expertise and vigilance will have a direct impact on keeping our systems and users secure.

1 month, 3 weeks назад @ openai.com
Our approach to AI safety
Our approach to AI safety Our approach to AI safety

Despite extensive research and testing, we cannot predict all of the beneficial ways people will use our technology, nor all the ways people will abuse it.

That’s why we believe that learning from real-world use is a critical component of creating and releasing increasingly safe AI systems over time.

We cautiously and gradually release new AI systems—with substantial safeguards in place—to a steadily broadening group of people and make continuous improvements based on the lessons we learn.

Crucially, we believe that society must have time to update and adjust to increasingly capable AI, and that everyone who is affected by this technology should have a significant say in how AI devel…

2 months назад @ openai.com
March 20 ChatGPT outage: Here’s what happened
March 20 ChatGPT outage: Here’s what happened March 20 ChatGPT outage: Here’s what happened

These emails contained the last four digits of another user’s credit card number, but full credit card numbers did not appear.

It’s possible that a small number of subscription confirmation emails might have been incorrectly addressed prior to March 20, although we have not confirmed any instances of this.

In ChatGPT, click on “My account,” then “Manage my subscription” between 1 a.m. and 10 a.m. Pacific time on Monday, March 20.

During this window, another active ChatGPT Plus user’s first and last name, email address, payment address, the last four digits (only) of a credit card number, and credit card expiration date might have been visible.

It’s possible that this also co…

2 months, 2 weeks назад @ openai.com
ChatGPT plugins
ChatGPT plugins ChatGPT plugins

In line with our iterative deployment philosophy, we are gradually rolling out plugins in ChatGPT so we can study their real-world use, impact, and safety and alignment challenges—all of which we’ll have to get right in order to achieve our mission.

Users have been asking for plugins since we launched ChatGPT (and many developers are experimenting with similar ideas) because they unlock a vast range of possible use cases.

We’re starting with a small set of users and are planning to gradually roll out larger-scale access as we learn more (for plugin developers, ChatGPT users, and after an alpha period, API users who would like to integrate plugins into their products).

Plugin developer…

2 months, 2 weeks назад @ openai.com
GPTs are GPTs: An early look at the labor market impact potential of large language models
GPTs are GPTs: An early look at the labor market impact potential of large language models GPTs are GPTs: An early look at the labor market impact potential of large language models

We investigate the potential implications of Generative Pre-trained Transformer (GPT) models and related technologies on the U.S. labor market.

Using a new rubric, we assess occupations based on their correspondence with GPT capabilities, incorporating both human expertise and classifications from GPT-4.

The influence spans all wage levels, with higher-income jobs potentially facing greater exposure.

Notably, the impact is not limited to industries with higher recent productivity growth.

We conclude that Generative Pre-trained Transformers exhibit characteristics of general-purpose technologies (GPTs), suggesting that as these models could have notable economic, social, and policy implicati…

2 months, 3 weeks назад @ openai.com
GPT-4
GPT-4 GPT-4

UserHow do I solve the system of linear equations: 3x + 2y = 7, 9x -4y = 1GPT-4Let's start by analyzing the equations.

Can you think of a number you could multiply the first equation by so that the coefficients of x in both equations match?

If you subtract equation 2 from equation 1, what will the resulting equation look like?

Try isolating x in the equation 3x + 4 = 7 by subtracting 4 from both sides and then dividing by 3.

So, the solution to the system of linear equations is x = 1 and y = 2.

2 months, 3 weeks назад @ openai.com
Introducing ChatGPT and Whisper APIs
Introducing ChatGPT and Whisper APIs Introducing ChatGPT and Whisper APIs

Model: The ChatGPT model family we are releasing today, gpt-3.5-turbo , is the same model used in the ChatGPT product.

It is priced at $0.002 per 1k tokens, which is 10x cheaper than our existing GPT-3.5 models.

It’s also our best model for many non-chat use cases—we’ve seen early testers migrate from text-davinci-003 to gpt-3.5-turbo with only a small amount of adjustment needed to their prompts.

API: Traditionally, GPT models consume unstructured text, which is represented to the model as a sequence of “tokens.” ChatGPT models instead consume a sequence of messages together with metadata.

We’ve created a new endpoint to interact with our ChatGPT models:

3 months, 1 week назад @ openai.com
Planning for AGI and beyond
Planning for AGI and beyond Planning for AGI and beyond

The short termThere are several things we think are important to do now to prepare for AGI.

We believe this is the best way to carefully steward AGI into existence—a gradual transition to a world with AGI is better than a sudden one.

As our systems get closer to AGI, we are becoming increasingly cautious with the creation and deployment of our models.

As our systems get closer to AGI, we are becoming increasingly cautious with the creation and deployment of our models.

We hope to contribute to the world an AGI aligned with such flourishing.

3 months, 1 week назад @ openai.com
Microsoft Microsoft
последний пост 1 week назад
3D telemedicine brings better care to underserved and rural communities, even across continents
3D telemedicine brings better care to underserved and rural communities, even across continents 3D telemedicine brings better care to underserved and rural communities, even across continents

Yet 2D telemedicine (2DTM) fails to fully replicate the experience of a face-to-face consultation.

Figure 1: A patient participates in a consultation with doctors using the 3D Telemedicine system.

Figure 2: In three studies produced during a trial in Scotland, 3D telemedicine outperformed 2D telemedicine in satisfaction, realism and quality, with a direct correlation between realism and satisfaction.

That began the collaboration on the next phase of the project and the installation of the first known 3D telemedicine system on the African continent.

Figure 4: 3DTM enables better planning, safety, and integration among the international team, plus better patient education and follow-up care.

1 week назад @ microsoft.com
Research Focus: Week of May 22, 2023
Research Focus: Week of May 22, 2023 Research Focus: Week of May 22, 2023

Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft.

Spotlight: Microsoft Research Podcast AI Frontiers: AI for health and the future of research with Peter Lee Peter Lee, head of Microsoft Research, and Ashley Llorens, AI scientist and engineer, discuss the future of AI research and the potential for GPT-4 as a medical copilot.

Listen nowNEW RESEARCHDNA storage in thermoresponsive microcapsules for repeated random multiplexed data accessAs the world generates more and more data, data storage capacity has not kept pace.

In DNA data storage, a large amount…

1 week, 6 days назад @ microsoft.com
REACT — A synergistic cloud-edge fusion architecture
REACT — A synergistic cloud-edge fusion architecture REACT — A synergistic cloud-edge fusion architecture

Figure 1(b): REACT uses asynchronous cloud detections to correct the box labels and detect more objects.

We illustrate our fusion approach in REACT for object detection in videos.

First, since the sequence of video frames are spatiotemporally correlated, it suffices to call edge object detection only once every few frames.

Here’s a more detailed description of how REACT goes about combining the edge and the cloud detections.

Also, we noted that edge and cloud models can complement each other, and overall performance improves due to our edge-cloud fusion algorithm.

2 weeks, 5 days назад @ microsoft.com
Achieving Zero-COGS with Microsoft Editor Neural Grammar Checker
Achieving Zero-COGS with Microsoft Editor Neural Grammar Checker Achieving Zero-COGS with Microsoft Editor Neural Grammar Checker

In 2022, Microsoft released a highly optimized version of the Microsoft Editor neural grammar checker on expanded endpoints in Word Win32, Word Online, Outlook Online, and the Editor Browser Extension.

Innovation: Aggressive DecodingBehind the AI-powered grammar checker in Microsoft Editor is the transformer model, enhanced by cutting-edge research innovations[1,2,3] from MSR for grammar correction.

ONNX Runtime custom operator capability allows users to implement their own operators to run within ONNX Runtime with more flexibility.

ONNX Runtime shows advantages for on-device deployment along with its lightweight engine and comprehensive client-inference focused solutions, such as ONNX Runt…

2 weeks, 5 days назад @ microsoft.com
Large-language models for automatic cloud incident management
Large-language models for automatic cloud incident management Large-language models for automatic cloud incident management

This paper, Recommending Root-Cause and Mitigation Steps for Cloud Incidents using Large Language Models, focuses on using state-of-the-art large language models (LLMs) to help generate recommendations for cloud incident root cause analysis and mitigation plans.

Listen nowAdapting large-language models for automated incident managementRecent breakthroughs in AI have enabled LLMs to develop a rich understanding of natural language.

Given the complexities of incident management, we sought to evaluate the effectiveness of LLMs in analyzing the root cause of production incidents and generating mitigation steps.

We find that fine-tuning the GPT-3 and GPT-3.5 models significantly improves the eff…

3 weeks назад @ microsoft.com
Highlights from CHI 2023
Highlights from CHI 2023 Highlights from CHI 2023

The ACM CHI Conference on Human Factors in Computing Systems (CHI) is a renowned meeting ground for top talent in the HCI field and a showcase for some of its most compelling work.

Researchers conducted a within-subjects experiment with 32 participants using the device and not using the device (control).

Toward this problem, researchers study the effects of automatically scheduling time for focused work on people’s work calendars using the “focus time” feature on Outlook calendars.

The researchers found that the treatment participants showed higher wellbeing, including increased excitement, relaxation, and satisfaction, and decreased anger, frustration, tiredness, and stress.

The researcher…

3 weeks, 1 day назад @ microsoft.com
Microsoft at EuroSys 2023: Systems innovation across the stack to help support an easier, faster, safer, and smarter cloud
Microsoft at EuroSys 2023: Systems innovation across the stack to help support an easier, faster, safer, and smarter cloud Microsoft at EuroSys 2023: Systems innovation across the stack to help support an easier, faster, safer, and smarter cloud

EuroSys 2023 is the premier systems conference in Europe, and 2023 marks its 18th edition.

As in previous years, Microsoft has a strong presence in the conference, drawing from research and production teams in Asia, Europe, and the United States, including Azure Systems Research, in collaboration with many universities.

This work spans areas including systems for machine learning, serverless computing, datacenter networking, caching, and debugging.

Serverless computingIn serverless computing, the paper “Palette Load Balancing: Locality Hints for Serverless Functions” proposes adding locality to Function-as-a-Service (FaaS) serverless systems, closing the performance gap between serverful da…

3 weeks, 4 days назад @ microsoft.com
Research Focus: Week of May 8, 2023
Research Focus: Week of May 8, 2023 Research Focus: Week of May 8, 2023

Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft.

To learn more, see the Microsoft Research Summit presentation Statistical Imaginaries: An Ode to Responsible Data Science or the publications Differential Perspectives: Epistemic Disconnects Surrounding the U.S. Census Bureau’s Use of Differential Privacy.

AWARDMicrosoft’s Nicole Immorlica receives 2023 SIGecom Test of Time AwardNicole Immorlica, a Senior Principal Researcher with Microsoft Research New England, has been awarded the 2023 SIGecom Test of Time Award for her work on a 2005 paper on matchin…

3 weeks, 6 days назад @ microsoft.com
Research Focus: Week of May 8, 2023
Research Focus: Week of May 8, 2023 Research Focus: Week of May 8, 2023

Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft.

To learn more, see the Microsoft Research Summit presentation Statistical Imaginaries: An Ode to Responsible Data Science or the publications Differential Perspectives: Epistemic Disconnects Surrounding the U.S. Census Bureau’s Use of Differential Privacy.

AWARDMicrosoft’s Nicole Immorlica receives 2023 SIGecom Test of Time AwardNicole Immorlica, a Senior Principal Researcher with Microsoft Research New England, has been awarded the 2023 SIGecom Test of Time Award for her work on a 2005 paper on matchin…

3 weeks, 6 days назад @ microsoft.com
Using generative AI to imitate human behavior
Using generative AI to imitate human behavior Using generative AI to imitate human behavior

Diffusion models have emerged as a powerful class of generative AI models.

In our new paper, Imitating Human Behaviour with Diffusion Models, we explore how they can be used to imitate human behavior in interactive environments.

Spotlight: On-demand video AI Explainer: Foundation models ​and the next era of AI Explore how the transformer architecture, larger models and more data, and in-context learning have helped advance AI from perception to creation.

Diffusion models are a specific class of generative model that are both stable to train and easy to sample from.

Our work adapts ideas that have been developed for text-to-image diffusion models, to this new paradigm of observation-to-actio…

1 month назад @ microsoft.com
Using generative AI to imitate human behavior
Using generative AI to imitate human behavior Using generative AI to imitate human behavior

Diffusion models have emerged as a powerful class of generative AI models.

In our new paper, Imitating Human Behaviour with Diffusion Models, we explore how they can be used to imitate human behavior in interactive environments.

Spotlight: Microsoft Research Podcast AI Frontiers: AI for health and the future of research with Peter Lee Peter Lee, head of Microsoft Research, and Ashley Llorens, AI scientist and engineer, discuss the future of AI research and the potential for GPT-4 as a medical copilot.

Diffusion models are a specific class of generative model that are both stable to train and easy to sample from.

Our work adapts ideas that have been developed for text-to-image diffusion mode…

1 month назад @ microsoft.com
Inferring rewards through interaction
Inferring rewards through interaction Inferring rewards through interaction

The interaction-grounded learning (IGL) paradigm from Microsoft Research enables agents to infer rewards through the very process of interaction, utilizing diverse feedback signals rather than explicit numeric rewards.

Despite the absence of a clear reward signal, the feedback relies on a binary latent reward through which the agent masters a policy that maximizes this unseen latent reward using environmental feedback.

Recommender systems help people navigate increasing volumes of content offerings by providing personalized content suggestions.

A growing body of work shows that recommender systems don’t provide consistently good recommendations across demographic groups.

This enables the us…

1 month назад @ microsoft.com
Inferring rewards through interaction
Inferring rewards through interaction Inferring rewards through interaction

The interaction-grounded learning (IGL) paradigm from Microsoft Research enables agents to infer rewards through the very process of interaction, utilizing diverse feedback signals rather than explicit numeric rewards.

Despite the absence of a clear reward signal, the feedback relies on a binary latent reward through which the agent masters a policy that maximizes this unseen latent reward using environmental feedback.

Recommender systems help people navigate increasing volumes of content offerings by providing personalized content suggestions.

A growing body of work shows that recommender systems don’t provide consistently good recommendations across demographic groups.

This enables the us…

1 month назад @ microsoft.com
Collaborators: Gov4git with Petar Maymounkov and Kasia Sitkiewicz
Collaborators: Gov4git with Petar Maymounkov and Kasia Sitkiewicz Collaborators: Gov4git with Petar Maymounkov and Kasia Sitkiewicz

Today, I’m joined by our first two guests, Petar Maymounkov and Kasia Sitkiewicz.

And what I do at GitHub, uh, I work as a product manager.

Right now, um, communities, there are few ways of like how they make decisions, either majority of the votes or through consensus.

One nice side benefit from this entire project is that Gov4git, uh, enables people to like reflect on what they’ve done and, and what is happening.

[MUSIC]HUIZINGA: Petar and Kasia, thank you so much for coming on the show today and being our first guests on the Collaborators podcast.

1 month назад @ microsoft.com
AI self-play for algorithm design
AI self-play for algorithm design AI self-play for algorithm design

While humans outperform AI systems at designing such algorithms, we show how to improve AI programming abilities using self-play, a technique that has helped AI systems dominate in games such as chess and Go.

Designing fast and accurate algorithms requires high-level abstract reasoning, which remains difficult for AI systems.

Examples of programming puzzles for AI self-playEach puzzle is specified by a short Python program that checks a possible answer.

Here are the programming puzzle and solution:Example 2: String challengeThis concise puzzle perplexes AI systems, although humans find it simple.

In contrast, AI systems usually need multiple attempts.

1 month назад @ microsoft.com
MIT AI MIT AI
последний пост 16 часов назад
Q&A: Gabriela Sá Pessoa on Brazilian politics, human rights in the Amazon, and AI
Q&A: Gabriela Sá Pessoa on Brazilian politics, human rights in the Amazon, and AI Q&A: Gabriela Sá Pessoa on Brazilian politics, human rights in the Amazon, and AI

Gabriela Sá Pessoa is a journalist passionate about the intersection of human rights and climate change.

Q: One focus of your reporting is human rights and environmental issues in the Amazon.

This is similar to the United States — like many people here, they don't see how they could be related to the human rights violations and the destruction of the rainforest that are happening.

It's a huge matter of human rights because our living depends on that, both locally and globally.

Both Marina and Sonia are global ecological and human rights champions, and I wonder what the impact would be if Congress ratifies these changes.

16 часов назад @ news.mit.edu
Scaling audio-visual learning without labels
Scaling audio-visual learning without labels Scaling audio-visual learning without labels

A joint and coordinated approachThe CAV-MAE works by “learning by prediction” and “learning by comparison,” says Gong.

Contrastive learning aims to map representations that are similar close to each other.

Overall, they found that contrastive learning and masked data modeling are complementary methods.

“Before this work, these methods are used separately, but after this work, I see that most of the audio-visual learning frameworks use contracting loss and the masked autoencoder together, implicitly or explicitly.”Bringing self-supervised audio-visual learning into our worldThe researchers see their contribution of the contrastive audio-visual masked autoencoder (CAV-MAE) as an important mil…

1 day, 16 hours назад @ news.mit.edu
Driven to driverless
Driven to driverless Driven to driverless

At the Indy Autonomous Challenge in November, MIT-PITT-RW was the only entirely student-run team out of nine teams.

Nothing has ever brought us down.”Getting accepted into MIT and joining the Driverless team was her first step toward repairing disparities in transportation.

Under the auspices of the MIT Edgerton Center , MIT Driverless develops their own artificial intelligence software to race in autonomous driving competitions.

Leveraging talent and resources, Driverless teamed up with the University of Pittsburgh, Rochester Institute of Technology (RIT), and the University of Waterloo, Canada, to form MIT-PITT-RW and compete in the Indy Autonomous Challenge.

While doing research, she dis…

6 days, 17 hours назад @ news.mit.edu
A more effective way to train machines for uncertain, real-world situations
A more effective way to train machines for uncertain, real-world situations A more effective way to train machines for uncertain, real-world situations

Striking a balanceMany existing methods that seek to strike a balance between imitation learning and reinforcement learning do so through brute force trial-and-error.

Their solution involves training two students: one with a weighted combination of reinforcement learning and imitation learning, and a second that can only use reinforcement learning to learn the same task.

The main idea is to automatically and dynamically adjust the weighting of the reinforcement and imitation learning objectives of the first student.

The researchers’ algorithm continually compares the two students.

Their method outperformed others that used either only imitation learning or only reinforcement learning.

1 week назад @ news.mit.edu
New tool helps people choose the right method for evaluating AI models
New tool helps people choose the right method for evaluating AI models New tool helps people choose the right method for evaluating AI models

“Saliency cards are designed to give a quick, glanceable summary of a saliency method and also break it down into the most critical, human-centric attributes.

For instance, one saliency method, known as integrated gradients, compares the importance of features in an image to a meaningless baseline.

For example, one attribute is hyperparameter dependence, which measures how sensitive that saliency method is to user-specified parameters.

They also want to develop a better understanding of how people perceive saliency method outputs, which could lead to better visualizations.

“We are really hopeful that these will be living documents that grow as new saliency methods and evaluations are develo…

1 week назад @ news.mit.edu
Celebrating the impact of IDSS
Celebrating the impact of IDSS Celebrating the impact of IDSS

Nobel Prize winner and IDSS affiliate Professor Esther Duflo spoke on large scale immunization efforts, former MLK Visiting Professor Craig Watkins joined a panel on equity and justice in AI, and IDSS Associate Director Alberto Abadie discussed synthetic controls for policy evaluation.

Other policy questions were explored through lightning talks, including those by students from the Technology and Policy Program (TPP) within IDSS.

A place to call homeThe list of IDSS accomplishments over the last eight years is long and growing.

“While we may be working on very different application areas, the core methodologies, such as mathematical tools for data science and probability optimization, crea…

1 week, 4 days назад @ news.mit.edu
Using AI, scientists find a drug that could combat drug-resistant infections
Using AI, scientists find a drug that could combat drug-resistant infections Using AI, scientists find a drug that could combat drug-resistant infections

Using an artificial intelligence algorithm, researchers at MIT and McMaster University have identified a new antibiotic that can kill a type of bacteria that is responsible for many drug-resistant infections.

The researchers identified the new drug from a library of nearly 7,000 potential drug compounds using a machine-learning model that they trained to evaluate whether a chemical compound will inhibit the growth of A. baumannii.

Drug discoveryOver the past several decades, many pathogenic bacteria have become increasingly resistant to existing antibiotics, while very few new antibiotics have been developed.

“Antibiotics often have to be administered systemically, and the last thing you wa…

1 week, 5 days назад @ news.mit.edu
Probabilistic AI that knows how well it’s working
Probabilistic AI that knows how well it’s working Probabilistic AI that knows how well it’s working

Autonomous driving systems can fail to perceive pedestrians and emergency vehicles right in front of them, with fatal consequences.

Conversational AI systems confidently make up facts and, after training via reinforcement learning, often fail to give accurate estimates of their own uncertainty.

Working together, researchers from MIT and the University of California at Berkeley have developed a new method for building sophisticated AI inference algorithms that simultaneously generate collections of probable explanations for data, and accurately estimate the quality of these explanations.

SMC algorithms are an established set of algorithms that have been widely used for uncertainty-calibrated…

1 week, 5 days назад @ news.mit.edu
Researchers use AI to identify similar materials in images
Researchers use AI to identify similar materials in images Researchers use AI to identify similar materials in images

The researchers' technique can also be used to select similar materials in a video.

Before the researchers could develop an AI method to learn how to select similar materials, they had to overcome a few hurdles.

The researchers rendered their own synthetic dataset of indoor scenes, which included 50,000 images and more than 16,000 materials randomly applied to each object.

Synthetic dataset in hand, they trained a machine-learning model for the task of identifying similar materials in real images — but it failed.

The system the researchers developed to identify similar materials is robust to changes in lighting conditions, as seen in this example of match heads burning.

2 weeks, 1 day назад @ news.mit.edu
Using data to write songs for progress
Using data to write songs for progress Using data to write songs for progress

“If you don’t project your voice, how are people going to hear you when you perform?” Gurumurthy recalls her conductor telling her.

Her father is a management consultant and her mother has experience as an investment banker.

I’ve been really fortunate to see the power of mathematical analysis firsthand.”“I have come to realize that the constructive use of technology could be a powerful voice of resistance against injustice,” she says.

“Operatic performance has given me the ability to truly step into my character and convey powerful emotions in my performance.

In the process, I have realized that my voice is most powerful when it reflects my true convictions, whether I am performing or publi…

2 weeks, 3 days назад @ news.mit.edu
Is medicine ready for AI? Doctors, computer scientists, and policymakers are cautiously optimistic
Is medicine ready for AI? Doctors, computer scientists, and policymakers are cautiously optimistic Is medicine ready for AI? Doctors, computer scientists, and policymakers are cautiously optimistic

The advent of generative artificial intelligence models like ChatGPT has prompted renewed calls for AI in health care, and its support base only appears to be broadening.

The second annual MIT-MGB AI Cures Conference, hosted on April 24 by the Abdul Latif Jameel Clinic for Machine Learning in Health (Jameel Clinic), saw its attendance nearly double this year, with over 500 attendees from an array of backgrounds in computer science, medicine, pharmaceuticals, and policy.

“Is AI going to be the thing that cures everything with our ailing health care system?” asked newly inaugurated Massachusetts Secretary of Health and Human Services Kate Walsh.

“We absolutely have to do better ... AI can loo…

2 weeks, 6 days назад @ news.mit.edu
An AI challenge only humans can solve
An AI challenge only humans can solve An AI challenge only humans can solve

In it, they examine who reaped the rewards from past innovations and who may gain from AI today, economically and politically.

“The book is about the choices we make with technology,” Johnson says.

That, we think, is undermining prosperity in the U.S. and around the world.”A call for “machine usefulness,” not “so-so automation”What do Acemoglu and Johnson think is deficient about AI?

But Acemoglu and Robinson contend that many AI programs are less agile than the human mind, and suboptimal replacements for it, even as AI is designed to replace human work.

China deploys AI to create “social credit” scores for citizens, along with heavy surveillance, while tightly restricting freedom of expres…

3 weeks назад @ news.mit.edu
A better way to study ocean currents
A better way to study ocean currents A better way to study ocean currents

To study ocean currents, scientists release GPS-tagged buoys in the ocean and record their velocities to reconstruct the currents that transport them.

The researchers developed a new model that incorporates knowledge from fluid dynamics to better reflect the physics at work in ocean currents.

Diving into the dataOceanographers use data on buoy velocity to predict ocean currents and identify “divergences” where water rises to the surface or sinks deeper.

Buoyant performanceThey evaluated the new model using synthetic and real ocean buoy data.

Because the synthetic data were fabricated by the researchers, they could compare the model’s predictions to ground-truth currents and divergences.

3 weeks назад @ news.mit.edu
Joining the battle against health care bias
Joining the battle against health care bias Joining the battle against health care bias

Medical researchers are awash in a tsunami of clinical data.

International collaborations such as MIMIC highlight one of the biggest obstacles in health care: most clinical research is performed in rich countries, typically with most clinical trial participants being white males.

That's the rule in the “datathons” (health hackathons) that MIT Critical Data has organized in more than two dozen countries, which apply the latest data science techniques to real-world health data.

At the datathons, MIT students and faculty both learn from local experts and share their own skill sets.

The datathons aim to further empower the professionals and students in the host countries to drive medical resear…

3 weeks назад @ news.mit.edu
3 Questions: Jacob Andreas on large language models
3 Questions: Jacob Andreas on large language models 3 Questions: Jacob Andreas on large language models

Is it possible for large language models to comprehend the intricacies of context?

There are objects that I can refer to, and the language models we have right now typically can’t see any of that when interacting with a human user.

Large language models, perhaps with under-defined or yet-to-be-understood "moral compasses," aren’t beholden to the truth.

Why do large language models tend to hallucinate facts, or confidently assert inaccuracies?

All that being said, I don't think this is a fundamental limitation of neural language models or even more general language models in general, but something that's true about today's language models.

3 weeks, 5 days назад @ news.mit.edu
Berkeley AI
последний пост 2 weeks назад
GPT-4 + Stable-Diffusion = ?: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models
GPT-4 + Stable-Diffusion = ?: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models GPT-4 + Stable-Diffusion = ?: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models

: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language ModelsTL;DR: Text Prompt -> LLM -> Intermediate Representation (such as an image layout) -> Stable Diffusion -> Image.

Recent advancements in text-to-image generation with diffusion models have yielded remarkable results synthesizing highly realistic and diverse images.

In contrast, our method, LLM-grounded Diffusion (LMD), delivers much better prompt understanding in text-to-image generation in those scenarios.

Figure 1: LLM-grounded Diffusion enhances the prompt understanding ability of text-to-image diffusion models.

This approach comes with significant costs: It is time-consuming and expensive to trai…

2 weeks назад @ bair.berkeley.edu
Interactive Fleet Learning
Interactive Fleet Learning Interactive Fleet Learning

A robot fleet is a modern analogue of a fleet of ships, where the word fleet has an etymology tracing back to flēot (‘ship’) and flēotan (‘float’) in Old English.

IFL Formalism and AlgorithmsTo this end, in a recent paper at the Conference on Robot Learning we introduced the paradigm of Interactive Fleet Learning (IFL), the first formalism in the literature for interactive learning with multiple robots and multiple humans.

In the Interactive Fleet Learning (IFL) paradigm, M humans are allocated to the robots that need the most help in a fleet of N robots (where N can be much larger than M).

Takeaways and Future DirectionsTo address the gap between the theory and practice of robot …

2 months назад @ bair.berkeley.edu
Koala: A Dialogue Model for Academic Research
Koala: A Dialogue Model for Academic Research Koala: A Dialogue Model for Academic Research

Koala: A Dialogue Model for Academic ResearchIn this post, we introduce Koala, a chatbot trained by fine-tuning Meta’s LLaMA on dialogue data gathered from the web.

We introduce a new model, Koala, which provides an additional piece of evidence toward this discussion.

We train our Koala model on a single Nvidia DGX server with 8 A100 GPUs.

Understanding large language models: because Koala inference can be performed on relatively inexpensive commodity GPUs, it enables us to better inspect and understand the internals of dialogue language models, making (previously black-box) language models more interpretable.

The TeamThe Koala model is a joint effort across multiple research groups in th…

2 months назад @ bair.berkeley.edu
Fully Autonomous Real-World Reinforcement Learning with Applications to Mobile Manipulation
Fully Autonomous Real-World Reinforcement Learning with Applications to Mobile Manipulation Fully Autonomous Real-World Reinforcement Learning with Applications to Mobile Manipulation

Fully Autonomous Real-World Reinforcement Learning with Applications to Mobile ManipulationReinforcement learning provides a conceptual framework for autonomous agents to learn from experience, analogously to how one might train a pet with treats.

Can we instead devise reinforcement learning systems for robots that allow them to learn directly “on-the-job”, while performing the task that they are required to do?

Learning systems have the ability to create the entire control algorithm for the robot, and are not limited to tuning a few parameters in a script.

The key step in this work allows these real-world learning systems to autonomously collect the data needed to enable the success of…

4 months, 2 weeks назад @ bair.berkeley.edu
Keeping Learning-Based Control Safe by Regulating Distributional Shift
Keeping Learning-Based Control Safe by Regulating Distributional Shift Keeping Learning-Based Control Safe by Regulating Distributional Shift

Keeping Learning-Based Control Safe by Regulating Distributional ShiftTo regulate the distribution shift experience by learning-based controllers, we seek a mechanism for constraining the agent to regions of high data density throughout its trajectory (left).

The central idea behind our work is to view the training data distribution as a safety constraint, and to draw on tools from control theory to control the distributional shift experienced by the agent during closed-loop control.

To use an LDM in control, we can train an LDM and learning-based controller on the same training dataset and constrain the controller’s action outputs with an LDM constraint ($G(s, a)) \leq -\log(c)$).

The ce…

8 months, 3 weeks назад @ bair.berkeley.edu
Reverse engineering the NTK: towards first-principles architecture design
Reverse engineering the NTK: towards first-principles architecture design Reverse engineering the NTK: towards first-principles architecture design

Reverse engineering the NTK: towards first-principles architecture designDeep neural networks have enabled technological wonders ranging from voice recognition to machine transition to protein engineering, but their design and application is nonetheless notoriously unprincipled.

Neural network kernelsThe field of deep learning theory has recently been transformed by the realization that deep neural networks often become analytically tractable to study in the infinite-width limit.

4 below shows a “mimic” activation function \(\tilde{\phi}\) that gives virtually the same NTK as a deep \(\textrm{ReLU}\) FCN.

This is interesting from an engineering perspective because the shallow network us…

9 months, 1 week назад @ bair.berkeley.edu
AWS Machine Learning AWS Machine Learning
последний пост 4 часа назад
Technology Innovation Institute trains the state-of-the-art Falcon LLM 40B foundation model on Amazon SageMaker
Technology Innovation Institute trains the state-of-the-art Falcon LLM 40B foundation model on Amazon SageMaker Technology Innovation Institute trains the state-of-the-art Falcon LLM 40B foundation model on Amazon SageMaker

United Arab Emirate’s (UAE) Technology Innovation Institute (TII), the applied research pillar of Abu Dhabi’s Advanced Technology Research Council, has launched Falcon LLM, a foundational large language model (LLM) with 40 billion parameters.

Trained on 1 trillion tokens, TII Falcon LLM boasts top-notch performance while remaining incredibly cost-effective.

LLM training on SageMakerSageMaker is a collection of managed APIs for developing, training, tuning, and hosting machine learning (ML) models, including LLMs.

“Our Falcon LLM illustrates the technology leadership of the UAE, and paves the way for AI-powered innovation in the region.

By making Falcon LLM open source, we aim to enable wide…

4 часа назад @ aws.amazon.com
Build high-performance ML models using PyTorch 2.0 on AWS – Part 1
Build high-performance ML models using PyTorch 2.0 on AWS – Part 1 Build high-performance ML models using PyTorch 2.0 on AWS – Part 1

Refer to PyTorch 2.0: Our next generation release that is faster, more Pythonic and Dynamic as ever for details.

This post demonstrates the performance and ease of running large-scale, high-performance distributed ML model training and deployment using PyTorch 2.0 on AWS.

Refer to Optimized PyTorch 2.0 inference with AWS Graviton processors for details on AWS Graviton-based instance inference performance benchmarks for PyTorch 2.0.

Options to build your ML platform are not limited to these services; you can pick and choose depending on your organizational requirements for your PyTorch 2.0 jobs.

After you log in to your EC2 instance, download the AWS PyTorch 2.0 DLC.

17 часов назад @ aws.amazon.com
Arrange your transcripts into paragraphs with Amazon Transcribe
Arrange your transcripts into paragraphs with Amazon Transcribe Arrange your transcripts into paragraphs with Amazon Transcribe

} ], "type": "punctuation" } ] } }The metadata is as follows:Type – The type value indicates if the specific item is a punctuation or a pronunciation.

A sentence is considered to be a list of transcription items that exists between punctuation items that indicate full stop.

Sentence identification is straightforward with Amazon Transcribe because punctuation is an out-of-the-box feature, along with the punctuation types comma, full stop, question mark.

ConclusionIn this post, we presented a concept to automatically introduce paragraphs to your transcripts, without manual intervention, based on the metadata Amazon Transcribe provides along with the actual transcript.

Check out Amazon Transcr…

20 часов назад @ aws.amazon.com
Build machine learning-ready datasets from the Amazon SageMaker offline Feature Store using the Amazon SageMaker Python SDK
Build machine learning-ready datasets from the Amazon SageMaker offline Feature Store using the Amazon SageMaker Python SDK Build machine learning-ready datasets from the Amazon SageMaker offline Feature Store using the Amazon SageMaker Python SDK

Amazon SageMaker Feature Store is a purpose-built service to store and retrieve feature data for use by machine learning (ML) models.

Unlike the leads feature group, there is only one record per lead in this feature group.

For more information on feature groups, refer to Create Feature Groups.

The base feature group is the feature group that has other feature groups or the Pandas DataFrame joined to it.

In our example, the base feature group is the leads feature group, and the target feature group is the web marketing feature group.

20 часов назад @ aws.amazon.com
Use Amazon SageMaker Canvas to build machine learning models using Parquet data from Amazon Athena and AWS Lake Formation
Use Amazon SageMaker Canvas to build machine learning models using Parquet data from Amazon Athena and AWS Lake Formation Use Amazon SageMaker Canvas to build machine learning models using Parquet data from Amazon Athena and AWS Lake Formation

In this post, we show you how to import Parquet data to Canvas from Athena, where Lake Formation enables data governance.

Import the Parquet data to Canvas using Athena.

Use the Parquet data to build ML models with CanvasThe Parquet data from Lake Formation is now available on Canvas.

Choose the Configure time series model link to provide time series model options.

To learn more about Canvas, refer to Predict types of machine failures with no-code machine learning using Canvas.

1 day, 17 hours назад @ aws.amazon.com
Amazon SageMaker Automatic Model Tuning now automatically chooses tuning configurations to improve usability and cost efficiency
Amazon SageMaker Automatic Model Tuning now automatically chooses tuning configurations to improve usability and cost efficiency Amazon SageMaker Automatic Model Tuning now automatically chooses tuning configurations to improve usability and cost efficiency

Amazon SageMaker Automatic Model Tuning has introduced Autotune, a new feature to automatically choose hyperparameters on your behalf.

Pain pointsSageMaker automatic model tuning, also called hyperparameter tuning, runs many training jobs on your dataset using a range of hyperparameters that you specify.

The SageMaker Automatic Model Tuning team is constantly innovating on behalf of our customers to optimize their ML workloads.

Maximum expected resources to be consumed by the tuning job (parallel jobs, max runtime, and so on) will be calculated and set in the tuning job record as soon as the tuning job is created.

Such reserved resources will not increase during the course of the tuning job…

1 day, 17 hours назад @ aws.amazon.com
Train a Large Language Model on a single Amazon SageMaker GPU with Hugging Face and LoRA
Train a Large Language Model on a single Amazon SageMaker GPU with Hugging Face and LoRA Train a Large Language Model on a single Amazon SageMaker GPU with Hugging Face and LoRA

The code for this walkthrough can be found on the Hugging Face notebooks GitHub repository under the sagemaker/24_train_bloom_peft_lora folder.

When using SageMaker training jobs, you only pay for GPUs for the duration of model training.

First, you can create a Hugging Face model using your new fine-tuned model artifact for deployment to a SageMaker endpoint.

Because you previously trained the model with a SageMaker Hugging Face estimator, you can deploy the model immediately.

Print the model output print(res[0]["generated_text"].split("Summary:")[-1]) # Sample model output: Kirsten and Alex are going bowling this Friday at 7 pm.

1 day, 18 hours назад @ aws.amazon.com
Announcing the launch of new Hugging Face LLM Inference containers on Amazon SageMaker
Announcing the launch of new Hugging Face LLM Inference containers on Amazon SageMaker Announcing the launch of new Hugging Face LLM Inference containers on Amazon SageMaker

Today, as part of Amazon Web Services’ partnership with Hugging Face, we are excited to announce the release of a new Hugging Face Deep Learning Container (DLC) for inference with Large Language Models (LLMs).

The Hugging Face LLM DLC provides these optimizations out of the box and makes it easier to host LLM models at scale.

Hugging Face’s Text Generation Inference simplifies LLM deploymentTGI is an open source, purpose-built solution for deploying Large Language Models (LLMs).

We use the helper function get_huggingface_llm_image_uri() to generate the appropriate image URI for the Hugging Face Large Language Model (LLM) inference.

“lmi” stands for SageMaker Large Model Inference backend an…

1 day, 19 hours назад @ aws.amazon.com
Implement a multi-object tracking solution on a custom dataset with Amazon SageMaker
Implement a multi-object tracking solution on a custom dataset with Amazon SageMaker Implement a multi-object tracking solution on a custom dataset with Amazon SageMaker

When applying a MOT solution in real-world cases, you need to train or fine-tune a MOT model on a custom dataset.

With Amazon SageMaker Ground Truth, you can effectively create labels on your own video dataset.

Train a ByteTrack model and tune hyperparameters on the custom datasetTo train your ByteTrack model, we use the bytetrack-training.ipynb notebook.

By drawing the tracking results in each frame and saving as a tracking video, you can confirm the tracking result on the tracking video.

ConclusionThis post demonstrated how to implement a multi-object tracking solution on a custom dataset using one of the state-of-the-art algorithms on SageMaker.

5 days, 18 hours назад @ aws.amazon.com
Translate documents in real time with Amazon Translate
Translate documents in real time with Amazon Translate Translate documents in real time with Amazon Translate

Amazon Translate is a neural machine translation service that delivers fast, high-quality, and affordable language translation.

Now, Amazon Translate offers real-time document translation to seamlessly integrate and accelerate content creation and localization.

Use Amazon Translate via the consoleFollow these steps to try out real-time document translation on the console:On the Amazon Translate console, choose Real-time translation in the navigation pane.

Use Amazon Translate with the AWS CLIYou can translate the contents of a file using the following AWS CLI command.

For more information about Amazon Translate, visit Amazon Translate resources to find video resources and blog posts, and re…

6 days, 16 hours назад @ aws.amazon.com
Scale your machine learning workloads on Amazon ECS powered by AWS Trainium instances
Scale your machine learning workloads on Amazon ECS powered by AWS Trainium instances Scale your machine learning workloads on Amazon ECS powered by AWS Trainium instances

In late 2022, AWS announced the general availability of Amazon EC2 Trn1 instances powered by AWS Trainium accelerators, which are purpose built for high-performance deep learning training.

Trn1 instances deliver up to 50% savings on training costs over other comparable Amazon Elastic Compute Cloud (Amazon EC2) instances.

Amazon Elastic Container Service (Amazon ECS) is a fully managed container orchestration service that simplifies your deployment, management, and scaling of containerized applications.

PrerequisitesTo follow along, familiarity with core AWS services such as Amazon EC2 and Amazon ECS is implied.

The first option is to use a Deep Learning Amazon Machine Image (DLAMI) that has…

6 days, 17 hours назад @ aws.amazon.com
Host ML models on Amazon SageMaker using Triton: CV model with PyTorch backend
Host ML models on Amazon SageMaker using Triton: CV model with PyTorch backend Host ML models on Amazon SageMaker using Triton: CV model with PyTorch backend

In this post, we dive deep to see how Amazon SageMaker can serve these PyTorch models using NVIDIA Triton Inference Server.

Triton with PyTorch backendThe PyTorch backend is designed to run TorchScript models using the PyTorch C++ API.

You can load Triton PyTorch models on GPU and CPU (see Multiple Model Instances) and model weights will be kept either in GPU memory/VRAM or in host memory/RAM correspondingly.

You can optimize PyTorch model performance on Triton by using a combination of available configuration-based features.

Triton Inference on SageMakerSageMaker allows you to deploy both SMEs and MMEs with NVIDIA Triton Inference Server.

6 days, 18 hours назад @ aws.amazon.com
Configure and use defaults for Amazon SageMaker resources with the SageMaker Python SDK
Configure and use defaults for Amazon SageMaker resources with the SageMaker Python SDK Configure and use defaults for Amazon SageMaker resources with the SageMaker Python SDK

The Amazon SageMaker Python SDK is an open-source library for training and deploying machine learning (ML) models on Amazon SageMaker.

In this post, we show you how to create and store the default configuration file in Studio and use the SDK defaults feature to create your SageMaker resources.

To use this feature, make sure to upgrade your SageMaker SDK version by running pip install --upgrade sagemaker .

Create the configuration fileTo use the default configuration for the SageMaker Python SDK, you create a config.yaml file in the format that the SDK expects.

Durga Sury is an ML Solutions Architect on the Amazon SageMaker Service SA team.

6 days, 18 hours назад @ aws.amazon.com
Accelerate your learning towards AWS Certification exams with automated quiz generation using Amazon SageMaker foundations models
Accelerate your learning towards AWS Certification exams with automated quiz generation using Amazon SageMaker foundations models Accelerate your learning towards AWS Certification exams with automated quiz generation using Amazon SageMaker foundations models

With JumpStart, you can find foundation models from different providers, enabling you to get started with foundation models quickly.

AI21 Jurassic-2 Jumbo InstructJurassic-2 Jumbo Instruct is an LLM by AI21 Labs that can be applied to any language comprehension or generation task.

Solution overviewIn the following sections, we go through the steps to test the Jurassic-2 Jumbo instruct model in SageMaker:Choose the Jurassic-2 Jumbo instruct model on the SageMaker console.

Evaluate the Jurassic-2 Jumbo Instruct model in the model playgroundOn the AI21 Jurassic-2 Jumbo Instruct listing, choose View Model.

Deploy the Jurassic-2 Jumbo Instruct foundation model from a notebookYou can use the foll…

6 days, 20 hours назад @ aws.amazon.com
Amazon SageMaker XGBoost now offers fully distributed GPU training
Amazon SageMaker XGBoost now offers fully distributed GPU training Amazon SageMaker XGBoost now offers fully distributed GPU training

The SageMaker XGBoost algorithm allows you to easily run XGBoost training and inference on SageMaker.

Today, we are happy to announce that SageMaker XGBoost now offers fully distributed GPU training.

Distributed GPU training with multi-GPU instancesWith SageMaker XGBoost version 1.2-2 or later, you can use one or more single-GPU instances for training.

Training time is only the XGBoost training time, measured from the train() call until the model is saved to Amazon Simple Storage Service (Amazon S3).

To learn more about SageMaker and distributed training using Dask, check out Amazon SageMaker built-in LightGBM now offers distributed training using DaskAbout the AuthorsDhiraj Thakur is a Sol…

1 week назад @ aws.amazon.com
NVIDIA
последний пост 22 часа назад
Fish-Farming Startup Casts AI to Make Aquaculture More Efficient, Sustainable
Fish-Farming Startup Casts AI to Make Aquaculture More Efficient, Sustainable Fish-Farming Startup Casts AI to Make Aquaculture More Efficient, Sustainable

He’s now CEO of GoSmart, an Israel-based company using AI and machine learning to make fish farming more efficient and sustainable.

Powered by the NVIDIA Jetson platform for edge AI, these systems analyze the average weight and population distribution of the fish within the environment, as well as its temperature and oxygen levels.

“The parameters that GoSmart systems analyze are crucial for the fish feed regime,” Melchner said.

To train its AI algorithms, the GoSmart team measured thousands of fish manually before deploying cameras to analyze millions more.

GoSmart could help farmers reduce fish feed by up to 15%, according to Melchner.

22 часа назад @ blogs.nvidia.com
Technical Artist Builds Great Woolly Mammoth With NVIDIA Omniverse USD Composer This Week ‘In the NVIDIA Studio’
Technical Artist Builds Great Woolly Mammoth With NVIDIA Omniverse USD Composer This Week ‘In the NVIDIA Studio’ Technical Artist Builds Great Woolly Mammoth With NVIDIA Omniverse USD Composer This Week ‘In the NVIDIA Studio’

Editor’s note: This post is part of our weekly In the NVIDIA Studio series, which celebrates featured artists, offers creative tips and tricks, and demonstrates how NVIDIA Studio technology improves creative workflows.

His NVIDIA Studio HP ZBook laptop with NVIDIA RTX 5000 graphics unlocked GPU-accelerated filters to speed up material creation.

Sathya imported Tiny Mammoth into NVIDIA Omniverse, a platform for developing and building industrial metaverse applications, via the USD Composer app.

Get started with NVIDIA Omniverse by downloading the standard license free, or learn how Omniverse Enterprise can connect your team.

Stay up to date on the platform by subscribing to the newsletter, a…

1 day назад @ blogs.nvidia.com
Microsoft Bing Speeds Ad Delivery With NVIDIA Triton
Microsoft Bing Speeds Ad Delivery With NVIDIA Triton Microsoft Bing Speeds Ad Delivery With NVIDIA Triton

They’re delivering personalized ads to users of Microsoft Bing with 7x throughput at reduced cost, thanks to NVIDIA Triton Inference Server running on NVIDIA A100 Tensor Core GPUs.

Flying With NVIDIA A100 MIGNext, the team upgraded the ad service from NVIDIA T4 to A100 GPUs.

The inference software comes in a software container, so it’s easy to deploy.

And open-source Triton — also available with enterprise-grade security and support through NVIDIA AI Enterprise — is backed by a community that makes the software better over time.

Accelerating Bing’s ad system with Triton on A100 GPUs is one example of what Chen likes about his job.

1 day, 22 hours назад @ blogs.nvidia.com
Accelerating the Accelerator: Scientist Speeds CERN’s HPC With GPUs, AI
Accelerating the Accelerator: Scientist Speeds CERN’s HPC With GPUs, AI Accelerating the Accelerator: Scientist Speeds CERN’s HPC With GPUs, AI

Editor’s note: This is part of a series profiling researchers advancing science with high performance computing.

Maria Girone is expanding the world’s largest network of scientific computers with accelerated computing and AI.

A high-luminosity version of the giant accelerator (HL-LHC) will produce 10x more proton collisions, spawning exabytes of data a year.

It works closely with NVIDIA through its collaboration with E4 Computer Engineering, a specialist in HPC and AI based in Italy.

Meanwhile, researchers are also porting physics software to GPU accelerators and using existing AI programs that run on GPUs.

1 day, 22 hours назад @ blogs.nvidia.com
Harnessing the Power of NVIDIA AI Enterprise on Azure Machine Learning
Harnessing the Power of NVIDIA AI Enterprise on Azure Machine Learning Harnessing the Power of NVIDIA AI Enterprise on Azure Machine Learning

NVIDIA AI Enterprise is built on top of the NVIDIA CUDA-X AI software stack, providing high-performance GPU-accelerated computing capabilities.

An open platform, Azure Machine Learning supports all popular machine learning frameworks and toolkits, including those from NVIDIA AI Enterprise.

View and access all prebuilt NVIDIA AI Enterprise Components, Environments, and Models from the NVIDIA AI Enterprise Preview Registry (Figure 2).

Pipelines in Azure Machine Learning using NVIDIA AI Enterprise componentsFind NVIDIA AI Enterprise sample assets in the Azure Machine Learning registry.

Get started with NVIDIA AI Enterprise on Azure Machine LearningNVIDIA AI Enterprise and Azure Machine Learnin…

4 days, 19 hours назад @ developer.nvidia.com
A New Age: ‘Age of Empires’ Series Joins GeForce NOW, Part of 20 Games Coming in June
A New Age: ‘Age of Empires’ Series Joins GeForce NOW, Part of 20 Games Coming in June A New Age: ‘Age of Empires’ Series Joins GeForce NOW, Part of 20 Games Coming in June

The season of hot sun and longer days is here, so stay inside this summer with 20 games joining GeForce NOW in June.

Titles from the Age of Empires series are the next Xbox games to roll out to GeForce NOW, giving members plenty to do this summer, especially with over 1,600 games part of the GeForce NOW library.

Expand Your EmpireNVIDIA released the first Xbox games to the cloud last month as part of its ongoing partnership with Microsoft.

All four of the franchise’s latest Steam versions will join GeForce NOW later this month: Age of Empires: Definitive Edition, Age of Empires II: Definitive Edition, Age of Empires III: Definitive Edition and Age of Empires IV: Anniversary Edition.

It feat…

6 days назад @ blogs.nvidia.com
Digital Renaissance: NVIDIA Neuralangelo Research Reconstructs 3D Scenes
Digital Renaissance: NVIDIA Neuralangelo Research Reconstructs 3D Scenes Digital Renaissance: NVIDIA Neuralangelo Research Reconstructs 3D Scenes

Neuralangelo, a new AI model by NVIDIA Research for 3D reconstruction using neural networks, turns 2D video clips into detailed 3D structures — generating lifelike virtual replicas of buildings, sculptures and other real-world objects.

Like Michelangelo sculpting stunning, life-like visions from blocks of marble, Neuralangelo generates 3D structures with intricate details and textures.

Creative professionals can then import these 3D objects into design applications, editing them further for use in art, video game development, robotics and industrial digital twins.

Neuralangelo can also reconstruct building interiors and exteriors — demonstrated with a detailed 3D model of the park at NVIDIA…

6 days назад @ blogs.nvidia.com
NVIDIA RTX Transforming 14-Inch Laptops, Plus Simultaneous Screen Encoding and May Studio Driver Available Today
NVIDIA RTX Transforming 14-Inch Laptops, Plus Simultaneous Screen Encoding and May Studio Driver Available Today NVIDIA RTX Transforming 14-Inch Laptops, Plus Simultaneous Screen Encoding and May Studio Driver Available Today

Editor’s note: This post is part of our weekly In the NVIDIA Studio series, which celebrates featured artists, offers creative tips and tricks, and demonstrates how NVIDIA Studio technology improves creative workflows.

For the first time, GeForce RTX performance comes to 14-inch devices.

Backed by NVIDIA Studio, the platform supercharges over 110 creative apps, provides lasting stability with NVIDIA Studio Drivers and includes a powerful suite of AI-powered Studio software, such as NVIDIA Omniverse, Canvas and Broadcast.

Visit the Studio Shop for the latest GeForce RTX-powered NVIDIA Studio systems and explore the range of high-performance Studio products.

Download GeForce Experience or NVI…

1 week, 1 day назад @ blogs.nvidia.com
MediaTek Partners With NVIDIA to Transform Automobiles With AI and Accelerated Computing
MediaTek Partners With NVIDIA to Transform Automobiles With AI and Accelerated Computing MediaTek Partners With NVIDIA to Transform Automobiles With AI and Accelerated Computing

The partnership was announced today at a COMPUTEX press conference with MediaTek CEO Rick Tsai and NVIDIA founder and CEO Jensen Huang.

“NVIDIA is a world-renowned pioneer and industry leader in AI and computing.

With this new GPU chiplet, NVIDIA can extend its GPU and accelerated compute leadership across broader markets.

MediaTek will develop automotive SoCs and integrate the NVIDIA GPU chiplet, featuring NVIDIA AI and graphics intellectual property, into the design architecture.

*MediaTek’s Dimensity Auto platform draws on its decades of experience in mobile computing, high-speed connectivity, entertainment and extensive Android ecosystem.

1 week, 2 days назад @ blogs.nvidia.com
Live From Taipei: NVIDIA CEO Unveils Gen AI Platforms for Every Industry
Live From Taipei: NVIDIA CEO Unveils Gen AI Platforms for Every Industry Live From Taipei: NVIDIA CEO Unveils Gen AI Platforms for Every Industry

It uses NVIDIA NVLink to combine up to 256 NVIDIA GH200 Grace Hopper Superchips into a single data-center-sized GPU.

Together, they’re bringing generative AI and accelerated computing to millions of users.

Accelerating Gen AI on WindowsHuang described how NVIDIA and Microsoft are collaborating to drive innovation for Windows PCs in the generative AI era.

In addition, Huang announced a new platform to enable the next generation of autonomous mobile robot (AMR) fleets.

It’s one more example of how NVIDIA is helping companies feel the benefits of generative AI with accelerated computing.

1 week, 2 days назад @ blogs.nvidia.com
NVIDIA Brings Advanced Autonomy to Mobile Robots With Isaac AMR
NVIDIA Brings Advanced Autonomy to Mobile Robots With Isaac AMR NVIDIA Brings Advanced Autonomy to Mobile Robots With Isaac AMR

As mobile robot shipments surge to meet the growing demands of industries seeking operational efficiencies, NVIDIA is launching a new platform to enable the next generation of autonomous mobile robot (AMR) fleets.

Isaac AMR brings advanced mapping, autonomy and simulation to mobile robots and will soon be available for early customers, NVIDIA founder and CEO Jensen Huang announced during his keynote address at the COMPUTEX technology conference in Taipei.

Isaac AMR is a platform to simulate, validate, deploy, optimize and manage fleets of autonomous mobile robots.

Isaac AMR: Mapping, Autonomy, SimulationIsaac AMR offers a foundation for mapping, autonomy and simulation.

Finally, Isaac AMR s…

1 week, 2 days назад @ blogs.nvidia.com
Techman Robot Selects NVIDIA Isaac Sim to Optimize Automated Optical Inspection
Techman Robot Selects NVIDIA Isaac Sim to Optimize Automated Optical Inspection Techman Robot Selects NVIDIA Isaac Sim to Optimize Automated Optical Inspection

The demo showed how Techman uses Isaac Sim to optimize the inspection of robots by robots on the manufacturing line.

Isaac Sim is built on NVIDIA Omniverse — an open development platform for building and operating industrial metaverse applications.

Using Omniverse, Techman built a digital twin of the inspection robot — as well as the product to be inspected — in Isaac Sim.

Then, with powerful optimization tools in Isaac Sim, Techman explored a massive number of program options in parallel on NVIDIA GPUs.

Learn more about how Isaac Sim on Omniverse, Metropolis and AI are streamlining the optical inspection process across products and industries by joining NVIDIA at COMPUTEX, where the Techma…

1 week, 2 days назад @ blogs.nvidia.com
Electronics Giants Tap Into Industrial Automation With NVIDIA Metropolis for Factories
Electronics Giants Tap Into Industrial Automation With NVIDIA Metropolis for Factories Electronics Giants Tap Into Industrial Automation With NVIDIA Metropolis for Factories

To drive product excellence, leading electronics manufacturers are adopting NVIDIA Metropolis for Factories.

NVIDIA Metropolis for Factories now offers a state-of-the-art AI platform and workflows for the development of incredibly accurate inspection applications such as AOI.

Pegatron Drives AOI With Metropolis for FactoriesLeading manufacturer Pegatron, based in Taipei’s Beitou district, is using NVIDIA Metropolis for Factories on its production lines.

Overview and Advantech — both NVIDIA Metropolis partners — are collaborating to build a real-time AI-based inspection system to support industrial inspection, product counting and assembly verification.

Metropolis partners Siemens and Data M…

1 week, 2 days назад @ blogs.nvidia.com
NVIDIA Brings New Generative AI Capabilities, Groundbreaking Performance to 100 Million Windows RTX PCs and Workstations
NVIDIA Brings New Generative AI Capabilities, Groundbreaking Performance to 100 Million Windows RTX PCs and Workstations NVIDIA Brings New Generative AI Capabilities, Groundbreaking Performance to 100 Million Windows RTX PCs and Workstations

Generative AI is rapidly ushering in a new era of computing for productivity, content creation, gaming and more.

When optimized for GeForce RTX and NVIDIA RTX GPUs, which offer up to 1,400 Tensor TFLOPS for AI inferencing, generative AI models can run up to 5x faster than on competing devices.

Generative AI on RTX, AnywhereFrom servers to the cloud to devices, generative AI running on RTX GPUs is everywhere.

Makers like Dell, HP, Lenovo and ASUS are pushing the generative AI era forward, backed by RTX GPUs and Tensor Cores.

This is what our powerful and scalable Precision workstations with NVIDIA RTX GPUs are designed to do.

1 week, 2 days назад @ blogs.nvidia.com
NVIDIA CEO Tells NTU Grads to Run, Not Walk — But Be Prepared to Stumble
NVIDIA CEO Tells NTU Grads to Run, Not Walk — But Be Prepared to Stumble NVIDIA CEO Tells NTU Grads to Run, Not Walk — But Be Prepared to Stumble

“You are running for food, or you are running from becoming food.

“Remember, either you are running for food; or you are running from becoming food.

Many nearly doomed us.”The first involved a key early contract the company won to help Sega build a gaming console.

Rapid changes in the industry forced NVIDIA to give up the contract in a near-death brush with bankruptcy, which Sega’s leadership helped avert.

“Confronting our mistake and, with humility, asking for help saved NVIDIA,” he said.

1 week, 4 days назад @ blogs.nvidia.com
Facebook
последний пост 2 weeks, 5 days назад
MSVP is Meta’s first video processing ASIC
MSVP is Meta’s first video processing ASIC MSVP is Meta’s first video processing ASIC

MSVP’s video encoding algorithmsThe MSVP encoder has two main goals: to be highly power efficient and to deliver the same or better video quality as software encoders.

Here’s a simplified version of the data flow of modern hybrid (hardware and software) video encoders:Simplified video encoder modules.

We mainly focused on three levels: block level, frame level, and group of picture (GOP) level.

Smart quantizationQuantization is the only lossy part of video compression, and it is also the dominant bit rate control knob in any video coding standard.

Since VP9’s frame type is different from H.264’s, the strategy for making frame level decisions is also different, as shown in the following figu…

2 weeks, 5 days назад @ ai.facebook.com
Meta introduces its first-generation AI inference accelerator
Meta introduces its first-generation AI inference accelerator Meta introduces its first-generation AI inference accelerator

These workloads run on PyTorch with first-class Python integration, eager-mode development, and the simplicity of APIs.

Deep learning recommendation models (DLRMs), in particular, are important for improving experiences across Meta’s services and applications.

We found that GPUs were not always optimal for running Meta’s specific recommendation workloads at the levels of efficiency required at our scale.

Our solution to this challenge was to design a family of recommendation-specific Meta Training and Inference Accelerator (MTIA) ASICs.

In addition, we maintained the user experience and developer efficiency offered by PyTorch eager-mode development.

2 weeks, 5 days назад @ ai.facebook.com
Improving Instagram notification management with machine learning and causal inference
Improving Instagram notification management with machine learning and causal inference Improving Instagram notification management with machine learning and causal inference

We’re sharing how Meta is applying statistics and machine learning (ML) to improve notification personalization and management on Instagram – particularly on daily digest push notifications.

At Meta, we have been applying statistics and machine learning (ML) for notification personalization and management on Instagram.

Today, we would like to share an example of how we used causal inference and ML to control sending for daily digest push notifications.

By doing so, we intend to maintain a fixed notification sending rate r where 0 < r < 1.

In the Instagram Notifications Systems team, ML and statistics have been applied in different areas to improve user notification experience.

7 months, 1 week назад @ engineering.fb.com
Scaling data ingestion for machine learning training at Meta
Scaling data ingestion for machine learning training at Meta Scaling data ingestion for machine learning training at Meta

To facilitate the level of data ingestion required to support the training models supporting our products, we’ve had to build a new data ingestion infrastructure as well as new last-mile transformation pipelines.

In the sections below, we share our experience building data ingestion and last-mile data preprocessing pipelines that are responsible for feeding data into AI training models.

Data ingestion pipeline overviewWe have exabytes of training data powering our models, and the amount of training data is growing rapidly.

We have built a disaggregated Data PreProcessing tier (DPP) that serves as the reader tier for data ingestion and last-mile data transformations for AI training.

Scaling …

8 months, 2 weeks назад @ engineering.fb.com
Uber Engineering Uber Engineering
последний пост None
neptune.ai neptune.ai
последний пост 1 day назад
How to Build ML Model Training Pipeline
How to Build ML Model Training Pipeline How to Build ML Model Training Pipeline

Complete ML model training pipeline workflow | SourceBut before we delve into the step-by-step model training pipeline, it’s essential to understand the basics, architecture, motivations, challenges associated with ML pipelines, and a few tools that you will need to work with.

There are several reasons to build an ML model training pipeline (trust me!

ML model training pipeline architectureAn ML model training pipeline typically consists of several interconnected components or stages.

In this section, we will walk through a step-by-step tutorial on how to build an ML model training pipeline.

To effectively incorporate distributed training in your ML model training pipelines, here are some u…

1 day назад @ neptune.ai
What Does GPT-3 Mean For the Future of MLOps? With David Hershey
What Does GPT-3 Mean For the Future of MLOps? With David Hershey What Does GPT-3 Mean For the Future of MLOps? With David Hershey

Every episode is focused on one specific ML topic, and during this one, we talked to David Hershey about GPT-3 and the feature of MLOps.

I give broad answers because nearly every industry product has some opportunity to incorporate or improve a feature using language models.

So I write a prompt, the language model says something back based on what the language model says back, and I send another prompt to clarify or to move in some other direction.

David: Maybe to start with the obvious and then we’ll get into the less obvious because I think that’s easy.

Happy to chat about language models, MLOps, whatever, and flush boat.

1 day, 23 hours назад @ neptune.ai
Building ML Platform in Retail and eCommerce
Building ML Platform in Retail and eCommerce Building ML Platform in Retail and eCommerce

An ML Platform helps in the faster iteration of an ML project lifecycle.

The architecture of an ML Platform in eCommerce | Source: AuthorOne might give a different name to a component, but the major components in an ML Platform are as follows:1 Data platformData platform 2 Data processingData processing 3 Continuous integration / continuous deployment / continuous trainingContinuous integration / continuous deployment / continuous training 4 Model servingModel serving 5 Performance monitoringThese are the components we will find in any ML Platform, but what’s special about ML Platform in Retail?

You may also like Building a Machine Learning Platform [Definitive Guide]Consideration for data …

1 week назад @ neptune.ai
How to Build ETL Data Pipeline in ML
How to Build ETL Data Pipeline in ML How to Build ETL Data Pipeline in ML

ETL pipeline vs. data pipeline: the differencesXoriant Data pipeline is an umbrella term for the category of moving data between different systems, and ETL data pipeline is a type of data pipeline.

It is common to use ETL data pipeline and data pipeline interchangeably.

Comparisons ETL Pipeline Data Pipeline Terminology As the abbreviation suggests, ETL involves a series of processes, extracting the data, transforming it and at the end loading it to the target source.

While a data pipeline can include various types of pipelines, ETL pipeline is one specific subset of a data pipeline.

ETL pipeline toolsTo create an ETL pipeline, as discussed in the last section, we require tools, tools that …

2 weeks, 6 days назад @ neptune.ai
How to Save Trained Model in Python
How to Save Trained Model in Python How to Save Trained Model in Python

with open(model_pkl_file, 'rb' ) as file: model = pickle.load(file) y_predict = model.predict(X_test) print(classification_report(y_test, y_predict))Once loaded you can use this model to make predictions.

Saving trained model with JSONWhen you want to have full control over the save and restore procedure of your ML model, JSON comes into play.

To save a deep learning model in TensorFlow Keras, you can use the save() method of the Keras Model object.

ONNX is primarily designed for deep learning models and may not be suitable for other types of machine learning models.

Save Both Model Architecture and Weights: In the case of DL-based models, if you save only model weight but not architecture,…

3 weeks, 6 days назад @ neptune.ai
How to Build an End-To-End ML Pipeline
How to Build an End-To-End ML Pipeline

One of the most prevalent complaints we hear from ML engineers in the community is how costly and error-prone it is to manually go through the ML workflow of building and deploying models. They run scripts manually to preprocess their training data, rerun the deployment scripts, manually tune their models, and spend their working hours…

4 weeks назад @ neptune.ai
Building and Deploying CV Models: Lessons Learned From Computer Vision Engineer
Building and Deploying CV Models: Lessons Learned From Computer Vision Engineer Building and Deploying CV Models: Lessons Learned From Computer Vision Engineer

With over 3 years of experience in designing, building, and deploying computer vision (CV) models, I’ve realized people don’t focus enough on crucial aspects of building and deploying such complex systems.

Right choice of the best computer vision modelThroughout my experience, I have worked on a wide range of applications for CV models.

Cloud deploymentCloud deployment has been a game-changer for deploying computer vision models, offering flexibility, scalability, and ease of maintenance.

Ensuring scalability, security, and performanceWhen deploying computer vision models, it is essential to consider the following factors.

In this blog post, I delved deeper into the practical knowledge and …

1 month, 2 weeks назад @ neptune.ai
How to Build an Experiment Tracking Tool [Learnings From Engineers Behind Neptune]
How to Build an Experiment Tracking Tool [Learnings From Engineers Behind Neptune] How to Build an Experiment Tracking Tool [Learnings From Engineers Behind Neptune]

Based on insights from our very own Piotr Łusakowski (architect), Adam Nieżurawski (back-end technical lead), and other engineers at neptune.ai, you’ll learn:How to develop requirements for your experiment tracking tool,for your experiment tracking tool, What the components of an ideal experiment tracking tool are, and how they satisfy the requirements,are, and how they satisfy the requirements, How to architect the backend layer of an experiment tracking tool.

Functional requirementsIn the previous section, you learned about the problems you solve with an experiment tracking tool; these are also the jobs to be done to build a functional experiment tracking tool.

To begin designing experime…

1 month, 2 weeks назад @ neptune.ai
ML Model Packaging [The Ultimate Guide]
ML Model Packaging [The Ultimate Guide] ML Model Packaging [The Ultimate Guide]

Machine learning model packaging is crucial to the machine learning development lifecycle.

In this comprehensive guide, we’ll explore the key concepts, challenges, and best practices for ML model packaging, including the different types of packaging formats, techniques, and frameworks.

Model packaging ensures a machine learning model can be easily deployed and maintained in a production environment.

Proper model packaging ensures that a machine learning model is:Easy to install: A well-packaged model should be straightforward to install, reducing the time and effort required for deployment.

However, as technology continues to evolve, future considerations for ML model packaging, such as the…

2 months назад @ neptune.ai
Real-World MLOps Examples: End-To-End MLOps Pipeline for Visual Search at Brainly
Real-World MLOps Examples: End-To-End MLOps Pipeline for Visual Search at Brainly Real-World MLOps Examples: End-To-End MLOps Pipeline for Visual Search at Brainly

In this second installment of the series “Real-world MLOps Examples,” Paweł Pęczek, Machine Learning Engineer at Brainly, will walk you through the end-to-end Machine Learning Operations (MLOps) process in the Visual Search team at Brainly.

The motivation behind MLOps at BrainlyTo understand Brainly’s journey toward MLOps, you need to know the motivation for Brainly to adopt AI and machine learning technologies.

Machine learning use cases at BrainlyThe AI department at Brainly aims to build a predictive intervention system for its users.

From our perspective, it was just easier, given our existing automation tools.” — Paweł Pęczek, Machine Learning Engineer at BrainlyOn Amazon EKS, they dep…

2 months, 1 week назад @ neptune.ai
Deploying Large NLP Models: Infrastructure Cost Optimization
Deploying Large NLP Models: Infrastructure Cost Optimization

There was a small systems error.

Please try refreshing the page and if the error is still there drop us a note and let us know.

2 months, 2 weeks назад @ neptune.ai
Definite Guide to Building a Machine Learning Platform
Definite Guide to Building a Machine Learning Platform Definite Guide to Building a Machine Learning Platform

Develop the user storiesIsaac Vidas , Shopify’s ML Platform Lead, at Ray Summit 2022 While building an ML platform, it is also important to remember who your users are and their profiles.

Other resources to learn ML platform designThis section has touched on the most important components to consider when building an ML platform.

MLOps best practices, learnings, and considerations from ML platform expertsWe have taken some of the best practices and learnings from the ML platform teams and consolidated them into the following points:Embrace iterating on your ML platform.

Embrace iterating on your ML platform Like any other software system, building your ML platform should not be a one-off thi…

2 months, 2 weeks назад @ neptune.ai
Managing Dataset Versions in Long-Term ML Projects
Managing Dataset Versions in Long-Term ML Projects Managing Dataset Versions in Long-Term ML Projects

An example of a long-term ML project will be a bank fraud detection system powered by ML models and algorithms for pattern recognition.

More specifically, in long-term ML projects, data, and concept drift, if allowed to persist, result in poor ML model performance and reduced effectiveness.

Data annotation and preprocessingCommon data preprocessing tasks for machine learning | Source: AuthorLong-term machine learning projects require managing large amounts of data.

ML teams involved in long-term machine learning projects have several options for incorporating Data Version Control (DVC) into their existing workflows and managing dataset versions.

We explored challenges that can arise in such…

2 months, 2 weeks назад @ neptune.ai
How to Build a CI/CD MLOps Pipeline [Case Study]
How to Build a CI/CD MLOps Pipeline [Case Study] How to Build a CI/CD MLOps Pipeline [Case Study]

Technology landscape of CI/CD MLOps systemThe infrastructure provided by the client mostly influences the technology landscape of ML model deployments.

Data governance: Ensure that the data used to train and test the model, as well as any new data used for prediction, is properly governed.

ML model explainability: Make sure the ML model is interpretable and understandable by the developers as well as other stakeholders and that the value addition provided can be easily quantified.

Setting up a CI/CD pipeline on AWS: CodePipeline based deployment | SourceWhy didn’t we go with AWS Sagemaker for code deployment?

Other aspects of our CI/CD pipeline developmentIn the above sections, we have disc…

2 months, 3 weeks назад @ neptune.ai
Comparing Tools For Data Processing Pipelines
Comparing Tools For Data Processing Pipelines Comparing Tools For Data Processing Pipelines

Data quality: A data pipeline can help improve the quality of data by automating the process of cleaning and transforming the data.

Some of the most popular vendors providing tools/solutions for streaming data processing are:Integrate.ioStreamSetsHevo DataAirbyteTools for Batch Data Pipelines transfer data in intervals or chunks, and they are commonly viewed as a more traditional method for moving data since they don’t facilitate real-time data processing.

Integrate.io Easy data pipeline designBasic data pipeline configuration doesn’t require the expertise as that of a developer.

Pricing of other modules such as Stitch, Data Management Platform, Big Data Platform, and Data Fabric can be fou…

2 months, 3 weeks назад @ neptune.ai
▶️ YouTube
Yannic Kilcher Yannic Kilcher
последний пост 17 часов назад
Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust (Explained)
Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust (Explained) Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust (Explained)

#stablediffusion #ai #watermark Watermarking the outputs of generative models is usually done as a post-processing step on the model outputs. Tree-Ring Watermarks are applied in the latent space at the beginning of a diffusion process, which makes them nearly undetectable, robust to strong distortions, and only recoverable by the model author. It is a very promising technique with applications potentially beyond watermarking itself. OUTLINE:

0:00 - Introduction & Overview

1:30 - Why Watermarking?

4:20 - Diffusion Models Recap

13:40 - Inverting Diffusion Models

17:05 - Tree-Ring Watermarking

26:15 - Effects of Tree-Ring Watermarks

30:00 - Experimental Results

32:40 - Limitations

34:40 - Conc…

17 часов назад @ youtube.com
RWKV: Reinventing RNNs for the Transformer Era (Paper Explained)
RWKV: Reinventing RNNs for the Transformer Era (Paper Explained) RWKV: Reinventing RNNs for the Transformer Era (Paper Explained)

#gpt4 #rwkv #transformer We take a look at RWKV, a highly scalable architecture between Transformers and RNNs. Fully Connected (June 7th in SF) Promo Link: https://www.fullyconnected.com/?promo=ynnc OUTLINE:

0:00 - Introduction

1:50 - Fully Connected In-Person Conference in SF June 7th

3:00 - Transformers vs RNNs

8:00 - RWKV: Best of both worlds

12:30 - LSTMs

17:15 - Evolution of RWKV's Linear Attention

30:40 - RWKV's Layer Structure

49:15 - Time-Parallel vs Sequence Mode

53:55 - Experimental Results & Limitations

58:00 - Visualizations

1:01:40 - Conclusion Paper: https://arxiv.org/abs/2305.13048 Abstract:

Transformers have revolutionized almost all natural language processing (NLP) tasks b…

4 days, 14 hours назад @ youtube.com
Tree of Thoughts: Deliberate Problem Solving with Large Language Models (Full Paper Review)
Tree of Thoughts: Deliberate Problem Solving with Large Language Models (Full Paper Review) Tree of Thoughts: Deliberate Problem Solving with Large Language Models (Full Paper Review)

#gpt4 #ai #prompt Tree-of-Thought improves prompting of large language models (LLMs) by generalizing the concept of Chain-of-Thought prompting and introduces a tree search across language model thoughts, including state evaluation and backtracking. Experiments on toy tasks show large improvements over both classic and Chain-of-Thought prompting. OUTLINE:

0:00 - Introduction

1:20 - From Chain-of-Thought to Tree-of-Thought

11:10 - Formalizing the algorithm

16:00 - Game of 24 & Creative writing

18:30 - Crosswords

23:30 - Is this a general problem solver?

26:50 - Ablation studies

28:55 - Conclusion Paper: https://arxiv.org/abs/2305.10601 Abstract:

Language models are increasingly being deployed…

2 weeks, 1 day назад @ youtube.com
OpenAI suggests AI licenses (US Senate hearing on AI regulation w/ Sam Altman)
OpenAI suggests AI licenses (US Senate hearing on AI regulation w/ Sam Altman) OpenAI suggests AI licenses (US Senate hearing on AI regulation w/ Sam Altman)

#ai #openai #gpt4 US Senate hearing on AI regulation. MLST video on the hearing: https://www.youtube.com/watch?v=DeSXnESGxr4 Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https://www.subscribestar.com/yannickilcher

Patreon: https://www.patreon.com/yannickilcher

Bitcoin (BTC): bc1q49lsw3q325…

2 weeks, 3 days назад @ youtube.com
[ML News] Geoff Hinton leaves Google | Google has NO MOAT | OpenAI down half a billion
[ML News] Geoff Hinton leaves Google | Google has NO MOAT | OpenAI down half a billion [ML News] Geoff Hinton leaves Google | Google has NO MOAT | OpenAI down half a billion

#google #openai #mlnews Updates from the world of Machine Learning and AI

Great AI memes here: https://twitter.com/untitled01ipynb OUTLINE:

0:00 - Google I/O 2023: Generative AI in everything

0:20 - Anthropic announces 100k tokens context

0:35 - Intro

1:20 - Geoff Hinton leaves Google

7:00 - Google memo leaked: we have no moat

11:30 - OpenAI loses 540M

12:30 - Google AI: Product first

15:50 - Ilya Sutskever on safety vs competition

18:00 - AI works cannot be copyrighted

19:40 - OpenAI tries to trademark GPT

20:30 - StarCoder: accessible code model

21:40 - RedPyjama & OpenLlama

22:55 - Mosaic 7B model

23:50 - YoloNAS

24:10 - Mojo programming language

25:30 - Random helpful things

37:40 - Dee…

3 weeks, 4 days назад @ youtube.com
Scaling Transformer to 1M tokens and beyond with RMT (Paper Explained)
Scaling Transformer to 1M tokens and beyond with RMT (Paper Explained) Scaling Transformer to 1M tokens and beyond with RMT (Paper Explained)

#ai #transformer #gpt4 This paper promises to scale transformers to 1 million tokens and beyond. We take a look at the technique behind it: The Recurrent Memory Transformer, and what its strenghts and weaknesses are. OUTLINE:

0:00 - Intro

2:15 - Transformers on long sequences

4:30 - Tasks considered

8:00 - Recurrent Memory Transformer

19:40 - Experiments on scaling and attention maps

24:00 - Conclusion Paper: https://arxiv.org/abs/2304.11062 Abstract:

This technical report presents the application of a recurrent memory to extend the context length of BERT, one of the most effective Transformer-based models in natural language processing. By leveraging the Recurrent Memory Transformer archit…

1 month, 1 week назад @ youtube.com
OpenAssistant RELEASED! The world's best open-source Chat AI!
OpenAssistant RELEASED! The world's best open-source Chat AI! OpenAssistant RELEASED! The world's best open-source Chat AI!

#openassistant #chatgpt #mlnews Try the chat: https://open-assistant.io/chat

Homepage: https://open-assistant.io Dataset: https://huggingface.co/datasets/OpenAssistant/oasst1

Code: https://github.com/LAION-AI/Open-Assistant

Paper (temporary): https://ykilcher.com/oa-paper Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have aske…

1 month, 3 weeks назад @ youtube.com
AI Alignment Livestream (aka OpenAssistant "Just Chatting")
AI Alignment Livestream (aka OpenAssistant "Just Chatting") AI Alignment Livestream (aka OpenAssistant "Just Chatting")

https://open-assistant.io/chat Links:

TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://discord.gg/4H8xxDF

BitChute: https://www.bitchute.com/channel/yannic-kilcher

Minds: https://www.minds.com/ykilcher

Parler: https://parler.com/profile/YannicKilcher

LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/

BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https…

1 month, 3 weeks назад @ youtube.com
OpenAssistant First Models are here! (Open-Source ChatGPT)
OpenAssistant First Models are here! (Open-Source ChatGPT) OpenAssistant First Models are here! (Open-Source ChatGPT)

#openassistant #chatgpt #gpt4 https://open-assistant.io/chat

https://huggingface.co/OpenAssistant

https://github.com/LAION-AI/Open-Assistant Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https://www.subscribestar.com/yannickilcher

Patreon: https://www.patreon.com/yannickilcher

Bitcoin (BTC)…

2 months назад @ youtube.com
The biggest week in AI (GPT-4, Office Copilot, Google PaLM, Anthropic Claude & more)
The biggest week in AI (GPT-4, Office Copilot, Google PaLM, Anthropic Claude & more) The biggest week in AI (GPT-4, Office Copilot, Google PaLM, Anthropic Claude & more)

#mlnews #gpt4 #copilot Your weekly news all around the AI world Check out W&B courses (free): https://wandb.courses/ OUTLINE:

0:00 - Intro

0:20 - GPT-4 announced!

4:30 - GigaGAN: The comeback of Generative Adversarial Networks

7:55 - ChoppedAI: AI Recipes

8:45 - Samsung accused of faking space zoom effect

14:00 - Weights & Biases courses are free

16:55 - Data Portraits

18:50 - Data2Vec 2.0

19:50 - Gated Models on Hugging Face & huggingface.js

22:05 - Visual ChatGPT

23:35 - Bing crosses 100 million daily active users

24:50 - Casual Conversations Dataset

25:50 - Anthropic AI Safety Research

27:30 - Magnushammer & more advances in AI-assisted math

30:30 - LLaMA license change PR

32:00 - Self-I…

2 months, 2 weeks назад @ youtube.com
GPT-4 is here! What we know so far (Full Analysis)
GPT-4 is here! What we know so far (Full Analysis) GPT-4 is here! What we know so far (Full Analysis)

#gpt4 #chatgpt #openai References:

https://openai.com/product/gpt-4

https://openai.com/research/gpt-4

https://cdn.openai.com/papers/gpt-4.pdf Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https://www.subscribestar.com/yannickilcher

Patreon: https://www.patreon.com/yannickilcher

Bitcoin (BTC…

2 months, 3 weeks назад @ youtube.com
This ChatGPT Skill will earn you $10B (also, AI reads your mind!) | ML News
This ChatGPT Skill will earn you $10B (also, AI reads your mind!) | ML News This ChatGPT Skill will earn you $10B (also, AI reads your mind!) | ML News

#mlnews #chatgpt #llama ChatGPT goes around the world and is finally available via API. Stunning mind-reading performed using fMRI and Stable Diffusion. LLaMA weights leak and hilarity ensues. GTC23 is around the corner! OUTLINE:

0:00 - Introduction

0:20 - GTC 23 on March 20

1:55 - ChatGPT API is out!

4:50 - OpenAI becomes more business-friendly

7:15 - OpenAI plans for AGI

10:00 - ChatGPT influencers

12:15 - Open-Source Prompting Course

12:35 - Flan UL2 20B

13:30 - LLaMA weights leaked

15:50 - Mind-Reading from fMRI

20:10 - Random News / Helpful Things

25:30 - Interview with Bryan Catanzaro Participate in the GTC Raffle: https://ykilcher.com/gtc References:

GTC 23 on March 20

https://www.nv…

2 months, 3 weeks назад @ youtube.com
LLaMA: Open and Efficient Foundation Language Models (Paper Explained)
LLaMA: Open and Efficient Foundation Language Models (Paper Explained) LLaMA: Open and Efficient Foundation Language Models (Paper Explained)

#ai #meta #languagemodel LLaMA is a series of large language models from 7B to 65B parameters, trained by Meta AI. They train for longer on more data and show that something like gpt-3 can be outperformed by significantly smaller models when trained like this. Meta also releases the trained models to the research community. OUTLINE:

0:00 - Introduction & Paper Overview

4:30 - Rant on Open-Sourcing

8:05 - Training Data

12:40 - Training Hyperparameters

14:50 - Architecture Modifications

17:10 - Optimizer

19:40 - Efficient Implementation

26:15 - Main Results

38:00 - Some more completions

40:00 - Conclusion Paper: https://arxiv.org/abs/2302.13971

Website: https://ai.facebook.com/blog/large-lang…

3 months назад @ youtube.com
Open Assistant Inference Backend Development (Hands-On Coding)
Open Assistant Inference Backend Development (Hands-On Coding) Open Assistant Inference Backend Development (Hands-On Coding)

#ai #huggingface #coding Join me as I build streaming inference into the Hugging Face text generation server, going through cuda, python, rust, grpc, websockets, server-sent events, and more... Original repo is here: https://github.com/huggingface/text-generation-inference OpenAssistant repo is here: https://github.com/LAION-AI/Open-Assistant (see inference/) Check out https://www.wandb.courses/ for free MLOps courses! Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the b…

3 months, 1 week назад @ youtube.com
OpenAssistant - ChatGPT's Open Alternative (We need your help!)
OpenAssistant - ChatGPT's Open Alternative (We need your help!) OpenAssistant - ChatGPT's Open Alternative (We need your help!)

#openassistant #chatgpt #ai Help us collect data for OpenAssistant, the largest and most open alternative to ChatGPT.

https://open-assistant.io OUTLINE:

0:00 - Intro

0:30 - The Project

2:05 - Getting to Minimum Viable Prototype

5:30 - First Tasks

10:00 - Leaderboard

11:45 - Playing the Assistant

14:40 - Tricky Facts

16:25 - What if humans had wings?

17:05 - Can foxes be tamed?

23:45 - Can zebras be tamed?

26:15 - Yo (spam)

27:00 - More tasks

29:10 - Entitled Emails

34:35 - Final Words Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: ht…

4 months назад @ youtube.com
Henry AI Labs Henry AI Labs
последний пост None
3blue1brown 3blue1brown
последний пост 2 months назад
Why π is in the normal distribution (beyond integral tricks)
Why π is in the normal distribution (beyond integral tricks) Why π is in the normal distribution (beyond integral tricks)

Where's the circle? And how does it relate to where e^(-x^2) comes from?

Help fund future projects: https://www.patreon.com/3blue1brown

An equally valuable form of support is to simply share the videos. The artwork in this video is by Kurt Bruns, aided by Midjourney Here are several other good posts about the classic Poisson proof vcubingx: https://www.youtube.com/watch?v=9CgOthUUdw4

BriTheMathGuy: https://www.youtube.com/watch?v=S79KPrIm_Gc

Dr. Alter's math library: https://idan-alter.github.io/2023/02/20/Gaussian-Integral.html And if you'd like to see many other variations on approaching this integral, take a look at this expository paper from Keith Conrad: https://kconrad.math.uconn.edu/…

2 months назад @ youtube.com
But what is the Central Limit Theorem?
But what is the Central Limit Theorem? But what is the Central Limit Theorem?

A visual introduction to probability's most important theorem

Help fund future projects: https://www.patreon.com/3blue1brown

An equally valuable form of support is to simply share the videos. -----------------

Timestamps

0:00 - Introduction

1:53 - A simplified Galton Board

4:14 - The general idea

6:15 - Dice simulations

8:55 - The true distributions for sums

11:41 - Mean, variance and standard deviation

15:54 - Unpacking the Gaussian formula

20:47 - The more elegant formulation

25:01 - A concrete example

27:10 - Sample means

28:10 - Underlying assumptions ------------------ These animations are largely made using a custom python library, manim. See the FAQ comments here:

https://www.3blue1b…

2 months, 3 weeks назад @ youtube.com
But what is a convolution?
But what is a convolution? But what is a convolution?

Discrete convolutions, from probability, to image processing and FFTs.

Help fund future projects: https://www.patreon.com/3blue1brown

Special thanks to these supporters: https://3b1b.co/lessons/convolutions#thanks

An equally valuable form of support is to simply share the videos. ------------------ Other videos I referenced Live lecture on image convolutions for the MIT Julia lab

https://youtu.be/8rrHTtUzyZA Lecture on Discrete Fourier Transforms

https://youtu.be/g8RkArhtCc4 Reducible video on FFTs

https://youtu.be/h7apO7q16V0 Veritasium video on FFTs

https://youtu.be/nmgFG7PUHfo A small correction for the integer multiplication algorithm mentioned at the end. A “straightforward” applicatio…

6 months, 2 weeks назад @ youtube.com
Researchers thought this was a bug (Borwein integrals)
Researchers thought this was a bug (Borwein integrals) Researchers thought this was a bug (Borwein integrals)

A curious pattern of integrals that all equal pi...until they don't.

Next video on convolutions: https://youtu.be/KuXjwB4LzSA

Help fund future projects: https://www.patreon.com/3blue1brown

Special thanks to these patrons: https://3b1b.co/lessons/borwein#thanks

An equally valuable form of support is to simply share the videos. ------------------ Original paper from David and Jonathan Borwein

https://carma.edu.au/resources/db90/pdfs/db90-119.00.pdf Other fun coverage of the topic:

http://schmid-werren.ch/hanspeter/publications/2014elemath.pdf https://johncarlosbaez.wordpress.com/2018/09/20/patterns-that-eventually-fail/ Correction: 4:12 The top line should not be there, as that integral diver…

7 months назад @ youtube.com
We ran a contest for math explainers, here are the results (SoME2)
We ran a contest for math explainers, here are the results (SoME2) We ran a contest for math explainers, here are the results (SoME2)

Winners and honorable mentions for the SoME2 contest

Playlist of all entries: https://www.youtube.com/playlist?list=PLnQX-jgAF5pTZXPiD8ciEARRylD9brJXU

Help fund future projects: https://www.patreon.com/3blue1brown Post with links to all entries:

https://www.3blue1brown.com/blog/some2 **Winners** Clear Crystal Conundrums, A Multifaceted Intro to Group Theory

https://explanaria.github.io/crystalgroups/ The Lore of Calculus

https://youtu.be/5M2RWtD4EzI How Realistic CGI Works (And How To Do It Way Faster)

https://www.youtube.com/watch?v=gsZiJeaMO48 Percolation: a Mathematical Phase Transition

https://youtu.be/a-767WnbaCQ The Coolest Hat Puzzle

https://youtu.be/6hVPNONm7xw **Honorable mentions*…

8 months, 1 week назад @ youtube.com
Two Minute Papers Two Minute Papers
последний пост 3 days, 20 hours назад
OpenAI's GPT-4: Eccentric Genius AI!
OpenAI's GPT-4: Eccentric Genius AI! OpenAI's GPT-4: Eccentric Genius AI!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers My latest paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bret Brizzee, Bryan Learn, B Shang, Christian Ahlin, Eric Martel, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Lukic, John Le, Kenneth Davis, Klaus Busse, Kyle Davis, Lorin Atzberger, Lukas Biewald, Marti…

3 days, 20 hours назад @ youtube.com
Unreal Engine 5: Next Level Games Are Coming!
Unreal Engine 5: Next Level Games Are Coming! Unreal Engine 5: Next Level Games Are Coming!

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.com/papers ❤️ Get about $50 off from an upcoming W&B event in San Francisco! - https://shorturl.at/brtIQ My latest paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Eric Martel, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Lukic, John …

6 days, 20 hours назад @ youtube.com
OpenAI's ChatGPT: The Future Of Teaching!
OpenAI's ChatGPT: The Future Of Teaching! OpenAI's ChatGPT: The Future Of Teaching!

❤️ Check out the Gradient Dissent podcast by Weights & Biases: http://wandb.me/gd ❤️ Get about $50 off from an upcoming W&B event in San Francisco! - https://www.fullyconnected.com?promo=2mp Check out Khanmigo here:

https://www.khanacademy.org/khan-labs My latest paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Eric …

1 week, 2 days назад @ youtube.com
Google Bard: Is It Better Than ChatGPT?
Google Bard: Is It Better Than ChatGPT? Google Bard: Is It Better Than ChatGPT?

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers Try Bard (note: not available in all countries yet): https://bard.google.com/ 📝 The paper "PaLM 2 Technical Report" is available here:

https://arxiv.org/abs/2305.10403 My latest paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Eri…

1 week, 5 days назад @ youtube.com
Stable Diffusion Got Supercharged - For Free!
Stable Diffusion Got Supercharged - For Free! Stable Diffusion Got Supercharged - For Free!

❤️ Check out Weights & Biases and say hi in their community forum here: https://wandb.me/paperforum 📝 The paper "Adding Conditional Control to Text-to-Image Diffusion Models" is available here:

https://arxiv.org/abs/2302.05543 Try it out!

ControlNet - https://github.com/lllyasviel/ControlNet

ControlNet guide - how install and use it: https://stable-diffusion-art.com/controlnet/ Transform yourself (dance) - https://www.reddit.com/r/StableDiffusion/comments/12i9qr7/i_transform_real_person_dancing_to_animation/

Group photo (cartooning) - https://reddit.com/r/StableDiffusion/comments/12nd60i/turn_a_group_photo_into_a_digital_painting_with/

Decartooning - https://reddit.com/r/StableDiffusion/com…

2 weeks, 1 day назад @ youtube.com
NVIDIA Is Simulating 100,000 Hair Strands!
NVIDIA Is Simulating 100,000 Hair Strands! NVIDIA Is Simulating 100,000 Hair Strands!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "Interactive Hair Simulation on the GPU using ADMM" is available here:

https://research.nvidia.com/publication/2023-08_interactive-hair-simulation-gpu-using-admm My latest paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahli…

2 weeks, 6 days назад @ youtube.com
This New AI Is The Future of Videomaking!
This New AI Is The Future of Videomaking! This New AI Is The Future of Videomaking!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "Structure and Content-Guided Video Synthesis with Diffusion Models" is available here:

https://arxiv.org/abs/2302.03011 Try Runway:

https://runwayml.com/ Full video made with runway: https://twitter.com/IXITimmyIXI/status/1649242592876412928 My latest paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz…

3 weeks, 2 days назад @ youtube.com
DeepMind’s AI Athletes Are Crazy Good!
DeepMind’s AI Athletes Are Crazy Good! DeepMind’s AI Athletes Are Crazy Good!

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.com/papers ❤️ Get more than $50 off from an upcoming W&B event in San Francisco! - https://www.fullyconnected.com?promo=2mp 📝 The paper "Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning" is available here:

https://sites.google.com/view/op3-soccer My latest paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabo…

3 weeks, 4 days назад @ youtube.com
OpenAI’s GPT-4: A 70-Year Old Lesson!
OpenAI’s GPT-4: A 70-Year Old Lesson! OpenAI’s GPT-4: A 70-Year Old Lesson!

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.com/papers 📝 The paper "Sparks of Artificial General Intelligence: Early experiments with GPT-4 is available here:

https://arxiv.org/abs/2303.12712 More about GPT-4 here: https://openai.com/product/gpt-4 My latest paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B S…

4 weeks назад @ youtube.com
NVIDIA’s New Video AI: Game Changer!
NVIDIA’s New Video AI: Game Changer! NVIDIA’s New Video AI: Game Changer!

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.com/papers 📝 The #NVIDIA paper "Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models " is available here:

https://research.nvidia.com/labs/toronto-ai/VideoLDM/ My latest paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian …

1 month назад @ youtube.com
AutoGPT: This Is ChatGPT Supercharged!
AutoGPT: This Is ChatGPT Supercharged! AutoGPT: This Is ChatGPT Supercharged!

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.com/papers #AutoGPT is available here:

GitHub: https://github.com/Significant-Gravitas/Auto-GPT

Try it locally (sort of): https://github.com/Significant-Gravitas/Auto-GPT

Try it on the web: https://huggingface.co/spaces/aliabid94/AutoGPT Sources:

AirPods: https://twitter.com/sharifshameem/status/1405462642936799247

Create website: https://twitter.com/SullyOmarr/status/1644160222733406214

Do research: https://twitter.com/dzhng/status/1648056958996606978

Progress report: https://www.reddit.com/r/AutoGPT/comments/12ohmmo/comment/jgiyire/?utm_source=reddit&utm_medium=web2x&context=3

Code improvements: https://twitter.…

1 month, 1 week назад @ youtube.com
Stable Diffusion Is Getting Outrageously Good!
Stable Diffusion Is Getting Outrageously Good! Stable Diffusion Is Getting Outrageously Good!

❤️ Check out Fully Connected by Weights & Biases: https://wandb.me/papers W&B+Stable Diffusion:

https://wandb.ai/capecape/stable_diffusions/reports/Speed-Up-Stable-Diffusion-on-Your-M1Pro-Macbook-Pro--VmlldzoyNjY0ODYz 📝 The paper "High-Resolution Image Synthesis with Latent Diffusion Models" is available here:

https://arxiv.org/abs/2112.10752 Try it:

Web 1: https://huggingface.co/spaces/stabilityai/stable-diffusion

Web 2: https://beta.dreamstudio.ai/generate

Web 3 (also Stable Diffusion XL!): https://clipdrop.co/stable-diffusion

Web 4 (notebooks): https://github.com/TheLastBen/fast-stable-diffusion

Guide: https://stable-diffusion-art.com/know-these-important-parameters-for-stunning-ai-image…

1 month, 1 week назад @ youtube.com
25 ChatGPT AIs Play a Video Game!
25 ChatGPT AIs Play a Video Game! 25 ChatGPT AIs Play a Video Game!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "Generative Agents: Interactive Simulacra of Human Behavior" is available here:

https://arxiv.org/abs/2304.03442

https://reverie.herokuapp.com/arXiv_Demo/ My latest paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Eric…

1 month, 2 weeks назад @ youtube.com
Midjourney AI: How Is This Even Possible?
Midjourney AI: How Is This Even Possible? Midjourney AI: How Is This Even Possible?

❤️ Check out Weights & Biases and say hi in their community forum here: https://wandb.me/paperforum Try Stable Diffusion:

Web 1: https://huggingface.co/spaces/stabilityai/stable-diffusion

Web 2: https://beta.dreamstudio.ai/generate

Web 3 (also Stable Diffusion XL!): https://clipdrop.co/stable-diffusion

Web 4 (notebooks): https://github.com/TheLastBen/fast-stable-diffusion Stable Diffusion Web UI (Windows/MacOS) https://github.com/AUTOMATIC1111/stable-diffusion-webui

Guide for installation: https://github.com/cmdr2/stable-diffusion-ui

Draw Things app (MacOS): https://drawthings.ai/

Simpler app (MacOS): https://huggingface.co/blog/fast-mac-diffusers

Guide: https://stable-diffusion-art.com/kno…

1 month, 2 weeks назад @ youtube.com
NVIDIA’s New AI: Better Games Are Coming!
NVIDIA’s New AI: Better Games Are Coming! NVIDIA’s New AI: Better Games Are Coming!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers My latest paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Eric Martel, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Lukic, John Le, Jonathan, Kenneth Davis, Klaus Busse, Kyle Davis, Lorin Atzberger, Lukas Biewald, Martin, M…

1 month, 3 weeks назад @ youtube.com
DataFest Video DataFest Video
последний пост None
Семинары JetBrains Research Семинары JetBrains Research
последний пост 2 weeks, 5 days назад
Обнаружение и диагностика неисправностей при помощи графовых нейронных сетей
Обнаружение и диагностика неисправностей при помощи графовых нейронных сетей Обнаружение и диагностика неисправностей при помощи графовых нейронных сетей

Анонсы семинаров доступны по ссылке: https://t.me/piterai Докладчик: Александр Коваленко - младший научный сотрудник научно-исследовательского института искусственного интеллекта AIRI. Доклад посвящен задачам обнаружения и диагностики неисправностей в технологических процессах и промышленном оборудовании. В докладе будут проанализированы особенности применения графовых нейронных сетей к данным с промышленных объектов, описан способ получения структуры графа при помощи обучающихся матриц смежности.

2 weeks, 5 days назад @ youtube.com
Механизм внимания в моделях, основанных на деревьях решений
Механизм внимания в моделях, основанных на деревьях решений Механизм внимания в моделях, основанных на деревьях решений

Анонсы семинаров доступны по ссылке: https://t.me/piterai Докладчик: Андрей Константинов, аспирант Высшей школы искусственного интеллекта Санкт-Петербургского политехнического университета Петра Великого. Доклад посвящён применению широко распространённых моделей внимания (Attention), изначально используемых для обработки естественного языка и последовательностей, к задачам классификации, регрессии и анализу выживаемости с обучающими выборками малых размеров. В качестве моделей рассматриваются различные комбинации механизма внимания с ансамблями деревьев решений, а задачи обучения сводятся как к оптимизации нейронных сетей, так и к выпуклому программированию.

1 month, 2 weeks назад @ youtube.com
Автоматический анализ временных рядов (с упором на классификацию и поиск аномалий)
Автоматический анализ временных рядов (с упором на классификацию и поиск аномалий) Автоматический анализ временных рядов (с упором на классификацию и поиск аномалий)

Анонсы будущих семинаров - в канале ассоциации - https://t.me/piterai Ссылка на Fedot.Industrial - https://github.com/aimclub/Fedot.Industrial Докладчик: Илья Ревин, научный сотрудник исследовательского центра в сфере искусственного интеллекта "Сильный искусственный интеллект в промышленности", научный сотрудник лаборатории композитного искусственного интеллекта ИТМО. Аннотация: Автоматическое машинное обучение способно повысить эффективность анализа временных рядов благодаря возможности комбинировать различные методы извлечения признаков и модели машинного обучения для создания оптимальной модели для конкретной задачи. На данный момент актуальные решения данной задачи имеют 2 существенных …

2 months, 1 week назад @ youtube.com
Решение задачи оптимального транспорта на основе нейросетей
Решение задачи оптимального транспорта на основе нейросетей Решение задачи оптимального транспорта на основе нейросетей

Анонсы будущих семинаров - в канале ассоциации - https://t.me/piterai Докладчик: Евгений Бурнаев, д.ф.-м.н., профессор, руководитель центра прикладного ИИ Сколтеха, ведущий научный сотрудник AIRI Аннотация:

Решение задач оптимального транспорта (OT) с помощью нейронных сетей получило широкое распространение в машинном обучении. Большинство существующих методов вычисляют стоимость OT и используют ее в качестве функции потерь для настройки генератора в генеративных моделях (Wasserstein GANs). В презентации будет рассказано о совершенно другом и недавно появившемся направлении - методах вычисления отображения ОТ и использования его в качестве генеративной модели. Недавние результаты показывают…

2 months, 2 weeks назад @ youtube.com
Яндекс. Компьютерные науки Яндекс. Компьютерные науки
последний пост 1 month назад
Bias-variance tradeoff в анализе данных и реальной жизни. Максим Николаев, Школа анализа данных
Bias-variance tradeoff в анализе данных и реальной жизни. Максим Николаев, Школа анализа данных Bias-variance tradeoff в анализе данных и реальной жизни. Максим Николаев, Школа анализа данных

Лекция «Bias-variance tradeoff. Почему мы любим спрашивать совета у профессионалов» на Дне открытых дверей Школы анализа данных в Санкт-Петербурге, 22 апреля 2023 года. Лектор: Максим Николаев, координатор партнерского направления Яндекса «Науки о данных» на факультете МКН СПбГУ, преподаватель статистики и ML. С переобучением и недообучением модели сталкивается всякий, кто интересуется анализом данных и машинным обучением. В этой лекции мы разберемся, как описать эти явления в терминах смещения и дисперсии предсказания, узнаем, как связаны эти понятия со способностью модели выучивать обучающую выборку, а также увидим, как улучшить качество предсказания, умело ограничивая выразительность мод…

1 month назад @ youtube.com
Data Dojo — ML тренировка 22 апреля 2023
Data Dojo — ML тренировка 22 апреля 2023 Data Dojo — ML тренировка 22 апреля 2023

Data Dojo — мероприятие, на котором встречаются специалисты из сферы анализа данных и участвуют в тренировках по машинному обучению. Задавайте вопросы спикерам в телеграм-чате (https://t.me/+OsKnLNG-7DE1ZTFi) с хештегом #вопрос, чтобы ведущий зачитал их в прямом эфире. Программа: https://events.yandex.ru/events/datadojo-22-04-2023

1 month, 2 weeks назад @ youtube.com
ML Party Yerevan — 2 марта 2023
ML Party Yerevan — 2 марта 2023 ML Party Yerevan — 2 марта 2023

ML Party — регулярные встречи о применении машинного обучения в IT. Задавайте вопросы спикерам в телеграм-чате (https://t.me/+OsKnLNG-7DE1ZTFi) с хештегом #вопрос, чтобы ведущий зачитал их в прямом эфире.

3 months, 1 week назад @ youtube.com
Data Dojo — ML тренировка 16 февраля 2023
Data Dojo — ML тренировка 16 февраля 2023 Data Dojo — ML тренировка 16 февраля 2023

Data Dojo — тренировки по машинному обучению и место встречи специалистов в сфере анализа данных. Задавайте вопросы спикерам в телеграм-чате (https://t.me/+OsKnLNG-7DE1ZTFi) с хештегом #вопрос, чтобы ведущий зачитал их в прямом эфире.

3 months, 4 weeks назад @ youtube.com
Data Dojo — новогодняя ML тренировка 24 декабря 2022
Data Dojo — новогодняя ML тренировка 24 декабря 2022 Data Dojo — новогодняя ML тренировка 24 декабря 2022

Data Dojo — тренировки по машинному обучению и место встречи специалистов в сфере анализа данных. Задавайте вопросы спикерам в телеграм-чате (https://t.me/+OsKnLNG-7DE1ZTFi) с хештегом #вопрос, чтобы ведущий зачитал их в прямом эфире. 0:00:00 — Начало трансляции

0:00:55 — ML-соревнования 2022. Подведение итогов года / Петр Ермаков

0:12:17 — Предсказание исполнителя трека по набору акустических признаков. Разбор решения с Yandex Cup 2022 / Владимир Фоменко

0:37:52 — Что было, что будет, чем сердце успокоится: об анализе новостной ленты из прошлого, настоящего и будущего / Елизавета Пушкарева и Георгий Сурков

2:02:12 — Дорога к Kaggle Competitions Master в 17 лет / Вадим Тимакин

2:31:10 — При…

5 months, 2 weeks назад @ youtube.com
Data Dojo — ML тренировка 17 ноября 2022
Data Dojo — ML тренировка 17 ноября 2022 Data Dojo — ML тренировка 17 ноября 2022

Data Dojo — тренировки по машинному обучению и место встречи специалистов в сфере анализа данных. Задавайте вопросы спикерам в телеграм-чате (https://t.me/+OsKnLNG-7DE1ZTFi) с хештегом #вопрос, чтобы ведущий зачитал их в прямом эфире.

6 months, 3 weeks назад @ youtube.com
Data Dojo — ML тренировка 22 сентября 2022
Data Dojo — ML тренировка 22 сентября 2022 Data Dojo — ML тренировка 22 сентября 2022

Data Dojo — тренировки по машинному обучению и место встречи специалистов в сфере анализа данных. Задавайте вопросы спикерам в телеграм-чате (https://t.me/+OsKnLNG-7DE1ZTFi) с хештегом #вопрос, чтобы ведущий зачитал их в прямом эфире. Программа: 0:05 — Открытие / Петр Ермаков (Яндекс)

4:38 — Бенчмарк приемлемости предложений на русском языке (RuCoLA) + секретный релиз / Максим Рябинин (Яндекс) / Презентация: https://clck.ru/32G6nT

39:50 — Верификация моделей автомобилей (Machines Can See 2022) / Дмитрий Гаус (VisionLabs) и Артём Стрекалов (АО Уфанет) / Презентация: https://clck.ru/32G6oS

8 months, 2 weeks назад @ youtube.com
ML Trainings ML Trainings
последний пост 3 часа назад
Антон Голубев - Data-centric AI: обзор методов
Антон Голубев - Data-centric AI: обзор методов Антон Голубев - Data-centric AI: обзор методов

Обзор data-centric подходов и их классификация на примерах, в том числе медицинских Data Fest 2023:

https://ods.ai/events/datafestonline2023

Трек "Ужасы медицинских данных": https://ods.ai/tracks/df23-meddata Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

3 часа назад @ youtube.com
Николай Холод - Learning annotator’s style in medical imaging
Николай Холод - Learning annotator’s style in medical imaging Николай Холод - Learning annotator’s style in medical imaging

Одна из часто встречающихся проблем при разметке медицинских изображений - расхождения между разными врачами-разметчиками (inter-observer variability). Одно и то же изображение может быть размечено двумя врачами по-разному. Основные причины - человеческий фактор, разница в опыте и квалификации, разные “радиологические школы”, плохое качество изображения, нечёткие инструкции. Влияние некоторых факторов можно уменьшить за счёт правильной организации процесса разметки, но мнение врачей всё равно нередко отличается. Оказывается, что нейронки с помощью дополнительного эмбеддинг-модуля могут обучиться стилю разметки разных врачей-рентгенологов. Data Fest 2023:

https://ods.ai/events/datafestonline…

6 часов назад @ youtube.com
Любовь Аксёнова - Возможности автоматизации процессов анализа изобр-ий сетчатки,получ-х методом ОКТ
Любовь Аксёнова  - Возможности автоматизации процессов анализа изобр-ий сетчатки,получ-х методом ОКТ Любовь Аксёнова - Возможности автоматизации процессов анализа изобр-ий сетчатки,получ-х методом ОКТ

Диагностика и оценка активности заболевания с помощью ОКТ, путем детекции и количественного

расчета морфологических изменений структур сетчатки биомаркеров эффективности терапии Data Fest 2023:

https://ods.ai/events/datafestonline2023

Трек "Ужасы медицинских данных": https://ods.ai/tracks/df23-meddata Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

23 часа назад @ youtube.com
Кудин Степан и Соколова Елена - Ускорение проведения исследования ПЭТ КТ с помощью DL шумопод-ия
Кудин Степан  и Соколова Елена - Ускорение проведения исследования ПЭТ КТ с помощью DL шумопод-ия Кудин Степан и Соколова Елена - Ускорение проведения исследования ПЭТ КТ с помощью DL шумопод-ия

ПЭТ КТ исследование является достаточно дорогим и довольно долгим исследованием. Одним из путей его удешевления является ускорение его проведения. Эта задача сводится к задаче шумоподавления на изображении. Мы расскажем о своей работе, в которой попытались сократить время получения одно ПЭТ снимка с 90 секунд до 60 и 30 секунд Data Fest 2023:

https://ods.ai/events/datafestonline2023

Трек "Ужасы медицинских данных": https://ods.ai/tracks/df23-meddata Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 day, 3 hours назад @ youtube.com
Юлия Агафонова - Личный путь ML-разработчика в мире медицинских данных
Юлия Агафонова - Личный путь ML-разработчика в мире медицинских данных Юлия Агафонова - Личный путь ML-разработчика в мире медицинских данных

Описание жизненного пути начинающего ML-разрабочтка в сфере медицины Data Fest 2023:

https://ods.ai/events/datafestonline2023

Трек "Ужасы медицинских данных": https://ods.ai/tracks/df23-meddata Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 day, 6 hours назад @ youtube.com
Вероника Семенова - Обзор артефактов на гистопатологических изображениях
Вероника Семенова - Обзор артефактов на гистопатологических изображениях Вероника Семенова - Обзор артефактов на гистопатологических изображениях

Что такое гистопатологические снимки? Зачем они нужны? И главное - какие на них бывают артефакты, и можно ли что-то с этим сделать? Data Fest 2023:

https://ods.ai/events/datafestonline2023

Трек "Ужасы медицинских данных": https://ods.ai/tracks/df23-meddata Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 day, 23 hours назад @ youtube.com
Евгений Никитин - Российский DL в радиологии: что происходит на рынке в 2023 году?
Евгений Никитин - Российский DL в радиологии: что происходит на рынке в 2023 году? Евгений Никитин - Российский DL в радиологии: что происходит на рынке в 2023 году?

Очень краткий обзор главных новостей рынка ИИ-систем в рентгенологии Data Fest 2023:

https://ods.ai/events/datafestonline2023

Трек "Ужасы медицинских данных": https://ods.ai/tracks/df23-meddata Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

2 days, 3 hours назад @ youtube.com
Тизер трека "Ужасы медицинских данных"
Тизер трека "Ужасы медицинских данных" Тизер трека "Ужасы медицинских данных" 2 days, 6 hours назад @ youtube.com
Data Fest 2023, день 16: офлайн в Москве 4 июня
Data Fest 2023, день 16: офлайн в Москве 4 июня Data Fest 2023, день 16: офлайн в Москве 4 июня

Последний стрим заключительного дня Data Fest 2023!

Это были ударные 2 недели: 8 оффлайн локаций, 11 трансляций активностей, а также многое, оставшееся за кадром трансляций, но в чем вы поучаствовали с нами за эти 2 недели. Сегодня мы завершаем программу Феста в гостях у МТС. Параллельно с докладами также весь день будет идти вторая часть программы Reliable ML в spatial.chat. Но чтобы посмотреть их живьем - нужно быть там :) Полное расписание:

https://ods.ai/events/fest2023-moscow5 Информация про Spatial.chat и другие активности доступна на ODS.AI: https://ods.ai/events/datafestonline2023 Вступить в сообщество: https://ods.ai/ Соцсети Data Fest & Course Fest: https://t.me/datafest

https://v…

3 days, 8 hours назад @ youtube.com
Data Fest 2023, день 15: online из spatial.chat
Data Fest 2023, день 15: online из spatial.chat Data Fest 2023, день 15: online из spatial.chat

В последнюю субботу Data Fest 2023 в spatial.chat участников Феста ждет много всего крутого! На этом стриме с 10:00 до 19:30 по Московскому времени будет идти трансляция Reliable ML. Основные активности, вопросы, и личное общение будут происходить в spatial.chat ODS: https://ods.ai/events/datafestonline2023/networking Параллельно с Reliable ML будут идти доклады и воркшопы в

1. МТС: с 11:00 до 14:30 про АВ тесты и не только

2. Газпромбанк.Тех с 12:00 до 16:40 про банковские приложения и практику Расписание активностей доступно на ODS.AI: https://ods.ai/events Вступить в сообщество: https://ods.ai/ Соцсети Data Fest & Course Fest: https://t.me/datafest

https://vk.com/datafest

4 days, 19 hours назад @ youtube.com
Data Fest 2023, день 15: офлайн в Новосибирске 3 июня
Data Fest 2023, день 15: офлайн в Новосибирске 3 июня Data Fest 2023, день 15: офлайн в Новосибирске 3 июня

Начинаем заключительную субботу Data Fest 2023 в Новосибирске! Не пугайтесь, программа начинается по Новосибирскому времени :)

Сегодня мы в гостях у ЦФТ с полным днем активностей. А после программы - начнется стрим онлайна в spatial.chat от секции Reliable ML! Полное расписание:

https://ods.ai/events/fest2023-novosib Информация про Spatial.chat и другие активности доступна на ODS.AI: https://ods.ai/events/datafestonline2023 Вступить в сообщество: https://ods.ai/ Соцсети Data Fest & Course Fest: https://t.me/datafest

https://vk.com/datafest

4 days, 19 hours назад @ youtube.com
Data Fest 2023, день 14: оффлайн в Москве 2 июня
Data Fest 2023, день 14: оффлайн в Москве 2 июня Data Fest 2023, день 14: оффлайн в Москве 2 июня

Начинаем заключительный викенд Data Fest 2023! Сегодня вас ждёт программа и из онлайна и с живой площадки в гостях у Альфа-Банка в Москве:

-С 11:30 до 15:00 доклады секции Ужасы Медицинских данных в spatial.chat

-С 15:00 до 20:30 программа из Альфа-Банка про NLP in Practice, и не только! Полное расписание:

https://ods.ai/events/fest2023-moscow3 Информация про Spatial.chat и другие активности доступна на ODS.AI: https://ods.ai/events/datafestonline2023 Вступить в сообщество: https://ods.ai/ Соцсети Data Fest & Course Fest: https://t.me/datafest

https://vk.com/datafest

5 days, 7 hours назад @ youtube.com
Data Fest 2023, день 12: оффлайн в Москве 31 мая
Data Fest 2023, день 12: оффлайн в Москве 31 мая Data Fest 2023, день 12: оффлайн в Москве 31 мая

Открываем заключительную неделю Data Fest 2023! Сегодня вас ждут живые выступления в гостях у Сколтеха: -С 12 до 14 доклады про DS/ML Open Source и Random DS

-С 15 до 18 доклады секции Career Полное расписание:

https://ods.ai/events/fest2023-moscow4/schedule Информация мероприятие доступна на ODS.AI: https://ods.ai/events/fest2023-moscow4

https://ods.ai/events/datafestonline2023 Вступить в сообщество: https://ods.ai/ Соцсети Data Fest & Course Fest: https://t.me/datafest

https://vk.com/datafest

1 week назад @ youtube.com
Data Fest 2023, день 9: online из spatial.chat
Data Fest 2023, день 9: online из spatial.chat Data Fest 2023, день 9: online из spatial.chat

Сегодня в spatial.chat участников Феста ждет насыщенная программа нескольких секций и активностей: Computer Vision: с 11:30 до 13:00

Practical ML Yandex: с 12:00 до 15:45

Career: с 12:30 до 18:00 (с перерывом на On-site-test)

DS Talks: с 13:00 до 14:00

MLOps: с 14:00 до 16:30

On-site-test aka Собеседования в никуда: c 14:00 до 16:00

Instruct Models: с 18:00 до 22:00

Более детальное расписание доступно на странице онлайн активностей Феста:

https://ods.ai/events/datafestonline2023/schedule_spatial Этот стрим будет последовательно перемещаться между активностями и комнатами в spatial.chat. Чтобы поучаствовать живьем в том, что будет идти параллельно со стримом, заходите в spatial.chat ODS: htt…

1 week, 3 days назад @ youtube.com
Data Fest 2023, день 8: оффлайн в Питере 27 мая
Data Fest 2023, день 8: оффлайн в Питере 27 мая Data Fest 2023, день 8: оффлайн в Питере 27 мая

Data Fest 2023 снова в Питере! Сегодня весь день стрим оффлайн дня в гостях у Газпромнефти

Заходите и в spatial.chat, где также будут активности и доклады Информация мероприятие доступна на ODS.AI: https://ods.ai/events/fest2023-spb2

https://ods.ai/events/datafestonline2023 Зарегистрироваться на другие offline дни можно здесь:

31 мая, Сколтех (Москва): https://ods.ai/events/fest2023-moscow4

2 июня, Альфа-Банк (Москва): https://ods.ai/events/fest2023-moscow3

3 июня, ЦФТ (Новосибирск): https://ods.ai/events/fest2023-novosib

4 июня, МТС (Москва): https://ods.ai/events/fest2023-moscow5 Вступить в сообщество: https://ods.ai/ Соцсети Data Fest & Course Fest: https://t.me/datafest

https://vk.com/d…

1 week, 4 days назад @ youtube.com
🎧 Podcasts
Lex Fridman AI Podcast Lex Fridman AI Podcast
последний пост 1 day, 20 hours назад
#382 – Bert Kreischer: Comedy, Drinking, Rogan, Segura, Churchill & Kim Jong Un
#382 – Bert Kreischer: Comedy, Drinking, Rogan, Segura, Churchill & Kim Jong Un #382 – Bert Kreischer: Comedy, Drinking, Rogan, Segura, Churchill & Kim Jong Un

Bert Kreischer is a comedian, actor, and podcaster.

Check him out on Bertcast, 2 Bears 1 Cave, Something is Burning, and the new movie The Machine.

Please support this podcast by checking out our sponsors:– Eight Sleep: https://www.eightsleep.com/lex to get special savings– NetSuite: http://netsuite.com/lex to get free product tour– ExpressVPN: https://expressvpn.com/lexpod to get 3 months freeEPISODE LINKS:Bert’s Instagram: https://instagram.com/bertkreischer/Bert’s Twitter: https://twitter.com/bertkreischerBert’s YouTube: https://www.youtube.com/@bertkreischerBert’s Website: https://bertbertbert.com/2 Bears 1 Cave: https://www.youtube.com/playlist?list=PL-i3EV1v5hLeT91DuXckUf6tsbMfLgZnoBo…

1 day, 20 hours назад @ lexfridman.com
#381 – Chris Lattner: Future of Programming and AI
#381 – Chris Lattner: Future of Programming and AI #381 – Chris Lattner: Future of Programming and AI

Chris Lattner is a legendary software and hardware engineer, leading projects at Apple, Tesla, Google, SiFive, and Modular AI, including the development of Swift, LLVM, Clang, MLIR, CIRCT, TPUs, and Mojo.

Please support this podcast by checking out our sponsors:– iHerb: https://lexfridman.com/iherb and use code LEX to get 22% off your order– Numerai: https://numer.ai/lex– InsideTracker: https://insidetracker.com/lex to get 20% offEPISODE LINKS:Chris’s Twitter: https://twitter.com/clattner_llvmChris’s Website: http://nondot.org/sabre/Mojo programming language: https://www.modular.com/mojoModular AI: https://modular.com/PODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcast…

4 days, 15 hours назад @ lexfridman.com
#380 – Neil Gershenfeld: Self-Replicating Robots and the Future of Fabrication
#380 – Neil Gershenfeld: Self-Replicating Robots and the Future of Fabrication #380 – Neil Gershenfeld: Self-Replicating Robots and the Future of Fabrication

Neil Gershenfeld is the director of the MIT Center for Bits and Atoms.

Please support this podcast by checking out our sponsors:– LMNT: https://drinkLMNT.com/lex to get free sample pack– NetSuite: http://netsuite.com/lex to get free product tour– BetterHelp: https://betterhelp.com/lex to get 10% offEPISODE LINKS:Neil’s Website: http://ng.cba.mit.edu/MIT Center for Bits and Atoms: https://cba.mit.edu/Fab Foundation: https://fabfoundation.org/Fab Lab community: https://fablabs.io/Fab Academy: https://fabacademy.org/Fab City: https://fab.city/PODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co/2lwqZIrSpotify: https://spoti.fi/2nEwCF8RSS: https://lexfrid…

1 week, 2 days назад @ lexfridman.com
#379 – Randall Kennedy: The N-Word – History of Race, Law, Politics, and Power
#379 – Randall Kennedy: The N-Word – History of Race, Law, Politics, and Power #379 – Randall Kennedy: The N-Word – History of Race, Law, Politics, and Power

Randall Kennedy is a law professor at Harvard and author of many seminal books on race, law, history, culture, and politics.

Please support this podcast by checking out our sponsors:– Eight Sleep: https://www.eightsleep.com/lex to get special savings– Linode: https://linode.com/lex to get $100 free credit– InsideTracker: https://insidetracker.com/lex to get 20% offEPISODE LINKS:Randall’s Website: https://hls.harvard.edu/faculty/randall-l-kennedyN*****: The Strange Career of a Troublesome Word: https://amzn.to/3MbrXSCSay It Loud!

: On Race, Law, History, and Culture: https://amzn.to/3MfQWUTFor Discrimination: Race, Affirmative Action, and the Law: https://amzn.to/3BASZxZRace, Crime, and the …

1 week, 6 days назад @ lexfridman.com
#378 – Anna Frebel: Origin and Evolution of the Universe, Galaxies, and Stars
#378 – Anna Frebel: Origin and Evolution of the Universe, Galaxies, and Stars #378 – Anna Frebel: Origin and Evolution of the Universe, Galaxies, and Stars

Anna Frebel is an astronomer and astrophysicist at MIT.

Please support this podcast by checking out our sponsors:– Hexclad Cookware: https://hexclad.com/lex and use code LEX to get 10% off– Numerai: https://numer.ai/lex– House of Macadamias: https://houseofmacadamias.com/lex and use code LEX to get 20% off your first orderEPISODE LINKS:Anna’s Twitter: https://twitter.com/annafrebelAnna’s Instagram: https://instagram.com/annafrebelAnna’s Book – Searching for the Oldest Stars: https://amzn.to/3pi2Ci6PODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co/2lwqZIrSpotify: https://spoti.fi/2nEwCF8RSS: https://lexfridman.com/feed/podcast/YouTube Full Episodes:…

2 weeks, 5 days назад @ lexfridman.com
#377 – Harvey Silverglate: Freedom of Speech
#377 – Harvey Silverglate: Freedom of Speech #377 – Harvey Silverglate: Freedom of Speech

Harvey Silverglate is a free speech advocate, co-founder of FIRE, the Foundation for Individual Rights in Expression, and author of several books on freedom of speech and criminal justice.

He is running for Harvard Board of Overseers on a platform of free speech.

If you’re a Harvard Alumni, please consider voting for him by Tue, May 16, 5pm ET: https://www.harvey4harvard.com/ballot Please support this podcast by checking out our sponsors:– Factor: https://factormeals.com/lex50 and use code lex50 to get 50% off first box– SimpliSafe: https://simplisafe.com/lex– Athletic Greens: https://athleticgreens.com/lex to get 1 month of fish oilEPISODE LINKS:Vote for Harvey: https://www.harvey4harvard.…

3 weeks, 1 day назад @ lexfridman.com
#376 – Stephen Wolfram: ChatGPT and the Nature of Truth, Reality & Computation
#376 – Stephen Wolfram: ChatGPT and the Nature of Truth, Reality & Computation #376 – Stephen Wolfram: ChatGPT and the Nature of Truth, Reality & Computation

Stephen Wolfram is a computer scientist, mathematician, theoretical physicist, and the founder of Wolfram Research, a company behind Wolfram|Alpha, Wolfram Language, and the Wolfram Physics and Metamathematics projects.

Please support this podcast by checking out our sponsors:– MasterClass: https://masterclass.com/lex to get 15% off– BetterHelp: https://betterhelp.com/lex to get 10% off– InsideTracker: https://insidetracker.com/lex to get 20% offEPISODE LINKS:Stephen’s Twitter: https://twitter.com/stephen_wolframStephen’s Blog: https://writings.stephenwolfram.comWolfram|Alpha: https://www.wolframalpha.comA New Kind of Science (book): https://amzn.to/30XoEunFundamental Theory of Physics (boo…

4 weeks назад @ lexfridman.com
#375 – David Pakman: Politics of Trump, Biden, Bernie, AOC, Socialism & Wokeism
#375 – David Pakman: Politics of Trump, Biden, Bernie, AOC, Socialism & Wokeism #375 – David Pakman: Politics of Trump, Biden, Bernie, AOC, Socialism & Wokeism

David Pakman is a left-wing progressive political commentator and host of The David Pakman Show.

Please support this podcast by checking out our sponsors:– Eight Sleep: https://www.eightsleep.com/lex to get special savings– Shopify: https://shopify.com/lex to get free trial– ExpressVPN: https://expressvpn.com/lexpod to get 3 months freeEPISODE LINKS:David’s Twitter: https://twitter.com/dpakmanDavid’s YouTube: https://youtube.com/@thedavidpakmanshowDavid’s Instagram: https://instagram.com/david.pakmanDavid’s Website: https://davidpakman.com/David’s Subreddit: https://reddit.com/r/thedavidpakmanshow/Books mentioned:1.

The Rebel and the Kingdom: https://amzn.to/3p9pLDt2.

Saving Time: https://a…

1 month назад @ lexfridman.com
#374 – Robert Playter: Boston Dynamics CEO on Humanoid and Legged Robotics
#374 – Robert Playter: Boston Dynamics CEO on Humanoid and Legged Robotics #374 – Robert Playter: Boston Dynamics CEO on Humanoid and Legged Robotics

Robert Playter is CEO of Boston Dynamics, a legendary robotics company that over 30 years has created some of the most elegant, dextrous, and simply amazing robots ever built, including the humanoid robot Atlas and the robot dog Spot.

Please support this podcast by checking out our sponsors:– NetSuite: http://netsuite.com/lex to get free product tour– Linode: https://linode.com/lex to get $100 free credit– LMNT: https://drinkLMNT.com/lex to get free sample packEPISODE LINKS:Boston Dynamics YouTube: https://youtube.com/@bostondynamicsBoston Dynamics Twitter: https://twitter.com/BostonDynamicsBoston Dynamics Instagram: https://www.instagram.com/bostondynamicsofficialBoston Dynamics Website: h…

1 month, 1 week назад @ lexfridman.com
#373 – Manolis Kellis: Evolution of Human Civilization and Superintelligent AI
#373 – Manolis Kellis: Evolution of Human Civilization and Superintelligent AI #373 – Manolis Kellis: Evolution of Human Civilization and Superintelligent AI

Manolis Kellis is a computational biologist at MIT.

Please support this podcast by checking out our sponsors:– Eight Sleep: https://www.eightsleep.com/lex to get special savings– NetSuite: http://netsuite.com/lex to get free product tour– ExpressVPN: https://expressvpn.com/lexpod to get 3 months free– InsideTracker: https://insidetracker.com/lex to get 20% offEPISODE LINKS:Manolis Website: http://web.mit.edu/manoli/Manolis Twitter: https://twitter.com/manoliskellisManolis YouTube: https://youtube.com/@ManolisKellis1PODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co/2lwqZIrSpotify: https://spoti.fi/2nEwCF8RSS: https://lexfridman.com/feed/podcast/YouT…

1 month, 2 weeks назад @ lexfridman.com
#372 – Simone Giertz: Queen of Sh*tty Robots, Innovative Engineering, and Design
#372 – Simone Giertz: Queen of Sh*tty Robots, Innovative Engineering, and Design #372 – Simone Giertz: Queen of Sh*tty Robots, Innovative Engineering, and Design

Simone Giertz is an inventor, designer, engineer, and roboticist famous for a combination of humor and brilliant creative design in the systems and products she creates.

Please support this podcast by checking out our sponsors:– MasterClass: https://masterclass.com/lex to get 15% off– InsideTracker: https://insidetracker.com/lex to get 20% off– Athletic Greens: https://athleticgreens.com/lex to get 1 month of fish oilEPISODE LINKS:Simone’s YouTube: https://www.youtube.com/@simonegiertzSimone’s Twitter: https://twitter.com/SimoneGiertzSimone’s Instagram: https://www.instagram.com/simonegiertzYETCH Store: https://yetch.storePODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Pod…

1 month, 3 weeks назад @ lexfridman.com
#371 – Max Tegmark: The Case for Halting AI Development
#371 – Max Tegmark: The Case for Halting AI Development #371 – Max Tegmark: The Case for Halting AI Development

Max Tegmark is a physicist and AI researcher at MIT, co-founder of the Future of Life Institute, and author of Life 3.0: Being Human in the Age of Artificial Intelligence.

Please support this podcast by checking out our sponsors:– Notion: https://notion.com– InsideTracker: https://insidetracker.com/lex to get 20% off– Indeed: https://indeed.com/lex to get $75 creditEPISODE LINKS:Max’s Twitter: https://twitter.com/tegmarkMax’s Website: https://space.mit.edu/home/tegmarkPause Giant AI Experiments (open letter): https://futureoflife.org/open-letter/pause-giant-ai-experimentsFuture of Life Institute: https://futureoflife.orgBooks and resources mentioned:1.

Life 3.0 (book): https://amzn.to/3UB9r…

1 month, 3 weeks назад @ lexfridman.com
#370 – Edward Frenkel: Reality is a Paradox – Mathematics, Physics, Truth & Love
#370 – Edward Frenkel: Reality is a Paradox – Mathematics, Physics, Truth & Love #370 – Edward Frenkel: Reality is a Paradox – Mathematics, Physics, Truth & Love

Edward Frenkel is a mathematician at UC Berkeley working on the interface of mathematics and quantum physics.

He is the author of Love and Math: The Heart of Hidden Reality.

Please support this podcast by checking out our sponsors:– House of Macadamias: https://houseofmacadamias.com/lex and use code LEX to get 20% off your first order– Shopify: https://shopify.com/lex to get free trial– ExpressVPN: https://expressvpn.com/lexpod to get 3 months freeEPISODE LINKS:Edward’s Website: https://edwardfrenkel.comEdward’s Book – Love and Math: https://amzn.to/40Bgxh0Edward’s Twitter: https://twitter.com/edfrenkelEdward’s YouTube: https://youtube.com/edfrenkelEdward’s Instagram: https://instagram.com/…

1 month, 4 weeks назад @ lexfridman.com
#369 – Paul Rosolie: Amazon Jungle, Uncontacted Tribes, Anacondas, and Ayahuasca
#369 – Paul Rosolie: Amazon Jungle, Uncontacted Tribes, Anacondas, and Ayahuasca #369 – Paul Rosolie: Amazon Jungle, Uncontacted Tribes, Anacondas, and Ayahuasca

Paul Rosolie is a conservationist, explorer, author, filmmaker, real life Tarzan, and founder of Junglekeepers which today protects over 50,000 acres of threatened habitat.

Please support this podcast by checking out our sponsors:– Eight Sleep: https://www.eightsleep.com/lex to get special savings– BetterHelp: https://betterhelp.com/lex to get 10% off– Athletic Greens: https://athleticgreens.com/lex to get 1 month of fish oilEPISODE LINKS:Paul’s Instagram: https://instagram.com/paulrosoliePaul’s Twitter: https://twitter.com/PaulRosolieJunglekeepers: https://www.junglekeepers.comVETPAW: https://vetpaw.orgPODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://appl…

2 months назад @ lexfridman.com
#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization
#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization #368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Eliezer Yudkowsky is a researcher, writer, and philosopher on the topic of superintelligent AI.

AGI Ruin (blog post): https://lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-list-of-lethalities2.

Adaptation and Natural Selection: https://amzn.to/40F5gfaPODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co/2lwqZIrSpotify: https://spoti.fi/2nEwCF8RSS: https://lexfridman.com/feed/podcast/YouTube Full Episodes: https://youtube.com/lexfridmanYouTube Clips: https://youtube.com/lexclipsSUPPORT & CONNECT:– Check out the sponsors above, it’s the best way to support this podcast– Support on Patreon: https://www.patreon.com/lexfridman– Twitter: https://twitter.c…

2 months, 1 week назад @ lexfridman.com
Microsoft Research Podcast Microsoft Research Podcast
последний пост 1 month назад
139 - Collaborators: Gov4git with Petar Maymounkov and Kasia Sitkiewicz
139 - Collaborators: Gov4git with Petar Maymounkov and Kasia Sitkiewicz 139 - Collaborators: Gov4git with Petar Maymounkov and Kasia Sitkiewicz

Today, I’m joined by our first two guests, Petar Maymounkov and Kasia Sitkiewicz.

And what I do at GitHub, uh, I work as a product manager.

Right now, um, communities, there are few ways of like how they make decisions, either majority of the votes or through consensus.

One nice side benefit from this entire project is that Gov4git, uh, enables people to like reflect on what they’ve done and, and what is happening.

[MUSIC]HUIZINGA: Petar and Kasia, thank you so much for coming on the show today and being our first guests on the Collaborators podcast.

1 month назад @ microsoft.com
138 - AI Frontiers: Models and Systems with Ece Kamar
138 - AI Frontiers: Models and Systems with Ece Kamar 138 - AI Frontiers: Models and Systems with Ece Kamar

Let’s say I want this AI system to write me an email to you.

And we just need to build these techniques and make them part of the way we build AI systems with these latest models.

Kamar: You know, the word agent comes from agency, and the question is what does agency mean for an AI system?

So those are the three main capabilities we want to have in our AI systems.

Right now, we are seeing writing documents, collecting information from the web, and presenting them, but in the future, what other creative things AI systems and humans can do together?

1 month, 3 weeks назад @ microsoft.com
137 - AI Frontiers: AI for health and the future of research with Peter Lee
137 - AI Frontiers: AI for health and the future of research with Peter Lee 137 - AI Frontiers: AI for health and the future of research with Peter Lee

Powerful new large-scale AI models like GPT-4 are showing dramatic improvements in reasoning, problem-solving, and language capabilities.

The second episode features Peter Lee, head of Microsoft Research.

Lee was among a group within Microsoft to have early access to GPT-4 for evaluation and experimentation.

Here, he applies his philosophy of tackling research from what will be inevitably true at a future point in time to this current moment.

He also explores the differences that may make integrating today’s AI advancements into health care more attainable, a topic he expands on in the soon-to-be-released book The AI Revolution in Medicine: GPT-4 and Beyond and the New England Journal of Me…

2 months, 1 week назад @ blubrry.com
136 - AI Frontiers: The Physics of AI with Sébastien Bubeck
136 - AI Frontiers: The Physics of AI with Sébastien Bubeck 136 - AI Frontiers: The Physics of AI with Sébastien Bubeck

Powerful new large-scale AI models like GPT-4 are showing dramatic improvements in reasoning, problem-solving, and language capabilities.

This marks a phase change for artificial intelligence—and a signal of accelerating progress to come.

The first episode features Sébastien Bubeck, who leads the Machine Learning Foundations group at Microsoft Research in Redmond.

He and his collaborators conducted an extensive evaluation of GPT-4 while it was in development, and have published their findings in a paper that explores its capabilities and limitations—noting that it shows “sparks” of artificial general intelligence.

https://www.microsoft.com/research

2 months, 2 weeks назад @ blubrry.com
NLP Highlights NLP Highlights
последний пост 1 day, 13 hours назад
140 - Generative AI and Copyright, with Chris Callison-Burch
140 - Generative AI and Copyright, with Chris Callison-Burch 140 - Generative AI and Copyright, with Chris Callison-Burch

In this special episode, we chatted with Chris Callison-Burch about his testimony in the recent U.S. Congress Hearing on the Interoperability of AI and Copyright Law.

We started by asking Chris about the purpose and the …

1 day, 13 hours назад @ soundcloud.com
139 - Coherent Long Story Generation, with Kevin Yang
139 - Coherent Long Story Generation, with Kevin Yang 139 - Coherent Long Story Generation, with Kevin Yang

How can we generate coherent long stories from language models?

Ensuring that the generated story has long range consistency and that it conforms to a high level plan is typically challenging.

In this episode, Kevin Yang…

2 months, 2 weeks назад @ soundcloud.com
138 - Compositional Generalization in Neural Networks, with Najoung Kim
138 - Compositional Generalization in Neural Networks, with Najoung Kim 138 - Compositional Generalization in Neural Networks, with Najoung Kim

Compositional generalization refers to the capability of models to generalize to out-of-distribution instances by composing information obtained from the training data.

In this episode we chatted with Najoung Kim, on how…

4 months, 2 weeks назад @ soundcloud.com
137 - Nearest Neighbor Language Modeling and Machine Translation, with Urvashi Khandelwal
137 - Nearest Neighbor Language Modeling and Machine Translation, with Urvashi Khandelwal 137 - Nearest Neighbor Language Modeling and Machine Translation, with Urvashi Khandelwal

We invited Urvashi Khandelwal, a research scientist at Google Brain to talk about nearest neighbor language and machine translation models.

These models interpolate parametric (conditional) language models with non-param…

4 months, 3 weeks назад @ soundcloud.com
Data Skeptic
последний пост 21 час назад
Evaluating Jokes with LLMs
Evaluating Jokes with LLMs Evaluating Jokes with LLMs

Fabricio Goes, a Lecturer in Creative Computing at the University of Leicester, joins us today. Fabricio discussed what creativity entails and how to evaluate jokes with LLMs. He specifically shared the process of evaluating jokes with GPT-3 and GPT-4. He concluded with his thoughts on the future of LLMs for creative tasks.

21 час назад @ dataskeptic.com
Why Machines Will Never Rule the World
Why Machines Will Never Rule the World Why Machines Will Never Rule the World

Why Machines will never Rule the WorldOn the show today, we are joined by Barry Smith and Jobst Landgrebe, authors of the book Why Machines will never Rule the World.

Jobst shared his stance on whether AI systems are truly intelligent.

Barry further discussed how AI systems are an extension of mathematical theories and thus have limitations.

They both discussed how the will and the desire of humans inherently make humans more complex than machines can ever be.

He also discussed why machines can not provide solutions to novel problems.

1 week, 1 day назад @ dataskeptic.com
A Psychopathological Approach to Safety in AGI
A Psychopathological Approach to Safety in AGI A Psychopathological Approach to Safety in AGI

He also discussed the complexity of the universe, which force machine learning engineers to develop models with abstractions.

Vahid discussed the communication barrier between human and machine agents.

He cited the work of the Foerster Lab for AI Research (FLAIR) lab focused on emergent communications for machine learning.

Vahid discussed how modeling large language models such as ChatGPT differs from the modeling behaviors of AGIs.

Vahid also recommends Less Wrong for conversations and blogs about AI safety.

2 weeks, 1 day назад @ dataskeptic.com
The NLP Community Metasurvey
The NLP Community Metasurvey The NLP Community Metasurvey

The NLP Community MetasurveyOn the show today, we are joined by Julian Michael, a postdoc at the New York University Center for Data Science.

His conversation was centered on the NLP Community Metasurvey: a survey aimed at understanding expert opinions on controversial issues in the NLP community.

Julian began by introducing AI alignment and its progression over the years.

He pointed out its similarity with other terms such as AI safety, AI ethics, and responsible AI.

Julian then dived into the NLP community meta-survey.

3 weeks, 2 days назад @ dataskeptic.com
Skeptical Survey Interpretation
Skeptical Survey Interpretation Skeptical Survey Interpretation

Skeptical Survey InterpretationKyle shares his own perspectives on challenges getting insight from surveys.

The discussion ranges from commentary on the market research industry to specific advice for detecting disingenuous or fraudulent responses and filtering them from your analysis.

Finally, he shares some quick thoughts on the usage of the Chi-Square test for interpreting cross tab results in survey analysis.

4 weeks назад @ dataskeptic.com
The Gallop Poll
The Gallop Poll The Gallop Poll

The Gallup PollTo wrap up our season on surveys, we are joined by Jeff Jones, the Senior Editor at Gallup.

Jones has been with Gallup since 1998 and has been involved in the Gallup polls since 2000.

Jeff also discussed the analysis involved after receiving the poll results.

Jeff discussed the evolution of Gallup’s questions in the face of continuous technological changes.

He rounded up by sharing trends of Gallup surveys.

1 month назад @ dataskeptic.com
Inclusive Study Group Formation at Scale
Inclusive Study Group Formation at Scale Inclusive Study Group Formation at Scale

Inclusive Study Group Formation at ScaleOn today’s show, Kyle speaks with Gireeja Ranade, a professor at the University of California at Berkeley.

Gireeja shares some insight on her research: Inclusive Study Group Formation at Scale.

However, finding the right study groups is a challenge.

Gireeja discussed some of the difficulties students faced when finding a study group.

She discussed the findings from her study, including why students leave study groups.

1 month, 1 week назад @ dataskeptic.com
The PhilPapers Survey
The PhilPapers Survey The PhilPapers Survey

He joins us to discuss the PhilPapers Survey project.

The PhilPapers survey aimed to gather philosophers’ opinions on different philosophical topics.

The PhilPapers survey was initially taken in 2009, but there was a follow-up survey in 2020.

Rounding up, David gave his thoughts on how recent AI development may affect philosophers’ opinions on machine consciousness.

ResourcesThe 2020 PhilPapers Survey: https://survey2020.philpeople.org/David Bourget’s website: https://www.dbourget.com/David Bourget’s Twitter: @dbourget.

1 month, 2 weeks назад @ dataskeptic.com
Non-Response Bias
Non-Response Bias Non-Response Bias

Non-response biasIn this episode, Kyle speaks with Yajuan Si, a Research Associate Professor at the Survey Research Center at the University of Michigan.

On the show, Yajuan discusses how to deal with survey non-response bias.

Yujuan shared what causes non-response across different levels.

She also discussed strategies such as weighting and multiple imputations to counter the bias caused by non-response.

She also discussed how researchers could estimate the bias caused by non-response.

1 month, 3 weeks назад @ dataskeptic.com
Measuring Trust in Robots with Likert Scales
Measuring Trust in Robots with Likert Scales Measuring Trust in Robots with Likert Scales

Maria is a Ph.D. student in the CORE Robotics Lab at Georgia Institute of Technology.

They discuss best practices for measuring a respondent’s perception in a survey.

Matthew discussed the process of validation, starting with measuring humans’ trust in robots.

Rounding up, Matthew discussed the trends of robot adoption for the coming years and some new research they are working on.

You can learn more about Matthew’s work by visiting CORE Robotics Lab’s official website.

2 months назад @ dataskeptic.com
CAREER Prediction
CAREER Prediction CAREER Prediction

Keyon mentioned using two popular longitudinal survey datasets: the National Longitudinal Survey of Youth (NLSY) and the Panel Study of Income Dynamics (PSID).

Keyon discussed how he incorporated a resume dataset that contained the resumes of about 25 million American workers in his work.

Using transfer learning, Keyon discussed the process of training his model to learn from the large resume data and make predictions from the smaller survey dataset.

Keyon’s machine learning model called CAREER was also inspired by the popular Transformers for NLP applications.

He mentioned that his CAREER model did not only learn the representation of individual jobs but also the representation of historie…

2 months, 1 week назад @ dataskeptic.com
The Panel Study of Income Dynamics
The Panel Study of Income Dynamics The Panel Study of Income Dynamics

Longitudinal Household SurveyNoura Insolera, a Research Investigator with the Panel Study of Income Dynamics (PSID) at the Institute for Social Research (ISR), University of Michigan, joins us today.

She focused on the trends and observations in food insecurity.

She explained what food insecurity was and the levels of food insecurity.

She shared the percentage of people living with food insecurity and the observed trends over the years.

She also shared the demography of those prone to food insecurity and its impact on their lives.

2 months, 2 weeks назад @ dataskeptic.com
Survey Design Working Session
Survey Design Working Session Survey Design Working Session

Affectionately called the Wikipediatrician, Susan Gerbic is the founder of Guerrilla Skepticism on Wikipedia (GSoW), Monterey County Skeptics and is a self-proclaimed skeptical junkie.

A Skeptical Inquirer contributor Gerbic is a fellow of CSI and winner of the James Randi Foundation award for 2017.

While her particular focus has been “Grief Vampires” (psychics), her activism encompasses all areas of skepticism.

While her particular focus has been “Grief Vampires” (psychics), her activism encompasses all areas of skepticism.

Information about her investigations into Grief Vampires can be found here https://www.abouttimeproject.org/about-7.

2 months, 3 weeks назад @ dataskeptic.com
Bot Detection and Dyadic Surveys
Bot Detection and Dyadic Surveys Bot Detection and Dyadic Surveys

Bot Detection and Dyadic SurveysSara Bybee, a postdoctoral research scholar at the University of Utah, joins us today.

On the show, she shares her study which involved detecting social bots in surveys.

Sara’s study was aimed at understanding LGBTQ couples facing cancer.

Sara shared her strategy for validating her suspicion and authenticating the submissions.

Rounding up, she shared some insights about her study on LGBTQ couples facing cancer.

3 months назад @ dataskeptic.com
Reproducible ESP Testing
Reproducible ESP Testing Reproducible ESP Testing

Zoltán joins us to discuss his research on replicating research findings.

Zoltán began by discussing the current problem with biomedicine and social science journals — the low replication rates of research findings.

He stated how low replicability affects the trustworthiness of published papers.

He also mentioned reasons for the low replicability.

He particularly discussed the design process for interacting with psychologist Daryl Bem to test the famous Bem experiment.

3 months, 2 weeks назад @ dataskeptic.com
SuperDataScience SuperDataScience
последний пост 1 day, 2 hours назад
685: Tools for Building Real-Time Machine Learning Applications, with Richmond Alake
685: Tools for Building Real-Time Machine Learning Applications, with Richmond Alake 685: Tools for Building Real-Time Machine Learning Applications, with Richmond Alake

Richmond Alake, a Machine Learning Architect at Slalom Build, sits down with Jon to share real-time ML insights, tools and career experiences for a high-energy and high impact episode.

From his work at Slalom Build to hi…

1 day, 2 hours назад @ soundcloud.com
684: Get More Language Context out of your LLM
684: Get More Language Context out of your LLM 684: Get More Language Context out of your LLM

Open-source LLMs, FlashAttention and generative AI terminology: Host Jon Krohn gives us the lift we need to explore the next big steps in generative AI.

Listen to the specific way in which Stanford University’s “exact at…

5 days, 2 hours назад @ soundcloud.com
683: Contextual A.I. for Adapting to Adversaries, with Dr. Matar Haller
683: Contextual A.I. for Adapting to Adversaries, with Dr. Matar Haller 683: Contextual A.I. for Adapting to Adversaries, with Dr. Matar Haller

Monitoring malicious, user-generated content; contextual AI; adapting to novel evasion attempts: Matar Haller speaks to Jon Krohn about the challenges of identifying, analyzing and flagging malicious information online.

1 week, 1 day назад @ soundcloud.com
682: Business Intelligence Tools, with Mico Yuk
682: Business Intelligence Tools, with Mico Yuk 682: Business Intelligence Tools, with Mico Yuk

In this week's episode, Mico Yuk, host of 'Analytics on Fire', joins Jon Krohn to share her effective business intelligence and analytics framework, BIDS, for persuading key decision makers.

She crowns one "power" tool a…

1 week, 5 days назад @ soundcloud.com
681: XGBoost: The Ultimate Classifier, with Matt Harrison
681: XGBoost: The Ultimate Classifier, with Matt Harrison 681: XGBoost: The Ultimate Classifier, with Matt Harrison

Unlock the power of XGBoost by learning how to fine-tune its hyperparameters and discover its optimal modeling situations.

This and more, when best-selling author and leading Python consultant Matt Harrison teams up with…

2 weeks, 1 day назад @ soundcloud.com
680: Automating Industrial Machines with Data Science and the Internet of Things (IoT)
680: Automating Industrial Machines with Data Science and the Internet of Things (IoT) 680: Automating Industrial Machines with Data Science and the Internet of Things (IoT)

Industrial machinery’s dependence on data science, tech stacks to build IoT platforms, and transitioning from data science to product: This week’s Friday episode with Allegra Alessi explores the minutiae of product owner…

2 weeks, 5 days назад @ soundcloud.com
679: The A.I. and Machine Learning Landscape, with investor George Mathew
679: The A.I. and Machine Learning Landscape, with investor George Mathew 679: The A.I. and Machine Learning Landscape, with investor George Mathew

Generative AI, MLOps, and making smart investments in AI: This week’s episode is critical listening for AI investors and generative AI creators.

AI investor George Mathew talks with host Jon Krohn about the emerging gene…

3 weeks, 1 day назад @ soundcloud.com
678: StableLM: Open-source "ChatGPT"-like LLMs you can fit on one GPU
678: StableLM: Open-source "ChatGPT"-like LLMs you can fit on one GPU 678: StableLM: Open-source "ChatGPT"-like LLMs you can fit on one GPU

StableLM, the new family of open-source language models from the brilliant minds behind Stable Diffusion is out!

Small, but mighty, these models have been trained on an unprecedented amount of data for single GPU LLMs.

3 weeks, 5 days назад @ soundcloud.com
677: Digital Analytics with Avinash Kaushik
677: Digital Analytics with Avinash Kaushik 677: Digital Analytics with Avinash Kaushik

How does one use marketing analytics to drive business success?

Avinash Kaushik, Chief Strategy Officer at Croud and former Sr. Director of Global Strategic Analytics at Google joins Jon Krohn live for an exciting episod…

4 weeks, 1 day назад @ soundcloud.com
676: The Chinchilla Scaling Laws
676: The Chinchilla Scaling Laws 676: The Chinchilla Scaling Laws

Chinchilla AI, and fine-tuning proprietary tasks with large language models: On this week’s Five-Minute Friday, host Jon Krohn outlines the principles of the Chinchilla Scaling Laws, the incredible power of models such a…

1 month назад @ soundcloud.com
675: Pandas for Data Analysis and Visualization
675: Pandas for Data Analysis and Visualization 675: Pandas for Data Analysis and Visualization

Wrangling data in Pandas, when to use Pandas, Matplotlib or Seaborn, and why you should learn to create Python packages: Jon Krohn speaks with guest Stefanie Molin, author of Hands-On Data Analysis with Pandas.

1 month назад @ soundcloud.com
674: Parameter-Efficient Fine-Tuning of LLMs using LoRA (Low-Rank Adaptation)
674: Parameter-Efficient Fine-Tuning of LLMs using LoRA (Low-Rank Adaptation) 674: Parameter-Efficient Fine-Tuning of LLMs using LoRA (Low-Rank Adaptation)

Models like Alpaca, Vicuña, GPT4All-J and Dolly 2.0 have relatively small model architectures, but they're prohibitively expensive to train even on a small amount of your own data.

The standard model-training protocol ca…

1 month, 1 week назад @ soundcloud.com
673: Taipy, the open-source Python application builder
673: Taipy, the open-source Python application builder 673: Taipy, the open-source Python application builder

Vincent Gosselin, CEO and co-founder of Taipy, an open-source Python library, joins Jon Krohn to discuss how to accelerate productivity in Python and build scalable, reusable, and maintainable data pipelines.

Gosselin sh…

1 month, 1 week назад @ soundcloud.com
672: Open-source "ChatGPT": Alpaca, Vicuña, GPT4All-J, and Dolly 2.0
672: Open-source "ChatGPT": Alpaca, Vicuña, GPT4All-J, and Dolly 2.0 672: Open-source "ChatGPT": Alpaca, Vicuña, GPT4All-J, and Dolly 2.0

Get started with language models: Learn about the commercial-use options available for your business in this week’s Five-Minute Friday, where host Jon Krohn discusses four models that have many of the capabilities of Cha…

1 month, 2 weeks назад @ soundcloud.com
671: Cloud Machine Learning
671: Cloud Machine Learning 671: Cloud Machine Learning

Get to grips with AWS, Azure, Google Cloud Platform on this week’s episode.

Host Jon Krohn speaks with Kirill Eremenko and Hadelin de Ponteves about CloudWolf, a cloud computing educational platform that prepares student…

1 month, 2 weeks назад @ soundcloud.com
Data Science at Home Data Science at Home
последний пост 1 week назад
Unleashing the Force: Blending Neural Networks and Physics for Epic Predictions (Ep. 230)
Unleashing the Force: Blending Neural Networks and Physics for Epic Predictions (Ep. 230) Unleashing the Force: Blending Neural Networks and Physics for Epic Predictions (Ep. 230)

Source: Physics Informed Deep LearningIn this enlightening episode of our podcast, we delve into the fascinating realm of Physics Informed Neural Networks (PINNs) and explore how they combine the extraordinary prediction capabilities of neural networks with the unparalleled accuracy of physics models.

Join us as we unravel the mysteries behind PINNs and their potential to revolutionize various scientific and engineering domains.

We’ll discuss the underlying principles that enable these networks to incorporate physical laws and constraints, resulting in enhanced predictions and a deeper understanding of complex systems.

SponsorsThis episode is supported by Mimecast – the email security solut…

1 week назад @ datascienceathome.com
AI’s Impact on Software Engineering: Killing Old Principles? [RB] (Ep. 229)
AI’s Impact on Software Engineering: Killing Old Principles? [RB] (Ep. 229) AI’s Impact on Software Engineering: Killing Old Principles? [RB] (Ep. 229)

In this episode, we dive into the ways in which AI and machine learning are disrupting traditional software engineering principles.

However, this reliance on AI can come at a cost to the tried-and-true methods of software engineering.

From real-time market data to sophisticated analytics, powerful trading tools, and more, Bloomberg engineers work with systems that operate at scale.

If you’re a software engineer looking for an exciting and fulfilling career, head over to bloomberg.com/careers to learn more.

Industries and governments around the world are fighting back, unveiling new regulations meant to better protect data against this rising threat.

1 week, 6 days назад @ datascienceathome.com
Warning! Mathematical Mayhem Ahead: Demystifying Liquid Time-Constant Networks (Ep. 228)
Warning! Mathematical Mayhem Ahead: Demystifying Liquid Time-Constant Networks (Ep. 228) Warning! Mathematical Mayhem Ahead: Demystifying Liquid Time-Constant Networks (Ep. 228)

Source: IEEE SpecrumHold on to your calculators and buckle up for a wild mathematical ride in this episode!

Brace yourself as we dive into the fascinating realm of Liquid Time-Constant Networks (LTCs), where mathematical content reaches new heights of excitement.

In this mind-bending adventure, we demystify the intricacies of LTCs, from complex equations to mind-boggling mathematical concepts, we break them down into digestible explanations.

Referenceshttps://www.science.org/doi/10.1126/scirobotics.adc8892https://spectrum.ieee.org/liquid-neural-networks#toggle-gdpr

2 weeks, 6 days назад @ datascienceathome.com
Efficiently Retraining Language Models: How to Level Up Without Breaking the Bank (Ep. 227)
Efficiently Retraining Language Models: How to Level Up Without Breaking the Bank (Ep. 227) Efficiently Retraining Language Models: How to Level Up Without Breaking the Bank (Ep. 227)

🎙️In our latest podcast episode, we dive deep into the world of LoRa (Low-Rank Adaptation) for large language models (LLMs).

This groundbreaking technique is revolutionizing the way we approach language model training by leveraging low-rank approximations.

We’ll explore the ingenious strategies and practical methods that empower you to fine-tune your language models without breaking the bank.

Whether you’re a researcher, developer, or language model enthusiast, this episode is packed with invaluable insights.

Tune in and join the conversation as we unravel the secrets of LoRa low-rank adaptation and show you how to retrain LLMs on a budget.

3 weeks, 3 days назад @ datascienceathome.com
Revolutionize Your AI Game: How Running Large Language Models Locally Gives You an Unfair Advantage Over Big Tech Giants (Ep. 226)
Revolutionize Your AI Game: How Running Large Language Models Locally Gives You an Unfair Advantage Over Big Tech Giants (Ep. 226) Revolutionize Your AI Game: How Running Large Language Models Locally Gives You an Unfair Advantage Over Big Tech Giants (Ep. 226)

SourceThis is the first episode about the latest trend in artificial intelligence that’s shaking up the industry – running large language models locally on your machine.

This new approach allows you to bypass the limitations and constraints of cloud-based models controlled by big tech companies, and take control of your own AI journey.

We’ll delve into the benefits of running models locally, such as increased speed, improved privacy and security, and greater customization and flexibility.

We’ll also discuss the technical requirements and considerations for running these models on your own hardware, and provide practical tips and advice to get you started.

Join us as we uncover the secrets t…

1 month назад @ datascienceathome.com
Rust: A Journey to High-Performance and Confidence in Code at Amethix Technologies (Ep. 225)
Rust: A Journey to High-Performance and Confidence in Code at Amethix Technologies (Ep. 225) Rust: A Journey to High-Performance and Confidence in Code at Amethix Technologies (Ep. 225)

Source litslink.comThe journey of porting our projects to Rust was intense, but it was a decision we made to improve the quality of our software.

The migration was not an easy task, as it required a considerable amount of time and resources.

However, it was worth the effort as we have seen significant improvements in code reusability, code cleanliness, and performance.

In this episode, I will tell you why you should consider taking that journey too.

1 month, 1 week назад @ datascienceathome.com
DataForge: Object Tracking Methods and its Recent Developments (Ep. 3)
DataForge: Object Tracking Methods and its Recent Developments (Ep. 3) DataForge: Object Tracking Methods and its Recent Developments (Ep. 3)

☄️ 🔍 During our chat, we explored how object tracking technology takes initial object detections, creates visual models for these objects, and follows them as they move throughout a video.

Therefore, we can count and monitor each tracked object by assigning unique identifiers to each tracked object in real-time.

You may have seen this technology in action, represented by a bounding box or other visual indicators that surround and follow objects in a video.

Youtube LinkBrought to you by.

Data Science at Home Podcast’s team, “DataForge” fireside chats are monthly casual chats hosted by us where we host data science advocates and discuss trending topics in this amazing field.

1 month, 2 weeks назад @ datascienceathome.com
The Power of Graph Neural Networks: Understanding the Future of AI – Part 2/2 (Ep.224)
The Power of Graph Neural Networks: Understanding the Future of AI – Part 2/2 (Ep.224) The Power of Graph Neural Networks: Understanding the Future of AI – Part 2/2 (Ep.224)

SourceIn this episode of our podcast, we dive deep into the fascinating world of Graph Neural Networks.

Next, we turn our attention to Generative Graph Models, which enable the creation of new graph structures that are similar to those in a given dataset.

Whether you’re a seasoned graph neural network expert or just starting to explore the field, this episode has something for you.

So join us for a deep dive into the power and potential of Graph Neural Networks.

References:Machine Learning with Graphs – http://web.stanford.edu/class/cs224w/A Comprehensive Survey on Graph Neural Networks – https://arxiv.org/abs/1901.00596

1 month, 2 weeks назад @ datascienceathome.com
The Power of Graph Neural Networks: Understanding the Future of AI – Part 1/2 (Ep.223)
The Power of Graph Neural Networks: Understanding the Future of AI – Part 1/2 (Ep.223) The Power of Graph Neural Networks: Understanding the Future of AI – Part 1/2 (Ep.223)

Source NVidiaIn this episode, I explore the cutting-edge technology of graph neural networks (GNNs) and how they are revolutionizing the field of artificial intelligence.

I break down the complex concepts behind GNNs and explain how they work by modeling the relationships between data points in a graph structure.

I also delve into the various real-world applications of GNNs, from drug discovery to recommendation systems, and how they are outperforming traditional machine learning models.

Join me and demystify this exciting area of AI research and discover the power of graph neural networks.

1 month, 3 weeks назад @ datascienceathome.com
Leveling Up AI: Reinforcement Learning with Human Feedback (Ep. 222)
Leveling Up AI: Reinforcement Learning with Human Feedback (Ep. 222) Leveling Up AI: Reinforcement Learning with Human Feedback (Ep. 222)

SourceIn this episode, we dive into the not-so-secret sauce of ChatGPT, and what makes it a different model than its predecessors in the field of NLP and Large Language Models.

We explore how human feedback can be used to speed up the learning process in reinforcement learning, making it more efficient and effective.

Whether you’re a machine learning practitioner, researcher, or simply curious about how machines learn, this episode will give you a fascinating glimpse into the world of reinforcement learning with human feedback.

SponsorsThis episode is supported by How to Fix the Internet, a cool podcast from the Electronic Frontier Foundation and Bloomberg, global provider of financial news…

2 months назад @ datascienceathome.com
The promise and pitfalls of GPT-4 (Ep. 221)
The promise and pitfalls of GPT-4 (Ep. 221) The promise and pitfalls of GPT-4 (Ep. 221)

In this episode, we explore the potential of the highly anticipated GPT-4 language model and the challenges that come with its development.

From its ability to generate highly coherent and creative text to concerns about ethical considerations and the potential misuse of such technology, we delve into the promise and pitfalls of GPT-4.

Join us as we speak with experts in the field to gain insights into the latest developments and the impact that GPT-4 could have on the future of natural language processing.

2 months назад @ datascienceathome.com
AI’s Impact on Software Engineering: Killing Old Principles? (Ep. 220)
AI’s Impact on Software Engineering: Killing Old Principles? (Ep. 220) AI’s Impact on Software Engineering: Killing Old Principles? (Ep. 220)

(Source)In this episode, we dive into the ways in which AI and machine learning are disrupting traditional software engineering principles.

With the advent of automation and intelligent systems, developers are increasingly relying on algorithms to create efficient and effective code.

However, this reliance on AI can come at a cost to the tried-and-true methods of software engineering.

Join us as we explore the pros and cons of this paradigm shift and discuss what it means for the future of software development.

2 months, 3 weeks назад @ datascienceathome.com
Prove It Without Revealing It: Exploring the Power of Zero-Knowledge Proofs in Data Science (Ep. 218)
Prove It Without Revealing It: Exploring the Power of Zero-Knowledge Proofs in Data Science (Ep. 218) Prove It Without Revealing It: Exploring the Power of Zero-Knowledge Proofs in Data Science (Ep. 218)

In this episode, we dive into the fascinating world of zero-knowledge proofs and their impact on data science.

Zero-knowledge proofs allow one party to prove to another that they know a secret without revealing the secret itself.

This powerful concept has numerous applications in data science, from ensuring data privacy and security to facilitating secure transactions and identity verification.

We explore the mechanics of zero-knowledge proofs, their real-world applications, and how it is revolutionizing how we handle sensitive information.

Join us as we uncover the secrets of zero-knowledge proofs and their impact on the future of data science.

3 months, 1 week назад @ datascienceathome.com
DataForge: Differentiable Robotic Simulations (Ep. 2)
DataForge: Differentiable Robotic Simulations  (Ep. 2) DataForge: Differentiable Robotic Simulations (Ep. 2)

🤔 What are differentiable physics engines?

How does Deep Learning tackle the most challenging robotics simulation problems?

🤖🦾Join the 2nd episode of the “DataForge” fireside chat series with Francesco and Nabi where we will be discussing the applications of NNs in physics engines and simulations.

3 months, 1 week назад @ datascienceathome.com
DataForge: Job roles in Data Science (Ep. 1)
DataForge: Job roles in Data Science (Ep. 1) DataForge: Job roles in Data Science (Ep. 1)

🔥 🤔 What skills and qualifications are required for each job role in Data Science?

What are the most emerging roles in Data Science and ML?

Staying generalist or specialist?

If you are looking to learn more about the most recent emerging roles in Data Science, this live session is the perfect one for you to watch!

Data Science at Home Podcast’s team, “DataForge” fireside chats are monthly casual chats hosted by us where we host data science advocates and discuss trending topics in this amazing field.

3 months, 1 week назад @ datascienceathome.com