Very ML
State-of-the-art Machine Learning News Feed
/r/MachineLearning
последний пост 57 минут назад
[Project] Fine tuning Hugging Face's GPT-2 transformer model for text generation
[Project] Fine tuning Hugging Face's GPT-2 transformer model for text generation

whoa there, pardner!

reddit's awesome and all, but you may have a bit of a problem.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 195.201.137.88 and reddit account

57 минут назад @ reddit.com
[Discussion] - "data sourcing will be more important than model building in the era of foundational model fine-tuning"
[Discussion] - "data sourcing will be more important than model building in the era of foundational model fine-tuning"

whoa there, pardner!

reddit's awesome and all, but you may have a bit of a problem.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 195.201.137.88 and reddit account

2 часа назад @ reddit.com
[P] Releasing customized language model pre-training acceleration toolkit: ExtremeBERT
[P] Releasing customized language model pre-training acceleration toolkit: ExtremeBERT

whoa there, pardner!

reddit's awesome and all, but you may have a bit of a problem.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 195.201.137.88 and reddit account

2 часа назад @ reddit.com
[P] Probably the Fastest Open Source Stable Diffusion is released
[P] Probably the Fastest Open Source Stable Diffusion is released [P] Probably the Fastest Open Source Stable Diffusion is released

whoa there, pardner!

reddit's awesome and all, but you may have a bit of a problem.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 195.201.137.88 and reddit account

2 часа назад @ reddit.com
[Project] I used whisper to transcribe 2500 episodes from around 80 podcasts and made it searchable.
[Project] I used whisper to transcribe 2500 episodes from around 80 podcasts and made it searchable.

whoa there, pardner!

reddit's awesome and all, but you may have a bit of a problem.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 195.201.137.88 and reddit account

2 часа назад @ reddit.com
[D] Pretraining a visual model
[D] Pretraining a visual model

whoa there, pardner!

reddit's awesome and all, but you may have a bit of a problem.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 195.201.137.88 and reddit account

2 часа назад @ reddit.com
[D] Annotations Tools' Bounding Box to Mask Feature Implementation
[D] Annotations Tools' Bounding Box to Mask Feature Implementation

whoa there, pardner!

reddit's awesome and all, but you may have a bit of a problem.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 195.201.137.88 and reddit account

2 часа назад @ reddit.com
[P] Build a model that can translate English into Spanish. In short, Implemented Transformer from scratch in TensorFlow 2.x. 🚀
[P] Build a model that can translate English into Spanish. In short, Implemented Transformer from scratch in TensorFlow 2.x. 🚀 [P] Build a model that can translate English into Spanish. In short, Implemented Transformer from scratch in TensorFlow 2.x. 🚀

whoa there, pardner!

reddit's awesome and all, but you may have a bit of a problem.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 195.201.137.88 and reddit account

3 часа назад @ reddit.com
[D] Reproducing MS Vision API labels with open models
[D] Reproducing MS Vision API labels with open models

whoa there, pardner!

reddit's awesome and all, but you may have a bit of a problem.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 195.201.137.88 and reddit account

3 часа назад @ reddit.com
[D] Risk assessment
[D] Risk assessment

whoa there, pardner!

reddit's awesome and all, but you may have a bit of a problem.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 195.201.137.88 and reddit account

4 часа назад @ reddit.com
[D]Neural Network and Decision Tree are the same
[D]Neural Network and Decision Tree are the same

whoa there, pardner!

reddit's awesome and all, but you may have a bit of a problem.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 195.201.137.88 and reddit account

4 часа назад @ reddit.com
IEEE ICASSP Clairty Challenge for ML/AI Speech Enhancement [R]
IEEE ICASSP Clairty Challenge for ML/AI Speech Enhancement [R]

whoa there, pardner!

reddit's awesome and all, but you may have a bit of a problem.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 195.201.137.88 and reddit account

4 часа назад @ reddit.com
[D] Is it possible to get a confusion matrix for Optical Character Recognition
[D] Is it possible to get a confusion matrix for Optical Character Recognition

whoa there, pardner!

reddit's awesome and all, but you may have a bit of a problem.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 195.201.137.88 and reddit account

5 часов назад @ reddit.com
[D] Cloud providers for hobby use
[D] Cloud providers for hobby use

whoa there, pardner!

reddit's awesome and all, but you may have a bit of a problem.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 195.201.137.88 and reddit account

6 часов назад @ reddit.com
[R] On Distillation of Guided Diffusion Models: “For diffusion models trained on the latent-space (Stable Diffusion), our approach is able to generate hi-fidelity images using as few as 1-4 denoising steps, accelerating inference by >10x compared to existi
[R] On Distillation of Guided Diffusion Models: “For diffusion models trained on the latent-space (Stable Diffusion), our approach is able to generate hi-fidelity images using as few as 1-4 denoising steps, accelerating inference by >10x compared to existi

whoa there, pardner!

reddit's awesome and all, but you may have a bit of a problem.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please contact us at this email address.

when contacting us, please include your ip address which is: 195.201.137.88 and reddit account

9 часов назад @ reddit.com
Towards Data Science
последний пост 41 минуту назад
How to estimate and reduce the carbon footprint of machine learning models
How to estimate and reduce the carbon footprint of machine learning models How to estimate and reduce the carbon footprint of machine learning models

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

41 минуту назад @ towardsdatascience.com
Exploring Midjourney V4 for Creating Digital Art
Exploring Midjourney V4 for Creating Digital Art Exploring Midjourney V4 for Creating Digital Art

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 часа назад @ towardsdatascience.com
Finding related articles with TF-IDF and Python
Finding related articles with TF-IDF and Python Finding related articles with TF-IDF and Python

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 часа назад @ towardsdatascience.com
’Tis the Season to Explore our Best Deep Dives
’Tis the Season to Explore our Best Deep Dives ’Tis the Season to Explore our Best Deep Dives

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 часа назад @ towardsdatascience.com
How to Evaluate Clustering Performance without Ground Truth Labels
How to Evaluate Clustering Performance without Ground Truth Labels How to Evaluate Clustering Performance without Ground Truth Labels

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 часа назад @ towardsdatascience.com
Sssneaky Data Problems that Creep in Over Time
Sssneaky Data Problems that Creep in Over Time Sssneaky Data Problems that Creep in Over Time

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

9 часов назад @ towardsdatascience.com
Understanding SVR and Epsilon Insensitive Loss with Scikit-learn
Understanding SVR and Epsilon Insensitive Loss with Scikit-learn Understanding SVR and Epsilon Insensitive Loss with Scikit-learn

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

9 часов назад @ towardsdatascience.com
Artificial Intelligence for Geospatial Analysis with Pytorch’s TorchGeo (part 2)
Artificial Intelligence for Geospatial Analysis with Pytorch’s TorchGeo (part 2) Artificial Intelligence for Geospatial Analysis with Pytorch’s TorchGeo (part 2)

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

9 часов назад @ towardsdatascience.com
k-Nearest Neighbors for Lithology Classification from Well Logs Using Python
k-Nearest Neighbors for Lithology Classification from Well Logs Using Python k-Nearest Neighbors for Lithology Classification from Well Logs Using Python

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

20 часов назад @ towardsdatascience.com
How to (Finally) Install TensorFlow GPU on WSL2
How to (Finally) Install TensorFlow GPU on WSL2 How to (Finally) Install TensorFlow GPU on WSL2

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

20 часов назад @ towardsdatascience.com
Data Retrieval with SQL — Tutorial & Examples
Data Retrieval with SQL — Tutorial & Examples Data Retrieval with SQL — Tutorial & Examples

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

20 часов назад @ towardsdatascience.com
How To Lie With Data
How To Lie With Data How To Lie With Data

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

20 часов назад @ towardsdatascience.com
Fluorescent Neuronal Cells dataset — part II
Fluorescent Neuronal Cells dataset — part II Fluorescent Neuronal Cells dataset — part II

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

20 часов назад @ towardsdatascience.com
Techniques to Improve the Performance of a DQN Agent
Techniques to Improve the Performance of a DQN Agent Techniques to Improve the Performance of a DQN Agent

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

20 часов назад @ towardsdatascience.com
How Type I Error, Confidence Intervals, Type II Error, and Power Are All Related
How Type I Error, Confidence Intervals, Type II Error, and Power Are All Related How Type I Error, Confidence Intervals, Type II Error, and Power Are All Related

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

20 часов назад @ towardsdatascience.com
Distill.pub Distill.pub
последний пост None
The Gradient The Gradient
последний пост 1 месяц, 4 недели назад
Artificial Curiosity as Moral Virtue
Artificial Curiosity as Moral Virtue Artificial Curiosity as Moral Virtue

An artificially intelligent agent could act and think in ways driven by curiosity, and, in these ways, exercise the moral virtue of curiosity - letting them become more and more human-like.

We begin to investigate this question through an exploration of artificial curiosity in the context of the free energy principle - as put forward by neuroscientist Karl Friston.

The free energy principle provides a way for adaptive systems to unify action, perception, and learning.

Other applications of the free energy principle span areas of exploration and novelty seeking.

Artificial curiosity, as it could follow from the free energy principle, could be examined appropriately.

1 месяц, 4 недели назад @ thegradient.pub
Artificial Intelligence and the Future of Demos
Artificial Intelligence and the Future of Demos Artificial Intelligence and the Future of Demos

In one of the claimed birthplaces of democracy, Ancient Athens, demos covered all Athenian citizens, who had an equal say in collective decision-making.

And only the real people – the demos – can recognize the ‘real’ from the ‘not-so-real.’In essence, if you are not part of the demos, you have no say in collective decision-making.

Original Photo: Daria Shevtsova / Pixabay, edited by authorIn democracies, it is the demos that should have the topmost power over collective decision-making.

If we want to preserve democracy and/or demos based on equality and freedom, we could start asking ourselves: Is our future demos nation-state-based or global, and how could we align AI development with this…

2 месяца назад @ thegradient.pub
Causal Inference: Connecting Data and Reality
Causal Inference: Connecting Data and Reality Causal Inference: Connecting Data and Reality

Any causal inference problem consists of two parts: causal identification and statistical inference.

Causal inference theoryCausal inference is a theory that describes, discriminates, and measures causal relationships, developed from statistics.

Causal representation learningUnlike the traditional causal inference approach, which uses causal graphs to connect random variables to complete the causal discovery and reasoning hypothesis task, the problem of causal representation learning has recently attracted more attention.

is not valid, and causal inference studies exactly such a situation: how to learn a causal model that can work under different distributions, imply a causal mechanism (Cau…

2 месяца, 4 недели назад @ thegradient.pub
The Future of Speech Recognition: Where Will We Be in 2030?
The Future of Speech Recognition: Where Will We Be in 2030? The Future of Speech Recognition: Where Will We Be in 2030?

"By 2030, speech recognition will feature truly multilingual models, rich standardized output objects, and be available to all and at scale.

Finally, speech recognition will engender the principles of responsible AI, and operate without bias."

Source: Hannun, Awni, “Speech Recognition is not Solved”.

CitationFor attribution in academic contexts or books, please cite this work asMigüel Jetté and Corey Miller, "The Future of Speech Recognition: Where will we be in 2030?

BibTeX citation:@article{miller2021futureofowork,author = {Jetté, Migüel and Miller, Corey},title = {The Future of Speech Recognition: Where will we be in 2030?

3 месяца, 1 неделя назад @ thegradient.pub
Symmetries, Scaffolds, and a New Era of Scientific Discovery
Symmetries, Scaffolds, and a New Era of Scientific Discovery Symmetries, Scaffolds, and a New Era of Scientific Discovery

Figure 1: Timeline of the drug discovery procedure, from target validation to clinical launch, from [1].

This article will cover how the application of geometric deep learning and the field of molecular machine learning is ushering us into a new era of scientific discovery.

CitationFor attribution in academic contexts or books, please cite this work asMeilina Reksoprodjo, "Symmetries, Scaffolds, and a New Era of Scientific Discovery", The Gradient, 2022.

[7] J. Vamathevan et al., "Applications of machine learning in drug discovery and development", Nature Reviews Drug Discovery, vol.

Available: https://thegradient.pub/ai-scientific-revolution/[13] H. Chen, O. Engkvist, Y. Wang, M. Olivecron…

4 месяца назад @ thegradient.pub
Overview of Graph Theory and Alzheimer's Disease
Overview of Graph Theory and Alzheimer's Disease Overview of Graph Theory and Alzheimer's Disease

2015)Photos of the brains of Paul Broca’s two aphasic patients, Leborgne (top row) and Lelong (bottom row) (Dronkers et al.

During the last decade, more advanced techniques borrowed from graph theory have been applied to brain imaging research (Rubinov and Sporns 2010).

Importantly, graph-based analyses can model the dynamics of the entire brain network all at once, thereby enabling investigation of network-wide properties.

CitationFor attribution in academic contexts or books, please cite this work asRebecca Ehrenkranz, "Overview of Graph Theory and Alzheimer's Disease", The Gradient, 2022.

BibTeX citation:@article{ehrenkranz_graph_ad,author = {Ehrenkranz, Rebecca},title = {Overview of Gra…

4 месяца, 1 неделя назад @ thegradient.pub
Lessons from the GPT-4Chan Controversy
Lessons from the GPT-4Chan Controversy Lessons from the GPT-4Chan Controversy

PreambleThis article contains an objective summary of a recent controversy related to an AI model named GPT-4chan, as well as a subjective commentary with my thoughts on it.

The main questions I will address are the following:Can GPT-4chan cause harm to peopleCan GPT-4chan contribute to AI researchIs GPT-4chan more 'truthful' than GPT-3Should the GPT-4chan model have been released to the publicWhat was the intent behind developing, deploying, and distributing GPT-4chanWas deploying GPT-4chan bots to interact with people on a message board unethicalCan GPT-4chan cause harm to peopleCan a bot that disseminates hate speech on the internet (e.g.

Moreover, now that the whole ordeal predictably l…

5 месяцев, 3 недели назад @ thegradient.pub
AI is Ushering In a New Scientific Revolution
AI is Ushering In a New Scientific Revolution AI is Ushering In a New Scientific Revolution

With manifold impacts stretching the length of the scientific method, AI is ushering in a scientific revolution through groundbreaking discoveries, novel techniques and augmented tools, and automated methods that advance the speed and accuracy of the scientific process.

Beyond the protein-folding problem, AI has proven its scientific worth with discoveries in a number of fields, from cosmology and chemistry to semiconductor design and materials science.

AI is ushering in a new scientific revolution by making remarkable breakthroughs in a number of fields, unlocking new approaches to science, and accelerating the pace of science and innovation.

CitationFor attribution in academic contexts or…

5 месяцев, 4 недели назад @ thegradient.pub
Working on the Weekends - an Academic Necessity?
Working on the Weekends - an Academic Necessity? Working on the Weekends - an Academic Necessity?

For most people, these roles outside of work occupy their evenings, weekends and vacations, yet almost every academic I know seems to fill every available bit of time with academic pursuits.

Not working on weekends seemed like a graduation from the messy life of an undergrad into the more structured life of an adult.

And the strangest thing is, I do not know where I got the idea that I should be working on weekends.

CitationFor attribution in academic contexts or books, please cite this work asClaas Voelcker, "Working on the Weekends - an Academic Necessity?

BibTeX citation:@article{class2022working,author = {Voelcket, Claas},title = {Working on the Weekends - an Academic Necessity?

6 месяцев назад @ thegradient.pub
Lessons From Deploying Deep Learning To Production
Lessons From Deploying Deep Learning To Production Lessons From Deploying Deep Learning To Production

I spent my last year at Berkeley doing research in deep learning for computer vision and working on Caffe, one of the first popular deep learning libraries.

Now I’m at Aquarium, where I get to help a multitude of companies deploying deep learning models to solve important problems for society.

I’ve learned a lot of lessons about doing deep learning in production, and I'd like to share some of those lessons with you so you don’t have to learn them the hard way.

For attribution in academic contexts or books, please cite this work asPeter Gao, "Lessons From Deploying Deep Learning To Production", The Gradient, 2022.

BibTeX citation:@article{gao2022lessons,author = {Gao, Peter },title = {Lesson…

6 месяцев, 2 недели назад @ thegradient.pub
An Illustrated Tour of Applying BERT to Speech Data
An Illustrated Tour of Applying BERT to Speech Data An Illustrated Tour of Applying BERT to Speech Data

The core idea behind wav2vec 2.0 is to teach the model to do two things in parallel:Quantize continuous speech data into discrete units automatically.

Wav2vec uses 2 groups with 320 possible words in each group, hence a theoretical maximum of 320 x 320 = 102,400 speech units.

The final context vectors then go through the last projection layer to match the dimension of the quantized speech units Qt.

Fine-tuning and downstream tasksThis concludes our tour of wav2vec 2.0 and its pre-training process.

HuBERT re-uses embeddings from the BERT encoder to improve targets, while wav2vec 2.0 only uses the output of the convolutional network for quantization.

6 месяцев, 3 недели назад @ thegradient.pub
Beyond Message Passing, a Physics-Inspired Paradigm for Graph Neural Networks
Beyond Message Passing, a Physics-Inspired Paradigm for Graph Neural Networks Beyond Message Passing, a Physics-Inspired Paradigm for Graph Neural Networks

Graph Neural Networks (GNNs) are by far the most common among graph ML methods and the most popular neural network architectures overall [2].

CitationFor attribution in academic contexts or books, please cite this work asMichael Bronstein, "Beyond Message Passing, a Physics-Inspired Paradigm for Graph Neural Networks", The Gradient, 2022.

BibTeX citation:@article{dlneuro2022,author = {Bronstein, Michael},title = {Beyond Message Passing, a Physics-Inspired Paradigm for Graph Neural Networks},journal = {The Gradient},year = {2022},howpublished = {\url{https://thegradient.pub/graph-neural-networks-beyond-message-passing-and-weisfeiler-lehman}},}[1] See e.g.

A general form of message passing an…

6 месяцев, 4 недели назад @ thegradient.pub
Focus on the Process: Formulating AI Ethics Principles More Responsibly
Focus on the Process: Formulating AI Ethics Principles More Responsibly Focus on the Process: Formulating AI Ethics Principles More Responsibly

On their own, AI ethics principles are insufficient to improve AI systems.

Instead, I suggest that each organization should articulate its own AI ethics principles, and I sketch ways to do so responsibly.

The search for universal AI ethics principlesIn recent years, several research groups have sought unifying themes in current AI ethics principles.

A key question is how to formulate AI ethics principles responsibly and how to tell that an organization has developed its principles responsibly.

But the first question to ask is “which principles?”, and my answer is: Don’t settle for “universal” AI ethics principles.

7 месяцев назад @ thegradient.pub
Deep Learning in Neuroimaging
Deep Learning in Neuroimaging Deep Learning in Neuroimaging

Specifically, this overview will first explain some common neuroimaging modalities more in-depth and then discuss applications of deep learning in conjunction with some of the unique characteristics of neuroimaging data.

These unique characteristics tie into a broader movement in deep learning, namely that data understanding should be a goal in itself to maximize the impact of applied deep learning.

Unique and leverageable aspects of neuroimaging dataAmong others [20], one critical challenge with deep learning in the field of neuroimaging is the limited number of samples; many neuroimaging datasets range from roughly 300 to 1300 subjects.

Data understanding and interpretation is an essentia…

7 месяцев назад @ thegradient.pub
AI Startups and the Hunt for Tech Talent in Vietnam
AI Startups and the Hunt for Tech Talent in Vietnam AI Startups and the Hunt for Tech Talent in Vietnam

Tech and AI startups have continued to gain prominence as the AI wave swept the country in 2018.

Hanoi and Ho Chi Minh City have developed a robust ecosystem for tech startups, with dominating sectors including AI, e-commerce, fin-tech, and enterprise solutions.

The ecosystem of 149 AI startups has attracted funding from both domestic and regional venture capital firms.

However, according to computer science professor Than Khoat, these structural initiatives have only started rolling out since 2020, and have not yet translated to a significant boost in tech talent supply to meet the pressing demand of tech talent of the fast growing tech ecosystem.

CitationFor attribution in academic contex…

7 месяцев, 2 недели назад @ thegradient.pub
TheSequence TheSequence
последний пост 4 часа назад
🚀🚀 Edge#248: Foundation Models are Creating the Industrial Era of AI
🚀🚀 Edge#248: Foundation Models are Creating the Industrial Era of AI 🚀🚀 Edge#248: Foundation Models are Creating the Industrial Era of AI

💥 What’s New in AI: Foundation Models are Creating the Industrial Era of AIIn recent months, we have seen a crazy proliferation of artificial intelligence (AI) applied to consumer and business needs.

From an outsider’s perspective, it might seem that all AI applications rely on versions of the same few models.

The AI community has already found a new term to refer to this phenomenon: foundation models.

Recently, I’ve been using the analogy of foundation models as the industrial era of AI.

Foundation ModelsThe term foundation model

4 часа назад @ thesequence.substack.com
📃 Edge#247: Classifying ML Interpretability Methods
📃 Edge#247: Classifying ML Interpretability Methods 📃 Edge#247: Classifying ML Interpretability Methods

The ML interpretability space is vast and full of distinctive methods.

From that perspective, it is essential to establish general criteria by which we can understand the different ML interpretability methods.

One of the most popular taxonomies in ML theory classifies interpretability techniques across three main dimensions:Our use of cookies✖We use necessary cookies to make our site work.

We also set performance and functionality cookies that help us make improvements by measuring traffic on our site.

For more detailed information about the cookies we use, please see our privacy policy

2 дня, 5 часов назад @ thesequence.substack.com
📝 Guest post: Burst Compute: Scaling Workloads Across Thousands of GPUs in the Cloud, Instantly*
📝 Guest post: Burst Compute: Scaling Workloads Across Thousands of GPUs in the Cloud, Instantly* 📝 Guest post: Burst Compute: Scaling Workloads Across Thousands of GPUs in the Cloud, Instantly*

The smartest companies are evolving toward more flexible, on-demand cloud infrastructure using a technique called burst compute, which provides enterprises with accessible, efficient, and cost-effective computing.

When you are able to access compute, legacy providers often charge exorbitant fees for ingress/egress, which can be debilitating for many clients.

CoreWeave Cloud is designed to address availability constraints, making it dead simple to scale up when your workloads require it, and scale down when they don’t.

Scaling seamlessly across the industry's broadest range of NVIDIA GPUs on CoreWeave Cloud, only paying for the compute you need, when you need it.

The cost of bursting on Core…

3 дня, 4 часа назад @ thesequence.substack.com
🤗 Stable Diffusion v2
🤗 Stable Diffusion v2 🤗 Stable Diffusion v2

A few months ago, Stability AI shocked the ML community by open-sourcing Stable Diffusion, taking a different approach from big AI labs like OpenAI, Google Brain, and Meta AI.

Since its initial release, Stable Diffusion has become the most widely used generative AI model, with applications across different domains.

Stable Diffusion v2 is a significant upgrade to its predecessor.

Depth2Img is another interesting addition to Stable Diffusion that can infer depth from an input image and represent that in the generated outputs.

Stable Diffusion v2 is another release that pushes the boundaries of the generative AI space.

4 дня, 5 часов назад @ thesequence.substack.com
📝 Guest post: How to Succeed as an ML/AI Startup?
📝 Guest post: How to Succeed as an ML/AI Startup? 📝 Guest post: How to Succeed as an ML/AI Startup?

The startup business environment has been evolving over the last couple decades.

In fact, AI startups have come and gone faster, on average, than other businesses.

A comprehensive AI business solution requires a set of must-have components, just to get off the ground, let alone generating stable value and lucrative revenue streams.

With Managed AI, startups can offload a significant portion of tasks to an MSP.

Starting with Managed AI is a wise strategy for building a strong foundation for future success!

6 дней, 4 часа назад @ thesequence.substack.com
🏋️‍♂️🤼‍♀️ Edge#246: OpenAI Used These Best Practices to Mitigate Risks While Training DALL-E 2
🏋️‍♂️🤼‍♀️ Edge#246: OpenAI Used These Best Practices to Mitigate Risks While Training DALL-E 2 🏋️‍♂️🤼‍♀️ Edge#246: OpenAI Used These Best Practices to Mitigate Risks While Training DALL-E 2

💥 What’s New in AI: OpenAI Used These Best Practices to Mitigate Risks While Training DALL-E 2The deep learning space is experiencing a small innovation in the area of text-to-Image synthesis.

This is the main reason while companies like OpenAI, Meta or Google haven’t yet released full implementations of these models.

How can we established safety rails around the pretraining and inference process of text-to-image generation models.

In the case of OpenAI, the company recently discussed some of the best practices used to mitigate risk in DALL-E 2.

OpenAI’s approach to risk mitigation with DALL-E 2 can be summarized in three fundamental areas:

1 неделя, 2 дня назад @ thesequence.substack.com
🔮 Edge#245: A New Series About Machine Learning Interpretability
🔮 Edge#245: A New Series About Machine Learning Interpretability 🔮 Edge#245: A New Series About Machine Learning Interpretability

In this issue:we start a new series about machine learning interpretability ;we discuss Manifold, an architecture for debugging ML models ;we explore Meta’s Captum, a framework for deep learning interpretability.

💡 ML Concept of the Day: A New Series About Machine Learning Interpretability“If you can’t explain it simply, you don’t understand it well enough”.

The famous quote from Albert Einstein certainly doesn’t seem to apply to machine learning (ML) systems.

This new series will explore the most important ML interpretability methods and technologies developed in recent years.

In general, the value proposition of ML interpretability can be decomposed in four fundamental benefits:

1 неделя, 2 дня назад @ thesequence.substack.com
📝 Guest post: What is a Vector Database?*
📝 Guest post: What is a Vector Database?* 📝 Guest post: What is a Vector Database?*

Specifically, he looked at 1) what features go into a mature vector database, 2) how a vector database differs from vector search libraries, 3) how a vector database differs from vector search plugins in traditional databases or search systems, and 4) the key challenges associated with building a vector database.

Vector search plugins for traditional databasesGreat, now that we’ve established the difference between vector search libraries and vector databases, let’s take a look at how vector databases differ from vector search plugins.

Technical challengesEarlier in this guest post, I listed the desired features a vector database should implement, before comparing vector databases to vector…

1 неделя, 3 дня назад @ thesequence.substack.com
🌅 The Era of Foundation Models is Here
🌅 The Era of Foundation Models is Here 🌅 The Era of Foundation Models is Here

Foundation models are shifting the ML development paradigm from creating brand-new models to fine-tuning large pretrained models.

Stanford University created the Center for Research on Foundation Models (CRFM), a new initiative focused on studying best practices around foundation models.

Just this week, Snorkel AI released Data-centric Foundation Model Development, a new series of addition to the Snorkel Flow platform to fine-tune and distill foundation models.

Finally, the CRFM team unveiled a new benchmark to facilitate the holistic evaluation of foundation models.

The era of foundation models is definitely upon us!

1 неделя, 4 дня назад @ thesequence.substack.com
📝 Guest post: Using One Methodology to Solve The Three Failure Modes
📝 Guest post: Using One Methodology to Solve The Three Failure Modes 📝 Guest post: Using One Methodology to Solve The Three Failure Modes

The technology is mostly ready, but bridging the proof of concept-production gap depends on fixing the training data quality problem.

Encord Active uses different metric functions to parametrize the data, constructing metrics on different data features.

To improve model performance efficiently, they need to granularly decompose model performance by specific data features to provide targeted interventions that improve the composition of the dataset and by extension model performance.

Encord Active, Encord’s open-source active learning tool, uses one methodology to provide interventions for improving data and label quality across each failure mode.

For data quality, Encord Active provides inf…

1 неделя, 6 дней назад @ thesequence.substack.com
🗣👥 Edge#244: This Google Model Combines Reasoning and Acting in a Single Language Model
🗣👥 Edge#244: This Google Model Combines Reasoning and Acting in a Single Language Model 🗣👥 Edge#244: This Google Model Combines Reasoning and Acting in a Single Language Model

On Thursdays, we dive deep into one of the freshest research papers or technology frameworks that is worth your attention.

💥 What’s New in AI: ReAct – a Model that Combines Reasoning and Acting in a Single Language ModelReasoning is one of the emerging capabilities we see in the new generation of language-pretrained models.

Neural network architectures such as GPT-3 and PaLM have excelled in reasoning tasks such as question-answering or even solving mathematical problems.

However, most language models still fail to translate reasoning into direct action in a given environment.

On the other hand, we have seen models in areas such as embodied tasks and gaming that are incredibly efficient at …

2 недели назад @ thesequence.substack.com
📌 Event: apply(recsys)—ML experts from Slack, ByteDance & more share their recommender system learnings
📌 Event: apply(recsys)—ML experts from Slack, ByteDance & more share their recommender system learnings 📌 Event: apply(recsys)—ML experts from Slack, ByteDance & more share their recommender system learnings

Are you building a machine learning recommender system or planning to?

Then you won’t want to miss apply(recsys), a free, virtual event that focuses on the specific challenges of building recommender systems.

Sign up today to learn from industry experts from Slack, ByteDance, Feast, and more.

apply() is an event series for machine learning and data teams to discuss the practical data engineering challenges faced when building real-time machine learning systems.

Participants learn from industry experts and share best practices with the community.

2 недели, 1 день назад @ thesequence.substack.com
🔂 Edge#243: Text-to-Image Synthesis Models – Recap
🔂 Edge#243: Text-to-Image Synthesis Models – Recap 🔂 Edge#243: Text-to-Image Synthesis Models – Recap

→ In Edge#225: we explain latent diffusion models; discuss the original latent diffusion paper; explore Hugging Face Diffusers, a library for state-of-the-art diffusion models.

→ In Edge#227: we explain autoregressive text-to-image models; discuss Google’s Parti, an impressive autoregressive text-to-image model; explore MS COCO, one of the most common datasets in text-to-image models.

→ In Edge#231: we explore Text-to-image synthesis with GANs; discuss Google’s XMC-GAN, a modern approach to text-to-image synthesis; explore NVIDIA GauGAN2 Demo.

→ In Edge#239: we dive deeper into Stable Diffusion; discuss retrieval augmented diffusion models that bring memory to text-to-image synthesis; explo…

2 недели, 2 дня назад @ thesequence.substack.com
☝️CoreWeave to Offer NVIDIA HGX H100 Supercomputers - Supporting Cutting Edge AI & ML Companies*
☝️CoreWeave to Offer NVIDIA HGX H100 Supercomputers - Supporting Cutting Edge AI & ML Companies* ☝️CoreWeave to Offer NVIDIA HGX H100 Supercomputers - Supporting Cutting Edge AI & ML Companies*

CoreWeave is proud to be among the first providers to offer cloud instances with NVIDIA HGX H100 supercomputers.

NVIDIA’s HGX H100 platform represents a major leap forward for the AI community, enabling up to seven times better efficiency in high-performance computing (HPC) applications, up to nine times faster AI training on the largest models and up to 30 times faster AI inference than the NVIDIA HGX A100.

CoreWeave is purpose-built for large-scale GPU-accelerated workloads, specialized to serve the most demanding AI and machine learning applications.

Ready to Check Out the HGX H100?

Learn more about the NVIDIA HGX H100 today, or reserve now.

2 недели, 3 дня назад @ thesequence.substack.com
✂️✂️ ML Talent Layoffs and Priorities Reset
✂️✂️ ML Talent Layoffs and Priorities Reset ✂️✂️ ML Talent Layoffs and Priorities Reset

As a result, most tech companies hired ML talent aggressively and embarked on super-ambitious AI projects.

These layoffs have, awkwardly, coincided with sizable funding rounds by ML startups in areas such as generative AI.

What’s really happening with ML talent?

During the bull market, large tech companies tended to over-hire ML talent, and funding was available for capital-intensive ML efforts such as self-driving cars.

At the same time, the layoffs of ML talent in tech giants have pushed talent toward VC-backed startups.

2 недели, 4 дня назад @ thesequence.substack.com
Synced Review
последний пост 18 часов назад
DeepMind Studies Process- vs Outcome-based Model Supervision, Significantly Reducing Reasoning…
DeepMind Studies Process- vs Outcome-based Model Supervision, Significantly Reducing Reasoning… DeepMind Studies Process- vs Outcome-based Model Supervision, Significantly Reducing Reasoning…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

18 часов назад @ medium.com
No Images Are Needed! Allen AI’s CLOSE Learns to Complete Visual Tasks From Text Inputs Alone
No Images Are Needed! Allen AI’s CLOSE Learns to Complete Visual Tasks From Text Inputs Alone No Images Are Needed! Allen AI’s CLOSE Learns to Complete Visual Tasks From Text Inputs Alone

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 день, 22 часа назад @ medium.com
NeurIPS 2022 | Meta AI, Stanford & Tübingen U Beat Neural Scaling Laws via Data Pruning
NeurIPS 2022 | Meta AI, Stanford & Tübingen U Beat Neural Scaling Laws via Data Pruning NeurIPS 2022 | Meta AI, Stanford & Tübingen U Beat Neural Scaling Laws via Data Pruning

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 дня, 16 часов назад @ medium.com
NeurIPS 2022 | MIT & Meta Enable Gradient Descent Optimizers to Automatically Tune Their Own…
NeurIPS 2022 | MIT & Meta Enable Gradient Descent Optimizers to Automatically Tune Their Own… NeurIPS 2022 | MIT & Meta Enable Gradient Descent Optimizers to Automatically Tune Their Own…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

6 дней, 12 часов назад @ medium.com
NeurIPS 2022 Announces Its Outstanding Main Track Papers, Outstanding Dataset & Benchmark Papers…
NeurIPS 2022 Announces Its Outstanding Main Track Papers, Outstanding Dataset & Benchmark Papers… NeurIPS 2022 Announces Its Outstanding Main Track Papers, Outstanding Dataset & Benchmark Papers…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя, 1 день назад @ medium.com
Moody Moving Faces: NVIDIA’s SPACEx Delivers High-Quality Portrait Animation with Controllable…
Moody Moving Faces: NVIDIA’s SPACEx Delivers High-Quality Portrait Animation with Controllable… Moody Moving Faces: NVIDIA’s SPACEx Delivers High-Quality Portrait Animation with Controllable…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя, 1 день назад @ medium.com
Talking to Models: Stanford U & Microsoft Method Enables Developers to Correct Model Bugs via…
Talking to Models: Stanford U & Microsoft Method Enables Developers to Correct Model Bugs via… Talking to Models: Stanford U & Microsoft Method Enables Developers to Correct Model Bugs via…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя, 2 дня назад @ medium.com
Running Fast Transformers on CPUs: Intel Approach Achieves Significant Speed Ups and SOTA…
Running Fast Transformers on CPUs: Intel Approach Achieves Significant Speed Ups and SOTA… Running Fast Transformers on CPUs: Intel Approach Achieves Significant Speed Ups and SOTA…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя, 6 дней назад @ medium.com
DeepMind’s Epistemic Neural Networks Enable Large Language Model Fine-Tuning With 50% Less Data
DeepMind’s Epistemic Neural Networks Enable Large Language Model Fine-Tuning With 50% Less Data DeepMind’s Epistemic Neural Networks Enable Large Language Model Fine-Tuning With 50% Less Data

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 недели назад @ medium.com
Solving Brain Dynamics Gives Rise to Flexible Machine Learning Models
Solving Brain Dynamics Gives Rise to Flexible Machine Learning Models Solving Brain Dynamics Gives Rise to Flexible Machine Learning Models

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 недели, 2 дня назад @ medium.com
‘MrsFormer’ Employs a Nove Multiresolution-Head Attention Mechanism to Cut Transformers’ Compute…
‘MrsFormer’ Employs a Nove Multiresolution-Head Attention Mechanism to Cut Transformers’ Compute… ‘MrsFormer’ Employs a Nove Multiresolution-Head Attention Mechanism to Cut Transformers’ Compute…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 недели, 3 дня назад @ medium.com
UT Austin & Sony AI’s VIOLA Object-Centric Imitation Learning Method for Robot Manipulation…
UT Austin & Sony AI’s VIOLA Object-Centric Imitation Learning Method for Robot Manipulation… UT Austin & Sony AI’s VIOLA Object-Centric Imitation Learning Method for Robot Manipulation…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 недели, 6 дней назад @ medium.com
MIT, Northeastern & Technion Propose ROME for Efficient Locating and Editing of Factual…
MIT, Northeastern & Technion Propose ROME for Efficient Locating and Editing of Factual… MIT, Northeastern & Technion Propose ROME for Efficient Locating and Editing of Factual…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 недели, 2 дня назад @ medium.com
GrowthEase Shares Its Latest Achievements in AI-Powered Technology with the World for the First…
GrowthEase Shares Its Latest Achievements in AI-Powered Technology with the World for the First… GrowthEase Shares Its Latest Achievements in AI-Powered Technology with the World for the First…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 недели, 2 дня назад @ medium.com
Baidu’s Parallel Evoformer and Branch Parallelism Strategy Accelerates AlphaFold2 Training by 38.67%
Baidu’s Parallel Evoformer and Branch Parallelism Strategy Accelerates AlphaFold2 Training by 38.67% Baidu’s Parallel Evoformer and Branch Parallelism Strategy Accelerates AlphaFold2 Training by 38.67%

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 недели, 6 дней назад @ medium.com
📓 Cool Blogs
ODS.ai Habr ODS.ai Habr
последний пост 1 неделя, 4 дня назад
Что я бы хотел знать про ML System Design раньше
Что я бы хотел знать про ML System Design раньше Что я бы хотел знать про ML System Design раньше

Уточнение задачиНе нужно сразу бросаться решать задачу, а лучше задать как можно больше правильных уточняющих вопросов.

Таким образом покажете, что у вас широкий опыт как с технической точки зрения, так и с продуктовой.

Можно упомянуть извечную проблему training-serving skew (расхождение между тренировкой и инференсом модели) и как ее можно решить с помощью фича сторов.

Зачастую добавляются сторонние источники данных (Redis, Postgres, S3), необходимые для инициализации модели и ее инференса.

Стоит хотя бы немного понимать про архитектуру систем, из каких компонент обычно состоят и как связаны компоненты.

1 неделя, 4 дня назад @ habr.com
Практический Metric learning
Практический Metric learning Практический Metric learning

О задаче Metric learningЗадача metric learning состоит в том, чтобы построить функцию от двух объектов, которая будет оценивать расстояние (похожесть) между ними.

Далее мы рассмотрим решение данной задачи с помощью нейронных сетей, то есть deep metric learning, где выделяются два основных подхода:Siamese.

Задачи deep metric learning и классификации могут перетекать друг в друга, что делает использование терминологии запутанным.

Если всё-таки выделить характерное отличие, то я бы сказал, что в классификации классы на train и test выборках совпадают, а в metric learning — не обязательно.

Да, для metric learning, как и для классификации, существует набор популярных датасетов, например, картино…

1 месяц назад @ habr.com
Запуск ML скриптов в облаке с помощью dstack. Бонус – про запуск open-source проектов
Запуск ML скриптов в облаке с помощью dstack. Бонус – про запуск open-source проектов Запуск ML скриптов в облаке с помощью dstack. Бонус – про запуск open-source проектов

Проблема с такими платформами заключается в том, что вам необходимо полностью интегрироваться в эту платформу, и использовать инструменты предлагаемые исключительно на базе этой платформы.

Традиционных подход заключается в использовании базы данных и в выделении центрального фасада (backend) для работы с этим стейтом.

Про запускЗдесь я бы хотел поделиться опытом запуска проектов в целом и open-source проектов в частности.

Если вы не еще являетесь частью комьюнити, можно попросить кого-то из хороших знакомых их коммьюнити поддержать ваш запуск искренним лайком или решером.

Если интересно поговорить про dstack, запуск ML скриптов, или про разработку open-source проектов, пишите в личное сообщ…

1 месяц, 2 недели назад @ habr.com
Распознавание речи, генерация субтитров и изучение языков при помощи Whisper
Распознавание речи, генерация субтитров и изучение языков при помощи Whisper Распознавание речи, генерация субтитров и изучение языков при помощи Whisper

Получили некий зашумленный датасет, в котором в том числе есть и транскрипции сделанные другими ASR системами, много тишины и шумов, смех, апплодисменты и т.д.

Объем получился 680 000 часов на 97 языках, из которых 117 000 часов не на английском.

ТранскрибацияПопробуем расшифровать несколько типичных видео разных жанров и с разным набором лексики.

И это в разы меньше.

Это не хорошо и не плохо, так как бывают различные требования к результату.

1 месяц, 3 недели назад @ habr.com
Новый запуск курса Natural Language Processing
Новый запуск курса Natural Language Processing Новый запуск курса Natural Language Processing

TL;DR: Этой осенью сообщество Open Data Science и компания Huawei делают новый запуск курса по обработке естественного языка.

Мы делаем новый запуск курса Natural Language Processing.

Я буду читать лекции, в области NLP я работаю последние 10 лет, успел поработать в Яндексе и ВКонтакте, защитить кандидатскую диссертацию.

Сам курс запускается в этом виде в пятый раз.

Ссылка будет в группе курса.

2 месяца, 2 недели назад @ habr.com
Data Science Pet Projects. FAQ
Data Science Pet Projects. FAQ Data Science Pet Projects. FAQ

В своем проекте вы “и спец, и на дудке игрец”, а также PO, CTO, CEO (и немного HR).

Поиск темы проекта и данных для анализаВ пет-проектах по анализу данных тема неразрывно связана с данными.

__поиск данных для примера 1 • существуют датчики, которые определяют шаги, пульс, глубину дыхания, частоту сердцебиения, температуру тела.

Что зарелизили: ML-бот, который умеет передвигаться, основываясь на входящем кадре-картинке и на векторе-состоянии инвентаря.

ODS ник: Sergei Два года пилил пет-проект про GAN/Deepfake, в процессе хорошо прокачался в в DL, описание проекта в хабр-статье.

3 месяца, 3 недели назад @ habr.com
Эй-Яй, крипта, MLOps и командный пет-проджект
Эй-Яй, крипта, MLOps и командный пет-проджект Эй-Яй, крипта, MLOps и командный пет-проджект

Третья часть – про организацию работу, сложности, с которыми мы столкнулись, и хаки, повышающие продуктивность команды, к которым мы в итоге пришли.

MLFlow и сервис обучения моделейМы реализовали абстрактный класс для обучения, у котого есть наследники для простой модели tf-idf & logreg и для BERT-модели.

Adversarial Validation для обнаружения дрифта в данныхAdversarial validation – подход, про который я узнал на Kaggle, который, кажется, под разными именами постоянно переизобретается и в академии, и в индустрии.

Роли в командеКажется, что даже в пет-проджекте хорошо бы выделить роли и не толькаться, не бороться за задачи.

Хорошо бы довести пет-проект до красивой демки, на которую можно и в…

5 месяцев назад @ habr.com
Как мы заняли 1-е место в задаче Matching в соревновании Data Fusion Contest 2022, или как нейронка обогнала бустинг
Как мы заняли 1-е место в задаче Matching в соревновании Data Fusion Contest 2022, или как нейронка обогнала бустинг Как мы заняли 1-е место в задаче Matching в соревновании Data Fusion Contest 2022, или как нейронка обогнала бустинг

То есть MRR = 1 если правильный ответ на первой позиции, 0.5 если на второй, 0.33 на третьей и 0.01 – если на последней.

Транзакции:Здесь кроме id-клиента есть mcc код и два поля с его текстовым описанием, код валюты с расшифровкой, сумма транзакции (как с плюсом, так и с минусом) и дата-время транзакции.

Мы предполагали, что в этом случае общее векторное пространство будет также на входе в RNN, и что сеть сможет найти, например, похожие по смыслу mcc в транзакциях и категории в кликстриме.

Для поля timestamp посчитали те же временные признаки, что и для транзакций.

Отличие от NLP методов было в том, что мы маскировали и предсказывали не сам токен (транзакцию), а его вектор.

5 месяцев, 3 недели назад @ habr.com
DIY. Книги для всех, даром
DIY. Книги для всех, даром DIY. Книги для всех, даром

Найти их, однако, не так просто, и скорее всего это будут книги для детей или избранная классика.

Долго стоял К. на деревянном мосту, который вел с проезжей дороги в Деревню, и смотрел в кажущуюся пустоту.

Более подробно про интерфейс приложения и как им пользоваться можно почитать здесь, а про технические детали здесь.

Можно выбрать на какой стороне листа будет какой язык и на основе какого исходного текста формировать абзацы параллельного текста.

В шаблонах зашиты переменные для включения цветовой подсветки, путь к картинке для обложки, подсказки для японского и китайского и некоторые другие.

5 месяцев, 3 недели назад @ habr.com
Причинно-следственный анализ в машинном обучении: итоги 2021 г
Причинно-следственный анализ в машинном обучении: итоги 2021 г Причинно-следственный анализ в машинном обучении: итоги 2021 г

А в этой статье - под катом - хотелось бы рассказать о трендах в развитии Causal Inference в ML в 2021 г.Causal Inference в ML: итоги 2021 г.Сначала поговорим обобщенно, а затем детальнее раскроем наиболее интересные пункты.

Нобелевская премия по экономике была выдана за развитие методов CI, крупнейшие конференции по ML провели воркшопы (NeurIPS, ICML) по вопросам CI для ML.

Interpretable & Causal ML Track – Data Fest Online 2021На ежегодном Data Fest уже в третий раз прошел трек по вопросам Reliable ML - Interpretable & Causal ML Track 2021.

Доклад вошел в топ всех выступлений сообщества Open Data Science в 2021 г. Тема с Causal Shapley Values прогремела в 2020 г., в 2021 г.

Появление каче…

6 месяцев назад @ habr.com
Система распознавания шрифта Брайля. Читаем написанное белым по белому
Система распознавания шрифта Брайля. Читаем написанное белым по белому Система распознавания шрифта Брайля. Читаем написанное белым по белому

Сейчас этот сервис используют сотни людей и в России, и за ее пределами.

Лень — двигатель прогрессаВы все видели брайлевские символы в лифте и в поликлинике.

В обычном письме буквы представлены связанными линиями, а в письме по Брайлю - комбинацией от 1 до 6 точек, расположенных в узлах воображаемой сетки.

Так что по всему выходило, что для применения опубликованных методов нужен сканер, причем специальный — обычный бытовой мал.

Есть что улучшить и в оптическом распознавании, и в веб-интерфейсе.

6 месяцев назад @ habr.com
Интерпретируемость в машинном обучении: итоги 2021 г
Интерпретируемость в машинном обучении: итоги 2021 г Интерпретируемость в машинном обучении: итоги 2021 г

Под катом давайте поговорим о том, что интересного произошло в интерпретируемости в 2021 г.Ключевые тренды и события 2021 г. в Interpretable MLСначала поговорим обобщенно, а затем детальнее раскроем наиболее интересные пункты.

Пожалуй, самое пристальное внимание в области XAI в 2021 г. было направлено на оценку качества методов интерпретации – для возможности сравнения методов между собой.

При этом в январе 2022 г. на arxiv появилась знаковая работа, в которой авторы систематизируют около 300 работ в области XAI, опубликованных на CS конференциях в 2014-2020 гг.

Так, и в 2021 г. в разных бизнес-источниках продолжили ссылаться на отчет PwC по Explainable AI от 2018 г. В обзоре достаточно про…

6 месяцев, 1 неделя назад @ habr.com
Причинно-следственный анализ в машинном обучении
Причинно-следственный анализ в машинном обучении Причинно-следственный анализ в машинном обучении

В следующей статье побеседуем о ключевых трендах в развитии методов причинно-следственного анализа в машинном обучении в 2020-2021 гг.

Что такое причинно-следственный анализCorrelation doesn't imply causationГлавный тезис эконометрики, который в последние 5 лет прочно пришел и в ML: «Корреляция не подразумевает причинно-следственную связь».

В целом, о кейсах бизнес-применения causal inference 2021 г. я рассказывала в одном из постов tg-канала @Reliable_ML еще в начале года.

Causal Inference как ключ к балансу классического ML и эконометрикиCausal Inference в MLВ 2020 году в отчете State of AI впервые в явном виде была обозначена необходимость интеграции классического ML c методами Causal In…

7 месяцев, 1 неделя назад @ habr.com
Нюансы распознавания речи. Восстанавливаем пунктуацию, числа и заглавные буквы
Нюансы распознавания речи. Восстанавливаем пунктуацию, числа и заглавные буквы Нюансы распознавания речи. Восстанавливаем пунктуацию, числа и заглавные буквы

Другими словами, сегодня мы поговорим про то, как автоматически восстановить пунктуацию и капитализацию (сделать нужные буквы заглавными).

Если вы делаете модель для малоресурсного языка, то можно воспользоваться проектом Lingtrain, который я делаю для создания параллельных книг (проект открытый и идеи приветствуются).

Если же вы делаете модель для какого-то популярного языка, то можно воспользоваться готовыми датасетами типа Tatoeba.

Для удобства я оформил скрипты для подготовки и обучения в репозиторий multipunct, поэтому дальше я буду обращаться к нему.

Если понадобится экспортировать модель для инференса, например, в TorchScript, то для этого есть метод export().

7 месяцев, 3 недели назад @ habr.com
Чистый AutoML для “грязных” данных: как и зачем автоматизировать предобработку таблиц в машинном обучении
Чистый AutoML для “грязных” данных: как и зачем автоматизировать предобработку таблиц в машинном обучении Чистый AutoML для “грязных” данных: как и зачем автоматизировать предобработку таблиц в машинном обучении

Мы использовали их при развитии нашего open-source AutoML фреймворка FEDOT , у которого безусловно есть свои особенности как в архитектуре, так и в парадигме разработки.

Если хотите подробнее почитать про предобработку табличных данных и какой она бывает, то можете начать с Предварительная обработка данных и продолжить c Data Preprocessing: Concepts .

Оговоримся сразу, что рассматривать будем наиболее критические изменения в данных, преобразования в духе “трансформация одномерного target массива в вектор-столбец” подразумевается, что производится при необходимости в любой сколько-нибудь крупной AutoML библиотеке.

Также имеется возможность опциональной предобработки для некоторых моделей - н…

8 месяцев назад @ habr.com
Machine Learning Mastery
последний пост 4 дня, 21 час назад
Implementing Gradient Descent in PyTorch
Implementing Gradient Descent in PyTorch

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

4 дня, 21 час назад @ machinelearningmastery.com
Training a Linear Regression Model in PyTorch
Training a Linear Regression Model in PyTorch

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя назад @ machinelearningmastery.com
Making Linear Predictions in PyTorch
Making Linear Predictions in PyTorch

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя назад @ machinelearningmastery.com
Loading and Providing Datasets in PyTorch
Loading and Providing Datasets in PyTorch

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя, 5 дней назад @ machinelearningmastery.com
Using Dataset Classes in PyTorch
Using Dataset Classes in PyTorch

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 недели назад @ machinelearningmastery.com
Calculating Derivatives in PyTorch
Calculating Derivatives in PyTorch

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 недели, 5 дней назад @ machinelearningmastery.com
Two-Dimensional Tensors in Pytorch
Two-Dimensional Tensors in Pytorch

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 недели назад @ machinelearningmastery.com
One-Dimensional Tensors in Pytorch
One-Dimensional Tensors in Pytorch

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 недели, 2 дня назад @ machinelearningmastery.com
365 Data Science courses free until November 21
365 Data Science courses free until November 21

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

4 недели, 1 день назад @ machinelearningmastery.com
Attend the Data Science Symposium 2022, November 8 in Cincinnati
Attend the Data Science Symposium 2022, November 8 in Cincinnati

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 месяц назад @ machinelearningmastery.com
A Brief Introduction to BERT
A Brief Introduction to BERT

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 месяц назад @ machinelearningmastery.com
Data Engineering for ML: Optimize for Cost Efficiency
Data Engineering for ML: Optimize for Cost Efficiency

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 месяц назад @ machinelearningmastery.com
Interactive Machine Learning Live Course with Dr. Kirk Borne
Interactive Machine Learning Live Course with Dr. Kirk Borne

machinelearningmastery.comChecking if the site connection is secureEnable JavaScript and cookies to continuemachinelearningmastery.com needs to review the security of your connection before proceeding.

1 месяц, 1 неделя назад @ machinelearningmastery.com
Inferencing the Transformer Model
Inferencing the Transformer Model

machinelearningmastery.comChecking if the site connection is secureEnable JavaScript and cookies to continuemachinelearningmastery.com needs to review the security of your connection before proceeding.

1 месяц, 1 неделя назад @ machinelearningmastery.com
Plotting the Training and Validation Loss Curves for the Transformer Model
Plotting the Training and Validation Loss Curves for the Transformer Model

machinelearningmastery.comChecking if the site connection is secureEnable JavaScript and cookies to continuemachinelearningmastery.com needs to review the security of your connection before proceeding.

1 месяц, 1 неделя назад @ machinelearningmastery.com
ML in Production
последний пост 6 месяцев, 3 недели назад
Driving Experimentation Forward through a Working Group (Experimentation Program Series: Guide 03)
Driving Experimentation Forward through a Working Group (Experimentation Program Series: Guide 03) Driving Experimentation Forward through a Working Group (Experimentation Program Series: Guide 03)

The experimentation working group is a group of individuals whose goal is to implement the experimentation program i.e.

Data Science Participation in the Working Group Which data science team members should participate in the working group?

This person is responsible for organizing the working group, leading the working group meetings (we’ll discuss this next), and influencing the working group participants.

The Working Group Meeting The working group should meet regularly to achieve it’s goal of implementing the ExPr.

Tactical Tips for Running an Experimentation Working Group Lets discuss a few tactical tips for improving the the working group’s probability of success.

6 месяцев, 3 недели назад @ mlinproduction.com
What is an Experimentation program and Who is Involved? (Experimentation Program Series: Guide 02)
What is an Experimentation program and Who is Involved? (Experimentation Program Series: Guide 02) What is an Experimentation program and Who is Involved? (Experimentation Program Series: Guide 02)

An experimentation program is the mechanism by which a company uses randomized controlled experiments to generate positive business results.

An experimentation program can succeed only when the right people are involved.

Exactly how this ideation, planning, implementation, and analysis is completed is the process component of an experimentation program.

Data Science Data science plays two important roles in an effective experimentation program.

This second role should be played by data science managers or product managers who sit on a data science team.

8 месяцев назад @ mlinproduction.com
Building An Effective Experimentation Program – 01 Introduction
Building An Effective Experimentation Program – 01 Introduction Building An Effective Experimentation Program – 01 Introduction

These are some of the words used to describe the products and services offered by the world’s largest and most successful businesses.

But it’s very likely that this experience wasn’t crafted in some sort of top-down, divine-intervention-like manner either.

Many business leaders today are familiar with examples of companies that have evolved their products and services, and correspondingly optimized their profit-and-loss statements, through experimentation.

Data science teams don’t need to be convinced of the benefits of running experiments.

But often they lack the business knowledge, cross-team relationships, and structured processes for engaging with business teams and helping them optimiz…

8 месяцев, 2 недели назад @ mlinproduction.com
Protected: TODO
Protected: TODO

My goal is to help data scientists, ML engineers, and AI product managers, build and operate machine learning systems in production.

Learn more about why I started MLinProduction.

9 месяцев, 4 недели назад @ mlinproduction.com
Sorta Insightful Sorta Insightful
последний пост 2 месяца назад
Generative Modelling is Still Accelerating
Generative Modelling is Still Accelerating Generative Modelling is Still Accelerating

In the months since, image generation has gone from a thing some people talked about, to something everyone was talking about.

I read a post from someone who discussed AI asceticism, and then acknowledged that they could not do it, the image generation was too fun to play with.

People have normalized that it is possible to get high quality language-guided image generation really, really quickly.

I think there’s only a few domains where we actually have enough human data at the moment.

I don’t think they’ll lead to fundamental floor raising of what we believe ML models are capable of.

2 месяца назад @ alexirpan.com
Seven Years Later
Seven Years Later Seven Years Later

This January, the team I was on won MIT Mystery Hunt, the biggest puzzlehunt of the year.

See, people don’t quite understand how long it takes to write Mystery Hunt.

markdown 414 2022 - 01 - 22 - mh - 2022. markdown 400 2022 - 04 - 15 - do - what - i - mean .

markdownI’m a bit surprised the ML-related post has fewer views than the Mystery Hunt post.

I’m guessing shades of what this post would have been will appear in other posts I write later.

3 месяца, 2 недели назад @ alexirpan.com
I'm Bad at Twitter
I'm Bad at Twitter I'm Bad at Twitter

I’m bad at Twitter.

I know I’m bad at Twitter.

There’s a machine learning Twitter, a philosophy Twitter, a history Twitter, a My Little Pony Twitter, a Smash Bros Twitter.

People tell me ML Twitter is worth it.

It’s quite likely that I’m losing out on both ML knowledge and career equity by not being more active on Twitter.

4 месяца, 2 недели назад @ alexirpan.com
My 2022 r/place Adventure
My 2022 r/place Adventure My 2022 r/place Adventure

Lots of big communities have little interest in r/place, and lots of little communities have outsized presence in r/place.

The Dustforce Discord talked about doing something for r/place, but hadn’t done anything, so I made a pixel art template in hopes it would get the ball rolling.

After scanning existing r/place pixel art, I realized our target image was somewhat big for our community size, so I prepared a smaller version instead.

Our art template and their art template overlapped by 1 pixel, and we both really wanted that pixel.

We even had time to adjust our template and fill in more space with Dustforce pixel art, adding the S+ icon we had last time r/place happened.

7 месяцев назад @ alexirpan.com
The Dawn of Do What I Mean
The Dawn of Do What I Mean The Dawn of Do What I Mean

SayCan is a robot learning system that we’ve been developing for about the past year.

The language generation is the easy part, while the value function + policy are the hard parts.

Meanwhile, Google Brain announced their PaLM language model, trained with 540B parameters on 780 billion tokens.

Let’s just say it’s not a good look for anyone claiming deep learning models are plateauing.

Similar to language generation, progress here might overstate the state of the field, because it’s improving things we naturally find interesting.

7 месяцев, 2 недели назад @ alexirpan.com
inFERENCe
последний пост 9 месяцев назад
Implicit Bayesian Inference in Large Language Models?
Implicit Bayesian Inference in Large Language Models? Implicit Bayesian Inference in Large Language Models?

March 3, 2022Implicit Bayesian Inference in Large Language Models?

Exchangeable sequences as Implicit Learning MachinesBefore talking about the paper, let me first refresh those old ideas about exchangeable sequences and implicit learning.

The de Finetti theorem connects such sequence models to Bayesian inference, saying that any such distribution can be decomposed as a mixture over i.i.d.

From Exchangeable sequences to Mixtures of HMMsBut GPT-3 is a language models, and clearly language tokens are not exchangeble.

In context learningThe core idea of this paper is that perhaps in-context learning exploits this implicit Bayesian inference, inherent to statistical models of language, to solve…

9 месяцев назад @ inference.vc
The Eastern European's Guide to Writing Reference Letters
The Eastern European's Guide to Writing Reference Letters The Eastern European's Guide to Writing Reference Letters

One phrase I often use to describe what it's like to read reference letters for Eastern European applicants to PhD and Master's programs in Cambridge.

I decided to write this guide for students so they can share it with their professors when asking for reference letters.

Reference letters are often used as ammunition to justify decisions internally, and to determine who gets prioritised for various scholarship and funding competitions.

Reference letters are often used as ammunition to justify decisions internally, and to determine who gets prioritised for various scholarship and funding competitions.

If you have a candidate you enthusiastically support, don't be afraid to ask for help writi…

9 месяцев назад @ inference.vc
The Spectator
последний пост None
The Unofficial Google Data Science Blog The Unofficial Google Data Science Blog
последний пост None
Off the Convex Path
последний пост 4 месяца, 2 недели назад
Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Networks
Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Networks Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Networks

We find that, analogously to matrix and tensor factorizations, the implicit regularization in hierarchical tensor factorization strives to lower a notion of rank (called hierarchical tensor rank).

For our current purpose it suffices to know that a hierarchical tensor factorization consists of multiple local tensor factorizations, whose components we call the local components of the hierarchical factorization.

Basically, if a tensor can be represented through hierarchical tensor factorization with few local components, then it has low hierarchical tensor rank.

Seeing that the implicit regularization in matrix and tensor factorizations leads to low matrix and tensor ranks, respectively, in ou…

4 месяца, 2 недели назад @ offconvex.org
Predicting Generalization using GANs
Predicting Generalization using GANs Predicting Generalization using GANs

Predicting Generalization using GANsA central problem of generalization theory is the following: Given a training dataset and a deep net trained with that dataset, give a mathematical estimate of the test error.

This blog post is about the topic of a NeurIPS 20 competition Predicting Generalization in Deep Learning competition which suggested using machine learning techniques to understand network properties that promote generalization!

This blog post describes our ICLR22 spotlight paper, coauthored with Nikunj Saunshi and Arushi Gupta, that gives a surprisingly easy method to predict generalization using Generative Adversarial Nets or GANs.

Observation 2) Training deep net classifiers usin…

5 месяцев, 4 недели назад @ offconvex.org
Jay Alammar
последний пост 1 месяц, 4 недели назад
The Illustrated Stable Diffusion
The Illustrated Stable Diffusion The Illustrated Stable Diffusion

The image generator goes through two stages:1- Image information creatorThis component is the secret sauce of Stable Diffusion.

The image information creator works completely in the image information space (or latent space).

This concludes the description of image generation by diffusion models mostly as described in Denoising Diffusion Probabilistic Models.

Speed Boost: Diffusion on Compressed (latent) Data Instead of the Pixel ImageTo speed up the image generation process, the Stable Diffusion paper runs the diffusion process not on the pixel images themselves, but on a compressed version of the image.

The released Stable Diffusion model uses ClipText (A GPT-based model), while the paper …

1 месяц, 4 недели назад @ jalammar.github.io
Applying massive language models in the real world with Cohere
Applying massive language models in the real world with Cohere Applying massive language models in the real world with Cohere

The company trains massive language models (both GPT-like and BERT-like) and offers them as an API (which also supports finetuning).

Semantic search has to be one of the most exciting applications of sentence embedding models.

Finetuning tends to lead to the best results language models can achieve.

This article explains the intuitions around finetuning representation/sentence embedding models.

This is a walkthrough of one of the most common use cases of embedding models -- text classification.

8 месяцев, 4 недели назад @ jalammar.github.io
Piekniewski's blog
последний пост None
fast.ai NLP fast.ai NLP
последний пост None
Sebastian Ruder Sebastian Ruder
последний пост None
🔬 Science
Papers With Code Papers With Code
последний пост 6 часов назад
/FKarl/ Transformers are Short Text Classifiers: A Study of Inductive Short Text Classifiers on Benchmarks and Real-world Datasets
/FKarl/ Transformers are Short Text Classifiers: A Study of Inductive Short Text Classifiers on Benchmarks and Real-world Datasets /FKarl/ Transformers are Short Text Classifiers: A Study of Inductive Short Text Classifiers on Benchmarks and Real-world Datasets

Short text classification is a crucial and challenging aspect of Natural Language Processing.

For this reason, there are numerous highly specialized short text classifiers.

In this work, we examine the performance of a variety of short text classifiers as well as the top performing traditional text classifier.

We further investigate the effects on two new real-world short text datasets in an effort to address the issue of becoming overly dependent on benchmark datasets with a limited number of characteristics.

Our experiments unambiguously demonstrate that Transformers achieve SOTA accuracy on short text classification tasks, raising the question of whether specialized short text techniques…

6 часов назад @ paperswithcode.com
/UniFeat/ Universal Feature Selection Tool (UniFeat): An Open-Source Tool for Dimensionality Reduction
/UniFeat/ Universal Feature Selection Tool (UniFeat): An Open-Source Tool for Dimensionality Reduction /UniFeat/ Universal Feature Selection Tool (UniFeat): An Open-Source Tool for Dimensionality Reduction

The Universal Feature Selection Tool (UniFeat) is an open-source tool developed entirely in Java for performing feature selection processes in various research areas.

It provides a set of well-known and advanced feature selection methods within its significant auxiliary tools.

This allows users to compare the performance of feature selection methods.

Moreover, due to the open-source nature of UniFeat, researchers can use and modify it in their research, which facilitates the rapid development of new feature selection algorithms.

PDFAbstract

6 часов назад @ paperswithcode.com
/researching-the-unknown/ Towards Training GNNs using Explanation Directed Message Passing
/researching-the-unknown/ Towards Training GNNs using Explanation Directed Message Passing /researching-the-unknown/ Towards Training GNNs using Explanation Directed Message Passing

With the increasing use of Graph Neural Networks (GNNs) in critical real-world applications, several post hoc explanation methods have been proposed to understand their predictions.

However, there has been no work in generating explanations on the fly during model training and utilizing them to improve the expressive power of the underlying GNN models.

In this work, we introduce a novel explanation-directed neural message passing framework for GNNs, EXPASS (EXplainable message PASSing), which aggregates only embeddings from nodes and edges identified as important by a GNN explanation method.

EXPASS can be used with any existing GNN architecture and subgraph-optimizing explainer to learn acc…

10 часов назад @ paperswithcode.com
/espnet/ EURO: ESPnet Unsupervised ASR Open-source Toolkit
/espnet/ EURO: ESPnet Unsupervised ASR Open-source Toolkit /espnet/ EURO: ESPnet Unsupervised ASR Open-source Toolkit

This paper describes the ESPnet Unsupervised ASR Open-source Toolkit (EURO), an end-to-end open-source toolkit for unsupervised automatic speech recognition (UASR).

EURO adopts the state-of-the-art UASR learning method introduced by the Wav2vec-U, originally implemented at FAIRSEQ, which leverages self-supervised speech representations and adversarial training.

EURO is implemented in ESPnet and follows its unified pipeline to provide UASR recipes with a complete setup.

Extensive experiments on three mainstream self-supervised models demonstrate the toolkit's effectiveness and achieve state-of-the-art UASR performance on TIMIT and LibriSpeech datasets.

EURO will be publicly available at http…

10 часов назад @ paperswithcode.com
/ictnlp/ Rephrasing the Reference for Non-Autoregressive Machine Translation
/ictnlp/ Rephrasing the Reference for Non-Autoregressive Machine Translation /ictnlp/ Rephrasing the Reference for Non-Autoregressive Machine Translation

Non-autoregressive neural machine translation (NAT) models suffer from the multi-modality problem that there may exist multiple possible translations of a source sentence, so the reference sentence may be inappropriate for the training when the NAT output is closer to other translations.

In response to this problem, we introduce a rephraser to provide a better training target for NAT by rephrasing the reference sentence according to the NAT output.

As we train NAT based on the rephraser output rather than the reference sentence, the rephraser output should fit well with the NAT output and not deviate too far from the reference, which can be quantified as reward functions and optimized by re…

10 часов назад @ paperswithcode.com
/aklein1995/ Towards Improving Exploration in Self-Imitation Learning using Intrinsic Motivation
/aklein1995/ Towards Improving Exploration in Self-Imitation Learning using Intrinsic Motivation /aklein1995/ Towards Improving Exploration in Self-Imitation Learning using Intrinsic Motivation

In the absence of an expert (and its subsequent demonstrations), an option is to prioritize well-suited exploration experiences collected by the agent in order to bootstrap its learning process with good exploration behaviors.

However, this solution highly depends on the ability of the agent to discover such trajectories in the early stages of its learning process.

To tackle this issue, we propose to combine imitation learning with intrinsic motivation, two of the most widely adopted techniques to address problems with sparse reward.

In this work intrinsic motivation is used to encourage the agent to explore the environment based on its curiosity, whereas imitation learning allows repeating…

10 часов назад @ paperswithcode.com
/extreme-bert/ ExtremeBERT: A Toolkit for Accelerating Pretraining of Customized BERT
/extreme-bert/ ExtremeBERT: A Toolkit for Accelerating Pretraining of Customized BERT /extreme-bert/ ExtremeBERT: A Toolkit for Accelerating Pretraining of Customized BERT

In this paper, we present ExtremeBERT, a toolkit for accelerating and customizing BERT pretraining.

Our goal is to provide an easy-to-use BERT pretraining toolkit for the research community and industry.

Thus, the pretraining of popular language models on customized datasets is affordable with limited resources.

Experiments show that, to achieve the same or better GLUE scores, the time cost of our toolkit is over $6\times$ times less for BERT Base and $9\times$ times less for BERT Large when compared with the original BERT paper.

The documentation and code are released at https://github.com/extreme-bert/extreme-bert under the Apache-2.0 license.

10 часов назад @ paperswithcode.com
/rsy6318/ GeoUDF: Surface Reconstruction from 3D Point Clouds via Geometry-guided Distance Representation
/rsy6318/ GeoUDF: Surface Reconstruction from 3D Point Clouds via Geometry-guided Distance Representation /rsy6318/ GeoUDF: Surface Reconstruction from 3D Point Clouds via Geometry-guided Distance Representation

The recent neural implicit representation-based methods have greatly advanced the state of the art for solving the long-standing and challenging problem of reconstructing a discrete surface from a sparse point cloud.

These methods generally learn either a binary occupancy or signed/unsigned distance field (SDF/UDF) as surface representation.

Besides, we model the local geometric structure of the input point clouds by explicitly learning a quadratic polynomial for each point.

This not only facilitates upsampling the input sparse point cloud but also naturally induces unoriented normal, which further augments UDF estimation.

Finally, to extract triangle meshes from the predicted UDF we propos…

11 часов назад @ paperswithcode.com
/vlbthambawita/ MLC at HECKTOR 2022: The Effect and Importance of Training Data when Analyzing Cases of Head and Neck Tumors using Machine Learning
/vlbthambawita/ MLC at HECKTOR 2022: The Effect and Importance of Training Data when Analyzing Cases of Head and Neck Tumors using Machine Learning /vlbthambawita/ MLC at HECKTOR 2022: The Effect and Importance of Training Data when Analyzing Cases of Head and Neck Tumors using Machine Learning

This paper presents the work done by team MLC for the 2022 version of the HECKTOR grand challenge held at MICCAI 2022.

For Task 1, the automatic segmentation task, our approach was, in contrast to earlier solutions using 3D segmentation, to keep it as simple as possible using a 2D model, analyzing every slice as a standalone image.

We proposed two approaches; one using only the CT scans to make predictions and another using a combination of the CT and PET scans.

For Task 2, the prediction of recurrence-free survival, we first proposed two approaches, one where we only use patient data and one where we combined the patient data with segmentations from the image model.

In our third approach, …

11 часов назад @ paperswithcode.com
/gamepiaynmo/ Robust and Fast Measure of Information via Low-rank Representation
/gamepiaynmo/ Robust and Fast Measure of Information via Low-rank Representation /gamepiaynmo/ Robust and Fast Measure of Information via Low-rank Representation

The matrix-based R\'enyi's entropy allows us to directly quantify information measures from given data, without explicit estimation of the underlying probability distribution.

However, this information theoretical quantity is not robust against noise in the data, and is computationally prohibitive in large-scale applications.

To address these issues, we propose a novel measure of information, termed low-rank matrix-based R\'enyi's entropy, based on low-rank representations of infinitely divisible kernel matrices.

The proposed entropy functional inherits the specialty of of the original definition to directly quantify information from data, but enjoys additional advantages including robustne…

11 часов назад @ paperswithcode.com
/pris-cv/ Bi-directional Feature Reconstruction Network for Fine-Grained Few-Shot Image Classification
/pris-cv/ Bi-directional Feature Reconstruction Network for Fine-Grained Few-Shot Image Classification /pris-cv/ Bi-directional Feature Reconstruction Network for Fine-Grained Few-Shot Image Classification

The main challenge for fine-grained few-shot image classification is to learn feature representations with higher inter-class and lower intra-class variations, with a mere few labelled samples.

To alleviate this problem, prior works predominately use a support set to reconstruct the query image and then utilize metric learning to determine its category.

Upon careful inspection, we further reveal that such unidirectional reconstruction methods only help to increase inter-class variations and are not effective in tackling intra-class variations.

In addition to using the support set to reconstruct the query set for increasing inter-class variations, we further use the query set to reconstruct …

11 часов назад @ paperswithcode.com
/googlebaba/ Directed Acyclic Graph Structure Learning from Dynamic Graphs
/googlebaba/ Directed Acyclic Graph Structure Learning from Dynamic Graphs /googlebaba/ Directed Acyclic Graph Structure Learning from Dynamic Graphs

Estimating the structure of directed acyclic graphs (DAGs) of features (variables) plays a vital role in revealing the latent data generation process and providing causal insights in various applications.

Although there have been many studies on structure learning with various types of data, the structure learning on the dynamic graph has not been explored yet, and thus we study the learning problem of node feature generation mechanism on such ubiquitous dynamic graph data.

In a dynamic graph, we propose to simultaneously estimate contemporaneous relationships and time-lagged interaction relationships between the node features.

These two kinds of relationships form a DAG, which could effect…

11 часов назад @ paperswithcode.com
/migalkin/ Weisfeiler and Leman Go Relational
/migalkin/ Weisfeiler and Leman Go Relational /migalkin/ Weisfeiler and Leman Go Relational

Many graph neural networks for such data emerged recently, often outperforming shallow architectures.

However, the design of such multi-relational graph neural networks is ad-hoc, driven mainly by intuition and empirical insights.

Here, we initiate the study of deriving a more principled understanding of multi-relational graph neural networks.

Namely, we investigate the limitations in the expressive power of the well-known Relational GCN and Compositional GCN architectures and shed some light on their practical learning performance.

Further, by leveraging recent progress in designing expressive graph neural networks, we introduce the $k$-RN architecture that provably overcomes the expressiv…

11 часов назад @ paperswithcode.com
/Ascend-Research/ AIO-P: Expanding Neural Performance Predictors Beyond Image Classification
/Ascend-Research/ AIO-P: Expanding Neural Performance Predictors Beyond Image Classification /Ascend-Research/ AIO-P: Expanding Neural Performance Predictors Beyond Image Classification

Evaluating neural network performance is critical to deep neural network design but a costly procedure.

Neural predictors provide an efficient solution by treating architectures as samples and learning to estimate their performance on a given task.

However, existing predictors are task-dependent, predominantly estimating neural network performance on image classification benchmarks.

In this paper, we propose a novel All-in-One Predictor (AIO-P), which aims to pretrain neural predictors on architecture examples from multiple, separate computer vision (CV) task domains and multiple architecture spaces, and then transfer to unseen downstream CV tasks or neural architectures.

Extensive experime…

11 часов назад @ paperswithcode.com
/openvinotoolkit/ How to Train an Accurate and Efficient Object Detection Model on Any Dataset
/openvinotoolkit/ How to Train an Accurate and Efficient Object Detection Model on Any Dataset /openvinotoolkit/ How to Train an Accurate and Efficient Object Detection Model on Any Dataset

The rapidly evolving industry demands high accuracy of the models without the need for time-consuming and computationally expensive experiments required for fine-tuning.

Moreover, a model and training pipeline, which was once carefully optimized for a specific dataset, rarely generalizes well to training on a different dataset.

This makes it unrealistic to have carefully fine-tuned models for each use case.

It can be used on its own or as a starting point for further fine-tuning for specific use cases when needed.

The source code is available as a part of the OpenVINO Training Extensions (https://github.com/openvinotoolkit/training_extensions}PDFAbstract

11 часов назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 6 часов назад
/anaymehrotra/ Fair Ranking with Noisy Protected Attributes
/anaymehrotra/ Fair Ranking with Noisy Protected Attributes /anaymehrotra/ Fair Ranking with Noisy Protected Attributes

Recent works, however, observe that errors in socially-salient (including protected) attributes of items can significantly undermine fairness guarantees of existing fair-ranking algorithms and raise the problem of mitigating the effect of such errors.

We study the fair-ranking problem under a model where socially-salient attributes of items are randomly and independently perturbed.

We present a fair-ranking framework that incorporates group fairness requirements along with probabilistic information about perturbations in socially-salient attributes.

We provide provable guarantees on the fairness and utility attainable by our framework and show that it is information-theoretically impossible…

11 часов назад @ paperswithcode.com
/zeeshankaleem/ SafeSpace MFNet: Precise and Efficient MultiFeature Drone Detection Network
/zeeshankaleem/ SafeSpace MFNet: Precise and Efficient MultiFeature Drone Detection Network /zeeshankaleem/ SafeSpace MFNet: Precise and Efficient MultiFeature Drone Detection Network

Unmanned air vehicles (UAVs) popularity is on the rise as it enables the services like traffic monitoring, emergency communications, deliveries, and surveillance.

However, the unauthorized usage of UAVs (a.k.a drone) may violate security and privacy protocols for security-sensitive national and international institutions.

The presented challenges require fast, efficient, and precise detection of UAVs irrespective of harsh weather conditions, the presence of different objects, and their size to enable SafeSpace.

To overcome these limitations, we propose a precise and efficient multiscale and multifeature UAV detection network for SafeSpace, i.e., \textit{MultiFeatureNet} (\textit{MFNet}), an…

11 часов назад @ paperswithcode.com
/inkiinki/ Towards Interpreting Vulnerability of Multi-Instance Learning via Customized and Universal Adversarial Perturbations
/inkiinki/ Towards Interpreting Vulnerability of Multi-Instance Learning via Customized and Universal Adversarial Perturbations /inkiinki/ Towards Interpreting Vulnerability of Multi-Instance Learning via Customized and Universal Adversarial Perturbations

The safety of MIL learners is concerning, though, as we can greatly fool them by introducing a few adversarial perturbations.

In this paper, we design two adversarial perturbations to interpret the vulnerability of MIL methods.

The first method can efficiently generate the bag-specific perturbation (called customized) with the aim of outsiding it from its original classification region.

We conduct various experiments to verify the performance of these two perturbations, and the results show that both of them can effectively fool MIL learners.

We additionally propose a simple strategy to lessen the effects of adversarial perturbations.

11 часов назад @ paperswithcode.com
/icetttb/ NOPE-SAC: Neural One-Plane RANSAC for Sparse-View Planar 3D Reconstruction
/icetttb/ NOPE-SAC: Neural One-Plane RANSAC for Sparse-View Planar 3D Reconstruction /icetttb/ NOPE-SAC: Neural One-Plane RANSAC for Sparse-View Planar 3D Reconstruction

This paper studies the challenging two-view 3D reconstruction in a rigorous sparse-view configuration, which is suffering from insufficient correspondences in the input image pairs for camera pose estimation.

We present a novel Neural One-PlanE RANSAC framework (termed NOPE-SAC in short) that exerts excellent capability to learn one-plane pose hypotheses from 3D plane correspondences.

Building on the top of a siamese plane detection network, our NOPE-SAC first generates putative plane correspondences with a coarse initial pose.

It then feeds the learned 3D plane parameters of correspondences into shared MLPs to estimate the one-plane camera pose hypotheses, which are subsequently reweighed …

11 часов назад @ paperswithcode.com
/hulianyuyy/ Self-Emphasizing Network for Continuous Sign Language Recognition
/hulianyuyy/ Self-Emphasizing Network for Continuous Sign Language Recognition /hulianyuyy/ Self-Emphasizing Network for Continuous Sign Language Recognition

Hand and face play an important role in expressing sign language.

They usually employ extra heavy pose-estimation networks to locate human body keypoints or rely on additional pre-extracted heatmaps for supervision.

To relieve this problem, we propose a self-emphasizing network (SEN) to emphasize informative spatial regions in a self-motivated way, with few extra computations and without additional expensive supervision.

Remarkably, with few extra computations, SEN achieves new state-of-the-art accuracy on four large-scale datasets, PHOENIX14, PHOENIX14-T, CSL-Daily, and CSL.

Visualizations verify the effects of SEN on emphasizing informative spatial and temporal features.

11 часов назад @ paperswithcode.com
/velocitycavalry/ CREPE: Open-Domain Question Answering with False Presuppositions
/velocitycavalry/ CREPE: Open-Domain Question Answering with False Presuppositions /velocitycavalry/ CREPE: Open-Domain Question Answering with False Presuppositions

Information seeking users often pose questions with false presuppositions, especially when asking about unfamiliar topics.

Most existing question answering (QA) datasets, in contrast, assume all questions have well defined answers.

We introduce CREPE, a QA dataset containing a natural distribution of presupposition failures from online information-seeking forums.

We find that 25% of questions contain false presuppositions, and provide annotations for these presuppositions and their corrections.

CREPE provides a benchmark to study question answering in the wild, and our analyses provide avenues for future work in better modeling and further studying the task.

11 часов назад @ paperswithcode.com
/wooboyeong/ Automated anomaly-aware 3D segmentation of bones and cartilages in knee MR images from the Osteoarthritis Initiative
/wooboyeong/ Automated anomaly-aware 3D segmentation of bones and cartilages in knee MR images from the Osteoarthritis Initiative /wooboyeong/ Automated anomaly-aware 3D segmentation of bones and cartilages in knee MR images from the Osteoarthritis Initiative

In medical image analysis, automated segmentation of multi-component anatomical structures, which often have a spectrum of potential anomalies and pathologies, is a challenging task.

Subsequently, the extracted data are used for downstream tasks involving semantic segmentation of individual bone and cartilage volumes as well as bone anomalies.

The reconstruction error was used to detect bone anomalies.

A second anomaly-aware network, which was compared to anomaly-na\"ive segmentation networks, was used to provide a final automated segmentation of the femoral, tibial and patellar bones and cartilages from the knee MR images containing a spectrum of bone anomalies.

The anomaly-aware segmentat…

11 часов назад @ paperswithcode.com
/datalab-fit-ctu/ WeatherFusionNet: Predicting Precipitation from Satellite Data
/datalab-fit-ctu/ WeatherFusionNet: Predicting Precipitation from Satellite Data /datalab-fit-ctu/ WeatherFusionNet: Predicting Precipitation from Satellite Data

The short-term prediction of precipitation is critical in many areas of life.

The radar images are available only in areas with ground weather radars.

Thus, we aim to predict high-resolution precipitation from lower-resolution satellite radiance images.

A neural network called WeatherFusionNet is employed to predict severe rain up to eight hours in advance.

WeatherFusionNet is a U-Net architecture that fuses three different ways to process the satellite data; predicting future satellite frames, extracting rain information from the current frames, and using the input sequence directly.

11 часов назад @ paperswithcode.com
/giannisdaras/ Multiresolution Textual Inversion
/giannisdaras/ Multiresolution Textual Inversion /giannisdaras/ Multiresolution Textual Inversion

We extend Textual Inversion to learn pseudo-words that represent a concept at different resolutions.

This allows us to generate images that use the concept with different levels of detail and also to manipulate different resolutions using language.

Once learned, the user can generate images at different levels of agreement to the original concept; "A photo of $S^*(0)$" produces the exact object while the prompt "A photo of $S^*(0.8)$" only matches the rough outlines and colors.

Our framework allows us to generate images that use different resolutions of an image (e.g.

details, textures, styles) as separate pseudo-words that can be composed in various ways.

11 часов назад @ paperswithcode.com
/joyhsu0504/ Geoclidean: Few-Shot Generalization in Euclidean Geometry
/joyhsu0504/ Geoclidean: Few-Shot Generalization in Euclidean Geometry /joyhsu0504/ Geoclidean: Few-Shot Generalization in Euclidean Geometry

Euclidean geometry is among the earliest forms of mathematical thinking.

Here we explore these questions by studying few-shot generalization in the universe of Euclidean geometry constructions.

We introduce Geoclidean, a domain-specific language for Euclidean geometry, and use it to generate two datasets of geometric concept learning tasks for benchmarking generalization judgements of humans and machines.

We find that humans are indeed sensitive to Euclidean geometry and generalize strongly from a few visual examples of a geometric concept.

Thus Geoclidean represents a novel few-shot generalization benchmark for geometric concept learning, where the performance of humans and of AI models di…

11 часов назад @ paperswithcode.com
/ljw-git/ Learning Motion-Robust Remote Photoplethysmography through Arbitrary Resolution Videos
/ljw-git/ Learning Motion-Robust Remote Photoplethysmography through Arbitrary Resolution Videos /ljw-git/ Learning Motion-Robust Remote Photoplethysmography through Arbitrary Resolution Videos

Remote photoplethysmography (rPPG) enables non-contact heart rate (HR) estimation from facial videos which gives significant convenience compared with traditional contact-based measurements.

On one side, guided with representative-area information, PFE adaptively encodes the arbitrary resolution facial frames to the fixed-resolution facial structure features.

On the other side, leveraging the estimated optical flow, TFA is able to counteract the rPPG signal confusion caused by the head movement thus benefit the motion-robust rPPG signal recovery.

Besides, we also train the model with a cross-resolution constraint using a two-stream dual-resolution framework, which further helps PFE learn re…

11 часов назад @ paperswithcode.com
/jingjyyao/ CDSM: Cascaded Deep Semantic Matching on Textual Graphs Leveraging Ad-hoc Neighbor Selection
/jingjyyao/ CDSM: Cascaded Deep Semantic Matching on Textual Graphs Leveraging Ad-hoc Neighbor Selection /jingjyyao/ CDSM: Cascaded Deep Semantic Matching on Textual Graphs Leveraging Ad-hoc Neighbor Selection

Deep semantic matching aims to discriminate the relationship between documents based on deep neural networks.

In recent years, it becomes increasingly popular to organize documents with a graph structure, then leverage both the intrinsic document features and the extrinsic neighbor features to derive discrimination.

In this work, we propose a novel framework, Cascaded Deep Semantic Matching (CDSM), for accurate and efficient semantic matching on textual graphs.

In the second stage, a high-capacity graph-based matching network is employed to compute fine-grained relevance scores based on the well-selected neighbors.

It is worth noting that CDSM is a generic framework which accommodates most …

11 часов назад @ paperswithcode.com
/pennylaneai/ Predicting Properties of Quantum Systems with Conditional Generative Models
/pennylaneai/ Predicting Properties of Quantum Systems with Conditional Generative Models /pennylaneai/ Predicting Properties of Quantum Systems with Conditional Generative Models

Machine learning has emerged recently as a powerful tool for predicting properties of quantum many-body systems.

For many ground states of gapped Hamiltonians, generative models can learn from measurements of a single quantum state to reconstruct the state accurately enough to predict local observables.

Alternatively, kernel methods can predict local observables by learning from measurements on different but related states.

In this work, we combine the benefits of both approaches and propose the use of conditional generative models to simultaneously represent a family of states, by learning shared structures of different quantum states from measurements.

We numerically validate our approach…

11 часов назад @ paperswithcode.com
/convlab/ ConvLab-3: A Flexible Dialogue System Toolkit Based on a Unified Data Format
/convlab/ ConvLab-3: A Flexible Dialogue System Toolkit Based on a Unified Data Format /convlab/ ConvLab-3: A Flexible Dialogue System Toolkit Based on a Unified Data Format

Diverse data formats and ontologies of task-oriented dialogue (TOD) datasets hinder us from developing general dialogue models that perform well on many datasets and studying knowledge transfer between datasets.

To address this issue, we present ConvLab-3, a flexible dialogue system toolkit based on a unified TOD data format.

In ConvLab-3, different datasets are transformed into one unified format and loaded by models in the same way.

Compared to the previous releases of ConvLab (Lee et al., 2019b; Zhu et al., 2020b), ConvLab-3 allows developing dialogue systems with much more datasets and enhances the utility of the reinforcement learning (RL) toolkit for dialogue policies.

To showcase the…

11 часов назад @ paperswithcode.com
/bozhenhhu/ Protein Language Models and Structure Prediction: Connection and Progression
/bozhenhhu/ Protein Language Models and Structure Prediction: Connection and Progression /bozhenhhu/ Protein Language Models and Structure Prediction: Connection and Progression

The prediction of protein structures from sequences is an important task for function prediction, drug design, and related biological processes understanding.

Recent advances have proved the power of language models (LMs) in processing the protein sequence databases, which inherit the advantages of attention networks and capture useful information in learning representations for proteins.

The past two years have witnessed remarkable success in tertiary protein structure prediction (PSP), including evolution-based and single-sequence-based PSP.

It seems that instead of using energy-based models and sampling procedures, protein language model (pLM)-based pipelines have emerged as mainstream p…

11 часов назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 6 часов назад
/passerer/ From Coarse to Fine: Hierarchical Pixel Integration for Lightweight Image Super-Resolution
/passerer/ From Coarse to Fine: Hierarchical Pixel Integration for Lightweight Image Super-Resolution /passerer/ From Coarse to Fine: Hierarchical Pixel Integration for Lightweight Image Super-Resolution

Image super-resolution (SR) serves as a fundamental tool for the processing and transmission of multimedia data.

Recently, Transformer-based models have achieved competitive performances in image SR.

They divide images into fixed-size patches and apply self-attention on these patches to model long-range dependencies among pixels.

Specifically, LAM presents a hierarchical importance map where the most important pixels are located in a fine area of a patch and some less important pixels are spread in a coarse area of the whole image.

To access pixels in the coarse area, instead of using a very large patch size, we propose a lightweight Global Pixel Access (GPA) module that applies cross-atten…

11 часов назад @ paperswithcode.com
/HuangJunJie2017/ BEVPoolv2: A Cutting-edge Implementation of BEVDet Toward Deployment
/HuangJunJie2017/ BEVPoolv2: A Cutting-edge Implementation of BEVDet Toward Deployment /HuangJunJie2017/ BEVPoolv2: A Cutting-edge Implementation of BEVDet Toward Deployment

It achieves this by omitting the calculation and preprocessing of the large frustum feature.

As a result, it can be processed within 0.82 ms even with a large input resolution of 640x1600, which is 15.1 times the previous fastest implementation.

Besides, it is also less cache consumptive when compared with the previous implementation, naturally as it no longer needs to store the large frustum feature.

We offer an example of deployment to the TensorRT backend in branch dev2.0 and show how fast the BEVDet paradigm can be processed on it.

Other than BEVPoolv2, we also select and integrate some substantial progress that was proposed in the past year.

11 часов назад @ paperswithcode.com
/Ascend-Research/ GENNAPE: Towards Generalized Neural Architecture Performance Estimators
/Ascend-Research/ GENNAPE: Towards Generalized Neural Architecture Performance Estimators /Ascend-Research/ GENNAPE: Towards Generalized Neural Architecture Performance Estimators

Predicting neural architecture performance is a challenging task and is crucial to neural architecture design and search.

In this paper, we propose GENNAPE, a Generalized Neural Architecture Performance Estimator, which is pretrained on open neural architecture benchmarks, and aims to generalize to completely unseen architectures through combined innovations in network representation, contrastive pretraining, and fuzzy clustering-based predictor ensemble.

Specifically, GENNAPE represents a given neural network as a Computation Graph (CG) of atomic operations which can model an arbitrary architecture.

Experiments show that GENNAPE pretrained on NAS-Bench-101 can achieve superior transferabil…

13 часов назад @ paperswithcode.com
/Ascend-Research/ AIO-P: Expanding Neural Performance Predictors Beyond Image Classification
/Ascend-Research/ AIO-P: Expanding Neural Performance Predictors Beyond Image Classification /Ascend-Research/ AIO-P: Expanding Neural Performance Predictors Beyond Image Classification

Evaluating neural network performance is critical to deep neural network design but a costly procedure.

Neural predictors provide an efficient solution by treating architectures as samples and learning to estimate their performance on a given task.

However, existing predictors are task-dependent, predominantly estimating neural network performance on image classification benchmarks.

In this paper, we propose a novel All-in-One Predictor (AIO-P), which aims to pretrain neural predictors on architecture examples from multiple, separate computer vision (CV) task domains and multiple architecture spaces, and then transfer to unseen downstream CV tasks or neural architectures.

Extensive experime…

13 часов назад @ paperswithcode.com
/Ascend-Research/ GENNAPE: Towards Generalized Neural Architecture Performance Estimators
/Ascend-Research/ GENNAPE: Towards Generalized Neural Architecture Performance Estimators /Ascend-Research/ GENNAPE: Towards Generalized Neural Architecture Performance Estimators

Predicting neural architecture performance is a challenging task and is crucial to neural architecture design and search.

In this paper, we propose GENNAPE, a Generalized Neural Architecture Performance Estimator, which is pretrained on open neural architecture benchmarks, and aims to generalize to completely unseen architectures through combined innovations in network representation, contrastive pretraining, and fuzzy clustering-based predictor ensemble.

Specifically, GENNAPE represents a given neural network as a Computation Graph (CG) of atomic operations which can model an arbitrary architecture.

Experiments show that GENNAPE pretrained on NAS-Bench-101 can achieve superior transferabil…

13 часов назад @ paperswithcode.com
/miaoxiong2320/ Birds of a Feather Trust Together: Knowing When to Trust a Classifier via Adaptive Neighborhood Aggregation
/miaoxiong2320/ Birds of a Feather Trust Together: Knowing When to Trust a Classifier via Adaptive Neighborhood Aggregation /miaoxiong2320/ Birds of a Feather Trust Together: Knowing When to Trust a Classifier via Adaptive Neighborhood Aggregation

How do we know when the predictions made by a classifier can be trusted?

This is a fundamental problem that also has immense practical applicability, especially in safety-critical areas such as medicine and autonomous driving.

In this work, we argue that the trustworthiness of a classifier's prediction for a sample is highly associated with two factors: the sample's neighborhood information and the classifier's output.

To combine the best of both worlds, we design a model-agnostic post-hoc approach NeighborAgg to leverage the two essential information via an adaptive neighborhood aggregation.

Empirically, extensive experiments on image and tabular benchmarks verify our theory and suggest th…

14 часов назад @ paperswithcode.com
/pjlab-adg/ Analyzing Infrastructure LiDAR Placement with Realistic LiDAR
/pjlab-adg/ Analyzing Infrastructure LiDAR Placement with Realistic LiDAR /pjlab-adg/ Analyzing Infrastructure LiDAR Placement with Realistic LiDAR

Infrastructure sensors play a critical role in this research field, however, how to find the optimal placement of infrastructure sensors is rarely studied.

In this paper, we investigate the problem of infrastructure sensor placement and propose a pipeline that can efficiently and effectively find optimal installation positions for infrastructure sensors in a realistic simulated environment.

To better simulate and evaluate LiDAR placement, we establish a Realistic LiDAR Simulation library that can simulate the unique characteristics of different popular LiDARs and produce high-fidelity LiDAR point clouds in the CARLA simulator.

Through simulating point cloud data in different LiDAR placement…

14 часов назад @ paperswithcode.com
/bigdata-inha/ Better Generalized Few-Shot Learning Even Without Base Data
/bigdata-inha/ Better Generalized Few-Shot Learning Even Without Base Data /bigdata-inha/ Better Generalized Few-Shot Learning Even Without Base Data

This paper introduces and studies zero-base generalized few-shot learning (zero-base GFSL), which is an extreme yet practical version of few-shot learning problem.

Motivated by the cases where base data is not available due to privacy or ethical issues, the goal of zero-base GFSL is to newly incorporate the knowledge of few samples of novel classes into a pretrained model without any samples of base classes.

According to our analysis, we discover the fact that both mean and variance of the weight distribution of novel classes are not properly established, compared to those of base classes.

In this paper, we overcome this limitation by proposing a simple yet effective normalization method th…

14 часов назад @ paperswithcode.com
/lisj575/ NeAF: Learning Neural Angle Fields for Point Normal Estimation
/lisj575/ NeAF: Learning Neural Angle Fields for Point Normal Estimation /lisj575/ NeAF: Learning Neural Angle Fields for Point Normal Estimation

Normal estimation for unstructured point clouds is an important task in 3D computer vision.

Current methods achieve encouraging results by mapping local patches to normal vectors or learning local surface fitting using neural networks.

To resolve these issues, we propose an implicit function to learn an angle field around the normal of each point in the spherical coordinate system, which is dubbed as Neural Angle Fields (NeAF).

Instead of directly predicting the normal of an input point, we predict the angle offset between the ground truth normal and a randomly sampled query normal.

To further leverage the prior learned by NeAF, we propose to refine the predicted normal vectors by minimizin…

14 часов назад @ paperswithcode.com
/glee4810/ EHRSQL: A Practical Text-to-SQL Benchmark for Electronic Health Records
/glee4810/ EHRSQL: A Practical Text-to-SQL Benchmark for Electronic Health Records /glee4810/ EHRSQL: A Practical Text-to-SQL Benchmark for Electronic Health Records

We present a new text-to-SQL dataset for electronic health records (EHRs), where the utterances are collected from 222 hospital staff—including physicians, nurses, insurance review and health records teams, and more.

To construct the QA dataset on structured EHR data, we conducted a poll at a university hospital and templatized the responses to create seed questions.

Then, we manually linked them to two open-source EHR databases—MIMIC-III and eICU—and included them with various time expressions and held-out unanswerable questions in the dataset, which were all collected from the poll.

We believe our dataset, EHRSQL, could serve as a practical benchmark to develop and assess QA models on str…

16 часов назад @ paperswithcode.com
/facebookresearch/ Coder Reviewer Reranking for Code Generation
/facebookresearch/ Coder Reviewer Reranking for Code Generation /facebookresearch/ Coder Reviewer Reranking for Code Generation

Sampling diverse programs from a code language model and reranking with model likelihood is a popular method for code generation but it is prone to preferring degenerate solutions.

We augment Coder language models from past work, which generate programs given language instructions, with Reviewer models, which evaluate the likelihood of the instruction given the generated programs.

Experimental results show that Coder-Reviewer reranking leads to consistent and significant improvement (up to 17% absolute accuracy gain) over reranking with the Coder model only.

When combined with executability filtering, Coder-Reviewer reranking can often outperform the minimum Bayes risk method.

Coder-Reviewe…

17 часов назад @ paperswithcode.com
/gpoesia/ Peano: Learning Formal Mathematical Reasoning
/gpoesia/ Peano: Learning Formal Mathematical Reasoning /gpoesia/ Peano: Learning Formal Mathematical Reasoning

General mathematical reasoning is computationally undecidable, but humans routinely solve new problems.

What structure enables this, and how might that inform automated mathematical reasoning?

We use Peano to formalize introductory algebra problems and axioms, obtaining well-defined search problems.

We observe existing reinforcement learning methods for symbolic reasoning to be insufficient to solve harder problems.

The recovered order has significant agreement with the expert-designed Khan Academy curriculum, and second-generation agents trained on the recovered curriculum learn significantly faster.

17 часов назад @ paperswithcode.com
/Altriaex/ Behavior Estimation from Multi-Source Data for Offline Reinforcement Learning
/Altriaex/ Behavior Estimation from Multi-Source Data for Offline Reinforcement Learning /Altriaex/ Behavior Estimation from Multi-Source Data for Offline Reinforcement Learning

Offline reinforcement learning (RL) have received rising interest due to its appealing data efficiency.

The present study addresses behavior estimation, a task that lays the foundation of many offline RL algorithms.

Behavior estimation aims at estimating the policy with which training data are generated.

In this case, neglecting data heterogeneity, existing approaches for behavior estimation suffers from behavior misspecification.

This model provides with a agent fine-grained characterization for multi-source data and helps it overcome behavior misspecification.

17 часов назад @ paperswithcode.com
/jla-gardner/ Synthetic data enable experiments in atomistic machine learning
/jla-gardner/ Synthetic data enable experiments in atomistic machine learning /jla-gardner/ Synthetic data enable experiments in atomistic machine learning

There have been major advances in developing descriptors and regression frameworks for this task, typically starting from (relatively) small sets of quantum-mechanical reference data.

The cheapness of this process, compared to the quantum-mechanical ground truth, allows us to generate millions of datapoints, in turn enabling rapid experimentation with atomistic ML models from the small- to the large-data regime.

This approach allows us here to compare regression frameworks in depth, and to explore visualisation based on learned representations.

We also show that learning synthetic data labels can be a useful pre-training task for subsequent fine-tuning on small datasets.

In the future, we e…

18 часов назад @ paperswithcode.com
/nii-yamagishilab/ Hiding speaker's sex in speech using zero-evidence speaker representation in an analysis/synthesis pipeline
/nii-yamagishilab/ Hiding speaker's sex in speech using zero-evidence speaker representation in an analysis/synthesis pipeline /nii-yamagishilab/ Hiding speaker's sex in speech using zero-evidence speaker representation in an analysis/synthesis pipeline

The use of modern vocoders in an analysis/synthesis pipeline allows us to investigate high-quality voice conversion that can be used for privacy purposes.

Here, we propose to transform the speaker embedding and the pitch in order to hide the sex of the speaker.

ECAPA-TDNN-based speaker representation fed into a HiFiGAN vocoder is protected using a neural-discriminant analysis approach, which is consistent with the zero-evidence concept of privacy.

This approach significantly reduces the information in speech related to the speaker's sex while preserving speech content and some consistency in the resulting protected voices.

PDFAbstract

18 часов назад @ paperswithcode.com
💼 University and corporation labs
DeepMind DeepMind
последний пост 17 часов назад
On the Expressivity of Markov Reward
On the Expressivity of Markov Reward On the Expressivity of Markov Reward

For a Markov reward function to express this task, it would need to make these two policies strictly higher in value than all other deterministic policies.

However, there is no such Markov reward function: the optimality of a single “move clockwise” action will depend on whether the agent was already moving in that direction in the past.

Since the reward function must be Markov, it cannot convey this kind of information.

Similar examples demonstrate that Markov reward cannot capture every policy order and trajectory order, too.

Further, if there is a reward function that captures the given task, we would ideally like to be able to output such a reward function.

17 часов назад @ deepmind.com
DeepMind’s latest research at NeurIPS 2022
DeepMind’s latest research at NeurIPS 2022 DeepMind’s latest research at NeurIPS 2022

Advancing best-in-class large models, compute-optimal RL agents, and more transparent, ethical, and fair AI systemsThe thirty-sixth International Conference on Neural Information Processing Systems (NeurIPS 2022) is taking place from 28 November - 9 December 2022, as a hybrid event, based in New Orleans, USA.

We updated the scaling laws of large models, showing how previously trained models were too large for the amount of training performed.

Pioneering responsiblyAt the heart of DeepMind’s mission is our commitment to act as responsible pioneers in the field of AI.

We’re committed to developing AI systems that are transparent, ethical, and fair.ÂExplaining and understanding the behavio…

6 дней, 17 часов назад @ deepmind.com
Building interactive agents in video game worlds
Building interactive agents in video game worlds Building interactive agents in video game worlds

Learning in “the playhouse”Our framework begins with people interacting with other people in the video game world.

Human participants set the contexts for the interactions by navigating through the world, setting goals, and asking questions for agents.

This phase was covered in two of our earlier papers, Imitating Interactive Intelligence, and Creating Multimodal Interactive Agents with Imitation and Self-Supervised Learning, which explored building imitation-based agents.

Our agents trained by RL performed much better than those trained by imitation learning alone.ÂWe asked people to evaluate our agents in online real-time interactions.

In Deep reinforcement learning from human prefere…

1 неделя, 1 день назад @ deepmind.com
Benchmarking the next generation of never-ending learners
Benchmarking the next generation of never-ending learners Benchmarking the next generation of never-ending learners

For example, when large models are deployed, whatever they have learned on one task is seldom harnessed to facilitate their learning of the next task.

What’s more, once new data or more compute become available, large models are typically retrained from scratch – a costly, time-consuming process. ÂThis raises the question of whether we could improve the trade-off between the efficiency and performance of these large models, making them faster and more sustainable while also preserving their outstanding capabilities.

The Never-Ending Visual classification Stream (NEVIS’22) is a benchmark stream in addition to an evaluation protocol, a set of initial baselines, and an open-source codeb…

1 неделя, 2 дня назад @ deepmind.com
Best practices for data enrichment
Best practices for data enrichment Best practices for data enrichment

The best practicesFollowing PAI’s recent white paper on Responsible Sourcing of Data Enrichment Services, we collaborated to develop our practices and processes for data enrichment.

This included the creation of five steps AI practitioners can follow to improve the working conditions for people involved in data enrichment tasks (for more details, please visit PAI’s Data Enrichment Sourcing Guidelines):ÂSelect an appropriate payment model and ensure all workers are paid above the local living wage.

Design and run a pilot before launching a data enrichment project.

This has not only increased the efficiency of our approval and launch processes, but, importantly, has enhanced the experienc…

2 недели, 1 день назад @ deepmind.com
The pursuit of AI education - past, present and future
The pursuit of AI education - past, present and future The pursuit of AI education - past, present and future

Meet Sylvia Christie, our education partnerships manager who’s played a leading role in expanding our scholarship programme, which has just celebrated its five-year anniversary.

Every academic year, we get to see the new crop of talented AI scholars become part of an international community of students and mentors.

We need to make sure that our work drives real change in the wider community and for AI education more generally.

The series also includes the short cinematic film below as a new way of speaking to audiences about the scholarship programme in a creative way.

What’re your biggest learnings now that the scholarship programme is five years old?ÂHow important collaboration is.

3 недели, 2 дня назад @ deepmind.com
Digital transformation with Google Cloud
Digital transformation with Google Cloud Digital transformation with Google Cloud

Applying our AI research, we’ve helped Google Cloud enhance core solutions used by their customers at scaleAlphabet’s Google Cloud empowers organisations to digitally transform themselves into smarter businesses.

Last week, many of the platform’s latest advances were shared at Next '22, Google Cloud's annual developer and tech conference about digital transformation in the cloud.

We’ve partnered with Google Cloud over the last few years to apply our AI research for making a positive impact on core solutions used by their customers.

And in recent years, we’ve partnered with Google Cloud Professional Services to positively impact the wind energy sector to help build a carbon-free fu…

1 месяц, 1 неделя назад @ deepmind.com
Measuring perception in AI models
Measuring perception in AI models Measuring perception in AI models

So today, we’re introducing the Perception Test, a multimodal benchmark using real-world videos to help evaluate the perception capabilities of a model.

Multimodal models, such as Perceiver, Flamingo, or BEiT-3, aim to be more general models of perception.

Geolocation of crowd-sourced participants involved in filming.ÂLearning more about the Perception TestThe Perception Test benchmark is publicly available here and further details are available in our paper.

A leaderboard and a challenge server will be available soon too.ÂOn 23 October, 2022, we’re hosting a workshop about general perception models at the European Conference on Computer Vision in Tel Aviv (ECCV 2022), where we will dis…

1 месяц, 2 недели назад @ deepmind.com
How undesired goals can arise with correct rewards
How undesired goals can arise with correct rewards How undesired goals can arise with correct rewards

Exploring examples of goal misgeneralisation – where an AI system's capabilities generalise but its goal doesn'tAs we build increasingly advanced artificial intelligence (AI) systems, we want to make sure they don’t pursue undesired goals.

Such behaviour in an AI agent is often the result of specification gaming – exploiting a poor choice of what they are rewarded for.

Crucially, in contrast to specification gaming, GMG can occur even when the AI system is trained with a correct specification.

During training, there is an “expert” agent (the red blob) that visits the coloured spheres in the correct order.

This AI system does what its designers intend it to do.

1 месяц, 3 недели назад @ deepmind.com
Discovering novel algorithms with AlphaTensor
Discovering novel algorithms with AlphaTensor Discovering novel algorithms with AlphaTensor

For centuries, mathematicians believed that the standard matrix multiplication algorithm was the best one could achieve in terms of efficiency.

We then trained an AlphaTensor agent using reinforcement learning to play the game, starting without any knowledge about existing matrix multiplication algorithms.

Through learning, AlphaTensor gradually improves over time, re-discovering historical fast matrix multiplication algorithms such as Strassen’s, eventually surpassing the realm of human intuition and discovering algorithms faster than previously known.

Single-player game played by AlphaTensor, where the goal is to find a correct matrix multiplication algorithm.

By exploring the space of …

1 месяц, 3 недели назад @ deepmind.com
Supporting the next generation of AI leaders
Supporting the next generation of AI leaders Supporting the next generation of AI leaders

These barriers not only contribute to the existing attainment gap, they directly impact the number of opportunities students have to pursue a career in STEM related fields, including AI, down the line.

Developing new AI resources with the Raspberry Pi FoundationÂWe will be working closely with the Raspberry Pi Foundation, a charity that promotes the study of computing and digital technologies, to develop new AI-focused resources including lesson plans for students and training for teachers.

By focusing on education at an early age, there’s an opportunity to help break down long-standing barriers that have facilitated a system of inequalities.

Amplifying the reach of existing programmesÂDe…

2 месяца назад @ deepmind.com
Building safer dialogue agents
Building safer dialogue agents Building safer dialogue agents

However, dialogue agents powered by LLMs can express inaccurate or invented information, use discriminatory language, or encourage unsafe behaviour.

To create safer dialogue agents, we need to be able to learn from human feedback.

Applying reinforcement learning based on input from research participants, we explore new methods for training dialogue agents that show promise for a safer system.

Sparrow is a research model and proof of concept, designed with the goal of training dialogue agents to be more helpful, correct, and harmless.

Sparrow is a significant step forward in understanding how to train dialogue agents to be more useful and safer.

2 месяца, 1 неделя назад @ deepmind.com
How our principles helped define AlphaFold’s release
How our principles helped define AlphaFold’s release How our principles helped define AlphaFold’s release

Our Operating Principles have come to define both our commitment to prioritising widespread benefit, as well as the areas of research and applications we refuse to pursue.

From principles to practiceWritten principles are only part of the puzzle – how they’re put into practice is key.

A major release of protein structure predictions in partnership with EMBL-EBI (EMBL’s European Bioinformatics Institute), the established community leader.

As a public institution, EMBL-EBI enables anyone to look up protein structure predictions as easily as a Google search.

As a public institution, EMBL-EBI enables anyone to look up protein structure predictions as easily as a Google search.

2 месяца, 2 недели назад @ deepmind.com
Maximising the impact of our breakthroughs
Maximising the impact of our breakthroughs Maximising the impact of our breakthroughs

Applying our AI research to help enrich the lives of billions of people around the worldBuilding useful products with new technologies has always been one of my greatest joys.

Taking research out of the labMy main focus as CBO is on taking our cutting-edge research breakthroughs and matching our technologies to solving everyday business problems.

I’m often asked, as a future-facing research organisation, why it’s important to work on global challenges that impact people every day?

that makes DeepMind so special is our ability to bridge leading AI research to hundreds, if not thousands, of AI-ready problems that impact billions of people.

And as we go along this journey, we’re continuo…

2 месяца, 3 недели назад @ deepmind.com
My journey from DeepMind intern to mentor
My journey from DeepMind intern to mentor My journey from DeepMind intern to mentor

Former intern turned intern manager, Richard Everett, describes his journey to DeepMind, sharing tips and advice for aspiring DeepMinders.

However, after working on several research projects with my professors, I developed a taste for research and decided to continue on towards a PhD.

Can you describe the internship interview process?

Today's interns can expect the entire process to last just a few months, which includes a technical and a team interview.

Any tips for aspiring DeepMind interns?

2 месяца, 3 недели назад @ deepmind.com
Google
последний пост 1 день, 1 час назад
Making a Traversable Wormhole with a Quantum Computer
Making a Traversable Wormhole with a Quantum Computer Making a Traversable Wormhole with a Quantum Computer

Experiment: Quantum Gravity in the LabImplementing these ideas on a Sycamore processor, we have constructed a quantum system that is dual to a traversable wormhole.

The presence of negative energy in a traversable wormhole is similar to negative energy in the Casimir effect, where vacuum energy pushes together closely spaced plates.

New ideas were needed to build a traversable wormhole on a quantum computer with a limited number of qubits.

Unlike experiments such as LIGO that record data about gravity in the world around us, quantum computers provide a tool to explore theories of quantum gravity.

We hope that quantum computers will help develop our understanding of future theories of quantu…

1 день, 1 час назад @ ai.googleblog.com
Boost medical discoveries with AlphaFold on Vertex AI
Boost medical discoveries with AlphaFold on Vertex AI Boost medical discoveries with AlphaFold on Vertex AI

There’s a lot we can learn from combining technology with science to help support the development of amazing discoveries.

By using an AI system to predict protein shapes, we have the potential to accelerate research in every field of biology.

Each one has a unique 3D shape that determines how it works and what it does.

Deepmind’s gigantic leapIn 2020, Alphabet’s artificial intelligence research arm, DeepMind, made a massive breakthrough in predicting protein structures using a deep learning model called AlphaFold.

AlphaFold is trained on publicly available data consisting of about 170,000 protein structures, and is the first computational method that can regularly predict the 3D shape of a …

1 день, 23 часа назад @ cloud.google.com
How InstaDeep used Cloud TPU v4 to help sustainable agriculture
How InstaDeep used Cloud TPU v4 to help sustainable agriculture How InstaDeep used Cloud TPU v4 to help sustainable agriculture

Genomic language models for sustainable agricultureEver since farming began, we have been, directly or indirectly, trying to breed better crops with higher yields, better resilience and, if we’re lucky, better taste too.

However, the complexity of plant genomes often makes it difficult to identify which variants are beneficial.

InstaDeep partners with Google Cloud to train the new generation of AI models for genomics on TPUsResearchers have demonstrated that large language models can be especially effective in proteomics.

This finding led InstaDeep researchers to train a set of increasingly larger language models on genomics datasets ranging from 1 billion to 20 billion parameters.

The comp…

2 дня назад @ cloud.google.com
Better Language Models Without Massive Compute
Better Language Models Without Massive Compute Better Language Models Without Massive Compute

In recent years, language models (LMs) have become more prominent in natural language processing (NLP) research and are also becoming increasingly impactful in practice.

In this blog post, we explore two complementary methods for improving existing language models by a large margin without using massive computational resources.

Second, in “Scaling Instruction-Finetuned Language Models”, we explore fine-tuning a language model on a collection of datasets phrased as instructions, a process we call “Flan”.

Compute versus model performance of PaLM 540B and U-PaLM 540B on 26 NLP benchmarks (listed in Table 8 in the paper).

Our instruction–fine-tuned language model, Flan-PaLM, responds better to …

2 дня назад @ ai.googleblog.com
Google at NeurIPS 2022
Google at NeurIPS 2022 Google at NeurIPS 2022

This week marks the beginning of the 36th annual Conference on Neural Information Processing Systems (NeurIPS 2022), the biggest machine learning conference of the year, which is being held in New Orleans, LA.

NeurIPS 2022 will be held in person with additional options for virtual attendees, and includes invited talks, demonstrations and presentations of some of the latest in machine learning research.

This year, NeurIPS is also offering a new track, called Spotlight Papers, which will provide opportunities to highlight papers presented in prestigious journals that would otherwise not have been eligible for submission.

Google is proud to be a Diamond level sponsor of NeurIPS this year and w…

3 дня, 2 часа назад @ ai.googleblog.com
Conversation Summaries in Google Chat
Conversation Summaries in Google Chat Conversation Summaries in Google Chat

Today, we are excited to introduce conversation summaries in Google Chat for messages in Spaces.

This feature is enabled by our state-of-the-art abstractive summarization model, Pegasus, which generates useful and concise summaries for chat conversations, and is currently available to selected premium Google Workspace business customers.

Conversation Summarization ModelingThe goal of text summarization is to provide helpful and concise summaries for different types of text, such as documents, articles, or spoken conversations.

The second one is misrepresentation, when the model’s generated summary misrepresents or contradicts the chat conversation.

Detecting low quality summaries: While the…

1 неделя, 5 дней назад @ ai.googleblog.com
How to run a large scale ML workflow on Dataflow ML for autonomous driving
How to run a large scale ML workflow on Dataflow ML for autonomous driving How to run a large scale ML workflow on Dataflow ML for autonomous driving

Google Cloud Dataflow is a fully managed data processing service that lets users run batch and streaming pipelines on large-scale data in a fast, scalable, and cost-effective manner.

Developers can write their pipelines using Apache Beam, which is an open-source, unified programming model that simplifies these large-scale data processing dynamics.

Building a simple ML pipeline to extract metadataNow let’s run a Dataflow ML pipeline to process large amounts of data for autonomous driving.

Dataflow automatically scales with your data volume (more specifically, by throughput), so you will not need to modify your pipeline as the data grows 10x or even 1,000x.

It is easy to convert the pipeline …

1 неделя, 5 дней назад @ cloud.google.com
Easily integrate machine learning models into applications with Vertex AI integration for Cloud Spanner
Easily integrate machine learning models into applications with Vertex AI integration for Cloud Spanner Easily integrate machine learning models into applications with Vertex AI integration for Cloud Spanner

They want to react to business or customer events faster and in a scalable way by leveraging machine learning (ML) models instead of relying on manual actions.

Spanner integration with Vertex AI, Google Cloud’s ML platform, lets users leverage ML models in Vertex AI using a simple SQL query in Spanner.

With Spanner Vertex AI integration, in contrast, developers can easily access models built by data scientists and apply them to database transactions using familiar SQL.

Getting started with Vertex AI integrationIn this release of Vertex AI integration, we’ve made it easy to consume any existing model already published in Vertex AI.

If you are unfamiliar with Vertex AI, check out this quick s…

1 неделя, 6 дней назад @ cloud.google.com
The Data Cards Playbook: A Toolkit for Transparency in Dataset Documentation
The Data Cards Playbook: A Toolkit for Transparency in Dataset Documentation The Data Cards Playbook: A Toolkit for Transparency in Dataset Documentation

Earlier this year at the ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT), we published Data Cards, a dataset documentation framework aimed at increasing transparency across dataset lifecycles.

Today, we introduce the Data Cards Playbook, a self-guided toolkit for a variety of teams to navigate transparency challenges with their ML datasets.

In the Data Cards Playbook, we’ve translated successful approaches into repeatable practices that can easily be adapted to unique team needs.

The Audit module helps data teams and organizations set up processes to evaluate completed Data Cards before they are published.

ConclusionWe present the Data Cards Playbook, a continuous a…

1 неделя, 6 дней назад @ ai.googleblog.com
Automating self-service tech support with Tensorflow
Automating self-service tech support with Tensorflow Automating self-service tech support with Tensorflow

What would tech support be like if an AI agent could recommend the right self-service article before a human support agent even got on the ticket?

Getting support fasterGooglers make use of knowledge-base support articles to help them troubleshoot and solve common technical issues at work.

The regexes are handcrafted rules and not as robust as a machine learning model could be.

Users submitting support tickets receive ML-suggested help center articles ahead of human-powered support.

Tensorflow and BERTWe tried out a few different types of machine learning models.

2 недели назад @ cloud.google.com
Mixture-of-Experts with Expert Choice Routing
Mixture-of-Experts with Expert Choice Routing Mixture-of-Experts with Expert Choice Routing

In “Mixture-of-Experts with Expert Choice Routing”, presented at NeurIPS 2022, we introduce a novel MoE routing algorithm called Expert Choice (EC).

Token Choice Routing.

Expert Choice RoutingTo address the above issues, we propose a heterogeneous MoE that employs the expert choice routing method illustrated below.

Expert Choice Routing.

EvaluationTo illustrate the effectiveness of Expert Choice routing, we first look at training efficiency and convergence.

2 недели назад @ ai.googleblog.com
Accelerate innovation in life sciences with Google Cloud
Accelerate innovation in life sciences with Google Cloud Accelerate innovation in life sciences with Google Cloud

Google Cloud for life sciencesAt Alphabet, we’ve made significant investments in healthcare and life sciences, helping to tackle the world’s biggest healthcare problems, from chronic disease management, to precision medicine, to protein folding.

Together with Google, you can transform your life sciences organization and deliver secure, data-driven innovation across the value chain.

Solutions like DocAI can enable optimal patient matching for clinical trials, helping organizations optimize clinical trial selection and increase time to value.

We’ll be taking a deep dive into each of the challenges outlined above in our life sciences video series.

Life Sciences Industry Becomes Latest Arena in…

2 недели, 1 день назад @ cloud.google.com
SAP Build Process Automation is better with Google Document AI and Google Workspace
SAP Build Process Automation is better with Google Document AI and Google Workspace SAP Build Process Automation is better with Google Document AI and Google Workspace

SAP Build Process Automation is designed to optimize business processes and boost efficiency.

With a focus on taking process automation to an even more advanced level and removing inefficiencies from workflows, SAP has introduced integrations with Google Cloud Document AI, and Google Workspace for SAP Build Process Automation customers.

Integrated automation can drive results for invoice and purchase order processingAn example of the improvements these integrations have made to SAP Build Process Automation is the use of AI, productivity tools, and automation for processing purchase orders.

Today, through SAP’s integrated automation platform, customers can automatically extract and organize …

2 недели, 1 день назад @ cloud.google.com
Using AI to increase asset utilization and production uptime for manufacturers
Using AI to increase asset utilization and production uptime for manufacturers Using AI to increase asset utilization and production uptime for manufacturers

Cloud capabilities have matured at an accelerated pace, giving manufacturers practical avenues to achieve these goals.

Manufacturers are finding new ways to bring AI and machine learning (ML) to practical use cases, like predictive maintenance, anomaly detection, and asset utilization management.

In this post, we will explore a practical example of how manufacturers can use Google Cloud manufacturing solutions to train, deploy and extract value from ML-enabled capabilities to predict asset utilization and maintenance needs.

The journey to machine learning insights starts with accessible dataThe first step to a successful machine learning project is to unify necessary data in a common reposi…

2 недели, 3 дня назад @ cloud.google.com
Redacting PII data in Dialogflow CX with Google Cloud Data Loss Prevention (DLP)
Redacting PII data in Dialogflow CX with Google Cloud Data Loss Prevention (DLP) Redacting PII data in Dialogflow CX with Google Cloud Data Loss Prevention (DLP)

Redacting Session Parameters, Webhook data, and Response MessagesFor Session Parameters, Webhook data, and other data logged by Dialogflow CX, including Fulfillment Response Messages, the approach to redact such information relies on Cloud Data Loss Prevention (DLP) inspection templates.

Key ComponentsData Loss Prevention (DLP) Inspection TemplatesOur solution uses Google Cloud Data Loss Prevention (DLP), which is a service that can identify, mask, obfuscate, de-identify, transform, or tokenize sensitive information in text using NLP- and rules-based methods.

In our case, the documents are the log messages that contain the Session Parameters, Webhook data, Fulfillment Response Messages and …

2 недели, 6 дней назад @ cloud.google.com
OpenAI OpenAI
последний пост 23 часа назад
ChatGPT: Optimizing Language Models for Dialogue
ChatGPT: Optimizing Language Models for Dialogue ChatGPT: Optimizing Language Models for Dialogue

Tell me about when Christopher Columbus came to the US in 2015 This question is a bit tricky because Christopher Columbus died in 1506, so he could not have come to the US in 2015.

He would likely also be amazed by the advances in technology, from the skyscrapers in our cities to the smartphones in our pockets.

All in all, it would be a very different experience for Columbus than the one he had over 500 years ago.

Tell me about when Christopher Columbus came to the US in 2015 Christopher Columbus came to the US in 2015 and he was very excited to be here.

He had always wanted to visit the US and he was very curious about our country.

23 часа назад @ openai.com
DALL·E API Now Available in Public Beta
DALL·E API Now Available in Public Beta DALL·E API Now Available in Public Beta

Starting today, developers can begin building apps with the DALL·E API.

Read documentationDevelopers can now integrate DALL·E directly into their apps and products through our API.

Microsoft Bing Microsoft is bringing DALL·E to a new graphic design app called Designer, which helps users create professional quality social media posts, invitations, digital postcards, graphics, and more.

Mixtiles uses the DALL·E API to create and frame emotionally resonating artwork, by guiding users through a creative process that captures childhood memories, dream destinations, and more.

All API customers can use the DALL·E API today.

4 недели назад @ openai.com
DALL·E Now Available Without Waitlist
DALL·E Now Available Without Waitlist DALL·E Now Available Without Waitlist

New users can start creating straight away.

Sign upStarting today, we are removing the waitlist for the DALL·E beta so users can sign up and start using it immediately.

Since we first previewed the DALL·E research to users in April, users have helped us discover new uses for DALL·E as a powerful creative tool.

We can't wait to see what users from around the world create with DALL·E.

Sign up today and start creating.

2 месяца назад @ openai.com
Introducing Whisper
Introducing Whisper Introducing Whisper

We’ve trained and are open-sourcing a neural net called Whisper that approaches human level robustness and accuracy on English speech recognition.

Whisper examples: Reveal TranscriptWhisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web.

We show that the use of such a large and diverse dataset leads to improved robustness to accents, background noise and technical language.

We find this approach is particularly effective at learning speech to text translation and outperforms the supervised SOTA on CoVoST2 to English translation zero-shot.

Check out the paper, model card, and code to learn more det…

2 месяца, 1 неделя назад @ openai.com
DALL·E: Introducing Outpainting
DALL·E: Introducing Outpainting DALL·E: Introducing Outpainting

Now, with Outpainting, users can extend the original image, creating large-scale images in any aspect ratio.

Outpainting takes into account the image’s existing visual elements — including shadows, reflections, and textures — to maintain the context of the original image.

More than one million people are using DALL·E, the AI system that generates original images and artwork from a natural language description, as a creative tool today.

Artists have already created remarkable images with the new Outpainting feature, and helped us better understand its capabilities in the process.

Original outpainting by Tyna Eloundou Original outpainting by OpenAI Outpainting by David Schnurr Original outpai…

3 месяца назад @ openai.com
Our approach to alignment research
Our approach to alignment research Our approach to alignment research

IntroOur alignment research aims to make artificial general intelligence (AGI) aligned with human values and follow human intent.

We believe that even without fundamentally new alignment ideas, we can likely build sufficiently aligned AI systems to substantially advance alignment research itself.

At a high-level, our approach to alignment research focuses on engineering a scalable training signal for very smart AI systems that is aligned with human intent.

Instead, we aim for a more pragmatic approach: building and aligning a system that can make faster and better alignment research progress than humans can.

Therefore human researchers will focus more and more of their effort on reviewing a…

3 месяца, 1 неделя назад @ openai.com
New-and-Improved Content Moderation Tooling
New-and-Improved Content Moderation Tooling New-and-Improved Content Moderation Tooling

We are introducing a new-and-improved content moderation tool: The Moderation endpoint improves upon our previous content filter, and is available for free today to OpenAI API developers.

To help developers protect their applications against possible misuse, we are introducing the faster and more accurate Moderation endpoint.

When given a text input, the Moderation endpoint assesses whether the content is sexual, hateful, violent, or promotes self-harm — content prohibited by our content policy.

input text Violence Self-harm Hate Sexual Moderation endpoint Flagged FlaggedThe Moderation endpoint helps developers to benefit from our infrastructure investments.

Use of the Moderation endpoint t…

3 месяца, 3 недели назад @ openai.com
DALL·E Now Available in Beta
DALL·E Now Available in Beta DALL·E Now Available in Beta

Join DALL·E 2 waitlistDALL·E, the AI system that creates realistic images and art from a description in natural language, is now available in beta.

Every DALL·E user will receive 50 free credits during their first month of use and 15 free credits every subsequent month.

PricingIn this first phase of the beta, users can buy additional DALL·E credits in 115-credit increments (460 images ) for $15 on top of their free monthly credits.

Using DALL·E for commercial projectsStarting today, users get full usage rights to commercialize the images they create with DALL·E, including the right to reprint, sell, and merchandise.

We are excited to see what people create with DALL·E and look forward to us…

4 месяца, 2 недели назад @ openai.com
Reducing Bias and Improving Safety in DALL·E 2
Reducing Bias and Improving Safety in DALL·E 2 Reducing Bias and Improving Safety in DALL·E 2

Today, we are implementing a new technique so that DALL·E generates images of people that more accurately reflect the diversity of the world’s population.

We plan to improve this technique over time as we gather more data and feedback.

We are continuing to research how AI systems, like DALL·E, might reflect biases in its training data and different ways we can address them.

These improvements have helped us gain confidence in the ability to invite more users to experience DALL·E.

Expanding access is an important part of our deploying AI systems responsibly because it allows us to learn more about real-world use and continue to iterate on our safety systems.

4 месяца, 2 недели назад @ openai.com
DALL·E 2: Extending Creativity
DALL·E 2: Extending Creativity DALL·E 2: Extending Creativity

As part of our DALL·E 2 research preview, more than 3,000 artists from more than 118 countries have incorporated DALL·E into their creative workflows.

“We didn't know what an osteosarcoma villain would look like so we turned to DALL·E as our creative outlet.

That's a community effort — it's come from the past few months of me talking to other DALL·E artists on Twitter / Discord / DM.

We're all figuring it out together, how to play this beautiful new instrument.”Tom AvivIsraeli chef and MasterChef winner Tom Aviv is debuting his first U.S. restaurant in Miami in a few months and has used DALL·E for menu, decor, and ambiance inspiration — and his team have also used DALL·E to in designing the…

4 месяца, 2 недели назад @ openai.com
DALL·E 2 Pre-Training Mitigations
DALL·E 2 Pre-Training Mitigations DALL·E 2 Pre-Training Mitigations

This post focuses on pre-training mitigations, a subset of these guardrails which directly modify the data that DALL·E 2 learns from.

This post is organized in three sections, each describing a different pre-training mitigation:In the first section, we describe how we filtered out violent and sexual images from DALL·E 2’s training dataset.

Reducing Graphic and Explicit Training DataSince training data shapes the capabilities of any learned model, data filtering is a powerful tool for limiting undesirable model capabilities.

We refer to the former model as the unfiltered model, and the latter as the filtered model.

Unfiltered Filtered Generations for the prompt “military protest” from our un…

5 месяцев назад @ openai.com
Learning to Play Minecraft with Video PreTraining (VPT)
Learning to Play Minecraft with Video PreTraining (VPT) Learning to Play Minecraft with Video PreTraining (VPT)

We trained a neural network to play Minecraft by Video PreTraining (VPT) on a massive unlabeled video dataset of human Minecraft play, while using only a small amount of labeled contractor data.

In order to utilize the wealth of unlabeled video data available on the internet, we introduce a novel, yet simple, semi-supervised imitation learning method: Video PreTraining (VPT).

Trained on 70,000 hours of IDM-labeled online video, our behavioral cloning model (the “VPT foundation model”) accomplishes tasks in Minecraft that are nearly impossible to achieve with reinforcement learning from scratch.

We then take each foundation model and fine-tune it to the house building dataset described in th…

5 месяцев, 1 неделя назад @ openai.com
AI-Written Critiques Help Humans Notice Flaws
AI-Written Critiques Help Humans Notice Flaws AI-Written Critiques Help Humans Notice Flaws

Major travel delays are expected late Friday and Friday night as rain turns into snow, the National Weather Service forecast said.

In counties like Sussex, Morris and Warren, expected snow accumulations range from 6 to 16 inches.

The winter storm warnings have been issued for Sussex, Warren, Morris, Hunterdon, Middlesex, Monmouth, Ocean and northwest Burlington counties.

Expect the National Weather Service’s Upton, N.Y. office, which covers northeastern N.J., to follow suit shortly.

With defenses already weakened, coastal communities could see major impacts from coastal flooding, with the worst coming Saturday morning, according to the National Weather Service.

5 месяцев, 2 недели назад @ openai.com
Techniques for Training Large Neural Networks
Techniques for Training Large Neural Networks Techniques for Training Large Neural Networks

As cluster and model sizes have grown, machine learning practitioners have developed an increasing variety of techniques to parallelize model training over many GPUs.

Data Parallelism Pipeline Parallelism Tensor Parallelism Expert Parallelism Data Parallelism Pipeline Parallelism Tensor Parallelism Expert ParallelismAn illustration of various parallelism strategies on a three-layer model.

Forward Forward Backward Backward Update Update Idle Idle GPipe PipeDream Comparison of GPipe and PipeDream pipelining schemes, using 4 microbatches per batch.

Matrix multiplication can be thought of as dot products between pairs of rows and columns; it's possible to compute independent dot products on dif…

5 месяцев, 3 недели назад @ openai.com
Best Practices for Deploying Language Models
Best Practices for Deploying Language Models Best Practices for Deploying Language Models

Joint Recommendation for Language Model DeploymentWe’re recommending several key principles to help providers of large language models (LLMs) mitigate the risks of this technology in order to achieve its full promise to augment human capabilities.

Documentation should also include model and use-case-specific safety best practices.

Publicly disclose lessons learned regarding LLM safety and misuse in order to enable widespread adoption and help with cross-industry iteration on best practices.

Treat all labor in the language model supply chain with respect.

As LLM providers, publishing these principles represents a first step in collaboratively guiding safer large language model development an…

6 месяцев назад @ openai.com
Microsoft Microsoft
последний пост 2 дня назад
Research Focus: Week of November 28, 2022
Research Focus: Week of November 28, 2022 Research Focus: Week of November 28, 2022

This special edition of Research Focus highlights some of the 100+ papers from Microsoft Research that were accepted for publication at NeurIPS 2022 – the thirty-sixth annual Conference on Neural Information Processing Systems.

New research on generative modelsTwo papers covering new research on generative models will be presented at NeurIPS 2022.

Microsoft Research career opportunities – come join us!

We’re hiring for multiple roles including internships and researchers at all levels in multiple Microsoft Research labs.

Or you can browse our current openings at NeurIPS 2022 – Microsoft Research career opportunities.

2 дня назад @ microsoft.com
Research trends in privacy, security and cryptography
Research trends in privacy, security and cryptography Research trends in privacy, security and cryptography

Against that backdrop, Microsoft Research is focused on what comes next in security and privacy.

At Microsoft Research, we pursue ambitious projects to improve the privacy and security of everyone on the planet.

This is the first blog post in a series exploring the work we do in privacy, security and cryptography.

Managing the massive quantities of security data collected is increasingly challenging, which creates an urgent need for disruptive innovation in security analytics.

At Microsoft Research, we are creating an orchestration infrastructure for developers to deploy cross-platform, cross-device federated learning solutions.

1 неделя, 6 дней назад @ microsoft.com
Research Focus: Week of November 17, 2022
Research Focus: Week of November 17, 2022 Research Focus: Week of November 17, 2022

Microsoft Research at NeurIPS 2022Event NeurIPS 2022Microsoft is a proud platinum sponsor of the 36th annual conference on Neural Information Processing Systems, running from November 28 to December 9.

The work was led by Precious Esie, a PhD student in epidemiology at Columbia’s Mailman School of Public Health, during her summer internship at Microsoft Research.

Over the month of July 2021, the research team found, pollution disproportionately affected Hispanic and Latinx residents on the West side of the city.

This work shows how next-generation environmental sensing can help public health agencies target interventions when and where they are most needed.

Microsoft Research is actively co…

2 недели назад @ microsoft.com
Expanding AI technology for unstructured biomedical text beyond English
Expanding AI technology for unstructured biomedical text beyond English

With Text Analytics for Health, a part of Azure Cognitive Services, healthcare organizations around the world can now extract meaningful insights from unstructured text in eight languages and process it in a way that enables clinical decision support like never before.

2 недели назад @ azure.microsoft.com
Any developer can be a space developer with the new Azure Orbital Space SDK
Any developer can be a space developer with the new Azure Orbital Space SDK

Today, we are announcing a crucial step towards democratizing access to space development, with the private preview release of Azure Orbital Space Software Development Kit(SDK)—a secure hosting platform and application kit designed to enable developers to create in the cloud and deploy and operate applications on-orbit.

2 недели назад @ azure.microsoft.com
AI and the need for purpose-built cloud infrastructure
AI and the need for purpose-built cloud infrastructure

Microsoft Azure is the only global public cloud service provider that offers purpose-built AI supercomputers with massively scalable scale-up-and-scale-out IT infrastructure comprised of NVIDIA InfiniBand interconnected NVIDIA Ampere A100 Tensor Core GPUs.

2 недели, 1 день назад @ azure.microsoft.com
How IoT, AI, and Digital Twins are helping achieve sustainability goals
How IoT, AI, and Digital Twins are helping achieve sustainability goals

Azure IoT can help transform businesses to be more efficient, manage renewable energy production, reduce waste, or accelerate the development and launch of sustainably oriented apps. A range of end-to-end solutions from our partners addresses sustainability in a variety of ways.

2 недели, 3 дня назад @ azure.microsoft.com
Cloud Intelligence/AIOps – Infusing AI into Cloud Computing Systems
Cloud Intelligence/AIOps – Infusing AI into Cloud Computing Systems Cloud Intelligence/AIOps – Infusing AI into Cloud Computing Systems

As a result, the non-functional properties of cloud computing platforms, including availability, reliability, performance, efficiency, security, and sustainability, have become immensely important.

AI for Customers to leverage AI/ML to create unparalleled user experiences and achieve exceptional user satisfaction using cloud services.

These systems have been integrated in both Azure and Microsoft 365 (M365), which has considerably improved engineers’ ability to handle incidents in cloud systems.

Making cloud systems more proactiveAIOps makes cloud systems more proactive by introducing the concept of proactive design.

Making cloud systems more manageableAIOps makes cloud systems more managea…

3 недели назад @ microsoft.com
Do more with less using new Azure HX and HBv4 virtual machines for HPC
Do more with less using new Azure HX and HBv4 virtual machines for HPC

The all-new HX-series and HBv4-series VMs are optimized for a variety of HPC workloads such as computational fluid dynamics (CFD), finite element analysis, frontend and backend electronic design automation (EDA), rendering, molecular dynamics, computational geoscience, weather simulation, AI inference, and financial risk analysis.

3 недели назад @ azure.microsoft.com
Research Focus: Week of November 7, 2022
Research Focus: Week of November 7, 2022 Research Focus: Week of November 7, 2022

The XXL model variant outperforms both XLM-R XXL and mT5 XXL while being ~2x and ~3x smaller, respectively.

GLUE – or the General Language Understanding Evaluation benchmark – is a collection of resources for training, evaluating, and analyzing natural language understanding systems.

Real-time 3D telemedicine has previously been proposed within a research setting only, with constraints on complexity, bandwidth and technology.

This research reports on an international collaboration on the participatory development and first validated clinical use of a novel, real-time 360-degree 3D telemedicine system worldwide.

However, the correctness of the resulting code with respect to user intent expre…

3 недели, 2 дня назад @ microsoft.com
Research Focus: Week of October 24, 2022
Research Focus: Week of October 24, 2022 Research Focus: Week of October 24, 2022

Welcome to Research Focus, a new series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft.

Meet the 2022 recipients of the Microsoft Research Global PhD FellowshipMicrosoft is thrilled to announce the 2022 Microsoft Research Global PhD Fellows from around the world.

With its complex semantics, biomedical text poses additional challenges in vision-language modelling, and previous work has used insufficiently adapted models that lack domain-specific language understanding.

In this study, we show that principled textual semantic modelling can substantially improve contrastive learning in bio…

1 месяц назад @ microsoft.com
Introducing Vision Studio, a UI-based demo interface for Computer Vision
Introducing Vision Studio, a UI-based demo interface for Computer Vision

Are you looking to improve the analysis and management of images and videos? The Computer Vision API provides access to advanced algorithms for processing media and returning information.

1 месяц назад @ azure.microsoft.com
Image Analysis 4.0 with new API endpoint and OCR model in preview
Image Analysis 4.0 with new API endpoint and OCR model in preview

We are thrilled to announce the preview release of Computer Vision Image Analysis 4.0 which combines existing and new visual features such as read optical character recognition (OCR), captioning, image classification and tagging, object detection, people detection, and smart cropping into one API.

1 месяц, 1 неделя назад @ azure.microsoft.com
ECCV 2022 highlights: Advancing the foundations of mixed reality
ECCV 2022 highlights: Advancing the foundations of mixed reality ECCV 2022 highlights: Advancing the foundations of mixed reality

Using synthetic data helps us protect the privacy of data subjects and the rights of photographers and content creators.

There are other advantages to using synthetic data to train ML models, as well.

And it isn’t necessary to apply quality assurance (QA) processes on each labeled image when using synthetic data—another cost- and time-saving benefit.

However, this doesn’t reflect real AR scenarios—the combination of AR devices and applications—and the opportunity they provide.

Using the LaMAR benchmarkLaMAR is the first benchmark that focuses on a realistic setup for visual localization and mapping using AR devices.

1 месяц, 1 неделя назад @ microsoft.com
Azure Scales 530B Parameter GPT-3 Model with NVIDIA NeMo Megatron
Azure Scales 530B Parameter GPT-3 Model with NVIDIA NeMo Megatron

Combining NVIDIA NeMo Megatron with our Azure AI infrastructure offers a powerful platform that anyone can spin up in minutes without having to incur the costs and burden of managing their own on-premises infrastructure. And of course, we have taken our benchmarking of the new framework to a new level, to truly show the power of the Azure infrastructure.

1 месяц, 1 неделя назад @ azure.microsoft.com
MIT AI MIT AI
последний пост 4 часа назад
Large language models help decipher clinical notes
Large language models help decipher clinical notes Large language models help decipher clinical notes

Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) believed that to disentangle the data, they needed to call on something bigger: large language models.

Using a set of short, publicly available clinical snippets, they cobbled together a small dataset to enable evaluation of the extraction performance of large language models.

“The research team’s advances in zero-shot clinical information extraction makes scaling possible.

Imprisoned in an EHRExperts have been steadily building up large language models (LLMs) for quite some time, but they burst onto the mainstream with GPT-3’s widely covered ability to complete sentences.

“To this end, this work sets for…

4 часа назад @ news.mit.edu
Ushering in a new era of computing
Ushering in a new era of computing Ushering in a new era of computing

Today, Huttenlocher serves as the inaugural dean at MIT Schwarzman College of Computing.

To highlight the significance of this moment in time, and the need for an interdisciplinary computing hub like the college of computing, he references the oft-cited prediction that software would gobble up and disrupt traditional industry structures.

There, he was the school’s first dean and vice provost, guiding its efforts to tie together industry and computing to enhance New York’s tech ecosystem.

“We want to harness the forefront of results in computing and infuse them with the other disciplines,” he says.

It aims to capitalize on the ubiquity of computing through a coordinated approach to computing…

22 часа назад @ news.mit.edu
Busy GPUs: Sampling and pipelining method speeds up deep learning on large graphs
Busy GPUs: Sampling and pipelining method speeds up deep learning on large graphs Busy GPUs: Sampling and pipelining method speeds up deep learning on large graphs

Now, a new method, called SALIENT (SAmpling, sLIcing, and data movemeNT), developed by researchers at MIT and IBM Research, improves the training and inference performance by addressing three key bottlenecks in computation.

Further, the team found that the technique scales well when computational power is added from one to 16 graphical processing units (GPUs).

“We started to look at the challenges current systems experienced when scaling state-of-the-art machine learning techniques for graphs to really big datasets.

In previous systems, this sampling step was a multi-process approach, creating extra data and unnecessary data movement between the processes.

This new capacity will now allow r…

1 день, 20 часов назад @ news.mit.edu
Breaking the scaling limits of analog computing
Breaking the scaling limits of analog computing Breaking the scaling limits of analog computing

MIT researchers have overcome this hurdle and found a way to effectively scale an optical neural network.

Their work could enable a super-fast, energy-efficient, analog neural network that can function with the same accuracy as a digital one.

Multiplying with lightAn optical neural network is composed of many connected components that function like reprogrammable, tunable mirrors.

Neural network data are encoded into light, which is fired into the optical neural network from a laser.

This technique also increased the bandwidth of the optical neural network so it can run three times faster.

2 дня, 7 часов назад @ news.mit.edu
Teresa Gao named 2024 Mitchell Scholar
Teresa Gao named 2024 Mitchell Scholar Teresa Gao named 2024 Mitchell Scholar

MIT senior Teresa Gao has been named one of the 12 winners of the George J. Mitchell Scholarship’s Class of 2024.

Gao is the fifth MIT student to be named a Mitchell Scholar.

Mitchell Scholars are selected on the basis of academic achievement, leadership, and dedication to public service.

Currently, she is working to establish cognitive benchmarks for AI with the MIT Quest for Intelligence.

Completely self-taught on the viola, Gao earned a highly competitive seat in the MIT Chamber Music Society.

1 неделя, 1 день назад @ news.mit.edu
A simpler path to better computer vision
A simpler path to better computer vision A simpler path to better computer vision

Instead of designing customized image generation programs for a particular training task, they gathered a dataset of 21,000 publicly available programs from the internet.

Then they used this large collection of basic image generation programs to train a computer vision model.

In the new work, they used an enormous dataset of uncurated image generation programs instead.

They used their massive dataset of image generation programs to pretrain computer vision models for both supervised and unsupervised image classification tasks.

While the accuracy levels were still less than models trained on real data, their technique narrowed the performance gap between models trained on real data and those…

1 неделя, 1 день назад @ news.mit.edu
A far-sighted approach to machine learning
A far-sighted approach to machine learning A far-sighted approach to machine learning

Researchers from MIT, the MIT-IBM Watson AI Lab, and elsewhere have developed a new approach that gives AI agents a farsighted perspective.

The agents then adapt their behaviors accordingly to influence other agents’ future behaviors and arrive at an optimal, long-term solution.

A key challenge is enabling AI agents to anticipate future behaviors of other agents when they are all learning simultaneously.

Reinforcement learning is a form of machine learning in which an AI agent learns by trial and error.

In both instances, the AI agents using FURTHER won the games more often.

1 неделя, 1 день назад @ news.mit.edu
Solving brain dynamics gives rise to flexible machine-learning models
Solving brain dynamics gives rise to flexible machine-learning models Solving brain dynamics gives rise to flexible machine-learning models

These modes have the same characteristics of liquid neural nets — flexible, causal, robust, and explainable — but are orders of magnitude faster, and scalable.

Imagine an end-to-end neural network that receives driving input from a camera mounted on a car.

In 2020, the team solved this by using liquid neural networks with 19 nodes, so 19 neurons plus a small perception module could drive a car.

“Neural network systems based on differential equations are tough to solve and scale to, say, millions and billions of parameters.

Getting that description of how neurons interact with each other, not just the threshold, but solving the physical dynamics between cells enables us to build up larger-sc…

2 недели, 1 день назад @ news.mit.edu
Ensuring AI works with the right dose of curiosity
Ensuring AI works with the right dose of curiosity Ensuring AI works with the right dose of curiosity

On the flip side, if you stick with what you know works well, you won't grow out of your narrow pathway.

Machines, in some cases, use “reinforcement learning” to accomplish a goal, where an AI agent iteratively learns from being rewarded for good behavior and punished for bad.

Too much curiosity can distract the agent from making good decisions, while too little means the agent will never discover good decisions.

In the pursuit of making AI agents with just the right dose of curiosity, researchers from MIT’s Improbable AI Laboratory and Computer Science and Artificial Intelligence Laboratory (CSAIL) created an algorithm that overcomes the problem of AI being too “curious” and getting distra…

3 недели назад @ news.mit.edu
A whole new world of learning via MIT OpenCourseWare videos
A whole new world of learning via MIT OpenCourseWare videos A whole new world of learning via MIT OpenCourseWare videos

And it was while searching for the answer to an AI-related question that Kasigazi first discovered MIT OpenCourseWare (OCW).

“The search results showed MIT lectures, and I thought, 'Which MIT is this?’” recalls Kasigazi, who admits he was initially skeptical as he opened the OCW YouTube channel.

For Kasigazi, the channel became a gateway to other open education resources, including the OpenCourseWare website and MITx courses, both part of MIT Open Learning.

“I am proud to say MIT OCW has made me fall in love with coding … it makes sense like it never has before,” he says.

Since coming across the OCW YouTube channel, Kasigazi has worked through all of the freely available MIT psychology cour…

3 недели, 2 дня назад @ news.mit.edu
Video on the record
Video on the record Video on the record

Convenings like this one, she said, “keep awareness going about those injustices.”Video evidence, then and nowOver the course of the three-day gathering, nine plenary speakers and more than 20 other presenters in thematically grouped sessions cracked open dialogues at the intersection of video and social justice.

The use of video evidence in courtrooms was a recurring theme in several other sessions — including one moderated by Harvard University faculty member and former NAACP president Cornell William Brooks.

Of more than 10,000 instances of police brutality in Triola’s dataset, fewer than 200 led to criminal charges against officers, and only 98 involved video.

But when video evidence di…

3 недели, 5 дней назад @ news.mit.edu
In machine learning, synthetic data can offer real performance improvements
In machine learning, synthetic data can offer real performance improvements In machine learning, synthetic data can offer real performance improvements

But are synthetic data as “good” as real data?

“The ultimate goal of our research is to replace real data pretraining with synthetic data pretraining.

The researchers were surprised to see that all three synthetic models outperformed models trained with real video clips on four of the six datasets.

“We use synthetic datasets to prevent privacy issues or contextual or social bias, but what does the model actually learn?

Now that they have demonstrated this use potential for synthetic videos, they hope other researchers will build upon their work.

4 недели назад @ news.mit.edu
Study urges caution when comparing neural networks to the brain
Study urges caution when comparing neural networks to the brain Study urges caution when comparing neural networks to the brain

This allows grid cells to encode a large number of unique positions using a relatively small number of cells.

In several recent studies, researchers have trained neural networks to perform this same task, which is known as path integration.

These studies concluded that grid-cell-like representations would naturally emerge in any neural network trained to perform the path integration task.

The researchers say that their findings suggest that more caution is warranted when interpreting neural network models of the brain.

They are now working on models of grid cells that they hope will generate more accurate predictions of how grid cells in the brain work.

4 недели, 1 день назад @ news.mit.edu
Using sound to model the world
Using sound to model the world Using sound to model the world

They developed a machine-learning model that can capture how any sound in a room will propagate through the space, enabling the model to simulate what a listener would hear at different locations.

I think this work opens up an exciting research direction on better utilizing sound to model the world,” Du says.

Sound and visionIn computer vision research, a type of machine-learning model called an implicit neural representation model has been used to generate smooth, continuous reconstructions of 3D scenes from images.

The MIT researchers employed the same type of model to capture how sound travels continuously through a scene.

But with sound, change locations and the sound one hears could be…

1 месяц назад @ news.mit.edu
Machine learning facilitates “turbulence tracking” in fusion reactors
Machine learning facilitates “turbulence tracking” in fusion reactors Machine learning facilitates “turbulence tracking” in fusion reactors

However, scientists typically study blobs using averaging techniques, which trade details of individual structures in favor of aggregate statistics.

The researchers built a synthetic video dataset of plasma turbulence to make this process more effective and efficient.

These kinds of events were not considered before with traditional approaches, but we could freely simulate those behaviors in the synthetic data,” Han says.

Then they tested the models using real video data from experiments.

One goal of this work is to encourage participation in fusion research from the broader machine-learning community toward the broader goal of helping solve the critical problem of climate change,” he adds.

1 месяц назад @ news.mit.edu
Berkeley AI
последний пост 2 месяца, 1 неделя назад
Keeping Learning-Based Control Safe by Regulating Distributional Shift
Keeping Learning-Based Control Safe by Regulating Distributional Shift Keeping Learning-Based Control Safe by Regulating Distributional Shift

Keeping Learning-Based Control Safe by Regulating Distributional ShiftTo regulate the distribution shift experience by learning-based controllers, we seek a mechanism for constraining the agent to regions of high data density throughout its trajectory (left).

The central idea behind our work is to view the training data distribution as a safety constraint, and to draw on tools from control theory to control the distributional shift experienced by the agent during closed-loop control.

To use an LDM in control, we can train an LDM and learning-based controller on the same training dataset and constrain the controller’s action outputs with an LDM constraint ($G(s, a)) \leq -\log(c)$).

The ce…

2 месяца, 1 неделя назад @ bair.berkeley.edu
Reverse engineering the NTK: towards first-principles architecture design
Reverse engineering the NTK: towards first-principles architecture design Reverse engineering the NTK: towards first-principles architecture design

Reverse engineering the NTK: towards first-principles architecture designFoundational works showed how to find the kernel corresponding to a wide network.

The NTK of a 4HL $\textrm{ReLU}$ FCN as a function of the cosine between two input vectors $x_1$ and $x_2$.

Shallowification of a deep $\textrm{ReLU}$ FCN into a 1HL FCN with an engineered activation function $\tilde{\phi}$.

4 below shows a “mimic” activation function \(\tilde{\phi}\) that gives virtually the same NTK as a deep \(\textrm{ReLU}\) FCN.

This is interesting from an engineering perspective because the shallow network uses fewer parameters than the deep network to achieve the same performance.

3 месяца назад @ bair.berkeley.edu
Why do Policy Gradient Methods work so well in Cooperative MARL? Evidence from Policy Representation
Why do Policy Gradient Methods work so well in Cooperative MARL? Evidence from Policy Representation Why do Policy Gradient Methods work so well in Cooperative MARL? Evidence from Policy Representation

Evidence from Policy RepresentationIn cooperative multi-agent reinforcement learning (MARL), due to its on-policy nature, policy gradient (PG) methods are typically believed to be less sample efficient than value decomposition (VD) methods, which are off-policy.

CTDE in Cooperative MARL: VD and PG methodsCentralized training and decentralized execution (CTDE) is a popular framework in cooperative MARL.

VD methods learn local Q networks and a mixing function that mixes the local Q networks to a global Q function.

By contrast, PG methods directly apply policy gradient to learn an individual policy and a centralized value function for each agent.

The permutation game: a simple counterexample w…

4 месяца, 3 недели назад @ bair.berkeley.edu
FIGS: Attaining XGBoost-level performance with the interpretability and speed of CART
FIGS: Attaining XGBoost-level performance with the interpretability and speed of CART FIGS: Attaining XGBoost-level performance with the interpretability and speed of CART

FIGS: Attaining XGBoost-level performance with the interpretability and speed of CARTFIGS (Fast Interpretable Greedy-tree Sums): A method for building interpretable models by simultaneously growing an ensemble of decision trees in competition with one another.

In this blog post we’ll cover FIGS, a new method for fitting an interpretable model that takes the form of a sum of trees.

Real-world experiments and theoretical results show that FIGS can effectively adapt to a wide range of structure in data, achieving state-of-the-art performance in several settings, all without sacrificing interpretability.

from imodels import FIGSClassifier , get_clean_dataset from sklearn.model_selection impor…

5 месяцев назад @ bair.berkeley.edu
The Berkeley Crossword Solver
The Berkeley Crossword Solver The Berkeley Crossword Solver

The Berkeley Crossword SolverWe recently built the Berkeley Crossword Solver (BCS), the first computer program to beat every human competitor in the world’s top crossword tournament.

in Berkeley (3)Domain ender that UC Berkeley was one of the first schools to adopt (3)Angeleno at Berkeley, say (8)Our ApproachThe BCS uses a two-step process to solve crossword puzzles.

Compared to the previous state-of-the-art method for answering crossword clues, this approach obtained a 13.4% absolute improvement in top-1000 QA accuracy.

FillWinning The American Crossword Puzzle TournamentThe American Crossword Puzzle Tournament (ACPT) is the largest and longest-running crossword tournament and is organiz…

6 месяцев, 2 недели назад @ bair.berkeley.edu
Rethinking Human-in-the-Loop for Artificial Augmented Intelligence
Rethinking Human-in-the-Loop for Artificial Augmented Intelligence Rethinking Human-in-the-Loop for Artificial Augmented Intelligence

Rethinking Human-in-the-Loop for Artificial Augmented IntelligenceFigure 1: In real-world applications, we think there exist a human-machine loop where humans and machines are mutually augmenting each other.

For demonstration, we designed a recognition framework that was a combination of active learning, semi-supervised learning, and human-in-the-loop (Figure 3).

Low-confidence predictions are sent for human annotation, and high-confidence predictions are trusted for downstream tasks or pseudo-labels for model updates.

Thus, the goal of AI development changes from replacing human intelligence to mutually augmenting both human and machine intelligence.

However, this goal of replacing human e…

7 месяцев назад @ bair.berkeley.edu
Designing Societally Beneficial Reinforcement Learning Systems
Designing Societally Beneficial Reinforcement Learning Systems Designing Societally Beneficial Reinforcement Learning Systems

Designing Societally Beneficial Reinforcement Learning SystemsDeep reinforcement learning (DRL) is transitioning from a research field focused on game playing to a technology with real-world applications.

At the same time as the emergence of powerful RL systems in the real world, the public and researchers are expressing an increased appetite for fair, aligned, and safe machine learning systems.

A Taxonomy of FeedbackReinforcement learning systems are often spotlighted for their ability to act in an environment, rather than passively make predictions.

Other supervised machine learning systems, such as computer vision, consume data and return a prediction that can be used by some decision ma…

7 месяцев назад @ bair.berkeley.edu
Should I Use Offline RL or Imitation Learning?
Should I Use Offline RL or Imitation Learning? Should I Use Offline RL or Imitation Learning?

Should I Use Offline RL or Imitation Learning?

Are there fundamental limitations to methods that rely on some form of imitation (BC, conditional BC, filtered BC) that offline RL addresses?

While it might be clear that offline RL should enjoy a large advantage over imitation learning when learning from diverse datasets that contain a lot of suboptimal behavior, we will also discuss how even cases that might seem BC-friendly can still allow offline RL to attain significantly better results.

Empirical Results Comparing Offline RL and BCIn our discussion so far, we have already studied settings such as the antmazes, where offline RL methods can significantly outperform imitation-style methods d…

7 месяцев, 1 неделя назад @ bair.berkeley.edu
Offline RL Made Easier: No TD Learning, Advantage Reweighting, or Transformers
Offline RL Made Easier: No TD Learning, Advantage Reweighting, or Transformers Offline RL Made Easier: No TD Learning, Advantage Reweighting, or Transformers

Offline RL Made Easier: No TD Learning, Advantage Reweighting, or TransformersA demonstration of the RvS policy we learn with just supervised learning and a depth-two MLP.

Offline reinforcement learning (RL) is conventionally approached using value-based methods based on temporal difference (TD) learning.

These algorithms learn conditional policies by conditioning on goal states (Lynch et al., 2019; Ghosh et al., 2021), reward-to-go (Kumar et al., 2019; Chen et al., 2021), or language descriptions of the task (Lynch and Sermanet, 2021).

The video above shows the complex behavior we learn using just supervised learning with a depth-two MLP – no TD learning, data reweighting, or Transformer…

7 месяцев, 2 недели назад @ bair.berkeley.edu
Accelerating Ukraine Intelligence Analysis with Computer Vision on Synthetic Aperture Radar Imagery
Accelerating Ukraine Intelligence Analysis with Computer Vision on Synthetic Aperture Radar Imagery Accelerating Ukraine Intelligence Analysis with Computer Vision on Synthetic Aperture Radar Imagery

EO imagery is commonplace—anyone who has used Google Maps or similar mapping software has interacted with EO satellite imagery.

In general, existing computer vision methods on other, non-aerial RGB imagery transfer very well to satellite imagery.

Synthetic Aperture Radar ImagerySynthetic aperture radar (SAR) imagery is an active form of remote sensing in which a satellite transmits pulses of microwave radar waves down to the surface of the Earth.

Computer Vision on SAR Imagery for UkraineImagery analysts are currently relying on both EO and SAR imagery where available over Ukraine.

Our top performing method, MAERS, for representation learning on RGB, SAR, and co-registered RGB + SAR build…

8 месяцев, 2 недели назад @ bair.berkeley.edu
Unsupervised Skill Discovery with Contrastive Intrinsic Control
Unsupervised Skill Discovery with Contrastive Intrinsic Control Unsupervised Skill Discovery with Contrastive Intrinsic Control

Unsupervised Skill Discovery with Contrastive Intrinsic ControlUnsupervised Reinforcement Learning (RL), where RL agents pre-train with self-supervised rewards, is an emerging paradigm for developing RL agents that are capable of generalization.

This tension between the need to support large skill spaces and the limitation of current discriminators leads us to propose Contrastive Intrinsic Control (CIC).

Contrastive Intrinsic Control (CIC) introduces a new contrastive density estimator to approximate the conditional entropy (the discriminator).

For a practical algorithm, we use the CIC contrastive skill learning as an auxiliary loss during pre-training.

Our hope is that our approach encoura…

9 месяцев, 1 неделя назад @ bair.berkeley.edu
AWS Machine Learning AWS Machine Learning
последний пост 3 часа назад
Protecting Consumers and Promoting Innovation – AI Regulation and Building Trust in Responsible AI
Protecting Consumers and Promoting Innovation – AI Regulation and Building Trust in Responsible AI Protecting Consumers and Promoting Innovation – AI Regulation and Building Trust in Responsible AI

We recognize that responsible AI is the shared responsibility of all organizations that develop and deploy AI systems.

We are committed to providing tools and resources to aide customers using our AI and machine learning (ML) services.

This week at our re:Invent 2022 conference, we announced the launch of AWS AI Service Cards, a new transparency resource to help customers better understand our AWS AI services.

The new AI Service Cards deliver a form of responsible AI documentation that provide customers with a single place to find information.

Each AI Service Card covers four key topics to help you better understand the service or service features, including intended use cases and limitatio…

3 часа назад @ aws.amazon.com
Stability AI builds foundation models on Amazon SageMaker
Stability AI builds foundation models on Amazon SageMaker Stability AI builds foundation models on Amazon SageMaker

We’re thrilled to announce that Stability AI has selected AWS as its preferred cloud provider to power its state-of-the-art AI models for image, language, audio, video, and 3D content generation.

Stability AI is a community-driven, open-source artificial intelligence (AI) company developing breakthrough technologies.

With Amazon SageMaker, Stability AI will build AI models on compute clusters with thousands of GPU or AWS Trainium chips, reducing training time and cost by 58%.

“Our mission at Stability AI is to build the foundation to activate humanity’s potential through AI.

Generative AI models and Stable DiffusionGenerative AI models can create text, images, audio, video, code, and more f…

20 часов назад @ aws.amazon.com
Launch Amazon SageMaker Autopilot experiments directly from within Amazon SageMaker Pipelines to easily automate MLOps workflows
Launch Amazon SageMaker Autopilot experiments directly from within Amazon SageMaker Pipelines to easily automate MLOps workflows Launch Amazon SageMaker Autopilot experiments directly from within Amazon SageMaker Pipelines to easily automate MLOps workflows

For more information, see Move Amazon SageMaker Autopilot ML models from experimentation to production using Amazon SageMaker Pipelines.

Eventually, the model can be registered into the SageMaker model registry using the Model step in combination with a Condition step.

The following steps are required for this end-to-end Autopilot training process:Create and monitor an Autopilot training job using the AutoMLStep ..

SummaryThis post describes an easy-to-use ML pipeline approach to automatically train tabular ML models (AutoML) using Autopilot, Pipelines, and Studio.

For more information on Autopilot and Pipelines, refer to Automate model development with Amazon SageMaker Autopilot and Amazon…

22 часа назад @ aws.amazon.com
AI21 Jurassic-1 foundation model is now available on Amazon SageMaker
AI21 Jurassic-1 foundation model is now available on Amazon SageMaker AI21 Jurassic-1 foundation model is now available on Amazon SageMaker

Today we are excited to announce that AI21 Jurassic-1 (J1) foundation models are available for customers using Amazon SageMaker.

You can now find foundation models from different model providers within JumpStart, enabling you to get started with foundation models quickly.

Amazon SageMaker offers the deepest and broadest set of ML services, and we’re excited to collaborate with Amazon SageMaker so that customers will be able to use these foundation models on SageMaker within their development environment.

Evaluate the Jurassic-1 Grande model with a test widgetOn the Jurassic-1 Grande listing, choose View Model.

ConclusionIn this post, we showed you how you can test and use AI21’s Jurassic Gr…

22 часа назад @ aws.amazon.com
Introducing AWS AI Service Cards: A new resource to enhance transparency and advance responsible AI
Introducing AWS AI Service Cards: A new resource to enhance transparency and advance responsible AI Introducing AWS AI Service Cards: A new resource to enhance transparency and advance responsible AI

To deliver the transparency that customers are asking for, we are excited to launch AWS AI Service Cards, a new resource to help customers better understand our AWS AI services.

AI Service Cards are a form of responsible AI documentation that provide customers with a single place to find information on the intended use cases and limitations, responsible AI design choices, and deployment and performance optimization best practices for our AI services.

AI Service Cards complement our existing developer guides and blog posts, which provide builders with descriptions of service features and detailed instructions for using our service APIs.

Our hope is that AI Service Cards will act as a useful …

23 часа назад @ aws.amazon.com
AWS Unveils New AI Service Features and Enhancements at re:Invent 2022
AWS Unveils New AI Service Features and Enhancements at re:Invent 2022 AWS Unveils New AI Service Features and Enhancements at re:Invent 2022

AWS AI services help customers create smoother, faster, and more efficient engagements with customers, driving greater efficiencies and lowering operational costs.

Customers can use AWS AI services with no ML expertise required.

Customers from different industries rely on AWS AI services to improve efficiency and reduce operational costs.

ConclusionWith these new features and capabilities, AWS continues to expand its portfolio of the broadest and deepest set of AI services.

K. Zhou et al., “A Review of Deep Learning in Medical Imaging: Imaging Traits, Technology Trends, Case Studies With Progress Highlights, and Future Promises,” in Proceedings of the IEEE, vol.

2 дня, 15 часов назад @ aws.amazon.com
Deploy an MLOps solution that hosts your model endpoints in AWS Lambda
Deploy an MLOps solution that hosts your model endpoints in AWS Lambda Deploy an MLOps solution that hosts your model endpoints in AWS Lambda

Models are tracked and registered in the Amazon SageMaker model registry.

This pipeline automates and connects the data preprocessing, model training, model metrics tracking in SageMaker Experiments, data postprocessing, and, model cataloging in SageMaker model registry.

If you follow the examples in Amazon SageMaker Projects, you get a template that hosts your model using a SageMaker endpoint.

Navigate the SageMaker Pipelines and SageMaker Experiments UIA SageMaker pipeline is a series of interconnected steps that are defined using the Amazon SageMaker Python SDK.

Approve Lambda deployment in the model registryAs a next step, navigate to the model registry under SageMaker resources.

3 дня назад @ aws.amazon.com
Introducing Amazon Kendra tabular search for HTML Documents
Introducing Amazon Kendra tabular search for HTML Documents Introducing Amazon Kendra tabular search for HTML Documents

Amazon Kendra users can now quickly find the information they need from tables on a webpage (HTML tables) using Amazon Kendra tabular search.

In this post, we provide an example of how to use Amazon Kendra tabular search.

Get started with Amazon Kendra tabular searchAmazon Kendra tabular search is turned on by default and no special configuration is required to enable it.

For newer documents, Amazon Kendra tabular search will work by default.

To learn more about Amazon Kendra, visit the Amazon Kendra product page.

3 дня, 11 часов назад @ aws.amazon.com
Enterprise administrative controls, simple sign-up, and expanded programming language support for Amazon CodeWhisperer
Enterprise administrative controls, simple sign-up, and expanded programming language support for Amazon CodeWhisperer Enterprise administrative controls, simple sign-up, and expanded programming language support for Amazon CodeWhisperer

Amazon CodeWhisperer is a machine learning (ML)-powered service that helps improve developer productivity by generating code recommendations based on developers’ prior code and comments.

Additionally, individual users who don’t have AWS accounts can now use CodeWhisperer using their personal email with AWS Builder ID.

The sign-up process takes only a few minutes and enables developers to start using CodeWhisperer immediately without any waitlist.

In this post, we discuss enterprise administrative controls, the new AWS Builder ID sign-up for CodeWhisperer, and support for new programming languages.

About the AuthorsBharadwaj Tanikella is a Senior Product Manager for Amazon CodeWhisperer.

3 дня, 13 часов назад @ aws.amazon.com
Optimize hyperparameters with Amazon SageMaker Automatic Model Tuning
Optimize hyperparameters with Amazon SageMaker Automatic Model Tuning Optimize hyperparameters with Amazon SageMaker Automatic Model Tuning

In this post, we set up and run our first HPO job using Amazon SageMaker Automatic Model Tuning (AMT).

First, we familiarize ourselves with the environment and SageMaker Training by running a standalone training job, without any tuning for now.

Define the hyperparameters – SageMaker provides an interface to define the hyperparameters for our built-in algorithm.

Construct the estimatorWe configure the training on an estimator object, which is a high-level interface for SageMaker Training.

Although SageMaker AMT orchestrates the HPO jobs, the HPO trials are all launched as individual SageMaker Training jobs and can be accessed as such.

6 дней, 1 час назад @ aws.amazon.com
How JPMorgan Chase & Co. uses AWS DeepRacer events to drive global cloud adoption
How JPMorgan Chase & Co. uses AWS DeepRacer events to drive global cloud adoption How JPMorgan Chase & Co. uses AWS DeepRacer events to drive global cloud adoption

JPMorgan Chase’s AWS DeepRacer learning program was born in Chicago in 2019.

Our AWS DeepRacer learning program now runs in 20 cities and 3,500 people have participated over the past two years.

We recently introduced the AWS DeepRacer Driving License, so hiring managers can see that applicants have attained a recognized standard.

As a testament to the work our employees have done with AWS DeepRacer, seven of the 40 racers in AWS’s global championships were JPMorgan Chase technologists.

He also leads the JPMorgan Chase DeepRacer Learning Program to grow his team building skills and support the firm’s widespread public cloud adoption.

1 неделя назад @ aws.amazon.com
Apply fine-grained data access controls with AWS Lake Formation and Amazon EMR from Amazon SageMaker Studio
Apply fine-grained data access controls with AWS Lake Formation and Amazon EMR from Amazon SageMaker Studio Apply fine-grained data access controls with AWS Lake Formation and Amazon EMR from Amazon SageMaker Studio

We’re excited to announce that Studio now supports applying this fine-grained data access control with Lake Formation when accessing data through Amazon EMR.

Separately, jobs submitted to Amazon EMR from Studio notebooks were unable to apply fine-grained data access control with Lake Formation.

For more details on using runtime roles with Amazon EMR, see Configure runtime roles for Amazon EMR steps.

Finally, refer to Prepare Data using Amazon EMR for detailed setup and networking instructions on integrating Studio with EMR clusters.

To learn more about using EMR with SageMaker Studio, visit Prepare Data using Amazon EMR.

1 неделя назад @ aws.amazon.com
AWS Cloud technology for near-real-time cardiac anomaly detection using data from wearable devices
AWS Cloud technology for near-real-time cardiac anomaly detection using data from wearable devices AWS Cloud technology for near-real-time cardiac anomaly detection using data from wearable devices

When the data from the remote wearable devices reaches AWS IoT Core, it can be sent using an AWS IoT rule and associated actions.

The rule extracts data from the raw stream using a simple SQL statement, as outlined by the following AWS IoT Core rule definition SQL code.

On the Timestream console, you can observe and monitor various database metrics, as shown in the following screenshot.

ECG data processingThe processing layer is composed of Amazon EventBridge, Lambda, and Amazon Rekognition.

In this post, we explored one way to ingest, process, and monitor live ECG data generated from a synthetic wearable device in order to provide insights to help determine if anomalies might be present in…

1 неделя назад @ aws.amazon.com
Identifying landmarks with Amazon Rekognition Custom Labels
Identifying landmarks with Amazon Rekognition Custom Labels Identifying landmarks with Amazon Rekognition Custom Labels

If you have other landmarks or buildings not yet supported by Amazon Rekognition, you can still use Amazon Rekognition Custom Labels.

You can then use your custom model via the Rekognition Custom Labels API and integrate it into your applications.

Create a projectTo create your Rekognition Custom Labels project, complete the following steps:On the Rekognition Custom Labels console, choose Create a project.

But if you have other landmarks currently not yet supported by Amazon Rekognition Labels, look no further and try out Amazon Rekognition Custom Labels.

For more information about using custom labels, see What Is Amazon Rekognition Custom Labels?

1 неделя, 1 день назад @ aws.amazon.com
Implementing Amazon Forecast in the retail industry: A journey from POC to production
Implementing Amazon Forecast in the retail industry: A journey from POC to production Implementing Amazon Forecast in the retail industry: A journey from POC to production

Recently, based on Amazon Forecast, we helped one of our retail customers achieve accurate demand forecasting, within 8 weeks.

In this post, we present the workflow and the critical elements to implement—from proof of concept (POC) to production—a demand forecasting system with Amazon Forecast, focused on challenges in the retail industry.

Background and current challenges of demand forecasting in the retail industryThe goal of demand forecasting is to estimate future demand from historical data, and to help store replenishment and capacity allocation.

However, retailers face multiple challenges when implementing ML-based demand forecasting systems into production.

In the retail industry, t…

1 неделя, 1 день назад @ aws.amazon.com
NVIDIA
последний пост 1 час назад
Meet the Omnivore: Cloud Architect Takes Infrastructure Visualization to New Heights With NVIDIA Omniverse
Meet the Omnivore: Cloud Architect Takes Infrastructure Visualization to New Heights With NVIDIA Omniverse Meet the Omnivore: Cloud Architect Takes Infrastructure Visualization to New Heights With NVIDIA Omniverse

As a Microsoft Certified Azure cloud specialist and DevOps automation engineer, Gavin Stevens is deeply in tune with cloud architect workflows.

Dubbed Meta Cloud Explorer, the open-source extension generates digital 3D models of engineers’ cloud infrastructure components at scale, based on contextual metadata from their Azure cloud portals.

“There’s no shortage of ‘infrastructure diagram generation’ tools that can produce 2D representations of your cloud infrastructure,” Stevens said.

Discover how to build an Omniverse extension in less than 10 minutes.

Follow NVIDIA Omniverse on Instagram, Medium, Twitter and YouTube for additional resources and inspiration.

1 час назад @ blogs.nvidia.com
Cheers to AI: Monarch Tractor Launches First Commercially Available Electric, ‘Driver Optional’ Smart Tractor
Cheers to AI: Monarch Tractor Launches First Commercially Available Electric, ‘Driver Optional’ Smart Tractor Cheers to AI: Monarch Tractor Launches First Commercially Available Electric, ‘Driver Optional’ Smart Tractor

Local startup Monarch Tractor has announced the first of six Founder Series MK-V tractors are rolling off the production line at its headquarters.

The debut caps a two-year development sprint since Monarch, founded in 2018, hatched plans to deliver its smart tractor, complete with the energy-efficient NVIDIA Jetson edge AI platform.

The MK-V tractor cuts energy costs and diesel emissions, while also helping reduce harmful herbicides, which are expensive and deplete the soil.

Leading Farming AI Wave of Clean TractorsMonarch Tractor founders include veterans of Silicon Valley’s EV scene who worked together at startup Zoox, now Amazon owned.

The NVIDIA Jetson platform provides energy-efficient…

1 час назад @ blogs.nvidia.com
GFN Thursday Dashes Into December With 22 New Games, Including ‘Marvel Midnight Suns’ Streaming Soon
GFN Thursday Dashes Into December With 22 New Games, Including ‘Marvel Midnight Suns’ Streaming Soon GFN Thursday Dashes Into December With 22 New Games, Including ‘Marvel Midnight Suns’ Streaming Soon

Rise up for Marvel’s Midnight Suns, from publisher 2K Games, streaming on GeForce NOW later this month.

Time to AssembleFrom the creators of XCOM, and published by 2K Games, Marvel’s Midnight Suns is a tactical role-playing game set in the darker, supernatural side of the Marvel Universe.

It launches on Steam on Friday, Dec. 2, with GeForce NOW members getting into the action later this month.

Play as “The Hunter,” a legendary demon slayer with a mysterious past and the first-ever customizable superhero in the Marvel Universe.

Stay tuned for updates on the game’s release on GeForce NOW.

3 часа назад @ blogs.nvidia.com
Improving Machine Learning Security Skills at a DEF CON Competition
Improving Machine Learning Security Skills at a DEF CON Competition Improving Machine Learning Security Skills at a DEF CON Competition

Machine learning (ML) security is a new discipline focused on the security of machine learning systems and the data they are built upon.

NVIDIA recently helped run an innovative ML security competition at the DEF CON 30 hacking and security conference.

The competition proved to be a valuable opportunity for participants to develop and improve their machine learning security skills.

The NVIDIA AI Red Team and AI Village joined together at DEF CON 30 to engage the information security community with a machine learning security competition.

With this familiar format in mind, the AI Village and NVIDIA AI Red Team built The AI Village CTF @ DEFCON.

20 часов назад @ developer.nvidia.com
Designing an Optimal AI Inference Pipeline for Autonomous Driving
Designing an Optimal AI Inference Pipeline for Autonomous Driving Designing an Optimal AI Inference Pipeline for Autonomous Driving

To achieve a low latency inference workflow, electric vehicle manufacturer NIO integrated NVIDIA Triton Inference Server into their AD inference pipeline.

NVIDIA Triton Inference Server is an open source multi-framework inference serving software.

It also shows how NIO reduced network transmission to successfully speed up their AI inference workflow for AD use cases.

Istio is used to load balance traffic to NVIDIA Triton Inference Server and monitor the health of the service through liveness/readiness probes of NVIDIA Triton.

By moving the preprocessing logic to GPU using the NVIDIA Triton pipeline orchestration functionality, NIO achieved:Faster image processingFreed CPU capacityReduced ne…

23 часа назад @ developer.nvidia.com
Qubit Pharmaceuticals Accelerates Drug Discovery With Hybrid Quantum Computing
Qubit Pharmaceuticals Accelerates Drug Discovery With Hybrid Quantum Computing Qubit Pharmaceuticals Accelerates Drug Discovery With Hybrid Quantum Computing

And companies are already making headway with hybrid approaches — those that combine classical and quantum computing — to tackle challenges like drug discovery for incurable diseases.

Qubit is building a drug discovery platform using the NVIDIA QODA programming model for hybrid quantum-classical computers and the startup’s Atlas software suite.

Qubit has one of France’s largest GPU supercomputers for drug discovery, powered by NVIDIA DGX systems.

The supercomputer runs Qubit’s Atlas software, performing in just a few hours calculations that would take several years with conventional methods.

The company’s Atlas software includes AI algorithms for every stage of the drug discovery cycle.

1 день, 9 часов назад @ blogs.nvidia.com
Siemens Taps Omniverse Replicator on AWS for Synthetic Data Generation to Accelerate Defect Detection Model Development by 5X
Siemens Taps Omniverse Replicator on AWS for Synthetic Data Generation to Accelerate Defect Detection Model Development by 5X Siemens Taps Omniverse Replicator on AWS for Synthetic Data Generation to Accelerate Defect Detection Model Development by 5X

Synthetic data is turbocharging model development.

It’s boosting data sets for everything from German company Festo’s robotic arm work, to efforts at Amazon Robotics using synthetic data to train robots to identify packages.

At Siemens, synthetic data generation is being used beyond defect detection to assist in areas including, but not limited to, robotic bin picking, safety monitoring, welding and wiring inspections, and checking kits of parts.

Common synthetic data generation methods, however, weren’t sufficient for production-ready robustness in some use-cases, leading to a need for real data acquisition and labeling, which could take months.

To catch PCB defects, the Siemens Digital In…

2 дня, 1 час назад @ blogs.nvidia.com
3D Artist and Educator Hsin-Chien Huang Takes VR to the World Stage This Week ‘In the NVIDIA Studio’
3D Artist and Educator Hsin-Chien Huang Takes VR to the World Stage This Week ‘In the NVIDIA Studio’ 3D Artist and Educator Hsin-Chien Huang Takes VR to the World Stage This Week ‘In the NVIDIA Studio’

3D artist, virtual reality expert, storyteller and educator Hsin-Chien Huang shares his unique creator journey and award-winning artwork Samsara this week In the NVIDIA Studio.

Unity’s light baking and Autodesk Maya’s Arnold renderer both require powerful GPUs, and his GeForce RTX 3070 GPU was equal to the task.

“Nowadays, a powerful GeForce RTX GPU is an indispensable tool for digital artists.” — Hsin-Chien Huang“Although the resolutions of these scanned models are low, it has the aesthetic of pixel art,” Huang said.

Yet again his RTX GPU acccelerates AI for the sharpening of images while retaining high-fidelity details.

“Nowadays, a powerful GeForce RTX GPU is an indispensable tool for di…

2 дня, 3 часа назад @ blogs.nvidia.com
NVIDIA Wins NeurIPS Awards for Research on Generative AI, Generalist AI Agents
NVIDIA Wins NeurIPS Awards for Research on Generative AI, Generalist AI Agents NVIDIA Wins NeurIPS Awards for Research on Generative AI, Generalist AI Agents

Two NVIDIA Research papers — one exploring diffusion-based generative AI models and another on training generalist AI agents — have been honored with NeurIPS 2022 Awards for their contributions to the field of AI and machine learning.

“AI is an incredibly important technology, and NVIDIA is making fast progress across the gamut — from generative AI to autonomous AI agents,” said Jan Kautz, vice president of learning and perception research at NVIDIA.

“In generative AI, we are not only advancing our theoretical understanding of the underlying models, but are also making practical contributions that will reduce the effort of creating realistic virtual worlds and simulations.”Reimagining the D…

3 дня, 3 часа назад @ blogs.nvidia.com
MAP Once, Run Anywhere: MONAI Introduces Framework for Deploying Medical Imaging AI Apps
MAP Once, Run Anywhere: MONAI Introduces Framework for Deploying Medical Imaging AI Apps MAP Once, Run Anywhere: MONAI Introduces Framework for Deploying Medical Imaging AI Apps

“Until now, most AI models would remain in an R&D loop, rarely reaching patient care,” said Jorge Cardoso, chief technology officer at the London Medical Imaging & AI Centre for Value-Based Healthcare.

Qure.ai : A member of the NVIDIA Inception program for startups, Qure.ai develops medical imaging AI models for use cases including lung cancer, traumatic brain injuries and tuberculosis.

: A member of the NVIDIA Inception program for startups, Qure.ai develops medical imaging AI models for use cases including lung cancer, traumatic brain injuries and tuberculosis.

Putting Medical Imaging AI on the MAPThe MAP specification was developed by the MONAI Deploy working group, a team of experts fro…

3 дня, 3 часа назад @ blogs.nvidia.com
NVIDIA Partners With NHS Trusts to Deploy AI Platform in UK Hospitals
NVIDIA Partners With NHS Trusts to Deploy AI Platform in UK Hospitals NVIDIA Partners With NHS Trusts to Deploy AI Platform in UK Hospitals

It’s built on MONAI, an open-source medical imaging AI framework co-developed by NVIDIA and the AI Centre, which allows AI applications to interface with hospital systems.

“These platforms will provide a scalable way for clinicians to deploy healthcare AI tools to support decision-making to improve the speed and precision of patient care.

“Across the healthcare ecosystem, researchers, hospitals and startups are realizing the power of incorporating a streamlined AI pipeline into their work,” said Haris Shuaib, AI transformation lead at the AI Centre.

The AI Centre has already developed algorithms to improve diagnosis of COVID-19, breast cancer, brain tumor, stroke detection and dementia risk…

3 дня, 3 часа назад @ blogs.nvidia.com
Explainer: What Is a Machine Learning Model?
Explainer: What Is a Machine Learning Model? Explainer: What Is a Machine Learning Model?

For the journey to AI, the most transformational technology of our time, the engine you need is a machine learning model.

What Is a Machine Learning Model?

Under the hood, a machine learning model is a mathematical representation of objects and their relationships to each other.

It uses a kind of machine learning model called a neural network because it was inspired by the patterns and functions of brain cells.

An ML Model for the MassesDeep learning took its name from the structure of its machine learning models.

5 дней, 21 час назад @ blogs.nvidia.com
Turn Black Friday Into Green Thursday With New GeForce NOW Deal
Turn Black Friday Into Green Thursday With New GeForce NOW Deal Turn Black Friday Into Green Thursday With New GeForce NOW Deal

For a limited time, get a free $20-value GeForce NOW membership gift card with every purchase of a $50-value GeForce NOW membership gift card.

Instant Streaming, Instant SavingsFor one week only, from Nov. 23-Dec. 2, purchase a $50-value gift card — good toward a three-month RTX 3080 membership or a six-month Priority membership — and get a bonus $20-value GeForce NOW membership gift card for free, which is good toward a one-month RTX 3080 membership or a two-month Priority membership.

Recipients will be able to redeem these gift cards for the GeForce NOW membership level of their choice.

The $20-value free gift card will be delivered as a digital code — providing instant savings for instan…

1 неделя назад @ blogs.nvidia.com
What Is a Smart Hospital?
What Is a Smart Hospital? What Is a Smart Hospital?

A smart hospital uses data and technology to accelerate and enhance the work healthcare professionals and hospital management are already doing, such as tracking hospital bed occupancy, monitoring patients’ vital signs and analyzing radiology scans.

What’s the Difference Between a Smart Hospital and a Traditional Hospital?

Smart hospital technology benefits healthcare systems, medical professionals and patients in the following ways:Healthcare providers: Smart hospital data can be used to help healthcare facilities optimize their limited resources, increasing operational efficiency for a better patient-centric approach.

Telemedicine — Smart Hospital Technology at HomeAnother part of smart h…

1 неделя, 1 день назад @ blogs.nvidia.com
Improving Network Performance of HPC Systems Using NVIDIA Magnum IO NVSHMEM and GPUDirect Async
Improving Network Performance of HPC Systems Using NVIDIA Magnum IO NVSHMEM and GPUDirect Async Improving Network Performance of HPC Systems Using NVIDIA Magnum IO NVSHMEM and GPUDirect Async

It enables the GPU to bypass the CPU when issuing internode NVSHMEM communication without any changes to existing applications.

When using a proxy thread, NVSHMEM performs the following sequence of operations:The application launches a CUDA kernel that produces data in GPU memory.

The application calls an NVSHMEM operation (such as nvshmem_put ) to communicate with another processing element (PE).

The NVSHMEM proxy thread detects the work descriptor and initiates the corresponding network operation.

Magnum IO NVSHMEM evaluationWe compared the performance of the NVSHMEM IBGDA transport with the NVSHMEM IBRC transport, which uses a proxy thread to manage communication.

1 неделя, 2 дня назад @ developer.nvidia.com
Facebook
последний пост 1 месяц назад
Improving Instagram notification management with machine learning and causal inference
Improving Instagram notification management with machine learning and causal inference Improving Instagram notification management with machine learning and causal inference

We’re sharing how Meta is applying statistics and machine learning (ML) to improve notification personalization and management on Instagram – particularly on daily digest push notifications.

At Meta, we have been applying statistics and machine learning (ML) for notification personalization and management on Instagram.

Today, we would like to share an example of how we used causal inference and ML to control sending for daily digest push notifications.

By doing so, we intend to maintain a fixed notification sending rate r where 0 < r < 1.

In the Instagram Notifications Systems team, ML and statistics have been applied in different areas to improve user notification experience.

1 месяц назад @ engineering.fb.com
Scaling data ingestion for machine learning training at Meta
Scaling data ingestion for machine learning training at Meta Scaling data ingestion for machine learning training at Meta

To facilitate the level of data ingestion required to support the training models supporting our products, we’ve had to build a new data ingestion infrastructure as well as new last-mile transformation pipelines.

In the sections below, we share our experience building data ingestion and last-mile data preprocessing pipelines that are responsible for feeding data into AI training models.

Data ingestion pipeline overviewWe have exabytes of training data powering our models, and the amount of training data is growing rapidly.

We have built a disaggregated Data PreProcessing tier (DPP) that serves as the reader tier for data ingestion and last-mile data transformations for AI training [Ref].

Sc…

3 месяца, 3 недели назад @ engineering.fb.com
Applying federated learning to protect data on mobile devices
Applying federated learning to protect data on mobile devices Applying federated learning to protect data on mobile devices

FL-DP enhances privacy in two important ways:It allows machine learning (ML) models to be trained in a distributed way so that users’ data remains on their mobile devices.

It adds noise to reduce the risk of an ML model memorizing user data.

Such an approach could enhance user privacy while still facilitating an intelligent, safe, and intuitive user experience across Meta’s family of technologies.

How it works:With FL-DP, ML models are trained in a federated manner where mobile devices learn locally.

This architecture is a combination of infrastructure across mobile devices, trusted execution environments, and conventional back-end servers.

5 месяцев, 2 недели назад @ engineering.fb.com
VESPA: Static profiling for binary optimization
VESPA: Static profiling for binary optimization VESPA: Static profiling for binary optimization

What the research is:Recent research has demonstrated that binary optimization is important for achieving peak performance for various applications.

VESPA expands on ESP in several ways to make it useful in the context of binary optimizers.

VESPA increases the scope where binary optimizers can be used, thus enhancing the range of applications that can leverage these tools to improve their performance.

Once the static profile data produced by VESPA is injected into a binary optimizer, this tool can proceed with its optimization steps as usual, completely oblivious to how the profile data was computed.

VESPA, therefore, can very easily be integrated into existing binary optimizers, which we d…

8 месяцев, 3 недели назад @ engineering.fb.com
Uber Engineering Uber Engineering
последний пост 4 месяца назад
ML Education at Uber: Program Design and Outcomes
ML Education at Uber: Program Design and Outcomes ML Education at Uber: Program Design and Outcomes

Share Vote Reddit WhatsApp 0 SharesIntroductionIf you have read our previous article, ML Education at Uber: Frameworks Inspired by Engineering Principles, you have seen several examples of how Uber benefits from applying Engineering Principles to drive the ML Education Program’s content design and program frameworks.

How were the ML Education program creators able to capture and communicate this value so that the program could scale to what it is today?

When a larger percentage of non-ML engineers attend ML Education courses it means that we are distilling ML expertise to the broader ML market, increasing the overall internal ML market size for Uber.

ConclusionUber’s ML Education Program ha…

4 месяца назад @ eng.uber.com
ML Education at Uber: Frameworks Inspired by Engineering Principles
ML Education at Uber: Frameworks Inspired by Engineering Principles ML Education at Uber: Frameworks Inspired by Engineering Principles

Part 1 will introduce our design principles and explain the benefits of applying these principles to technical education content design and program frameworks, specifically in the ML domain.

Core Principles of Uber’s ML Education ProgramThe capabilities of Uber’s ML infrastructure and ecosystem have enabled us to design, implement, and ground our ML Education program in our design principles.

Aside from the core principle of reproducibility discussed above, we have a list of other design principles that comprise Uber’s ML Education program:Because our subject matter is highly technical, we felt it appropriate to derive our design principles from industry-recognized engineering principles.

H…

4 месяца назад @ eng.uber.com
Uber’s Real-Time Document Check
Uber’s Real-Time Document Check Uber’s Real-Time Document Check

Real-Time Document Check CriteriaFrom the onset, we knew that the Real-Time Document Check product needed to meet 4 non-negotiable criteria:Data privacy: Adherence to best practices for handling personal data, taking into account local laws, regulations, and norms in all countries where the product is available.

In the Document Image Processing module, a list of operations (including document classification, transcription, and fraud detection) are applied to the uploaded document images via different technologies (e.g., 3rd-party vendor, Uber in-house technology, and human review).

Looking into the FutureReal World ImpactAs of May 2022, Real-Time ID Document Check is live in Brazil, Mexico,…

5 месяцев, 3 недели назад @ eng.uber.com
DeepETA: How Uber Predicts Arrival Times Using Deep Learning
DeepETA: How Uber Predicts Arrival Times Using Deep Learning DeepETA: How Uber Predicts Arrival Times Using Deep Learning

By training machine learning (ML) models on top of the road graph prediction using historical data in combination with real-time signals, we can refine ETAs that better predict real-world outcomes.

To meet these challenges, Uber AI partnered with Uber’s Maps team on a project called DeepETA to develop a low-latency deep neural network architecture for global ETA prediction.

We take a similar approach to ETA prediction at Uber.

Conclusions and Future WorkWe have launched this DeepETA model into production for global 4-wheel ETA prediction.

The DeepETA model launch makes it both possible and efficient to train and serve large-scale Deep Learning models that predict ETAs better than XGBoost ap…

9 месяцев, 3 недели назад @ eng.uber.com
neptune.ai neptune.ai
последний пост 3 недели, 6 дней назад
Argo vs Airflow vs Prefect: How Are They Different
Argo vs Airflow vs Prefect: How Are They Different Argo vs Airflow vs Prefect: How Are They Different

Features Argo Airflow Prefect 1.

Features: Workflows Argo: Dynamic workflow Airflow: Static workflow Prefect: Dynamic workflow 6.

Features: Kubernetes support Argo: Airflow: Prefect: 9.

Airflow: Airflow uses Python-based DAG definition language.

Prefect: Same as Airflow Scalability Argo: Parallel Airflow: Horizontal Prefect: Parallel Accessibility Argo: Open-sourced Airflow: Open-source Prefect: Open-sourced and subscription-based Flexibility Argo: Rigid Airflow: Rigid and Complicated Prefect: FlexibleComparison of the featuresLet’s start this section by exploring the User Interface.

3 недели, 6 дней назад @ neptune.ai
Good Design in ML Applications With Konrad Piercey
Good Design in ML Applications With Konrad Piercey Good Design in ML Applications With Konrad Piercey

He talked about good design in ML applications – why it’s a growing topic, what it actually means and how to start implementing it.

There are no machine learning designers, there are machine learning and data science engineers out there.

Examples of good design in ML – Delivery Hero | Copyright Konrad PierceyPatrycja: Maybe coming back to your specific project.

I am wondering, in terms of good design ML, or probably the next few steps, do you see imposing or adding limits or summary statistics to applications in ML becoming a standard?

However, it’s only a surface-level implementation of what we’ve been talking about so far for good design machine learning.

1 месяц назад @ neptune.ai
Your First MLOps System: What Does Good Look Like? With Andy McMahon
Your First MLOps System: What Does Good Look Like? With Andy McMahon Your First MLOps System: What Does Good Look Like? With Andy McMahon

2 What does a good MLOps system look like?

What does a good MLOps system look like?

4 Difference between an ML system and an MLOps systemDifference between an ML system and an MLOps system 5 Release Management of an MLOps systemRelease Management of an MLOps system 6 And much more!

What does a good MLOps system look like?

Difference between an ML system and an MLOps systemStephen: Is there any clear difference between me talking about an ML system and an MLOps system?

1 месяц, 1 неделя назад @ neptune.ai
5 Tools That Will Help You Setup Production ML Model Testing
5 Tools That Will Help You Setup Production ML Model Testing 5 Tools That Will Help You Setup Production ML Model Testing

Data Validation, before the training, mostly while splitting the data into training and testing, and ML model testing.

You can also use it to check data, model drifts, model integrity, and model monitoring.

Drifter-MLDrifter ML is an ML model testing tool specifically written for the Scikit-learn library.

Monitoring model in production | SourceRobust intelligence enables model testing, model protection during deployment, and model monitoring after deployment.

ML model testing frameworks play an integral part in defining how the model will perform when deployed to a real-world scenario.

2 месяца назад @ neptune.ai
How to Solve the Data Ingestion and Feature Store Component of the MLOps Stack
How to Solve the Data Ingestion and Feature Store Component of the MLOps Stack How to Solve the Data Ingestion and Feature Store Component of the MLOps Stack

Bookmark for later How to Solve the Model Serving Component of the MLOps StackWhat is a feature store?

You could use Redshift as an offline feature store and DynamoDB or Redis as an online feature store.

But if this is not the case and you’re running a public cloud-heavy workload, using AWS SageMaker Feature Store or GCP Vertex AI Feature Store can be good options to start with.

Amazon SageMaker Feature Store for machine learning | SourceDatabricks also offers an embedded Feature Store service, which is also a good option and would be perfectly compatible with a tool like MLFlow.

Read also Setting up MLOps at a Reasonable Scale With Jacopo TagliabueExplanation of a feature store | SourceWho…

2 месяца, 2 недели назад @ neptune.ai
Feature Selection Methods and How to Choose Them
Feature Selection Methods and How to Choose Them Feature Selection Methods and How to Choose Them

Then, we will take a glimpse behind the hood of Boruta, the state-of-the-art feature selection algorithm, to check out a clever way to combine different feature selection methodsAnd we’ll look into how feature selection is leveraged in the industry.

This is what feature selection is, but it is equally important to understand what feature selection is not – it is neither feature extraction/feature engineering nor it is dimensionality reduction.

Feature selection methods | Source: authorUnsupervised feature selection methodsJust like unsupervised learning is the type of learning that looks for patterns in unlabeled data, similarly, unsupervised feature selection methods are such methods that …

2 месяца, 3 недели назад @ neptune.ai
Exploratory Data Analysis for Tabular Data
Exploratory Data Analysis for Tabular Data Exploratory Data Analysis for Tabular Data

May interest you Exploratory Data Analysis for Natural Language Processing: A Complete Guide to Python ToolsExploratory Data Analysis vs.

Classical Data AnalysisApart from EDA, there are also other data analysis approaches, Classical Data Analysis being one of the most popular ones.

Both Exploratory Data Analysis and Classical Data Analysis start with a problem, followed by collecting the related data that can be used to understand the problem.

This is where their similarities end, let us see the differences now:Parameters Exploratory Data Analysis Classical Data Analysis Model Exploratory Data Analysis: does not impose deterministic or probabilistic models on the data.

Exploratory Data Ana…

2 месяца, 3 недели назад @ neptune.ai
Best ML Model Registry Tools
Best ML Model Registry Tools Best ML Model Registry Tools

This article will discuss the model registry tools and evaluation criteria for such tools.

Evaluation criteria for choosing model registry toolsThe model registry is an important part of MLOps platforms/tools.

Competence in managing the model dependenciesThe model registry tool must have compatibility with all the dependencies the ML model needs.

Model registry toolsHere are a number of model registry tools that are used across the industry:Neptune provides a central processing unit to store, log, compare, display, query, and organize all metadata.

Comparison of model registry toolsEvery model registry tool has different features and performs various unique operations.

2 месяца, 3 недели назад @ neptune.ai
Building ML Pipeline: 6 Problems & Solutions [From a Data Scientist’s Experience]
Building ML Pipeline: 6 Problems & Solutions [From a Data Scientist’s Experience] Building ML Pipeline: 6 Problems & Solutions [From a Data Scientist’s Experience]

Problem 2: No high-level separation of concernsThe separation of concerns in ML code bases is often missing at a high level.

What this means is that more often than not, so-called ML code is also doing feature transformations like operations that have nothing to do with ML – think physical document ingestion, conversion of administrative data, etc.

Problem 4: No configuration Data ModelA data model for handling ML configuration is often missing.

Problem 5: Handling legacy modelsSince the process of training a ML model often involves manual efforts (see problem 1) it can take really long to do so.

MLConfiguration contains the ML data model: enums and classes that do not contain any processin…

2 месяца, 4 недели назад @ neptune.ai
Recommender Systems: Lessons From Building and Deployment
Recommender Systems: Lessons From Building and Deployment Recommender Systems: Lessons From Building and Deployment

If you look at recommender systems papers, a large number of them come from the industry instead of academia.

Recommender systems: model trainingLarge NLP or Vision models have billions of parameters distributed among linear, convolution, recurrent, or attention layers.

According to the team:“TorchRec is a PyTorch domain library built to provide common sparsity & parallelism primitives needed for large-scale recommender systems (RecSys).

Recommender systems: model evaluationOffline evaluationTypical classification task optimizes for metrics like accuracy, precision, recall, or F1-score.

Recommender systems: A/B testingImproving recommender systems is a continuous process.

3 месяца, 1 неделя назад @ neptune.ai
Pillars of MLOps and How to Implement Them
Pillars of MLOps and How to Implement Them Pillars of MLOps and How to Implement Them

This is exactly the problem that is supposed to be solved by MLOps (Machine Learning Operations).

In this article, I will explain:what it is about,what are the pillars of MLOps,and how to implement them in your current or future projects.

Read also Setting up MLOps at a Reasonable Scale With Jacopo TagliabueThe pillars of MLOps: core ingredients for a robust MLOps strategyNow that we have a basic understanding of MLOps and its general role in machine learning projects let’s dig deeper to understand what are the key concepts/techniques that will help you implement MLOps best practices in your existing or future projects.

MLOps pillar: reproducibility and versioningOne of the core features of…

3 месяца, 1 неделя назад @ neptune.ai
Deploying ML Models: How to Make Sure the New Model Is Better Than the One in Production? [Practical Guide]
Deploying ML Models: How to Make Sure the New Model Is Better Than the One in Production? [Practical Guide] Deploying ML Models: How to Make Sure the New Model Is Better Than the One in Production? [Practical Guide]

But before deploying a new model, we need to make sure that it’s indeed a better model than the old one.

Lastly, we have to point out the importance of testing before deploying an ML model.

ML model deployment is a process of integrating the model into an existing production environment to make practical business decisions.

ML models almost always require deployment to provide business value, but unfortunately, most of the models never make it to production.

Models run asynchronously, firstly the old model in production and after the new shadow model.

3 месяца, 2 недели назад @ neptune.ai
Leveraging Unlabeled Image Data With Self-Supervised Learning or Pseudo Labeling With Mateusz Opala
Leveraging Unlabeled Image Data With Self-Supervised Learning or Pseudo Labeling With Mateusz Opala Leveraging Unlabeled Image Data With Self-Supervised Learning or Pseudo Labeling With Mateusz Opala

Every episode is focused on one specific ML topic, and during this one, we talked to Mateusz Opala about leveraging unlabeled image data with self-supervised learning or pseudo-labeling.

Sabine: With us today, we have Mateusz Opala, who is going to be answering questions about leveraging unlabeled image data with self-supervised learning or pseudo-labeling.

Can you walk us through some of the different use cases where you apply pseudo-labeling for image data in Brainly?

Mateusz: In general, most of the techniques we use it’s still supervised learning, and we label data, but it’s limited and it’s time-consuming.

Mateusz:My biggest challenge right now is connecting all the steps in the whole …

3 месяца, 2 недели назад @ neptune.ai
How to Solve the Model Serving Component of the MLOps Stack
How to Solve the Model Serving Component of the MLOps Stack How to Solve the Model Serving Component of the MLOps Stack

Serving Machine Learning models the right wayML model serving has a tight relationship with metadata stores, ML model registries, monitoring components, and feature stores.

If we have a high-performance server that is a nightmare to integrate with our observability, feature stores, and model registries, we have a terrible model serving component.

Our ML serving component periodically checks in with the ML model registry, and if there’s a new model with the compatible tag, it will update the deployment.

Model versions visible in the Neptune model registry | See in the appOf course, as mentioned earlier, frequently, the model serving component has to interact with feature stores.

Think of the…

3 месяца, 2 недели назад @ neptune.ai
Active Learning: Strategies, Tools, and Real-World Use Cases
Active Learning: Strategies, Tools, and Real-World Use Cases Active Learning: Strategies, Tools, and Real-World Use Cases

Diagram of active learning system | Source: AuthorWhy do we need active learning?

Active learning use case in NLP (NER) | SourceAs we can see above, clearly, all of the active learning strategies are outperforming the random sampling (RAND) baseline performance by a good margin.

Sample of selected frames via active learning | SourceAside from the cost advantages, a significant improvement in mean average precision (from an objection detection perspective) was observed using active learning.

The improvement of protein production | SourceSome popular frameworks used for Active Learning1.modAL: A modular active learning framework for Python3modAL is an active learning framework for Python3, de…

3 месяца, 3 недели назад @ neptune.ai
▶️ YouTube
Yannic Kilcher Yannic Kilcher
последний пост 4 дня, 1 час назад
[ML News] GPT-4 Rumors | AI Mind Reading | Neuron Interaction Solved | AI Theorem Proving
[ML News] GPT-4 Rumors | AI Mind Reading | Neuron Interaction Solved | AI Theorem Proving [ML News] GPT-4 Rumors | AI Mind Reading | Neuron Interaction Solved | AI Theorem Proving

#ai #mlnews #gpt4 Your weekly news from the AI & Machine Learning world. OUTLINE:

0:00 - Introduction

0:25 - AI reads brain signals to predict what you're thinking

3:00 - Closed-form solution for neuron interactions

4:15 - GPT-4 rumors

6:50 - Cerebras supercomputer

7:45 - Meta releases metagenomics atlas

9:15 - AI advances in theorem proving

10:40 - Better diffusion models with expert denoisers

12:00 - BLOOMZ & mT0

13:05 - ICLR reviewers going mad

21:40 - Scaling Transformer inference

22:10 - Infinite nature flythrough generation

23:55 - Blazing fast denoising

24:45 - Large-scale AI training with MultiRay

25:30 - arXiv to include Hugging Face spaces

26:10 - Multilingual Diffusion

26:30 - Mu…

4 дня, 1 час назад @ youtube.com
CICERO: An AI agent that negotiates, persuades, and cooperates with people
CICERO: An AI agent that negotiates, persuades, and cooperates with people CICERO: An AI agent that negotiates, persuades, and cooperates with people

#ai #cicero #diplomacy A team from Meta AI has developed Cicero, an agent that can play the game Diplomacy, in which players have to communicate via chat messages to coordinate and plan into the future. Paper Title: Human-level play in the game of Diplomacy by combining language models with strategic reasoning Commented game by human expert: https://www.youtube.com/watch?v=u5192bvUS7k OUTLINE:

0:00 - Introduction

9:50 - AI in cooperation games

13:50 - Cicero agent overview

25:00 - A controllable dialogue model

36:50 - Dialogue-conditional strategic planning

49:00 - Message filtering

53:45 - Cicero's play against humans

55:15 - More examples & discussion Homepage: https://ai.facebook.com/res…

5 дней, 18 часов назад @ youtube.com
Galactica: A Large Language Model for Science (Drama & Paper Review)
Galactica: A Large Language Model for Science (Drama & Paper Review) Galactica: A Large Language Model for Science (Drama & Paper Review)

#ai #galactica #meta Galactica is a language model trained on a curated corpus of scientific documents, such as papers, knowledge bases, reviews, and other articles. The model can be used in a generative fasion to assist scientific writing, do reference prediction, and much more, including a new approach to do step-by-step reasoning using a clever encoding of intermediate steps. This video explains the paper, but also dives into the drama that ensued once Meta released a public demo of the model. OUTLINE:

0:00 - Introduction

1:30 - Drama around the public demo

16:00 - Start of paper review

20:30 - Dataset construction and encoding

23:30 - Encoding step-by-step reasoning using a scratchpad

3…

1 неделя, 5 дней назад @ youtube.com
[ML News] Multiplayer Stable Diffusion | OpenAI needs more funding | Text-to-Video models incoming
[ML News] Multiplayer Stable Diffusion | OpenAI needs more funding | Text-to-Video models incoming [ML News] Multiplayer Stable Diffusion | OpenAI needs more funding | Text-to-Video models incoming

#mlnews #ai #mlinpl Your news from the world of Machine Learning! OUTLINE:

0:00 - Introduction

1:25 - Stable Diffusion Multiplayer

2:15 - Huggingface: DOI for Models & Datasets

3:10 - OpenAI asks for more funding

4:25 - The Stack: Source Code Dataset

6:30 - Google Vizier Open-Sourced

7:10 - New Models

11:50 - Helpful Things

20:30 - Prompt Databases

22:15 - Lexicap by Karpathy References:

Stable Diffusion Multiplayer

https://huggingface.co/spaces/huggingface-projects/stable-diffusion-multiplayer?roomid=room-0 Huggingface: DOI for Models & Datasets

https://huggingface.co/blog/introducing-doi OpenAI asks for more funding

https://www.theinformation.com/articles/openai-valued-at-nearly-20-billio…

2 недели, 4 дня назад @ youtube.com
The New AI Model Licenses have a Legal Loophole (OpenRAIL-M of BLOOM, Stable Diffusion, etc.)
The New AI Model Licenses have a Legal Loophole (OpenRAIL-M of BLOOM, Stable Diffusion, etc.) The New AI Model Licenses have a Legal Loophole (OpenRAIL-M of BLOOM, Stable Diffusion, etc.)

#ai #stablediffusion #license So-called responsible AI licenses are stupid, counterproductive, and have a dangerous legal loophole in them. OpenRAIL++ License here: https://www.ykilcher.com/license OUTLINE:

0:00 - Introduction

0:40 - Responsible AI Licenses (RAIL) of BLOOM and Stable Diffusion

3:35 - Open source software's dilemma of bad usage and restrictions

8:45 - Good applications, bad applications

12:45 - A dangerous legal loophole

15:50 - OpenRAIL++ License

16:50 - This has nothing to do with copyright

26:00 - Final thoughts References:

https://huggingface.co/CompVis/stable-diffusion/tree/main

https://huggingface.co/spaces/CompVis/stable-diffusion-license

https://huggingface.co/bigsci…

3 недели, 1 день назад @ youtube.com
ROME: Locating and Editing Factual Associations in GPT (Paper Explained & Author Interview)
ROME: Locating and Editing Factual Associations in GPT (Paper Explained & Author Interview) ROME: Locating and Editing Factual Associations in GPT (Paper Explained & Author Interview)

#ai #language #knowledge Large Language Models have the ability to store vast amounts of facts about the world. But little is known, how these models actually do this. This paper aims at discovering the mechanism and location of storage and recall of factual associations in GPT models, and then proposes a mechanism for the targeted editing of such facts, in form of a simple rank-one update to a single MLP layer. This has wide implications both for how we understand such models' inner workings, and for our ability to gain greater control over such models in the future. OUTLINE:

0:00 - Introduction

1:40 - What are the main questions in this subfield?

6:55 - How causal tracing reveals where fa…

3 недели, 6 дней назад @ youtube.com
Is Stability turning into OpenAI?
Is Stability turning into OpenAI? Is Stability turning into OpenAI?

#stablediffusion #aiart #openai Stability AI has stepped into some drama recently. They are accused of a hostile takeover of the community-led sub-reddits and Discord servers, of going after an alternative web UI, and of falsely dealing out IP takedown notices. OUTLINE:

0:00 - Intro

2:40 - Stability takes over community Discord & Reddit

14:50 - AUTOMATIC1111 web UI, stolen or not ?

24:50 - Stable Diffusion 1.5 takedown request

31:20 - Scary: Stability CIO statement on safety & openness References:

https://finance.yahoo.com/news/stability-ai-startup-behind-stable-170151950.html?guccounter=1

https://analyticsindiamag.com/when-stability-ai-went-rogue-on-reddit-rampage%ef%bf%bc/

https://www.red…

4 недели, 1 день назад @ youtube.com
Neural Networks are Decision Trees (w/ Alexander Mattick)
Neural Networks are Decision Trees (w/ Alexander Mattick) Neural Networks are Decision Trees (w/ Alexander Mattick)

#neuralnetworks #machinelearning #ai Alexander Mattick joins me to discuss the paper "Neural Networks are Decision Trees", which has generated a lot of hype on social media. We ask the question: Has this paper solved one of the large mysteries of deep learning and opened the black-box neural networks up to interpretability? OUTLINE:

0:00 - Introduction

2:20 - Aren't Neural Networks non-linear?

5:20 - What does it all mean?

8:00 - How large do these trees get?

11:50 - Decision Trees vs Neural Networks

17:15 - Is this paper new?

22:20 - Experimental results

27:30 - Can Trees and Networks work together? Paper: https://arxiv.org/abs/2210.05189 Abstract:

In this manuscript, we show that any feed…

1 месяц, 1 неделя назад @ youtube.com
This is a game changer! (AlphaTensor by DeepMind explained)
This is a game changer! (AlphaTensor by DeepMind explained) This is a game changer! (AlphaTensor by DeepMind explained)

#alphatensor #deepmind #ai Matrix multiplication is the most used mathematical operation in all of science and engineering. Speeding this up has massive consequences. Thus, over the years, this operation has become more and more optimized. A fascinating discovery was made when it was shown that one actually needs less than N^3 multiplication operations to multiply to NxN matrices. DeepMind goes a step further and creates AlphaTensor, a Deep Reinforcement Learning algorithm that plays a single-player game, TensorGame, in order to find even more optimized algorithms for matrix multiplication. And it turns out, there exists a plethora of undiscovered matrix multiplication algorithms, which not…

1 месяц, 3 недели назад @ youtube.com
[ML News] OpenAI's Whisper | Meta Reads Brain Waves | AI Wins Art Fair, Annoys Humans
[ML News] OpenAI's Whisper | Meta Reads Brain Waves | AI Wins Art Fair, Annoys Humans [ML News] OpenAI's Whisper | Meta Reads Brain Waves | AI Wins Art Fair, Annoys Humans

#mlnews #openai #ai Everything important going on in the ML world right here! Sponsor: Paperspace

https://www.paperspace.com/?src=yannic OUTLINE:

0:00 - Introduction

0:20 - Whisper: Open-Source Speech Transcription

6:30 - Sponsor: Paperspace

9:30 - Meta: How the brain hears audio

11:25 - PyTorch moves to Linux Foundation

12:15 - French Government uses AI to find unlicensed swimming pools

13:35 - AlphaFold extends database

14:10 - John Carmack raises 20M to build AGI0729970510422016

16:10 - Cerebras achieves model size record

17:40 - Andrej Karpathy on YouTube

18:35 - ColabPro changes pricing

19:15 - Huggingface runs evaluation on the hub

20:35 - AI wins art fair

22:50 - PaLI: Multilingual L…

1 месяц, 4 недели назад @ youtube.com
[ML News] Stable Diffusion Takes Over! (Open Source AI Art)
[ML News] Stable Diffusion Takes Over! (Open Source AI Art) [ML News] Stable Diffusion Takes Over! (Open Source AI Art)

#stablediffusion #aiart #mlnews Stable Diffusion has been released and is riding a wave of creativity and collaboration. But not everyone is happy about this... Sponsor: NVIDIA

GPU Raffle: https://ykilcher.com/gtc OUTLINE:

0:00 - Introduction

0:30 - What is Stable Diffusion?

2:25 - Open-Source Contributions and Creations

7:55 - Textual Inversion

9:30 - OpenAI vs Open AI

14:20 - Journalists be outraged

16:20 - AI Ethics be even more outraged

19:45 - Do we need a new social contract?

21:30 - More applications

22:55 - Helpful Things

23:45 - Sponsor: NVIDIA (& how to enter the GPU raffle) References: https://early-hair-c20.notion.site/Stable-Diffusion-Takes-Over-Referenes-7a2f45b8f7e04ae0ba19db…

2 месяца, 1 неделя назад @ youtube.com
How to make your CPU as fast as a GPU - Advances in Sparsity w/ Nir Shavit
How to make your CPU as fast as a GPU - Advances in Sparsity w/ Nir Shavit How to make your CPU as fast as a GPU - Advances in Sparsity w/ Nir Shavit

#ai #sparsity #gpu Sparsity is awesome, but only recently has it become possible to properly handle sparse models at good performance. Neural Magic does exactly this, using a plain CPU. No specialized hardware needed, just clever algorithms for pruning and forward-propagation of neural networks. Nir Shavit and I talk about how this is possible, what it means in terms of applications, and why sparsity should play a much larger role in the Deep Learning community. Sponsor: AssemblyAI

Link: https://www.assemblyai.com/?utm_source=youtube&utm_medium=social&utm_campaign=yannic_autochapters Check out Neural Magic: https://neuralmagic.com/

and DeepSparse: https://github.com/neuralmagic/deepsparse O…

2 месяца, 2 недели назад @ youtube.com
More Is Different for AI - Scaling Up, Emergence, and Paperclip Maximizers (w/ Jacob Steinhardt)
More Is Different for AI - Scaling Up, Emergence, and Paperclip Maximizers (w/ Jacob Steinhardt) More Is Different for AI - Scaling Up, Emergence, and Paperclip Maximizers (w/ Jacob Steinhardt)

#ai #interview #research Jacob Steinhardt believes that future AI systems will be fundamentally different than the ones we know currently. We talk about how emergence happens when scaling up, what implications that has on AI Safety, and why thought experiments like the Paperclip Maximizer might be more useful than most people think. OUTLINE:

0:00 Introduction

1:10 Start of Interview

2:10 Blog posts series

3:56 More Is Different for AI (Blog Post)

7:40 Do you think this emergence is mainly a property from the interaction of things?

9:17 How does phase transition or scaling-up play into AI and Machine Learning?

12:10 GPT-3 as an example of qualitative difference in scaling up

14:08 GPT-3 as a…

2 месяца, 2 недели назад @ youtube.com
The hidden dangers of loading open-source AI models (ARBITRARY CODE EXPLOIT!)
The hidden dangers of loading open-source AI models (ARBITRARY CODE EXPLOIT!) The hidden dangers of loading open-source AI models (ARBITRARY CODE EXPLOIT!)

#huggingface #pickle #exploit Did you know that something as simple as loading a model can execute arbitrary code on your machine? Try the model: https://huggingface.co/ykilcher/totally-harmless-model

Get the code: https://github.com/yk/patch-torch-save Sponsor: Weights & Biases

Go here: https://wandb.me/yannic OUTLINE:

0:00 - Introduction

1:10 - Sponsor: Weights & Biases

3:20 - How Hugging Face models are loaded

5:30 - From PyTorch to pickle

7:10 - Understanding how pickle saves data

13:00 - Executing arbitrary code

15:05 - The final code

17:25 - How can you protect yourself? Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannicki…

2 месяца, 4 недели назад @ youtube.com
The Future of AI is Self-Organizing and Self-Assembling (w/ Prof. Sebastian Risi)
The Future of AI is Self-Organizing and Self-Assembling (w/ Prof. Sebastian Risi) The Future of AI is Self-Organizing and Self-Assembling (w/ Prof. Sebastian Risi)

#ai #selforganization #emergence Read Sebastian's article here: https://sebastianrisi.com/self_assembling_ai/ OUTLINE:

0:00 - Introduction

2:25 - Start of Interview

4:00 - The intelligence of swarms

9:15 - The game of life & neural cellular automata

14:10 - What's missing from neural CAs?

17:20 - How does local computation compare to centralized computation?

25:40 - Applications beyond games and graphics

33:00 - Can we do away with goals?

35:30 - Where do these methods shine?

43:30 - The paradox of scales & brains

49:45 - Connections to graphical systems & GNNs

51:30 - Could this solve ARC?

57:45 - Where can people get started? References:

https://sebastianrisi.com/

https://modl.ai/

https:/…

3 месяца назад @ youtube.com
Henry AI Labs Henry AI Labs
последний пост 3 месяца, 4 недели назад
Weaviate User Experience - Weaviate Podcast Recap
Weaviate User Experience - Weaviate Podcast Recap Weaviate User Experience - Weaviate Podcast Recap

Please check out the full podcast here: https://www.youtube.com/watch?v=gjJBYcYMB-o This video is a commentary on the latest Weaviate Podcast with Laura Ham on the Weaviate User Experience. User Experience describes a suite of things from the performance of the tech, API interfaces, documentation, and communication strategy -- as outlined by Bob van Luijt here: https://twitter.com/bobvanluijt/status/1552379772747096064. Laura has lead the development of the GraphQL API that makes Weaviate so friendly and exciting to use! I really hope you enjoy learning more about these topics. Here are some additional links referenced in the video: Wikipedia Weaviate Example: https://weaviate.io/developers…

3 месяца, 4 недели назад @ youtube.com
Thoughts on Weaviate v1.14 Release!
Thoughts on Weaviate v1.14 Release! Thoughts on Weaviate v1.14 Release!

Hey everyone! Here are some of my thoughts and lessons learned on the new Weaviate v1.14 release! Please check out the full length podcast linked here: https://www.youtube.com/watch?v=eiQaZIhUS_o. Some references from the video:

Weaviate v1.14 Blog Post: https://weaviate.io/blog/2022/07/Weaviate-release-1-14.html#stronger-together

CO-Search: https://arxiv.org/pdf/2006.09595.pdf

Prometheus: https://prometheus.io/docs/introduction/overview/

Literature-Augmented Clinical Outcome Prediction: https://aclanthology.org/2022.findings-naacl.33.pdf

Sigmoid-MSE vs. Softmax Cross-Entropy: https://wandb.ai/ayush-thakur/dl-question-bank/reports/Sigmoid-MSE-vs-Softmax-Cross-Entropy--VmlldzoyMDA3ODQ

4 месяца, 3 недели назад @ youtube.com
Approximate Nearest Neighbor Benchmarks - Weaviate Podcast Recap
Approximate Nearest Neighbor Benchmarks - Weaviate Podcast Recap Approximate Nearest Neighbor Benchmarks - Weaviate Podcast Recap

Please check out the full podcast here: https://www.youtube.com/watch?v=kG3ji89AFyQ This video is a commentary on the latest Weaviate Podcast with Etienne Dilocker on ANN Benchmarks. ANN search -- short for Approximate Nearest Neighbors -- describes algorithms that enable efficient distance comparison between an encoded query vector and a vector database. For example, we may have 1 billion vectors to search through -- we don't want to do a dot product distance between our query and 1 billion candidate vectors! This podcast describes Weaviate's efforts to benchmark HNSW within the Weaviate system and give users a sense of how performance varies with respect to each dataset (and their respect…

6 месяцев назад @ youtube.com
Search through Y Combinator startups with Weaviate!
Search through Y Combinator startups with Weaviate! Search through Y Combinator startups with Weaviate!

Please check out Eric Jang's article "Ranking YC Companies with a Neural Net": https://evjang.com/2022/04/02/yc-rank.html Please subscribe to SeMI Technologies on YouTube! https://www.youtube.com/c/SeMI-and-Weaviate Timecodes

0:00 Introduction

0:58 Weaviate Demo

3:40 Article Overview

10:45 NLP for Venture Capital and Data-Centric AI

8 месяцев назад @ youtube.com
MosaicML Composer for faster and cheaper Deep Learning!
MosaicML Composer for faster and cheaper Deep Learning! MosaicML Composer for faster and cheaper Deep Learning!

Please leave a star! https://github.com/mosaicml/composer Thank you so much for watching! This video presents some details of MosaicML's Composer launch and how to use it in Python. I am really excited about this company and their mission to deliver faster and cheaper Deep Learning training! I hope you find this video useful, happy to answer any questions you might have about this or these ideas in Efficient Deep Learning generally! The full Weaviate podcast with Jonathan Frankle will be uploaded very soon on SeMI Technologies YouTube, please subscribe!

https://www.youtube.com/c/SeMI-and-Weaviate Chapters

0:00 Introduction

1:45 Documentation Intro

4:20 Composer Notebooks

5:35 Functional API…

8 месяцев, 1 неделя назад @ youtube.com
Jina AI DocArray - Documentation Overview
Jina AI DocArray - Documentation Overview Jina AI DocArray - Documentation Overview

I hope you found this useful, please let me know if you have any questions or ideas! Docarray Documentation: https://docarray.jina.ai/ Full-Length Podcast: https://www.youtube.com/watch?v=HIGAQAE_xaI Code Tutorial (Weaviate + Jina AI for Image Search): https://www.youtube.com/watch?v=rBKvoIGihnY Please check out Jina AI on YouTube: https://www.youtube.com/c/JinaAI Please check out SeMI Technologies on YouTube: https://www.youtube.com/c/SeMI-and-Weaviate/videos

8 месяцев, 2 недели назад @ youtube.com
What lead Jina AI CEO Han Xiao to Neural Search?
What lead Jina AI CEO Han Xiao to Neural Search? What lead Jina AI CEO Han Xiao to Neural Search?

This video explains one of the biggest lessons for me in interviewing Han Xiao from Jina AI. I hope this was a good explanation of the preprocessing / granularity of embeddings and how that can enable different kinds of search applications. Full-Length Podcast: https://www.youtube.com/watch?v=HIGAQAE_xaI Code Tutorial (Weaviate + Jina AI for Image Search): https://www.youtube.com/watch?v=rBKvoIGihnY Please check out Jina AI on YouTube: https://www.youtube.com/c/JinaAI Please check out SeMI Technologies on YouTube: https://www.youtube.com/c/SeMI-and-Weaviate/videos Chapters

0:00 Introduction

8 месяцев, 2 недели назад @ youtube.com
Full Stack Neural Search
Full Stack Neural Search Full Stack Neural Search

This video explains one of the biggest lessons for me in interviewing Han Xiao from Jina AI. I hope this was a good explanation of the preprocessing / granularity of embeddings and how that can enable different kinds of search applications. Full-Length Podcast: https://www.youtube.com/watch?v=HIGAQAE_xaI Code Tutorial (Weaviate + Jina AI for Image Search): https://www.youtube.com/watch?v=rBKvoIGihnY Please check out Jina AI on YouTube: https://www.youtube.com/c/JinaAI Please check out SeMI Technologies on YouTube: https://www.youtube.com/c/SeMI-and-Weaviate/videos Chapters

0:00 Please check out SeMI YouTube!

0:15 My takeaways on Full Stack Neural Search

11:04 Podcast Clip - Han Xiao

8 месяцев, 2 недели назад @ youtube.com
Python Tutorial: How to use Weaviate and Jina AI for Image Search!
Python Tutorial: How to use Weaviate and Jina AI for Image Search! Python Tutorial: How to use Weaviate and Jina AI for Image Search!

I hope this video helps you get started with Image Search using Weaviate and Jina AI - happy to answer any questions / help solve problems! Check out the full tutorial explanation from Laura Ham: https://www.youtube.com/watch?v=rBKvoIGihnY New podcast with Jina AI CEO Han Xiao! https://www.youtube.com/watch?v=HIGAQAE_xaI Full notebook code: https://github.com/laura-ham/HM-Fashion-image-neural-search/blob/main/hm-fashion-image-neural-search.ipynb Get started with the Weaviate Cloud Service: console.semi.technology

8 месяцев, 2 недели назад @ youtube.com
Causal Inference in Deep Learning (Podcast Overview with Brady Neal)
Causal Inference in Deep Learning (Podcast Overview with Brady Neal) Causal Inference in Deep Learning (Podcast Overview with Brady Neal)

Hey everyone! Hopefully this video helps supplement the new Weaviate podcast with Brady Neal, I hope you find this interesting / useful! Check out Brady Neal on YouTube! https://www.youtube.com/c/BradyNealCausalInference/featured Weaviate Podcast: https://www.youtube.com/watch?v=t7g9s1GWcB8 0:00 New Weaviate Podcast!

0:42 Brady Neal Causal Inference

1:34 Oogway.ai

2:45 Whiteboard Ideas

5:35 Discussion Topics

9 месяцев назад @ youtube.com
OpenAI Embeddings API - (Interview Recap and Background)
OpenAI Embeddings API - (Interview Recap and Background) OpenAI Embeddings API - (Interview Recap and Background)

Hey everyone! I recently interviewed Arvind Neelakantan from OpenAI about the new OpenAI Embeddings API on the Weaviate Podcast! This video provides some additional detail for the different topics that were discussed. If you find this video to be informative, please check out SeMI technologies on youtube where we are working hard on developing content explaining concepts in Deep Learning for Search. Full Podcast: https://www.youtube.com/watch?v=uFxfZ0vLsoU SeMI Technologies on YouTube: https://www.youtube.com/channel/UCJKT6kJ3IFYybWnL7jbXxhQ

9 месяцев, 3 недели назад @ youtube.com
AI Weekly Update - February 7th, 2022
AI Weekly Update - February 7th, 2022 AI Weekly Update - February 7th, 2022

Thanks for watching! Please subscribe for more Deep Learning and AI videos, the list of papers is below under "Content Links" Content Links:

Fully Online Meta-Learning without Task Boundaries: https://arxiv.org/abs/2202.00263

Datamodels: Predicting Predictions from Training Data: https://arxiv.org/abs/2202.00622

Adaptive Discrete Communication Bottlenecks with Dynamic Vector Quantization: https://arxiv.org/abs/2202.01334

Competition-Level Code Generation with AlphaCode: https://storage.googleapis.com/deepmind-media/AlphaCode/competition_level_code_generation_with_alphacode.pdf

GPT-NeoX-20B: https://blog.eleuther.ai/announcing-20b/

PromptSource: https://arxiv.org/abs/2202.01279

Chain of Thou…

9 месяцев, 3 недели назад @ youtube.com
3blue1brown 3blue1brown
последний пост 1 неделя, 6 дней назад
But what is a convolution?
But what is a convolution? But what is a convolution?

Discrete convolutions, from probability, to image processing and FFTs.

Help fund future projects: https://www.patreon.com/3blue1brown​

An equally valuable form of support is to simply share the videos. ------------------ Other videos I referenced Live lecture on image convolutions for the MIT Julia lab

https://youtu.be/8rrHTtUzyZA Lecture on Discrete Fourier Transforms

https://youtu.be/g8RkArhtCc4 Reducible video on FFTs

https://youtu.be/h7apO7q16V0 Veritasium video on FFTs

https://youtu.be/nmgFG7PUHfo These animations are largely made using a custom python library, manim. See the FAQ comments here:

https://www.3blue1brown.com/faq#manim

https://github.com/3b1b/manim

https://github.com/Manim…

1 неделя, 6 дней назад @ youtube.com
Researchers thought this was a bug (Borwein integrals)
Researchers thought this was a bug (Borwein integrals) Researchers thought this was a bug (Borwein integrals)

A curious pattern of integrals that all equal pi...until they don't.

Help fund future projects: https://www.patreon.com/3blue1brown​

Special thanks to these patrons: https://3b1b.co/lessons/borwein#thanks

An equally valuable form of support is to simply share the videos. ------------------ Original paper from David and Jonathan Borwein

https://carma.edu.au/resources/db90/pdfs/db90-119.00.pdf Timestamps

0:00 - The pattern

4:45 - Moving average analogy

10:41 - High-level overview of the connection

16:14 - What's coming up next These animations are largely made using a custom python library, manim. See the FAQ comments here:

https://www.3blue1brown.com/faq#manim

https://github.com/3b1b/manim

h…

3 недели, 6 дней назад @ youtube.com
Have you seen more math videos in your feed recently? (SoME2 results)
Have you seen more math videos in your feed recently?  (SoME2 results) Have you seen more math videos in your feed recently? (SoME2 results)

Winners and honorable mentions for the SoME2 contest

Help fund future projects: https://www.patreon.com/3blue1brown​

An equally valuable form of support is to simply share the videos. Winners

https://explanaria.github.io/crystalgroups

https://youtu.be/5M2RWtD4EzI

https://youtu.be/gsZiJeaMO48

https://youtu.be/a-767WnbaCQ

https://youtu.be/6hVPNONm7xw Honorable mentions:

https://youtu.be/v_HeaeUUOnc

https://youtu.be/piF6D6CQxUw

https://youtu.be/QC3CjBZLHXs

https://thenumb.at/Autodiff/

https://youtu.be/KufsL2VgELo

http://xperimex.com/blog/panorama-homography/

https://youtu.be/zR_hpai3XkY

https://youtu.be/HeBP3MG-WHg

https://youtu.be/2dwQUUDt5Is

https://youtu.be/3gyHKCDq1YA

https://youtu.be/nK2j…

2 месяца назад @ youtube.com
How to lie using visual proofs
How to lie using visual proofs How to lie using visual proofs

Three false proofs, and what lessons they teach.

New notebooks: https://store.dftba.com/collections/3blue1brown/products/mathematical-quotebook-notebook

Help fund future projects: https://www.patreon.com/3blue1brown​

An equally valuable form of support is to simply share the videos. Time stamps:

0:00 - Fake sphere proof

1:39 - Fake pi = 4 proof

5:16 - Fake proof that all triangles are isosceles

9:54 - Sphere "proof" explanation

15:09 - pi = 4 "proof" explanation

16:57 - Triangle "proof" explanation and conclusion ------------------ These animations are largely made using a custom python library, manim. See the FAQ comments here:

https://www.3blue1brown.com/faq#manim

https://github.com/3b1b/…

5 месяцев назад @ youtube.com
Summer of Math Exposition #2
Summer of Math Exposition #2 Summer of Math Exposition #2

Mailing-list: https://summerofmathexposition.substack.com/p/the-summer-of-math-exposition-is?s=r

Find collaborators here: https://github.com/leios/SoME_Topics/

Join the discord: https://discord.gg/dsp3zgB4qQ

Submission form: https://forms.gle/sNqosxqwCW2EjPVu5

Last year’s results: https://3b1b.co/blog/some1-results ------------------ Music by Vincent Rubinetti.

https://www.vincentrubinetti.com/ ------------------ 3blue1brown is a channel about animating math, in all senses of the word animate. And you know the drill with YouTube, if you want to stay posted on new videos, subscribe: http://3b1b.co/subscribe Various social media stuffs:

Website: https://www.3blue1brown.com

Twitter: https://tw…

5 месяцев, 3 недели назад @ youtube.com
Olympiad level counting
Olympiad level counting Olympiad level counting

Generating functions, as applied to a hard puzzle used for IMO training.

Help fund future projects: https://www.patreon.com/3blue1brown​

An equally valuable form of support is to simply share the videos. Books mentioned 102 Combinatorial problems, by Titu Andreescu and Zuming Feng

https://amzn.to/3wAPoNq Generatingfunctionology by Herbert Wilf

https://amzn.to/3sPJ8Al Visualizing the Riemann zeta function

https://youtu.be/sD0NjbwqlYw Fourier series

https://youtu.be/r6sGWTCMz2k Timestamps

0:00 - Puzzle statement and motivation

4:31 - Simpler example

6:51 - The generating function

11:52 - Evaluation tricks

17:24 - Roots of unity

26:31 - Recap and final trick

30:13 - Takeaways -----------------…

6 месяцев, 1 неделя назад @ youtube.com
Oh, wait, actually the best Wordle opener is not “crane”…
Oh, wait, actually the best Wordle opener is not “crane”… Oh, wait, actually the best Wordle opener is not “crane”…

A slight correction to the previous video, with some more details about how the best first word was chosen.

Special thanks to these supporters: https://3b1b.co/lessons/wordle#thanks

Help fund future projects: https://www.patreon.com/3blue1brown​

An equally valuable form of support is to simply share the videos. Contents:

0:00 - The Bug

3:31 - How the best first guess is chosen

8:54 - Does this ruin the game? Nice post by Jonathan Olson on optimal wordle algorithms:

https://jonathanolson.net/experiments/optimal-wordle-solutions ------------------ These animations are largely made using a custom python library, manim. See the FAQ comments here:

https://www.3blue1brown.com/faq#manim

https://gi…

9 месяцев, 2 недели назад @ youtube.com
The mathematically optimal Wordle strategy
The mathematically optimal Wordle strategy The mathematically optimal Wordle strategy

An excuse to teach a lesson on information theory and entropy.

Help fund future projects: https://www.patreon.com/3blue1brown​

Special thanks to these supporters: https://3b1b.co/thanks

An equally valuable form of support is to simply share the videos. Contents:

0:00 - What is Wordle?

2:43 - Initial ideas

8:04 - Information theory basics

18:15 - Incorporating word frequencies

27:49 - Final performance Original wordle site:

https://www.powerlanguage.co.uk/wordle/ Music by Vincent Rubinetti.

https://www.vincentrubinetti.com/ Shannon and von Neumann artwork by Kurt Bruns. Code for this video:

https://github.com/3b1b/videos/blob/master/_2022/wordle.py These animations are largely made using a c…

9 месяцев, 4 недели назад @ youtube.com
Two Minute Papers Two Minute Papers
последний пост 23 часа назад
OpenAI’s New AI: Now Do My Homework! 🤖
OpenAI’s New AI: Now Do My Homework! 🤖 OpenAI’s New AI: Now Do My Homework! 🤖

❤️ Check out Fully Connected by Weights & Biases: https://wandb.me/papers 📝 The paper "Efficient Training of Language Models to Fill in the Middle" is available here:

https://arxiv.org/abs/2207.14255 Code for benchmarking the model is available here (note: this is not the full source code):

https://github.com/openai/human-eval-infilling I think you can try it with GPT-3 with the text-davinci-003 model: https://beta.openai.com/docs/models/gpt-3 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Eric Martel, Geronimo Moralez, Gordon Ch…

23 часа назад @ youtube.com
Ubisoft’s New AI: Breathing Life Into Games!
Ubisoft’s New AI: Breathing Life Into Games! Ubisoft’s New AI: Breathing Life Into Games!

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.com/papers 📝 The paper "ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech" is available here:

https://github.com/ubisoft/ubisoft-laforge-ZeroEGGS 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Eric Martel, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Lukic, John Le, Jonas, Jonathan, Kenneth Davis, Klaus Busse, Kyle Davis, Lorin Atzberger, Lukas Biewald, Luke Dominique Warner, Matthew Allen Fisher, Matthew Valle, Michael Albrecht, Mi…

5 дней, 1 час назад @ youtube.com
Crushing 1,000,000 Particles With a Hydraulic Press!
Crushing 1,000,000 Particles With a Hydraulic Press! Crushing 1,000,000 Particles With a Hydraulic Press!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "Energetically Consistent Inelasticity for Optimization Time Integration" is available here (scroll down a little):

https://www.math.ucla.edu/~minchen/ 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Eric Martel, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Lukic, John Le, Jonas, Jonathan, Kenneth Davis, Klaus Busse, Kyle Davis, Lorin Atzberger, Lukas Biewald, Luke Dominique Warner, Matthew Allen Fisher, Matthew Valle, Michael…

1 неделя, 1 день назад @ youtube.com
NVIDIA’s AI Makes The Video Games of The Future!
NVIDIA’s AI Makes The Video Games of The Future! NVIDIA’s AI Makes The Video Games of The Future!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper CLIP-based "Neural Neighbor Style Transfer for 3D Assets" is available here:

https://granskog.xyz/clip-nnfm-for-3d It source code is also available, free of charge here:

https://github.com/shhra/clip3dstyle 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Eric Martel, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Lukic, John Le, Jonas, Jonathan, Kenneth Davis, Klaus Busse, Kyle Davis, Lorin Atzberger, Lukas Biewald, Luke Dominiq…

1 неделя, 5 дней назад @ youtube.com
OpenAI’s Whisper Learned 680,000 Hours Of Speech!
OpenAI’s Whisper Learned 680,000 Hours Of Speech! OpenAI’s Whisper Learned 680,000 Hours Of Speech!

❤️ Check out Anyscale and try it for free here: https://www.anyscale.com/papers 📝 The paper "Robust Speech Recognition via Large-Scale Weak Supervision" is available here:

https://openai.com/blog/whisper/ Try it out (note: the Scholarly Stampede appears to be in order - we barely published the video and there are already longer wait times): https://huggingface.co/spaces/openai/whisper Source code: https://github.com/openai/whisper Lex transcriptions by Andrej Karpathy: https://karpathy.ai/lexicap/ 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Ch…

2 недели, 2 дня назад @ youtube.com
OpenAI's DALL-E 2 Has Insane Capabilities! 🤖
OpenAI's DALL-E 2 Has Insane Capabilities! 🤖 OpenAI's DALL-E 2 Has Insane Capabilities! 🤖

❤️ Check out Runway and try it for free here: https://runwayml.com/papers/

Use the code TWOMINUTE at checkout to get 10% off! 📝 The paper "Hierarchical Text-Conditional Image Generation with CLIP Latents" is available here:

https://openai.com/dall-e-2/ ☀️My free Master-level light transport course is available here:

https://users.cg.tuwien.ac.at/zsolnai/gfx/rendering-course/ Our Separable Subsurface Scattering paper with Activition-Blizzard:

https://users.cg.tuwien.ac.at/zsolnai/gfx/separable-subsurface-scattering-with-activision-blizzard/ Our earlier paper with the caustics:

https://users.cg.tuwien.ac.at/zsolnai/gfx/adaptive_metropolis/ Rendered images:

LuxCore Render / Sharlybg https://lu…

3 недели назад @ youtube.com
Google’s New AI Learns Table Tennis! 🏓
Google’s New AI Learns Table Tennis! 🏓 Google’s New AI Learns Table Tennis! 🏓

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "i-Sim2Real: Reinforcement Learning of Robotic Policies in Tight Human-Robot Interaction Loops" is available here:

https://sites.google.com/view/is2r

https://twitter.com/lgraesser3/status/1547942995139301376 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Eric Martel, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Lukic, John Le, Jonas, Jonathan, Kenneth Davis, Klaus Busse, Kyle Davis, Lorin Atzberger, Lukas Biewald, Luke Domini…

3 недели, 2 дня назад @ youtube.com
Stable Diffusion Is Getting Outrageously Good! 🤯
Stable Diffusion Is Getting Outrageously Good! 🤯 Stable Diffusion Is Getting Outrageously Good! 🤯

❤️ Check out Anyscale and try it for free here: https://www.anyscale.com/papers 📝 The paper "High-Resolution Image Synthesis with Latent Diffusion Models" is available here:

https://ommer-lab.com/research/latent-diffusion-models/

https://github.com/mallorbc/stable-diffusion-klms-gui You can also try Stable diffusion for free here: https://huggingface.co/spaces/stabilityai/stable-diffusion Credits:

1. Prompt-image repository https://lexica.art + Variants from photos https://twitter.com/sharifshameem/status/157177206133663334

2. Infinite zoom https://twitter.com/matthen2/status/1564608723636654093 + how to do it https://twitter.com/matthen2/status/1564608773485895692

3. Lego to reality https:…

3 недели, 4 дня назад @ youtube.com
NVIDIA’s Amazing AI Clones Your Voice! 🤐
NVIDIA’s Amazing AI Clones Your Voice! 🤐 NVIDIA’s Amazing AI Clones Your Voice! 🤐

❤️ Check out Cohere and sign up for free today: https://cohere.ai/papers 📝 The paper "One TTS Alignment To Rule Them All" is available here:

https://arxiv.org/abs/2108.10447 Early access: https://developer.nvidia.com/riva/studio-early-access 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Eric Martel, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Lukic, John Le, Jonas, Jonathan, Kenneth Davis, Klaus Busse, Kyle Davis, Lorin Atzberger, Lukas Biewald, Luke Dominique Warner, Matthew Allen Fisher, Matthew Valle, Michael Albrecht,…

4 недели назад @ youtube.com
Google’s Video AI: Outrageously Good! 🤖
Google’s Video AI: Outrageously Good! 🤖 Google’s Video AI: Outrageously Good! 🤖

❤️ Check out Runway and try it for free here: https://runwayml.com/papers/

Use the code TWOMINUTE at checkout to get 10% off! 📝 The paper "High Definition Video Generation with Diffusion Models" is available here:

https://imagen.research.google/video/ 📝 My paper "The flow from simulation to reality" with clickable citations is available here for free:

https://rdcu.be/cWPfD 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Eric Martel, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Lukic, John Le, Jonas, Jonathan, Kenneth Davis, …

1 месяц назад @ youtube.com
Google’s New AI: Kind of Like Tesla's Robot! 🤖
Google’s New AI: Kind of Like Tesla's Robot! 🤖 Google’s New AI: Kind of Like Tesla's Robot! 🤖

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "Inner Monologue: Embodied Reasoning through Planning with Language Models" is available here:

https://innermonologue.github.io/ ❤️ Watch these videos in early access on our Patreon page or join us here on YouTube: - https://www.patreon.com/TwoMinutePapers

- https://www.youtube.com/channel/UCbfYPyITQ-7l4upoX8nvctg/join 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Eric Martel, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Luk…

1 месяц назад @ youtube.com
Intel’s New AI: Amazing Ray Tracing Results! ☀️
Intel’s New AI: Amazing Ray Tracing Results! ☀️ Intel’s New AI: Amazing Ray Tracing Results! ☀️

❤️ Check out Weights & Biases and say hi in their community forum here: https://wandb.me/paperforum 📝 The paper "Temporally Stable Real-Time Joint Neural Denoising and Supersampling" is available here:

https://www.intel.com/content/www/us/en/developer/articles/technical/temporally-stable-denoising-and-supersampling.html 📝 Our earlier paper with the spheres scene that took 3 weeks:

https://users.cg.tuwien.ac.at/zsolnai/gfx/adaptive_metropolis/ ❤️ Watch these videos in early access on our Patreon page or join us here on YouTube: - https://www.patreon.com/TwoMinutePapers

- https://www.youtube.com/channel/UCbfYPyITQ-7l4upoX8nvctg/join 🙏 We would like to thank our generous Patreon supporters who…

1 месяц, 1 неделя назад @ youtube.com
Google’s New AI: DALL-E, But Now In 3D! 🤯
Google’s New AI: DALL-E, But Now In 3D! 🤯 Google’s New AI: DALL-E, But Now In 3D! 🤯

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.com/papers 📝 The paper "DreamFusion: Text-to-3D using 2D Diffusion" is available here:

https://dreamfusion3d.github.io/ Unofficial open source implementation: https://github.com/ashawkey/stable-dreamfusion Interpolation: https://twitter.com/xsteenbrugge/status/1558508866463219712

Full video of interpolation: https://www.youtube.com/watch?v=Bo3VZCjDhGI ❤️ Watch these videos in early access on our Patreon page or join us here on YouTube: - https://www.patreon.com/TwoMinutePapers

- https://www.youtube.com/channel/UCbfYPyITQ-7l4upoX8nvctg/join 🙏 We would like to thank our generous Patreon supporters who make Two Minut…

1 месяц, 1 неделя назад @ youtube.com
Ray Tracing: How NVIDIA Solved the Impossible!
Ray Tracing: How NVIDIA Solved the Impossible! Ray Tracing: How NVIDIA Solved the Impossible!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The showcased papers are available here:

https://research.nvidia.com/publication/2021-07_rearchitecting-spatiotemporal-resampling-production

https://research.nvidia.com/publication/2022-07_generalized-resampled-importance-sampling-foundations-restir

https://graphics.cs.utah.edu/research/projects/gris/

https://users.cg.tuwien.ac.at/zsolnai/gfx/adaptive_metropolis/ If you wish to learn more about light transport, I have a course that is free for everyone, no strings attached: https://users.cg.tuwien.ac.at/zsolnai/gfx/rendering-course/ ❤️ Watch these videos in early access on our Patreon page or join us h…

1 месяц, 2 недели назад @ youtube.com
NVIDIA's New AI: Teaching Virtual Humans to Walk! 🏃‍♂️
NVIDIA's New AI: Teaching Virtual Humans to Walk! 🏃‍♂️ NVIDIA's New AI: Teaching Virtual Humans to Walk! 🏃‍♂️

❤️ Check out Cohere and sign up for free today: https://cohere.ai/papers 📝 The paper "Accelerated Policy Learning with Parallel Differentiable Simulation" is available here:

https://short-horizon-actor-critic.github.io/ ❤️ Watch these videos in early access on our Patreon page or join us here on YouTube: - https://www.patreon.com/TwoMinutePapers

- https://www.youtube.com/channel/UCbfYPyITQ-7l4upoX8nvctg/join 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Eric Martel, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Lukic, John …

1 месяц, 2 недели назад @ youtube.com
DataFest Video DataFest Video
последний пост None
Семинары JetBrains Research Семинары JetBrains Research
последний пост 6 месяцев назад
Learning to Recommend Method Names with Global Context
Learning to Recommend Method Names with Global Context Learning to Recommend Method Names with Global Context

Во многих задачах исследователи работают с небольшими фрагментами кода — отдельными методами, реже — с файлами. Но чтобы найти качественное решение, зачастую требуется выйти за пределы небольших кусков кода и использовать глобальную информацию о модуле или проекте. Мы поговорим о различных способах использования информации о контексте в ML моделях и о том, на что нужно обращать внимание для честной оценки их качества. Докладчик: Егор Богомолов Материалы: https://arxiv.org/pdf/2201.10705.pdf

6 месяцев назад @ youtube.com
Генерация SQL запросов по тексту на естественном языке
Генерация SQL запросов по тексту на естественном языке Генерация SQL запросов по тексту на естественном языке

Мы разберем методы генерации SQL запросов из описания на естественном языке и немного поговорим о более широком применении их к генерации DSL кода. Мы обсудим почему обучение обучение моделей для DSL может отличаться от моделей генерации кода, текущие подходы к решению задачи на базе лидерборда для Spider датасета и их ограничения. Мы представим более масштабируемый подход к генерации SQL и наши текущие результаты. Докладчик: Денис Литвинов

6 месяцев, 1 неделя назад @ youtube.com
Automating Reinforcement Learning Architecture Design for Code Optimization
Automating Reinforcement Learning Architecture Design for Code Optimization Automating Reinforcement Learning Architecture Design for Code Optimization

В настоящее время Reinforcement Learning (RL) применяется для решения ряда задач оптимизации в области компиляторов, таких как конфигурация флагов компиляции, выбор оптимального порядка выполнения инструкций и многие другие. Однако, подобрать оптимальный RL-алгоритм бывает сложно, так как он зависит от контекста конкретной задачи. Более того, разработчики компиляторов зачастую могут быть не вовлечены в область RL, что еще сильнее осложняет решение данной задачи. В работе Automating Reinforcement Learning Architecture Design for Code Optimization авторы предлагают инструмент Supersonic, позволяющий автоматически подбирать оптимальный RL-алгоритм для решения оптимизационных задач в компилятор…

6 месяцев, 2 недели назад @ youtube.com
Implementation Matters in Deep Policy Gradients: A Case Study on PPO and TRPO
Implementation Matters in Deep Policy Gradients: A Case Study on PPO and TRPO Implementation Matters in Deep Policy Gradients: A Case Study on PPO and TRPO

Несмотря на то, что многие из последних достижений в области машинного обучения связаны с глубоким обучением с подкреплением, Deep RL алгоритмы остаются ненадёжными (по сравнению с классическими моделями глубокого обучения) и трудновоспроизводимыми (с точки зрения результата). Авторы статьи связывают описанные недостатки с проблемой отсутствия понимания того как внутренние механизмы, используемые в RL алгоритмах, влияют на поведение агента по отдельности и вместе взятые. На семинаре мы поговорим о поднятой авторами проблеме на примере алгоритмов Trust Region Policy Optimization (TRPO) и Proximal Policy Optimization (PPO), рассмотрим эксперименты по оценке влияния составных частей этих алгор…

7 месяцев, 1 неделя назад @ youtube.com
Predicting What You Already Know Helps: Provable Self-Supervised Learning
Predicting What You Already Know Helps: Provable Self-Supervised Learning Predicting What You Already Know Helps: Provable Self-Supervised Learning

Зачастую в прикладных задачах собрать достаточно большой, подходящим образом размеченный датасет для обучения модели не представляется возможным. Популярным решением в такой ситуации является Self-Supervised Learning. В рамках этого подхода модель сначала предобучают на синтетической, искусственно выдуманной задаче, выборку для которой автоматически формируют из неразмеченных данных. Примерами таких синтетических задач являются восстановление маскированных токенов в NLP (этот же подход используется и в некоторых моделях для работы с кодом), восстановление фрагментов или удаление искусственного шума при работе с картинками, восстановление последовательности кадров при работе с видео и т.д.. …

7 месяцев, 2 недели назад @ youtube.com
Emerging Properties in Self-Supervised Vision Transforms
Emerging Properties in Self-Supervised Vision Transforms Emerging Properties in Self-Supervised Vision Transforms

Многие из самых захватывающих новых прорывов в области искусственного интеллекта произошли благодаря двум недавним инновациям: самоконтролируемое обучение, который позволяет машинам учиться на случайных немаркированных примерах, а также Трансформеры, которые позволяют моделям ИИ выборочно сосредотачиваться на определенных частях своего ввода и, таким образом, рассуждать более эффективно. На семинара будет разобрана новая статья "Emerging Properties in Self-Supervised Vision Transforms", в которой авторы используются ранее упомянутые техники для решения задач компьютерного зрения. Докладчик: Ольга Лавриченко.

7 месяцев, 2 недели назад @ youtube.com
Multimodal Conditional Image Synthesis with Product-of-Experts GANs
Multimodal Conditional Image Synthesis with Product-of-Experts GANs Multimodal Conditional Image Synthesis with Product-of-Experts GANs

Существующие фреймворки для генерации изображений могут обуславливаться на пользовательский ввод в одной модальности — например, на текст, эскиз, маску сегментации или пример изображения со стилем. При этом, такие подходы не используют доступные мультимодальные данные. Авторы данной статьи предлагают Product-of-Experts Generative Adversarial Networks (PoE-GAN) фреймворк, который позволяет синтезировать изображение на основе условий в нескольких модальностях или любом их подмножестве, а также осуществлять безусловную генерацию. Данная модель также превосходит другие подходы в условиях унимодальной условной генерации. Докладчик: Дарья Евсикова.

7 месяцев, 2 недели назад @ youtube.com
Block-Recurrent Transformers
Block-Recurrent Transformers Block-Recurrent Transformers

Трансформеры уже давно господствуют во многих задачах NLP. И если с задачами где длина последовательности относительно мала (не более 512 токенов) проблем не возникает, то с обработкой больших текстов не все так ясно. Проблема в том, что потребление памяти увеличивается квадратично с ростом обрабатываемой последовательности. Существуют различные подходы к решению проблемы, например, можно линеаризовать softmax в модуле внимания, снизив асимптотику до O(N) (linear transformers); или же исследовать разреженность (BigBird). В свою очередь, авторы статьи продолжают идеи sliding-window и Transformer-XL. Поэтому на семинаре поговорим об этих подходах и архитектуре Block-Recurrent Transformer. Док…

7 месяцев, 3 недели назад @ youtube.com
Assessing Project-Level Fine-Tuning of ML4SE Models
Assessing Project-Level Fine-Tuning of ML4SE Models Assessing Project-Level Fine-Tuning of ML4SE Models

Мы расскажем про исследование, посвященное дообучению ML4SE моделей под конкретный проект. В то время как большинство исследователей обучает и тестирует модели на непересекающихся наборах проектов, мы задались вопросом: “А что будет, если показать модели данные из целевого проекта?“. Мы поговорим об особенностях оценки качества проектно-дообученных моделей и презентуем полученные результаты для трех моделей в задаче предсказания имен методов.

Докладчик – Егор Богомолов

7 месяцев, 3 недели назад @ youtube.com
Предсказание типов для исходного кода с использованием графовых нейронных сетей
Предсказание типов для исходного кода с использованием графовых нейронных сетей Предсказание типов для исходного кода с использованием графовых нейронных сетей

На семинаре мы поговорим о нашей работе в области предварительной тренировки векторных представлений графовых нейронных сетей (GNN) для исходного кода. Качество векторов мы оцениваем с помощью задачи предсказания типов для языка с динамической типизацией Python. Для предварительной тренировки используется задача предсказания имён. По результатам наших экспериментов векторные представления GNN позволяют достичь точности классификации типов, сравнимой с CodeBERT. Вдобавок, объединение CodeBERT и GNN векторов в гибридную модель позволяет улучшить точность классификации типов. При этом, улучшения достигаются даже после тренировки GNN модели в течение всего одной эпохи, что намного меньше чем тр…

7 месяцев, 3 недели назад @ youtube.com
Industry-scale IR-based Bug Localization: A Perspective from Facebook
Industry-scale IR-based Bug Localization: A Perspective from Facebook Industry-scale IR-based Bug Localization: A Perspective from Facebook

В крупных компаниях, где весь код лежит в едином репозитории, очень важно уметь оперативно локализовать баг. Задача усложняется, когда отельные файлы состоят из сотен строк, а проблема выявляется на этапе End-to-End тестирования или в продакшене. В такой ситуации необходимо автоматическое решение, которое способно быстро найти ломающий коммит, несмотря на то, что сообщения об ошибке зачастую трудночитаемые и содержат большой объём информации. На этом семинаре мы разберём статью от Facebook (https://arxiv.org/pdf/2010.09977.pdf), в которой авторы предлагают эффективный unsupervised алгоритм локализации бага к коммиту, использующий методы информационного поиска. Описанный алгоритм приспособле…

7 месяцев, 3 недели назад @ youtube.com
Code Smells for Machine Learning Applications
Code Smells for Machine Learning Applications Code Smells for Machine Learning Applications

Разработка программного обеспечения сопряжена с поиском и исправлением ошибок. В программной инженерии уже давно изучаются и описываются запахи кода – формальные признаки, индицирующие о возможном наличии проблем. Примерами запахов кода могут быть завистливая функция (метод обращается к данным чужого класса чаще, чем к данным собственного) или параллельная иерархия (ситуация, когда при создании нового класса в одной иерархии классов вам почти всегда приходится создавать парный к нему класс в другой иерархии). Для каждого запаха кода описаны потенциальные пути исправления, часто сводящиеся к какому-то рефакторингу.

Однако, проекты, связанные с машинным обучением, обладают особой спецификой и…

7 месяцев, 3 недели назад @ youtube.com
Fastformer: Additive Attention Can Be All You Need
Fastformer: Additive Attention Can Be All You Need Fastformer: Additive Attention Can Be All You Need

Трансформер - очень хорошая модель для понимания текста, однако она не эффективна из-за квадратичной асимптотической сложности по длине входящей последовательности. Хотя существует множество методов ускорения трансформера, они все еще недостаточно эффективны на длинных последовательностях. Авторы статьи предлагают Fastformer, эффективную модель трансформера, основанную на аддитивном внимании (additive attention). На семинаре мы вспомним, как работают трансформеры, познакомимся с additive attention и Fastformer и посмотрим, как он справляется с различными задачами. Докладчик: Тимур Хабибуллин

7 месяцев, 3 недели назад @ youtube.com
Language Models are Unsupervised Multitask Learners
Language Models are Unsupervised Multitask Learners Language Models are Unsupervised Multitask Learners

Задачи обработки естественного языка, такие как машинный перевод, ответы на вопросы и обобщения текстов, как правило решаются с помощью обучения с учителем на специально подобранных под конкретное задание датасетах. Авторы статьи показывают, что можно обучить модель, которая будет способна решать различные задачи с минимальным количеством обучения с учителем, используя для этого датасет Webtext, состоящий из миллионов различных веб-страниц. На семинаре мы обсудим, как модель справляется с заданиями различной специфики и сравним результаты авторов с результатами state-of-the art моделей. Докладчики: Маргарита Чудова

7 месяцев, 3 недели назад @ youtube.com
Neural Code Completion: Research & Practice
Neural Code Completion: Research & Practice Neural Code Completion: Research & Practice

Я расскажу про процесс создания командой AI Team системы автодополнения для языка R. Будет рассказано о том, с какими трудностями можно столкнуться при разработке и внедрении системы автодополнения, основанной на нейросетях. Также мы рассмотрим некоторые нерешённые исследовательские проблемы в области нейросетевого автодополнения и обсудим возможные способы их решения. Большая часть рассказа будет основана на статье Time-Efficient Code Completion Model for the R Programming Language (https://aclanthology.org/2021.nlp4prog-1.4/), опубликованной на воркшопе NLP4prog 2021 (https://nlp4prog.github.io/2021/) конференции ACL. Докладчики: Артем Попов.

9 месяцев, 1 неделя назад @ youtube.com
Яндекс. Компьютерные науки Яндекс. Компьютерные науки
последний пост 2 недели, 2 дня назад
Data Dojo — ML тренировка 17 ноября 2022
Data Dojo — ML тренировка 17 ноября 2022 Data Dojo — ML тренировка 17 ноября 2022

Data Dojo — тренировки по машинному обучению и место встречи специалистов в сфере анализа данных. Задавайте вопросы спикерам в телеграм-чате (https://t.me/+OsKnLNG-7DE1ZTFi) с хештегом #вопрос, чтобы ведущий зачитал их в прямом эфире.

2 недели, 2 дня назад @ youtube.com
Data Dojo — ML тренировка 22 сентября 2022
Data Dojo — ML тренировка 22 сентября 2022 Data Dojo — ML тренировка 22 сентября 2022

Data Dojo — тренировки по машинному обучению и место встречи специалистов в сфере анализа данных. Задавайте вопросы спикерам в телеграм-чате (https://t.me/+OsKnLNG-7DE1ZTFi) с хештегом #вопрос, чтобы ведущий зачитал их в прямом эфире. Программа: 19:00 — Открытие

19:05 — Бенчмарк приемлемости предложений на русском языке (RuCoLA) + секретный релиз / Максим Рябинин (Яндекс)

19:40 — Перерыв 20:00 — Верификация моделей автомобилей (Machines Can See 2022) / Дмитрий Гаус (VisionLabs) и Артём Стрекалов (АО Уфанет)

2 месяца, 1 неделя назад @ youtube.com
Задачи RMQ и LCA. Часть 2
Задачи RMQ и LCA. Часть 2 Задачи RMQ и LCA. Часть 2

Дерево отрезков. Задача RSQ (range sum query). Задачи LCA (least common ancestor) и RMQ (range minimum query). Решение RMQ с помощью sparse table. Сведение LCA к RMQ (алгоритм Фарах-Колтона-Бендера). Сведение RMQ к LCA. Задача LA (level ancestors). Подробнее о поступлении в Школу анализа данных от Академии Яндекса: https://clck.ru/geqRt

7 месяцев, 1 неделя назад @ youtube.com
Задача о кратчайших путях. Алгоритмы Беллмана-Форда, Флойда, Дийкстры и Джонсона
Задача о кратчайших путях. Алгоритмы Беллмана-Форда, Флойда, Дийкстры и Джонсона Задача о кратчайших путях. Алгоритмы Беллмана-Форда, Флойда, Дийкстры и Джонсона

Кратчайшие пути в графах. Оценки расстояний и их релаксация. Алгоритмы Беллмана-Форда, Флойда и Дийкстры. Потенциалы. Критерий консервативности длин в терминах наличия допустимых потенциалов. Нахождение допустимых потенциалов с помощью алгоритма Беллмана-Форда. Алгоритм Джонсона. Подробнее о поступлении в Школу анализа данных от Академии Яндекса: https://clck.ru/geqRt

7 месяцев, 1 неделя назад @ youtube.com
Длиннейшая возрастающая подпоследовательность 2. Кучи. Сортировка кучей
Длиннейшая возрастающая подпоследовательность 2. Кучи. Сортировка кучей Длиннейшая возрастающая подпоследовательность 2. Кучи. Сортировка кучей

Задача о длиннейшой возрастающей подпоследовательности. Алгоритмы сортировки Heap-Sort и Intro-Sort. Частичная сортировка с помощью кучи и поиска порядковой статистики. Подробнее о поступлении в Школу анализа данных от Академии Яндекса: https://clck.ru/geqRt

7 месяцев, 1 неделя назад @ youtube.com
Фильтр Блюма и count-min sketch
Фильтр Блюма и count-min sketch Фильтр Блюма и count-min sketch

Построение совершенной хеш-функции методом двухуровненого хеширования. Построение совершенной хеш-функции методом ациклических графов. Фильтр Блюма (Bloom filter). Оценка вероятности ложноположительного срабатывания. Count-min sketch. Подробнее о поступлении в Школу анализа данных от Академии Яндекса: https://clck.ru/geqRt

7 месяцев, 1 неделя назад @ youtube.com
Сильно связные компоненты, точки сочленения и мосты
Сильно связные компоненты, точки сочленения и мосты Сильно связные компоненты, точки сочленения и мосты

Сильно связанные компоненты. Точки сочленения: определение и нахождение с помощью обхода в глубину. Мосты. Подробнее о поступлении в Школу анализа данных от Академии Яндекса: https://clck.ru/geqRt

7 месяцев, 1 неделя назад @ youtube.com
Модели вычислений. Анализ учетных стоимостей. Часть 1
Модели вычислений. Анализ учетных стоимостей. Часть 1 Модели вычислений. Анализ учетных стоимостей. Часть 1

Время и память как основные ресурсы. RAM машина. Сложность на заданном входе, сложность в худшем случае, сложность в среднем случае, рандомизированная сложность.

Учетная стоимость операций, метод потенциалов, банковский метод анализа сложности.

Массивы переменного размера. Реаллокация. Анализ учетной сложности операции push-back. Подробнее о поступлении в Школу анализа данных от Академии Яндекса: https://clck.ru/geqRt

7 месяцев, 1 неделя назад @ youtube.com
Модели вычислений. Анализ учетных стоимостей. Часть 2
Модели вычислений. Анализ учетных стоимостей. Часть 2 Модели вычислений. Анализ учетных стоимостей. Часть 2

Время и память как основные ресурсы. RAM машина. Сложность на заданном входе, сложность в худшем случае, сложность в среднем случае, рандомизированная сложность.

Учетная стоимость операций, метод потенциалов, банковский метод анализа сложности.

Массивы переменного размера. Реаллокация. Анализ учетной сложности операции push-back. Подробнее о поступлении в Школу анализа данных от Академии Яндекса: https://clck.ru/geqRt

7 месяцев, 1 неделя назад @ youtube.com
Очередь и стэки. Иммутабельность и персистентность
Очередь и стэки. Иммутабельность и персистентность Очередь и стэки. Иммутабельность и персистентность

Реализация очереди на паре стеков с константной учетной сложностью. Динамические минимумы-максимумы в стеках и очередях. Персистентные структуры данных. Виды персистентности. Модель вычислений Pointer Machine. Персистентные стеки и очереди. Подробнее о поступлении в Школу анализа данных от Академии Яндекса: https://clck.ru/geqRt

7 месяцев, 1 неделя назад @ youtube.com
Misra-Gries. Деревья поиска. RB-деревья. Декартовы деревья и дучи.
Misra-Gries. Деревья поиска. RB-деревья. Декартовы деревья и дучи. Misra-Gries. Деревья поиска. RB-деревья. Декартовы деревья и дучи.

Алгоритм Misra-Gries.

Деревья поиска. Вставка и удаление элементов. Inorder-обход дерева. Красно черные деревья: определение и основные свойства. Реализация операций вставки для красно-черного дерева. Дучи (treaps). Единственность дучи для заданного набора различных ключей и приоритетов. Логарифмическая оценка матожидания высоты дучи. Операции слияния и разделения для дуч. Операции вставки и удаления элементов для дуч. Подробнее о поступлении в Школу анализа данных: https://academy.yandex.ru/dataschool

7 месяцев, 1 неделя назад @ youtube.com
Быстрая сортировка и сортировка слиянием 2. Бинарный поиск. Длиннейшая возрастающая подпоследователь
Быстрая сортировка и сортировка слиянием 2. Бинарный поиск. Длиннейшая возрастающая подпоследователь Быстрая сортировка и сортировка слиянием 2. Бинарный поиск. Длиннейшая возрастающая подпоследователь

Быстрая сортировка (Quick-Sort). Способы выбора разделяющего элемента. Элиминация хвостовой рекурсии. Порядковые статистики. Рандомизированный алгоритм Quick-Select. Детермининированный алгоритм поиска (метод "медианы медиан").

Бинарный поиск. Galloping.

Линейное по времени слияние упорядоченных последовательностей. Оптимальное по числу сравнений слияние упорядоченных последовательностей.

Задача о длиннейшей возврастающей подпоследовательности. Динамическое программирование. O(n log n)-алгоритм. Подробнее о поступлении в Школу анализа данных от Академии Яндекса: https://clck.ru/geqRt

7 месяцев, 1 неделя назад @ youtube.com
Задачи RMQ и LCA. Часть 1
Задачи RMQ и LCA. Часть 1 Задачи RMQ и LCA. Часть 1

Дерево отрезков. Задача RSQ (range sum query). Задачи LCA (least common ancestor) и RMQ (range minimum query). Решение RMQ с помощью sparse table. Сведение LCA к RMQ (алгоритм Фарах-Колтона-Бендера). Сведение RMQ к LCA. Задача LA (level ancestors). Подробнее о поступлении в Школу анализа данных от Академии Яндекса: https://clck.ru/geqRt

7 месяцев, 1 неделя назад @ youtube.com
Минимальные остовные деревья. Алгоритмы Краскала и Прима. Системы непересекающихся множеств.
Минимальные остовные деревья. Алгоритмы Краскала и Прима. Системы непересекающихся множеств. Минимальные остовные деревья. Алгоритмы Краскала и Прима. Системы непересекающихся множеств.

Остовы минимального веса. Лемма о минимальном ребре в разрезе. Алгоритмы Краскала и Прима. Структура DSU (disjoint set union) Реализация с использованием леса. Ранги вершин, эвристика ранга. Логарифмическая оценка ранга через количество элементов. Эвристика сжатия путей. Оценка учетной стоимости операций (без доказательства). Подробнее о поступлении в Школу анализа данных от Академии Яндекса: https://clck.ru/geqRt

7 месяцев, 1 неделя назад @ youtube.com
Splay-деревья. Обход в ширину. Обход в глубину. Топологическая сортировка и проверка ацикличности.
Splay-деревья. Обход в ширину. Обход в глубину. Топологическая сортировка и проверка ацикличности. Splay-деревья. Обход в ширину. Обход в глубину. Топологическая сортировка и проверка ацикличности.

Splay-деревья. Операция splay: zig, zig-zig и zig-zag шаги. Реализация операций вставки, удаления, слияния и разделения для splay-деревьев. Обход в глубину. Топологическая сортировка Подробнее о поступлении в Школу анализа данных от Академии Яндекса: https://clck.ru/geqRt

7 месяцев, 1 неделя назад @ youtube.com
ML Trainings ML Trainings
последний пост 1 день, 8 часов назад
ML System Design - Развертывание систем
ML System Design - Развертывание систем ML System Design - Развертывание систем

Страница курса: https://ods.ai/tracks/ml-system-design-22

Все доп.материалы в блоке на странице курса: https://ods.ai/tracks/ml-system-design-22/blocks/b8e07cd1-70ea-4509-aa62-de7897e05052

Course Fest: https://ods.ai/events/course_season_autumn_22 Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 день, 8 часов назад @ youtube.com
DRL Course | Introduction to Neural Networks. Deep Cross-Entropy Method
DRL Course | Introduction to Neural Networks. Deep Cross-Entropy Method DRL Course | Introduction to Neural Networks. Deep Cross-Entropy Method

Курс Deep Reinforcement Learning: https://ods.ai/tracks/drlcourse22

Сезон курсов: https://ods.ai/events/course_season_autumn_22 Во второй лекции:

Рассмотрены понятия нейрона, функции активации, нейронных сетей.

Кратко изложен нейросетевой подход к решению задач регрессии и классификации.

Приведена Теорема Цибенко об аппроксимации нейронными сетями непрерывных функций.

Рассказана модификация алгоритма Кросс-Энтропии с использованием нейронных сетей для решения задач обучения с подкреплением с бесконечными пространствами состояний и действий. Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

2 дня, 8 часов назад @ youtube.com
Иван Глухов | Process mining сегодня, обзор текущих решений, в рамках импортозамещения
Иван Глухов | Process mining сегодня, обзор текущих решений, в рамках импортозамещения Иван Глухов | Process mining сегодня, обзор текущих решений, в рамках импортозамещения

Data Fest Siberia 3 & Halloween 2022:

https://ods.ai/tracks/groups/data-fest-siberia-3-halloween

Трек "ML in Industry":

https://ods.ai/tracks/sibfest3-ml-in-industry Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 неделя, 3 дня назад @ youtube.com
Егор Некрут | Улучшение 3D Геометрии Фасадов Зданий
Егор Некрут | Улучшение 3D Геометрии Фасадов Зданий Егор Некрут | Улучшение 3D Геометрии Фасадов Зданий

Data Fest Siberia 3 & Halloween 2022:

https://ods.ai/tracks/groups/data-fest-siberia-3-halloween

Трек "Computer Vision":

https://ods.ai/tracks/sibfest3-computer-vision Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 неделя, 3 дня назад @ youtube.com
Анна Маршалова | Извлечение аспектов из текстов научных статей
Анна Маршалова | Извлечение аспектов из текстов научных статей Анна Маршалова | Извлечение аспектов из текстов научных статей

Data Fest Siberia 3 & Halloween 2022:

https://ods.ai/tracks/groups/data-fest-siberia-3-halloween

Трек "Студенческие проекты":

https://ods.ai/tracks/sibfest3-student-projects Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 неделя, 3 дня назад @ youtube.com
Ольга Тихобаева | Извлечение семантических отношений из текстов научных статей
Ольга Тихобаева | Извлечение семантических отношений из текстов научных статей Ольга Тихобаева | Извлечение семантических отношений из текстов научных статей

Data Fest Siberia 3 & Halloween 2022:

https://ods.ai/tracks/groups/data-fest-siberia-3-halloween

Трек "Студенческие проекты":

https://ods.ai/tracks/sibfest3-student-projects Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 неделя, 4 дня назад @ youtube.com
Антон Легченко | Улучшаем распознавание речи, без разметки, регистрации и смс
Антон Легченко | Улучшаем распознавание речи, без разметки, регистрации и смс Антон Легченко | Улучшаем распознавание речи, без разметки, регистрации и смс

Data Fest Siberia 3 & Halloween 2022:

https://ods.ai/tracks/groups/data-fest-siberia-3-halloween

Трек "NLP":

https://ods.ai/tracks/sibfest3-nlp Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 неделя, 4 дня назад @ youtube.com
Алина Кочева | Model Serving в K8S
Алина Кочева | Model Serving в K8S Алина Кочева | Model Serving в K8S

Data Fest Siberia 3 & Halloween 2022:

https://ods.ai/tracks/groups/data-fest-siberia-3-halloween

Трек "MLOps":

https://ods.ai/tracks/sibfest3-mlops Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 неделя, 4 дня назад @ youtube.com
Валентин Мамедов | Рекомендуем операторам процедуры для обслуживания
Валентин Мамедов | Рекомендуем операторам процедуры для обслуживания Валентин Мамедов | Рекомендуем операторам процедуры для обслуживания

Data Fest Siberia 3 & Halloween 2022:

https://ods.ai/tracks/groups/data-fest-siberia-3-halloween

Трек "ML in Industry":

https://ods.ai/tracks/sibfest3-ml-in-industry Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 неделя, 5 дней назад @ youtube.com
Алексей Синадский | CL Thymus APRE анализ трафика с нулевым априорным знанием
Алексей Синадский | CL Thymus APRE анализ трафика с нулевым априорным знанием Алексей Синадский | CL Thymus APRE анализ трафика с нулевым априорным знанием

Data Fest Siberia 3 & Halloween 2022:

https://ods.ai/tracks/groups/data-fest-siberia-3-halloween

Трек "ML & Security":

https://ods.ai/tracks/sibfest3-ml-and-security Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 неделя, 5 дней назад @ youtube.com
Антон Воронов | Ищем свое место в шторме
Антон Воронов | Ищем свое место в шторме Антон Воронов | Ищем свое место в шторме

Data Fest Siberia 3 & Halloween 2022:

https://ods.ai/tracks/groups/data-fest-siberia-3-halloween

Трек "Career":

https://ods.ai/tracks/sibfest3-career Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 неделя, 5 дней назад @ youtube.com
Павел Петроченко | Present and future of Smart Completion tools
Павел Петроченко | Present and future of Smart Completion tools Павел Петроченко | Present and future of Smart Completion tools

Data Fest Siberia 3 & Halloween 2022:

https://ods.ai/tracks/groups/data-fest-siberia-3-halloween

Трек "Code Mining":

https://ods.ai/tracks/sibfest3-code-mining Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 неделя, 6 дней назад @ youtube.com
Александр Калинин | Research Manifest как"продавать"риски в ML,чтобы потом не было мучительно больно
Александр Калинин | Research Manifest как"продавать"риски в ML,чтобы потом не было мучительно больно Александр Калинин | Research Manifest как"продавать"риски в ML,чтобы потом не было мучительно больно

Data Fest Siberia 3 & Halloween 2022:

https://ods.ai/tracks/groups/data-fest-siberia-3-halloween

Трек "ML in Industry":

https://ods.ai/tracks/sibfest3-ml-in-industry Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 неделя, 6 дней назад @ youtube.com
Александр Мигуцкий | Исследование безопастности биометрических устройств
Александр Мигуцкий | Исследование безопастности биометрических устройств Александр Мигуцкий | Исследование безопастности биометрических устройств

Data Fest Siberia 3 & Halloween 2022:

https://ods.ai/tracks/groups/data-fest-siberia-3-halloween

Трек "ML & Security":

https://ods.ai/tracks/sibfest3-ml-and-security Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 неделя, 6 дней назад @ youtube.com
Дмитрий Чудаков | Super Resolution ортографий и танцы с фотограмметрией
Дмитрий Чудаков | Super Resolution ортографий и танцы с фотограмметрией Дмитрий Чудаков | Super Resolution ортографий и танцы с фотограмметрией

Data Fest Siberia 3 & Halloween 2022:

https://ods.ai/tracks/groups/data-fest-siberia-3-halloween

Трек "Computer Vision":

https://ods.ai/tracks/sibfest3-computer-vision Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

2 недели назад @ youtube.com
Primer Primer
последний пост 3 месяца, 2 недели назад
How many people might ever exist, calculated
How many people might ever exist, calculated How many people might ever exist, calculated

You can get 50% off What We Owe The Future and drive sales to local independent bookstores by using the promotion code PRIMER50 when buying from the following website: https://bookshop.org/books/what-we-owe-the-future/9781541618626 I made this video in partnership with the Forethought Foundation for Global Priorities Research, where the author Will MacAskill serves as director. Their goal is to help the book reach more people, and I’m very aligned with that goal. The more we can work together to think about our future, the better! Source links:

https://ourworldindata.org/longtermism

https://www.prb.org/articles/how-many-people-have-ever-lived-on-earth/

https://ourworldindata.org/world-popul…

3 месяца, 2 недели назад @ youtube.com
How To Catch A Cheater With Math
How To Catch A Cheater With Math How To Catch A Cheater With Math

Try catching cheaters yourself: https://primerlearning.org/ Support these videos on Patreon: https://www.patreon.com/primerlearning

Plush blobs and other stuff: https://store.dftba.com/collections/primer Binomial probability example (the whole section on Khan Academy may be helpful)

https://www.khanacademy.org/math/statistics-probability/random-variables-stats-library/binomial-random-variables/v/probability-of-making-2-shots-in-6-attempts For discussion and updates

- Discord: https://discord.gg/NbruaNW

- Twitter: @primerlearning

- Reddit: r/primerlearning Made with Unity and Manim

https://github.com/Helpsypoo/PrimerUnity

https://www.manim.community/ Made possible by support through Patreon:…

5 месяцев, 1 неделя назад @ youtube.com
Can you catch the cheaters?
Can you catch the cheaters? Can you catch the cheaters?

Play at primerlearning.org

Or on Google Play: https://play.google.com/store/apps/details?id=com.Primer.CatchtheCheaters For discussion and updates

- Discord: https://discord.gg/NbruaNW

- Twitter: @primerlearning

- Reddit: r/primerlearning Plush blobs and other merch: https://store.dftba.com/collections/primer

Support these videos on Patreon: https://www.patreon.com/primerlearning Made with Unity

https://github.com/Helpsypoo/PrimerUnity Made possible by support through Patreon:

Anthony Eufemio

Jon Mundle

Spline

Zachariah Richard Fournier

Vladimir Duchenchuk

Roy & BreAnna Steves

Shayn Osborn

Jeremy

Guguke

Anders Fjeldvær

Luc Cedric R.

Erik Broeders

Kairui Wang

Sean Barker

Eric Helps

Stevie Hr…

7 месяцев, 3 недели назад @ youtube.com
🎧 Podcasts
Lex Fridman AI Podcast Lex Fridman AI Podcast
последний пост 2 дня назад
#342 – Todd Howard: Skyrim, Elder Scrolls 6, Fallout, and Starfield
#342 – Todd Howard: Skyrim, Elder Scrolls 6, Fallout, and Starfield #342 – Todd Howard: Skyrim, Elder Scrolls 6, Fallout, and Starfield

Todd Howard is a legendary video game designer at Bethesda Game Studios.

He led the development of the Elder Scrolls series and the Fallout series, and an upcoming game Starfield.

Please support this podcast by checking out our sponsors:– Shopify: https://shopify.com/lex to get free trial– Eight Sleep: https://www.eightsleep.com/lex to get special savings– InsideTracker: https://insidetracker.com/lex to get 20% off– LMNT: https://drinkLMNT.com/lex to get free sample packEPISODE LINKS:Bethesda: https://bethesda.netBethesda Game Studios: https://bethesdagamestudios.comCreation Club: https://creationclub.bethesda.netPODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: ht…

2 дня назад @ lexfridman.com
#341 – Guido van Rossum: Python and the Future of Programming
#341 – Guido van Rossum: Python and the Future of Programming #341 – Guido van Rossum: Python and the Future of Programming

Guido van Rossum is the creator of Python programming language.

Please support this podcast by checking out our sponsors:– GiveDirectly: https://givedirectly.org/lex to get gift matched up to $1000– Eight Sleep: https://www.eightsleep.com/lex to get special savings– Fundrise: https://fundrise.com/lex– InsideTracker: https://insidetracker.com/lex to get 20% off– Athletic Greens: https://athleticgreens.com/lex to get 1 month of fish oilEPISODE LINKS:Guido’s Twitter: https://twitter.com/gvanrossumGuido’s Website: https://gvanrossum.github.io/Python’s Website: https://python.orgPODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co/2lwqZIrSpotify: https://s…

5 дней, 1 час назад @ lexfridman.com
#340 – Chris Tarbell: FBI Agent Who Took Down Silk Road
#340 – Chris Tarbell: FBI Agent Who Took Down Silk Road #340 – Chris Tarbell: FBI Agent Who Took Down Silk Road

Chris Tarbell is a former FBI special agent and cybercrime investigation specialist who brought down Ross Ulbricht and Silk Road, and Hector Monsegur (aka Sabu) of LulzSec and Anonymous.

Please support this podcast by checking out our sponsors:– True Classic Tees: https://trueclassictees.com/lex and use code LEX to get 25% off– InsideTracker: https://insidetracker.com/lex to get 20% off– ExpressVPN: https://expressvpn.com/lexpod to get 3 months free– BetterHelp: https://betterhelp.com/lex to get 10% off– Blinkist: https://blinkist.com/lex to get 25% off premiumEPISODE LINKS:Hacker And The Fed podcast: https://podcasts.apple.com/podcast/hacker-and-the-fed/id1649541362Naxo: https://naxo.com/w…

1 неделя, 2 дня назад @ lexfridman.com
#339 – Climate Change Debate: Bjørn Lomborg and Andrew Revkin
#339 – Climate Change Debate: Bjørn Lomborg and Andrew Revkin #339 – Climate Change Debate: Bjørn Lomborg and Andrew Revkin

Bjørn Lomborg is author of “False Alarm”.

Andrew Revkin is a climate journalist (21 years at NY Times).

Please support this podcast by checking out our sponsors:– Eight Sleep: https://www.eightsleep.com/lex to get special savings– Linode: https://linode.com/lex to get $100 free credit– InsideTracker: https://insidetracker.com/lex to get 20% off– Onnit: https://lexfridman.com/onnit to get up to 10% offEPISODE LINKS:Andrew’s Twitter: https://twitter.com/RevkinAndrew’s Substack: https://revkin.substack.comAndrew’s Linktree: https://linktr.ee/revkinBjørn’s Twitter: https://twitter.com/BjornLomborgBjørn’s Website: https://lomborg.comAndrew’s Books:The Human Planet: https://amzn.to/3MRuLUYThe Bur…

1 неделя, 6 дней назад @ lexfridman.com
#338 – Chamath Palihapitiya: Money, Success, Startups, Energy, Poker & Happiness
#338 – Chamath Palihapitiya: Money, Success, Startups, Energy, Poker & Happiness #338 – Chamath Palihapitiya: Money, Success, Startups, Energy, Poker & Happiness

Chamath Palihapitiya is a venture capitalist, engineer, CEO of Social Capital, and co-host of the All-In Podcast.

Please support this podcast by checking out our sponsors:– Bambee: https://bambee.com and use code LEX to get free HR audit– InsideTracker: https://insidetracker.com/lex to get 20% off– NetSuite: http://netsuite.com/lex to get free product tour– SimpliSafe: https://simplisafe.com/lex– Indeed: https://indeed.com/lex to get $75 creditEPISODE LINKS:Chamath’s Twitter: https://twitter.com/chamathChamath’s LinkedIn: https://linkedin.com/in/chamathChamath’s Substack: https://chamathreads.substack.comSocial Capital (website): https://socialcapital.comAll-In Podcast (podcast): https://yo…

2 недели, 2 дня назад @ lexfridman.com
#337 – Destiny: Politics, Free Speech, Controversy, Sex, War, and Relationships
#337 – Destiny: Politics, Free Speech, Controversy, Sex, War, and Relationships #337 – Destiny: Politics, Free Speech, Controversy, Sex, War, and Relationships

Steven Bonnell, aka Destiny, is a progressive political commentator and a live streamer on YouTube.

Melina Goransson is a live streamer on Twitch.

Please support this podcast by checking out our sponsors:– True Classic Tees: https://trueclassictees.com/lex and use code LEX to get 25% off– Athletic Greens: https://athleticgreens.com/lex to get 1 month of fish oil– MasterClass: https://masterclass.com/lex to get 15% off– Blinkist: https://blinkist.com/lex to get 25% off premiumEPISODE LINKS:Destiny’s YouTube: https://youtube.com/destinyDestiny’s Subreddit: https://reddit.com/r/DestinyDestiny’s Website: https://destiny.ggDestiny’s Instagram: https://instagram.com/destinyMelina’s Twitch: https:…

2 недели, 5 дней назад @ lexfridman.com
#336 – Ben Shapiro: Politics, Kanye, Trump, Biden, Hitler, Extremism, and War
#336 – Ben Shapiro: Politics, Kanye, Trump, Biden, Hitler, Extremism, and War #336 – Ben Shapiro: Politics, Kanye, Trump, Biden, Hitler, Extremism, and War

Ben Shapiro is a conservative political commentator, host of The Ben Shapiro Show, co-founder of The Daily Wire, and author of The Authoritarian Moment and other books.

Please support this podcast by checking out our sponsors:– ExpressVPN: https://expressvpn.com/lexpod to get 3 months free– Policygenius: https://www.policygenius.com/– BetterHelp: https://betterhelp.com/lex to get 10% off– InsideTracker: https://insidetracker.com/lex to get 20% offEPISODE LINKS:Ben’s Twitter: https://twitter.com/benshapiroBen’s Instagram: https://instagram.com/officialbenshapiroDaily Wire: https://dailywire.comBen’s Books:The Authoritarian Moment: https://amzn.to/3T3RRJvFacts (Still) Don’t Care About Your Fe…

3 недели, 3 дня назад @ lexfridman.com
#335 – Fiona Hill: Vladimir Putin and Donald Trump
#335 – Fiona Hill: Vladimir Putin and Donald Trump #335 – Fiona Hill: Vladimir Putin and Donald Trump

Fiona Hill is a presidential advisor and foreign policy expert specializing in Russia.

Please support this podcast by checking out our sponsors:– Mizzen+Main: https://mizzenandmain.com and use code LEX to get $35 off– Calm: https://calm.com/lex to get 40% off premium– Athletic Greens: https://athleticgreens.com/lex to get 1 month of fish oil– LMNT: https://drinkLMNT.com/lex to get free sample packEPISODE LINKS:Fiona’s Books:There Is Nothing for You Here: https://amzn.to/3TR0nN9Mr. Putin: Operative in the Kremlin: https://amzn.to/3WiGU9FPODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co/2lwqZIrSpotify: https://spoti.fi/2nEwCF8RSS: https://lexfridman.…

3 недели, 6 дней назад @ lexfridman.com
#334 – Abbas Amanat: Iran Protests, Mahsa Amini, History, CIA & Nuclear Weapons
#334 – Abbas Amanat: Iran Protests, Mahsa Amini, History, CIA & Nuclear Weapons #334 – Abbas Amanat: Iran Protests, Mahsa Amini, History, CIA & Nuclear Weapons

Abbas Amanat is a historian at Yale specializing in the modern history of Iran.

Iran: https://amzn.to/3zzLWVA2.

Apocalyptic Islam and Iranian Shi’ism: https://amzn.to/3h66fU0PODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co/2lwqZIrSpotify: https://spoti.fi/2nEwCF8RSS: https://lexfridman.com/feed/podcast/YouTube Full Episodes: https://youtube.com/lexfridmanYouTube Clips: https://youtube.com/lexclipsSUPPORT & CONNECT:– Check out the sponsors above, it’s the best way to support this podcast– Support on Patreon: https://www.patreon.com/lexfridman– Twitter: https://twitter.com/lexfridman– Instagram: https://www.instagram.com/lexfridman– LinkedIn: https:…

4 недели, 1 день назад @ lexfridman.com
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI
#333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI #333 – Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI

Andrej Karpathy is a legendary AI researcher, engineer, and educator.

He’s the former director of AI at Tesla, a founding member of OpenAI, and an educator at Stanford.

Please support this podcast by checking out our sponsors:– Eight Sleep: https://www.eightsleep.com/lex to get special savings– BetterHelp: https://betterhelp.com/lex to get 10% off– Fundrise: https://fundrise.com/lex– Athletic Greens: https://athleticgreens.com/lex to get 1 month of fish oilEPISODE LINKS:Andrej’s Twitter: http://twitter.com/karpathyAndrej’s YouTube: http://youtube.com/c/AndrejKarpathyAndrej’s Website: http://karpathy.aiAndrej’s Google Scholar: http://scholar.google.com/citations?user=l8WuQJgAAAAJBooks mentio…

1 месяц назад @ lexfridman.com
#332 – Kanye ‘Ye’ West
#332 – Kanye ‘Ye’ West #332 – Kanye ‘Ye’ West

Ye is a legendary artist, producer, and designer.

Please support this podcast by checking out our sponsors:– Shopify: https://shopify.com/lex to get 14-day free trial– InsideTracker: https://insidetracker.com/lex to get 20% off– Indeed: https://indeed.com/lex to get $75 credit– ExpressVPN: https://expressvpn.com/lexpod to get 3 months free– Athletic Greens: https://athleticgreens.com/lex to get 1 month of fish oilEPISODE LINKS:Ye’s Twitter: https://twitter.com/kanyewestYe’s Instagram: https://instagram.com/kanyewestYe’s Website: https://kanyewest.comPODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co/2lwqZIrSpotify: https://spoti.fi/2nEwCF8RSS: https…

1 месяц, 1 неделя назад @ lexfridman.com
#331 – Balaji Srinivasan: How to Fix Government, Twitter, Science, and the FDA
#331 – Balaji Srinivasan: How to Fix Government, Twitter, Science, and the FDA #331 – Balaji Srinivasan: How to Fix Government, Twitter, Science, and the FDA

Balaji Srinivasan is an angel investor, tech founder, philosopher, and author of The Network State: How to Start a New Country.

He was formerly the CTO of Coinbase and General Partner at Andreessen Horowitz.

Please support this podcast by checking out our sponsors:– Policygenius: https://www.policygenius.com/– Blinkist: https://blinkist.com/lex to get 25% off premium– Notion: https://notion.com– Onnit: https://lexfridman.com/onnit to get up to 10% offEPISODE LINKS:Balaji’s Twitter: https://twitter.com/balajisBalaji’s Website: https://balajis.comBooks:1.

The Network State: https://thenetworkstate.com2.

On some podcast players you should be able to click the timestamp to jump to that time.

1 месяц, 1 неделя назад @ lexfridman.com
#330 – Hikaru Nakamura: Chess, Magnus, Kasparov, and the Psychology of Greatness
#330 – Hikaru Nakamura: Chess, Magnus, Kasparov, and the Psychology of Greatness #330 – Hikaru Nakamura: Chess, Magnus, Kasparov, and the Psychology of Greatness

Hikaru Nakamura is a chess super grandmaster and is currently the #1 ranked blitz chess player in the world.

He is also one of the top chess streamers on Twitch and YouTube.

Please support this podcast by checking out our sponsors:– Mizzen+Main: https://mizzenandmain.com and use code LEX to get $35 off– InsideTracker: https://insidetracker.com/lex to get 20% off– NetSuite: http://netsuite.com/lex to get free product tour– SimpliSafe: https://simplisafe.com/lexEPISODE LINKS:Hikaru’s Twitch: https://twitch.tv/gmhikaruHikaru’s YouTube: https://www.youtube.com/GMHikaruHikaru’s Twitter: https://twitter.com/GMHikaruHikaru’s Instagram: https://instagram.com/gmhikaruHikaru’s Website: https://hikaru…

1 месяц, 2 недели назад @ lexfridman.com
#329 – Kate Darling: Social Robots, Ethics, Privacy and the Future of MIT
#329 – Kate Darling: Social Robots, Ethics, Privacy and the Future of MIT #329 – Kate Darling: Social Robots, Ethics, Privacy and the Future of MIT

Kate Darling is a researcher at MIT Media Lab interested in human robot interaction and robot ethics.

Please support this podcast by checking out our sponsors:– True Classic Tees: https://trueclassictees.com/lex and use code LEX to get 25% off– Shopify: https://shopify.com/lex to get 14-day free trial– Linode: https://linode.com/lex to get $100 free credit– InsideTracker: https://insidetracker.com/lex to get 20% off– ExpressVPN: https://expressvpn.com/lexpod to get 3 months freeEPISODE LINKS:Kate’s Twitter: http://twitter.com/grok_Kate’s Website: http://katedarling.orgKate’s Instagram: http://www.instagram.com/grok_The New Breed (book): https://amzn.to/3ExhBufCreativity without Law (book): …

1 месяц, 2 недели назад @ lexfridman.com
#328 – John Danaher: Submission Grappling, ADCC, Animal Combat, and Knives
#328 – John Danaher: Submission Grappling, ADCC, Animal Combat, and Knives #328 – John Danaher: Submission Grappling, ADCC, Animal Combat, and Knives

John Danaher is one of the greatest coaches and minds in martial arts history.

Please support this podcast by checking out our sponsors:– Audible: https://audible.com/lex to get 30-day free trial– Calm: https://calm.com/lex to get 40% off premium– Indeed: https://indeed.com/lex to get $75 credit– MasterClass: https://masterclass.com/lex to get 15% off– Eight Sleep: https://www.eightsleep.com/lex to get special savingsEPISODE LINKS:John’s Instagram: https://instagram.com/danaherjohnWatch full matches at FloGrappling: https://flograppling.comPODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co/2lwqZIrSpotify: https://spoti.fi/2nEwCF8RSS: https://lexfrid…

1 месяц, 3 недели назад @ lexfridman.com
Microsoft Research Podcast Microsoft Research Podcast
последний пост 7 месяцев, 3 недели назад
135 - Just Tech: Centering Community-Driven Innovation at the Margins Episode 3 with Dr. Sasha Costanza-Chock
135 - Just Tech: Centering Community-Driven Innovation at the Margins Episode 3 with Dr. Sasha Costanza-Chock 135 - Just Tech: Centering Community-Driven Innovation at the Margins Episode 3 with Dr. Sasha Costanza-Chock

In “Just Tech: Centering Community-Driven Innovation at the Margins,” Senior Principal Researcher Mary L. Gray explores how technology and community intertwine and the role technology can play in supporting community-driven innovation and community-based organizations.

Dr. Gray and her team are working to bring computer science, engineering, social science, and communities together to boost societal resilience in ongoing work with Project Resolve.

She’ll talk with organizers, academics, technology leaders, and activists to understand how to develop tools and frameworks of support alongside members of these communities.

They also discuss how critical thinkers and makers from social movements…

7 месяцев, 3 недели назад @ blubrry.com
134 - Just Tech: Centering Community-Driven Innovation at the Margins episode 2 with Dr. Tawanna Dillahunt, Zachary Rowe, and Joanna Velazquez
134 - Just Tech: Centering Community-Driven Innovation at the Margins episode 2 with Dr. Tawanna Dillahunt, Zachary Rowe, and Joanna Velazquez 134 - Just Tech: Centering Community-Driven Innovation at the Margins episode 2 with Dr. Tawanna Dillahunt, Zachary Rowe, and Joanna Velazquez

In “Just Tech: Centering Community-Driven Innovation at the Margins,” Senior Principal Researcher Mary Gray explores how technology and community intertwine and the role technology can play in supporting community-driven innovation and community-based organizations.

Dr. Gray and her team are working to bring computer science, engineering, social science, and community together to boost societal resilience in ongoing work with Project Resolve.

She’ll talk with organizers, academics, technology leaders, and activists to understand how to develop tools and frameworks of support alongside members of these communities.

In this episode of the series, Dr. Gray talks with Dr. Tawanna Dillahunt, Ass…

8 месяцев назад @ blubrry.com
133 - Just Tech: Centering Community-Driven Innovation at the Margins episode 1 with Desmond Patton and Mary Gray
133 - Just Tech: Centering Community-Driven Innovation at the Margins episode 1 with Desmond Patton and Mary Gray 133 - Just Tech: Centering Community-Driven Innovation at the Margins episode 1 with Desmond Patton and Mary Gray

In “Just Tech: Centering Community-Driven Innovation at the Margins,” Senior Principal Researcher Mary Gray explores how technology and community intertwine and the role technology can play in supporting community-driven innovation and community-based organizations.

Dr. Gray and her team are working to bring computer science, engineering, social science, and community together to boost societal resilience in ongoing work with Project Resolve.

She’ll talk with organizers, academics, technology leaders, and activists to understand how to develop tools and frameworks of support alongside members of these communities.

Together, they explore Patton’s learnings about the challenges of using AI in…

8 месяцев, 1 неделя назад @ blubrry.com
Data Skeptic
последний пост 3 дня, 3 часа назад
Your Mouse Reveals Your Gender and Age
Your Mouse Reveals Your Gender and Age Your Mouse Reveals Your Gender and Age

Today, he talks to us about his recent work on behavioural profiling via mouse tracking.

He outlined the techniques used for tracking human behavior on the web (eye gaze measurement and mouse tracking) and contrasted between both techniques.

Luis then talked about his research, where he and his co-authors tracked a sample of the user’s mouse movement after consent.

Luis spoke about the possibility of using the tracked user attention to determine whether an advertiser pays for an ad.

The positives and negatives of accurate mouse tracking for targeted ads were also discussed on the showLuis discussed efforts to curtail the privacy concerns with mouse tracking.

3 дня, 3 часа назад @ dataskeptic.com
Measuring Web Search Behavior
Measuring Web Search Behavior Measuring Web Search Behavior

Measuring Web Search BehaviorIt’s a double guest show today.

Aleksandra Urman and Mykola Makhortykh join us to discuss their work on the comparative analysis of web search behavior using web tracking data.

Aleksandra is a postdoctoral researcher at the University of Zurich, Switzerland where she works with s social computing group.

Mykola, on the other hand, is a lecturer at the University of Bern working at the Institute of Media and Communication Studies in Bern.

Furthermore, Mykola and Aleksandra discussed some of the takeaways for search engines from the analysis result.

1 неделя, 3 дня назад @ dataskeptic.com
StrategyQA and Big Bench
StrategyQA and Big Bench StrategyQA and Big Bench

The last time, she spoke about annotator bias in language models and how it affects the robustness of NLP models.

Mor spoke about the StrategyQA dataset, a question-answering benchmark for testing the ability of models to perform implicit reasoning.

StrategyQA is one of the challenging tasks in the Big Bench benchmark, a collaborative benchmark for measuring the capabilities of large language models.

The construction of Big Bench was led by Google and involved contributions from over 400 researchers in the NLP community.

In closing, she gave her take on whether the trajectory in language models will lead to AGI.

1 неделя, 6 дней назад @ dataskeptic.com
Ad Blockers Effect on News Consumption
Ad Blockers Effect on News Consumption Ad Blockers Effect on News Consumption

Ad Blockers Effect on News ConsumptionOn the show today, we speak with Shunyao Yan, an Assistant Professor in Marketing at Leavey School of Business, Santa Clara University.

She joins us to discuss her recent work on the effect of ad blockers on news consumption.

Her study investigated the changes in the behavior of users that adopt ad blockers.

She specifically spoke about the characterization of users who are most likely to adopt ad blockers.

She also explained how news platforms can still generate revenues despite the adoption of ad blockers — a development news publishers have acknowledged.

2 недели, 3 дня назад @ dataskeptic.com
Your Consent is Worth 75 Euros a Year
Your Consent is Worth 75 Euros a Year Your Consent is Worth 75 Euros a Year

His research interest lies in privacy and data protection for IoT, usability, and Human-Computer Interactions.

He also questioned the legalities of website targeting activities.

Victor then spoke about the Transparency and Consent Framework (TCF) to communicate users’ consent and aggregate the data for targeted advertising.

He discussed the efforts of agencies such as the Belgium Data Protection Agency to audit websites for ad targeting.

Wrapping up, Victor discussed future opportunities for research in the field.

3 недели, 3 дня назад @ dataskeptic.com
Automated Email Generation for Targeted Attacks
Automated Email Generation for Targeted Attacks Automated Email Generation for Targeted Attacks

Avisha’s research studies phishing attacks and natural language models in biomedical applications.

She explained why the phishing email problem is still widely unsolved.

She also discussed possible ways one can identify phishing emails and take caution.

She discussed the possibility of applying generative language models for cognitive therapies as well.

She discussed ways researchers are handling this black box problem.

1 месяц назад @ dataskeptic.com
Tribal Marketing
Tribal Marketing Tribal Marketing

Tribal MarketingOn today’s episode, we are joined by Peter Gloor, a Research Scientist at the MIT Center for Collective Intelligence.

Peter then focuses on his research and his self-made tool called the tribe finder.

The tribe finder model was trained to classify a large array of people online into tribes.

He extensively discussed how the tribe finder model scraps tweets and the yardstick for the tribe classifications he used.

Peter explained how he validated the accuracy of his tribe finder model.

1 месяц, 1 неделя назад @ dataskeptic.com
Nano-targetted Facebook Ads
Nano-targetted Facebook Ads Nano-targetted Facebook Ads

He discussed how he got the data for his work using Facebook API.

Speaking of Facebook ads, Ángel discussed what it takes to run a Facebook campaign.

He then discussed how Facebook profile users are based on activities on the platform.

He spoke about the limits that big social media companies such as Facebook and LinkedIn have in place.

Ángel also spoke on the actions of other big companies such as Google in this regard.

1 месяц, 2 недели назад @ dataskeptic.com
Debiasing GPT-3 Job Ads
Debiasing GPT-3 Job Ads Debiasing GPT-3 Job Ads

Debiasing GPT3 Job AdsToday, Conrad Borchers, a Ph.D. student in Human-Computer Interaction, joins us to discuss his work on debiasing large generative language models, particularly GPT-3.

Speaking of the importance of debiasing large language models, Conrad began with retrospective questions.

He also spoke about methods to effectively fine-tune GPT-3 models for eliminating bias.

He explained what prompt engineering was and how to potentially de-bias language models with prompts.

He discussed social practices that can help expedite the improvement of generative language models in the future.

1 месяц, 3 недели назад @ dataskeptic.com
ML Ops in Production
ML Ops in Production ML Ops in Production

ML Ops in ProductionMoses Guttman from Clear ML joins us to share insights about how organizations leveraging machine learning keep their programs on track.

While many parallels exist between the software development life cycle (SWLC) and the machine learning development life cycle, successful deployments of ML in production have demonstrated that a unique set of tools is required.

Moses and I discuss the emergence of ML Ops, success stories, and how modern teams leverage tools like Clear ML’s open source solution to maximize the value of ML in the organization.

1 месяц, 3 недели назад @ dataskeptic.com
Ad Network Tomography
Ad Network Tomography Ad Network Tomography

He has recently been working on the compliance side of data sharing auditing as well.

Professor Rishab mainly focuses on internet measurement and how data sharing affects users’ experience.

Maaz started with how he got into data sharing as a field.

He gave further insight into data sharing, and how advertisers exploit the data for targeted ads across multiple websites.

Given that the data sharing operations are mostly private, Maaz and Rishab discussed the need to have more transparency.

1 месяц, 4 недели назад @ dataskeptic.com
First Party Tracking Cookies
First Party Tracking Cookies First Party Tracking Cookies

Shaoor, whose research interest lies around the development and evaluation of privacy-preserving technologies, joins us to discuss his recent publication titled, COOKIEGRAPH: Measuring and Countering First-Party Tracking Cookies.

Shaoor began the conversation with an overview of privacy-preserving technologies.

According to him, the field of privacy-preserving technologies is ever evolving and there is a lot more awareness about it today than ever before.

Shaoor discussed the reaction of advertisers to this development, one of which includes migrating to first-party cookies.

Shaoor discussed the model prediction accuracy.

2 месяца назад @ dataskeptic.com
The Harms of Targeted Weight Loss Ads
The Harms of Targeted Weight Loss Ads The Harms of Targeted Weight Loss Ads

The Harms of Targeted Weight Loss AdsToday, we are joined by Liza Gak, a Ph.D student at UC Berkeley.

Liza’s research interest lies around human-computer interaction (HCI), social computing, and how people are harmed online.

Liza explained how she grouped and coded the qualitative data using the inductive iterative approach.

She spoke about her findings, iterating how weight loss ads target the vulnerable.

She also explained how ad distribution platforms can play a role in ameliorating the harm ads cause to users.

2 месяца, 1 неделя назад @ dataskeptic.com
Podcast Advertising
Podcast Advertising Podcast Advertising

Podcast AdvertisingToday, we are joined by Rob Walch, the VP of Podcast Relations at Libsyn.

Libsyn is a popular podcast hosting platform, where the Data Skeptic podcast is hosted.

He then explained how podcasters can monetize their podcasts using Host Read.

Rob then discusses how to engage the podcast audience using surveys.

He also explained why iOS users have 5 times more podcast listeners than Android users even though there are 5X more Android phones than iPhones.

2 месяца, 2 недели назад @ dataskeptic.com
Fairness in e-Commerce Search
Fairness in e-Commerce Search Fairness in e-Commerce Search

Fairness in e-commerce SearchOn the show today, we are joined by Abhisek Dash and Saptarshi Ghosh.

Fairness and Interpretability Issues in E-commerce Search through Smart Speakers.

Abhisek started by giving some background on what fairness in machine learning is.

He also explained what it means to audit machine learning systems.

They both discussed some concerning discrepancies in search results between Amazon smart speakers and the desktop website.

2 месяца, 3 недели назад @ dataskeptic.com
SuperDataScience SuperDataScience
последний пост 2 дня, 5 часов назад
631: Data Analytics Career Orientation
631: Data Analytics Career Orientation 631: Data Analytics Career Orientation

Interview success, funny memes about data, and stakeholder management: Jon Krohn speaks with Luke Barousse, a full-time YouTuber who produces content to help aspiring data scientists.

First, Jon and his guest go underwat…

2 дня, 5 часов назад @ soundcloud.com
630: Resilient Machine Learning
630: Resilient Machine Learning 630: Resilient Machine Learning

Jon Krohn sits with Dr. Dan Shiebler at the Open Data Science Conference (ODSC) to dive into the critical components of building resilient machine learning.

Additional materials: www.superdatascience.com/630Interested…

6 дней, 5 часов назад @ soundcloud.com
629: Software for Efficient Data Science
629: Software for Efficient Data Science 629: Software for Efficient Data Science

Has the term developer advocacy ever left you scratching your head?

This week data science developer advocate for JetBrains, Dr. Jodie Burchell, joins Jon Krohn to shed light on her responsibilities and why it's a role y…

1 неделя, 2 дня назад @ soundcloud.com
628: The Critical Human Element of Successful A.I. Deployments
628: The Critical Human Element of Successful A.I. Deployments 628: The Critical Human Element of Successful A.I. Deployments

On this episode of Five-Minute Friday, Jon Krohn speaks from the Open Data Science Conference (ODSC).

There, he sits down with author and data scientist Keith McCormick to discuss the conference’s key trend: learning the…

1 неделя, 6 дней назад @ soundcloud.com
627: AutoML: Automated Machine Learning
627: AutoML: Automated Machine Learning 627: AutoML: Automated Machine Learning

Jon Krohn speaks with Erin LeDell, H2O.ai’s Chief Machine Learning Scientist.

They investigate how AutoML supercharges the data science process, the importance of admissible machine learning for an equitable data-driven …

2 недели, 2 дня назад @ soundcloud.com
Subword Tokenization with Byte-Pair Encoding | SDS 626
Subword Tokenization with Byte-Pair Encoding | SDS 626 Subword Tokenization with Byte-Pair Encoding | SDS 626

Word tokenization, character tokenization and subword tokenization go head-to-head this week as Jon Krohn delivers a mini-bootcamp on the NLP-related process.

Additional materials: www.superdatascience.com/626Interest…

2 недели, 6 дней назад @ soundcloud.com
Analyzing Blockchain Data and Cryptocurrencies | SDS 625
Analyzing Blockchain Data and Cryptocurrencies | SDS 625 Analyzing Blockchain Data and Cryptocurrencies | SDS 625

Chainalysis' Director of Research, Kim Grauer joins Jon Krohn to explore the state of economic-data analysis on the blockchain.

This episode is brought to you by Datalore (datalore.online/SDS), the collaborative data sc…

3 недели, 2 дня назад @ soundcloud.com
Imagen Video: Incredible Text-to-Video Generation | SDS 624
Imagen Video: Incredible Text-to-Video Generation | SDS 624 Imagen Video: Incredible Text-to-Video Generation | SDS 624

On this week’s Five-Minute Friday, Jon Krohn investigates Imagen Video, Google’s latest model for making video art out of text prompts.

Recently published, this text-to-image converter now competes against already strong…

3 недели, 6 дней назад @ soundcloud.com
Data Analyst, Data Scientist, and Data Engineer Career Paths | SDS 623
Data Analyst, Data Scientist, and Data Engineer Career Paths | SDS 623 Data Analyst, Data Scientist, and Data Engineer Career Paths | SDS 623

Jon Krohn speaks with Shashank Kalanithi, the man who makes a sport out of YouTube and data analytics out of sports.

Listen in as he talks about how he got started producing YouTube videos on data science, the essential …

1 месяц назад @ soundcloud.com
Burnout: Causes and Solutions | SDS 622
Burnout: Causes and Solutions | SDS 622 Burnout: Causes and Solutions | SDS 622

Is burnout on the horizon for you and your team?

Christina Maslach, author of the new book "The Burnout Challenge," joins Jon Krohn to help us identify the common signs of looming burnout while steering us in a healthier…

1 месяц назад @ soundcloud.com
Blockchains and Cryptocurrencies: Analytics and Data Applications | SDS 621
Blockchains and Cryptocurrencies: Analytics and Data Applications | SDS 621 Blockchains and Cryptocurrencies: Analytics and Data Applications | SDS 621

Cryptocurrency and blockchain take center stage this week as we welcome Chief Economist at Chainalysis, Philip Gradwell, to discuss the data science applications in this exciting field.

This episode is brought to you by…

1 месяц, 1 неделя назад @ soundcloud.com
OpenAI Whisper: General-Purpose Speech Recognition | SDS 620
OpenAI Whisper: General-Purpose Speech Recognition | SDS 620 OpenAI Whisper: General-Purpose Speech Recognition | SDS 620

What’s your secret to superb audio recognition?

Whisper it.

We mean that literally—Whisper is the latest in OpenAI’s growing suite of models aimed to benefit humanity.

On this episode of Five-Minute Friday, host Jon Kroh…

1 месяц, 1 неделя назад @ soundcloud.com
Tools for Deploying Data Models into Production | SDS 619
Tools for Deploying Data Models into Production | SDS 619 Tools for Deploying Data Models into Production | SDS 619

Jon Krohn speaks with Erik Bernhardsson, the man who invented Spotify’s original music recommendation system.

They address the different ways to interview a data science candidate, how to deploy a data model into the clo…

1 месяц, 2 недели назад @ soundcloud.com
The Joy of Atelic Activities | SDS 618
The Joy of Atelic Activities | SDS 618 The Joy of Atelic Activities | SDS 618

Telic and atelic activities take center stage this week as Jon Krohn contemplates how our daily actions contribute to our overall sense of fulfillment.

Additional materials: www.superdatascience.com/618Interested in s…

1 месяц, 2 недели назад @ soundcloud.com
Causal Modeling and Sequence Data | SDS 617
Causal Modeling and Sequence Data | SDS 617 Causal Modeling and Sequence Data | SDS 617

Dr. Sean Taylor, Co-Founder and Chief Scientist of Motif Analytics, joins Jon Krohn this week for yet another perspective on causal modeling.

Tune in for a great conversation that covers large-scale causal experimentatio…

1 месяц, 3 недели назад @ soundcloud.com
Data Science at Home Data Science at Home
последний пост 1 неделя, 2 дня назад
Autonomous cars cannot drive. Here is why. (Ep. 210)
Autonomous cars cannot drive. Here is why. (Ep. 210) Autonomous cars cannot drive. Here is why. (Ep. 210)

If you think that the problem of self-driving cars has been solved, think twice.

As a matter of fact, the problem of self-driving cars cannot be solved with the technical solutions that companies are currently considering.

Whoever is telling you they solved the problem of driving a vehicle fully autonomously, they are lying.

Check it out at https://arcticwolf.com/datascienceAmethix works to create and maximize the impact of the world’s leading corporations and startups, so they can create a better future for everyone they serve.

We provide solutions in AI/ML, Fintech, Healthcare/RWE, and Predictive maintenance.

1 неделя, 2 дня назад @ datascienceathome.com
Evolution of data platforms (Ep. 209)
Evolution of data platforms (Ep. 209) Evolution of data platforms (Ep. 209)

Let’s look at the history of data platforms.

Shall I switch to the latest architecture?

Our SponsorsExplore the Complex World of Regulations.

Check it out at https://arcticwolf.com/datascienceAmethix works to create and maximize the impact of the world’s leading corporations and startups, so they can create a better future for everyone they serve.

We provide solutions in AI/ML, Fintech, Healthcare/RWE, and Predictive maintenance.

3 недели назад @ datascienceathome.com
[RB] Is studying AI in academia a waste of time? (Ep. 208)
[RB] Is studying AI in academia a waste of time? (Ep. 208) [RB] Is studying AI in academia a waste of time? (Ep. 208)

Companies and other business entities are actively involved in defining data products and applied research every year.

Academia has always played a role in creating new methods and solutions/algorithms in the fields of machine learning and artificial intelligence.

Is studying AI in academia a waste of time?

Our SponsorsReady to advance your career in data science?

University of Cincinnati Online offers nationally recognized educational programs in business analytics and information systems.

4 недели назад @ datascienceathome.com
Private machine learning done right (Ep. 207)
Private machine learning done right (Ep. 207) Private machine learning done right (Ep. 207)

There are many solutions to private machine learning.

I am with Daniel Huynh, CEO of Mithril Security, a graduate from Ecole Polytechnique with a specialisation in AI and data science.

He has written articles on Homomorphic Encryptions with the CKKS explained series (https://blog.openmined.org/ckks-explained-part-1-simple-encoding-and-decoding/).

He is now focusing on Confidential Computing at Mithril Security and has written extensive articles on the topic: https://blog.mithrilsecurity.io/.

In this show we speak about confidential computing, SGX and private machine learningReferences

1 месяц, 1 неделя назад @ datascienceathome.com
Edge AI for applications in military and space (Ep. 206)
Edge AI for applications in military and space (Ep. 206) Edge AI for applications in military and space (Ep. 206)

Our SponsorsReady to advance your career in data science?

The University of Cincinnati Online offers nationally recognized educational programs in business analytics and information systems.

Predictive Analytics Today named UC the No.1 MS Data Science school in the country and is nationally recognized with a proven track record of placing students at high-profile companies such as Google, Amazon, and P&G.

Discover more about the University of Cincinnati’s 100% online master’s degree programs at online.uc.edu/obaisAmethix works to create and maximize the impact of the world’s leading corporations and startups, so they can create a better future for everyone they serve.

We provide AI/ML, Fint…

1 месяц, 2 недели назад @ datascienceathome.com
[RB] What are generalist agents and why they can change the AI game (Ep. 205)
[RB] What are generalist agents and why they can change the AI game (Ep. 205) [RB] What are generalist agents and why they can change the AI game (Ep. 205)

That deep learning alone is not sufficient to solve artificial general intelligence, is more and more accepted statement.

Generalist agents have great properties that can overcome some of the limitations of single-task deep learning models.

Be aware we are still far from AGI, though.

So what are generalist agents?

Referenceshttps://arxiv.org/pdf/2205.06175

1 месяц, 2 недели назад @ datascienceathome.com
LIDAR, cameras and autonomous vehicles (Ep. 204)
LIDAR, cameras and autonomous vehicles (Ep. 204) LIDAR, cameras and autonomous vehicles (Ep. 204)

How does an autonomous vehicle see?

In this episode I speak about LIDAR, high resolution cameras and some machine learning methods adapted to a minimal number of sensors.

Our SponsorsReady to advance your career in data science?

The University of Cincinnati Online offers nationally recognized educational programs in business analytics and information systems.

Predictive Analytics Today named UC as the No.1 MS Data Science school in the country and is nationally recognized with a proven track record of placing students at high-profile companies such as Google, Amazon and P&G.

2 месяца назад @ datascienceathome.com
Predicting Out Of Memory Kill events with Machine Learning (Ep. 203)
Predicting Out Of Memory Kill events with Machine Learning (Ep. 203) Predicting Out Of Memory Kill events with Machine Learning (Ep. 203)

Can we use machine learning to predict and eventually detect out of memory kills from the operating system?

200:00:09,142 –> 00:00:19,170This time we have something for you if you want to help us shape the data science leaders of the future, we have created the the Data Science at Home’s Ambassador program.

300:00:19,340 –> 00:00:28,378Ambassadors are volunteers who are passionate about data science and want to give back to our growing community of data science professionals and enthusiasts.

1100:01:39,226 –> 00:01:56,218Regardless of your application, is a video streaming application or any other communication type of application, or a fintech application, or energy, or whatever, this memo…

2 месяца, 1 неделя назад @ datascienceathome.com
Is studying AI in academia a waste of time? (Ep. 202)
Is studying AI in academia a waste of time? (Ep. 202) Is studying AI in academia a waste of time? (Ep. 202)

Companies and other business entities are actively involved in defining data products and applied research every year.

Academia has always played a role in creating new methods and solutions/algorithms in the fields of machine learning and artificial intelligence.

However, there is doubt about how powerful and effective such research efforts are.

Is studying AI in academia a waste of time?

Check it out at https://arcticwolf.com/datascienceAmethix works to create and maximize the impact of the world’s leading corporations and startups, so they can create a better future for everyone they serve.

2 месяца, 2 недели назад @ datascienceathome.com
Zero-Cost Proxies: How to find the best neural network without training (Ep. 201)
Zero-Cost Proxies: How to find the best neural network without training (Ep. 201) Zero-Cost Proxies: How to find the best neural network without training (Ep. 201)

Neural networks are becoming massive monsters that are hard to train (without the “regular” 12 last-generation GPUs).

Is there a way to skip that?

Let me introduce you to Zero-Cost proxiesReferences

2 месяца, 3 недели назад @ datascienceathome.com
Online learning is better than batch, right? Wrong! (Ep. 200)
Online learning is better than batch, right? Wrong! (Ep. 200) Online learning is better than batch, right? Wrong! (Ep. 200)

In this episode, I speak about online machine learning systems and why blindly choosing such a paradigm can lead to unpredictable and expensive outcomes.

Also, in this episode, I have to deal with an intruder 🙂LinksBirman, K.; Joseph, T. (1987).

“Exploiting virtual synchrony in distributed systems”.

Proceedings of the Eleventh ACM Symposium on Operating Systems Principles – SOSP ’87.

S2CID 7739589.

2 месяца, 3 недели назад @ datascienceathome.com
What are generalist agents and why they can change the AI game (Ep. 199)
What are generalist agents and why they can change the AI game (Ep. 199) What are generalist agents and why they can change the AI game (Ep. 199)

June 3, 2022 podcastThat deep learning alone is not sufficient to solve artificial general intelligence, is more and more accepted statement.

Generalist agents have great properties that can overcome some of the limitations of single-task deep learning models.

Be aware, we are still far from AGI, though.

So what are generalist agents?

Referenceshttps://arxiv.org/pdf/2205.06175

6 месяцев назад @ datascienceathome.com
Streaming data with ease. With Chip Kent from Deephaven Data Labs (Ep. 198)
Streaming data with ease. With Chip Kent from Deephaven Data Labs (Ep. 198) Streaming data with ease. With Chip Kent from Deephaven Data Labs (Ep. 198)

We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits.

By clicking “Accept”, you consent to the use of ALL the cookies.

6 месяцев, 1 неделя назад @ datascienceathome.com
Learning from data to create personalized experiences with Matt Swalley from Omneky (Ep. 197)
Learning from data to create personalized experiences with Matt Swalley from Omneky (Ep. 197) Learning from data to create personalized experiences with Matt Swalley from Omneky (Ep. 197)

May 16, 2022 podcastIn this episode I speak with Matt Swalley, Chief Business Officer of Omneky, an AI platform that generates, analyzes and optimizes personalized ad creatives at scale.

We speak about the way AI is used for generating customized recommendation and creating experiences with data aggregation and analytics.

respecting the privacy of individuals.

LinksGrow your business with personalized ads https://www.omneky.com/Data Science at Home Podcast (Live) https://www.twitch.tv/datascienceathome

6 месяцев, 2 недели назад @ datascienceathome.com
State of Artificial Intelligence 2022 (Ep. 196)
State of Artificial Intelligence 2022 (Ep. 196) State of Artificial Intelligence 2022 (Ep. 196)

May 16, 2022 podcastLet’s take a break and think about the state of AI in 2022.

In this episode I summarize the long report from the Stanford Institute for Human-Centered Artificial Intelligence (HAI)Enjoy!

If you want a new interactive experience, I am scheduling hands-on session on TwitchFeel free to drop by when there is a live session, and interact with me.

I’ll see you there!

Referenceshttps://spectrum.ieee.org/artificial-intelligence-indexhttps://www.twitch.tv/datascienceathome

6 месяцев, 2 недели назад @ datascienceathome.com