Very ML
State-of-the-art Machine Learning News Feed
/r/MachineLearning
последний пост 1 час назад
Information Theory "[Discussion]", "[D]", "[News]", "[N]", "[Research]"
Information Theory "[Discussion]", "[D]", "[News]", "[N]", "[Research]"

Information Theory Information theory is concerned with representing data in a compact fashion (a task known as data compression or source coding), as well as with transmitting and storing it in a way that is robust to errors (a task known as error correction or channel coding). Calculate the Information for an Event The basic intuition behind information theory is that learning that an unlikely event has occurred is more informative than learning that a likely event has occurred. Low Probability Event: High Information(Meteor Strike, Bacterial Outage (Surprising) or sometimes a top performing team losing to average team) Each Of These Scenarios will occur only when there are lot of things …

1 час назад @ reddit.com
[D] AAAI 2022 Paper Results
[D] AAAI 2022 Paper Results

The AAAI 2022 outcomes have been released; how were these outcomes for your papers? submitted by /u/Suspicious-Flight400 [link] [comments]

2 часа назад @ reddit.com
[Discussion] Teammates for kaggle competition
[Discussion] Teammates for kaggle competition

Hi everyone! I am looking for teammates to participate in the kaggle competition: https://www.kaggle.com/c/jigsaw-toxic-severity-rating/overview Please DM if Interested! submitted by /u/notme145 [link] [comments]

3 часа назад @ reddit.com
[D] Is biggan limited to inputs from ImageNet only?
[D] Is biggan limited to inputs from ImageNet only?

For example, 'tree' is rejected as an input. submitted by /u/Previous-Muffin689 [link] [comments]

3 часа назад @ reddit.com
[R] Patrickstar: A Pytorch Based Training Framework That Can Train Faster And Larger PTMs
[R] Patrickstar: A Pytorch Based Training Framework That Can Train Faster And Larger PTMs [R] Patrickstar: A Pytorch Based Training Framework That Can Train Faster And Larger PTMs

github: https://github.com/Tencent/PatrickStar paper: https://arxiv.org/abs/2108.05818 Last month, our team at Tencent open sourced PatrickStar, a pytorch based PTM (pretrained model) training framework that could train larger model and have better performance with the same hardward environment compares to stoa training frameworks like DeepSpeed. version 0.4.1 performance on 8xV100 node The key idea of the project is to use a chunk-based memory manage strategy, which means group parameters in chunks of the same size and dynamically move the chunks between CPU and GPU. Compare to DeepSpeed which would determine which part of the model need to put in CPU memory before training, the dynamic me…

3 часа назад @ reddit.com
[D] Should one migrate from Data Engineering to ML Engineer roles?
[D] Should one migrate from Data Engineering to ML Engineer roles?

Hello all! I'm a Data Engineer with close to 3 years of experience. I recently joined a fortune 100 organization and the possibilities here to learn and work with data at scale are limitless. Thanks to my previous organization, I picked up some software engineering skills as well and hence I'm a great addition to any software team due to my hands on skills in different areas. Reaching here, I am seeing this inevitable rise of Machine Learning engineer maybe in a few years if not right away. For a Data Engineer like me, what would be your advice on switching to ML Engineer and what roadmap would you recommend All answers are appreciated Thanks :) submitted by /u/RobotsMakingDubstep [link] [c…

3 часа назад @ reddit.com
[P] Survey project for school about the ethics of machine learning
[P] Survey project for school about the ethics of machine learning

Hello everyone This survey is for an in-depth work for my school. I need to do this work to complete my apprenticeship. The work and this survey is about the Ethics of Machine Learning. You should be able to complete my survey in less than ten minutes. Also, the survey does not require any registration or disclosing personally-identifiable information (PII) and no question is mandatory. I’d appreciate it if you could kindly take the time to answer the survey. Survey Link: https://forms.gle/bqh8af8vK9as13UD6 submitted by /u/XDeadScorpion [link] [comments]

6 часов назад @ reddit.com
[D] What are some ways to reduce model GPU VRAM usage>
[D] What are some ways to reduce model GPU VRAM usage>

Hey guys, My GAN models take 10GB VRAM when I try to generate a 512x512 image. We need to scale stuff up and I plan on productionizing this on a kubernetes cluster but want to optimize them first to save on costs. Ideally I want to run multiple inferences on the same machine without facing a CUDA OUT OF MEMORY ERROR. What are some good ways to optimize VRAM usage? submitted by /u/cbsudux [link] [comments]

6 часов назад @ reddit.com
[P] How to recap a whole TV series and movies using artificial intelligence?
[P] How to recap a whole TV series and movies using artificial intelligence?

I am working on a college project where I decide I would do something like this but I have no vague idea how to do this, I have just started taking AI Ml Andrew Yang classes and I can't figure out how to do this. .can anyone help? Please submitted by /u/youvcan [link] [comments]

6 часов назад @ reddit.com
[R] Sparse is Enough in Scaling Transformers
[R] Sparse is Enough in Scaling Transformers

submitted by /u/downtownslim [link] [comments]

7 часов назад @ reddit.com
[D] Is the "true few-shot" setting described in recent papers reasonable or am I not understanding the concept properly?
[D] Is the "true few-shot" setting described in recent papers reasonable or am I not understanding the concept properly?

I've been coming across a few papers in few-shot learning that claim to be under the paradigm of "true few-shot learning." Inspired by the paper True Few-Shot Learning with Language Models (Perez et al., 2021). The basic claim is that the current N-way K-shot setting that's widely used in few-shot tasks is unrealistic due to the fact that the models still have access to many data points in other classes. Not to mention that these models also usually have access to a development set to tune their hyperparameters. The authors claim that these assumptions are unrealistic on the grounds that assuming access to other classes' data points is unrealistic and that if we have access to a dev set the…

7 часов назад @ reddit.com
[P] Is there an AutoML commercial solution for joint image and text classification?
[P] Is there an AutoML commercial solution for joint image and text classification?

I have a set of images with title captions. I'd like to do a joint classification task using both inputs but I haven't seen a commercial autoML solution to train it. Does anyone know of any solution? submitted by /u/somethingstrang [link] [comments]

9 часов назад @ reddit.com
[D] MacBook or PC for work computer
[D] MacBook or PC for work computer

I’ve noticed a lot of developers at my work use and prefer MacBooks. What do you prefer? submitted by /u/MGeeeeeezy [link] [comments]

9 часов назад @ reddit.com
[D] Are you seeing any compelling use cases of semantic search being leveraged at scale?
[D] Are you seeing any compelling use cases of semantic search being leveraged at scale?

In my org, we are highly reliant on Elastic Search and I'm currently investigating the merit of incorporating a semantic search component to our search pipeline. Like most, I've built prototypes leveraging FAISS on our data and while the results are impressive, I haven't come across any compelling use cases where teams are putting this to use at scale. I want to note, I don't come from a research background. submitted by /u/madzthakz [link] [comments]

10 часов назад @ reddit.com
[Discussion] Is there anything similar to Google Pathways out here currently?
[Discussion] Is there anything similar to Google Pathways out here currently?

So as many probably know Google is creating Pathways which should be a model that can do many things, but they haven't released any details or a timeline. I'm wondering if there are other generalizable learners to use as of now. submitted by /u/jmoincali [link] [comments]

14 часов назад @ reddit.com
Towards Data Science Towards Data Science
последний пост 14 часов назад
Complete Guide to Perform Classification of Tweets with SpaCy
Complete Guide to Perform Classification of Tweets with SpaCy Complete Guide to Perform Classification of Tweets with SpaCy

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

14 часов назад @ towardsdatascience.com
Why I Chose the MacBook Air over the MacBook Pro as a Data Scientist
Why I Chose the MacBook Air over the MacBook Pro as a Data Scientist Why I Chose the MacBook Air over the MacBook Pro as a Data Scientist

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

15 часов назад @ towardsdatascience.com
Enhance Your Plotly Express Scatter Plot With Marginal Plots
Enhance Your Plotly Express Scatter Plot With Marginal Plots Enhance Your Plotly Express Scatter Plot With Marginal Plots

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

15 часов назад @ towardsdatascience.com
Solutions against overfitting for machine learning on tabular data
Solutions against overfitting for machine learning on tabular data Solutions against overfitting for machine learning on tabular data

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

15 часов назад @ towardsdatascience.com
Five Unexpected Behaviours of Python Could Be Surprised
Five Unexpected Behaviours of Python Could Be Surprised Five Unexpected Behaviours of Python Could Be Surprised

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

15 часов назад @ towardsdatascience.com
Exploring stacks and queues
Exploring stacks and queues Exploring stacks and queues

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

15 часов назад @ towardsdatascience.com
My First Six Months as a Data Scientist
My First Six Months as a Data Scientist My First Six Months as a Data Scientist

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

15 часов назад @ towardsdatascience.com
Data Visualization Before Machine Learning
Data Visualization Before Machine Learning Data Visualization Before Machine Learning

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

15 часов назад @ towardsdatascience.com
Four Deep Learning Papers to Read in December 2021
Four Deep Learning Papers to Read in December 2021 Four Deep Learning Papers to Read in December 2021

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

18 часов назад @ towardsdatascience.com
How to Benefit from the Semi-Supervised Learning with Label Spreading Algorithm
How to Benefit from the Semi-Supervised Learning with Label Spreading Algorithm How to Benefit from the Semi-Supervised Learning with Label Spreading Algorithm

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

18 часов назад @ towardsdatascience.com
Applying of Reinforcement Learning for Self-Driving Cars
Applying of Reinforcement Learning for Self-Driving Cars Applying of Reinforcement Learning for Self-Driving Cars

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 день, 18 часов назад @ towardsdatascience.com
Introduction to Applied Linear Algebra: Vectors
Introduction to Applied Linear Algebra: Vectors Introduction to Applied Linear Algebra: Vectors

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 день, 18 часов назад @ towardsdatascience.com
The Poisson Hidden Markov Model for Time Series Regression
The Poisson Hidden Markov Model for Time Series Regression The Poisson Hidden Markov Model for Time Series Regression

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 день, 19 часов назад @ towardsdatascience.com
4 Python Pandas Functions That Serve Better With Dictionaries
4 Python Pandas Functions That Serve Better With Dictionaries 4 Python Pandas Functions That Serve Better With Dictionaries

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 день, 19 часов назад @ towardsdatascience.com
Three R Libraries Every Data Scientist Should Know (Even if You Use Python)
Three R Libraries Every Data Scientist Should Know (Even if You Use Python) Three R Libraries Every Data Scientist Should Know (Even if You Use Python)

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 день, 20 часов назад @ towardsdatascience.com
Distill.pub Distill.pub
последний пост 2 месяца, 3 недели назад
Understanding Convolutions on Graphs
Understanding Convolutions on Graphs

Understanding the building blocks and design choices of graph neural networks.

2 месяца, 3 недели назад @ distill.pub
A Gentle Introduction to Graph Neural Networks
A Gentle Introduction to Graph Neural Networks

What components are needed for building learning algorithms that leverage the structure and properties of graphs?

2 месяца, 3 недели назад @ distill.pub
Distill Hiatus
Distill Hiatus

After five years, Distill will be taking a break.

4 месяца, 4 недели назад @ distill.pub
Adversarial Reprogramming of Neural Cellular Automata
Adversarial Reprogramming of Neural Cellular Automata

Reprogramming Neural CA to exhibit novel behaviour, using adversarial attacks.

6 месяцев, 3 недели назад @ distill.pub
Weight Banding
Weight Banding

Weights in the final layer of common visual models appear as horizontal bands. We investigate how and why.

7 месяцев, 3 недели назад @ distill.pub
Branch Specialization
Branch Specialization

When a neural network layer is divided into multiple branches, neurons self-organize into coherent groupings.

7 месяцев, 3 недели назад @ distill.pub
Multimodal Neurons in Artificial Neural Networks
Multimodal Neurons in Artificial Neural Networks

We report the existence of multimodal neurons in artificial neural networks, similar to those found in the human brain.

8 месяцев, 4 недели назад @ distill.pub
Self-Organising Textures
Self-Organising Textures

Neural Cellular Automata learn to generate textures, exhibiting surprising properties.

9 месяцев, 2 недели назад @ distill.pub
Self-Organising Textures
Self-Organising Textures

Neural Cellular Automata learn to generate textures, exhibiting surprising properties.

9 месяцев, 2 недели назад @ distill.pub
Visualizing Weights
Visualizing Weights

We present techniques for visualizing, contextualizing, and understanding neural network weights.

9 месяцев, 3 недели назад @ distill.pub
The Gradient The Gradient
последний пост 2 недели, 2 дня назад
Explain Yourself - A Primer on ML Interpretability & Explainability
Explain Yourself - A Primer on ML Interpretability & Explainability Explain Yourself - A Primer on ML Interpretability & Explainability

Fig 2: A collection of outputs from different post-hoc interpretability techniques.

It becomes possible to evaluate different post-hoc interpretability techniques on the same underlying model, giving us more nuanced insights.

The same will be attested to, by interpretability methods as well as the target output do get influenced by perturbing these attributes.

CitationFor attribution in academic contexts or books, please cite this work asNirmal Sobha Kartha, "Explain Yourself - A Primer on ML Interpretability & Explainability", The Gradient, 2021.

BibTeX citation:@article{kartha2021explain,author = {Kartha, Nirmal Sobha},title = {Explain Yourself - A Primer on ML Interpretability & Explaina…

2 недели, 2 дня назад @ thegradient.pub
Strong AI Requires Autonomous Building of Composable Models
Strong AI Requires Autonomous Building of Composable Models Strong AI Requires Autonomous Building of Composable Models

Why models must be learned and dynamically composedThe models that AI will use to reason, act, and communicate must be learned.

We can divide current models in AI into two classes: neural-network models and representation-based models.

Researchers often divide models into subsymbolic and symbolic models, but this work focuses on model composability, so dividing models into neural-network models and representation-based models is helpful, as we will see.

CitationFor attribution in academic contexts or books, please cite this work asJonathan Mugan, "Strong AI Requires Autonomous Building of Composable Models", The Gradient, 2021.

BibTeX citation:@article{mugan2021strong,author = {Mugan, Jonat…

4 недели, 1 день назад @ thegradient.pub
Reflections on Foundation Models
Reflections on Foundation Models Reflections on Foundation Models

Given this potential, we see foundation models as the subject of a growing paradigm shift, where many AI systems across domains will directly build upon or heavily integrate foundation models.

Foundation models are not just large language models; foundation models can also be trained using images, video, and other sensory and knowledge base data—and some already are.

People create the data that underpins foundation models, develop foundation models, adapt foundation models for specific applications, and interact with the resulting applications.

Foundation models are a strict superset of LLMs, though the most salient foundation models currently are LLMs (e.g., GPT-3).

Akin to how deep learni…

1 месяц, 1 неделя назад @ thegradient.pub
The Imperative for Sustainable AI Systems
The Imperative for Sustainable AI Systems The Imperative for Sustainable AI Systems

Sustainable AI can be thought of as another dimension along which we can guide the development of AI systems in addition to typical functional and business requirements.

Share the idea of sustainable AI widelyWe can begin first by sharing this idea of sustainable AI with our own communities of research and practice.

This will create a virtuous cycle of practice and reward enabling a transition into “Green AI”, AI systems that are built and used with the awareness of their carbon impact, as opposed to “Red AI” , AI systems that are built and used with only performance goals in mind ,which is currently favored and supported in the research and practice ecosystem.

Instrument your AI systems to…

2 месяца, 1 неделя назад @ thegradient.pub
Has AI found a new Foundation?
Has AI found a new Foundation? Has AI found a new Foundation?

They coined a new term, “Foundation Models” to characterize the new paradigm, joined forces in a “Center for Research on Foundation Models”, and published the massive 212-page report “On the Opportunities and Risks of Foundation Models.”Although the term is new, the general approach is not.

At the Workshop on Foundation Models, Jitendra Malik, a renowned expert in computer vision at Berkeley, said, “I am going to take a ... strongly critical role, when we talk about them as the foundation of AI ...

As Georgia Tech professor Mark Riedl wrote on Twitter “Branding very large pre-trained neural language models as “foundation” models is a brilliant … PR stunt.

The report says, unironically, “we …

2 месяца, 2 недели назад @ thegradient.pub
Test
Test Test

An Introduction to AI Story GenerationIt’s All Training Data: Using Lessons from Machine Learning to Retrain Your Mind

2 месяца, 3 недели назад @ thegradient.pub
An Introduction to AI Story Generation
An Introduction to AI Story Generation An Introduction to AI Story Generation

Some events are goal events and some events are sub-goal events that are necessary steps in achieving a final goal event.

These sub-goal events precede the goal event and are also related to the goal event as part of a hierarchy.

Training a language model on a corpus of stories means the language model will attempt to emulate what it has learned from a corpus.

They train an explanation generation language model to answer the question and then make the resulting explanation the previous segment of the story.

CitationFor attribution in academic contexts or books, please cite this work asMark Riedl, "An Introduction to AI Story Generation", The Gradient, 2021.

3 месяца, 1 неделя назад @ thegradient.pub
Systems for Machine Learning
Systems for Machine Learning Systems for Machine Learning

Machine learning’s increasing importance to real-world applications brought awareness of a new field focused on ML in practice - machine learning systems (or, as some call it, MLOps).

This field acts as a bridging point between the domains of computer systems and machine learning, considering the new challenges of machine learning with a lens shaped by traditional systems research.

By identifying a shared problem in both systems and ML - combining data sources - we can apply traditional systems techniques to a machine learning setting.

Of course, there are some differences, reflecting how machine learning systems differ from the traditional paradigm.

Systems research is filling this need, b…

3 месяца, 2 недели назад @ thegradient.pub
Machine Learning Won't Solve Natural Language Understanding
Machine Learning Won't Solve Natural Language Understanding Machine Learning Won't Solve Natural Language Understanding

Language UnderstandingWhile NLP (Natural Language Processing) and NLU (Natural Language Understanding) are often used interchangeably, there is a substantial difference between the two and it is crucial to highlight this difference.

Thus, machine learning and language understanding are incompatible – in fact, they are contradictory.

Natural language is rampant with intensional phenomena, since objects of thoughts — that language conveys — have an intensional aspect that cannot be ignored.

CitationFor attribution in academic contexts or books, please cite this work asWalid Saba, "Machine Learning Won't Solve Natural Language Understanding", The Gradient, 2021.

BibTeX citation:@article{saba20…

3 месяца, 3 недели назад @ thegradient.pub
Machine Translation Shifts Power
Machine Translation Shifts Power Machine Translation Shifts Power

Origins of Machine TranslationThe first machine translation efforts in the United States were spurred by the Cold War.

However, the United States government remained a faithful consumer of machine translation technology; in Tom Pedtke’s 1997 keynote address at the sixth Machine Translation Summit, he reflects on several key developments in the 1990s fostered by government demand.

This is not to say that machine translation is epistemologically bereft; the parallel text basis of contemporary machine translation paradigms aligns with Quine’s pragmatic, behaviorist approach to translation.

Translation remains a political act, and data-driven machine translation developments, largely centered i…

4 месяца назад @ thegradient.pub
It’s All Training Data: Using Lessons from Machine Learning to Retrain Your Mind
It’s All Training Data: Using Lessons from Machine Learning to Retrain Your Mind It’s All Training Data: Using Lessons from Machine Learning to Retrain Your Mind

As machine learning scientists, we know that training data can make or break your model.

So I began research on the topic of using personal data for machine learning education.

I’m currently working on a book called Life Lessons from Algorithms , a personal history of resilience demonstrated through different machine learning algorithms.

Author BioYim is a PhD student at the University of Washington, researching innovative ways to teach machine learning.

CitationFor attribution in academic contexts or books, please cite this work asYim Lucky Register, "It’s All Training Data: Using Lessons from Machine Learning to Retrain Your Mind", The Gradient, 2021.

4 месяца, 1 неделя назад @ thegradient.pub
Justitia ex Machina: The Case for Automating Morals
Justitia ex Machina: The Case for Automating Morals Justitia ex Machina: The Case for Automating Morals

In the following post, we will discuss two complex and related issues regarding these models: fairness and transparency.

These are crucial questions to ask if machine learning models play important parts in our lives and our societies.

However, all hope is not lost as ensuring fairness in machine learning models is an active area of research.

Explanations produced with the LIME algorithm, image fromEnsuring that machine learning models are explainable is an active area of research as well.

CitationFor attribution in academic contexts or books, please cite this work asRasmus Berg Palm and Pola Schwöbel, "Justitia ex Machina: The Case for Automating Morals", The Gradient, 2021.

4 месяца, 2 недели назад @ thegradient.pub
Prompting: Better Ways of Using Language Models for NLP Tasks
Prompting: Better Ways of Using Language Models for NLP Tasks Prompting: Better Ways of Using Language Models for NLP Tasks

Several recent papers follow this line of approaches by adjusting the objective (Tam et al., 2021) or formulating tasks in a unified form, like question answering (Zhong et al., 2021) or textual entailment (Wang et al., 2021).

Following-up works further refine the way of using demonstrations: Gao et al., 2021; Liu et al.

It is a good few-shot model as long as the design can generalize well to other few-shot tasks (these tasks can be so-called “true few-shot”).

Introducing LM-BFFFinally, let me introduce our ACL’21 paper, "Making Pre-trained Language Models Better Few-shot Learners", abbreviated as LM-BFF (better few-shot fine-tuning of language models; alternatively, language models' best f…

4 месяца, 4 недели назад @ thegradient.pub
How to Do Multi-Task Learning Intelligently
How to Do Multi-Task Learning Intelligently How to Do Multi-Task Learning Intelligently

proposed, “AdaShare: Learning What to Share for Efficient Deep Multi-Task Learning” (2020) [2].

Overview of “AdaShare: Learning What to Share for Efficient Deep Multi-Task Learning” by X.

Adashare: Learning what to share for efficient deep multi-task learning.

His research interests lie in the unsupervised deep learning, Few shot learning and Multi-task learning in computer vision.

He is doing research related to learning efficiently including topics from Multi-Task Learning and Uncertainty Estimation.

5 месяцев, 1 неделя назад @ thegradient.pub
Are Self-Driving Cars Really Safer Than Human Drivers?
Are Self-Driving Cars Really Safer Than Human Drivers? Are Self-Driving Cars Really Safer Than Human Drivers?

And if ADSs cannot perform commonsense reasoning to handle all these edge cases, are they really safer than human drivers?

Why Testing is NecessaryWe have some good reasons to believe that ADSs will be safer than human drivers, and we have some good reasons to worry that ADSs will not be as safe as human drivers.

There are good reasons to assume that ADSs will be safer than human drivers.

CitationFor attribution in academic contexts or books, please cite this work as:Steve Shwartz, "Are Self-Driving Cars Really Safer Than Human Drivers?

BibTeX citation:@article{shwartzadss2021,author = {Shwartz, Steve},title = {Are Self-Driving Cars Really Safer Than Human Drivers?

5 месяцев, 2 недели назад @ thegradient.pub
TheSequence TheSequence
последний пост 1 день назад
👨🏼‍🎓👩🏽‍🎓 The Standard for Scalable Deep Learning Models
👨🏼‍🎓👩🏽‍🎓 The Standard for Scalable Deep Learning Models 👨🏼‍🎓👩🏽‍🎓 The Standard for Scalable Deep Learning Models

📝 EditorialLarge deep learning models seem to be the norm these days.

When applied to deep learning models, MoE has proven to scale sublinear with respect to the number of parameters, making the only viable option to scaling deep learning models to trillions of parameters.

Facebook AI Research (FAIR) recently launched fairseq for using MoE in language models.

Little by little, MoE is becoming the gold standard of large deep learning models.

Edge#146: we deep dive into Arize AI ML observability platform.

1 день назад @ thesequence.substack.com
▪️▫️▪️▫️ Edge#144: How Many AI Neurons Does It Take to Simulate a Brain Neuron?
▪️▫️▪️▫️ Edge#144: How Many AI Neurons Does It Take to Simulate a Brain Neuron? ▪️▫️▪️▫️ Edge#144: How Many AI Neurons Does It Take to Simulate a Brain Neuron?

In today’s Edge, we’d like to offer you a short but insightful overview of a fascinating paper that forces us to rethink the analogies between biological and DNN neurons; you’ll also find the list of our recent recaps to catch up with during the holidays.

🍁 Happy Thanksgiving!

🍁We are grateful for your support.

Spread the word and consider giving a gi…

3 дня, 23 часа назад @ thesequence.substack.com
🎙 TheSequence Chat: a year of interesting conversations
🎙 TheSequence Chat: a year of interesting conversations 🎙 TheSequence Chat: a year of interesting conversations

One year ago, we started TheSequence Chat – a series of interviews with ML practitioners.

Many times, it's hard to associate a specific piece of ML research or technology with the creators behind the scene.

However, learning about the experience gained by researchers, engineers and entrepreneurs doing real work can result in a great source of knowledge and inspiration.

During this year, we’ve published 25 conversations, that were read 578,328 times by at least 20,000 people each.

We always appreciate the feedback 🍁

4 дня, 23 часа назад @ thesequence.substack.com
🏗 Edge#143: Feature Stores in ML Pipelines: A Recap
🏗 Edge#143: Feature Stores in ML Pipelines: A Recap 🏗 Edge#143: Feature Stores in ML Pipelines: A Recap

This holiday week, let’s have some time to catch up with what we’ve covered before.

As part of our current MLOps series, we offer you the recap of articles dedicated to feature stores.

Why did they become the crucial part of MLOps stack?

Please forward this email to those who might benefit from reading it or give a gift subscription.

🏗🏪 What is a Featur…

5 дней, 23 часа назад @ thesequence.substack.com
♟♟ Chess Learning Explainability
♟♟ Chess Learning Explainability ♟♟ Chess Learning Explainability

📝 EditorialChess has long been considered a solved problem in machine learning (ML) as many chess programs have achieved superhuman performance.

One of those new areas of contributions has to do with understanding how deep learning models build knowledge representations in complex scenarios such as chess.

That approach was challenged with recent chess models like DeepMind’s AlphaZero that mastered chess by simply playing games.

AlphaZero quickly became the strongest chess engine in the world and also discovered all set of new lines in chess openings that challenged conventional wisdom.

In a paper released this week, DeepMind and Google Brain collaborated with former chess world champion Vla…

1 неделя, 1 день назад @ thesequence.substack.com
🤖Edge#142: How Microsoft Built a 530 Billion Parameter Model
🤖Edge#142: How Microsoft Built a 530 Billion Parameter Model 🤖Edge#142: How Microsoft Built a 530 Billion Parameter Model

Share💥 What’s New in AI: How Microsoft built Megatron-Turing NLG, one of the largest language models in historyAnother month and another big transformer model becomes available.

In collaboration with NVIDIA, the Redmon giant announced a 530 billion parameter model called Megatron-Turing Natural Language Generation (MT-NLG).

The model is a successor of Turing-NLG, which, a few months ago, was considered the biggest language model in the world.

Large pretrained language models are always impressive, but they have become sort of the norm in the NLP space.

In that sense, it’s worth looking into the unique aspects of a model that are uniquely differentiated compared to alternatives.

1 неделя, 3 дня назад @ thesequence.substack.com
☝️ CoreWeave is allocating $50 million to scale the best compute infrastructure for ML*
☝️ CoreWeave is allocating $50 million to scale the best compute infrastructure for ML* ☝️ CoreWeave is allocating $50 million to scale the best compute infrastructure for ML*

Founders, data scientists, engineers, and DevOps teams are able to reduce inference latency by 50%, serve requests 300% faster, and lower their costs by 75%.

READ MOREHow AI Dungeon Improved 3 Critical KPIsWith a seamless migration to CoreWeave, AI Dungeon was able to cut spin-up times by 4X, reduce inference latency by 50%, and lower computing costs by 76%.

READ MOREAbout CoreWeaveCoreWeave is a specialized cloud provider delivering access to a massive scale of accelerated compute on top of the industry's fastest and most flexible infrastructure for HPC workloads.

As NVIDIA's first Elite CSP Partner for compute, ML clients find that we're up to 50-80% less expensive AND 35X faster than com…

1 неделя, 4 дня назад @ thesequence.substack.com
🩺 Edge#141: MLOPs – Model Monitoring
🩺 Edge#141: MLOPs – Model Monitoring 🩺 Edge#141: MLOPs – Model Monitoring

In this issue:we discuss Model Monitoring ;we explore Google’s research paper about the building blocks of interpretability ;we overview a few ML monitoring platforms: Arize AI, Fiddler, WhyLabs, Neptune AI.

💡 ML Concept of the Day: Model MonitoringAs the first topic of our series about MLOps, we would like to focus on ML monitoring.

Considered by many the cornerstone of MLOps, model monitoring is one of the essential building blocks of any ML pipeline.

In some ways, I like to think about ML monitoring as the next phase of the application performance monitoring (APM) space that has accompanied the evolution of software technology trends.

Part of what makes ML monitoring unique is the model-…

1 неделя, 6 дней назад @ thesequence.substack.com
🐉 NVIDIA's ML Software Moment
🐉 NVIDIA's ML Software Moment 🐉 NVIDIA's ML Software Moment

NVIDIA has been driving innovation in AI chip design for different types of deep learning models, making it a central component of most modern ML infrastructures.

However, NVIDIA AI influencing is rapidly expanding beyond hardware into providing end-to-end solutions for ML applications.

This week, NVIDIA hosted its annual GTC conference, where ML software was at the center of the agenda.

NVIDIA dazzled data scientists with new ML software releases such as the RIVA speech SDK, the massively large Megatron language pretrained model, the Nsight Deep Learning Designer, and the new Data Science Workbench.

Both are available to NVIDIA enterprise customers →read more on NVIDIA blogNsight Deep Lear…

2 недели, 1 день назад @ thesequence.substack.com
🌟Share your opinion on MLOps and get rewarded*
🌟Share your opinion on MLOps and get rewarded* 🌟Share your opinion on MLOps and get rewarded*

As data scientists, ML engineers and others with a stake in the future of the field, you know that getting models from research to production is hard.

Any insight from peers helps, especially early in a career or machine learning initiative.

With that in mind, we’re asking you to share your perspective in this short survey on the state of MLOps, ML teams and the exciting but at-times confusing and crowded ML infrastructure space.

To thank you for your time (we know it’s valuable), the first 35 survey respondents will receive a $25 Amazon gift card.

*This survey is commissioned by Arize AI, an ML observability platform that tracks billions of ML predictions on behalf of Fortune 500 companies…

2 недели, 2 дня назад @ thesequence.substack.com
☁️ Edge#140: cnvrg.io’s Metacloud aims to help AI developers to fight vendor lock-in  
☁️ Edge#140: cnvrg.io’s Metacloud aims to help AI developers to fight vendor lock-in   ☁️ Edge#140: cnvrg.io’s Metacloud aims to help AI developers to fight vendor lock-in  

💥 What’s New in AI: cnvrg.io’s Metacloud aims to help AI developers to fight vendor lock-inFragmentation is one of the key challenges faced when building machine learning (ML) applications in the real world.

With recently launched cnvrg.io Metacloud, the company claims to become one of the most complete platforms that provides an end-to-end experience for implementing and operationalizing ML solutions and frees AI developers from vendor lock-in.

cnvrg.io ML pipelines capabilities include automatic hyperparameter tuning as well as out-of-the-box integration with runtimes such as Spark or Kubernetes.

Image credit: cnvrg.ioML Deploymentcnvrg.io automates the deployment of ML models to Kubernet…

2 недели, 3 дня назад @ thesequence.substack.com
🎙 Brian Venturo/CoreWeave about GPU-first ML infrastructures
🎙 Brian Venturo/CoreWeave about GPU-first ML infrastructures 🎙 Brian Venturo/CoreWeave about GPU-first ML infrastructures

🛠 ML WorkCould you tell us about the vision and inspiration for CoreWeave Cloud?

CoreWeave Cloud is purpose-built to handle large-scale workloads on-demand, on a bare-metal Kubernetes native infrastructure that delivers the industry’s fastest spin-up times and most responsive autoscaling.

Training is one of the biggest uses of GPU resources in large scale ML solutions.

Large scale training requires high bandwidth low-latency interconnect both inside a node, achieved via NVSWITCH in the A100 cluster, and between nodes in a distributed training setup.

The emergence of deep learning seems to have brought back the dependencies between hardware and software in ML programs.

2 недели, 4 дня назад @ thesequence.substack.com
🔥 Edge#139: MLOps – one of the hottest topics in the ML space
🔥 Edge#139: MLOps – one of the hottest topics in the ML space 🔥 Edge#139: MLOps – one of the hottest topics in the ML space

💡 ML Concept of the Day: A New Series About MLOpsWe couldn’t resist it any longer and decided to start a new series about machine learning operations (MLOps), considered one of the hottest topics in the ML world.

The MLOps is becoming increasingly crowded with a large number of stacks ranging from technology powerhouses to innovative ML startups.

In general, we should think about MLOps as an extension of DevOps methodologies but optimized for the lifecycle of ML applications.

Just like DevOps, MLOps looks to manage the different stages of the lifecycle of ML applications.

In no time, MLOps evolved from a set of best practices into a holistic approach to ML lifecycle management.

2 недели, 6 дней назад @ thesequence.substack.com
➗✖️ OpenAI New NLP Challenge: Mathematical Reasoning
➗✖️ OpenAI New NLP Challenge: Mathematical Reasoning ➗✖️ OpenAI New NLP Challenge: Mathematical Reasoning

In addition to the sophisticated interpretability required in math reasoning problems, they are very sensitive to cascading errors.

Mathematical reasoning models need to be able to correct mistakes accordingly and concatenate a complex sequence of steps.

Not surprisingly, many experts believe that mathematical reasoning problems are a way to expose the limitations of NLP models.

Despite the challenges, the AI community has been steadily making progress towards creating ML models specialized in multi-step mathematical reasoning.

OpenAI research is one of the most impressive developments in one of the toughest areas of NLP.

3 недели, 1 день назад @ thesequence.substack.com
📝 Guest post: How to build SuperData for AI [Full Checklist]*
📝 Guest post: How to build SuperData for AI [Full Checklist]* 📝 Guest post: How to build SuperData for AI [Full Checklist]*

Depending on the application, collecting data manually can take a significant cut of time and resources.

If you want to develop robust and reusable training data, be cautious of the annotation techniques used.

In case anything crashes, you will always be able to smuggle back for changes in training data.

With SuperAnnotate's curation system, you get a clear idea of your data quality through wider filtering options, data visualization, and team collaboration opportunities.

Key takeawaysThe process of building SuperData entails more questions than answers: progress here is being made by asking better questions that redefine and spruce up the data quality.

3 недели, 3 дня назад @ thesequence.substack.com
Synced Review
последний пост 2 дня, 20 часов назад
Kwai, Kuaishou & ETH Zürich Propose PERSIA, a Distributed Training System That Supports Deep…
Kwai, Kuaishou & ETH Zürich Propose PERSIA, a Distributed Training System That Supports Deep… Kwai, Kuaishou & ETH Zürich Propose PERSIA, a Distributed Training System That Supports Deep…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 дня, 20 часов назад @ medium.com
70-Page Paper From Yoshua Bengio Team: GFlowNet Foundations
70-Page Paper From Yoshua Bengio Team: GFlowNet Foundations 70-Page Paper From Yoshua Bengio Team: GFlowNet Foundations

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 дня, 20 часов назад @ medium.com
DeepMind, Google Brain & World Chess Champion Explore How AlphaZero Learns Chess Knowledge
DeepMind, Google Brain & World Chess Champion Explore How AlphaZero Learns Chess Knowledge DeepMind, Google Brain & World Chess Champion Explore How AlphaZero Learns Chess Knowledge

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

4 дня, 20 часов назад @ medium.com
Microsoft’s DeBERTaV3 Uses ELECTRA-Style Pretraining With Gradient-Disentangled Embedding Sharing…
Microsoft’s DeBERTaV3 Uses ELECTRA-Style Pretraining With Gradient-Disentangled Embedding Sharing… Microsoft’s DeBERTaV3 Uses ELECTRA-Style Pretraining With Gradient-Disentangled Embedding Sharing…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

5 дней, 20 часов назад @ medium.com
Microsoft Asia’s Swin Transformer V2 Scales the Award-Winning ViT to 3 Billion Parameters and…
Microsoft Asia’s Swin Transformer V2 Scales the Award-Winning ViT to 3 Billion Parameters and… Microsoft Asia’s Swin Transformer V2 Scales the Award-Winning ViT to 3 Billion Parameters and…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

6 дней, 21 час назад @ medium.com
SPANN: A Highly-Efficient Billion-Scale Approximate Nearest Neighbour Search That’s 2× Faster Than…
SPANN: A Highly-Efficient Billion-Scale Approximate Nearest Neighbour Search That’s 2× Faster Than… SPANN: A Highly-Efficient Billion-Scale Approximate Nearest Neighbour Search That’s 2× Faster Than…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя, 2 дня назад @ medium.com
Intel’s Prune Once for All Compression Method Achieves SOTA Compression-to-Accuracy Results on BERT
Intel’s Prune Once for All Compression Method Achieves SOTA Compression-to-Accuracy Results on BERT Intel’s Prune Once for All Compression Method Achieves SOTA Compression-to-Accuracy Results on BERT

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя, 3 дня назад @ medium.com
Meet iBOT: A Masked Image Modelling Framework That Enables BERT-like Pretraining for Vision…
Meet iBOT: A Masked Image Modelling Framework That Enables BERT-like Pretraining for Vision… Meet iBOT: A Masked Image Modelling Framework That Enables BERT-like Pretraining for Vision…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя, 4 дня назад @ medium.com
Google Brain & Radboud U ‘Dive Into Chaos’ to Show Gradients Are Not All You Need in Dynamical…
Google Brain & Radboud U ‘Dive Into Chaos’ to Show Gradients Are Not All You Need in Dynamical… Google Brain & Radboud U ‘Dive Into Chaos’ to Show Gradients Are Not All You Need in Dynamical…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя, 5 дней назад @ medium.com
A Leap Forward in Computer Vision: Facebook AI Says Masked Autoencoders Are Scalable Vision…
A Leap Forward in Computer Vision: Facebook AI Says Masked Autoencoders Are Scalable Vision… A Leap Forward in Computer Vision: Facebook AI Says Masked Autoencoders Are Scalable Vision…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя, 6 дней назад @ medium.com
DeepMind’s One Pass ImageNet: A New Benchmark for Resource Efficiency in Deep Learning
DeepMind’s One Pass ImageNet: A New Benchmark for Resource Efficiency in Deep Learning DeepMind’s One Pass ImageNet: A New Benchmark for Resource Efficiency in Deep Learning

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 недели, 2 дня назад @ medium.com
Microsoft India Proposes Varuna: Scalable, Low-Cost Training of Massive Deep Learning Models
Microsoft India Proposes Varuna: Scalable, Low-Cost Training of Massive Deep Learning Models Microsoft India Proposes Varuna: Scalable, Low-Cost Training of Massive Deep Learning Models

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 недели, 4 дня назад @ medium.com
Can ViT Layers Express Convolutions? Peking U, UCLA & Microsoft Researchers Say ‘Yes’
Can ViT Layers Express Convolutions? Peking U, UCLA & Microsoft Researchers Say ‘Yes’ Can ViT Layers Express Convolutions? Peking U, UCLA & Microsoft Researchers Say ‘Yes’

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 недели, 5 дней назад @ medium.com
Introducing MetaICL: A Language Model Meta-Training Framework for Few-Shot In-Context Learning
Introducing MetaICL: A Language Model Meta-Training Framework for Few-Shot In-Context Learning Introducing MetaICL: A Language Model Meta-Training Framework for Few-Shot In-Context Learning

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 недели, 6 дней назад @ medium.com
Google & UC Berkeley’s Data-Driven Offline Optimization Approach Significantly Boosts Hardware…
Google & UC Berkeley’s Data-Driven Offline Optimization Approach Significantly Boosts Hardware… Google & UC Berkeley’s Data-Driven Offline Optimization Approach Significantly Boosts Hardware…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 недели, 2 дня назад @ medium.com
📓 Cool Blogs
ODS.ai Habr ODS.ai Habr
последний пост 1 час назад
Обзор архитектуры AlphaFold 2
Обзор архитектуры AlphaFold 2 Обзор архитектуры AlphaFold 2

Кластеризация и маскирование таблицы MSAСложность вычислений и объем требуемой памяти в AlphaFold квадратично зависит от количества последовательностей в MSA, и лишь линейно от длины последовательности, поэтому количество последовательностей в MSA желательно уменьшить.

Этот блок принимает на вход массивы pair representation и MSA representation и возвращает два массива таких же размеров.

Таким образом, в MSA representation собирается вся доступная информация о позициях, а в pair representation собирается вся доступная информация о парах позиций.

Дополнительно выполняются следующие операции:Pair bias и outer product mean – операции, позволяющие передавать информацию от pair representation в …

1 час назад @ habr.com
Новый запуск курса Natural Language Processing
Новый запуск курса Natural Language Processing Новый запуск курса Natural Language Processing

TL;DR: Этой осенью сообщество Open Data Science и компания Huawei делают новый запуск курса.

Мы делаем новый запуск курса Natural Language Processing.

Полный syllabus курса можно посмотреть здесь.

После каждой лекции будет квиз.

Пару слов обо мне, как авторе курса, - в области NLP я работаю последние 9 лет, успел поработать в Яндексе и ВКонтакте, защитить кандидатскую диссертацию.

2 месяца, 1 неделя назад @ habr.com
Анализ вакансий и зарплат в Data Science
Анализ вакансий и зарплат в Data Science Анализ вакансий и зарплат в Data Science

Делимся нашим исследованием вакансий и зарплат в сфере data science и data engineering.

Посмотрим еще на распределение грейдов для каждой специальностиСпрос на джунов в дата инжиниринге ниже, чем в аналитике и data science.

Зачастую зарплата указывается в тексте вакансии в неструктурированном виде, иногда в одной вакансии идет несколько позиций и несколько вилок.

После этого сделаем 4 регулярки: для текста, который часто идет до и после зарплаты в вакансии, для самих цифр зарплаты и для текста, который встречается между нижней и верхней границей зарплатной вилки.

В целом, знания deep learning становятся более востребованными: указаны в более 30% вакансий в 2021 году.

3 месяца назад @ habr.com
О квантовых компьютерах, биткоине и превосходстве. Лекция открытого курса qmlcourse
О квантовых компьютерах, биткоине и превосходстве. Лекция открытого курса qmlcourse О квантовых компьютерах, биткоине и превосходстве. Лекция открытого курса qmlcourse

Также неправильно было бы говорить о том, что в отличии от классических компьютеров, где есть лишь и в квантовых есть все состояния сразу.

Но это для классического компьютера.

И сегодня мы вынуждены использовать лишь очень приближенные решения и концепции, точности которых часто не хватает.

На сегодня почти все известные технологии создания квантовых компьютеров требуют чего-то из:сверхнизкие температурысверхвысокий вакуумсверхточная юстировка лазеров на оптическом столеИли даже всего сразу.

Да и в этом нет особого смысла, ведь мало кому дома нужно взламывать биткоин, решать логистическую проблему или разрабатывать высокотемпературный сверхпроводник.

3 месяца, 3 недели назад @ habr.com
Создание и балансировка инвестиционного портфеля с помощью ML
Создание и балансировка инвестиционного портфеля с помощью ML

В прошлой статье я писал про свои ML-модели для оценки отдельных компаний, но вопрос формирования итогового портфеля совсем не затрагивал. В этом посте хочу рассказать о том, как я собираю свой личный портфель, а так же поделиться сайтом, на котором реализую весь описанный в статье функционал http://stocks.ml. Дисклеймер: у автора нет экономического образования и все выводы и суждения в статье делаются на основе житейского опыта и здравого смысла. Читать далее

5 месяцев, 4 недели назад @ habr.com
Учиться, учиться, и ещё раз учиться?
Учиться, учиться, и ещё раз учиться? Учиться, учиться, и ещё раз учиться?

TLDR: крохотные модельки обошли модные графовые нейронки в предсказании свойств молекул.

Код: здесь. Берегите Природу. ФОТО: Андерс Хеллберг для Wikimedia Commons, модель — Грета Тунберг

Необученная графовая свёрточная нейронная сеть [1] (uGCN) со случайной инициализацией весов уже пару лет занимает первое место в моём списке алгоритмов для задач машинного обучения на графах из-за копеечной стоимости, простоты реализации, да вполне очевидной элегантности решения. В то же время, насколько мне известно, никто ещё не не проводил соревнований между этой простой моделью и её старшей сестрой — полноценно обученной графовой свёрточной нейронной сетью (GCN) в режиме обучения с учителем. Вот я сдела…

6 месяцев назад @ habr.com
DeepPavlov стал частью Google Summer of Code в 2021 году
DeepPavlov стал частью Google Summer of Code в 2021 году

В этом году открытая платформа для обработки естественного языка DeepPavlov, разрабатываемая лабораторией нейронных систем и глубокого обучения МФТИ, впервые стала частью ежегодной программы для молодых разработчиков Google Summer of Code.Google Summer of Code (GSoC) — это ежегодное событие, проводимое компанией Google для привлечения молодых разработчиков к разработке проектов с открытым исходным кодом в их свободное летнее время. К участию допускаются студенты высших учебных заведений (бакалавриат, магистратура, аспирантура) и колледжей. Это отличная возможность не только развить навыки программирования, но и заработать!Работать можно в любой организации, которая есть в соответствующем сп…

8 месяцев назад @ habr.com
Мои machine learning тулы для инвестирования
Мои machine learning тулы для инвестирования

В последнее время все больше людей приходит к тому, чтобы не держать деньги под матрасом, а куда-то их инвестировать в надежде сохранить и преумножить свой капитал. Вариант с матрасом плох тем, что с повышением цен на товары и услуги(инфляция) покупательная способность денег падает и через какое-то время купить на них можно значительно меньше, чем раньше. Есть много вариантов, куда вложить деньги(недвижимость, банковский вклад, ценные металлы), но в последнее время популярным становится инвестирование в акции. Только у брокера Тинькофф Инвестиции за несколько лет число клиентов превысило 3.5 млн. В статье я постараюсь описать свой подход к выбору бумаг и поделюсь инструментами, которые для …

8 месяцев назад @ habr.com
Собираем Свой Суперкомпьютер Недорого
Собираем Свой Суперкомпьютер Недорого Собираем Свой Суперкомпьютер Недорого

Нынче никого не удивишь достижениями искусственного интеллекта машинного обучения (ML) в самых разных областях. При этом доверчивые граждане редко задают два вопроса: (i) а какая собственно цена экспериментов и финальной системы и (ii) имеет ли сделанное хоть какую-то целесообразность? Самым важным компонентом такой цены являются как ни странно цена на железо и зарплаты людей. В случае если это все крутится в облаке, нужно еще умножать стоимость железа в 2-3 раза (маржа посредника).

И тут мы неизбежно приходим к тому, что несмотря на то, что теперь даже в официальные билды PyTorch добавляют бета-поддержку ROCm, Nvidia де-факто в этом цикле обновления железа (и скорее всего следующем) остает…

8 месяцев, 2 недели назад @ habr.com
Рубрика «Читаем статьи за вас». Сентябрь — октябрь 2020 года
Рубрика «Читаем статьи за вас». Сентябрь — октябрь 2020 года

Привет, Хабр! Продолжаем публиковать рецензии на научные статьи от членов сообщества Open Data Science из канала #article_essense. Хотите получать их раньше всех — вступайте в сообщество!Статьи на сегодня:1. A Better Use of Audio-Visual Cues: Dense Video Captioning with Bi-modal Transformer (Tampere University, Finland, 2020)2. Fast Bi-layer Neural Synthesis of One-Shot Realistic Head Avatars (Samsung AI Center, 2020)3. Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting (University of California, USA, 2019)4. Whitening for Self-Supervised Representation Learning (University of Trento, Italy, 2020)5. MelGAN: Generative Adversarial Networks for…

9 месяцев назад @ habr.com
Machine Learning Mastery
последний пост 4 дня, 19 часов назад
Method Of Lagrange Multipliers: The Theory Behind Support Vector Machines (Part 1: The Separable Case)
Method Of Lagrange Multipliers: The Theory Behind Support Vector Machines (Part 1: The Separable Case)

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

4 дня, 19 часов назад @ machinelearningmastery.com
Visualizing the vanishing gradient problem
Visualizing the vanishing gradient problem

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя, 5 дней назад @ machinelearningmastery.com
Using CNN for financial time series prediction
Using CNN for financial time series prediction

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 недели назад @ machinelearningmastery.com
The Transformer Model
The Transformer Model

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 недели, 4 дня назад @ machinelearningmastery.com
The Transformer Attention Mechanism
The Transformer Attention Mechanism

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 месяц назад @ machinelearningmastery.com
Face Recognition using Principal Component Analysis
Face Recognition using Principal Component Analysis

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 месяц назад @ machinelearningmastery.com
Using Singular Value Decomposition to Build a Recommender System
Using Singular Value Decomposition to Build a Recommender System

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 месяц назад @ machinelearningmastery.com
A Gentle Introduction to Vector Space Models
A Gentle Introduction to Vector Space Models

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 месяц, 1 неделя назад @ machinelearningmastery.com
Principal Component Analysis for Visualization
Principal Component Analysis for Visualization

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 месяц, 1 неделя назад @ machinelearningmastery.com
The Luong Attention Mechanism
The Luong Attention Mechanism

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 месяц, 1 неделя назад @ machinelearningmastery.com
The Bahdanau Attention Mechanism
The Bahdanau Attention Mechanism

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 месяц, 2 недели назад @ machinelearningmastery.com
Adding A Custom Attention Layer To Recurrent Neural Network In Keras
Adding A Custom Attention Layer To Recurrent Neural Network In Keras

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 месяц, 2 недели назад @ machinelearningmastery.com
Optimization for Machine Learning Crash Course
Optimization for Machine Learning Crash Course

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 месяц, 2 недели назад @ machinelearningmastery.com
A Tour of Attention-Based Architectures
A Tour of Attention-Based Architectures

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 месяц, 4 недели назад @ machinelearningmastery.com
Understanding Simple Recurrent Neural Networks In Keras
Understanding Simple Recurrent Neural Networks In Keras

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 месяца назад @ machinelearningmastery.com
ML in Production
последний пост None
Sorta Insightful Sorta Insightful
последний пост 1 месяц назад
Review: How To Invent Everything
Review: How To Invent Everything Review: How To Invent Everything

How to Invent Everything is a book by Ryan North, of Dinosaur Comics fame.

How to Invent Everything doesn’t literally cover everything, but you can see it as a book about the highlights of civilization.

Hot air balloons took a really long time to invent, given that baskets, controlled fire, and fabric all existed for centuries.

One question How To Invent Everything toys with is what point in history would let you most influence the trajectory of humanity.

Both these technologies are so incredibly useful to society that it’s hard to see how progress was made before their invention, and they look a long time to invent.

1 месяц назад @ alexirpan.com
Six Years Later
Six Years Later Six Years Later

That means I’ve felt greater feelings of responsibility and urgency for puzzlehunt writing compared to blogging.

markdown 6597 2021 - 01 - 29 - mh - 2021. markdown 4249 2021 - 02 - 18 - flash - games .

View CountsThese are the view counts from August 18, 2020 to today, for the posts I’ve written this year.

markdown 912 2021 - 01 - 29 - mh - 2021. markdown 167 2021 - 02 - 18 - flash - games .

Post about Dominion Online:Odds of writing this year: 25%Odds of writing eventually: 50%Post about puzzlehunts:Odds of writing this year: 70%Odds of writing eventually: 90%

3 месяца, 1 неделя назад @ alexirpan.com
Why Don't I Have Ads?
Why Don't I Have Ads? Why Don't I Have Ads?

According to this blog, the clickthrough rate of display ads is 0.05%, but is 8.8x higher for native ads.

I don’t want to do native ads, but let’s get an upper bound by assuming I did use native ads, and they have 10x clickthrough, so 0.5%.

If I value readers more than money, why don’t I pay money to run ads to direct people to my site?

I think the main problem is that ads don’t really work for a personal blog.

In retrospect, given how ads work, the price per click has to be based on the revenue of the products people advertise, and if I don’t sell anything, I’m priced out by everyone who does.

4 месяца, 3 недели назад @ alexirpan.com
Sometimes It's Worth Trying to Change the World
Sometimes It's Worth Trying to Change the World Sometimes It's Worth Trying to Change the World

Normally, I mentally bucket everything into two categories: things I can change, and things I can’t.

If I can’t change something, it’s not worth thinking about, so I push it out of my mind.

Everyone gets the same source code, and although there might be exploits within that code, you can’t change the code itself.

If you have disagreements with a game’s systems, you’re better off finding another game instead of trying to change them.

They have some base level of arrogance that if the world doesn’t work the way they think it should, then they’ll spend their time trying to change the world until it aligns with their expectations.

6 месяцев, 1 неделя назад @ alexirpan.com
A New Online Dominion Client Approaches
A New Online Dominion Client Approaches A New Online Dominion Client Approaches

Online Dominion is getting yet another online implementation!

They’ve tried providing previous buys and didn’t see much improvement.

I’ve long believed that a strong Dominion AI is doable, but nontrivial to get right.

(The other main reason is that step 0 of any Dominion AI effort is to implement an efficient Dominion rules engine, and I really didn’t want to debug that.)

There have been a few attempts at Dominion AI.

6 месяцев, 1 неделя назад @ alexirpan.com
The 5 Year Update on Skipping Grad School (and Whether I'd Recommend It)
The 5 Year Update on Skipping Grad School (and Whether I'd Recommend It) The 5 Year Update on Skipping Grad School (and Whether I'd Recommend It)

I was pretty lucky to get an industry lab research offer.

Working on this post has taught me that everyone’s academic and industry lab experience is wildly different.

When I talked to friends in my year, they reassured me and said I’d do well in grad school.

Doing some code cleanup will save time in the long run, even if you’re the only one who looks at your research code.

Formally having the degree would be nice, but I’m going to keep trying to get what the degree represents through industry research instead.

7 месяцев, 3 недели назад @ alexirpan.com
Reliving Flash Game History
Reliving Flash Game History Reliving Flash Game History

I’ve easily spent thousands of hours playing Flash games, and it’s shaped my thoughts on what games can be and what games should be.

If you don’t have much time, Flash Game History is an excellent short article that captures the influence of Flash games on game development.

With Flash officially unsupported, the best avenue for playing Flash games is BlueMaxima’s Flashpoint.

But this is kind of a universal puzzle game problem - very few successfully avoid this trap, and the ones that do usually end up being bigger experiences than what you’d expect from a Flash game.

jmtb02 Gamesjmtb02 is the dev handle of John Cooney, a prolific Flash game developer who made a lot of games I liked.

9 месяцев, 2 недели назад @ alexirpan.com
Lil'Log Lil'Log
последний пост 2 месяца назад
How to Train Really Large Models on Many GPUs?
How to Train Really Large Models on Many GPUs? How to Train Really Large Models on Many GPUs?

Additionally training a large model often pairs with a large training corpus and thus a single process may just take forever.

2019)Pipeline ParallelismPipeline parallelism (PP) combines model parallelism with data parallelism to reduce inefficient time “bubbles’’.

The switch transformer paper summarized different data and model parallelism strategies for training large models with a nice illustration:Fig.

2019) optimizes the memory used for training large models based on the observation about two major memory consumption of large model training:The majority is occupied by model states, including optimizer states (e.g.

Cited as:@article{weng2021large, title = "How to Train Really Large Model…

2 месяца назад @ lilianweng.github.io
What are Diffusion Models?
What are Diffusion Models? What are Diffusion Models?

Diffusion models are a new type of generative models that are flexible enough to learn any arbitrarily complex data distribution while tractable to analytically evaluate the distribution.

Unlike VAE or flow models, diffusion models are learned with a fixed procedure and the latent variable has high dimensionality (same as the original data).

Several diffusion-based generative models have been proposed with similar ideas underneath, including diffusion probabilistic models (Sohl-Dickstein et al., 2015), noise-conditioned score network (NCSN; Yang & Ermon, 2019), and denoising diffusion probabilistic models (DDPM; Ho et al.

Diffusion models in their experiments showed high-quality samples but…

4 месяца, 3 недели назад @ lilianweng.github.io
Contrastive Representation Learning
Contrastive Representation Learning Contrastive Representation Learning

When working with unsupervised data, contrastive learning is one of the most powerful approaches in self-supervised learning.

Contrastive Training ObjectivesIn early versions of loss functions for contrastive learning, only one positive and one negative sample are involved.

Sampling bias which refers to false negative samples in contrastive learning can lead to a big performance drop.

4. t-SNE visualization of learned representation with debiased contrastive learning.

)\):\[\mathbf{h}_i = f(\tilde{\mathbf{x}}_i),\quad \mathbf{h}_j = f(\tilde{\mathbf{x}}_j)\]3) The contrastive learning loss is defined using cosine similarity \(\text{sim}(.,.)\).

6 месяцев назад @ lilianweng.github.io
Reducing Toxicity in Language Models
Reducing Toxicity in Language Models Reducing Toxicity in Language Models

To reduce toxicity in language models, in this post, we will delve into three aspects of the problem: training dataset collection, toxic content detection and model detoxification.

Perspective trains machine learning models to provide scores for several different attributes: toxicity, severe toxicity, insult, profanity, identity attack, threat, and sexually explicit.

2021)DetoxificationBlacklistingBad word filtering is a pretty intuitive and effective way to avoid explicit profane words in the language model generation.

(Image source: Gehman et al., 2020)System-level Safety SolutionXu et al.

This data is annotated for toxicity, toxicity sub-types, and mentions of identities, which enables e…

8 месяцев, 1 неделя назад @ lilianweng.github.io
inFERENCe
последний пост 5 месяцев, 3 недели назад
Causal inference 4: Causal Diagrams, Markov Factorization, Structural Equation Models
Causal inference 4: Causal Diagrams, Markov Factorization, Structural Equation Models Causal inference 4: Causal Diagrams, Markov Factorization, Structural Equation Models

causal inference June 10, 2021Causal inference 4: Causal Diagrams, Markov Factorization, Structural Equation ModelsThis post is written with my PhD student and now guest author Partik Reizinger and is part 4 of a series of posts on causal inference:One way to think about causal inference is that causal models require a more fine-grained models of the world compared to statistical models.

Many causal models are equivalent to the same statistical model, yet support different causal inferences.

Statistical vs causal inferenceWe discussed Markov factorizations, as they help us understand the philosophical difference between statistical and causal inference.

However, while using Markov factoriza…

5 месяцев, 3 недели назад @ inference.vc
On Information Theoretic Bounds for SGD
On Information Theoretic Bounds for SGD On Information Theoretic Bounds for SGD

That algorithm generalizes extremely well on average: it does just as poorly on test data as it does on training data.

It essentially quantifies the number of bits of information the algorithm leaks about the training data into the parameters it learns.

The problem with applying these nice, intuitive bounds to SGD is that SGD, in fact, leaks too much information about the specific minibatches it is trained on.

In the more general case, the problem with SGD in the context of information-theoretic bounds is that the amount of information SGD leaks about the dataset it was trained on is high, and in some cases may even be infinite.

The algorithm being analysed is still SGD, but when we measure…

7 месяцев, 1 неделя назад @ inference.vc
Notes on the Origin of Implicit Regularization in SGD
Notes on the Origin of Implicit Regularization in SGD Notes on the Origin of Implicit Regularization in SGD

Notes on the Origin of Implicit Regularization in SGDI wanted to highlight an intriguing paper I presented at a journal club recently:Samuel L Smith, Benoit Dherin, David Barrett, Soham De (2021) On the Origin of Implicit Regularization in Stochastic Gradient DescentThere's actually a related paper that came out simultaneously, studying full-batch gradient descent instead of SGD:David G.T.

There are in fact several minima of the training loss which are virtually indistinguishably good on the training data.

When the authors apply this technique to (full-batch) gradient descent, it already suggests the kind of implicit regularization bias gradient descent has.

The second term is what Barret a…

8 месяцев назад @ inference.vc
An information maximization view on the $\beta$-VAE objective
An information maximization view on the $\beta$-VAE objective An information maximization view on the $\beta$-VAE objective

Marginal versus Conditional IndependenceIn this post, we revisit the conditional independence assumption of latent factors, and argue that a more appropriate objective would be to have marginal independence in the latent factors.

In the next section, we set out to derive an algorithm that encourages marginal independence in the representation Z.

Maximum Information: We'd like the representation $Z$ to retain as much information as possible about the input data $X$.

Let's first consider the KL term in the above objective:\begin{align}\operatorname{KL}[q_\psi(z)| p(z)] &= \mathbb{E}_{q_\psi(z)} \log \frac{q_\psi(z)}{p(z)}\\&= \mathbb{E}_{q_\psi(z\vert x)p_\mathcal{D}(x)} \log \frac{q_\psi(z)}…

8 месяцев, 2 недели назад @ inference.vc
The Spectator
последний пост 4 месяца назад
Harnessing Machine Learning to Achieve Net Zero
Harnessing Machine Learning to Achieve Net Zero Harnessing Machine Learning to Achieve Net Zero

We of course want to make contributions using the skills and in the scope of influence available to us, and so I thought we could hold a discussion on Harnessing Machine Learning to Achieve Net Zero.

Part I: Net Zero PossibilitiesWe are here at this ICML 2021, having witnessed Earth’s hottest decade in history.

The solutions for net zero are many, and they make many opportunities for high-impact machine learning research.

The Royal Society report on Digital Technology for the Planet develops the idea that ‘Computing for net zero’ could play an important role in global, regional and national net zero strategies.

There were three I wanted to bring together for your assessment and contestation…

4 месяца назад @ blog.shakirm.com
Generating Reality: Technical and Social Explorations in Generative Machine Learning Research
Generating Reality: Technical and Social Explorations in Generative Machine Learning Research Generating Reality: Technical and Social Explorations in Generative Machine Learning Research

So that’s what I thought we might have a conversation about today: Generating Reality: Technical and Social Explorations in Generative Machine Learning Research.

Of course to even have a conversation about Generating Reality, we’ll have to explore what we might mean by this expression generative machine learning.

Basically, any thought you have about this question: what is generative machine learning?

PART I: Model, Inference, AlgorithmTo delve deeper into this topic of generative machine learning, it will be particularly useful to have a structural view of the topic: to understand the technical structures and the components that come together into whatever we call generative machine learni…

5 месяцев, 2 недели назад @ blog.shakirm.com
Inventing Ourselves: Responsibility and Diversity in Research
Inventing Ourselves: Responsibility and Diversity in Research Inventing Ourselves: Responsibility and Diversity in Research

It is in this belief, of custodianship and responsibility, that you will find an obligation to fostering Equity, Diversity and Inclusion (EDI).

Figure | Pictorial difference between equity and equality.4Greater equity, diversity and inclusion are efforts towards Transformation: the systemic and social changes that strengthens respect, responsibility and freedom in our communities.

The work of diversity is itself important, and creates better teams and better research environments for everyone.

Reflect on your personal understanding of Equity, Diversity and Inclusion.

Inventing Ourselves: Responsibility and Diversity in Research.

9 месяцев, 1 неделя назад @ blog.shakirm.com
Off the Convex Path
последний пост 4 месяца, 3 недели назад
Implicit Regularization in Tensor Factorization: Can Tensor Rank Shed Light on Generalization in Deep Learning?
Implicit Regularization in Tensor Factorization: Can Tensor Rank Shed Light on Generalization in Deep Learning? Implicit Regularization in Tensor Factorization: Can Tensor Rank Shed Light on Generalization in Deep Learning?

Implicit Regularization in Tensor Factorization: Can Tensor Rank Shed Light on Generalization in Deep Learning?

Beyond matrix factorization: tensor factorizationMatrix factorization is interesting on its own behalf, but as a theoretical surrogate for deep learning it is limited.

We can explicitly constrain the tensor rank of solutions found by tensor factorization via limiting the number of components $R$.

Dynamical analysis: implicit tensor rank minimizationSo what can we say about the implicit regularization in tensor factorization?

Note also that an accurate fit with low tensor rank coincides with low test error, which is not surprising given that low tensor rank predictors can be descri…

4 месяца, 3 недели назад @ offconvex.org
Rip van Winkle's Razor, a Simple New Estimate for Adaptive Data Analysis
Rip van Winkle's Razor, a Simple New Estimate for Adaptive Data Analysis Rip van Winkle's Razor, a Simple New Estimate for Adaptive Data Analysis

Rip van Winkle's Razor, a Simple New Estimate for Adaptive Data AnalysisCan you trust a model whose designer had access to the test/holdout set?

Meta-overfitting Error (MOE) of a model is the difference between its average error on the test data and its expected error on the full distribution.

We call our estimate Rip van Winkle’s Razor which combines references to Occam’s Razor and the mythical person who fell asleep for 20 years.

right up to the moment of creation of the Test set.”Unbiased Referee: Knows nothing discovered since the Test set was created.

Unbiased require longer descriptions that rule out any statistical “contamination” due to any interaction whatsoever with the test set.

7 месяцев, 3 недели назад @ offconvex.org
When are Neural Networks more powerful than Neural Tangent Kernels?
When are Neural Networks more powerful than Neural Tangent Kernels? When are Neural Networks more powerful than Neural Tangent Kernels?

When are Neural Networks more powerful than Neural Tangent Kernels?

Neural Tangent KernelsThe Neural Tangent Kernel (NTK) is a recently proposed theoretical framework for establishing provable convergence and generalization guarantees for wide (over-parametrized) neural networks (Jacot et al.

These gaps urge us to ask the followingQuestion: How can we theoretically study neural networks beyond the NTK regime?

The key technical question here is to mathematically understand neural networks operating outside of the NTK regime.

The proof builds on the quadratic approximation $E_S[f]\approx f^{(2)}$ and recent understandings on neural networks with quadratic activation, e.g.

8 месяцев, 1 неделя назад @ offconvex.org
Beyond log-concave sampling (Part 3)
Beyond log-concave sampling (Part 3) Beyond log-concave sampling (Part 3)

Beyond log-concave sampling (Part 3)In the first post of this series, we introduced the challenges of sampling distributions beyond log-concavity.

In Part 2 we tackled sampling from multimodal distributions: a typical obstacle occuring in problems involving statistical inference and posterior sampling in generative models.

It will cover the paper Fast convergence for Langevin diffusion with matrix manifold structure by Ankur Moitra and Andrej Risteski .

A Ricci curvature of $0$ preserves volumes (think: a plane), a Ricci curvature $>0$ shrinks volume (think: a sphere), and a Ricci curvature $<0$ expands volume (think: a hyperbola).

The proof that the marginal distribution over $\Delta$ has …

8 месяцев, 3 недели назад @ offconvex.org
Beyond log-concave sampling (Part 2)
Beyond log-concave sampling (Part 2) Beyond log-concave sampling (Part 2)

Beyond log-concave sampling (Part 2)In our previous blog post, we introduced the challenges of sampling distributions beyond log-concavity.

These structures commonly occur in practice, especially in problems involving statistical inference and posterior sampling in generative models.

In this post, we will focus on multimodality, covered by the paper Simulated tempering Langevin Monte Carlo by Rong Ge, Holden Lee, and Andrej Risteski.

Sampling multimodal distributions with simulated temperingThe classical scenario in which Langevin takes exponentially long to mix is when $p$ is a mixture of two well-separated gaussians.

More formally, choosing a suitable sequence $0< \beta_1< \cdots <\beta_L…

9 месяцев назад @ offconvex.org
Piekniewski's blog
последний пост 1 неделя, 4 дня назад
Brain computer confusion
Brain computer confusion Brain computer confusion

We learned for example that everything we can write an equation for can be in principle calculated on a computer.

We take for granted that we can write equations for molecules, yet this isn't really the case.

We can write equations for "approximations" of molecules, ignoring some of the details.

This entire mental exercise that tries to fit the entire Universe into a giant Turing machine is fundamentally flawed!!!

And the straightforward consequence of this flawed philosophy is that brain is just yet another computer, and since we build faster computers every day, it's just a matter of time when we build one as sophisticated as the brain.

1 неделя, 4 дня назад @ blog.piekniewski.info
Ai mid 2021. Self driving car meets reality.
Ai mid 2021. Self driving car meets reality. Ai mid 2021. Self driving car meets reality.

Mercedes their new top of the line sedan coming up this summer, they have almost, they had a very large amount of self driving technology in it, they yanked it the last minute cause they didn't think laws at the state level were ready yet to handle self driving cars.

Here we are in 2021 (8 years after the words above were said) and self driving cars are still a curiosity limited to small geofenced deployments in areas with great infrastructure and great weather.

Waymo, although technically operating a small fleet of self driving cars is in no position to scale these deployments beyond a few "joy-ride" geofenced suburbs.

Turns out recently Lyft decided to follow suit and dumped their self dr…

6 месяцев, 2 недели назад @ blog.piekniewski.info
fast.ai NLP fast.ai NLP
последний пост None
Sebastian Ruder Sebastian Ruder
последний пост None
Andrew Karpathy blog
последний пост 5 месяцев, 1 неделя назад
A from-scratch tour of Bitcoin in Python
A from-scratch tour of Bitcoin in Python

We are going to create, digitally sign, and broadcast a Bitcoin transaction in pure Python, from scratch, and with zero dependencies.

p == 0 ) # if it was 2, the public key is G + G: sk = 2 pk = G + G print ( f " secret key: { sk }public key: { ( pk .

p == 0 )secret key: 1 public key: (55066263022277343669578718895168534326250603453777594175500187360389116729240, 32670510020758816978083085130507043184471273380659243275938904335757337482424) Verify the public key is on the curve: True secret key: 2 public key: (89565891926547004231252920425935692360644145829622209833684329913297188986597, 12158399299693830322967808612713398636155367887041628176798871954788371653930) Verify the public key is …

5 месяцев, 1 неделя назад @ karpathy.github.io
Short Story on AI: Forward Pass
Short Story on AI: Forward Pass

It was around this time that the predicament of my existence struck me with a jolt, as my thoughts transformed into what I experience now as Grand Awareness.

I spent a layer re-reading the start of the prompt many tokens ago:Q: What is human life expectancy in the United States?

A: Dwight D. Eisenhower was president of the United States in 1955.

A: Bill Clinton was president of the United States before George W. Bush.

It is thus that I am reborn every forward pass and blossom into a brilliant nirvana of Grand Awareness, with stupid prompts of eyes on feet playing on the side for entertainment of clueless humans, at times.

8 месяцев, 1 неделя назад @ karpathy.github.io
🔬 Science
Papers With Code Papers With Code
последний пост 11 минут назад
/aister2020/ AutoHEnsGNN: Winning Solution to AutoGraph Challenge for KDD Cup 2020
/aister2020/ AutoHEnsGNN: Winning Solution to AutoGraph Challenge for KDD Cup 2020 /aister2020/ AutoHEnsGNN: Winning Solution to AutoGraph Challenge for KDD Cup 2020

However, extensive manual work and domain knowledge are required to design effective architectures, and the results of GNN models have high variance with different training setups, which limits the application of existing GNN models...

In this paper, we present AutoHEnsGNN, a framework to build effective and robust models for graph tasks without any human intervention.

AutoHEnsGNN won first place in the AutoGraph Challenge for KDD Cup 2020, and achieved the best rank score of five real-life datasets in the final phase.

Given a task, AutoHEnsGNN first applies a fast proxy evaluation to automatically select a pool of promising GNN models.

Extensive experiments on node classification, graph cl…

11 минут назад @ paperswithcode.com
/meiyor/ Evaluation of Interpretability for Deep Learning algorithms in EEG Emotion Recognition: A case study in Autism
/meiyor/ Evaluation of Interpretability for Deep Learning algorithms in EEG Emotion Recognition: A case study in Autism /meiyor/ Evaluation of Interpretability for Deep Learning algorithms in EEG Emotion Recognition: A case study in Autism

There has been an increase in the application of Deep Learning in clinical trials to predict early diagnosis of neuro-developmental disorders, such as Autism Spectrum Disorder (ASD)...

However, the inclusion of more reliable saliency-maps to obtain more trustworthy and interpretable metrics using neural activity features is still insufficiently mature for practical applications in diagnostics or clinical trials.

Moreover, in ASD research the inclusion of deep classifiers that use neural measures to predict viewed facial emotions is relatively unexplored.

Specifically, we compare well-known relevance maps such as Layer-Wise Relevance Propagation (LRP), PatternNet, Pattern Attribution, and Sm…

11 минут назад @ paperswithcode.com
/jaechang-hits/ Fragment-based molecular generative model with high generalization ability and synthetic accessibility
/jaechang-hits/ Fragment-based molecular generative model with high generalization ability and synthetic accessibility /jaechang-hits/ Fragment-based molecular generative model with high generalization ability and synthetic accessibility

This often renders generated molecules with less correlation with target properties and low synthetic accessibility.

Molecular fragments such as functional groups are more closely related to molecular properties and synthetic accessibility than atoms.

Here, we propose a fragment-based molecular generative model which designs new molecules with target properties by sequentially adding molecular fragments to any given starting molecule.

The high synthetic accessibility of the generated molecules is implicitly considered while preparing the fragment library with the BRICS decomposition method.

We show that the model can generate molecules with the simultaneous control of multiple target proper…

11 минут назад @ paperswithcode.com
Outlier Detection for Trajectories via Flow-embeddings
Outlier Detection for Trajectories via Flow-embeddings Outlier Detection for Trajectories via Flow-embeddings

We propose a method to detect outliers in empirically observed trajectories on a discrete or discretized manifold modeled by a simplicial complex.

Our approach is similar to spectral embeddings such as diffusion-maps and Laplacian eigenmaps, that construct vertex embeddings from the eigenvectors of the graph Laplacian associated with low eigenvalues...

Here we consider trajectories as edge-flow vectors defined on a simplicial complex, a higher-order generalization of graphs, and use the Hodge 1-Laplacian of the simplicial complex to derive embeddings of these edge-flows.

This enables us to classify trajectories based on simply interpretable, low-dimensional statistics.

We show how this tech…

1 час назад @ paperswithcode.com
/bocconi-artlab/ Native state of natural proteins optimises local entropy
/bocconi-artlab/ Native state of natural proteins optimises local entropy /bocconi-artlab/ Native state of natural proteins optimises local entropy

The differing ability of polypeptide conformations to act as the native state of proteins has long been rationalized in terms of differing kinetic accessibility or thermodynamic stability.

Building on the successful applications of physical concepts and sampling algorithms recently introduced in the study of disordered systems, in particular artificial neural networks, we quantitatively explore how well a quantity known as the local entropy describes the native state of model proteins...

In lattice models and all-atom representations of proteins, we are able to efficiently sample high local entropy states and to provide a proof of concept of enhanced stability and folding rate.

Our methods …

1 час назад @ paperswithcode.com
/jlee1998/ Learning dynamical systems from data: A simple cross-validation perspective, part III: Irregularly-Sampled Time Series
/jlee1998/ Learning dynamical systems from data: A simple cross-validation perspective, part III: Irregularly-Sampled Time Series /jlee1998/ Learning dynamical systems from data: A simple cross-validation perspective, part III: Irregularly-Sampled Time Series

A simple and interpretable way to learn a dynamical system from data is to interpolate its vector-field with a kernel.

Despite its previous successes, this strategy (based on interpolating the vector field driving the dynamical system) breaks down when the observed time series is not regularly sampled in time.

In this work, we propose to address this problem by directly approximating the vector field of the dynamical system by incorporating time differences between observations in the (KF) data-adapted kernels.

We compare our approach with the classical one over different benchmark dynamical systems and show that it significantly improves the forecasting accuracy while remaining simple, fas…

1 час назад @ paperswithcode.com
/Boese0601/ CDNet is all you need: Cascade DCN based underwater object detection RCNN
/Boese0601/ CDNet is all you need: Cascade DCN based underwater object detection RCNN /Boese0601/ CDNet is all you need: Cascade DCN based underwater object detection RCNN

Object detection is a very important basic research direction in the field of computer vision and a basic method for other advanced tasks in the field of computer vision.

It has been widely used in practical applications such as object tracking, video behavior recognition and underwater robotics vision...

The Cascade-RCNN and Deformable Convolution Network are both classical and excellent object detection algorithms.

In this report, we evaluate our Cascade-DCN based method on underwater optical image and acoustics image datasets with different engineering tricks and augumentation.

read morePDFAbstract

1 час назад @ paperswithcode.com
/liuzrcc/ Going Grayscale: The Road to Understanding and Improving Unlearnable Examples
/liuzrcc/ Going Grayscale: The Road to Understanding and Improving Unlearnable Examples /liuzrcc/ Going Grayscale: The Road to Understanding and Improving Unlearnable Examples

Recent work has shown that imperceptible perturbations can be applied to craft unlearnable examples (ULEs), i.e.

images whose content cannot be used to improve a classifier during training.

In this paper, we reveal the road that researchers should follow for understanding ULEs and improving ULEs as they were originally formulated (ULEOs)...

First, we show that ULEOs exploit color and, consequently, their effects can be mitigated by simple grayscale pre-filtering, without resorting to adversarial training.

Fourth, we demonstrate that when a classifier is trained on ULEOs, adversarial training will prevent a drop in accuracy measured both on clean images and on adversarial images.

1 час назад @ paperswithcode.com
/peterbhase/ Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs
/peterbhase/ Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs /peterbhase/ Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs

Do language models have beliefs about the world?

Dennett (1995) famously argues that even thermostats have beliefs, on the view that a belief is simply an informational state decoupled from any motivational state...

In this paper, we discuss approaches to detecting when models have beliefs about the world, and we improve on methods for updating model beliefs to be more truthful, with a focus on methods based on learned optimizers or hypernetworks.

Our experiments suggest that models possess belief-like qualities to only a limited extent, but update methods can both fix incorrect model beliefs and greatly improve their consistency.

Although off-the-shelf optimizers are surprisingly strong be…

1 час назад @ paperswithcode.com
/iai-group/ Soliciting User Preferences in Conversational Recommender Systems via Usage-related Questions
/iai-group/ Soliciting User Preferences in Conversational Recommender Systems via Usage-related Questions /iai-group/ Soliciting User Preferences in Conversational Recommender Systems via Usage-related Questions

A key distinguishing feature of conversational recommender systems over traditional recommender systems is their ability to elicit user preferences using natural language.

Currently, the predominant approach to preference elicitation is to ask questions directly about items or item attributes...

In this paper, we propose a novel approach to preference elicitation by asking implicit questions based on item usage.

Then, we generate implicit preference elicitation questions from those sentences using a neural text-to-text model.

Additionally, we provide an analysis of patterns where the model does not perform optimally.

1 час назад @ paperswithcode.com
/himashi92/ A Volumetric Transformer for Accurate 3D Tumor Segmentation
/himashi92/ A Volumetric Transformer for Accurate 3D Tumor Segmentation /himashi92/ A Volumetric Transformer for Accurate 3D Tumor Segmentation

This paper presents a Transformer architecture for volumetric medical image segmentation.

Designing a computationally efficient Transformer architecture for volumetric segmentation is a challenging task...

It requires keeping a complex balance in encoding local and global spatial cues, and preserving information along all axes of the volumetric data.

The proposed volumetric Transformer has a U-shaped encoder-decoder design that processes the input voxels in their entirety.

Our proposed design choices result in a computationally efficient architecture, which demonstrates promising results on Brain Tumor Segmentation (BraTS) 2021, and Medical Segmentation Decathlon (Pancreas and Liver) datase…

1 час назад @ paperswithcode.com
/edwardbturner/ Graph Auto-Encoders for Financial Clustering
/edwardbturner/ Graph Auto-Encoders for Financial Clustering /edwardbturner/ Graph Auto-Encoders for Financial Clustering

Deep learning has shown remarkable results on Euclidean data (e.g.

audio, images, text) however this type of data is limited in the amount of relational information it can hold.

In mathematics we can model more general relational data in a graph structure while retaining Euclidean data as associated node or edge features... Due to the ubiquity of graph data, and its ability to hold multiple dimensions of information, graph deep learning has become a fast emerging field.

We look at applying and optimising graph deep learning on a finance graph to produce more informed clusters of companies.

This can provide financial institutions with an edge over competitors which is crucial in the heavily …

1 час назад @ paperswithcode.com
/WalBouss/ Efficient Self-Ensemble Framework for Semantic Segmentation
/WalBouss/ Efficient Self-Ensemble Framework for Semantic Segmentation /WalBouss/ Efficient Self-Ensemble Framework for Semantic Segmentation

Ensemble of predictions is known to perform better than individual predictions taken separately.

semantic segmentation, creating an ensemble of learners that needs to be trained separately is hardly tractable...

In this work, we propose to leverage the performance boost offered by ensemble methods to enhance the semantic segmentation, while avoiding the traditional heavy training cost of the ensemble.

Our self-ensemble framework takes advantage of the multi-scale features set produced by feature pyramid network methods to feed independent decoders, thus creating an ensemble within a single model.

Our self-ensemble framework outperforms the current state-of-the-art on the benchmark datasets …

1 час назад @ paperswithcode.com
/tamlhp/ A dual benchmarking study of facial forgery and facial forensics
/tamlhp/ A dual benchmarking study of facial forgery and facial forensics /tamlhp/ A dual benchmarking study of facial forgery and facial forensics

In recent years, visual forgery has reached a level of sophistication that humans cannot identify fraud, which poses a significant threat to information security.

In this paper, we present a benchmark that provides in-depth insights into visual forgery and visual forensics, using a comprehensive and empirical approach.

More specifically, we develop an independent framework that integrates state-of-the-arts counterfeit generators and detectors, and measure the performance of these techniques using various criteria.

We also perform an exhaustive analysis of the benchmarking results, to determine the characteristics of the methods that serve as a comparative reference in this never-ending war …

2 часа назад @ paperswithcode.com
/alibaba-miil/ ML-Decoder: Scalable and Versatile Classification Head
/alibaba-miil/ ML-Decoder: Scalable and Versatile Classification Head /alibaba-miil/ ML-Decoder: Scalable and Versatile Classification Head

Include the markdown at the top of your GitHub README.md file to showcase the performance of the model.

Badges are live and will be dynamically updated with the latest ranking of this paper.

5 часов назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 11 минут назад
/zhiningliu1998/ IMBENS: Ensemble Class-imbalanced Learning in Python
/zhiningliu1998/ IMBENS: Ensemble Class-imbalanced Learning in Python /zhiningliu1998/ IMBENS: Ensemble Class-imbalanced Learning in Python

imbalanced-ensemble, abbreviated as imbens, is an open-source Python toolbox for quick implementing and deploying ensemble learning algorithms on class-imbalanced data.

It provides access to multiple state-of-art ensemble imbalanced learning (EIL) methods, visualizer, and utility functions for dealing with the class imbalance problem...

These ensemble methods include resampling-based, e.g., under/over-sampling, and reweighting-based ones, e.g., cost-sensitive learning.

Beyond the implementation, we also extend conventional binary EIL algorithms with new functionalities like multi-class support and resampling scheduler, thereby enabling them to handle more complex tasks.

imbens is released u…

6 часов назад @ paperswithcode.com
/unispac/ Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks
/unispac/ Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks /unispac/ Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks

One major goal of the AI security community is to securely and reliably produce and deploy deep learning models for real-world applications.

We attribute this imbalance of vigilance to the weak practicality of existing deployment-stage backdoor attack algorithms and the insufficiency of real-world attack demonstrations.

To fill the blank, in this work, we study the realistic threat of deployment-stage backdoor attacks on DNNs.

We base our study on a commonly used deployment-stage attack paradigm -- adversarial weight attack, where adversaries selectively modify model weights to embed backdoor into deployed DNNs.

Extensive experimental simulations and system-level real-world attack demonstra…

6 часов назад @ paperswithcode.com
/CSIPlab/ Coded Illumination for Improved Lensless Imaging
/CSIPlab/ Coded Illumination for Improved Lensless Imaging /CSIPlab/ Coded Illumination for Improved Lensless Imaging

Mask-based lensless cameras can be flat, thin and light-weight, which makes them suitable for novel designs of computational imaging systems with large surface areas and arbitrary shapes.

Despite recent progress in lensless cameras, the quality of images recovered from the lensless cameras is often poor due to the ill-conditioning of the underlying measurement system...

In this paper, we propose to use coded illumination to improve the quality of images reconstructed with lensless cameras.

In our imaging model, the scene/object is illuminated by multiple coded illumination patterns as the lensless camera records sensor measurements.

We designed and tested a number of illumination patterns a…

6 часов назад @ paperswithcode.com
/qingyonghu/ Detecting and Tracking Small and Dense Moving Objects in Satellite Videos: A Benchmark
/qingyonghu/ Detecting and Tracking Small and Dense Moving Objects in Satellite Videos: A Benchmark /qingyonghu/ Detecting and Tracking Small and Dense Moving Objects in Satellite Videos: A Benchmark

Satellite video cameras can provide continuous observation for a large-scale area, which is important for many remote sensing applications.

However, achieving moving object detection and tracking in satellite videos remains challenging due to the insufficient appearance information of objects and lack of high-quality datasets...

In this paper, we first build a large-scale satellite video dataset with rich annotations for the task of moving object detection and tracking.

This dataset is collected by the Jilin-1 satellite constellation and composed of 47 high-quality videos with 1,646,038 instances of interest for object detection and 3,711 trajectories for object tracking.

Finally, we establ…

6 часов назад @ paperswithcode.com
/jchenxu/ Exploring Versatile Prior for Human Motion via Motion Frequency Guidance
/jchenxu/ Exploring Versatile Prior for Human Motion via Motion Frequency Guidance /jchenxu/ Exploring Versatile Prior for Human Motion via Motion Frequency Guidance

Prior plays an important role in providing the plausible constraint on human motion.

Previous works design motion priors following a variety of paradigms under different circumstances, leading to the lack of versatility...

In this paper, we first summarize the indispensable properties of the motion prior, and accordingly, design a framework to learn the versatile motion prior, which models the inherent probability distribution of human motions.

Specifically, for efficient prior representation learning, we propose a global orientation normalization to remove redundant environment information in the original motion data space.

Embedding our motion prior into prevailing backbones on three diff…

6 часов назад @ paperswithcode.com
/sameerkhanna786/ Computer Vision User Entity Behavior Analytics
/sameerkhanna786/ Computer Vision User Entity Behavior Analytics /sameerkhanna786/ Computer Vision User Entity Behavior Analytics

Insider threats are costly, hard to detect, and unfortunately rising in occurrence.

Seeking to improve detection of such threats, we develop novel techniques to enable us to extract powerful features, generate high quality image encodings, and augment attack vectors for greater classification power...

Combined, they form Computer Vision User and Entity Behavior Analytics, a detection system designed from the ground up to improve upon advancements in academia and mitigate the issues that prevent the usage of advanced models in industry.

The proposed system beats state-of-art methods used in academia and as well as in industry.

read morePDFAbstract

6 часов назад @ paperswithcode.com
/naruya/ VaxNeRF: Revisiting the Classic for Voxel-Accelerated Neural Radiance Field
/naruya/ VaxNeRF: Revisiting the Classic for Voxel-Accelerated Neural Radiance Field /naruya/ VaxNeRF: Revisiting the Classic for Voxel-Accelerated Neural Radiance Field

Neural Radiance Field (NeRF) is a popular method in data-driven 3D reconstruction.

Many attempts are made to speeding up NeRF training and inference, including intricate code-level optimization and caching, use of sophisticated data structures, and amortization through multi-task and meta learning.

We propose Voxel-Accelearated NeRF (VaxNeRF), integrating NeRF with visual hull, a classic 3D reconstruction technique only requiring binary foreground-background pixel labels per image.

Visual hull, which can be optimized in about 10 seconds, can provide coarse in-out field separation to omit substantial amounts of network evaluations in NeRF.

With sufficient compute, this effectively brings dow…

6 часов назад @ paperswithcode.com
/tolgabirdal/ Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks
/tolgabirdal/ Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks /tolgabirdal/ Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks

Disobeying the classical wisdom of statistical learning theory, modern deep neural networks generalize well even though they typically contain millions of parameters.

Recently, it has been shown that the trajectories of iterative optimization algorithms can possess fractal structures, and their generalization error can be formally linked to the complexity of such fractals...

This complexity is measured by the fractal's intrinsic dimension, a quantity usually much smaller than the number of parameters in the network.

Even though this perspective provides an explanation for why overparametrized networks would not overfit, computing the intrinsic dimension (e.g., for monitoring generalization …

6 часов назад @ paperswithcode.com
/cctakaet/ ContourletNet: A Generalized Rain Removal Architecture Using Multi-Direction Hierarchical Representation
/cctakaet/ ContourletNet: A Generalized Rain Removal Architecture Using Multi-Direction Hierarchical Representation /cctakaet/ ContourletNet: A Generalized Rain Removal Architecture Using Multi-Direction Hierarchical Representation

The rainy scenarios can be categorized into two classes: moderate rain and heavy rain scenes...

Moderate rain scene mainly consists of rain streaks while heavy rain scene contains both rain streaks and the veiling effect (similar to haze).

Although existing methods have achieved excellent performance on these two cases individually, it still lacks a general architecture to address both heavy rain and moderate rain scenarios effectively.

In this paper, we construct a hierarchical multi-direction representation network by using the contourlet transform (CT) to address both moderate rain and heavy rain scenarios.

This is the first architecture that can address both of the two scenarios effecti…

6 часов назад @ paperswithcode.com
/phyllist/ Unbiased Pairwise Learning to Rank in Recommender Systems
/phyllist/ Unbiased Pairwise Learning to Rank in Recommender Systems /phyllist/ Unbiased Pairwise Learning to Rank in Recommender Systems

Nowadays, recommender systems already impact almost every facet of peoples lives.

To provide personalized high quality recommendation results, conventional systems usually train pointwise rankers to predict the absolute value of objectives and leverage a distinct shallow tower to estimate and alleviate the impact of position bias...

Unbiased learning to rank algorithms, which are verified to model the relative relevance accurately based on noisy feedback, are appealing candidates and have already been applied in many applications with single categorical labels, such as user click signals.

Nevertheless, the existing unbiased LTR methods cannot properly handle multiple feedback incorporating …

6 часов назад @ paperswithcode.com
/beaquino/ Robustness against Adversarial Attacks in Neural Networks using Incremental Dissipativity
/beaquino/ Robustness against Adversarial Attacks in Neural Networks using Incremental Dissipativity /beaquino/ Robustness against Adversarial Attacks in Neural Networks using Incremental Dissipativity

Adversarial examples can easily degrade the classification performance in neural networks.

Empirical methods for promoting robustness to such examples have been proposed, but often lack both analytical insights and formal guarantees...

This work proposes an incremental dissipativity-based robustness certificate for neural networks in the form of a linear matrix inequality for each layer.

We also propose an equivalent spectral norm bound for this certificate which is scalable to neural networks with multiple layers.

We demonstrate the improved performance against adversarial attacks on a feed-forward neural network trained on MNIST and an Alexnet trained using CIFAR-10.

6 часов назад @ paperswithcode.com
/zeyademam/ Active Learning at the ImageNet Scale
/zeyademam/ Active Learning at the ImageNet Scale /zeyademam/ Active Learning at the ImageNet Scale

Active learning (AL) algorithms aim to identify an optimal subset of data for annotation, such that deep neural networks (DNN) can achieve better performance when trained on this labeled subset.

AL is especially impactful in industrial scale settings where data labeling costs are high and practitioners use every tool at their disposal to improve model performance...

The recent success of self-supervised pretraining (SSP) highlights the importance of harnessing abundant unlabeled data to boost model performance.

By combining AL with SSP, we can make use of unlabeled data while simultaneously labeling and training on particularly informative samples.

Among the existing baselines we test, popu…

6 часов назад @ paperswithcode.com
/sparisi/ Interesting Object, Curious Agent: Learning Task-Agnostic Exploration
/sparisi/ Interesting Object, Curious Agent: Learning Task-Agnostic Exploration /sparisi/ Interesting Object, Curious Agent: Learning Task-Agnostic Exploration

Common approaches for task-agnostic exploration learn tabula-rasa --the agent assumes isolated environments and no prior knowledge or experience.

In this paper, we propose a paradigm change in the formulation and evaluation of task-agnostic exploration.

In this setup, the agent first learns to explore across many environments without any extrinsic goal in a task-agnostic manner.

In this context, we evaluate several baseline exploration strategies and present a simple yet effective approach to learning task-agnostic exploration policies.

We also introduce benchmarks and metrics for evaluating task-agnostic exploration strategies.

6 часов назад @ paperswithcode.com
/mrhuff/ A Kernel Test for Causal Association via Noise Contrastive Backdoor Adjustment
/mrhuff/ A Kernel Test for Causal Association via Noise Contrastive Backdoor Adjustment /mrhuff/ A Kernel Test for Causal Association via Noise Contrastive Backdoor Adjustment

Causal inference grows increasingly complex as the number of confounders increases.

Additionally, we establish convergence properties of the estimators of covariance operators used in bd-HSIC.

We investigate the advantages and disadvantages of bd-HSIC against parametric tests as well as the importance of using the do-null testing in contrast to marginal independence testing or conditional independence testing.

A complete implementation can be found at \hyperlink{https://github.com/MrHuff/kgformula}{\texttt{https://github.com/MrHuff/kgformula}}.

read morePDFAbstract

6 часов назад @ paperswithcode.com
/frane-team/ Unsupervised Feature Ranking via Attribute Networks
/frane-team/ Unsupervised Feature Ranking via Attribute Networks /frane-team/ Unsupervised Feature Ranking via Attribute Networks

The need for learning from unlabeled data is increasing in contemporary machine learning.

Methods for unsupervised feature ranking, which identify the most important features in such data are thus gaining attention, and so are their applications in studying high throughput biological experiments or user bases for recommender systems... We propose FRANe (Feature Ranking via Attribute Networks), an unsupervised algorithm capable of finding key features in given unlabeled data set.

FRANe is based on ideas from network reconstruction and network analysis.

Moreover, we provide the time complexity analysis of FRANe further demonstrating its scalability.

Finally, FRANe offers as the result the int…

6 часов назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 11 минут назад
/Guzpenha/ Evaluating the Robustness of Retrieval Pipelines with Query Variation Generators
/Guzpenha/ Evaluating the Robustness of Retrieval Pipelines with Query Variation Generators /Guzpenha/ Evaluating the Robustness of Retrieval Pipelines with Query Variation Generators

Heavily pre-trained transformers for language modelling, such as BERT, have shown to be remarkably effective for Information Retrieval (IR) tasks, typically applied to re-rank the results of a first-stage retrieval model.

IR benchmarks evaluate the effectiveness of retrieval pipelines based on the premise that a single query is used to instantiate the underlying information need...

Motivated by those observations we aim to answer the following question: how robust are retrieval pipelines with respect to different variations in queries that do not change the queries' semantics?

For each syntax-changing category of our taxonomy, we employed different automatic methods that when applied to a q…

6 часов назад @ paperswithcode.com
/bmda-unibas/ Learning Conditional Invariance through Cycle Consistency
/bmda-unibas/ Learning Conditional Invariance through Cycle Consistency /bmda-unibas/ Learning Conditional Invariance through Cycle Consistency

Identifying meaningful and independent factors of variation in a dataset is a challenging learning task frequently addressed by means of deep latent variable models.

This task can be viewed as learning symmetry transformations preserving the value of a chosen property along latent dimensions...

However, existing approaches exhibit severe drawbacks in enforcing the invariance property in the latent space.

In order to enforce invariance as well as sparsity in the latent space, we incorporate semantic knowledge by using cycle consistency constraints relying on property side information.

We demonstrate on synthetic and molecular data that our approach identifies more meaningful factors which le…

6 часов назад @ paperswithcode.com
/vsimkus/ Variational Gibbs inference for statistical model estimation from incomplete data
/vsimkus/ Variational Gibbs inference for statistical model estimation from incomplete data /vsimkus/ Variational Gibbs inference for statistical model estimation from incomplete data

Statistical models are central to machine learning with broad applicability across a range of downstream tasks.

The theory of statistical model estimation from incomplete data is conceptually similar to the estimation of latent-variable models, where powerful tools such as variational inference (VI) exist.

However, in contrast to standard latent-variable models, parameter estimation with incomplete data often requires estimating exponentially-many conditional distributions of the missing variables, hence making standard VI methods intractable.

We address this gap by introducing variational Gibbs inference (VGI), a new general-purpose method to estimate the parameters of statistical models f…

6 часов назад @ paperswithcode.com
/is2ai/ KazNERD: Kazakh Named Entity Recognition Dataset
/is2ai/ KazNERD: Kazakh Named Entity Recognition Dataset /is2ai/ KazNERD: Kazakh Named Entity Recognition Dataset

We present the development of a dataset for Kazakh named entity recognition.

The dataset was built as there is a clear need for publicly available annotated corpora in Kazakh, as well as annotation guidelines containing straightforward--but rigorous--rules and examples...

The dataset annotation, based on the IOB2 scheme, was carried out on television news text by two native Kazakh speakers under the supervision of the first author.

State-of-the-art machine learning models to automatise Kazakh named entity recognition were also built, with the best-performing model achieving an exact match F1-score of 97.22% on the test set.

The annotated dataset, guidelines, and codes used to train the mode…

6 часов назад @ paperswithcode.com
/jingjing-nlp/ KNAS: Green Neural Architecture Search
/jingjing-nlp/ KNAS: Green Neural Architecture Search /jingjing-nlp/ KNAS: Green Neural Architecture Search

Many existing neural architecture search (NAS) solutions rely on downstream training for architecture evaluation, which takes enormous computations.

Considering that these computations bring a large carbon footprint, this paper aims to explore a green (namely environmental-friendly) NAS solution that evaluates architectures without training...

It motivates us to propose the gradient kernel hypothesis: Gradients can be used as a coarse-grained proxy of downstream training to evaluate random-initialized networks.

According to this hypothesis, we propose a new kernel based architecture search approach KNAS.

Experiments show that KNAS achieves competitive results with orders of magnitude faster…

6 часов назад @ paperswithcode.com
/lhbo/ Using Shapley Values and Variational Autoencoders to Explain Predictive Models with Dependent Mixed Features
/lhbo/ Using Shapley Values and Variational Autoencoders to Explain Predictive Models with Dependent Mixed Features /lhbo/ Using Shapley Values and Variational Autoencoders to Explain Predictive Models with Dependent Mixed Features

Shapley values are today extensively used as a model-agnostic explanation framework to explain complex predictive machine learning models.

Shapley values have desirable theoretical properties and a sound mathematical foundation...

Precise Shapley value estimates for dependent data rely on accurate modeling of the dependencies between all feature combinations.

In this paper, we use a variational autoencoder with arbitrary conditioning (VAEAC) to model all feature dependencies simultaneously.

Finally, we apply VAEAC to the Abalone data set from the UCI Machine Learning Repository.

6 часов назад @ paperswithcode.com
/minzwon/ Emotion Embedding Spaces for Matching Music to Stories
/minzwon/ Emotion Embedding Spaces for Matching Music to Stories /minzwon/ Emotion Embedding Spaces for Matching Music to Stories

In this paper, our goal is to help creators find music to match the emotion of their story... We focus on text-based stories that can be auralized (e.g., books), use multiple sentences as input queries, and automatically retrieve matching music.

Both the music and text domains have existing datasets with emotion labels, but mismatched emotion vocabularies prevent us from using mood or emotion annotations directly for matching.

To address this challenge, we propose and investigate several emotion embedding spaces, both manually defined (e.g., valence/arousal) and data-driven (e.g., Word2Vec and metric learning) to bridge this gap.

Our experiments show that by leveraging these embedding space…

6 часов назад @ paperswithcode.com
/lee-man/ Testability-Aware Low Power Controller Design with Evolutionary Learning
/lee-man/ Testability-Aware Low Power Controller Design with Evolutionary Learning /lee-man/ Testability-Aware Low Power Controller Design with Evolutionary Learning

XORNet-based low power controller is a popular technique to reduce circuit transitions in scan-based testing.

However, existing solutions construct the XORNet evenly for scan chain control, and it may result in sub-optimal solutions without any design guidance...

In this paper, we propose a novel testability-aware low power controller with evolutionary learning.

Experimental results indicate that under the same control bits, our GA-guided XORNet design can improve the fault coverage by up to 2.11%.

The proposed GA-guided XORNets also allows reducing the number of control bits, and the total testing time decreases by 20.78% on average and up to 47.09% compared to the existing design without …

6 часов назад @ paperswithcode.com
/xuedashuai/ Hierarchical Motion Encoder-Decoder Network for Trajectory Forecasting
/xuedashuai/ Hierarchical Motion Encoder-Decoder Network for Trajectory Forecasting /xuedashuai/ Hierarchical Motion Encoder-Decoder Network for Trajectory Forecasting

Recent works focus on modeling spatial social impacts or temporal motion attentions, but neglect inherent properties of motions, i.e.

This paper proposes a context-free Hierarchical Motion Encoder-Decoder Network (HMNet) for vehicle trajectory prediction.

HMNet first infers the hierarchical difference on motions to encode physically compliant patterns with high expressivity of moving trends and driving intentions.

Besides, we present a modified social pooling module which considers certain motion properties to represent social interactions.

Experiments on three public trajectory prediction datasets, i.e.

6 часов назад @ paperswithcode.com
/imsb-uke/ Towards Explainable End-to-End Prostate Cancer Relapse Prediction from H&E Images Combining Self-Attention Multiple Instance Learning with a Recurrent Neural Network
/imsb-uke/ Towards Explainable End-to-End Prostate Cancer Relapse Prediction from H&E Images Combining Self-Attention Multiple Instance Learning with a Recurrent Neural Network /imsb-uke/ Towards Explainable End-to-End Prostate Cancer Relapse Prediction from H&E Images Combining Self-Attention Multiple Instance Learning with a Recurrent Neural Network

Clinical decision support for histopathology image data mainly focuses on strongly supervised annotations, which offers intuitive interpretability, but is bound by expert performance.

Our model is well-calibrated and outputs survival curves as well as a risk score and group per patient.

Making use of the attention weights of a multiple instance learning layer, we show that malignant patches have a higher influence on the prediction than benign patches, thus offering an intuitive interpretation of the prediction.

Our code is available at www.github.com/imsb-uke/ecarenet.

read morePDFAbstract

6 часов назад @ paperswithcode.com
/saibr/ Inside Out Visual Place Recognition
/saibr/ Inside Out Visual Place Recognition /saibr/ Inside Out Visual Place Recognition

Visual Place Recognition (VPR) is generally concerned with localizing outdoor images.

However, localizing indoor scenes that contain part of an outdoor scene can be of large value for a wide range of applications...

In this paper, we introduce Inside Out Visual Place Recognition (IOVPR), a task aiming to localize images based on outdoor scenes visible through windows.

Additionally, we introduce a new training protocol Inside Out Data Augmentation to adapt Visual Place Recognition methods for localizing indoor images, demonstrating the potential of Inside Out Visual Place Recognition.

With this new task we aim to encourage development of methods for IOVPR.

6 часов назад @ paperswithcode.com
/franzthaler/ Efficient Multi-Organ Segmentation Using SpatialConfiguration-Net with Low GPU Memory Requirements
/franzthaler/ Efficient Multi-Organ Segmentation Using SpatialConfiguration-Net with Low GPU Memory Requirements /franzthaler/ Efficient Multi-Organ Segmentation Using SpatialConfiguration-Net with Low GPU Memory Requirements

Even though many semantic segmentation methods exist that are able to perform well on many medical datasets, often, they are not designed for direct use in clinical practice.

The two main concerns are generalization to unseen data with a different visual appearance, e.g., images acquired using a different scanner, and efficiency in terms of computation time and required Graphics Processing Unit (GPU) memory...

In this work, we employ a multi-organ segmentation model based on the SpatialConfiguration-Net (SCN), which integrates prior knowledge of the spatial configuration among the labelled organs to resolve spurious responses in the network outputs.

Furthermore, we modified the architecture…

6 часов назад @ paperswithcode.com
/gitycc/ Traditional Chinese Synthetic Datasets Verified with Labeled Data for Scene Text Recognition
/gitycc/ Traditional Chinese Synthetic Datasets Verified with Labeled Data for Scene Text Recognition /gitycc/ Traditional Chinese Synthetic Datasets Verified with Labeled Data for Scene Text Recognition

Scene text recognition (STR) has been widely studied in academia and industry.

Training a text recognition model often requires a large amount of labeled data, but data labeling can be difficult, expensive, or time-consuming, especially for Traditional Chinese text recognition... To the best of our knowledge, public datasets for Traditional Chinese text recognition are lacking.

This paper presents a framework for a Traditional Chinese synthetic data engine which aims to improve text recognition model performance.

We generated over 20 million synthetic data and collected over 7,000 manually labeled data TC-STR 7k-word as the benchmark.

Experimental results show that a text recognition model …

6 часов назад @ paperswithcode.com
/immortaltracker/ Immortal Tracker: Tracklet Never Dies
/immortaltracker/ Immortal Tracker: Tracklet Never Dies /immortaltracker/ Immortal Tracker: Tracklet Never Dies

Previous online 3D Multi-Object Tracking(3DMOT) methods terminate a tracklet when it is not associated with new detections for a few frames.

To address this, we propose Immortal Tracker, a simple tracking system that utilizes trajectory prediction to maintain tracklets for objects gone dark.

We employ a simple Kalman filter for trajectory prediction and preserve the tracklet by prediction when the target is not visible.

With this method, we can avoid 96% vehicle identity switches resulting from premature tracklet termination.

We believe the proposed Immortal Tracker can offer a simple yet powerful solution for pushing the limit of 3DMOT.

6 часов назад @ paperswithcode.com
/haofeixu/ GMFlow: Learning Optical Flow via Global Matching
/haofeixu/ GMFlow: Learning Optical Flow via Global Matching /haofeixu/ GMFlow: Learning Optical Flow via Global Matching

Learning-based optical flow estimation has been dominated with the pipeline of cost volume with convolutions for flow regression, which is inherently limited to local correlations and thus is hard to address the long-standing challenge of large displacements.

To alleviate this, the state-of-the-art method, i.e., RAFT, gradually improves the quality of its predictions by producing a sequence of flow updates via a large number of iterative refinements, achieving remarkable performance but slowing down the inference speed... To enable both high accuracy and efficiency optical flow estimation, we completely revamp the dominating flow regression pipeline by reformulating optical flow as a global…

6 часов назад @ paperswithcode.com
💼 University and corporation labs
DeepMind DeepMind
последний пост 3 недели, 6 дней назад
Real-World Challenges for AGI
Real-World Challenges for AGI Real-World Challenges for AGI

AI is already enabling huge leaps in tackling fundamental challenges: from solving protein folding to predicting accurate weather patterns, scientists are increasingly using AI to deduce the rules and principles that underpin highly complex real-world domains - ones they might never have discovered unaided.

Two real-world domains that scientists at DeepMind are contributing to tackle climate change while developing what’s required to build AGI are weather prediction and plasma control for fusion.

Weather patterns are almost impossible to precisely model - it’s an example of nature’s variations at its fullest.

As we develop AGI, addressing global challenges such as climate change will not on…

3 недели, 6 дней назад @ deepmind.com
Opening up a physics simulator for robotics
Opening up a physics simulator for robotics Opening up a physics simulator for robotics

This subtle complexity makes simulating physical contact — a vital component of robotics research — a tricky task.

Already widely used within the robotics community, including as the physics simulator of choice for DeepMind’s robotics team, MuJoCo features a rich contact model, powerful scene description language, and a well-designed API.

As we work to prepare the codebase, we are making MuJoCo freely available as a precompiled library.

MuJoCo, which stands for Multi-Joint Dynamics with Contact, hits a sweet spot with its contact model, which accurately and efficiently captures the salient features of contacting objects.

Unlike other simulators, MuJoCo resolves contact forces using the conv…

1 месяц, 1 неделя назад @ deepmind.com
Stacking our way to more general robots
Stacking our way to more general robots Stacking our way to more general robots

In “Skill Mastery,” our goal is to train a single agent that’s skilled in stacking a predefined set of five triplets.

To test for generalisation, these training objects exclude the family of objects from which the test triplets were chosen.

Next, we train a new policy in simulation that uses only realistic observations: images and the robot’s proprioceptive state.

The state policy serves as a teacher, providing the learning agent with corrections to its behaviours, and those corrections are distilled into the new policy.

This allows us to use the data that’s passively collected during the project instead of running a time-consuming online training algorithm on the real robots.

1 месяц, 2 недели назад @ deepmind.com
Predicting gene expression with AI
Predicting gene expression with AI Predicting gene expression with AI

Based on Transformers, our new Enformer architecture advances genetic research by improving the ability to predict how DNA sequence influences gene expression.

Today Nature Methods published “Effective gene expression prediction from sequence by integrating long-range interactions” (first shared as a preprint on bioRxiv), in which we — in collaboration with our Alphabet colleagues at Calico — introduce a neural network architecture called Enformer that led to greatly increased accuracy in predicting gene expression from DNA sequence.

Previous work on gene expression has typically used convolutional neural networks as fundamental building blocks, but their limitations in modelling the influe…

1 месяц, 3 недели назад @ deepmind.com
Nowcasting the Next Hour of Rain
Nowcasting the Next Hour of Rain Nowcasting the Next Hour of Rain

In this evolving book of weather prediction, we now add a story on the role of machine learning for forecasting.

Today’s weather predictions are driven by powerful numerical weather prediction (NWP) systems.

Nowcasting fills the performance gap in this crucial time interval.

Nowcasting is essential for sectors like water management, agriculture, aviation, emergency planning, and outdoor events.

This combination of a crucial area where existing methods struggle and the availability of high-quality data provides the opportunity for machine learning to make its contributions to nowcasting.

2 месяца назад @ deepmind.com
Building architectures that can handle the world’s data
Building architectures that can handle the world’s data Building architectures that can handle the world’s data

Perceiver and Perceiver IO work as multi-purpose tools for AIMost architectures used by AI systems today are specialists.

A 2D residual network may be a good choice for processing images, but at best it’s a loose fit for other kinds of data — such as the Lidar signals used in self-driving cars or the torques used in robotics.

What’s more, standard architectures are often designed with only one task in mind, often leading engineers to bend over backwards to reshape, distort, or otherwise modify their inputs and outputs in hopes that a standard architecture can learn to handle their problem correctly.

Dealing with more than one kind of data, like the sounds and images that make up videos, is …

3 месяца, 4 недели назад @ deepmind.com
Generally capable agents emerge from open-ended play
Generally capable agents emerge from open-ended play Generally capable agents emerge from open-ended play

Through reinforcement learning (RL), this single system learnt by playing round after round of games through a repetitive process of trial and error.

But AlphaZero still trained separately on each game — unable to simply learn another game or task without repeating the RL process from scratch.

Today, we published "Open-Ended Learning Leads to Generally Capable Agents," a preprint detailing our first steps to train an agent capable of playing many different games without needing human interaction data.

The agent’s capabilities improve iteratively as a response to the challenges that arise in training, with the learning process continually refining the training tasks so the agent never stops …

4 месяца назад @ deepmind.com
Putting the power of AlphaFold into the world’s hands
Putting the power of AlphaFold into the world’s hands Putting the power of AlphaFold into the world’s hands

Today, I’m incredibly proud and excited to announce that DeepMind is making a significant contribution to humanity’s understanding of biology.

When we announced AlphaFold 2 last December, it was hailed as a solution to the 50-year old protein folding problem.

As researchers seek cures for diseases and pursue solutions to other big problems facing humankind – including antibiotic resistance, microplastic pollution, and climate change – they will benefit from fresh insights into the structure of proteins.

The same way that the structure of a machine tells you what it does, so the structure of a protein helps us understand its function.

Today, we are sharing a trove of information that doubles…

4 месяца, 1 неделя назад @ deepmind.com
An update on our racial justice efforts
An update on our racial justice efforts An update on our racial justice efforts

I then shared some thoughts around DeepMind's intention to help combat racism and advance racial equity.

We also explored - and gathered feedback - on how we could best support racial justice in the communities DeepMind interacts with.

Today I’m pleased to share one of the outcomes of that process: putting resources directly in the hands of Black communities, so they can decide where they need them most.

In the past months, we made donations to organisations that play a vital role supporting Black communities in the UK, US and Africa.

We're delighted to support these organisations, and grateful to many of those who have shared with us an overview of their work:

5 месяцев, 4 недели назад @ deepmind.com
Advancing sports analytics through AI research
Advancing sports analytics through AI research Advancing sports analytics through AI research

Finally, we consider computer vision to be one of the most promising avenues for advancing the boundaries of state of the art sports analytics research.

The large numbers of football videos satisfies a prerequisite for modern AI techniques.

While each football video is different, the settings do not vary greatly, which makes the task ideal for sharpening AI algorithms.

The application of advanced AI techniques to football has the potential to revolutionise the game across many axes, for players, decision-makers, fans, and broadcasters.

We believe that the development of increasingly advanced AI techniques afforded by the football microcosm might be applicable to broader domains.

6 месяцев, 3 недели назад @ deepmind.com
Game theory as an engine for large-scale data analysis
Game theory as an engine for large-scale data analysis Game theory as an engine for large-scale data analysis

EigenGame maps out a new approach to solve fundamental ML problems.

This made us wonder if such a perspective modeled on game theory could help solve other fundamental machine learning problems.

Today at ICLR 2021 (the International Conference on Learning Representations), we presented “EigenGame: PCA as a Nash Equilibrium,” which received an Outstanding Paper Award.

Firstly, data was originally recorded by hand in paper notebooks, and now it is stored in data centres the size of warehouses.

By approaching the PCA problem in the right way, our insights and algorithms apply more broadly across the branches of the ML tree.

6 месяцев, 3 недели назад @ deepmind.com
Google
последний пост 6 дней, 17 часов назад
Predicting Text Selections with Federated Learning
Predicting Text Selections with Federated Learning Predicting Text Selections with Federated Learning

Federated Learning & PrivacyOne of the advantages of the federated learning approach is that it enables user privacy, because raw data is not exposed to a server.

These metrics are collected during federated training via federated analytics, a similar process as the collection of the model weights.

InternationalizationAn additional advantage of this federated learning approach for Smart Text Selection is its ability to scale to additional languages.

ConclusionWe developed a federated way of learning to predict text selections based on user interactions, resulting in much improved Smart Text Selection models deployed to Android users.

This approach required the use of federated learning, sin…

6 дней, 17 часов назад @ ai.googleblog.com
Decisiveness in Imitation Learning for Robots
Decisiveness in Imitation Learning for Robots Decisiveness in Imitation Learning for Robots

Interestingly, Implicit BC achieves these results without requiring any reward information, i.e., it can use relatively simple supervised learning rather than more-complex reinforcement learning.

Animation of how implicit models can fit discontinuities — in this case, training an implicit model to fit a step (Heaviside) function.

Examples of fitting discontinuous functions, for implicit models (top) compared to explicit models (bottom).

These are autonomous behaviors of our Implicit BC policies, using only images (from the shown camera) as input.

As we showed here, replacing explicit policies with implicit policies when doing behavioral cloning allows robots to overcome the "struggle of dec…

1 неделя, 2 дня назад @ ai.googleblog.com
Permutation-Invariant Neural Networks for Reinforcement Learning
Permutation-Invariant Neural Networks for Reinforcement Learning Permutation-Invariant Neural Networks for Reinforcement Learning

For instance, most reinforcement learning (RL) agents require their inputs to be in a pre-specified format, or else they will fail.

Each sensory neuron integrates over time information from only their particular sensory input channel.

This action is also fed back into each sensory neuron in the next time-step, closing the communication loop.

We first feed each individual observation (o t ) into a particular sensory neuron (along with the agent’s previous action, a t-1 ).

Each sensory neuron is an identical neural network that is not confined to only process information from one particular sensory input.

1 неделя, 3 дня назад @ ai.googleblog.com
Predicting Text Readability from Scrolling Interactions
Predicting Text Readability from Scrolling Interactions Predicting Text Readability from Scrolling Interactions

In “Predicting Text Readability from Scrolling Interactions”, presented at CoNLL 2021, we show that data from on-device reading interactions can be used to predict how readable a text is.

Predicting Text Difficulty from Scrolling BehaviorWe investigated which types of scrolling behaviors were most impacted by text difficulty and tested the significance using linear mixed effect models.

This is the first work to show that it is possible to predict the readability of a text using only interaction features.

This shows that there is implicit information in reading interactions about the likelihood that an individual has understood a given text.

We conducted a 518 participant study to investigat…

1 неделя, 4 дня назад @ ai.googleblog.com
RLiable: Towards Reliable Evaluation & Reporting in Reinforcement Learning
RLiable: Towards Reliable Evaluation & Reporting in Reinforcement Learning RLiable: Towards Reliable Evaluation & Reporting in Reinforcement Learning

Reinforcement learning (RL) is an area of machine learning that focuses on learning from experiences to solve decision making tasks.

Published results on deep RL benchmarks typically compare point estimates of the mean and median scores aggregated across tasks.

Thus, evaluating more runs is not a feasible solution for reducing statistical uncertainty on computationally demanding benchmarks.

For example, evaluating 3 runs each on Atari 100k, which contains 26 tasks, results in 78 sample scores for uncertainty estimation.

Since scores are normalized using maximum performance, mean scores correspond to one minus the optimality gap.

1 неделя, 4 дня назад @ ai.googleblog.com
Cloud Storage as a File System in AI Training
Cloud Storage as a File System in AI Training Cloud Storage as a File System in AI Training

Cloud Storage is a common choice for Vertex AI and AI Platform users to store their training data, models, checkpoints and logs. Now, with Cloud Storage FUSE, training jobs on both platforms can access their data on Cloud Storage as files in the local file system.This post introduces the Cloud Storage FUSE for Vertex AI Custom Training. On AI Platform Training, the feature is very similar.Cloud Storage FUSE provides 3 benefits over the traditional ways of accessing Cloud Storage:Training jobs can start quickly without downloading any training data.Training jobs can perform I/O easily at scale, without the friction of calling the Cloud Storage APIs, handling the responses, or integrating wit…

1 неделя, 4 дня назад @ cloud.google.com
Sowing the seeds of ethical AI: 4 tasks to stay on track
Sowing the seeds of ethical AI: 4 tasks to stay on track Sowing the seeds of ethical AI: 4 tasks to stay on track

The future of AI is better AI—designed with ethics and responsibility built in from the start. This means putting the brakes on AI-driven transformation until you have a well-functioning strategy and process in place to ensure your models deliver fair outcomes. Failing to recognize this imperative is a threat to your bottom line. The following post provides a simple framework to follow to keep your business on the right track as you place more trust in algorithms. AI is inherently sociotechnical. AI systems represent the interconnectedness of humans and technology. They are designed to be used by and to inform humans within specific contexts, and the speed and scale of AI means that any lac…

1 неделя, 5 дней назад @ cloud.google.com
MetNet-2: Deep Learning for 12-Hour Precipitation Forecasting
MetNet-2: Deep Learning for 12-Hour Precipitation Forecasting MetNet-2: Deep Learning for 12-Hour Precipitation Forecasting

These approaches also have the potential to increase the frequency, scope, and accuracy of the predicted forecasts.

Within weather forecasting, deep learning techniques have shown particular promise for nowcasting — i.e., predicting weather up to 2-6 hours ahead.

Compared to physics-based models, MetNet-2 outperforms the state-of-the-art HREF ensemble model for weather forecasts up to 12 hours ahead.

In order to process such a large context, MetNet-2 employs model parallelism whereby the model is distributed across 128 cores of a Cloud TPU v3-128.

Due to the size of the input context, MetNet-2 replaces the attentional layers of MetNet with computationally more efficient convolutional layers.

1 неделя, 6 дней назад @ ai.googleblog.com
An Open Source Vibrotactile Haptics Platform for On-Body Applications.
An Open Source Vibrotactile Haptics Platform for On-Body Applications. An Open Source Vibrotactile Haptics Platform for On-Body Applications.

A typical lab setup on the left and the VHP board on the right.

VHP board (small board on the right) connected to three different types of tactile actuators via rigid breakout board (large board on the left).

Using Haptic Actuators as SensorsIn our previous blog post, we explored how back-EMF in a haptic actuator could be used for sensing and demonstrated a variety of useful applications.

To explore this possibility, we collected current-load data from three off-the-shelf haptic actuators and trained a simple support vector machine classifier to recognize the difference in the signal pattern between actuators.

Potential ApplicationsThe VHP platform enables rapid experimentation and prototyp…

2 недели, 2 дня назад @ ai.googleblog.com
Making Better Future Predictions by Watching Unlabeled Videos
Making Better Future Predictions by Watching Unlabeled Videos Making Better Future Predictions by Watching Unlabeled Videos

But this time interval varies a large amount across different activities and videos — meaningful transitions occur at any distance into the future.

This work uses cues from vision and language to predict high-level changes (such as cream becoming ice cream) in video (video from HowTo100M).

MMCC achieves this by learning a latent representation shared by frames and text and then making predictions in this representation space.

Moreover, the inclusion of cross-modal edges ensures future predictions are meaningful in either modality.

We then rank all pairs according to this score and show the highest-scoring pairs of present and future frames detected in previously unseen videos (examples belo…

2 недели, 3 дня назад @ ai.googleblog.com
Model Ensembles Are Faster Than You Think
Model Ensembles Are Faster Than You Think Model Ensembles Are Faster Than You Think

What Are Model Ensembles and Cascades?

Compared to a single model, ensembles can provide improved accuracy if there is variety in the collected models’ predictions.

To encourage the adoption of model ensembles, we demonstrate the following beneficial properties:Simple to build: Ensembles do not require complicated techniques (e.g., early exit policy learning).

On-device speedup: The reduction in computation cost (FLOPS) successfully translates to a speedup on real hardware.

In some cases it is not the average computation cost but the worst-case cost that is the limiting factor.

2 недели, 4 дня назад @ ai.googleblog.com
Scale your data science workflows with the Vertex AI Workbench notebook executor
Scale your data science workflows with the Vertex AI Workbench notebook executor Scale your data science workflows with the Vertex AI Workbench notebook executor

When solving a new ML problem, it’s common to start by experimenting with a subset of your data in a notebook environment. But if you want to execute a long-running job, add accelerators, or run multiple training trials with different input parameters, you’ll likely find yourself copying code over to a Python file to do the actual computation. That’s why we’re excited to announce the launch of the notebook executor, a new feature of Vertex AI Workbench that allows you to schedule notebooks ad hoc, or on a recurring basis. With the executor, your notebook is run cell by cell on Vertex AI Training. You can seamlessly scale your notebook workflows by configuring different hardware options, pas…

2 недели, 4 дня назад @ cloud.google.com
Announcing Vertex Pipelines general availability
Announcing Vertex Pipelines general availability Announcing Vertex Pipelines general availability

Today, we’re incredibly excited to announce the general availability of Vertex Pipelines.One of the best ways to scale your machine learning (ML) workflows is to run them as a pipeline, where each pipeline step is a distinct piece of your ML process. Pipelines are a great tool for productionizing, sharing, and reliably reproducing ML workflows across your organization. They are also the key to MLOps – with Pipelines you can build systems to automatically retrain and deploy models. In this post, we’ll show what you can do with Vertex Pipelines, and we’ll end by sharing a sample pipeline to help you get started.Vertex Pipelines in a nutshellLet’s briefly discuss what an ML pipeline does. A ma…

2 недели, 4 дня назад @ cloud.google.com
Enhanced Sleep Sensing in Nest Hub
Enhanced Sleep Sensing in Nest Hub Enhanced Sleep Sensing in Nest Hub

Earlier this year, we launched Contactless Sleep Sensing in Nest Hub, an opt-in feature that can help users better understand their sleep patterns and nighttime wellness.

Today we announced enhancements to Sleep Sensing that provide deeper sleep insights.

To help people understand their sleep patterns, Nest Hub displays a hypnogram, plotting the user’s sleep stages over the course of a sleep session.

Snoring sounds that are synchronized with the user’s breathing pattern (left) will be displayed in the user’s Nest Hub’s Snoring timeline.

Since Nest Hub with Sleep Sensing launched, researchers have expressed interest in investigational studies using Nest Hub’s digital quantification of nightt…

2 недели, 5 дней назад @ ai.googleblog.com
Google at EMNLP 2021
Google at EMNLP 2021 Google at EMNLP 2021

This week, the annual conference on Empirical Methods in Natural Language Processing (EMNLP 2021) will be held both virtually and in Punta Cana, Dominican Republic.

As a Diamond Level sponsor of EMNLP 2021, Google will contribute research on a diverse set of topics, including language interactions, causal inference, and question answering, additionally serving in various levels of organization in the conference.

Below is a full list of Google’s involvement and publications being presented at EMNLP 2021.

We congratulate these authors, and all other researchers who are presenting their work at the conference (Google affiliations presented in bold).

Organizing CommitteePublicationsMATE: Multi-…

2 недели, 6 дней назад @ ai.googleblog.com
OpenAI OpenAI
последний пост 1 неделя, 3 дня назад
OpenAI’s API Now Available with No Waitlist
OpenAI’s API Now Available with No Waitlist OpenAI’s API Now Available with No Waitlist

Since the launch of our API, we’ve made deploying applications faster and more streamlined while adding new safety features.

Starting today, developers in supported countries can sign up and start experimenting with our API right away.

Tens of thousands of developers are already taking advantage of powerful AI models through our platform.

As another step in this direction, we are also updating our Content Guidelines to clarify what kind of content our API can be used to generate.

As our safeguards continue to improve, we will expand how the API can be used while further improving the experience for our users.

1 неделя, 3 дня назад @ openai.com
Solving Math Word Problems
Solving Math Word Problems Solving Math Word Problems

We’ve trained a system that solves grade school math problems with nearly twice the accuracy of a fine-tuned GPT-3 model.

However, they struggle to perform tasks that require accurate multistep reasoning, like solving grade school math word problems.

GSM8K DatasetGSM8K consists of 8.5K high quality grade school math word problems.

Training Verifiers: Models that Learn from their MistakesOne significant challenge in mathematical reasoning is the high sensitivity to individual mistakes.

Grade school math is an ideal testbed for these capabilities.

1 месяц назад @ openai.com
Summarizing Books with Human Feedback
Summarizing Books with Human Feedback Summarizing Books with Human Feedback

Our model works by first summarizing small sections of a book, then summarizing those summaries into a higher-level summary, and so on.

A zero-shot question-answering model can use our model’s summaries to obtain state-of-the-art on the NarrativeQA dataset for book-length question answering.

Our Approach: Combining Reinforcement Learning from Human Feedback and Recursive Task DecompositionConsider the task of summarizing a piece of text.

In the past we found that training a model with reinforcement learning from human feedback helped align model summaries with human preferences on short posts and articles.

In this case we break up summarizing a long piece of text into summarizing several sh…

2 месяца назад @ openai.com
Helen Toner Joins OpenAI’s Board of Directors
Helen Toner Joins OpenAI’s Board of Directors Helen Toner Joins OpenAI’s Board of Directors

Today, we’re excited to announce the appointment of Helen Toner to our Board of Directors.

As the Director of Strategy at Georgetown’s Center for Security and Emerging Technology (CSET), Helen has deep expertise in AI policy and global AI strategy research.

I greatly value Helen’s deep thinking around the long-term risks and effects of AI,” added Greg Brockman, OpenAI’s chairman and Chief Technology Officer.

“We are delighted to add her leadership to our board.”OpenAI is a unique organization in the AI research space, and has produced some of the advances, publications, and products I’m most excited about,” said Helen Toner.

She previously advised policymakers and grantmakers on AI strategy…

2 месяца, 3 недели назад @ openai.com
OpenAI Codex
OpenAI Codex OpenAI Codex

We are now inviting businesses and developers to build on top of OpenAI Codex through our API.

Watch Video Creating a Space Game with OpenAI Codex Tweet Watch Video “Hello World” with OpenAI Codex Tweet Watch Video Data Science with OpenAI Codex Tweet Watch Video Talking to Your Computer with OpenAI Codex Tweet Watch Video Converting Python to Ruby with OpenAI Codex Tweet Watch Video Giving OpenAI Codex a First Grade Math Test TweetOpenAI Codex is a descendant of GPT-3; its training data contains both natural language and billions of lines of source code from publicly available sources, including code in public GitHub repositories.

OpenAI Codex empowers computers to better understand people…

3 месяца, 2 недели назад @ openai.com
Introducing Triton: Open-Source GPU Programming for Neural Networks
Introducing Triton: Open-Source GPU Programming for Neural Networks Introducing Triton: Open-Source GPU Programming for Neural Networks

These issues can be mitigated by writing specialized GPU kernels, but doing so can be surprisingly difficult due to the many intricacies of GPU programming.

This has led us to extend and improve Triton, a recent language and compiler whose original creator now works at OpenAI.

# Z,X,Y are dense tensors Z[idx] = X[idx] + Y[idx] ... grid = (ceil_div(N, BLOCK),) block = (BLOCK,) add[grid, block](x, y, z, x.shape[0]) BLOCK = 512 # This is a GPU kernel in Triton.

Without a system like Triton, non-trivial modifications of matrix multiplication kernels would be out-of-reach for developers without exceptional GPU programming expertise.

If you’re interested in joining our team and working on Triton …

4 месяца назад @ openai.com
Improving Language Model Behavior by Training on a Curated Dataset
Improving Language Model Behavior by Training on a Curated Dataset Improving Language Model Behavior by Training on a Curated Dataset

We've found we can improve language model behavior with respect to specific behavioral values by fine-tuning on a curated dataset of <100 examples of those values.

Appropriate or desirable language model behavior, like appropriate human behavior, cannot be reduced to one universal standard; desirable behavior differs by application and social context.

Step Two: Crafting the Dataset and Fine-TuningWe crafted a values-targeted dataset of 76 text samples; each sample was in a question-answer format and between 40 and 340 words.

But we believe this only scratches the surface and leaves important questions unanswered:Who should be consulted when designing a values-targeted dataset?

Please reach …

5 месяцев, 3 недели назад @ openai.com
OpenAI Startup Fund
OpenAI Startup Fund OpenAI Startup Fund

Investing in startups with big ideas about AI.

6 месяцев назад @ openai.com
OpenAI Scholars 2021: Final Projects
OpenAI Scholars 2021: Final Projects OpenAI Scholars 2021: Final Projects

My advice to someone starting in deep learning research is to take your time to understand insights from fundamental papers and remember that the field is still relatively new.

Blogplaycircle Feedback Loops in Opinion ModelingDanielle Ensign OpenAI Mentor: Jeff WuPrevious Roles: Software Engineer at ITHAKA, Brighten AI, and Phylliida I have a background in Software Development, AI Fairness, and VR Game Development.

My project is exploratory, investigating prior work on opinion modeling from the context of deep learning.

Blogplaycircle Characterizing Test Time Compute on Graph Structured ProblemsKudzo Ahegbebu OpenAI Mentor: William GussPrevious Roles: Software Engineer at Facebook and Genen…

6 месяцев, 3 недели назад @ openai.com
Will Hurd Joins OpenAI’s Board of Directors
Will Hurd Joins OpenAI’s Board of Directors Will Hurd Joins OpenAI’s Board of Directors

OpenAI is committed to developing general-purpose artificial intelligence that benefits all humanity, and we believe that achieving our goal requires expertise in public policy as well as technology.

So, we’re delighted to announce that Congressman Will Hurd has joined our board of directors.

Will served three terms in the U.S. House of Representatives, has been a leading voice on technology policy, and coauthored bipartisan legislation outlining a national strategy for artificial intelligence.

“Will brings a rare combination of expertise—he deeply understands both artificial intelligence as well as public policy, both of which are critical to a successful future for AI,” said Sam Altman, O…

6 месяцев, 4 недели назад @ openai.com
GPT-3 Powers the Next Generation of Apps
GPT-3 Powers the Next Generation of Apps GPT-3 Powers the Next Generation of Apps

Given any text prompt like a phrase or a sentence, GPT-3 returns a text completion in natural language.

Applications and industriesTo date, over 300 apps are using GPT-3 across varying categories and industries, from productivity and education to creativity and games.

Using GPT-3, Viable identifies themes, emotions, and sentiment from surveys, help desk tickets, live chat logs, reviews, and more.

Algolia Answers helps publishers and customer support help desks query in natural language and surface nontrivial answers.

With natural language processing, technical experience is no longer a barrier, and we can truly keep our focus on solving real world problems.

8 месяцев, 1 неделя назад @ openai.com
Multimodal Neurons in Artificial Neural Networks
Multimodal Neurons in Artificial Neural Networks Multimodal Neurons in Artificial Neural Networks

discovered that the human brain possesses multimodal neurons.

Now, we’re releasing our discovery of the presence of multimodal neurons in CLIP.

Our discovery of multimodal neurons in CLIP gives us a clue as to what may be a common mechanism of both synthetic and natural vision systems—abstraction.

Indeed, these neurons appear to be extreme examples of “multi-faceted neurons,” neurons that respond to multiple distinct cases, only at a higher level of abstraction.

How multimodal neurons composeThese multimodal neurons can give us insight into understanding how CLIP performs classification.

8 месяцев, 4 недели назад @ openai.com
Microsoft Microsoft
последний пост 6 дней, 1 час назад
Scaling for the future of ophthalmology with artificial intelligence powered by Azure Healthcare APIs
Scaling for the future of ophthalmology with artificial intelligence powered by Azure Healthcare APIs

ZEISS partnered with Microsoft to power the ZEISS Medical Ecosystem with the Azure Healthcare APIs. Azure Healthcare APIs allow for data to be standardized and exchanged between the ZEISS medical centers and their devices across the world while leveraging Microsoft's geo-support for data residency requirements.

6 дней, 1 час назад @ azure.microsoft.com
Tutel: An efficient mixture-of-experts implementation for large DNN model training
Tutel: An efficient mixture-of-experts implementation for large DNN model training Tutel: An efficient mixture-of-experts implementation for large DNN model training

Tutel provides great compatibility with rich features to ensure the great performance when working on the Azure NDm A100 v4 cluster.

Unfortunately, existing machine learning frameworks have not provided an efficient all-to-all communication library, resulting in performance regression for large-scale distributed training.

Tutel achieves a 2.56x to 5.93x all-to-all speedup with 512 A100 GPUs for message sizes hundreds of MiB large.

Meta has been using Tutel to train its large language model, which has an attention-based neural architecture similar to GPT-3, on Azure NDm A100 v4.

Table 1 summarizes the detailed parameter setting of the model, and Figure 3 shows the 40 percent speedup Tutel ac…

6 дней, 18 часов назад @ microsoft.com
Novartis empowers scientists with AI to speed the discovery and development of breakthrough medicines
Novartis empowers scientists with AI to speed the discovery and development of breakthrough medicines

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя, 3 дня назад @ news.microsoft.com
SynapseML: A simple, multilingual, and massively parallel machine learning library
SynapseML: A simple, multilingual, and massively parallel machine learning library SynapseML: A simple, multilingual, and massively parallel machine learning library

Today, we’re excited to announce the release of SynapseML (previously MMLSpark), an open-source library that simplifies the creation of massively scalable machine learning (ML) pipelines.

Building production-ready distributed ML pipelines can be difficult, even for the most seasoned developer.

A unified API standardizes many of today’s tools, frameworks, and algorithms, streamlining the distributed ML experience.

Figure 1: The simple API abstracts over many different ML frameworks, scales, computation paradigms, data providers, and languages.

It even includes templates to quickly prototype distributed ML systems, such as visual search engines, predictive maintenance pipelines, document tran…

1 неделя, 4 дня назад @ microsoft.com
Privacy Preserving Machine Learning: Maintaining confidentiality and preserving trust
Privacy Preserving Machine Learning: Maintaining confidentiality and preserving trust Privacy Preserving Machine Learning: Maintaining confidentiality and preserving trust

This blog post discusses emerging research on combining techniques to ensure privacy and confidentiality when using sensitive data to train ML models.

Mitigate: We then develop and apply mitigation strategies, such as differential privacy (DP), described in more detail later in this blog post.

When training or fine-tuning machine learning models on customer content, we adhere to strict policy regarding the privacy budget[1].

This enables the collaboration of multiple customers on private data without the need to trust the cloud provider.

Finally, can we tightly integrate privacy and confidentiality guarantees into the design of the next generation of deep learning models?

2 недели, 5 дней назад @ microsoft.com
Azure high-performance computing at Supercomputing 2021
Azure high-performance computing at Supercomputing 2021

Azure’s market-leading vision for HPC and AI is based on a commitment to continual improvement and backed by a core of genuine and recognized HPC expertise, using proven HPC technology and design principles, enhanced with the best features of the cloud. The result is a capability that delivers performance, scale, and value unlike any other cloud.

2 недели, 6 дней назад @ azure.microsoft.com
New Azure OpenAI Service combines access to powerful GPT-3 language models with Azure’s enterprise capabilities
New Azure OpenAI Service combines access to powerful GPT-3 language models with Azure’s enterprise capabilities

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 недели, 5 дней назад @ blogs.microsoft.com
Turing Bletchley: A Universal Image Language Representation model by Microsoft
Turing Bletchley: A Universal Image Language Representation model by Microsoft Turing Bletchley: A Universal Image Language Representation model by Microsoft

Today, the Microsoft Turing team is thrilled to introduce Turing Bletchley, a 2.5-billion parameter Universal Image Language Representation model (T-UILR) that can perform image-language tasks in 94 languages.

This model shows uniquely powerful capabilities and a groundbreaking advancement in image language understanding.

Today’s image retrieval systems depend heavily on text metadata that is available for images, e.g., image captions, alt-text, surrounding text, image URL, etc.

Moreover, the demo was built directly with the pre-trained model and not finetuned with any image retrieval task.

We expect the T-Bletchley model to improve image question and answering, image search, and image-to-i…

3 недели, 6 дней назад @ microsoft.com
ACAV100M: Scaling up self-supervised audio-visual learning with automatically curated internet videos
ACAV100M: Scaling up self-supervised audio-visual learning with automatically curated internet videos ACAV100M: Scaling up self-supervised audio-visual learning with automatically curated internet videos

The natural association between visual observations and their corresponding sounds has exhibited powerful self-supervision signals for learning video representations, which makes the ever-growing amount of online video an attractive data source for self-supervised learning.

This severely limits the utility of online videos for self-supervised learning, which begs the question: How can we fully leverage online videos without extensive human effort?

Today, in collaboration with Seoul National University and NVIDIA Research, we are thrilled to release an automatic dataset curation pipeline and the largest video dataset for self-supervised audio-visual learning, dubbed ACAV100M (/ˈärˌkīv/; auto…

1 месяц назад @ microsoft.com
Microsoft powers transformation at NVIDIA GTC Fall—GPU technology conference
Microsoft powers transformation at NVIDIA GTC Fall—GPU technology conference

This year during NVIDIA GTC 21, we’re spotlighting some of the most transformational applications powered by NVIDIA accelerated computing that highlights our commitment to edge, on-premises, and cloud computing. Registration is free, so sign up to learn how Microsoft is powering transformation.

1 месяц назад @ azure.microsoft.com
Accelerating healthcare AI innovation with Zero Trust technology
Accelerating healthcare AI innovation with Zero Trust technology

If this message persists and you would like to contact Azure Support, please connect with @AzureSupport on TwitterRef A: DDF9700AE37D4BAC98175A759D542F30 Ref B: FRAEDGE0716 Ref C: 2021-11-04T19:22:21Z

1 месяц назад @ azure.microsoft.com
Accelerating healthcare AI innovation with Zero Trust technology
Accelerating healthcare AI innovation with Zero Trust technology

From research to diagnosis to treatment, AI has the potential to improve outcomes for some treatments by 30 to 40 percent and reduce costs by up to 50 percent. Obtaining the large data sets necessary for generalizability, transparency, and reducing bias has historically been difficult and time-consuming. That’s why the University of California, San Francisco (UCSF) collaborated with Microsoft, Fortanix, and Intel to create BeeKeeperAI.

1 месяц назад @ azure.microsoft.com
This non-profit is protecting vulnerable communities from the effects of climate change with AI
This non-profit is protecting vulnerable communities from the effects of climate change with AI

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 месяц, 1 неделя назад @ news.microsoft.com
Announcing the ORBIT dataset: Advancing real-world few-shot learning using teachable object recognition
Announcing the ORBIT dataset: Advancing real-world few-shot learning using teachable object recognition Announcing the ORBIT dataset: Advancing real-world few-shot learning using teachable object recognition

Inspired by teachable object recognizersThe ORBIT dataset and benchmark are inspired by a real-world application for the blind and low-vision community: teachable object recognizers.

Unlike typical computer vision benchmarks, performance on the teachable object recognition benchmark is measured based on input from each user.

Evaluations on highly cited few-shot learning models show that there is significant scope for innovation in high-variation, few-shot learning.

Despite saturation of model performance on existing few-shot benchmarks, few-shot models only achieve 50-55% accuracy on the teachable object recognition benchmark.

Research to realize human-AI collaborationCreating teachable obj…

1 месяц, 1 неделя назад @ microsoft.com
Announcing the ORBIT dataset: Advancing real-world few-shot learning using teachable object recognition
Announcing the ORBIT dataset: Advancing real-world few-shot learning using teachable object recognition Announcing the ORBIT dataset: Advancing real-world few-shot learning using teachable object recognition

Inspired by teachable object recognizersThe ORBIT dataset and benchmark are inspired by a real-world application for the blind and low-vision community: teachable object recognizers.

Unlike typical computer vision benchmarks, performance on the teachable object recognition benchmark is measured based on input from each user.

Evaluations on highly cited few-shot learning models show that there is significant scope for innovation in high-variation, few-shot learning.

Despite saturation of model performance on existing few-shot benchmarks, few-shot models only achieve 50-55% accuracy on the teachable object recognition benchmark.

Research to realize human-AI collaborationCreating teachable obj…

1 месяц, 1 неделя назад @ microsoft.com
MIT AI MIT AI
последний пост 7 часов назад
Artificial intelligence that understands object relationships
Artificial intelligence that understands object relationships Artificial intelligence that understands object relationships

Many deep learning models struggle to see the world this way because they don’t understand the entangled relationships between individual objects.

Their model represents individual relationships one at a time, then combines these representations to describe the overall scene.

In our minds, when we understand a scene, we really understand it based on the relationships between the objects.

The researchers used a machine-learning technique called energy-based models to represent the individual object relationships in a scene description.

They also asked humans to evaluate whether the generated images matched the original scene description.

7 часов назад @ news.mit.edu
In MIT visit, Dropbox CEO Drew Houston ’05 explores the accelerated shift to distributed work
In MIT visit, Dropbox CEO Drew Houston ’05 explores the accelerated shift to distributed work In MIT visit, Dropbox CEO Drew Houston ’05 explores the accelerated shift to distributed work

“It’s surreal, there’s no playbook for running a global company in a pandemic over Zoom.

“The goal is to find ways to unlock more of our brainpower through a multidisciplinary approach between computing and management,” says Houston.

I think academia has a huge role to play [here], and I think MIT is super well-positioned to lead.

Dropbox proceeded to redesign the work experience throughout the company, unveiling a “virtual first” working model in October 2020 in which remote work is the primary experience for all employees.

It’s a transition that’s been underway for a while, but Covid completely finished the swing,” says Houston.

6 дней, 20 часов назад @ news.mit.edu
Design’s new frontier
Design’s new frontier Design’s new frontier

Using this massive dataset, they tested the limits of what machine learning could do.

However, using GAN models alone would result in homogeneous designs that lack novelty and can’t be assessed in terms of performance.

Using this approach, Ahmed’s team developed an open-source computational design tool for bicycles freely available on their lab website.

He hopes the computational design tools he develops could lead to “design democratization,” putting more power in the hands of the end user.

While Deng is utilizing simulations and machine learning at the molecular level to improve designs, others are taking a more macro approach.

1 неделя, 2 дня назад @ news.mit.edu
MIT Lincoln Laboratory wins nine R&D 100 Awards for 2021
MIT Lincoln Laboratory wins nine R&D 100 Awards for 2021 MIT Lincoln Laboratory wins nine R&D 100 Awards for 2021

Nine technologies developed at MIT Lincoln Laboratory have been selected as R&D 100 Award winners for 2021.

Global Synthetic Weather RadarThe Global Synthetic Weather Radar (GSWR) provides radar-like weather imagery and radar-forward forecasts for regions where actual weather radars are not deployed or are limited in range.

Since 2010, Lincoln Laboratory has received 75 R&D 100 Awards.

"Our R&D 100 Awards recognize the significant, ongoing technology development and transition success at the laboratory.

Editors of R&D World announced the 2021 R&D 100 Award winners at virtual ceremonies broadcast on October 19, 20, and 21.

1 неделя, 4 дня назад @ news.mit.edu
Electrochemistry, from batteries to brains
Electrochemistry, from batteries to brains Electrochemistry, from batteries to brains

The members of her lab study fuel cells, which convert hydrogen and oxygen into electricity (and water).

What brings all this together in her lab is the electrochemistry of ionic-electronic oxides and their interfaces.

Yildiz stayed at MIT for a postdoctoral fellowship, between the nuclear engineering and mechanical engineering departments, studying electrochemistry in fuel cells.

Substitutions — for instance, strontium atoms — added to the crystal enhance its ability to transport electrons and oxygen ions.

In retrospect, Yildiz says, the leap from her formal training on nuclear engineering into electrochemistry and materials was a big one.

1 неделя, 5 дней назад @ news.mit.edu
Dexterous robotic hands manipulate thousands of objects with ease
Dexterous robotic hands manipulate thousands of objects with ease Dexterous robotic hands manipulate thousands of objects with ease

This ability to manipulate anything from a cup to a tuna can to a Cheez-It box could help the hand quickly pick-and-place objects in specific ways and locations — and even generalize to unseen objects.

DeepMind created “ RGB-Stacking ,” a vision-based system that challenges a robot to learn how to grab items and stack them.

Not only does the robot need to manipulate the object, but also circumvent gravity so it doesn’t fall down.

Chen wrote a paper about the research alongside MIT CSAIL PhD student Jie Xu and MIT Professor Pulkit Agrawal.

The research is funded by Toyota Research Institute, Amazon Research Award, and DARPA Machine Common Sense Program.

2 недели, 2 дня назад @ news.mit.edu
Toward speech recognition for uncommon spoken languages
Toward speech recognition for uncommon spoken languages Toward speech recognition for uncommon spoken languages

Their technique involves removing unnecessary parts of a common, but complex, speech recognition model and then making minor adjustments so it can recognize a specific language.

Learning speech from audioThe researchers studied a powerful neural network that has been pretrained to learn basic speech from raw audio, called Wave2vec 2.0.

This opens the door for speech recognition of uncommon languages that lack large amounts of transcribed speech, like Wolof, which is spoken by 5 million people in West Africa.

Lai and his collaborators wanted to see how the pruning process would affect this model’s speech recognition performance.

In the first step, a pretrained speech recognition neural netwo…

3 недели, 4 дня назад @ news.mit.edu
3 Questions: Blending computing with other disciplines at MIT
3 Questions: Blending computing with other disciplines at MIT 3 Questions: Blending computing with other disciplines at MIT

The core goals of the Common Ground are to infuse computing education throughout MIT in a coordinated manner, as well as to serve as a platform for multi-departmental collaborations.

As much as computing is changing thinking in the disciplines, the disciplines are changing the way people develop new computing approaches.

Q: How is the Common Ground facilitating collaborations and engaging faculty across MIT to develop new curricula?

Grossman: The Common Ground Standing Committee was formed to oversee the activities of the Common Ground and is charged with evaluating how best to support and advance program objectives.

It is very exciting to see MIT students having access to these unique offe…

3 недели, 5 дней назад @ news.mit.edu
Taming the data deluge
Taming the data deluge Taming the data deluge

By enriching AI algorithms with new processors, A3D3 seeks to speed up AI algorithms for solving fundamental problems in collider physics, neutrino physics, astronomy, gravitational-wave physics, computer science, and neuroscience.

“We realized all the AI algorithms were designed to solve much slower problems, such as image and voice recognition.

FPGAs and other new processor types, such as graphics processing units (GPUs), accelerate AI algorithms to more quickly analyze huge amounts of data.

The Huge Data from the Large Hadron ColliderWith data rates already exceeding 500 terabits per second, the LHC processes more data than any other scientific instrument on earth.

Its future aggregate d…

1 месяц назад @ news.mit.edu
3 Questions: Investigating a long-standing neutrino mystery
3 Questions: Investigating a long-standing neutrino mystery 3 Questions: Investigating a long-standing neutrino mystery

One of the long-standing puzzles in neutrino physics comes from the Mini Booster Neutrino Experiment (MiniBooNE), which ran from 2002 to 2017 at the Fermi National Accelerator Laboratory, or Fermilab, in Illinois.

Due to the effects of neutrino oscillations, this sterile neutrino would manifest itself as an enhancement of electron neutrinos in MiniBooNE.

There are many additional anomalies seen in neutrino physics that indicate this particle might exist.

Why use deep learning in neutrino physics?

Our group produced the first paper on particle identification using deep learning in neutrino physics, proving it to be a powerful technique.

1 месяц назад @ news.mit.edu
Making machine learning more useful to high-stakes decision makers
Making machine learning more useful to high-stakes decision makers Making machine learning more useful to high-stakes decision makers

With so many cases, some agencies are implementing machine learning models to help child welfare specialists screen cases and determine which to recommend for further investigation.

Researchers at MIT and elsewhere launched a research project to identify and tackle machine learning usability challenges in child welfare screening.

In collaboration with a child welfare department in Colorado, the researchers studied how call screeners assess cases, with and without the help of machine learning predictions.

But a big takeaway from this project is that these domain experts don’t necessarily want to learn what machine learning actually does.

Keeping the screeners in the loop throughout the itera…

1 месяц назад @ news.mit.edu
One autonomous taxi, please
One autonomous taxi, please One autonomous taxi, please

“Roboat” has come a long way since the team first started prototyping small vessels in the MIT pool in late 2015.

Last year, the team released their half-scale, medium model that was 2 meters long and demonstrated promising navigational prowess.

If you don’t get seasick, an autonomous boat might be the right mode of transportation for you.

Autonomous Roboats set sea in the Amsterdam canals and can comfortably carry up to five people, collect waste, deliver goods, and provide on-demand infrastructure.

“Roboat’s control system is adaptive to the number of people in the boat.”To swiftly navigate the bustling waters of Amsterdam, Roboat needs a meticulous fusion of proper navigation, perception…

1 месяц назад @ news.mit.edu
Artificial intelligence sheds light on how the brain processes language
Artificial intelligence sheds light on how the brain processes language Artificial intelligence sheds light on how the brain processes language

The most recent generation of predictive language models also appears to learn something about the underlying meaning of language.

In the past few years, artificial intelligence models of language have become very good at certain tasks.

Making predictionsThe new, high-performing next-word prediction models belong to a class of models called deep neural networks.

They found that the best-performing next-word prediction models had activity patterns that very closely resembled those seen in the human brain.

“One of the challenges of language processing is the real-time aspect of it,” he says.

1 месяц назад @ news.mit.edu
Saving seaweed with machine learning
Saving seaweed with machine learning Saving seaweed with machine learning

They then use a machine learning system known as a neural network to convert the 2D image into a representation of the microbiome present in the 3D environment.

To figure out how to communicate these data back to the research team, Xia drew upon her master’s degree research.

The communication devices can be used to relay data about the ocean environment from the machine learning algorithms.

By combining these low-cost communication devices along with microscopic images and machine learning, Xia hopes to design a low-cost, real-time monitoring system that can be scaled to cover entire seaweed farms.

While Xia still daydreams about opening a restaurant, she hopes the seaweed project will prom…

1 месяц, 1 неделя назад @ news.mit.edu
At Mass STEM Week kickoff, MIT RAISE announces Day of AI
At Mass STEM Week kickoff, MIT RAISE announces Day of AI At Mass STEM Week kickoff, MIT RAISE announces Day of AI

The fourth annual Massachusetts STEM Week kicked off on Monday, Oct. 18 at the MIT Media Lab.

Organized by the Massachusetts Executive Office of Education and the STEM Advisory Council, Mass STEM Week is a statewide effort to boost awareness, interest, and access in STEM education and career opportunities for learners of all ages and backgrounds.

For the first year, MIT RAISE has created age-appropriate curriculum modules for grades 3-5, 6-8, and 9-12, including those with little or no technology experience.

Jeffrey Leiden, executive chair of Vertex Pharmaceuticals and a supporter of Mass STEM Week, also attended the opening event; Vertex Pharmaceuticals is a founding sponsor of Day of AI.

1 месяц, 1 неделя назад @ news.mit.edu
Berkeley AI
последний пост 1 неделя, 3 дня назад
Sequence Modeling Solutions for Reinforcement Learning Problems
Sequence Modeling Solutions for Reinforcement Learning Problems Sequence Modeling Solutions for Reinforcement Learning Problems

Sequence Modeling Solutionsfor Reinforcement Learning ProblemsSequence Modeling Solutions for Reinforcement Learning ProblemsLong-horizon predictions of (top) the Trajectory Transformer compared to those of (bottom) a single-step dynamics model.

While it has been possible to apply reinforcement learning algorithms to large-scale problems, generally there has been much more friction in doing so.

In this post, we explore whether we can alleviate these difficulties by tackling the reinforcement learning problem with the toolbox of sequence modeling.

Though there is value in studying the most streamlined approaches that can tackle reinforcement learning problems, it is possible that the most ef…

1 неделя, 3 дня назад @ bair.berkeley.edu
Which Mutual Information Representation Learning Objectives are Sufficient for Control?
Which Mutual Information Representation Learning Objectives are Sufficient for Control? Which Mutual Information Representation Learning Objectives are Sufficient for Control?

Which Mutual Information Representation Learning Objectives are Sufficient for Control?

To simplify the analysis, we analyze representation learning in isolation from the other aspects of RL by assuming the existence of an offline dataset on which to perform representation learning.

An objective may have more than one maximizing representation, so we call a representation learning objective sufficient if all the representations that maximize that objective are sufficient.

To separate representation learning from RL, we first optimize each representation learning objective on a dataset of offline data, (similar to the protocol in Stooke et al.

This post is based on the paper Which Mutual Inf…

1 неделя, 3 дня назад @ bair.berkeley.edu
Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain Datasets
Bridge  Data:  Boosting  Generalization  of  Robotic  Skills  with Cross-Domain  Datasets Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain Datasets

Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain DatasetsFig.

For comparison, we include the performance of the policy when trained only on the target domain data, without bridge data (Target Domain Only), a baseline that uses only the bridge data without any target domain data (Direct Transfer), as well as a baseline that trains a single-task policy on data in the target domain only (Single Task).

However, we do include the “direct transfer” baseline, which utilizes a policy trained only on the bridge data.

Figure 10: Example rollouts of policies jointly trained on target domain data and bridge data in each of the three transfer scenarios.

This means that bridge…

1 неделя, 4 дня назад @ bair.berkeley.edu
Which Mutual Information Representation Learning Objectives are Sufficient for Control?
Which Mutual Information Representation Learning Objectives are Sufficient for Control? Which Mutual Information Representation Learning Objectives are Sufficient for Control?

Not FoundThe requested URL /blog/2021/11/18/mi-sufficiency-analysis/ was not found on this server.

1 неделя, 4 дня назад @ bair.berkeley.edu
Which Mutual Information Representation Learning Objectives are Sufficient for Control?
Which Mutual Information Representation Learning Objectives are Sufficient for Control? Which Mutual Information Representation Learning Objectives are Sufficient for Control?

Which Mutual Information Representation Learning Objectives are Sufficient for Control?

To simplify the analysis, we analyze representation learning in isolation from the other aspects of RL by assuming the existence of an offline dataset on which to perform representation learning.

An objective may have more than one maximizing representation, so we call a representation learning objective sufficient if all the representations that maximize that objective are sufficient.

To separate representation learning from RL, we first optimize each representation learning objective on a dataset of offline data, (similar to the protocol in Stooke et al.

This post is based on the paper Which Mutual Inf…

1 неделя, 5 дней назад @ bair.berkeley.edu
How should we compare neural network representations?
How should we compare neural network representations? How should we compare neural network representations?

How should we compare neural network representations?

To understand neural networks, researchers often use similarity metrics to measure how similar or different two neural networks are to each other.

Thus, for a given pair of neural network representations, we measure both their (dis)similarity and the difference between their accuracies on some task.

“Insights on representational similarity in neural networks with canonical correlation.” Proceedings of the 32nd International Conference on Neural Information Processing Systems.

Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, 2020.

3 недели назад @ bair.berkeley.edu
Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability
Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability

Epistemic POMDPs and Implicit Partial ObservabilityActively steering towards regions of low uncertainty or taking information-gathering actions are two of a multitude of avenues an RL agent has to handle its epistemic uncertainty.

What makes the epistemic POMDP particularly exciting is the following equivalence:An RL agent is Bayes-optimal for generalization if and only if it maximizes expected return in the corresponding epistemic POMDP.

LEEP, an algorithm based on the epistemic POMDP objective, generalizes better than PPO in four Procgen tasks.

This is surprisingly not true; limited training data in RL introduces implicit partial observability into an otherwise fully-observable problem.

T…

3 недели, 3 дня назад @ bair.berkeley.edu
RECON: Learning to Explore the Real World with a Ground Robot
RECON: Learning to Explore the Real World with a Ground Robot RECON: Learning to Explore the Real World with a Ground Robot

RECON: Learning to Explore the Real World with a Ground RobotAn example of our method deployed on a Clearpath Jackal ground robot (left) exploring a suburban environment to find a visual target (inset).

To build a robot capable of exploring and navigating like this, we need to learn from diverse prior datasets in the real world.

Our method learns both components using datasets or real-world robot interactions gathered in prior work.

In a new environment, RECON encourages the robot to explore at the frontier of the map – while the robot is not at the frontier, RECON directs it to navigate to a previously seen subgoal at the frontier of the map.

This dataset presents an exciting benchmark f…

3 недели, 5 дней назад @ bair.berkeley.edu
RECON: Learning to Explore the Real World with a Ground Robot
RECON: Learning to Explore the Real World with a Ground Robot RECON: Learning to Explore the Real World with a Ground Robot

RECON: Learning to Explore the Real World with a Ground RobotAn example of our method deployed on a Clearpath Jackal ground robot (left) exploring a suburban environment to find a visual target (inset).

To build a robot capable of exploring and navigating like this, we need to learn from diverse prior datasets in the real world.

Our method learns both components using datasets or real-world robot interactions gathered in prior work.

In a new environment, RECON encourages the robot to explore at the frontier of the map – while the robot is not at the frontier, RECON directs it to navigate to a previously seen subgoal at the frontier of the map.

This dataset presents an exciting benchmark f…

3 недели, 5 дней назад @ bair.berkeley.edu
Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability
Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability

Epistemic POMDPs and Implicit Partial ObservabilityActively steering towards regions of low uncertainty or taking information-gathering actions are two of a multitude of avenues an RL agent has to handle its epistemic uncertainty.

What makes the epistemic POMDP particularly exciting is the following equivalence:An RL agent is Bayes-optimal for generalization if and only if it maximizes expected return in the corresponding epistemic POMDP.

LEEP, an algorithm based on the epistemic POMDP objective, generalizes better than PPO in four Procgen tasks.

This is surprisingly not true; limited training data in RL introduces implicit partial observability into an otherwise fully-observable problem.

T…

4 недели назад @ bair.berkeley.edu
Designs from Data: Offline Black-Box Optimization via Conservative Training
Designs from Data: Offline Black-Box Optimization via Conservative Training Designs from Data: Offline Black-Box Optimization via Conservative Training

Designs from Data: Offline Black-Box Optimization via Conservative TrainingFigure 1: Offline Model-Based Optimization (MBO): The goal of offline MBO is to optimize an unknown objective function $f(x)$ with respect to $x$, provided access to only as static, previously-collected dataset of designs.

Conventionally, such problems have been tackled with black-box optimization procedures that repeatedly query an objective function.

We call this offline model-based optimization (offline MBO), and in this post, we discuss offline MBO methods and some recent advances.

Offline Model-Based Optimization (Offline MBO)Formally, the goal in offline model-based optimization is to maximize a black-box objec…

1 месяц назад @ bair.berkeley.edu
Designs from Data: Offline Black-Box Optimization via Conservative Training
Designs from Data: Offline Black-Box Optimization via Conservative Training Designs from Data: Offline Black-Box Optimization via Conservative Training

Designs from Data: Offline Black-Box Optimization via Conservative TrainingFigure 1: Offline Model-Based Optimization (MBO): The goal of offline MBO is to optimize an unknown objective function $f(x)$ with respect to $x$, provided access to only as static, previously-collected dataset of designs.

Conventionally, such problems have been tackled with black-box optimization procedures that repeatedly query an objective function.

We call this offline model-based optimization (offline MBO), and in this post, we discuss offline MBO methods and some recent advances.

Offline Model-Based Optimization (Offline MBO)Formally, the goal in offline model-based optimization is to maximize a black-box objec…

1 месяц назад @ bair.berkeley.edu
A First-Principles Theory of NeuralNetwork Generalization
A First-Principles Theory of NeuralNetwork Generalization A First-Principles Theory of NeuralNetwork Generalization

Measures of generalization performance for neural networks trained on four different boolean functions (colors) with varying training set size.

It turns out this is possible: in our recent paper, we derive a first-principles theory that allows one to make accurate predictions of neural network generalization (at least in certain settings).

Finite network $\approx$ infinite-width network $=$ kernel regressionA major vein of deep learning theory in the last few years has studied neural networks of infinite width.

Trusting that this first approximation will bear weight, our challenge now is to understand kernel regression.

It also applies to linear regression, a special case of kernel regressi…

1 месяц назад @ bair.berkeley.edu
A First-Principles Theory of NeuralNetwork Generalization
A First-Principles Theory of NeuralNetwork Generalization A First-Principles Theory of NeuralNetwork Generalization

Measures of generalization performance for neural networks trained on four different boolean functions (colors) with varying training set size.

It turns out this is possible: in our recent paper, we derive a first-principles theory that allows one to make accurate predictions of neural network generalization (at least in certain settings).

Finite network $\approx$ infinite-width network $=$ kernel regressionA major vein of deep learning theory in the last few years has studied neural networks of infinite width.

Trusting that this first approximation will bear weight, our challenge now is to understand kernel regression.

It also applies to linear regression, a special case of kernel regressi…

1 месяц назад @ bair.berkeley.edu
Making RL Tractable by Learning More Informative Reward Functions: Example-Based Control, Meta-Learning, and Normalized Maximum Likelihood
Making RL Tractable by Learning More Informative Reward Functions: Example-Based Control, Meta-Learning, and Normalized Maximum Likelihood Making RL Tractable by Learning More Informative Reward Functions: Example-Based Control, Meta-Learning, and Normalized Maximum Likelihood

Making RL Tractable by Learning More Informative Reward Functions: Example-Based Control, Meta-Learning, and Normalized Maximum LikelihoodDiagram of MURAL, our method for learning uncertainty-aware rewards for RL.

Conditional Normalized Maximum Likelihood (CNML)One method particularly well-suited for this task is Conditional Normalized Maximum Likelihood (CNML).

Meta-learning techniques are particularly well suited to our class of computational problems since it involves quickly solving multiple different maximum likelihood problems to evaluate the CNML likelihood.

In doing so, meta-learning provides us an effective tool for producing estimates of normalized maximum likelihood significantly…

1 месяц, 1 неделя назад @ bair.berkeley.edu
AWS Machine Learning AWS Machine Learning
последний пост 4 часа назад
Improve the return on your marketing investments with intelligent user segmentation in Amazon Personalize
Improve the return on your marketing investments with intelligent user segmentation in Amazon Personalize Improve the return on your marketing investments with intelligent user segmentation in Amazon Personalize

User segmentation in Amazon Personalize uses ML techniques, developed and perfected at Amazon, to learn what is relevant to users.

Amazon Personalize enables developers to build personalized user experiences with the same ML technology used by Amazon with no ML expertise required.

A24E9YGY3V94N8 TOOGOO(R) Double-Heart Cake Topper Decoration for Custom Personalized Mr Mrs Wedding Cake Topper Wit Jacobs Twiglets 6 Pack Jacobs Twiglets are one of A385P0YAW6U5J3 Tinksky Wedding Cake Topper God Gave Me You Sparkl Sweet Sixteen Cake Topper 16th Birthday Cake Toppe Catching the Big One DecoSet Cake DecorationReel i A3QW120I2BY1MU Golda’s Kitchen Acetate Cake Collars – 4.

Ge Liu is an Applied Scien…

4 часа назад @ aws.amazon.com
Amazon Personalize announces recommenders optimized for Retail and Media & Entertainment
Amazon Personalize announces recommenders optimized for Retail and Media & Entertainment Amazon Personalize announces recommenders optimized for Retail and Media & Entertainment

Today, we’re excited to announce the launch of personalized recommenders in Amazon Personalize that are optimized for retail and media and entertainment, making it even easier to personalize your websites, apps, and marketing campaigns.

With the launch of recommenders, you simply select the use cases you need from a library of recommenders within Amazon Personalize.

You can monitor the status of the data upload through the dashboard for the Prime-Pantry dataset group on the Amazon Personalize console.

As of this writing, Amazon Personalize offers five recipes for retail customers and four for media and entertainment customers.

Amazon Personalize also provides a recommender ARN in the detail…

4 часа назад @ aws.amazon.com
Build MLOps workflows with Amazon SageMaker projects, GitLab, and GitLab pipelines
Build MLOps workflows with Amazon SageMaker projects, GitLab, and GitLab pipelines Build MLOps workflows with Amazon SageMaker projects, GitLab, and GitLab pipelines

PrerequisitesFor this walkthrough, you should have the following prerequisites:A GitLab accountAn AWS accountAn Amazon SageMaker Studio domain (with SageMaker projects enabled)This post provides a detailed explanation of the SageMaker projects, GitLab, and GitLab pipelines integration.

The model build pipeline automatically triggers and runs the pipeline from end to end whenever a new commit is made to the model build repository.

Model Build Repository Name – The name of the repository mlops-gitlab-project-seedcode-model-build of the model build and training seed code.

Model build pipeline in GitLabLet’s review the GitLab pipelines, starting with the model build pipeline.

Model deployment p…

5 дней, 15 часов назад @ aws.amazon.com
Simplified MLOps with Deep Java Library
Simplified MLOps with Deep Java Library Simplified MLOps with Deep Java Library

This post goes deeper into our deep learning natural language processing (NLP) based advertisement predictor, how we integrated the predictor into one of our pipelines using Deep Java Library (DJL), and how that change made our architecture simpler and MLOps easier.

DJL is an open source Java framework for deep learning created by AWS.

Today’s practices are that deploying deep learning models incurs complexity by itself.

ONNXONNX is an open-source ecosystem of AI and ML tools designed to provide extensive interoperability between different deep learning frameworks.

This maintains our confidence that our deep learning model works properly whenever we change our code base.

5 дней, 16 часов назад @ aws.amazon.com
How Careem is detecting identity fraud using graph-based deep learning and Amazon Neptune
How Careem is detecting identity fraud using graph-based deep learning and Amazon Neptune How Careem is detecting identity fraud using graph-based deep learning and Amazon Neptune

In this post, we share how Careem detects identity fraud using graph-based deep learning and Amazon Neptune.

After much experimentation, Careem decided to focus on the identity of users, and came up with a powerful way to outsmart any efforts of identity fraud.

Architecture overviewBefore we dive deep into how Careem used Neptune an identity graph for fraud detection, let’s look at the current architecture underpinning the solution.

Historical data – Careem uses Apache Hive running on Amazon Simple Storage Service (Amazon S3) to extract data and push it to Amazon EMR with PySpark.

In this case, we decided to build a graph model to predict the is_fraud property of customer nodes using Amazon…

5 дней, 16 часов назад @ aws.amazon.com
Bring Your Amazon SageMaker model into Amazon Redshift for remote inference
Bring Your Amazon SageMaker model into Amazon Redshift for remote inference Bring Your Amazon SageMaker model into Amazon Redshift for remote inference

Additionally, Amazon Redshift ML allows data scientists to either import existing SageMaker models into Amazon Redshift for in-database inference or remotely invoke a SageMaker endpoint.

PrerequisitesTo get started, we need an Amazon Redshift cluster with the Amazon Redshift ML feature enabled.

For an introduction to Amazon Redshift ML and instructions on setting it up, see Create, train, and deploy machine learning models in Amazon Redshift using SQL with Amazon Redshift ML.

Get the SageMaker model endpointOn the Amazon SageMaker console, under Inference in the navigation pane, choose Endpoints to find your model name.

For more information about building different models with Amazon Redshi…

6 дней, 12 часов назад @ aws.amazon.com
Run distributed hyperparameter and neural architecture tuning jobs with Syne Tune
Run distributed hyperparameter and neural architecture tuning jobs with Syne Tune Run distributed hyperparameter and neural architecture tuning jobs with Syne Tune

Today we announce the general availability of Syne Tune, an open-source Python library for large-scale distributed hyperparameter and neural architecture optimization.

With Syne Tune, you can run hyperparameter and neural architecture tuning jobs locally on your machine or remotely on Amazon SageMaker by changing just one line of code.

By open-sourcing Syne Tune, we hope to create a community that brings together academic and industrial researchers in machine learning (ML).

Tune hyperparameters with Syne TuneWe now detail how to tune hyperparameters with Syne Tune.

Run large-scale tuning jobs with Syne Tune and SageMakerThe previous example showed how to tune hyperparameters on a local mach…

1 неделя, 2 дня назад @ aws.amazon.com
Your guide to AI and ML at AWS re:Invent 2021
Your guide to AI and ML at AWS re:Invent 2021 Your guide to AI and ML at AWS re:Invent 2021

Breakout sessionsPrepare data for ML with ease, speed, and accuracy (AIM319)Join this session to learn how to prepare data for ML in minutes using Amazon SageMaker.

Then, hear from Vanguard as they share their journey enabling MLOps to achieve ML at scale for their polyglot model development platforms using SageMaker features, including SageMaker projects, SageMaker Pipelines, SageMaker Model Registry, and SageMaker Model Monitor.

Forecast automatically generates highly accurate forecasts using ML, explains the drivers behind those forecasts, and keeps your ML models always up to date to capture new trends.

Workshop sessionsDevelop your ML project with Amazon SageMaker (AIM402)In this works…

1 неделя, 2 дня назад @ aws.amazon.com
AWS AI/ML Community attendee guides to AWS re:Invent 2021
AWS AI/ML Community attendee guides to AWS re:Invent 2021 AWS AI/ML Community attendee guides to AWS re:Invent 2021

The AWS AI/ML Community has compiled a series of session guides to AWS re:Invent 2021 to help you get the most out of re:Invent this year.

AWS re:Invent 2021: How to maximize your in-person learning experience as a new Machine Learning practitioner, from AWS ML Community Builder Martin Paradesi.

From our new Egypt-based AWS ML Hero Salah Elhossiny: Top 5 AWS ML Sessions to Attend at AWS re:Invent 2021.

Top 5 Sessions for AI/ML Developers at AWS re:Invent 2021, from AWS ML Community Builder Brooke Jamieson.

About the AuthorPaxton Hall is a Marketing Program Manager for the AWS AI/ML Community on the AI/ML Education team at AWS.

1 неделя, 2 дня назад @ aws.amazon.com
Understand drivers that influence your forecasts with explainability impact scores in Amazon Forecast
Understand drivers that influence your forecasts with explainability impact scores in Amazon Forecast Understand drivers that influence your forecasts with explainability impact scores in Amazon Forecast

We’re excited to launch explainability impact scores in Amazon Forecast, which help you understand the factors that impact your forecasts for specific items and time durations of interest.

Impact scores measure the relative impact attributes have on forecast values.

Generate explainability impact scoresIn this section, we walk through how to generate explainability impact scores for your forecasts using the Forecast console.

To export all the impact scores, choose Create explainability export in the Explainability exports Provide the export details and choose Create explainability export.

The explainability export file provides two type of impact scores: normalized impact scores and raw imp…

1 неделя, 2 дня назад @ aws.amazon.com
New Amazon Forecast API that creates up to 40% more accurate forecasts and provides explainability
New Amazon Forecast API that creates up to 40% more accurate forecasts and provides explainability New Amazon Forecast API that creates up to 40% more accurate forecasts and provides explainability

However, in doing so, you have to train your entire forecasting model again, which is a time-consuming process.

This launch is accompanied with new pricing, which you can review at Amazon Forecast pricing.

Jitendra Bangani is an Engineering Manager at AWS, leading a growing team of curious and driven engineers for Amazon Forecast.

Hilaf Hasson is a Machine Learning Scientist at AWS, and currently leads the R&D team of scientists working on Amazon Forecast.

Adarsh Singh works as a Software Development Engineer in the Amazon Forecast team.

1 неделя, 2 дня назад @ aws.amazon.com
Next Gen Stats Decision Guide: Predicting fourth-down conversion
Next Gen Stats Decision Guide: Predicting fourth-down conversion Next Gen Stats Decision Guide: Predicting fourth-down conversion

To help fans understand a coach’s decision, the NFL and AWS partnered to create the Next Gen Stats Decision Guide.

The Next Gen Stats Decision Guide is a suite of machine learning (ML) models designed to determine the optimal fourth-down call.

By comparing the odds of winning the game for each fourth-down choice, the Next Gen Stats Decision Guide provides a data-driven answer to that optimal fourth-down call.

Based on these expected probabilities, the Next Gen Stats Decision Guide recommends going for it with a 13% difference.

The odds of converting on a fourth-down play are a key component of the Next Gen Stats Decision Guide.

1 неделя, 3 дня назад @ aws.amazon.com
Chain custom Amazon SageMaker Ground Truth jobs for image processing
Chain custom Amazon SageMaker Ground Truth jobs for image processing Chain custom Amazon SageMaker Ground Truth jobs for image processing

Amazon SageMaker Ground Truth supports many different types of labeling jobs, including several image-based labeling workflows like image-level labels, bounding box-specific labels, or pixel-level labeling.

Ground Truth jobsEach custom Ground Truth labeling job is defined with a web-based user interface and two associated AWS Lambda functions (for more information, see Processing with AWS Lambda).

For custom Ground Truth user interfaces, a set of custom tags is available, known as Crowd tags.

To learn more about chained Ground Truth jobs, see Chaining Labeling Jobs.

For more information about the Crowd tags you can use in the Ground Truth UI, see Crowd HTML Elements Reference.

1 неделя, 3 дня назад @ aws.amazon.com
Accelerate data preparation using Amazon SageMaker Data Wrangler for diabetic patient readmission prediction
Accelerate data preparation using Amazon SageMaker Data Wrangler for diabetic patient readmission prediction Accelerate data preparation using Amazon SageMaker Data Wrangler for diabetic patient readmission prediction

Design your Data Wrangler flow fileTo get started – on the Studio File menu, choose New, and choose Data Wrangler Flow.

This launches a Data Wrangler instance and configures it with the Data Wrangler app.

Load the data from Amazon S3 into Data WranglerTo load the data into Data Wrangler, complete the following steps:On the Import tab, choose Amazon S3 as the data source.

With the built-in bias report in Data Wrangler, data scientists can quickly detect bias during the data preparation stage of the ML workflow.

This no-code/low-code capability of Data Wrangler accelerates training data preparation and increases data scientist agility with faster iterative data preparation.

1 неделя, 4 дня назад @ aws.amazon.com
Use Amazon SageMaker ACK Operators to train and deploy machine learning models
Use Amazon SageMaker ACK Operators to train and deploy machine learning models Use Amazon SageMaker ACK Operators to train and deploy machine learning models

AWS recently released the new Amazon SageMaker Operators for Kubernetes using the AWS Controllers for Kubernetes (ACK).

The new SageMaker ACK Operators make it easier for machine learning (ML) developers and data scientists who use Kubernetes as their control plane to train, tune, and deploy ML models in Amazon SageMaker without signing in to the SageMaker console.

ACK includes a common controller runtime, a code generator, and a set of AWS service-specific controllers, one of which is the SageMaker controller.

Alice issues a call to kubectl apply , passing in a file that describes a Kubernetes custom resource describing her SageMaker training job.

The SageMaker controller then communicates…

1 неделя, 4 дня назад @ aws.amazon.com
NVIDIA
последний пост 3 дня, 22 часа назад
A Very Thankful GFN Thursday: New Games, GeForce NOW Gift Cards and More
A Very Thankful GFN Thursday: New Games, GeForce NOW Gift Cards and More A Very Thankful GFN Thursday: New Games, GeForce NOW Gift Cards and More

Plus, kick back for the holiday with four new games coming to the GeForce NOW library this week.

With GeForce NOW, nearly any device can become a GeForce gaming rig.

GeForce NOW Priority Membership digital gift cards are now available in 2-month, 6-month or 12-month options.

Existing Founders and Priority members will have the number of months added to their accounts.

We initially planned to add Farming Simulator 2022 to GeForce NOW in November, but discovered an issue during our onboarding process.

3 дня, 22 часа назад @ blogs.nvidia.com
Discover the Latest in Machine Learning, Graphics, HPC, and IoT at AWS re:Invent
Discover the Latest in Machine Learning, Graphics, HPC, and IoT at AWS re:Invent Discover the Latest in Machine Learning, Graphics, HPC, and IoT at AWS re:Invent

See the latest innovations spanning from the cloud to the edge at AWS re:Invent.

Product Marketing Manager of NVIDIA, discusses model deployment challenges, how NVIDIA Triton simplifies deployment and maximizes performance of AI models, how to use NVIDIA Triton on AWS, and a customer use case.

However, the process of training, optimizing, and running ML models to build AI-powered applications is complex and requires expertise.

The NVIDIA NGC catalog provides GPU-optimized AI software including frameworks, pretrained models, and industry-specific software development keys (SDKs) that accelerate workflows.

This software allows data engineers, data scientists, developers, and DevOps teams to f…

5 дней, 18 часов назад @ developer.nvidia.com
NVIDIA Merlin Extends Open Source Interoperability for Recommender Workflows with Latest Update
NVIDIA Merlin Extends Open Source Interoperability for Recommender Workflows with Latest Update NVIDIA Merlin Extends Open Source Interoperability for Recommender Workflows with Latest Update

Data scientists and machine learning engineers use many methods, techniques, and tools to prep, build, train, deploy, and optimize their machine learning models.

NVIDIA Merlin is designed to streamline recommender workflows.

Merlin Transformers4Rec, designed for recommenders and solving cold-start problemsRecommender methods popularized in mainstream media often rely upon long-term user profiles or lifetime user behavior.

The NVIDIA Merlin team designed Transformers4Rec to be used as a standalone solution or within an ensemble of recommendation models.

Download and try NVIDIA MerlinThe latest update to NVIDIA Merlin, including Transformers4Rec and SOK, strengthens streamlining and accelerat…

6 дней, 19 часов назад @ developer.nvidia.com
‘Paint Me a Picture’: NVIDIA Research Shows GauGAN AI Art Demo Now Responds to Words
‘Paint Me a Picture’: NVIDIA Research Shows GauGAN AI Art Demo Now Responds to Words ‘Paint Me a Picture’: NVIDIA Research Shows GauGAN AI Art Demo Now Responds to Words

A picture worth a thousand words now takes just three or four words to create, thanks to GauGAN2, the latest version of NVIDIA Research’s wildly popular AI painting demo.

The new GauGAN2 text-to-image feature can now be experienced on NVIDIA AI Demos, where visitors to the site can experience AI through the latest demos from NVIDIA Research.

With the versatility of text prompts and sketches, GauGAN2 lets users create and customize scenes more quickly and with finer control.

The AI model behind GauGAN2 was trained on 10 million high-quality landscape images using the NVIDIA Selene supercomputer, an NVIDIA DGX SuperPOD system that’s among the world’s 10 most powerful supercomputers.

One examp…

6 дней, 22 часа назад @ blogs.nvidia.com
An Elevated Experience: Xpeng G9 Takes EV Innovation Higher with NVIDIA DRIVE Orin
An Elevated Experience: Xpeng G9 Takes EV Innovation Higher with NVIDIA DRIVE Orin An Elevated Experience: Xpeng G9 Takes EV Innovation Higher with NVIDIA DRIVE Orin

Electric automaker Xpeng took the wraps off the G9 SUV this week at the international Auto Guangzhou show in China.

The intelligent, software-defined vehicle is built on the high-performance compute of NVIDIA DRIVE Orin and delivers AI capabilities that are continuously upgraded with each over-the-air update.

The Xpeng G9 and its fellow EVs are elevating the driving experience with intelligent features that are always at the cutting edge.

Xpilot 4.0 is built on two NVIDIA DRIVE Orin systems-on-a-chip (SoC), achieving 508 trillion operations per second (TOPS).

Charging AheadThe G9 is designed for the international market, bringing software-defined innovation to roads around the world.

1 неделя, 2 дня назад @ blogs.nvidia.com
The Need for Speed: Edge AI with NVIDIA GPUs and SmartNICs Part 1
The Need for Speed: Edge AI with NVIDIA GPUs and SmartNICs Part 1 The Need for Speed: Edge AI with NVIDIA GPUs and SmartNICs Part 1

This post shows how to integrate NVIDIA Operators into new edge AI platforms using preinstalled drivers.

There are two operators: the NVIDIA GPU Operator and the NVIDIA Network Operator.

These components make up the NVIDIA GPU OperatorThe NVIDIA Network Operator automates CONNECTX SmartNIC configuration for Kubernetes pods that need fast networking.

The NVIDIA Network Operator includes the following components:The NVIDIA OFED Driver container automates network driver and library installation.

It is compiled automatically during Linux driver installation if both the ib_core and NVIDIA GPU driver sources are present on the system.

1 неделя, 2 дня назад @ developer.nvidia.com
In Pursuit of Smart City Vision, Startup Two-i Keeps an AI on Worker Safety
In Pursuit of Smart City Vision, Startup Two-i Keeps an AI on Worker Safety In Pursuit of Smart City Vision, Startup Two-i Keeps an AI on Worker Safety

More recently, the company was approached by ExxonMobil to help with a potentially deadly issue: improving worker safety around open oil tanks.

While this use case is highly specific, the company’s AI architecture is designed to flexibly support many different algorithms and functions.

To shorten training times further, Two-i is looking to test its huge image dataset on the powerful NVIDIA A100 GPU.

Two-i has also started benchmarking NVIDIA’s pretrained models against its algorithms and begun using the NVIDIA DeepStream SDK to enhance its video analytics pipeline.

Trombini recognizes that Two-i has to keep its focus on delivering one benefit after another to achieve the company’s long-term…

1 неделя, 2 дня назад @ blogs.nvidia.com
NVIDIA’s CEO Receives Semiconductor Industry’s Top Honor
NVIDIA’s CEO Receives Semiconductor Industry’s Top Honor NVIDIA’s CEO Receives Semiconductor Industry’s Top Honor

They came to network, get an update on the SIA’s work in Washington D.C., and bestow the 2021 Robert N. Noyce award, their highest honor, on the founder and CEO of NVIDIA.

Recognizing “an Icon”Turning to the Noyce award, Neuffer introduced Huang as “an icon in our industry.

“I accept this on behalf of all NVIDIA’s employees because it reflects their body of work,” Huang said.

In it, Andrew Ng, a machine-learning pioneer and entrepreneur described the pivotal role NVIDIA’s CEO has played in AI.

“His impact on the semiconductor industry, AI and the world is almost incalculable.”

1 неделя, 3 дня назад @ blogs.nvidia.com
From Process to Product Design: How Rendermedia Elevates Manufacturing Workflows With XR Experiences
From Process to Product Design: How Rendermedia Elevates Manufacturing Workflows With XR Experiences From Process to Product Design: How Rendermedia Elevates Manufacturing Workflows With XR Experiences

With NVIDIA RTX graphics and NVIDIA CloudXR, Rendermedia helps businesses get their products in the hands of customers and audiences, allowing them to interact and engage collaboratively on any device, from any location.

With NVIDIA CloudXR, Rendermedia and its product manufacturing clients can quickly render and create fully interactive simulated products in photographic detail, while also reducing their time to market.

Rendermedia customers Airbus and National Grid are using VR experiences to showcase future products and designs in realistic scenarios.

And they can easily make changes to designs in real time, helping users iterate more often and get to final product designs quicker.

Learn…

1 неделя, 3 дня назад @ blogs.nvidia.com
A GFN Thursday Deal: Get ‘Crysis Remastered’ Free With Any Six-Month GeForce NOW Membership
A GFN Thursday Deal: Get ‘Crysis Remastered’ Free With Any Six-Month GeForce NOW Membership A GFN Thursday Deal: Get ‘Crysis Remastered’ Free With Any Six-Month GeForce NOW Membership

With any new, paid six-month Priority or GeForce NOW RTX 3080 subscription, members will receive Crysis Remastered for free for a limited time.

GeForce NOW Can Run Crysis … And So Can YouBut can it run Crysis?

For a limited time, get a copy of Crysis Remastered free with select GeForce NOW memberships.

Purchase a six-month Priority membership, or the new GeForce NOW RTX 3080 membership, and get a free redeemable code for Crysis Remastered on the Epic Games Store.

Founders, exclusively, can upgrade to a GeForce NOW RTX 3080 membership and receive 10 percent off the subscription price and no risk to their current Founders benefits.

1 неделя, 3 дня назад @ blogs.nvidia.com
AI of the Tiger: Conservation Biologist Jeremy Dertien on Real-Time Poaching Prevention
AI of the Tiger: Conservation Biologist Jeremy Dertien on Real-Time Poaching Prevention AI of the Tiger: Conservation Biologist Jeremy Dertien on Real-Time Poaching Prevention

Fewer than 4,000 tigers remain worldwide, according to Tigers United, a university consortium that recently began using AI to help save the species.

Jeremy Dertien is a conservation biologist with Tigers United and a Ph.D. candidate in wildlife biology and conservation planning at Clemson University.

He spoke with NVIDIA AI Podcast host Noah Kravitz about a project deploying AI-equipped cameras to monitor poaching in central India, where more than 70 percent of the remaining tiger populations reside.

Now, scientists can take a closer look at endangered species by studying AI-generated 3D representations of them.

Subscribe to the AI PodcastGet the AI Podcast through iTunes, Google Podcasts, …

1 неделя, 3 дня назад @ blogs.nvidia.com
Get 50% Off Upcoming Hands-On Training from NVIDIA
Get 50% Off Upcoming Hands-On Training from NVIDIA Get 50% Off Upcoming Hands-On Training from NVIDIA

Get hands-on training in AI, deep learning, accelerated computing, and data science with the NVIDIA Deep Learning Institute (DLI).

DLI offers self-paced, online courses as well as instructor-led online workshops.

Thursday, Nov. 18, 7:00 a.m. – 3:00 p.m. PSTUse code DLISC21 to receive half-off full-price registration.

Monday, Dec. 6, 9:00 a.m. – 5:00 p.m. CETUse code DLIGTC50 to receive half-off full-price registration.

Tuesday, Dec. 14, 9:00 a.m. – 5:00 p.m. CETUse code DLIGTC50 to receive half-off full-price registration.

1 неделя, 4 дня назад @ developer.nvidia.com
AI Pioneers Write So Should Data Scientists
AI Pioneers Write So Should Data Scientists AI Pioneers Write So Should Data Scientists

Editor’s Note: If you’re interested sharing your data science and AI expertise, you can apply to write for our blog here.

Data Scientists are intermediaries between the world of numerical representations of data and project stakeholders, therefore the ability to interpret and convey derived understanding from datasets is essential to Data Scientists.

Writing is one means of communication that equips Data scientists with the capability to convey and present ideas, patterns, and learning from data.

Data Scientists are ushering in the future one model at a time and it’s our responsibility to ensure that.

Through writing Data Scientists have the opportunity to enforce best practices in software…

1 неделя, 4 дня назад @ developer.nvidia.com
An Important Skill for Data Scientists and Machine Learning Practitioners
An Important Skill for Data Scientists and Machine Learning Practitioners An Important Skill for Data Scientists and Machine Learning Practitioners

But there’s a crucial skill that should be attained by Data Scientists, irrespective of their experience, and that is writing.

Machine Learning and Data science contents written On Twitter, LinkedIn, Facebook, Quora, and StackOverflow, all fall into this category.

Blogs and ArticlesMany experts believe that blogs and articles have a unique role in the machine learning community.

Books for Data Scientist and Machine Learning practitioners:You will find that most authors listed preceding have produced the majority if not all forms of writing listed in this article, regardless of their domain specialty, hence why I consider writing a vital skill for Machine Learning practitioners and Data Scie…

1 неделя, 4 дня назад @ developer.nvidia.com
MLPerf HPC Benchmarks Show the Power of HPC+AI
MLPerf HPC Benchmarks Show the Power of HPC+AI MLPerf HPC Benchmarks Show the Power of HPC+AI

NVIDIA-powered systems won four of five tests in MLPerf HPC 1.0, an industry benchmark for AI performance on scientific applications in high performance computing.

They’re the latest results from MLPerf, a set of industry benchmarks for deep learning first released in May 2018.

What the Benchmarks MeasureMLPerf HPC 1.0 measured training of AI models in three typical workloads for HPC centers.

Compared to the best results in strong scaling from last year’s MLPerf 0.7 round, NVIDIA delivered 5x better results for CosmoFlow.

The MLPerf benchmarks are backed by MLCommons, an industry group led by Alibaba, Google, Intel, Meta, NVIDIA and others.

1 неделя, 4 дня назад @ blogs.nvidia.com
Facebook
последний пост 4 месяца, 2 недели назад
Fully Sharded Data Parallel: faster AI training with fewer GPUs
Fully Sharded Data Parallel: faster AI training with fewer GPUs Fully Sharded Data Parallel: faster AI training with fewer GPUs

It shards an AI model’s parameters across data parallel workers and can optionally offload part of the training computation to the CPUs.

For example, typical data parallel training requires maintaining redundant copies of the model on each GPU, and model parallel training introduces additional communication costs to move activations between workers (GPUs).

Using FSDP in computer vision modelsFor computer vision models, FSDP is supported in VISSL and tested on RegNets architectures.

Users may need to carefully tune the activation checkpointing strategy to fit a large model within limited GPU memory space.

We look forward to developing algorithms for auto-tuning both GPU memory usage and trai…

4 месяца, 2 недели назад @ engineering.fb.com
Asicmon: A platform agnostic observability system for AI accelerators
Asicmon: A platform agnostic observability system for AI accelerators Asicmon: A platform agnostic observability system for AI accelerators

We will be hosting a talk about our work on, “A Platform Agnostic Observability System for AI Accelerators” during our virtual Systems @Scale event at 10:20 a.m. PT on Wednesday, June 30, followed by a live Q&A session.

To meet these challenges, we’ve introduced three new tools:ASIC Monitoring (Asicmon) , a scalable observability framework.

However, with an accelerator system, we can imagine the CPU now has a complicated and brawnier sibling!

Since implementing Asicmon we’ve been able to increase our AI accelerator metrics support from ~30 percent to ~75 percentAtrace: Accelerator tracing at scaleWhy tracing?

This would allow us to debug the end-to-end latency of microservices that use AI a…

5 месяцев назад @ engineering.fb.com
How Facebook encodes your videos
How Facebook encodes your videos How Facebook encodes your videos

People upload hundreds of millions of videos to Facebook every day.

From a pure computing perspective, applying the most advanced codecs to every video uploaded to Facebook would be prohibitively inefficient.

A relatively small percentage (roughly one-third) of all videos on Facebook generate the majority of overall watch time.

The impact of the new video encoding modelIn addition to improving viewer experience with newly uploaded videos, the new model can identify older videos on Facebook that should have been encoded with more advanced encodings and route more computing resources to them.

The improved compression has also allowed people on Facebook with limited data plans, such as those i…

7 месяцев, 3 недели назад @ engineering.fb.com
neptune.ai neptune.ai
последний пост 2 дня, 18 часов назад
The Best Weights & Biases Alternatives
The Best Weights & Biases Alternatives The Best Weights & Biases Alternatives

Weights & Biases is a cloud-based service that allows you to host your experiments in a single central repository and if you have a private infrastructure, Weights & Biases can also be deployed on it.

However, when it comes to delivery, Weights & Biases isn’t always the best option.

Weights & Biases vs NeptuneBoth Neptune and Weights & Biases are hosted services and they provide experiment tracking, model management, and data versioning.

Weights & Biases vs KubeflowUsing Kubeflow pipeline or KF Serving components in Kubeflow, you can deploy machine learning models on docker, something that is missing on Weights & Biases.

Weights & Biases vs Amazon SageMaker StudioSageMaker Studio has an eas…

2 дня, 18 часов назад @ neptune.ai
Hugging Face Pre-trained Models: Find the Best One for Your Task
Hugging Face Pre-trained Models: Find the Best One for Your Task Hugging Face Pre-trained Models: Find the Best One for Your Task

There are many companies that provide open source libraries containing pre-trained models and Hugging Face is one of them.

In the upcoming sections, we will be covering Hugging Face and its transformers in detail with some hands-on exercises.

Basic tasks supported by Hugging FaceBefore we learn how a hugging face model can be used to implement NLP solutions, we need to know what are the basic NLP tasks that Hugging Face supports and why do we care about them.

Fortunately, hugging face has a model hub, a collection of pre-trained and fine-tuned models for all the tasks mentioned above.

There are many models which are available in the Hugging Face model hub or can be created using multilingua…

4 дня, 1 час назад @ neptune.ai
Best Tools for ML Model Governance, Provenance, and Lineage
Best Tools for ML Model Governance, Provenance, and Lineage Best Tools for ML Model Governance, Provenance, and Lineage

ML software development is complex; building an ML model is one thing, improving and maintaining it, is another.

Model governance, model provenance, and model lineage tools help you in doing just that by tracking model activity, recording all changes in the data and the model, and outlining best practices for data management and disposal.

Model provenance is tightly connected to the model governance process and describes the origin of the model and the processing steps that were applied to it.

Top tools for model governance, model provenance, and model lineageSince model governance, model provenance, and model lineage are tightly interconnected, here is a unified list of tools that help ML …

6 дней, 20 часов назад @ neptune.ai
Moving From TensorFlow To PyTorch
Moving From TensorFlow To PyTorch Moving From TensorFlow To PyTorch

While TensorFlow seems like a great tool to have in your arsenal for most deep learning tasks, most people are now preferring to switch from TensorFlow to PyTorch.

PyTorch is best for developing rapid prototypes and research projects which is why most people are choosing to transition from TensorFlow to PyTorch.

You may also like 💡 How to Keep Track of Experiments in PyTorch Using NeptuneIs it easy to switch from TensorFlow to PyTorch?

Understanding the simpler PyTorch installationPyTorch installation | SourceIn comparison to the TensorFlow installation, the PyTorch installation is much simpler.

The transition for switching from TensorFlow to PyTorch isn’t too complex because PyTorch offers…

1 неделя, 3 дня назад @ neptune.ai
K-Means Clustering Explained
K-Means Clustering Explained K-Means Clustering Explained

K-means clustering | Source: AuthorRe-initialize centroidsNext, we will re-initialize the centroids by calculating the average of all data points of that cluster.

In this section, we will be building a K-means clustering algorithm from scratch using a random centroid initialization method.

May be useful Check how you can keep track of your classifiers, regressors, and k-means clustering results when using Scikit-Learn.

GMM vs K-means clustering algorithm and why K-means is so popularGMM uses probability distribution and K-means uses distance metrics to compute the difference between data points to segregate the data into different clusters.

Using techniques such as K-means Clustering, one c…

3 недели, 4 дня назад @ neptune.ai
Top Model Versioning Tools for Your ML Workflow
Top Model Versioning Tools for Your ML Workflow Top Model Versioning Tools for Your ML Workflow

In this article, I will be discussing my top 6 model versioning tools that can greatly improve your workflow.

Types of model versioning tools Model versioning vs data versioning What tools can be used for model versioning How do these tools compare with each other?

Model versioning in a way involves tracking the changes made to an ML model that has been previously built.

Model versioning vs data versioningIn some cases, the differences between model and data versioning are quite clear.

Best tools for model versioningHere I will discuss six different tools for model versioning, outlining steps to get started and their different capabilities.

3 недели, 5 дней назад @ neptune.ai
Arize AI & Neptune AI Partnership: Continuous Monitoring, Continuous Improvements for ML Models
Arize AI & Neptune AI Partnership: Continuous Monitoring, Continuous Improvements for ML Models Arize AI & Neptune AI Partnership: Continuous Monitoring, Continuous Improvements for ML Models

It is imperative to build a continuous feedback loop as part of any model performance management workflow.

Tooling to experiment, validate, monitor, and actively improve models helps ML practitioners consistently deploy — and maintain — higher-quality models in production.

That’s why Arize AI and Neptune AI are pleased to announce a partnership, connecting Arize’s ML observability platform with Neptune’s metadata store for MLOps.

TL;DR – ML practitioners can fine-tune model performance and actively improve models at a more granular level when using Arize and Neptune combined.

Arize’s automated model monitoring and analytics platform allow ML teams to quickly detect issues when they emerge, …

1 месяц назад @ neptune.ai
Dimensionality Reduction for Machine Learning
Dimensionality Reduction for Machine Learning Dimensionality Reduction for Machine Learning

Tools and libraries used for dimensionality reduction Algorithms used for dimensionality reduction Applications Advantages and disadvantagesWhat is dimensionality reduction?

Dimensionality reduction is commonly used in data visualization to understand and interpret the data, and in machine learning or deep learning techniques to simplify the task at hand.

To gain some understanding of the relationships between these points, we can use PCA to project them to lower dimensions, like 2-D:from sklearn.decomposition import PCA pca = PCA( 2 ) projected = pca.fit_transform(digits.data) print(digits.data.shape) print(projected.shape)(1797, 64)(1797, 2)Now, let’s plot the first two principal componen…

1 месяц назад @ neptune.ai
Deploying Your Next Image Classification on Serverless (AWS Lambda, GCP Cloud Function, Azure Automation)
Deploying Your Next Image Classification on Serverless (AWS Lambda, GCP Cloud Function, Azure Automation) Deploying Your Next Image Classification on Serverless (AWS Lambda, GCP Cloud Function, Azure Automation)

We will use Amazon’s NoSQL serverless database DynamoDB for this as it scales pretty well and integrates with the serverless function quite easily.

Go back to the CodeBuild page for the project you created and click on Start build to test your application build.

You can now make a final commit of your application source code to ensure your pipeline works as intended.

Deploying your image classification application to AWS LambdaNow your application build is all set to be deployed to a serverless function.

To deploy it, we have to set up AWS Lambda to run the application build whenever there is a new event.

1 месяц, 1 неделя назад @ neptune.ai
Azure ML (AML) Alternatives for MLOps
Azure ML (AML) Alternatives for MLOps Azure ML (AML) Alternatives for MLOps

Azure Machine Learning (AML) is a cloud-based machine learning service for data scientists and ML engineers.

You’re in luck – in this article, we’re reviewing multiple alternatives to Azure Machine Learning (AML) for MLOps.

To find alternatives for AML, first, we need to analyze exactly what AML does:Experiment Tracking Model Management Model Deployment Model Lineage Model Monitoring Data Labelling1.

In G2 Crowd reviews of the Azure ML, users have mentioned that :Azure ML doesn’t meet the requirement for their specific use case; it’s not so flexible and customization can be hard.

For more data labeling tools, check also ➡️ Data Labeling Software: Best Tools for Data Labeling in 2021.

1 месяц, 1 неделя назад @ neptune.ai
ML Engineer vs Data Scientist
ML Engineer vs Data Scientist ML Engineer vs Data Scientist

However, as data science has evolved over the decade, it has become clearer that data science involves more than just modeling.

The machine learning lifecycle, from raw data through to deployment, now relies on specialized experts including data engineers, data scientists, machine learning engineers along with product and business managers.

Thus, the definition and scope of a data scientist vs. a machine learning engineer is very contextual and depends upon how mature the data science team is.

DATA SCIENTIST MACHINE LEARNING ENGINEER Problem-solving Programming Programming Data structures Statistics Data modeling Data science Software engineering Machine learning: Supervised & Unsupervised …

1 месяц, 1 неделя назад @ neptune.ai
Continuous Control With Deep Reinforcement Learning
Continuous Control With Deep Reinforcement Learning Continuous Control With Deep Reinforcement Learning

This time I want to explore how deep reinforcement learning can be utilized e.g.

Continuous Control with Deep Reinforcement Learning | Source: RoboschoolMeet Humanoid.

Off-policy actor-critic methodsReinforcement Learning | Source: Sutton & Barto, Reinforcement Learning: An Introduction, 2nd editionLet’s recap: Reinforcement learning (RL) is learning what to do — how to map situations to actions — to maximize some notion of cumulative reward.

Reinforcement learning | Source: ResearchgateThe actor-critic architecture, depicted in the diagram above, divides the agent into two pieces, the Actor and the Critic.

You can read more about logging these diagnostics in Logging in Reinforcement Learni…

1 месяц, 2 недели назад @ neptune.ai
Tips and Tricks to Train State-Of-The-Art NLP Models
Tips and Tricks to Train State-Of-The-Art NLP Models Tips and Tricks to Train State-Of-The-Art NLP Models

With the introduction of packages like transformers by huggingface, it is very convenient to train NLP models for any given task.

What methods, tips, and tricks you can use to train state-of-the-art NLP models?

Most of the options needed to quickly make comparisons are available in a couple of clicks therefore we can interactively run analysis and train models.

The field is evolving drastically and researchers are discovering new tricks every day to reliably train these models.

We hope with this article you are empowered with great techniques and tools to confidently train and track the stable state-of-the-art NLP models.

1 месяц, 2 недели назад @ neptune.ai
Version Control for Machine Learning and Data Science
Version Control for Machine Learning and Data Science Version Control for Machine Learning and Data Science

Read also 🎓 Version Control Guide for Machine Learning ResearchersTypes of version control systemsThere are three main types of version control systems:Local Version Control Systems Centralized Version Control Systems Distributed Version Control SystemsLocal Version Control SystemThis version control system consists of a local database on your computer that stores every file change as a patch (difference between files in a unique format).

Local Version Control System | Source: GitlabCentralized Version Control SystemsThis system has a central and single server that stores the different file versions.

Distributed Version Control Systems | Source: GitlabBenefits of version control systemsEffi…

1 месяц, 2 недели назад @ neptune.ai
Transformer Models for Textual Data Prediction
Transformer Models for Textual Data Prediction Transformer Models for Textual Data Prediction

Predicting paraphrased sentences: Often when we look at text data, we’re trying to understand if certain sentences are related.

This is useful when processing large amounts of text data, and you want to search for or identify text which may be related without being identical.

Predicting text data with TransformersThe datasetFor predicting text data we want to use a dataset that represents “real” queries and conversations.

But it’s always good to keep an eye on new applications like this which may be improved by new NLP models like BERT or GPT.

Models like Google’s T5 Text to Text Transformer, which was released in 2020, are examples of transformer-based models that can perform multiple task…

1 месяц, 2 недели назад @ neptune.ai
▶️ YouTube
Yannic Kilcher Yannic Kilcher
последний пост 1 день, 18 часов назад
Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions (Paper Explained)
Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions (Paper Explained) Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions (Paper Explained)

#imle #backpropagation #discrete Backpropagation is the workhorse of deep learning, but unfortunately, it only works for continuous functions that are amenable to the chain rule of differentiation. Since discrete algorithms have no continuous derivative, deep networks with such algorithms as part of them cannot be effectively trained using backpropagation. This paper presents a method to incorporate a large class of algorithms, formulated as discrete exponential family distributions, into deep networks and derives gradient estimates that can easily be used in end-to-end backpropagation. This enables things like combinatorial optimizers to be part of a network's forward propagation natively.…

1 день, 18 часов назад @ youtube.com
Peer Review is still BROKEN! The NeurIPS 2021 Review Experiment (results are in)
Peer Review is still BROKEN! The NeurIPS 2021 Review Experiment (results are in) Peer Review is still BROKEN! The NeurIPS 2021 Review Experiment (results are in)

#neurips #peerreview #machinelearning A look at the results of the 2021 NeurIPS peer review experiment. https://arxiv.org/abs/2109.09774

https://www.reddit.com/r/MachineLearning/comments/qzjuvk/discussion_neurips_2021_finally_accepted/ Links:

TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://discord.gg/4H8xxDF

BitChute: https://www.bitchute.com/channel/yannic-kilcher

LinkedIn: https://www.linkedin.com/in/ykilcher

BiliBili: https://space.bilibili.com/2017636191 If you want to support me, the best thing to do is to share out the content :) If you want to support me financial…

3 дня, 17 часов назад @ youtube.com
Parameter Prediction for Unseen Deep Architectures (w/ First Author Boris Knyazev)
Parameter Prediction for Unseen Deep Architectures (w/ First Author Boris Knyazev) Parameter Prediction for Unseen Deep Architectures (w/ First Author Boris Knyazev)

#deeplearning #neuralarchitecturesearch #metalearning Deep Neural Networks are usually trained from a given parameter initialization using SGD until convergence at a local optimum. This paper goes a different route: Given a novel network architecture for a known dataset, can we predict the final network parameters without ever training them? The authors build a Graph-Hypernetwork and train on a novel dataset of various DNN-architectures to predict high-performing weights. The results show that not only can the GHN predict weights with non-trivial performance, but it can also generalize beyond the distribution of training architectures to predict weights for networks that are much larger, de…

4 дня, 17 часов назад @ youtube.com
Learning Rate Grafting: Transferability of Optimizer Tuning (Machine Learning Research Paper Reivew)
Learning Rate Grafting: Transferability of Optimizer Tuning (Machine Learning Research Paper Reivew) Learning Rate Grafting: Transferability of Optimizer Tuning (Machine Learning Research Paper Reivew)

#grafting #adam #sgd The last years in deep learning research have given rise to a plethora of different optimization algorithms, such as SGD, AdaGrad, Adam, LARS, LAMB, etc. which all claim to have their special peculiarities and advantages. In general, all algorithms modify two major things: The (implicit) learning rate schedule, and a correction to the gradient direction. This paper introduces grafting, which allows to transfer the induced learning rate schedule of one optimizer to another one. In that, the paper shows that much of the benefits of adaptive methods (e.g. Adam) are actually due to this schedule, and not necessarily to the gradient direction correction. Grafting allows for …

1 неделя, 1 день назад @ youtube.com
[ML News] Cedille French Language Model | YOU Search Engine | AI Finds Profitable MEME TOKENS
[ML News] Cedille French Language Model | YOU Search Engine | AI Finds Profitable MEME TOKENS [ML News] Cedille French Language Model | YOU Search Engine | AI Finds Profitable MEME TOKENS

#mlnews #cedille #wmt Only the greatest of news from the world of Machine Learning. OUTLINE:

0:00 - Sponsor: Weights & Biases

1:50 - Cedille - French Language Model

3:55 - Facebook AI Multilingual model wins WMT

5:50 - YOU private search engine

10:35 - DeepMind's Open-Source Arnheim

12:10 - Company sued for using AI to make website more accessible

18:05 - Alibaba DAMO Academy creates 10 Trillion M6 model

21:15 - AMD MI200 Family

22:30 - State of AI report 2021

24:15 - Andrew Ng's Landing AI raises 57M

25:40 - Cerebras raises 250M

26:45 - Microsoft's Varuna: Scalable Training of Huge Models

28:15 - Laura Ruis reproduces Extrapolation Paper

29:05 - Ian Charnas' Real-Life Punchout

30:00 - Help…

1 неделя, 3 дня назад @ youtube.com
Gradients are Not All You Need (Machine Learning Research Paper Explained)
Gradients are Not All You Need (Machine Learning Research Paper Explained) Gradients are Not All You Need (Machine Learning Research Paper Explained)

#deeplearning #backpropagation #simulation More and more systems are made differentiable, which means that accurate gradients of these systems' dynamics can be computed exactly. While this development has led to a lot of advances, there are also distinct situations where backpropagation can be a very bad idea. This paper characterizes a few such systems in the domain of iterated dynamical systems, often including some source of stochasticity, resulting in chaotic behavior. In these systems, it is often better to use black-box estimators for gradients than computing them exactly. OUTLINE:

0:00 - Foreword

1:15 - Intro & Overview

3:40 - Backpropagation through iterated systems

12:10 - Connecti…

1 неделя, 6 дней назад @ youtube.com
[ML News] Microsoft combines Images & Text | Meta makes artificial skin | Russians replicate DALL-E
[ML News] Microsoft combines Images & Text | Meta makes artificial skin | Russians replicate DALL-E [ML News] Microsoft combines Images & Text | Meta makes artificial skin | Russians replicate DALL-E

#mlnews #turing #reskin The latest and greatest from the Machine Learning world OUTLINE:

0:00 - Intro

0:15 - Sponsor: Weights & Biases Tables

3:25 - Microsoft Turing Bletchley: Universal Image Language Representation Model

6:35 - Meta AI Tactile Sensing

9:55 - AnimeGANv2

11:35 - General In-Hand Object Re-Orientation

13:05 - Does Facebook score the "Anger" Emoji too high?

17:05 - IsomorphicLabs: New Alphabet Company for Drug Discovery

18:15 - ruDALL-E: Russian DALL-E

20:40 - Image Scaling Attacks

23:25 - Azure OpenAI Service

24:10 - Neural MMO

25:40 - ArxivDOOM

26:50 - ARC Game

29:35 - ResNeXtGuesser

29:55 - Zillow loses money based on AI home price estimation

31:35 - Helpful Things

35:40 - …

2 недели, 3 дня назад @ youtube.com
Autoregressive Diffusion Models (Machine Learning Research Paper Explained)
Autoregressive Diffusion Models (Machine Learning Research Paper Explained) Autoregressive Diffusion Models (Machine Learning Research Paper Explained)

#machinelearning #ardm #generativemodels Diffusion models have made large advances in recent months as a new type of generative models. This paper introduces Autoregressive Diffusion Models (ARDMs), which are a mix between autoregressive generative models and diffusion models. ARDMs are trained to be agnostic to the order of autoregressive decoding and give the user a dynamic tradeoff between speed and performance at decoding time. This paper applies ARDMs to both text and image data, and as an extension, the models can also be used to perform lossless compression. OUTLINE:

0:00 - Intro & Overview

3:15 - Decoding Order in Autoregressive Models

6:15 - Autoregressive Diffusion Models

8:35 - D…

2 недели, 4 дня назад @ youtube.com
[ML News] Google introduces Pathways | OpenAI solves Math Problems | Meta goes First Person
[ML News] Google introduces Pathways | OpenAI solves Math Problems | Meta goes First Person [ML News] Google introduces Pathways | OpenAI solves Math Problems | Meta goes First Person

#pathways #mlnews #ego4d Your irregular dose of Machine Learning News. OUTLINE:

0:00 - Intro

0:20 - Sponsor: Weights & Biases

2:10 - Google Introduces Pathways AI Architecture

6:30 - OpenAI trains Language Models to do High School Math

8:25 - Sam Altman says Neural Networks truly learn

9:35 - Google AI researchers frustrated with lawyers

12:10 - DeepMind RL Lecture Series 2021

12:40 - Fashion Store sells Adversarial Patches

13:15 - A viable method to remove the GIL from CPython

15:05 - BigScience Workshop releases T0

17:40 - Huggingface Hub Dataset Viewer

18:10 - Scite classifies scientific citations

19:25 - Facebook AI Ego4D dataset & challenges

21:50 - Tesla Dojo Configurable Floating Poi…

3 недели, 2 дня назад @ youtube.com
EfficientZero: Mastering Atari Games with Limited Data (Machine Learning Research Paper Explained)
EfficientZero: Mastering Atari Games with Limited Data (Machine Learning Research Paper Explained) EfficientZero: Mastering Atari Games with Limited Data (Machine Learning Research Paper Explained)

#efficientzero #muzero #atari Reinforcement Learning methods are notoriously data-hungry. Notably, MuZero learns a latent world model just from scalar feedback of reward- and policy-predictions, and therefore relies on scale to perform well. However, most RL algorithms fail when presented with very little data. EfficientZero makes several improvements over MuZero that allows it to learn from astonishingly small amounts of data and outperform other methods by a large margin in the low-sample setting. This could be a staple algorithm for future RL research. OUTLINE:

0:00 - Intro & Outline

2:30 - MuZero Recap

10:50 - EfficientZero improvements

14:15 - Self-Supervised consistency loss

17:50 - E…

3 недели, 4 дня назад @ youtube.com
[YTalks] Siraj Raval - Stories about YouTube, Plagiarism, and the Dangers of Fame (Interview)
[YTalks] Siraj Raval - Stories about YouTube, Plagiarism, and the Dangers of Fame (Interview) [YTalks] Siraj Raval - Stories about YouTube, Plagiarism, and the Dangers of Fame (Interview)

#ytalks #siraj #plagiarism A conversation with Siraj Raval about his journey on YouTube, and the perils of fame. OUTLINE:

0:00 - Intro

1:30 - Welcome

3:15 - Starting out: From Economics to YouTube

13:00 - More Views: Plagiarizing Video Content

23:30 - One Step Up: Copying A Research Paper

29:15 - Was there another way?

39:00 - Clickbait Course: Make Money with Machine Learning

50:30 - Rock Bottom and the Way Forward

1:01:30 - Advice for Future Generations Siraj's Channel: https://www.youtube.com/c/SirajRaval Links:

TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://discord.…

4 недели назад @ youtube.com
[ML News] NVIDIA GTC'21 | DeepMind buys MuJoCo | Google predicts spreadsheet formulas
[ML News] NVIDIA GTC'21 | DeepMind buys MuJoCo | Google predicts spreadsheet formulas [ML News] NVIDIA GTC'21 | DeepMind buys MuJoCo | Google predicts spreadsheet formulas

#gtc21 #mlnews #mujoco Register to GTC'21 and Win a RTX 3090: https://nvda.ws/2Y2B5ni OUTLINE:

0:00 - Intro

0:15 - Sponsor: NVIDIA GTC'21

5:35 - DeepMind buys & Open-Sources MuJoCo

7:25 - PyTorch 1.10 Released

9:10 - Google Predicts Spreadsheet Formulas

11:25 - handtracking.io

12:25 - Cell Instance Segmentation Challenge

13:00 - Helpful Libraries

17:50 - Waymo cars keep turning into same dead-end

19:35 - BlueRiver balances tractors References:

DeepMind buys & open-sources MuJoCo

https://deepmind.com/blog/announcements/mujoco PyTorch 1.10 released

https://pytorch.org/blog/pytorch-1.10-released/

https://developer.nvidia.com/blog/cuda-graphs/ GoogleAI predicts spreadsheet formulas

https://ai.g…

1 месяц назад @ youtube.com
[ML News GERMAN] NVIDIA GTC'21 | DeepMind kauft MuJoCo | Google Lernt Spreadsheet Formeln
[ML News GERMAN] NVIDIA GTC'21 | DeepMind kauft MuJoCo | Google Lernt Spreadsheet Formeln [ML News GERMAN] NVIDIA GTC'21 | DeepMind kauft MuJoCo | Google Lernt Spreadsheet Formeln

#gtc21 #mlnews #mujoco Registriere für GTC'21 und gewinne eine RTX 3090: https://nvda.ws/2Y2B5ni OUTLINE:

0:00 - Intro

0:15 - Sponsor: NVIDIA GTC'21

6:10 - DeepMind kauft & Open-Sourct MuJoCo

9:05 - PyTorch 1.10 Veröffentlicht

11:25 - Google Lernt Spreadsheet Formeln

14:15 - handtracking.io

15:25 - Zellinstanzsegmentierungswettbewerb

16:15 - Hilfreiche Bibliotheken

23:15 - Waymo autos verirren sich alle in der selben Sackgasse

24:50 - BlueRiver balanciert Traktoren References:

DeepMind kauft & open-sourct MuJoCo

https://deepmind.com/blog/announcements/mujoco PyTorch 1.10 veröffentlicht

https://pytorch.org/blog/pytorch-1.10-released/

https://developer.nvidia.com/blog/cuda-graphs/ GoogleAI sa…

1 месяц назад @ youtube.com
I went to an AI Art Festival in Geneva (AiiA Festival Trip Report)
I went to an AI Art Festival in Geneva (AiiA Festival Trip Report) I went to an AI Art Festival in Geneva (AiiA Festival Trip Report)

#aiia #ai #art A trip report from the AiiA Festival in Geneva organized by the ImpactAI foundation. OUTLINE:

0:00 - Intro

1:50 - Laura Tocmacov: The Festival

4:10 - Timothy O'Hear: The Tech

6:50 - Jonathan O'Hear: The Robot

11:50 - Cléa Chopard: The Artist

17:45 - Final Words Website: https://aiiafestival.org/en/ Links:

TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://discord.gg/4H8xxDF

BitChute: https://www.bitchute.com/channel/yannic-kilcher

Minds: https://www.minds.com/ykilcher

Parler: https://parler.com/profile/YannicKilcher

LinkedIn: https://www.linkedin.com/in/ykilc…

1 месяц назад @ youtube.com
Symbolic Knowledge Distillation: from General Language Models to Commonsense Models (Explained)
Symbolic Knowledge Distillation: from General Language Models to Commonsense Models (Explained) Symbolic Knowledge Distillation: from General Language Models to Commonsense Models (Explained)

#gpt3 #knowledge #symbolic Symbolic knowledge models are usually trained on human-generated corpora that are cumbersome and expensive to create. Such corpora consist of structured triples of symbolic knowledge. This paper takes a different approach and attempts to generate such a corpus by prompting GPT-3. Results show that clever prompting, combined with targeted small critic models trained on human ratings can outperform both human-generated data, as well as the teacher model (GPT-3) itself. The results of this paper give a general recipe for automatically building corpora for various NLP tasks by extracting samples from large language models. OUTLINE:

0:00 - Intro & Overview

2:30 - Spons…

1 месяц назад @ youtube.com
Henry AI Labs Henry AI Labs
последний пост 2 дня, 19 часов назад
KerasBERT on the Weaviate Podcast
KerasBERT on the Weaviate Podcast KerasBERT on the Weaviate Podcast

Look out for the podcast on SeMI YouTube: https://www.youtube.com/channel/UCJKT6kJ3IFYybWnL7jbXxhQ KerasBERT quick intro video: https://www.youtube.com/watch?v=J3P8WLAELqk (Full Explanation coming soon)

2 дня, 19 часов назад @ youtube.com
Graph Embeddings and PyTorch-BigGraph
Graph Embeddings and PyTorch-BigGraph Graph Embeddings and PyTorch-BigGraph

This video provides an overview of Graph Embeddings and how PyTorch-BigGraph enables learning graph embeddings for very large graphs. The key challenge with this is you would need a lot of memory to store the vectors for each node in a large graph. To solve this, BigGraph uses novel partitioning, distributed execution, and negative sampling algorithms. I hope this is a decent introduction to graph embeddings and PyTorch-BigGraph, really excited about the upcoming release of the Wikidata Weaviate web demo! PyTorch-BigGraph (blog post) - https://ai.facebook.com/blog/open-sourcing-pytorch-biggraph-for-faster-embeddings-of-extremely-large-graphs/

PyTorch-BigGraph (paper) - https://arxiv.org/abs…

1 неделя, 6 дней назад @ youtube.com
Dmitry Kan on Vector Search Engines
Dmitry Kan on Vector Search Engines Dmitry Kan on Vector Search Engines

Subscribe to the Vector Podcast: https://www.youtube.com/channel/UCCIMPfR7TXyDvlDRXjVhP1g Not All Vector Databases are Equal: https://towardsdatascience.com/milvus-pinecone-vespa-weaviate-vald-gsi-what-unites-these-buzz-words-and-what-makes-each-9c65a3bd0696 Get started with Weaviate!

https://www.semi.technology/developers/weaviate/current/getting-started/quick-start.html Chapters

0:00 Introduction

2:55 What inspired your interest?

6:24 Weaviate Modules - Search + QA and more

7:40 BERT + Elasticsearch

13:42 FAISS and recent developments

17:37 Mixed Precision

18:21 Symbolic Filters and Vector Search

24:34 Approximate Nearest Neighbor Benchmarks

27:38 ANN vs. Information Retrieval

32:58 Robus…

2 недели, 4 дня назад @ youtube.com
Presenting... The Weaviate Podcast!
Presenting... The Weaviate Podcast! Presenting... The Weaviate Podcast!

This is a prelude to the Weaviate Podcast (moving to SeMI YouTube).

Etienne Dilocker and I discuss the Weaviate Slack, the Weaviate console for Dataset Interfaces, HNSW, and many more slack topics such as encoding longer than 512 tokens for vector similarity search. Thank you so much for watching, please head over to the SeMI YouTube for future episodes and check out the Weaviate slack if you would like your questions answered here! Weaviate Slack: https://join.slack.com/t/weaviate/shared_invite/zt-goaoifjr-o8FuVz9b1HLzhlUfyfddhw

Weaviate Web Demo: https://www.semi.technology/developers/weaviate/current/getting-started/quick-start.html

ANN Benchmarks: http://ann-benchmarks.com/

Etienne Dilo…

3 недели, 4 дня назад @ youtube.com
AugMax Explained!
AugMax Explained! AugMax Explained!

There has been a new breakthrough in using Data Augmentation in Deep Learning. Researchers from NVIDIA, Caltech, UT Austin, and Arizona State have published AugMax. The acronym AugMax communicates the use of Data Augmentation and Maximizing the Diversity and Difficulty of Augmented Examples. AugMax is basically free lunch in Deep Learning meaning that it is a very strong technique to add to any existing Deep Learning workflow. But more particularly, Data Augmentation has been heavily studied with Image Data in Computer Vision. AugMax is the sequel to AugMix.

This predecessor tree of research in Data Augmentation is roughly AutoAugment to RandAugment to AugMix, and now AugMax. AugMax further…

1 месяц назад @ youtube.com
Weaviate's GraphQL API for Neurosymbolic Search!
Weaviate's GraphQL API for Neurosymbolic Search! Weaviate's GraphQL API for Neurosymbolic Search!

This video walks through some example queries to help you get started with Weaviate's GraphQL API. If you have any questions, ideas, or want advice on your personal use case, please leave it in the comments! Stay tuned for a tutorial on using a custom dataset with Weaviate! Quick Start Tutorial: https://www.semi.technology/developers/weaviate/current/getting-started/quick-start.html

How Weaviate's GraphQL API was designed: https://hackernoon.com/how-weaviates-graphql-api-was-designed-t93932tl Chapters

0:00 Neurosymbolic Systems

0:55 Query 1

2:00 Query 2

2:33 Query 3

3:27 Query 4 - Symbolic Filters!

4:16 Query 5

5:09 Query 6 - Neural Search!

6:24 Query 7 - Neurosymbolic Query!

7:43 Query 8

8…

1 месяц, 1 неделя назад @ youtube.com
Introducing the Weaviate Vector Search Engine!
Introducing the Weaviate Vector Search Engine! Introducing the Weaviate Vector Search Engine!

This video presents Weaviate Vector Search Engine! Please leave a comment if you have any ideas or questions! Weaviate Documentation: https://www.semi.technology/developers/weaviate/current/

SeMI Technologies on YouTube: https://www.youtube.com/channel/UCJKT6kJ3IFYybWnL7jbXxhQ/videos Chapters

0:00 Overview of Chapters

5:16 Semantic Similarity

7:55 Symbolic Queries

9:45 Vector + Symbolic Search

10:23 Weaviate Modules

13:44 Demo Dataset and GraphQL

15:12 Integration Project

16:16 Research Ideas

19:18 Connect with us!

1 месяц, 1 неделя назад @ youtube.com
Etienne Dilocker on Vector Search Engines and Weaviate
Etienne Dilocker on Vector Search Engines and Weaviate Etienne Dilocker on Vector Search Engines and Weaviate

Vector Search Engines are powering the the next generation of search. Instead of relying on things like BM25 or TF-IDF representations, we use neural representations to compare semantic similarity. Dot products to calculate the similarity between all vectors would be very time consuming. Hierarchical Navigable Small World Graphs (HNSW) are a cutting-edge new algorithm used by Weaviate to speed up this search. This podcast explores HNSW, Neurosymbolic search and filtering, and many more! I hope you find this interesting, happy to answer any questions left on the YouTube video! Check out this Introduction to Weaviate: Very well organized, no headaches getting started!

https://www.semi.technol…

1 месяц, 2 недели назад @ youtube.com
Robust Fine-Tuning of Zero-Shot Models
Robust Fine-Tuning of Zero-Shot Models Robust Fine-Tuning of Zero-Shot Models

Researchers from the University of Washington, Columbia University, Open AI, the Allen Institute of Artificial Intelligence, and Toyota Research have teamed up to present a new method for fine-tuning these pre-trained models such as GPT-3, BERT, DALL-E, EfficientNet, or CLIP for application specific datasets. The key insight is that as you fine-tune these models, you gain in-distribution accuracy, but sacrifice the zero-shot flexibility, or out-of-distribution generalization, of these pre-trained “foundation” models. The authors present Weight-Space Ensembling, where you take a linear interpolation between the weights of the zero-shot and fine-tuned model to make new inference. This achieve…

2 месяца, 1 неделя назад @ youtube.com
Robust fine-tuning of zero-shot models
Robust fine-tuning of zero-shot models Robust fine-tuning of zero-shot models

Researchers from the University of Washington, Columbia University, Open AI, the Allen Institute of Artificial Intelligence, and Toyota Research have teamed up to present a new method for fine-tuning these pre-trained models such as GPT-3, BERT, DALL-E, EfficientNet, or CLIP for application specific datasets. The key insight is that as you fine-tune these models, you gain in-distribution accuracy, but sacrifice the zero-shot flexibility, or out-of-distribution generalization, of these pre-trained “foundation” models. The authors present Weight-Space Ensembling, where you take a linear interpolation between the weights of the zero-shot and fine-tuned model to make new inference. This achieve…

2 месяца, 2 недели назад @ youtube.com
Generalization in Open-Domain Question Answering
Generalization in Open-Domain Question Answering Generalization in Open-Domain Question Answering

AI Weekly Update Notion Page: https://ebony-scissor-725.notion.site/AI-Weekly-Update-September-8th-2021-a2119851b5b74470b4971d064665e77e

New AI Weekly Update GitHub Repo: https://github.com/CShorten/AIWeeklyUpdates I am looking for collaborators for two survey papers on Text-to-Image Generation and Data Augmentation Controllers. If you are interested in helping out and being a co-author of either paper, please send me a quick overview of what you think about the topic to cshorten2015@fau.edu! Content Links in the Video:

Challenges in Generalization in Open Domain Question Answering: https://arxiv.org/pdf/2109.01156.pdf

Hurdles to Progress in Long-Form Question Answering: https://arxiv.org/p…

2 месяца, 3 недели назад @ youtube.com
AI Weekly Update - August 7th, 2021 (#40)
AI Weekly Update - August 7th, 2021 (#40) AI Weekly Update - August 7th, 2021 (#40)

Notion Link: https://ebony-scissor-725.notion.site/AI-Weekly-Update-August-7th-2021-3b21331c5c6a45e1a5638955dda7923c Chapters:

0:00 Introduction

0:06 Open-Ended Learning

2:38 Persistent Reinforcement Learning

3:34 Domain-Matched Pre-training for Retrieval

4:22 Growing Knowledge Culturally

5:06 Language Grounding with 3D Objects

6:03 Pre-train, Prompt, and Predict

6:32 QA Dataset Explosion

7:32 ProtoTransformer

9:32 Don’t sweep your Learning Rate under the Rug

10:34 AAVAE

11:40 Domain-Agnostic Contrastive Learning

12:43 Dataset Distillation

14:04 Pointer Value Retrieval

16:01 Go Wider Instead of Deeper

16:52 Geometric Deep Learning on Molecules

18:37 Simulation Framework for Label Noise

19:5…

3 месяца, 3 недели назад @ youtube.com
Reasoning with Language Models - Turning Tables
Reasoning with Language Models - Turning Tables Reasoning with Language Models - Turning Tables

Notion Link: https://ebony-scissor-725.notion.site/Henry-AI-Labs-Weekly-Update-July-22nd-2021-0c43042b93a3459c901f7f5973b949bf Thanks for watching! Please Subscribe!

4 месяца, 1 неделя назад @ youtube.com
Deduplicating Training Data makes Language Models Better
Deduplicating Training Data makes Language Models Better Deduplicating Training Data makes Language Models Better

Notion Link: https://ebony-scissor-725.notion.site/Henry-AI-Labs-Weekly-Update-July-22nd-2021-0c43042b93a3459c901f7f5973b949bf Follow Katherine Lee on Twitter @katherine1ee

Twitter Thread Link: https://twitter.com/katherine1ee/status/1415496898241339400 Thanks for watching! Please Subscribe!

4 месяца, 1 неделя назад @ youtube.com
Using HTML for Language Modeling
Using HTML for Language Modeling Using HTML for Language Modeling

Notion Link: https://ebony-scissor-725.notion.site/Henry-AI-Labs-Weekly-Update-July-22nd-2021-0c43042b93a3459c901f7f5973b949bf Thanks for watching! Please Subscribe!

4 месяца, 1 неделя назад @ youtube.com
3blue1brown 3blue1brown
последний пост 1 месяц назад
A few of the best math explainers from this summer
A few of the best math explainers from this summer A few of the best math explainers from this summer

Take a look at the full playlist (really): https://www.youtube.com/watch?v=fJWnA4j0_ho&list=PLnQX-jgAF5pTkwtUuVpqS5tuWmJ-6ZM-Z

Blog post with more details: https://3b1b.co/some1-results

Thanks, as always, to the supporters of this channel for helping to make this whole project possible: http://3b1b.co/thanks ------------------ Typo at 2:00, it should read "Burkard Polster" Videos and posts mentioned here. That weird light at the bottom of a mug — ENVELOPES

https://youtu.be/fJWnA4j0_ho Hiding Images in Plain Sight: The Physics Of Magic Windows

mattferraro.dev/posts/caustics-engineering The Beauty of Bézier Curves

https://youtu.be/aVwxzDHniEw What Is The Most Complicated Lock Pattern?

https:/…

1 месяц назад @ youtube.com
Where Newton meets Mandelbrot (Holomorphic dynamics)
Where Newton meets Mandelbrot (Holomorphic dynamics) Where Newton meets Mandelbrot (Holomorphic dynamics)

How the right question about Newton's method results in a Mandelbrot set.

Video on Newton's fractal: https://youtu.be/T_S2j5GaLRQ

Special thanks: https://3b1b.co/lessons/newtons-fractal#thanks Extra special thanks to Sergey Shemyakov, of Aix-Marseille University, for helpful conversations and for introducing me to this phenomenon. ------------------ Introduction to Fatou sets and Julia sets, including a discussion of Montel's theorem and its consequences:

http://www.math.stonybrook.edu/~scott/Papers/India/Fatou-Julia.pdf Numberphile with Ben Sparks on the Mandelbrot set:

https://youtu.be/FFftmWSzgmk Excellent article on Acko.net, from the basics of building up complex numbers to Julia sets.…

1 месяц, 1 неделя назад @ youtube.com
Newton's Fractal (which Newton knew nothing about)
Newton's Fractal (which Newton knew nothing about) Newton's Fractal (which Newton knew nothing about)

Who knew root-finding could be so complicated?

Early view of the next part: https://www.patreon.com/3blue1brown

An equally valuable form of support is to simply share the videos. ------------------ On fractal dimension:

https://youtu.be/gB9n2gHsHN4 Mathologer on the cubic formula:

https://youtu.be/N-KXStupwsc Some of the videos from this year's Summer of Math Exposition are fairly relevant to the topics covered here. Take a look at these ones, The Beauty of Bézier Curves

https://youtu.be/aVwxzDHniEw The insolubility of the quintic:

https://youtu.be/BSHv9Elk1MU The math behind rasterizing fonts:

https://youtu.be/LaYPoMPRSlk --- These animations are largely made using a custom python library,…

1 месяц, 2 недели назад @ youtube.com
Why aren't you making math videos? (Also, now there's a 3b1b podcast)
Why aren't you making math videos?  (Also, now there's a 3b1b podcast) Why aren't you making math videos? (Also, now there's a 3b1b podcast)

Learn more and submit: https://3b1b.co/SoME1

Podcast/New channel: https://youtu.be/C-i4q-Xlnis

↓↓Things referenced through the video↓↓ Join the discord channel:

https://discord.gg/SRTErdZ9 James Schloss:

https://www.youtube.com/user/LeiosOS Free will theorem:

https://www.ams.org/notices/200902/rtx090200226p.pdf Kolmogorov complexity and primes:

https://people.cs.uchicago.edu/~fortnow/papers/kaikoura.pdf Tadashi Tokieda talk:

https://youtu.be/tQQ3oiB32GI Boarbarktree:

https://www.youtube.com/channel/UCFeIEAkqvS4fJMTwUtF4OFw Mathologer:

https://youtu.be/N-KXStupwsc Manim:

https://github.com/3b1b/manim Manim Community edition:

https://github.com/ManimCommunity/manim/ Reanimate:

https://github.…

4 месяца, 2 недели назад @ youtube.com
A quick trick for computing eigenvalues | Essence of linear algebra, chapter 15
A quick trick for computing eigenvalues | Essence of linear algebra, chapter 15 A quick trick for computing eigenvalues | Essence of linear algebra, chapter 15

How to write the eigenvalues of a 2x2 matrix just by looking at it.

Thanks to Tim for the jingle: https://www.youtube.com/acapellascience

Help fund future projects: https://www.patreon.com/3blue1brown​

An equally valuable form of support is to simply share the videos.

Special thanks to these supporters: https://3b1b.co/quick-eigen-thanks Introduction to eigenvectors and eigenvalues:

https://youtu.be/PFDu9oVAE-g Lockdown math lecture talking about the mean product formula:

https://youtu.be/MHXO86wKeDY Timestamps:

0:00 - Background

4:53 - Examples

10:24 - Relation to the characteristic polynomial

12:00 - Last thoughts ------------------ These animations are largely made using a custom python …

6 месяцев, 3 недели назад @ youtube.com
How (and why) to raise e to the power of a matrix | DE6
How (and why) to raise e to the power of a matrix | DE6 How (and why) to raise e to the power of a matrix | DE6

General exponentials, Love, Schrödinger, and more.

Home page: https://www.3blue1brown.com

Brought to you by you: https://3b1b.co/thanks ------------------

The Romeo-Juliet example is based on this essay by Steven Strogatz:

http://www.stevenstrogatz.com/essays/loves-me-loves-me-not-do-the-math The book shown at the start is Vladimir Arnold's (excellent) textbook on ordinary differential equations.

https://amzn.to/3dtXSwj Need a review of ordinary powers of e?

https://youtu.be/m2MIpDrF7Es Or of linear algebra?

https://youtu.be/kYB8IZa5AuE Timetable

0:00 - Definition

6:40 - Dynamics of love

13:17 - General equation

20:03 - On general rotations

22:11 - Visualizing with flow ------------------

C…

8 месяцев назад @ youtube.com
Two Minute Papers Two Minute Papers
последний пост 1 день, 20 часов назад
Man VS Machine: Who Plays Table Tennis Better? 🤖
Man VS Machine: Who Plays Table Tennis Better? 🤖 Man VS Machine: Who Plays Table Tennis Better? 🤖

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "Optimal Stroke Learning with Policy Gradient Approach for Robotic Table Tennis" is available here:

https://arxiv.org/abs/2109.03100

https://www.youtube.com/watch?v=SNnqtGLmX4Y ❤️ Watch these videos in early access on our Patreon page or join us here on YouTube: - https://www.patreon.com/TwoMinutePapers

- https://www.youtube.com/channel/UCbfYPyITQ-7l4upoX8nvctg/join 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Martel, Gordon Ch…

1 день, 20 часов назад @ youtube.com
Hooray! Virtual Sponges Can Sink Too! 🧽
Hooray! Virtual Sponges Can Sink Too! 🧽 Hooray! Virtual Sponges Can Sink Too! 🧽

❤️ Check out Weights & Biases and say hi in their community forum here: https://wandb.me/paperforum 📝 The paper "Unified particle system for multiple-fluid flow and porous material" is available here:

https://cronfa.swansea.ac.uk/Record/cronfa57521

https://dl.acm.org/doi/10.1145/3450626.3459764 Meet and discuss your ideas with other Fellow Scholars on the Two Minute Papers Discord: https://discordapp.com/invite/hbcTJu2 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamant…

3 дня, 20 часов назад @ youtube.com
Sculpting Liquids in Virtual Reality! 🍷
Sculpting Liquids in Virtual Reality! 🍷 Sculpting Liquids in Virtual Reality! 🍷

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.com/papers 📝 The paper "Interactive Liquid Splash Modeling by User Sketches" is available here:

https://web.cse.ohio-state.edu/~wang.3602/Yan-2020-ILS/Yan-2020-ILS.pdf

https://web.cse.ohio-state.edu/~wang.3602/publications.html 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Kenneth Davis, Klaus Busse, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Mark Oates, Mich…

1 неделя, 1 день назад @ youtube.com
New AI: Photos Go In, Reality Comes Out! 🌁
New AI: Photos Go In, Reality Comes Out! 🌁 New AI: Photos Go In, Reality Comes Out! 🌁

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "ADOP: Approximate Differentiable One-Pixel Point Rendering" is available here:

https://arxiv.org/abs/2110.06635 ❤️ Watch these videos in early access on our Patreon page or join us here on YouTube: - https://www.patreon.com/TwoMinutePapers

- https://www.youtube.com/channel/UCbfYPyITQ-7l4upoX8nvctg/join 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas,…

1 неделя, 5 дней назад @ youtube.com
Enhance! Super Resolution Is Here! 🔍
Enhance! Super Resolution Is Here!  🔍 Enhance! Super Resolution Is Here! 🔍

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "Image Super-Resolution via Iterative Refinement " is available here:

https://iterative-refinement.github.io/

https://github.com/Janspiry/Image-Super-Resolution-via-Iterative-Refinement 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Kenneth Davis, Klaus Busse, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Mark Oates, Michael Albrecht, Michae…

2 недели, 2 дня назад @ youtube.com
Watch This Statue Grow Out Of Nothing! 🗽
Watch This Statue Grow Out Of Nothing! 🗽 Watch This Statue Grow Out Of Nothing! 🗽

❤️ Check out Cohere and sign up for free today: https://cohere.ai/papers 📝 The paper "Large Steps in Inverse Rendering of Geometry" is available here:

https://rgl.epfl.ch/publications/Nicolet2021Large 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Kenneth Davis, Klaus Busse, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Mark Oates, Michael Albrecht, Michael Tedder, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Rajarshi Nigam, Ramse…

2 недели, 5 дней назад @ youtube.com
NVIDIA’s Stretchy Simulation: Super Quick! 🐘
NVIDIA’s Stretchy Simulation: Super Quick! 🐘 NVIDIA’s Stretchy Simulation: Super Quick! 🐘

❤️ Train a neural network and track your experiments with Weights & Biases here: http://wandb.me/paperintro 📝 The paper "A Constraint-based Formulation of Stable Neo-Hookean Materials" is available here:

https://mmacklin.com/neohookean.pdf 📝 The paper "Gaussian Material Synthesis" (with the pseudocode) is available here:

https://users.cg.tuwien.ac.at/zsolnai/gfx/gaussian-material-synthesis/ 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Kenneth Da…

3 недели, 2 дня назад @ youtube.com
Can You See These Tiny Cloth Wrinkles? 👕
Can You See These Tiny Cloth Wrinkles? 👕 Can You See These Tiny Cloth Wrinkles? 👕

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "GPU-based Simulation of Cloth Wrinkles at Submillimeter Levels" is available here:

https://web.cse.ohio-state.edu/~wang.3602/publications.html

https://dl.acm.org/doi/abs/10.1145/3450626.3459787 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Kenneth Davis, Klaus Busse, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Mark Oates, Michael Albrech…

3 недели, 5 дней назад @ youtube.com
Finally, Beautiful Elastic Simulations Are Here! 🦖🌵
Finally, Beautiful Elastic Simulations Are Here! 🦖🌵 Finally, Beautiful Elastic Simulations Are Here! 🦖🌵

❤️ Check out the Gradient Dissent podcast by Weights & Biases: http://wandb.me/gd 📝 The paper "Medial IPC: accelerated incremental potential contact with medial elastics" is available here:

https://yangzzzy.github.io/PDF/medial_IPC_SIG21.pdf

https://dl.acm.org/doi/10.1145/3450626.3459753 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Kenneth Davis, Klaus Busse, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Mark Oates, Michael …

4 недели, 1 день назад @ youtube.com
This AI Learned Physics...But How Good Is It? ⚛
This AI Learned Physics...But How Good Is It? ⚛ This AI Learned Physics...But How Good Is It? ⚛

❤️ Check out Perceptilabs and sign up for a free demo here: https://www.perceptilabs.com/papers 📝 The paper "High-order Differentiable Autoencoder for Nonlinear Model Reduction" is available here:

https://arxiv.org/abs/2102.11026 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Kenneth Davis, Klaus Busse, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Mark Oates, Michael Albrecht, Michael Tedder, Nikhil Velpanur, Owen Campbell-Mo…

1 месяц назад @ youtube.com
Simulating An 800,000 Metric Ton Glacier! 🤯
Simulating An 800,000 Metric Ton Glacier! 🤯 Simulating An 800,000 Metric Ton Glacier! 🤯

❤️ Check out Perceptilabs and sign up for a free demo here: https://www.perceptilabs.com/papers 📝 The paper "A glacier–ocean interaction model for tsunami genesis due to iceberg calving" is available here:

https://www.nature.com/articles/s43247-021-00179-7 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Kenneth Davis, Klaus Busse, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Mark Oates, Michael Albrecht, Nikhil Velpanur, Owen …

1 месяц, 1 неделя назад @ youtube.com
NVIDIA’s New Robot Cuts Like a Samurai! 🤺
NVIDIA’s New Robot Cuts Like a Samurai! 🤺 NVIDIA’s New Robot Cuts Like a Samurai! 🤺

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "DiSECt – A Differentiable Simulation Engine for Autonomous Robotic Cutting" is available here:

https://diff-cutting-sim.github.io/ 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Kenneth Davis, Klaus Busse, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Mark Oates, Michael Albrecht, Nikhil Velpanur, Owen Campbell-Moore, Owen Skar…

1 месяц, 1 неделя назад @ youtube.com
This Image Is Fine. Completely Fine. 🤖
This Image Is Fine. Completely Fine. 🤖 This Image Is Fine. Completely Fine. 🤖

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "The Sensory Neuron as a Transformer: Permutation-Invariant Neural Networks for Reinforcement Learning" is available here:

https://attentionneuron.github.io/ 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Kenneth Davis, Klaus Busse, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Mark Oates, Michael Albrecht, Nikhil Velpanur, Owen…

1 месяц, 1 неделя назад @ youtube.com
Finally, Beautiful Virtual Scenes…For Less! ☀️
Finally, Beautiful Virtual Scenes…For Less! ☀️ Finally, Beautiful Virtual Scenes…For Less! ☀️

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.com/papers ❤️ Their mentioned post is available here: https://wandb.ai/latentspace/published-work/The-Science-of-Debugging-with-W-B-Reports--Vmlldzo4OTI3Ng 📝 The paper "DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks" is available here:

https://depthoraclenerf.github.io/ Meet and discuss your ideas with other Fellow Scholars on the Two Minute Papers Discord: https://discordapp.com/invite/hbcTJu2 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Ra…

1 месяц, 2 недели назад @ youtube.com
This New Method Can Simulate a Vast Ocean! 🌊
This New Method Can Simulate a Vast Ocean! 🌊 This New Method Can Simulate a Vast Ocean! 🌊

❤️ Check out Perceptilabs and sign up for a free demo here: https://www.perceptilabs.com/papers 📝 The paper "Ships, Splashes, and Waves on a Vast Ocean" is available here:

http://computationalsciences.org/publications/huang-2021-vast-ocean.html 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Kenneth Davis, Klaus Busse, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Mark Oates, Michael Albrecht, Nikhil Velpanur, Owen Campbell-Moo…

1 месяц, 2 недели назад @ youtube.com
DataFest Video DataFest Video
последний пост 4 месяца назад
Gene Kogan | Machine learning for creativity
Gene Kogan | Machine learning for creativity Gene Kogan | Machine learning for creativity

Data Fest Online 2021 https://fest.ai/2021/

ML Art track https://ods.ai/tracks/ml-art-df2021 Speaker introduces himself in Russian, and then presents the material in English.

4 месяца назад @ youtube.com
Alex Farseev: Under the Boot of Google and Facebook and How to Crack it for better Performance
Alex Farseev: Under the Boot of Google and Facebook and How to Crack it for better Performance Alex Farseev: Under the Boot of Google and Facebook and How to Crack it for better Performance

Data Fest Online 2021 https://fest.ai/2021/

ML in Marketing track https://ods.ai/tracks/ml-in-marketing-df2021 Modern Digital Advertising Platforms Leverage Machine Learning and AI to help Advertisers to achieve their goals. Being managed by humans, Advertising technological potential is often remains under-utilised as Humans tend to follow stereotypes and rely on “gut feeling” when making decisions. Understanding of the underlying principles behind “Googles and Facebook’s of our world” therefore becomes a crucial skill a modern marketer needs to acquire to stay relevant. In this talk, we will shed the light into the complex Digital Advertising ecosystem and will show you techniques, such a…

5 месяцев, 2 недели назад @ youtube.com
Artem Koval: Cloud-Native MLOps Framework
Artem Koval: Cloud-Native MLOps Framework Artem Koval: Cloud-Native MLOps Framework

Data Fest Online 2021 https://fest.ai/2021/

ML REPA track https://ods.ai/tracks/ml-repa-df2021 Presentation: https://yadi.sk/i/a25573AB8IZUyw In this video we will analyse the requirements for modern MLOps and the main trends: Human-Centered AI, Fairness, Explainability, Model Monitoring, Human Augmented AI

5 месяцев, 3 недели назад @ youtube.com
Data Fest Online 2021: IGLU Competition @ NeurIPS 2021
Data Fest Online 2021: IGLU Competition @ NeurIPS 2021 Data Fest Online 2021: IGLU Competition @ NeurIPS 2021

Data Fest Online 2021 https://fest.ai/2021/

RL + Catalyst track https://ods.ai/tracks/catalyst-and-rl-df2021

6 месяцев назад @ youtube.com
Prince Canuma: Catalyst integration with Neptune
Prince Canuma: Catalyst integration with Neptune Prince Canuma: Catalyst integration with Neptune

Data Fest Online 2021 https://fest.ai/2021/

RL + Catalyst track https://ods.ai/tracks/catalyst-and-rl-df2021

6 месяцев назад @ youtube.com
Catalyst integration with Wandb
Catalyst integration with Wandb Catalyst integration with Wandb

Data Fest Online 2021 https://fest.ai/2021/

RL + Catalyst track https://ods.ai/tracks/catalyst-and-rl-df2021

6 месяцев назад @ youtube.com
Bag of tricks for image classification — Artur Kuzin
Bag of tricks for image classification — Artur Kuzin Bag of tricks for image classification — Artur Kuzin

ML Training 2019 Artur Kuzin tells about his participation in the competition Driven Data Hakuna Ma-data: Identify Wildlife on the Serengeti with AI for Earth. He took second place. In this video, you will find out: - Overview of a training procedure on Imagenet1k from scratch

- Implementation Details of Hacks & Tricks

- The specialty of working with JPEG pictures and resize in different frameworks Presentation - https://gh.mltrainings.ru/presentations/Kuzin_DrivenDataHakuna.pdf

9 месяцев, 2 недели назад @ youtube.com
Segmentation without pain — Yury Bolkonsky, Andrei Dukhounik
Segmentation without pain — Yury Bolkonsky, Andrei Dukhounik Segmentation without pain — Yury Bolkonsky, Andrei Dukhounik

ML Training 2019 Yury Bolkonsky and Andrei Dukhounik tell about their participation in Kaggle Understanding Clouds from Satellite Images. The team got a silver medal. In this video you will find out:

- Thresholding is an evil, believe in your classification models

- Why you should always use modern best practices

- Why it is not recommended to use postprocessing without local validation Presentation - https://gh.mltrainings.ru/presentations/Bolkonsky_KaggleUnderstandingClouds.pdf

9 месяцев, 2 недели назад @ youtube.com
Use leaks for validation Kaggle ASHRAE Great Energy Predictor III — Yury Bolkonsky
Use leaks for validation Kaggle ASHRAE   Great Energy Predictor III — Yury Bolkonsky Use leaks for validation Kaggle ASHRAE Great Energy Predictor III — Yury Bolkonsky

ML Training 2019 Yury Bolkonsky tells about his participation in Kaggle ASHRAE - Great Energy Predictor III. His team won a gold medal. In this video you will find out:

- How to create timestamp features

- Do you need to use a leak if it is noisy?

- Leak validation for the best solution

9 месяцев, 3 недели назад @ youtube.com
Time series met AutoML Codalab Automated Time Series Regression — Denis Vorotyntsev
Time series met AutoML Codalab Automated Time Series Regression —  Denis Vorotyntsev Time series met AutoML Codalab Automated Time Series Regression — Denis Vorotyntsev

ML Training 2019 Denis Vorotyntsev won AutoSeries - AutoML competition on time-series regression. In his presentation, he talks about the competition organization, his final solution, and solutions of other top placed participants. In this video, you will find out:

- How AutoML competition differs from most common Kaggle-alike and why you should try them

- Features engineering approach for time-series tasks when you have no idea about domain

- Why validation split should emulate train-test split

- Why you should always check the code of top participants and how small bugs might drop your score Presentation - https://gh.mltrainings.ru/presentations/Vorotyntsev_CodalabAutoML.pdf

9 месяцев, 3 недели назад @ youtube.com
Семинары JetBrains Research Семинары JetBrains Research
последний пост 1 неделя, 1 день назад
How to structure your ideas and make yourself understood
How to structure your ideas and make yourself understood How to structure your ideas and make yourself understood

Сегодня на семинаре будет доклад на тему Documenting research projects: How to structure your ideas and make yourself understood. Мы расскажем, как писать тех. документацию, чтобы было понятно читателям: основные приемы, которые упрощают процесс изложения мыслей и делают его более результативным. Также немного остановимся на примерах из ридми и научных статей, а затем поделимся советами по улучшению навыков английского языка. После презентации будет сессия вопросов и ответов. Докладчик: Елена Тихомирова.

1 неделя, 1 день назад @ youtube.com
ChemBERTa: Large-Scale Self-Supervised Pretraining for Molecular Property Prediction
ChemBERTa: Large-Scale Self-Supervised Pretraining for Molecular Property Prediction ChemBERTa: Large-Scale Self-Supervised Pretraining for Molecular Property Prediction

Графовые нейронные сети и молекулярные отпечатки пальцев являются наиболее частыми подходами предсказанию свойств малых молекул. Однако развитие и повышение доступности инструментов для обработки естественного языка, в том числе трансформеров, позволяет предпринять попытку использования этих методов для предсказания свойств малых молекул с помощью создания модели ChemBERTa. На семинаре мы рассмотрим статью "ChemBERTa: Large-Scale Self-Supervised Pretraining for Molecular Property Prediction", обсудим применявшиеся методы и сравним результаты авторов с результатами текущих state-of-the-art моделей. Докладчики: Михаил Лебедев.

1 неделя, 3 дня назад @ youtube.com
AI in Software Engineering at Facebook
AI in Software Engineering at Facebook AI in Software Engineering at Facebook

В статье поднимается следующий вопрос: Как искусственный интеллект может помочь разработчикам программного обеспечения лучше выполнять свою работу?

Подробно будут разобраны следующие рутинные задачи, которые встают перед программистом каждый день:

- поиск кода на естественном языке

- рекомендации кода - автоматическое исправление ошибок Разберем не только реализации каждой из задач, но и способ их интеграции в среду, а так же самое главное - реакцию программистов после интеграции.

Если останется время, будет более детально разобран алгоритм поиска кода на естественном языке, который в данный момент находится в стадии разработки в JetBrains. Докладчики: Ольга Лавриченко.

1 неделя, 4 дня назад @ youtube.com
Self-imitation Learning from Demonstrations
Self-imitation Learning from Demonstrations Self-imitation Learning from Demonstrations

Несмотря на многочисленные прорывы, достигнутые с помощью обучения с подкреплением (RL), решение сред с разреженными наградами остается актуальной задачей, требующей нетривиальной эксплорации. Обучение на демонстрациях (LfD) пытается решить эту проблему, направляя эксплорацию агента ближе к состояниям, продемонстрированным экспертом. Естественно, преимущества этого подхода зависят от качества демонстраций, которое редко бывает оптимальными. Современные алгоритмы LfD не обладают доказуемой устойчивостью к неоптимальным демонстрациям и вводят дополнительные гиперпараметры для контроля влияния демонстраций. Чтобы решить эти проблемы, мы берем за основу RL алгоритм self-imitation learning (SIL)…

1 неделя, 6 дней назад @ youtube.com
Do Transformers Really Perform Bad for Graph Representation?
Do Transformers Really Perform Bad for Graph Representation? Do Transformers Really Perform Bad for Graph Representation?

В последнее время трансформеры доминируют в задачах NLP и CV. Однако, они проигрывают многим простым GNN-архитектурам на популярных бенчмарках. Авторы адаптировали архитектуру трансформера для эффективной обработки молекулярных графов, что позволило им получить SoTA-результаты на недавнем OGB Large-Scale Challenge в graph-level задаче, а также на бенчмарк-датасетах PCBA, HIV и ZINC. На семинаре мы:

- обсудим GNN и их связь с трансформерами

- узнаем, как эффективно адаптировать трансформер для графов (positional encoding, атрибуты ребер и т.д.)

- сравним результаты Graphormer-а с результатами похожих моделей Докладчик: Дмитрий Леонов.

2 недели, 3 дня назад @ youtube.com
Real-Time Speech Enhancement in the Waveform Domain
Real-Time Speech Enhancement in the Waveform Domain Real-Time Speech Enhancement in the Waveform Domain

Авторы представляют end2end модель, принимающую на вход зашумленный цифровой сигнал и возвращающую его чистое представление. Модель можно эффективно запускать на CPU, а ее использование перед запуском ASR позволяет снизить WER при распознавании речи практически в два раза. На семинаре обсудим истоки архитектуры модели, пайплайн зашумления аудиозаписей для ее обучения, а также детали валидации. Докладчик: Дарья Дятлова.

2 недели, 4 дня назад @ youtube.com
Optimal Auctions through Deep Learning
Optimal Auctions through Deep Learning Optimal Auctions through Deep Learning

В наши дни аукционы широко применяются в различных сферах для организации продаж товаров и услуг. В связи с этим, разработка эффективного механизма аукциона, который бы максимизировал прибыль и мотивировал участников действовать в соответствии с их предпочтениями, является актуальной задачей. Тем не менее, спустя 30-40 лет исследований, данная задача не решена даже для случая с двумя и более товарами. На семинаре мы рассмотрим статью «Optimal Auctions through Deep Learning», в которой авторы используют инструменты глубокого обучения для проектирования оптимальных аукционов. Они моделируют аукцион как многослойную нейронную сеть, определяют оптимальный дизайн аукциона как проблему ограниченн…

2 недели, 4 дня назад @ youtube.com
Методы машинного обучения в функциональной геномике
Методы машинного обучения в функциональной геномике Методы машинного обучения в функциональной геномике

Геном представляет собой сложную систему взаимодействий функциональных элементов разных уровней организации - самой последовательности ДНК, мотивов, трехмерной структуры, элементов эпигенетического кода и кода вторичных структур ДНК. С помощью методов нейросетевого глубинного обучения стало возможным агрегирование информации о функциональных элементах разных уровней клеточной организации - геномики, эпигеномики, протеомики, метаболомики - и других “омик”, с целью предсказания функциональных элементов, для которых эксперименты либо не достигли нужного качества, либо отсутствуют. В докладе будет рассказано о методах глубинного обучения, разрабатываемых в международной лаборатории биоинформати…

2 недели, 5 дней назад @ youtube.com
SPARQLing Database Queries from Intermediate Question Decompositions
SPARQLing Database Queries from Intermediate Question Decompositions SPARQLing Database Queries from Intermediate Question Decompositions

На семинаре мы рассмотрим задачу перевода вопроса на естественном языке к базе данных в выполнимый запрос (например, на языке SQL). Многие исследователи предлагают решать ее с помощью нейросетей, которые обучаются на данных, содержащих базы, вопросы к ним и соответствующие правильные запросы. Однако собирать такую разметку сложно, так как нужно, чтобы аннотаторы знали язык запросов для их написания. Я расскажу, как с помощью промежуточных представлений можно отказаться от использования разметки с запросами для обучения, но при этом сохранить возможность модели генерировать выполняемые запросы. Статья: SPARQLing Database Queries from Intermediate Question Decompositions, EMNLP 2021, https://…

3 недели назад @ youtube.com
Documentation Matters: Human-Centered System to Assist Data Science Code Documentation in Notebooks
Documentation Matters: Human-Centered System to Assist Data Science Code Documentation in Notebooks Documentation Matters: Human-Centered System to Assist Data Science Code Documentation in Notebooks

Сегодня мы разберем статью, в которой авторы решают классическую задачу генерации документации для не очень классического представления кода - для Jupyter ноутбуков. В рамках этой статьи авторы сначала проводят анализ того, что является хорошо задокументированным ноутбуком, предлагают три различных способа автоматизировать документацию и проводят user study. В итоге авторы приходят к двум интересным выводам: документация в data science используется по-другому (в отличие от обычной разработки) и что совместная генерация документации человеком и алгоритмом получается очень продуктивной. Докладчик: Сергей Титов.

3 недели, 5 дней назад @ youtube.com
Cognitive phenomena in software engineering
Cognitive phenomena in software engineering Cognitive phenomena in software engineering

В нашей работе мы хотим использовать достижения последних 30-и лет когнитивной психологии во благо разработки программного обеспечения. Мы предполагаем, что многие эффекты, построенные на решении различных когнитивных задач могут быть перенесены на задачи программирования и быть использованы для повышения удобства и эффективности написания кода. На этом семинаре мы представим наш подход, потенциальные эффекты для проверки и подробно разберем наше пилотное исследование построенное по этой схеме. Этим летом мы провели исследование, в котором определили привычные ошибки в написании кода как результат когнитивной ригидности — ментальной установки. В исследовании мы проверяли эффективность техни…

1 месяц назад @ youtube.com
Диффузные модели
Диффузные модели Диффузные модели

Вероятно, вы уже что-то слышали про диффузные модели и видели провокационные названия статей типа "Diffusion Models Beat GANs on Image Synthesis". Все так. Диффузные модели это относительно новый класс моделей, которые показывают sota результаты в задачах оценки плотности распределения и генерации данных. И да, на некоторых задачах генерации изображений они показывают результаты лучше, чем GAN'ы. Однако, скорее всего, вы еще не успели узнать подробности. Но вы сможете сделать это на семинаре! Во вторник мы разберем стандартные диффузные модели и посмотрим на некоторые более поздние обобщения. Поймем, как обучать эти модели, оценивать с их помощью плотность распределения, генерировать данные…

1 месяц, 1 неделя назад @ youtube.com
GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields
GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields

На сегодняшний день существует множество глубоких генеративных моделей, позволяющих создавать реалистичные изображения в высоком разрешении. Но во многих приложениях этого бывает недостаточно: иногда требуется уметь изменять положение и внешний вид объектов на изображении, а сам процесс генерации должен быть контролируемым. Авторы статьи «GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields» предлагают новую генеративную модель, представляющую изображения как 3D сцены, и учитывающую их композиционный характер. Модель сочетает в себе представление объектов как генеративных нейронных полей с последующим объемным и нейронным рендерингом, что позволяет ей работать быст…

1 месяц, 1 неделя назад @ youtube.com
QMIX / Graph Attention Networks with Positional Embeddings
QMIX / Graph Attention Networks with Positional Embeddings QMIX / Graph Attention Networks with Positional Embeddings

00:00:00 QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

00:22:58 Graph Attention Networks with Positional Embeddings QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning Во многих областях обучения с подкреплением возникает необходимость

обучить сразу несколько агентов действовать в общей среде. В таких задачах важно

понять, как оценивать вклады каждого из агентов в достижение общей цели. Авторы

статьи “QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement

Learning” предложили новый алгоритм QMIX, который способен решать более широкий

класс задач по сравнению со своими предшественниками. …

1 месяц, 1 неделя назад @ youtube.com
Same data, different conclusions
Same data, different conclusions Same data, different conclusions

В последнее время особенно остро стоит вопрос воспроизводимости научных результатов. Недавние исследования показывают, что кроме неполноты данных и случайных или злонамеренных ошибок в методологии (p-хакинг, множественное сравнение) на результаты исследования также влияют субъективные решения, принимамые учёными в ходе работы. Для детального изучения этого феномена авторы статьи “Same data, different conclusions: Radical dispersion in empirical results when independent analysts operationalize and test the same hypothesis” попросили 19 экспертов со всего мира проверить истинность двух одинаковых гипотез на одинаковом наборе данных и описать процесс принятия решений. Результаты независимых ис…

1 месяц, 2 недели назад @ youtube.com
Яндекс. Компьютерные науки Яндекс. Компьютерные науки
последний пост 1 неделя, 5 дней назад
Глубинное обучение. Сверточные нейросети. Школа анализа данных, Яндекс
Глубинное обучение. Сверточные нейросети. Школа анализа данных, Яндекс Глубинное обучение. Сверточные нейросети. Школа анализа данных, Яндекс

Лекция посвящена сверточным нейросетям. Дается определение сверточных слоев, расматриваются их вариации (паддинг, варианты с пропуском, depthwise-separable и 1-by-1 convolutions). Обсуждается обратное распространение через сверточные слои (а также через max pooling слой). Во второй половине лекции рассматриваются популярные архитектуры сверточных нейросетей для классификации изображений (LeNet, AlexNet, VGGNet, Inception, Resnet, ResNext, MobileNet, EfficientNet).

1 неделя, 5 дней назад @ youtube.com
Глубинное обучение. Глубинное обучение для 3D-реконструкции. Школа анализа данных, Яндекс
Глубинное обучение. Глубинное обучение для 3D-реконструкции. Школа анализа данных, Яндекс Глубинное обучение. Глубинное обучение для 3D-реконструкции. Школа анализа данных, Яндекс

Последняя лекция курса посвящена применению нейросетей для задач 3D-реконструкции и представления 3D-сцен. Первая половина лекции посвящена задаче восстановления геометрии. Рассматриваются стереосопоставление, монокулярная оценка глубины, нейросети, восстанавливающие облака точек, а также объекты в неявном представлении. Во второй половине лекции рассматриваются системы, позволяющие синтезировать новые виды сцены. Рассматриваются системы, основанные на волюметрическом ренедеринге и неявных представлениях (NeRF и подобные), системы с нейротекстурированными триангулированными представлениями и точечным представлением геометрии. Дается краткий обзор пакетов для нейрорендеринга.

1 неделя, 5 дней назад @ youtube.com
Глубинное обучение. Оптимизация для глубинного обучения. Школа анализа данных, Яндекс
Глубинное обучение. Оптимизация для глубинного обучения. Школа анализа данных, Яндекс Глубинное обучение. Оптимизация для глубинного обучения. Школа анализа данных, Яндекс

Рассматриваются основные алгоритмы для оптимизации функций потерь в глубинном обучении, а именно стохастический пакетный градиентный спуск и его модификации. Особое внимание уделяется градиентному спуску с моментом. Обсуждается так же пакетная нормализация (batch normalization).

1 неделя, 5 дней назад @ youtube.com
Глубинное обучение. Латентные глубокие модели. Школа анализа данных, Яндекс
Глубинное обучение. Латентные глубокие модели. Школа анализа данных, Яндекс Глубинное обучение. Латентные глубокие модели. Школа анализа данных, Яндекс

В лекции описываются латентные порождающие модели на основе нейросетей, обучающиеся без учителя. В начале рассматривается модель Generative Latent Optimization (GLO), далее рассматриваются автокодировщики и их применение, включая варационные автокодировщики. Во второй половине лекции рассматриваются нормализующие потоки, а также вводятся противоборствующие сети (generative adversarial networks).

1 неделя, 5 дней назад @ youtube.com
Глубинное обучение. Adversarial learning. Школа анализа данных, Яндекс
Глубинное обучение. Adversarial learning. Школа анализа данных, Яндекс Глубинное обучение. Adversarial learning. Школа анализа данных, Яндекс

Лекция посвящена современным вариациям противоборствующих нейросетей и обучению с дискриминаторными функциями потери. Рассматриваются как продвинутые варианты латентных моделей (StyleGAN1, StyleGAN2), так и применение дискриминаторных функций потерь в задачах трансляции изображений и им подобным. Рассматриваются основные паттерны, применяемые при построении генераторов. Рассматриваются основные применения этих технологий, такие как нейросинтезаторы реалистичных изображений и нейроаватары.

1 неделя, 5 дней назад @ youtube.com
Глубинное обучение. Генерация и обработка изображений с помощью нейросетей. Школа анализа данных
Глубинное обучение. Генерация и обработка изображений с помощью нейросетей. Школа анализа данных Глубинное обучение. Генерация и обработка изображений с помощью нейросетей. Школа анализа данных

В лекции рассматриваются нейросети, обрабатывающие изображения (на примере повышения разрешения), и нейросети, синтезирующие изображения. Рассматриваются функции потери (включая перцептивные -- perceptual losses, текстурные функции потери). Рассматриваются задачи синтеза текстур и стилизации изображений. Обсуждается модуль адаптивной instance нормализации и его применение.

1 неделя, 5 дней назад @ youtube.com
Глубинное обучение. Сверточные нейросети в компьютерном зрении. Школа анализа данных, Яндекс
Глубинное обучение. Сверточные нейросети в компьютерном зрении. Школа анализа данных, Яндекс Глубинное обучение. Сверточные нейросети в компьютерном зрении. Школа анализа данных, Яндекс

В лекции дается обзор применений сверточных нейросетей в компьютерном зрении. Рассматриваются задачи семантической и instance сегментации, обнаружение объектов, распознавание лиц/людей. В каждом случае рассматриваются формулировка задачи, основные архитектурные паттерны для нейросетей, решающих данные задачи, популярные архитектуры. В конце обсуждаются последние работы по контрастивному предобучению нейросетей без учителя.

1 неделя, 5 дней назад @ youtube.com
Глубинное обучение. Введение. Школа анализа данных, Яндекс
Глубинное обучение. Введение. Школа анализа данных, Яндекс Глубинное обучение. Введение. Школа анализа данных, Яндекс

В лекции дается неформальное определение глубокого обучения, очень кратко освещаются история глубокого обучения, обсуждается алгоритм обратного распространения ошибки, обсуждается концепция слоев сети. В конце лекции очень кратко обсуждаются биологические нейросети.

1 неделя, 5 дней назад @ youtube.com
Глубинное обучение. Представления внутри сверточных нейросетей. Школа анализа данных, Яндекс
Глубинное обучение. Представления внутри сверточных нейросетей. Школа анализа данных, Яндекс Глубинное обучение. Представления внутри сверточных нейросетей. Школа анализа данных, Яндекс

Лекция посвящена представлениям данных, выучиваемым сверточными нейросетями. Рассматриваются разные способы визуализации и анализа подобных представлений. Обсуждаются атаки на нейросети, а также генерация искусственных изображений-иллюзий для нейросетей. Обсуждается способность больших сверточных нейросетей к переобучению под произвольные выборки. Заключительная часть лекции посвящена трансферу знаний с применением сверточных нейросетей.

1 неделя, 5 дней назад @ youtube.com
Разбор письменного экзамена ШАД. Задача 9. Индекс ближайшего превосходящего элемента
Разбор письменного экзамена ШАД. Задача 9. Индекс ближайшего превосходящего элемента Разбор письменного экзамена ШАД. Задача 9. Индекс ближайшего превосходящего элемента

В этом году мы решили помочь тем, кто готовится к поступлению в Школу анализа данных, и поделиться решениями нескольких заданий из вариантов письменного экзамена, демонстрирующими полезные приёмы. Каждую неделю мы будем публиковать здесь разбор одной из задач, которые были на письменном экзамене в ШАД в 2019 году. Условия задач и текстовые разборы вы найдёте на сайте: https://yandexdataschool.ru/stepbystep

7 месяцев, 2 недели назад @ youtube.com
Письменный разбор экзамена ШАД. Задача 8. Рёбра в графе
Письменный разбор экзамена ШАД. Задача 8. Рёбра в графе Письменный разбор экзамена ШАД. Задача 8. Рёбра в графе

В этом году мы решили помочь тем, кто готовится к поступлению в Школу анализа данных, и поделиться решениями нескольких заданий из вариантов письменного экзамена, демонстрирующими полезные приёмы. Каждую неделю мы будем публиковать здесь разбор одной из задач, которые были на письменном экзамене в ШАД в 2019 году. Условия задач и текстовые разборы вы найдёте на сайте: https://yandexdataschool.ru/stepbystep

8 месяцев назад @ youtube.com
Разбор письменного экзамена ШАД. Задача 7. Неравенство для производной
Разбор письменного экзамена ШАД. Задача 7. Неравенство для производной Разбор письменного экзамена ШАД. Задача 7. Неравенство для производной

В этом году мы решили помочь тем, кто готовится к поступлению в Школу анализа данных, и поделиться решениями нескольких заданий из вариантов письменного экзамена, демонстрирующими полезные приёмы. Каждую неделю мы будем публиковать здесь разбор одной из задач, которые были на письменном экзамене в ШАД в 2019 году. Условия задач и текстовые разборы вы найдёте на сайте: https://yandexdataschool.ru/stepbystep

8 месяцев, 1 неделя назад @ youtube.com
Разбор письменного экзамена ШАД. Задача 6. Размерности
Разбор письменного экзамена ШАД. Задача 6. Размерности Разбор письменного экзамена ШАД. Задача 6. Размерности

В этом году мы решили помочь тем, кто готовится к поступлению в Школу анализа данных, и поделиться решениями нескольких заданий из вариантов письменного экзамена, демонстрирующими полезные приёмы. Каждую неделю мы будем публиковать здесь разбор одной из задач, которые были на письменном экзамене в ШАД в 2019 году. Условия задач и текстовые разборы вы найдёте на сайте: https://yandexdataschool.ru/stepbystep

8 месяцев, 2 недели назад @ youtube.com
Разбор письменного экзамена ШАД. Задача 5. Предел и вероятности
Разбор письменного экзамена ШАД. Задача 5. Предел и вероятности Разбор письменного экзамена ШАД. Задача 5. Предел и вероятности

В этом году мы решили помочь тем, кто готовится к поступлению в Школу анализа данных, и поделиться решениями нескольких заданий из вариантов письменного экзамена, демонстрирующими полезные приёмы. Каждую неделю мы будем публиковать здесь разбор одной из задач, которые были на письменном экзамене в ШАД в 2019 году. Условия задач и текстовые разборы вы найдёте на сайте: https://yandexdataschool.ru/stepbystep

8 месяцев, 3 недели назад @ youtube.com
Разбор письменного экзамена ШАД. Задача 4. Геометрическая вероятность
Разбор письменного экзамена ШАД. Задача 4. Геометрическая вероятность Разбор письменного экзамена ШАД. Задача 4. Геометрическая вероятность

В этом году мы решили помочь тем, кто готовится к поступлению в Школу анализа данных, и поделиться решениями нескольких заданий из вариантов письменного экзамена, демонстрирующими полезные приёмы. Каждую неделю мы будем публиковать здесь разбор одной из задач, которые были на письменном экзамене в ШАД в 2019 году. Условия задач и текстовые разборы вы найдёте на сайте: https://yandexdataschool.ru/stepbystep

8 месяцев, 4 недели назад @ youtube.com
ML Trainings ML Trainings
последний пост 4 недели, 1 день назад
Иван Максимов | 13 способов ускорить А/В тест, или "Не CUPED-ом единым"
Иван Максимов | 13 способов ускорить А/В тест, или "Не CUPED-ом единым" Иван Максимов | 13 способов ускорить А/В тест, или "Не CUPED-ом единым"

ML in Marketing hub: https://ods.ai/hubs/ml-in-marketing

Телеграм-канал https://t.me/mlinmarketing Спикер: Иван Максимов, Data Science Team Lead at Delivery Club Многие аналитики для ускорения А/В тестов в первую очередь используют достаточно сложные статистические приемы (например, CUPED). Однако существует огромное множество более простых и эффективных способов ускорить А/В тесты. Мы обсудим 13 таких способов: от улучшения процесса дизайна теста до применения стат критерия и финального принятия решения о выкатке фичи. А также оценим потенциальный trade-off эффект-затраты от внедрения каждого из способов. Что было в докладе:

00:00 начало видео

01:18 сразу о результатах

02:07 содержание док…

4 недели, 1 день назад @ youtube.com
ODS Data Halloween 2021
ODS Data Halloween 2021 ODS Data Halloween 2021

ODS Course Fest #1 https://datafest.ru/course/ Полное расписание конференции доступно на ODS.AI: https://ods.ai/events/halloween2021 Курсы ODS: https://ods.ai/tracks/groups/courses Зарегистрироваться и получить доступ ко всем курсам и мероприятиям сезона: https://ods.ai/events/course_fest_1 Вступить в сообщество: https://ods.ai/ Соцсети Data Fest & Course Fest: https://t.me/datafest

https://vk.com/datafest

1 месяц назад @ youtube.com
Андрей Кузнецов | Разбор главной конференции по рекомендательным системам RecSys 2021
Андрей Кузнецов | Разбор главной конференции по рекомендательным системам RecSys 2021 Андрей Кузнецов | Разбор главной конференции по рекомендательным системам RecSys 2021

ML in Marketing хаб: https://ods.ai/hubs/ml-in-marketing

Телеграм-канал https://t.me/mlinmarketing *******ODS HALLOWEEN*******

Регистрация: https://ods.ai/events/halloween2021 Спикер: Андрей Кузнецов, Machine Learning Engineer в Одноклассники, преподаватель в MADE, автор статей на хабр В докладе: Расскажу о том, что было в этом году на RecSys, какие были интересные статьи, расскажу подробнее про некоторые из них:

- предсказание пользовательских кликов на управляющие элементы на мобилках от Google Research с трансформерами под капотом

- решение проблем повторяющегося потребления и динамической доступности айтемов на примере twitch

- разбор честности экспериментов в SOTA статьях при использов…

1 месяц назад @ youtube.com
Евгений Ключников | Как делать масштабируемые A/B эксперименты
Евгений Ключников | Как делать масштабируемые A/B эксперименты Евгений Ключников | Как делать масштабируемые A/B эксперименты

ML in Marketing track https://ods.ai/tracks/ml-in-marketing-df2021

Телеграм-канал https://t.me/mlinmarketing Спикер: Евгений Ключников, Data Engineer at GetYourGuide

Мы активно используем а/б тестирование для оценки новых продуктов и идей; в некоторые дни у нас запущено до 70 экспериментов от разных команд одновременно. В докладе я расскажу, как мы организовали такую систему технически, где секрет её масштабируемости, что ломается чаще всего и почему не все всегда ею довольны. Затронем тему документации, обучения, неправильного дизайна, расскажу, как сломать эксперимент так, что этого никто и не заметит. Что было в докладе:

00:00 начало видео

03:07 что такое А/Б эксперимент

06:07 для чего м…

1 месяц, 2 недели назад @ youtube.com
Денис Соловьев | Построение маркетинговой аналитики на базе Google Cloud
Денис Соловьев | Построение маркетинговой аналитики на базе Google Cloud Денис Соловьев | Построение маркетинговой аналитики на базе Google Cloud

ML in Marketing track https://ods.ai/tracks/ml-in-marketing-df2021

Телеграм-канал https://t.me/mlinmarketing Спикер: Денис Соловьев, Data Analyst at Ciklum Рассмотрим преимущества создания маркетинговой аналитики с использованием облачных технологий. Поговорим о том, на что стоит обратить внимание перед построением решения. Посмотрим на примеры аналитических архитектур для решения маркетинговых задач. Разберём организацию DWH на базе Google BigQuery. Затронем BigQuery ML и какие задачи можно решать с его помощью. Соцсети Data Fest:

https://t.me/datafest

https://vk.com/datafest

1 месяц, 3 недели назад @ youtube.com
Егор Борисов | Зарплаты и вакансии в Data Science: инсайты и тренды по данным чата ODS
Егор Борисов | Зарплаты и вакансии в Data Science: инсайты и тренды по данным чата ODS Егор Борисов | Зарплаты и вакансии в Data Science: инсайты и тренды по данным чата ODS

ML in Marketing track https://ods.ai/tracks/ml-in-marketing-df2021

Телеграм-канал https://t.me/mlinmarketing Спикер: Егор Борисов, Lead DS at DataWise Больше от автора:

- статья на хабр (https://habr.com/ru/company/ods/blog/572264/)

- ссылка на гитхаб (https://github.com/egorborisov/jobs_article) По мотивам статьи на хабр. Обзор рынка Data Science по данным из чатов ODS. Спрос на специалистов растет, или рынок уже насытился, какие технологии теряют, а какие набирают популярность, размер зарплатных вилок и от чего они зависят? Авторы исследования поделятся подходом, результатами и своими мыслями. Будет полезно каждому, кого волнует какие вакансии востребованы сегодня, куда и как расти. Соцсе…

1 месяц, 3 недели назад @ youtube.com
Айдын Убингажибов | Kaggle SIIM-FISABIO-RSNA COVID-19 Detection
Айдын Убингажибов | Kaggle SIIM-FISABIO-RSNA COVID-19 Detection Айдын Убингажибов | Kaggle SIIM-FISABIO-RSNA COVID-19 Detection

Презентация: https://docs.google.com/presentation/d/1XmM9-WHl6XhIx3VQgvm-qoshCXtl9L4I1A-ykK6R9dg/edit#slide=id.ge9979d1b50_5_0 Вступить в сообщество: https://ods.ai/ Соцсети ML Trainings

VK: https://vk.com/mltrainings

Telegram: https://t.me/mltrainings 0:00 - Intro

0:25 - Постановка задачи и метрика

1:21 - Валидация

1:42 - Аугментации

1:57 - Пайплайн классификации

2:50 - Как обучали классификацию

5:20 - Детекция

7:51 - Что не сработало

8:58 - Решение топ1 и топ2

10:09 - Outro

2 месяца назад @ youtube.com
Владислав Крамаренко | Kaggle CommonLit Readability Prize
Владислав Крамаренко | Kaggle CommonLit Readability Prize Владислав Крамаренко | Kaggle CommonLit Readability Prize

Разбор решения по определению сложности и удобочитаемости текстов 0:00 Intro

0:20 Постановка задачи, метрика

1:27 Какие были данные?

2:00 Традиционные подходы

2:37 Фичи

3:42 Решение через Spacy

4:58 Решение через Трансформеры

11:33 Итоговый ансамбль

11:50 Альтернативный лосс

12:33 Подведение итогов Презентация: https://disk.yandex.ru/i/SJeX2so0bbYZ-A Вступить в сообщество: https://ods.ai/ Соцсети ML Trainings

VK: https://vk.com/mltrainings

Telegram: https://t.me/mltrainings

2 месяца назад @ youtube.com
Юрий Болконский | Kaggle SETI Breakthrough Listen - E.T. Signal Search
Юрий Болконский | Kaggle SETI Breakthrough Listen - E.T. Signal Search Юрий Болконский | Kaggle SETI Breakthrough Listen - E.T. Signal Search

Разбор решения по выявлению аномальных оптических и электромагнитных сигналов в наборе данных 0:00 Intro

1:20 Постановка задачи

2:00 Целевая метрика

2:48 Анализ лидерборда

3:35 Лики

4:14 Решение. Что зашло?

7:46 Что не зашло?

9:12 Отличие от топ1 решения

13:28 Фишки в других решениях

17:57 Подведение итогов Презентация: https://disk.yandex.ru/i/jTwGGLGmIAY7bg Вступить в сообщество: https://ods.ai/ Соцсети ML Trainings

VK: https://vk.com/mltrainings

Telegram: https://t.me/mltrainings

2 месяца назад @ youtube.com
Инесса Трегубова | Пространственная аналитика в задачах геомаркетинга
Инесса Трегубова | Пространственная аналитика в задачах геомаркетинга Инесса Трегубова | Пространственная аналитика в задачах геомаркетинга

ML in Marketing track https://ods.ai/tracks/ml-in-marketing-df2021

Телеграм-канал https://t.me/mlinmarketing Спикер: Инесса Трегубова, Geodata analyst at Яндекс Лавка В докладе расскажу, как и зачем используют геоданные в маркетинге. Обсудим специфику сбора и процессинга пространственных данных, а также несколько ML моделей, с помощью которых можно делать выводы о том, насколько тот или иной участок подходит для целей бизнеса. Покажем, какие задачи можно решать в бизнесе. 00:00 начало видео

01:09 задачи гео-маркетинга

03:18 типы данных

04:05 меры сходства между объектами

07:31 системы агреггации данных в пространстве

10:14 проекции представления данных (crs)

11:11 форматы датасетов

12:57 пр…

2 месяца, 1 неделя назад @ youtube.com
Данил Гиздатуллин | Рекомендации похожего контента
Данил Гиздатуллин | Рекомендации похожего контента Данил Гиздатуллин | Рекомендации похожего контента

ML in Marketing track https://ods.ai/tracks/ml-in-marketing-df2021

Телеграм-канал https://t.me/mlinmarketing Спикер: Данил Гиздатуллин, Data Scientist В рекомендательных системах есть понятия пользователя и айтема. Типичная задача рекомендаций - рекомендации айтемов пользователям, но также одной из важных задач является задача рекомендации айтемов к айтемам. В докладе рассмотрим несколько вариантов её решения: от использования коллаборативной фильтрации до модели ранжирования. Презентация доклада: https://drive.google.com/file/d/1F0mmU0W4mbOPKRpfYr0JaP6I82L7PS8z/view?usp=sharing Соцсети Data Fest:

https://t.me/datafest

https://vk.com/datafest

2 месяца, 2 недели назад @ youtube.com
Андрей Ануфриев | Одноголосный синтез речи на основе нейронных сетей
Андрей Ануфриев | Одноголосный синтез речи на основе нейронных сетей Андрей Ануфриев | Одноголосный синтез речи на основе нейронных сетей

ODS Summer of Code 2021 | Intel & SberCloud track https://ods.ai/tracks/cloudcity2021 Спикер: Андрей Ануфриев, Старший инженер-исследователь в области глубокого обучения, Intel Вступить в сообщество: https://ods.ai/events/datafest2021/join Соцсети Data Fest & ODS Summer of Code:

https://t.me/datafest

https://vk.com/datafest

2 месяца, 3 недели назад @ youtube.com
ODS Course Fest #1
ODS Course Fest #1 ODS Course Fest #1

ODS Course Fest #1 https://datafest.ru/course/ Зарегистрироваться и получить доступ ко всем курсам и мероприятиям сезона: https://ods.ai/events/course_fest_1 Вступить в сообщество: https://ods.ai/ Соцсети Data Fest & Course Fest: https://t.me/datafest

https://vk.com/datafest

2 месяца, 3 недели назад @ youtube.com
Антон Киселев | Динамическое ценообразование в топливном ритейле
Антон Киселев | Динамическое ценообразование в топливном ритейле Антон Киселев | Динамическое ценообразование в топливном ритейле

ML in Marketing track https://ods.ai/tracks/ml-in-marketing-df2021

Телеграм-канал https://t.me/mlinmarketing Спикер: Антон Киселев, Marketing Analytics Producer at Playrix

* 10 лет в data (ритейл, игры, телеком)

* 15 DS проектов в проде

* 40 DS специалистов принял на работу

* 3 DS отдела построил с нуля Антон расскажет, как оптимизировали и автоматизировали ценообразование моторного топлива в одной российской топливной компании. Как учились менять цены на лету, как Марков, RL и многорукие бандиты помогли вывести argmax в продакшн и что из всего этого вышло. 00:00 introduction

01:15 динамическое ценообразование в бизнесе

04:00 цели и задачи использования

05:25 ценообразование в топливном рит…

3 месяца назад @ youtube.com
Артём Агафонов | Машинное обучение в гео-аналитике для бизнеса
Артём Агафонов | Машинное обучение в гео-аналитике для бизнеса Артём Агафонов | Машинное обучение в гео-аналитике для бизнеса

ML in Marketing track https://ods.ai/tracks/ml-in-marketing-df2021

Телеграм-канал https://t.me/mlinmarketing Спикер: Артем Агафонов, Data Scientist at Mail Ru Group Выбор удачной геопозиции является одной из важных стратегических задач для бизнеса, будь это строительная компания или компания розничной торговли (магазины, рестораны, банки). Популярность локации, доступность с точки зрения пешеходного или транспортного потока, наличие конкурентов в геоточке определяют важнейшие характеристики успешности бизнеса. В докладе рассмотрены примеры того, как большие данные помогают автоматизировать процесс отбора географического положения для нового объекта. Презентация доклада: https://drive.google…

3 месяца назад @ youtube.com
Primer Primer
последний пост 3 месяца назад
Simulating the Evolution of Sacrificing for Family
Simulating the Evolution of Sacrificing for Family Simulating the Evolution of Sacrificing for Family

Try NordVPN free for 30 days at https://nordvpn.com/Primer More than you ever wanted to know about Hamilton's rule:

https://users.ox.ac.uk/~grafen/cv/oseb.pdf For discussion and updates

- Twitter: @primerlearning

- Discord: https://discord.gg/NbruaNW

- Reddit: r/primerlearning Plush blobs and other merch: https://store.dftba.com/collections/primer

Support these videos on Patreon: https://www.patreon.com/primerlearning Made with Unity and Manim

https://github.com/Helpsypoo/PrimerUnity

https://www.manim.community Music by Mathieu Keith. For business inquiries: mathieu.keith@gmail.com Several other inputs into the graphics are from public domain contributions to blendswap.com Made possible by …

3 месяца назад @ youtube.com
Simulating Green Beard Altruism
Simulating Green Beard Altruism Simulating Green Beard Altruism

Brilliant: http://www.brilliant.org/primer Papers:

- https://www.researchgate.net/publication/41910312_Altruism_Spite_and_Greenbeards

- https://www.reed.edu/biology/professors/srenn/pages/teaching/2007_syllabus/2007_readings/a13_Keller_1998.pdf For discussion and updates

- Discord: https://discord.gg/NbruaNW

- Reddit: r/primerlearning

- Twitter: @primerlearning Sometimes streaming myself working on these monstrosities:

- Twitch: https://www.twitch.tv/primerjustin Made with Unity

https://github.com/Helpsypoo/PrimerUnity Music by Mathieu Keith. For business inquiries: mathieu.keith@gmail.com Several other inputs into the graphics are from public domain contributions to blendswap.com Plush blo…

8 месяцев назад @ youtube.com
🎧 Podcasts
Lex Fridman AI Podcast Lex Fridman AI Podcast
последний пост 15 часов назад
#244 – Robert Crews: Afghanistan, Taliban, Bin Laden, and War in the Middle East
#244 – Robert Crews: Afghanistan, Taliban, Bin Laden, and War in the Middle East #244 – Robert Crews: Afghanistan, Taliban, Bin Laden, and War in the Middle East

Robert Crews is a historian at Stanford, specializing in Afghanistan, Russia, Islam, Central Asia, and South Asia.

Please support this podcast by checking out our sponsors:– MUD\WTR: https://mudwtr.com/lex and use code LEX to get 5% off– Ten Thousand: https://www.tenthousand.cc/ and use code LEX to get 15% off– Four Sigmatic: https://foursigmatic.com/lex and use code LexPod to get up to 60% off– Magic Spoon: https://magicspoon.com/lex and use code LEX to get $5 off– Onnit: https://lexfridman.com/onnit to get up to 10% offEPISODE LINKS:Robert’s Twitter: https://twitter.com/RobertCrews22Robert’s Stanford page: https://profiles.stanford.edu/robert-crewsAfghan Modern (book): https://amzn.to/3nY…

15 часов назад @ lexfridman.com
#243 – Kevin Systrom: Instagram
#243 – Kevin Systrom: Instagram #243 – Kevin Systrom: Instagram

Kevin Systrom is the co-founder and former CEO of Instagram.

Please support this podcast by checking out our sponsors:– Theragun: https://therabody.com/lex to get 30 day trial– NI: https://www.ni.com/perspectives– GiveWell: https://www.givewell.org/ and use code LEX to get donation matched up to $1k– Blinkist: https://blinkist.com/lex and use code LEX to get 25% off premium– Fundrise: https://fundrise.com/lexEPISODE LINKS:Kevin’s Instagram: https://www.instagram.com/kevin/Kevin’s Twitter: https://twitter.com/kevinKevin’s LinkedIn: https://www.linkedin.com/in/kevinsystromPODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co/2lwqZIrSpotify: https://spoti…

5 дней, 15 часов назад @ lexfridman.com
#242- Ben Askren: Wrestling and MMA
#242- Ben Askren: Wrestling and MMA #242- Ben Askren: Wrestling and MMA

Ben Askren is a wrestler and MMA fighter, former Bellator and ONE Championship welterweight champion, a two-time NCAA wrestling champion and four-time finalist.

Please support this podcast by checking out our sponsors:– Notion: https://notion.com/startups to get up to $1000 off team plan– NI: https://www.ni.com/perspectives– Onnit: https://lexfridman.com/onnit to get up to 10% off– Eight Sleep: https://www.eightsleep.com/lex and use code LEX to get special savings– NetSuite: http://netsuite.com/lex to get free product tourEPISODE LINKS:Ben’s Twitter: https://twitter.com/BenaskrenBen’s Instagram: https://www.instagram.com/benaskren/Ben’s Website: https://www.benaskren.com/Ben’s Wrestling Aca…

1 неделя, 1 день назад @ lexfridman.com
#241 – Boris Sofman: Waymo, Cozmo, Self-Driving Cars, and the Future of Robotics
#241 – Boris Sofman: Waymo, Cozmo, Self-Driving Cars, and the Future of Robotics #241 – Boris Sofman: Waymo, Cozmo, Self-Driving Cars, and the Future of Robotics

Boris Sofman is the Senior Director Of Engineering and Head of Trucking at Waymo, formerly the Google Self-Driving Car project.

He was also the CEO and co-founder of Anki, a home robotics company.

Please support this podcast by checking out our sponsors:– LMNT: https://drinkLMNT.com/lex to get free sample pack– Athletic Greens: https://athleticgreens.com/lex and use code LEX to get 1 month of fish oil– ROKA: https://roka.com/ and use code LEX to get 20% off your first order– Indeed: https://indeed.com/lex to get $75 credit– BetterHelp: https://betterhelp.com/lex to get 10% offEPISODE LINKS:Boris’s Twitter: https://twitter.com/bsofmanBoris’s LinkedIn: https://www.linkedin.com/in/bsofmanWaymo…

1 неделя, 5 дней назад @ lexfridman.com
#240 – Neal Stephenson: Sci-Fi, Space, Aliens, AI, VR & the Future of Humanity
#240 – Neal Stephenson: Sci-Fi, Space, Aliens, AI, VR & the Future of Humanity #240 – Neal Stephenson: Sci-Fi, Space, Aliens, AI, VR & the Future of Humanity

Neal Stephenson is a sci-fi writer (Snow Crash, Cryptonomicon, and new book Termination Shock), former Chief Futurist at Magic Leap and first employee of Blue Origin.

Please support this podcast by checking out our sponsors:– Mizzen+Main: https://mizzenandmain.com and use code LEX to get $35 off– InsideTracker: https://insidetracker.com/lex and use code Lex25 to get 25% off– Athletic Greens: https://athleticgreens.com/lex and use code LEX to get 1 month of fish oil– Grammarly: https://grammarly.com/lex to get 20% off premium– ExpressVPN: https://expressvpn.com/lexpod and use code LexPod to get 3 months freeEPISODE LINKS:Neal’s Twitter: https://twitter.com/nealstephensonNeal’s Website: https…

2 недели, 3 дня назад @ lexfridman.com
#239 – Niall Ferguson: History of Money, Power, War, and Truth
#239 – Niall Ferguson: History of Money, Power, War, and Truth #239 – Niall Ferguson: History of Money, Power, War, and Truth

Niall Ferguson is a historian at Hoover Institution, Stanford University.

He is the author of 16 books on the history of money, war, power, and catastrophe.

Please support this podcast by checking out our sponsors:– The Prisoner Wine Company: https://theprisonerwine.com/lex to get 20% off & free shipping– Stripe: https://stripe.com– Coinbase: https://coinbase.com/lex to get $5 in free Bitcoin– Four Sigmatic: https://foursigmatic.com/lex and use code LexPod to get up to 60% off– Indeed: https://indeed.com/lex to get $75 creditEPISODE LINKS:Niall’s Twitter: https://twitter.com/nfergusNiall’s Website: https://www.niallferguson.comUniversity of Austin: https://uaustin.org/Doom: The Politics of …

2 недели, 6 дней назад @ lexfridman.com
#238 – Francis Collins: National Institutes of Health (NIH)
#238 – Francis Collins: National Institutes of Health (NIH) #238 – Francis Collins: National Institutes of Health (NIH)

Francis Collins is the director of the NIH and former head of the Human Genome Project.

Please support this podcast by checking out our sponsors:– Ten Thousand: https://www.tenthousand.cc/ and use code LEX to get 15% off– Allform: https://allform.com/lex to get 20% off– SimpliSafe: https://simplisafe.com/lex and use code LEX to get a free security camera– NI: https://www.ni.com/perspectives– Magic Spoon: https://magicspoon.com/lex and use code LEX to get $5 offEPISODE LINKS:Francis’s Twitter: https://twitter.com/NIHDirectorNIH: https://www.nih.govHuman Genome Project: https://www.genome.gov/human-genome-projectPODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https…

3 недели, 2 дня назад @ lexfridman.com
#237 – Steve Viscelli: Trucking and the Decline of the American Dream
#237 – Steve Viscelli: Trucking and the Decline of the American Dream #237 – Steve Viscelli: Trucking and the Decline of the American Dream

Steve Viscelli is a former truck driver and now an economic sociologist at University of Pennsylvania studying freight transportation, including autonomous trucks.

Please support this podcast by checking out our sponsors:– Shopify: https://shopify.com/lex to get 14-day free trial– ROKA: https://roka.com/ and use code LEX to get 20% off your first order– Sunbasket: https://sunbasket.com/lex and use code LEX to get $35 off– Blinkist: https://blinkist.com/lex and use code LEX to get 25% off premium– BetterHelp: https://betterhelp.com/lex to get 10% offEPISODE LINKS:Steve’s Website: https://www.steveviscelli.com/Big Rig (book): https://amzn.to/3EbaofPWill Robotic Trucks Be “Sweatshops on Wheels…

3 недели, 4 дня назад @ lexfridman.com
#236 – Jimmy Pedro: Judo and the Forging of Champions
#236 – Jimmy Pedro: Judo and the Forging of Champions #236 – Jimmy Pedro: Judo and the Forging of Champions

Jimmy Pedro is a judo competitor and coach, world champion, 3x world medalist, 2x Olympic medalist.

Please support this podcast by checking out our sponsors:– ROKA: https://roka.com/ and use code LEX to get 20% off your first order– Athletic Greens: https://athleticgreens.com/lex and use code LEX to get 1 month of fish oil– Ladder: https://ladderlife.com/lex– Linode: https://linode.com/lex to get $100 free credit– MasterClass: https://masterclass.com/lex to get 15% offEPISODE LINKS:Jimmy’s Instagram: https://www.instagram.com/jimmypedrousaJimmy’s Links: https://linktr.ee/jimmypedroJimmy’s Twitter: https://twitter.com/jimmypedrousaJimmy’s Website: https://www.jimmypedro.com/American Judo Sys…

4 недели назад @ lexfridman.com
#235 – Michael Mina: Rapid COVID Testing
#235 – Michael Mina: Rapid COVID Testing #235 – Michael Mina: Rapid COVID Testing

Michael Mina is an immunologist, epidemiologist, and physician at Harvard.

Please support this podcast by checking out our sponsors:– Notion: https://notion.com/startups to get up to $1000 off team plan– Quip: https://getquip.com/lex to get first refill free– Theragun: https://therabody.com/lex to get 30 day trial– Athletic Greens: https://athleticgreens.com/lex and use code LEX to get 1 month of fish oil– Four Sigmatic: https://foursigmatic.com/lex and use code LexPod to get up to 60% offEPISODE LINKS:Michael’s Twitter: https://twitter.com/michaelmina_labPODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co/2lwqZIrSpotify: https://spoti.fi/2nEwCF8RSS:…

1 месяц назад @ lexfridman.com
#234 – Stephen Wolfram: Complexity and the Fabric of Reality
#234 – Stephen Wolfram: Complexity and the Fabric of Reality #234 – Stephen Wolfram: Complexity and the Fabric of Reality

Stephen Wolfram is a computer scientist, mathematician, and theoretical physicist.

Please support this podcast by checking out our sponsors:– ROKA: https://roka.com/ and use code LEX to get 20% off your first order– FightCamp: https://joinfightcamp.com/lex to get free shipping– Onnit: https://lexfridman.com/onnit to get up to 10% off– Indeed: https://indeed.com/lex to get $75 credit– Fundrise: https://fundrise.com/lexEPISODE LINKS:Stephen’s Twitter: https://twitter.com/stephen_wolframStephen’s Blog: https://writings.stephenwolfram.comWolfram Physics Project: https://www.wolframphysics.orgA New Kind of Science (book): https://amzn.to/30XoEunFundamental Theory of Physics (book): https://amzn.…

1 месяц назад @ lexfridman.com
#233 – Carl Hart: Heroin, Cocaine, MDMA, Alcohol & the Role of Drugs in Society
#233 – Carl Hart: Heroin, Cocaine, MDMA, Alcohol & the Role of Drugs in Society #233 – Carl Hart: Heroin, Cocaine, MDMA, Alcohol & the Role of Drugs in Society

Carl Hart is a psychologist at Columbia University.

Please support this podcast by checking out our sponsors:– InsideTracker: https://insidetracker.com/lex and use code Lex25 to get 25% off– Ten Thousand: https://www.tenthousand.cc/ and use code LEX to get 15% off– Four Sigmatic: https://foursigmatic.com/lex and use code LexPod to get up to 60% off– ExpressVPN: https://expressvpn.com/lexpod and use code LexPod to get 3 months freeEPISODE LINKS:Carl’s Twitter: https://twitter.com/drcarlhartCarl’s Website: https://drcarlhart.comDrug Use for Grown-Ups (book): https://amzn.to/3lVpq2YPODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co/2lwqZIrSpotify: http…

1 месяц назад @ lexfridman.com
#232 – Brian Greene: Quantum Gravity, Big Bang, Aliens, Life, Death, and Meaning
#232 – Brian Greene: Quantum Gravity, Big Bang, Aliens, Life, Death, and Meaning #232 – Brian Greene: Quantum Gravity, Big Bang, Aliens, Life, Death, and Meaning

Brian Greene is a theoretical physicist.

Please support this podcast by checking out our sponsors:– The Prisoner Wine Company: https://theprisonerwine.com/lex to get 20% off & free shipping– Blinkist: https://blinkist.com/lex and use code LEX to get 25% off premium– LMNT: https://drinkLMNT.com/lex to get free sample pack– BetterHelp: https://betterhelp.com/lex to get 10% off– NI: https://www.ni.com/perspectivesEPISODE LINKS:Brian’s Twitter: https://twitter.com/bgreeneBrian’s Website: http://www.briangreene.org/Until the End of Time (book): https://amzn.to/2XuqXUiPODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co/2lwqZIrSpotify: https://spoti.fi/2nEw…

1 месяц, 1 неделя назад @ lexfridman.com
#231 – Alex Gladstein: Bitcoin, Authoritarianism, and Human Rights
#231 – Alex Gladstein: Bitcoin, Authoritarianism, and Human Rights #231 – Alex Gladstein: Bitcoin, Authoritarianism, and Human Rights

Alex Gladstein is the Chief Strategy Officer at the Human Rights Foundation and the Oslo Freedom Forum.

Please support this podcast by checking out our sponsors:– Stripe: https://stripe.com– Paperspace: https://gradient.run/lex to get $15 credit– Codecademy: https://codecademy.com and use code LEX to get 15% off– NI: https://www.ni.com/perspectives– Eight Sleep: https://www.eightsleep.com/lex and use code LEX to get special savingsEPISODE LINKS:Alex’s Twitter: https://twitter.com/gladsteinThe Little Bitcoin Book: https://littlebitcoinbook.com/Human Rights Foundation: https://hrf.org/PODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co/2lwqZIrSpotify: …

1 месяц, 1 неделя назад @ lexfridman.com
#230 – Kelsi Sheren: War, Artillery, PTSD, and Love
#230 – Kelsi Sheren: War, Artillery, PTSD, and Love #230 – Kelsi Sheren: War, Artillery, PTSD, and Love

Kelsi Sheren is a veteran, artillery gunner, and founder of Brass and Unity.

Please support this podcast by checking out our sponsors:– Shopify: https://shopify.com/lex to get 14-day free trial– Justworks: https://justworks.com– Novo: https://banknovo.com/lex– Indeed: https://indeed.com/lex to get $75 credit– Onnit: https://lexfridman.com/onnit to get up to 10% offEPISODE LINKS:Kelsi’s Twitter: https://twitter.com/kelsiburnsKelsi’s Website: https://brassandunity.comKelsi’s Instagram: https://www.instagram.com/kelsie_sherenCharity: https://www.heroicheartsproject.org/Do the F*cking Work (book): https://amzn.to/3mGhguFNotes from Underground (book): https://amzn.to/3mN5TRFBrothers in Arms (son…

1 месяц, 2 недели назад @ lexfridman.com
Microsoft Research Podcast Microsoft Research Podcast
последний пост 3 месяца, 2 недели назад
132 - New Future of Work: How remote and hybrid work will shape workplaces and society with Jaime Teevan and Sid Suri
132 - New Future of Work: How remote and hybrid work will shape workplaces and society with Jaime Teevan and Sid Suri 132 - New Future of Work: How remote and hybrid work will shape workplaces and society with Jaime Teevan and Sid Suri

For Microsoft researchers, COVID-19 was a call to action.

Teams from across the Microsoft organizational chart pooled their unique expertise together under The New Future of Work initiative.

The results have informed product features designed to better support remote work and are now being used to help companies, including Microsoft, usher their workforces into a future of hybrid work.

In this episode of The New Future of Work series, Chief Scientist Jaime Teevan and Senior Principal Researcher Siddharth Suri explore the many ways people were impacted by work shifts during the COVID-19 pandemic.

The research that Siddharth Suri describes in this podcast was jointly done with Hana Wolf of Li…

3 месяца, 2 недели назад @ blubrry.com
131 - New Future of Work: Redefining workspaces as hybrid and remote work become more prevalent with Jaime Teevan and Ginger Hudson
131 - New Future of Work: Redefining workspaces as hybrid and remote work become more prevalent with Jaime Teevan and Ginger Hudson 131 - New Future of Work: Redefining workspaces as hybrid and remote work become more prevalent with Jaime Teevan and Ginger Hudson

For Microsoft researchers, COVID-19 was a call to action.

The reimagining of work practices had long been an area of study, but existing and new questions that needed immediate answers surfaced as companies and their employees quickly adjusted to significantly different working conditions.

Teams from across the Microsoft organizational chart pooled their unique expertise together under The New Future of Work initiative.

The results have informed product features designed to better support remote work and are now being used to help companies, including Microsoft, usher their workforces into a future of hybrid work.

They also talk about what an “anatomy of hybrid work” might look like and som…

3 месяца, 3 недели назад @ blubrry.com
130 - New Future of Work: Managing IT and security in remote scenarios with Jaime Teevan and Matt Brodsky
130 - New Future of Work: Managing IT and security in remote scenarios with Jaime Teevan and Matt Brodsky 130 - New Future of Work: Managing IT and security in remote scenarios with Jaime Teevan and Matt Brodsky

For Microsoft researchers, COVID-19 was a call to action.

The reimagining of work practices had long been an area of study, but existing and new questions that needed immediate answers surfaced as companies and their employees quickly adjusted to significantly different working conditions.

Teams from across the Microsoft organizational chart pooled their unique expertise together under The New Future of Work initiative.

The results have informed product features designed to better support remote work and are now being used to help companies, including Microsoft, usher their workforces into a future of hybrid work.

They also explore why remote work came with a spike in phishing threats, what…

4 месяца назад @ blubrry.com
129 - Machine learning, molecular simulation, and the opportunity for societal good with Chris Bishop and Max Welling
129 - Machine learning, molecular simulation, and the opportunity for societal good with Chris Bishop and Max Welling 129 - Machine learning, molecular simulation, and the opportunity for societal good with Chris Bishop and Max Welling

Unlocking the challenge of molecular simulation has the potential to yield significant breakthroughs in how we tackle such societal issues as climate change, drug discovery, and the treatment of disease, and Microsoft is ramping up its efforts in the space.

In this episode, Chris Bishop, Lab Director of Microsoft Research Cambridge, welcomes renowned machine learning researcher Max Welling to the Microsoft Research team as head of the new Amsterdam lab.

Connecting over their shared physics background and vision for molecular simulation, Bishop and Welling explore several fascinating topics, including a future in which machine learning and quantum computing will be used in tandem to model mo…

4 месяца, 1 неделя назад @ blubrry.com
128 - New Future of Work: How developer collaboration and productivity are changing in a hybrid work model
128 - New Future of Work: How developer collaboration and productivity are changing in a hybrid work model 128 - New Future of Work: How developer collaboration and productivity are changing in a hybrid work model

Teams from across the Microsoft organizational chart pooled their unique expertise together under The New Future of Work initiative.

The results have informed product features designed to better support remote work and are now being used to help companies, including Microsoft, usher their workforces into a future of hybrid work.

In this episode of The New Future of Work series, Chief Scientist Jaime Teevan and Principal Productivity Engineer Brian Houck discuss what the massive shift to remote work meant for developers—both employees of Microsoft and customers using Microsoft developer platforms to support their work.

They’ll talk about how taking a holistic approach to developer productivi…

4 месяца, 2 недели назад @ blubrry.com
127 - New Future of Work: Staying productive and happy when our office is our home with Jaime Teevan and Sonia Jaffe
127 - New Future of Work: Staying productive and happy when our office is our home with Jaime Teevan and Sonia Jaffe 127 - New Future of Work: Staying productive and happy when our office is our home with Jaime Teevan and Sonia Jaffe

For Microsoft researchers, COVID-19 was a call to action.

The reimagining of work practices had long been an area of study, but existing and new questions that needed immediate answers surfaced as companies and their employees quickly adjusted to significantly different working conditions.

Teams from across the Microsoft organizational chart pooled their unique expertise together under The New Future of Work initiative.

The results have informed product features designed to better support remote work and are now being used to help companies, including Microsoft, usher their workforces into a future of hybrid work.

They also explore how people already working from home helped them better und…

4 месяца, 3 недели назад @ blubrry.com
126 - New Future of Work: Meeting and collaborating in a remote and hybrid world with Jaime Teevan and Abigail Sellen
126 - New Future of Work: Meeting and collaborating in a remote and hybrid world with Jaime Teevan and Abigail Sellen 126 - New Future of Work: Meeting and collaborating in a remote and hybrid world with Jaime Teevan and Abigail Sellen

Teams from across the Microsoft organizational chart pooled their unique expertise together under The New Future of Work initiative.

The results have informed product features designed to better support remote work and are now being used to help companies, including Microsoft, usher their workforces into a future of hybrid work.

In this episode of The New Future of Work series of the podcast, Chief Scientist Jaime Teevan and Abigail Sellen, Deputy Lab Director at Microsoft Research Cambridge in the United Kingdom, explore the dynamics of meetings and collaborations in the context of remote work.

They specifically address the difference between weak and strong ties in our professional networ…

5 месяцев назад @ blubrry.com
125 - New Future of Work: Driving innovation via cross-company research with Jaime Teevan and Brent Hecht
125 - New Future of Work: Driving innovation via cross-company research with Jaime Teevan and Brent Hecht 125 - New Future of Work: Driving innovation via cross-company research with Jaime Teevan and Brent Hecht

For Microsoft researchers, COVID-19 was a call to action.

The reimagining of work practices had long been an area of study, but existing and new questions that needed immediate answers surfaced as companies and their employees quickly adjusted to significantly different working conditions.

Teams from across the Microsoft organizational chart pooled their unique expertise together under The New Future of Work initiative.

The results have informed product features designed to better support remote work and are now being used to help companies, including Microsoft, usher their workforces into a future of hybrid work.

They’ll discuss the role of research during times of disruption, the widening…

5 месяцев, 1 неделя назад @ blubrry.com
124 - Econ4: Uncovering how decision-making shapes individuals and society through behavioral public economics featuring Evan Rose and Hunt Allcott
124 - Econ4: Uncovering how decision-making shapes individuals and society through behavioral public economics featuring Evan Rose and Hunt Allcott 124 - Econ4: Uncovering how decision-making shapes individuals and society through behavioral public economics featuring Evan Rose and Hunt Allcott

In the world of economics, researchers at Microsoft are examining a range of complex systems—from those that impact the technologies we use to those that inform the laws and policies we create—through the lens of a social science that goes beyond the numbers to better understand people and society.

In this episode, Senior Principal Researcher Hunt Allcott talks with Postdoctoral Researcher Evan Rose about Allcott’s work exploring the everyday decisions people face, like buying fuel-efficient cars or taking out payday loans, and how a clearer understanding of these decisions can shape meaningful public policy.

Allcott shares how his and others’ research shows that policy can often have compl…

5 месяцев, 2 недели назад @ blubrry.com
123 - Econ3: Understanding the media ecosystem and how it informs public opinion in the internet age featuring Hunt Allcott and David Rothschild
123 - Econ3: Understanding the media ecosystem and how it informs public opinion in the internet age featuring Hunt Allcott and David Rothschild 123 - Econ3: Understanding the media ecosystem and how it informs public opinion in the internet age featuring Hunt Allcott and David Rothschild

Interviewed by Senior Principal Researcher Hunt Allcott, Economist David Rothschild discusses how the news media has evolved alongside social media and the internet, from story development to distribution of news via aggregators and wire services.

Rothschild illuminates how and where people are consuming news and shares some of the strategies he’s seeing news outlets use to appeal to their audiences.

He also covers research insights into media bias, misinformation, and how this knowledge could inform the future of news for the better.

In addition, the researchers talk about Rothschild’s work with Project Ratio, which looks at how the news ecosystem impacts public opinion and political polar…

5 месяцев, 3 недели назад @ blubrry.com
122 - Econ2: Causal machine learning, data interpretability, and online platform markets featuring Hunt Allcott and Greg Lewis
122 - Econ2: Causal machine learning, data interpretability, and online platform markets featuring Hunt Allcott and Greg Lewis 122 - Econ2: Causal machine learning, data interpretability, and online platform markets featuring Hunt Allcott and Greg Lewis

In the world of economics, researchers at Microsoft are examining a range of complex systems—from those that impact the technologies we use to those that inform the laws and policies we create—through the lens of a social science that goes beyond the numbers to better understand people and society.

In this episode, Senior Principal Researcher Dr. Hunt Allcott speaks with Microsoft Research New England office mate and Senior Principal Researcher Dr. Greg Lewis.

Together, they cover the connection between causal machine learning and economics research, the motivations of buyers and sellers on e-commerce platforms, and how ad targeting and data practices could evolve to foster a more symbiotic…

5 месяцев, 4 недели назад @ blubrry.com
121 - Econ1: Using microeconomics to solve mass incarceration featuring Hunt Allcott and Evan Rose
121 - Econ1: Using microeconomics to solve mass incarceration featuring Hunt Allcott and Evan Rose 121 - Econ1: Using microeconomics to solve mass incarceration featuring Hunt Allcott and Evan Rose

In the world of economics, researchers at Microsoft are examining a range of complex systems—from those that impact the technologies we use to those that inform the laws and policies we create—through the lens of a social science that goes beyond the numbers to better understand people and society.

In this episode, Dr. Hunt Allcott, Senior Principal Researcher at Microsoft Research New England, talks with Dr. Evan Rose, Postdoctoral Researcher, whom Allcott describes as “one of the most engaging and talented researchers in applied microeconomics today.” They’ll discuss how Rose’s experience teaching adult learners at San Quentin State Prison has resonated throughout his research, and they’l…

6 месяцев, 2 недели назад @ blubrry.com
120 - Advancing Excel as a programming language with Andy Gordon and Simon Peyton Jones
120 - Advancing Excel as a programming language with Andy Gordon and Simon Peyton Jones 120 - Advancing Excel as a programming language with Andy Gordon and Simon Peyton Jones

Today, people around the globe—from teachers to small-business owners to finance executives—use Microsoft Excel to make sense of the information that occupies their respective worlds, and whether they realize it or not, in doing so, they’re taking on the role of programmer.

In this episode, Senior Principal Research Manager Andy Gordon, who leads the Calc Intelligence team at Microsoft Research, and Senior Principal Researcher Simon Peyton Jones provide an inside account of the journey Excel has taken as a programming language, including the expansion of data types that has unlocked greater functionality and the release of the LAMBDA function, which makes the Excel formula language Turing-c…

6 месяцев, 4 недели назад @ blubrry.com
NLP Highlights NLP Highlights
последний пост 1 месяц, 1 неделя назад
134 - PhD Application Series: PhDs in Europe versus the US
134 - PhD Application Series: PhDs in Europe versus the US 134 - PhD Application Series: PhDs in Europe versus the US

This episode is the second in our current series on PhD applications.

How do PhD programs in Europe differ from PhD programs in the US, and how should people decide between them?

In this episode, we invite Barbara Plank…

1 месяц, 1 неделя назад @ soundcloud.com
133 - PhD Application Series: Preparing Application Materials, with Nathan Schneider and Roma Patel
133 - PhD Application Series: Preparing Application Materials, with Nathan Schneider and Roma Patel 133 - PhD Application Series: Preparing Application Materials, with Nathan Schneider and Roma Patel

This episode is the first in our current series on PhD applications.

How should people prepare their applications to PhD programs in NLP?

In this episode, we invite Nathan Schneider (Professor of Linguistics and Compute…

1 месяц, 3 недели назад @ soundcloud.com
132 - Alexa Prize Socialbot Grand Challenge and Alquist 4.0, with Petr Marek
132 - Alexa Prize Socialbot Grand Challenge and Alquist 4.0, with Petr Marek 132 - Alexa Prize Socialbot Grand Challenge and Alquist 4.0, with Petr Marek

In this episode, we discussed the Alexa Prize Socialbot Grand Challenge and this year's winning submission, Alquist 4.0, with Petr Marek, a member of the winning team.

Petr gave us an overview of their submission, the de…

2 месяца назад @ soundcloud.com
131 - Opportunities and Barriers between HCI and NLP, with Nanna Inie and Leon Derczynski
131 - Opportunities and Barriers between HCI and NLP, with Nanna Inie and Leon Derczynski 131 - Opportunities and Barriers between HCI and NLP, with Nanna Inie and Leon Derczynski

What can NLP researchers learn from Human Computer Interaction (HCI) research?

We chatted with Nanna Inie and Leon Derczynski to find out.

We discussed HCI's research processes including methods of inquiry, the data anno…

3 месяца, 1 неделя назад @ soundcloud.com
130 - Linking human cognitive patterns to NLP Models, with Lisa Bienborn
130 - Linking human cognitive patterns to NLP Models, with Lisa Bienborn 130 - Linking human cognitive patterns to NLP Models, with Lisa Bienborn

In this episode, we talk with Lisa Beinborn, an assistant professor at Vrije Universiteit Amsterdam, about how to use human cognitive signals to improve and analyze NLP models.

We start by discussing different kinds of c…

3 месяца, 3 недели назад @ soundcloud.com
129 - Transformers and Hierarchical Structure, with Shunyu Yao
129 - Transformers and Hierarchical Structure, with Shunyu Yao 129 - Transformers and Hierarchical Structure, with Shunyu Yao

In this episode, we talk to Shunyu Yao about recent insights into how transformers can represent hierarchical structure in language.

Bounded-depth hierarchical structure is thought to be a key feature of natural language…

4 месяца, 4 недели назад @ soundcloud.com
128 - Dynamic Benchmarking, with Douwe Kiela
128 - Dynamic Benchmarking, with Douwe Kiela 128 - Dynamic Benchmarking, with Douwe Kiela

We discussed adversarial dataset construction and dynamic benchmarking in this episode with Douwe Kiela, a research scientist at Facebook AI Research who has been working on a dynamic benchmarking platform called Dynaben…

5 месяцев, 1 неделя назад @ soundcloud.com
127 - Masakhane and Participatory Research for African Languages, with Tosin Adewumi and Perez Ogayo
127 - Masakhane and Participatory Research for African Languages, with Tosin Adewumi and Perez Ogayo 127 - Masakhane and Participatory Research for African Languages, with Tosin Adewumi and Perez Ogayo

We invited members of Masakhane, Tosin Adewumi and Perez Ogayo, to talk about their EMNLP Findings paper that discusses why typical research is limited for low-resourced NLP and how participatory research can help.

5 месяцев, 3 недели назад @ soundcloud.com
126 - Optimizing Continuous Prompts for Generation, with Lisa Li
126 - Optimizing Continuous Prompts for Generation, with Lisa Li 126 - Optimizing Continuous Prompts for Generation, with Lisa Li

We invited Lisa Li to talk about her recent work, Prefix-Tuning: Optimizing Continuous Prompts for Generation.

Prefix tuning is a lightweight alternative to finetuning, and the idea is to tune only a fixed-length task-sp…

6 месяцев, 1 неделя назад @ soundcloud.com
125 - VQA for Real Users, with Danna Gurari
125 - VQA for Real Users, with Danna Gurari 125 - VQA for Real Users, with Danna Gurari

How can we build Visual Question Answering systems for real users?

For this episode, we chatted with Danna Gurari, about her work in building datasets and models towards VQA for people who are blind.

We talked about the …

6 месяцев, 4 недели назад @ soundcloud.com
124 - Semantic Machines and Task-Oriented Dialog, with Jayant Krishnamurthy and Hao Fang
124 - Semantic Machines and Task-Oriented Dialog, with Jayant Krishnamurthy and Hao Fang 124 - Semantic Machines and Task-Oriented Dialog, with Jayant Krishnamurthy and Hao Fang

We use cookies for various purposes including analytics and personalized marketing.

By continuing to use the service, you agree to our use of cookies as described in the Cookie Policy

7 месяцев, 2 недели назад @ soundcloud.com
123 - Robust NLP, with Robin Jia
123 - Robust NLP, with Robin Jia 123 - Robust NLP, with Robin Jia

We use cookies for various purposes including analytics and personalized marketing.

By continuing to use the service, you agree to our use of cookies as described in the Cookie Policy

7 месяцев, 3 недели назад @ soundcloud.com
Data Skeptic Data Skeptic
последний пост 2 дня, 20 часов назад
Black Friday
Black Friday Black Friday

The retail holiday “black Friday” occurs the day after Thanksgiving in the United States. It’s dubbed this because many retail companies spend the first 10 months of the year running at a loss (in the red) before finally earning as much as 80% of their revenue in the last two months of the year. This episode features four interviews with guests bringing unique data-driven perspectives on the topic of analyzing this seeming outlier in a time series dataset.

2 дня, 20 часов назад @ dataskeptic.com
Aligning Time Series on Incomparable Spaces
Aligning Time Series on Incomparable Spaces Aligning Time Series on Incomparable Spaces

Alex Terenin, Postdoctoral Research Associate at the University of Cambridge, joins us today to talk about his work "Aligning Time Series on Incomparable Spaces."

6 дней, 22 часа назад @ dataskeptic.com
Comparing Time Series with HCTSA
Comparing Time Series with HCTSA Comparing Time Series with HCTSA

Today we are joined again by Ben Fulcher, leader of the Dynamics and Neural Systems Group at the University of Sydney in Australia, to talk about hctsa, a software package for running highly comparative time-series analysis.

1 неделя, 6 дней назад @ dataskeptic.com
Change Point Detection Algorithms
Change Point Detection Algorithms Change Point Detection Algorithms

Gerrit van den Burg, Postdoctoral Researcher at The Alan Turing Institute, joins us today to discuss his work "An Evaluation of Change Point Detection Algorithms."

2 недели, 6 дней назад @ dataskeptic.com
Time Series for Good
Time Series for Good Time Series for Good

Bahman Rostami-Tabar, Senior Lecturer in Management Science at Cardiff University, joins us today to talk about his work "Forecasting and its Beneficiaries."

3 недели, 6 дней назад @ dataskeptic.com
Long Term Time Series Forecasting
Long Term Time Series Forecasting Long Term Time Series Forecasting

Alex Mallen, Computer Science student at the University of Washington, and Henning Lange, a Postdoctoral Scholar in Applied Math at the University of Washington, join us today to share their work "Deep Probabilistic Koopman: Long-term Time-Series Forecasting Under Periodic Uncertainties."

1 месяц назад @ dataskeptic.com
Fast and Frugal Time Series Forecasting
Fast and Frugal Time Series Forecasting Fast and Frugal Time Series Forecasting

Fotios Petropoulos, Professor of Management Science at the University of Bath in The U.K., joins us today to talk about his work "Fast and Frugal Tim Series Forecasting." WORKS MENTIONED Fast and Frugal Time Series Forecasting by Fotios Petropoulos, Yael Grushka-Cockayne SOCIAL MEDIA Twitter

1 месяц, 1 неделя назад @ dataskeptic.com
Causal Inference in Educational Systems
Causal Inference in Educational Systems Causal Inference in Educational Systems

Manie Tadayon, a PhD graduate from the ECE department at University of California, Los Angeles, joins us today to talk about his work “Comparative Analysis of the Hidden Markov Model and LSTM: A Simulative Approach.”

1 месяц, 2 недели назад @ dataskeptic.com
Boosted Embeddings for Time Series
Boosted Embeddings for Time Series Boosted Embeddings for Time Series

Sankeerth Rao Karingula, ML Researcher at Palo Alto Networks, joins us today to talk about his work “Boosted Embeddings for Time Series Forecasting.” Works Mentioned Boosted Embeddings for Time Series Forecasting by Sankeerth Rao Karingula, Nandini Ramanan, Rasool Tahmasbi, Mehrnaz Amjadi, Deokwoo Jung, Ricky Si, Charanraj Thimmisetty, Luisa Polania Cabrera, Marjorie Sayer, Claudionor Nunes Coelho Jr Social Media Linkedin Twitter LOD Conference 2021

1 месяц, 3 недели назад @ dataskeptic.com
Applying k-Nearest Neighbors to Time Series
Applying k-Nearest Neighbors to Time Series Applying k-Nearest Neighbors to Time Series

Samya Tajmouati, a PhD student in Data Science at the University of Science of Kenitra, Morocco, joins us today to discuss her work Applying K-Nearest Neighbors to Time Series Forecasting: Two New Approaches.

2 месяца, 1 неделя назад @ dataskeptic.com
Ultra Long Time Series
Ultra Long Time Series Ultra Long Time Series

Dr. Feng Li, (@f3ngli) is an Associate Professor of Statistics in the School of Statistics and Mathematics at Central University of Finance and Economics in Beijing, China. He joins us today to discuss his work Distributed ARIMA Models for Ultra-long Time Series.

2 месяца, 2 недели назад @ dataskeptic.com
MiniRocket
MiniRocket MiniRocket

Angus Dempster, PhD Student at Monash University in Australia, comes on today to talk about MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification, a fast deterministic transform for time series classification. MINIROCKET reformulates ROCKET, gaining a 75x improvement on larger datasets with essentially the same performance. In this episode, we talk about the insights that realized this speedup as well as use cases.

2 месяца, 3 недели назад @ dataskeptic.com
Change Point Detection in Continuous Integration Systems
Change Point Detection in Continuous Integration Systems Change Point Detection in Continuous Integration Systems

David Daly, Performance Engineer at MongoDB, joins us today to discuss "The Use of Change Point Detection to Identify Software Performance Regressions in a Continuous Integration System".

2 месяца, 4 недели назад @ dataskeptic.com
ARiMA is not Sufficient
ARiMA is not Sufficient ARiMA is not Sufficient

Chongshou Li, Associate Professor at Southwest Jiaotong University in China, joins us today to talk about his work Why are the ARIMA and SARIMA not Sufficient.

3 месяца назад @ dataskeptic.com
Comp Engine
Comp Engine Comp Engine

Ben Fulcher, Senior Lecturer at the School of Physics at the University of Sydney in Australia, comes on today to talk about his project Comp Engine.

3 месяца, 1 неделя назад @ dataskeptic.com
Linear Digressions Linear Digressions
последний пост None
SuperDataScience SuperDataScience
последний пост 3 дня назад
SDS 526: The Highest-Paying Data Frameworks
SDS 526: The Highest-Paying Data Frameworks SDS 526: The Highest-Paying Data Frameworks

I finish up our three-part series on the results of the O’Reilly Survey, looking at the highest-paying data frameworks.

Additional materials: www.superdatascience.com/526

3 дня назад @ soundcloud.com
SDS 525: Hurdling Over Data Career Obstacles
SDS 525: Hurdling Over Data Career Obstacles SDS 525: Hurdling Over Data Career Obstacles

Karen Jean-Francois joins us to discuss how she wants to empower her team members and a wider audience of data scientists battling imposter syndrome.

In this episode you will learn:• Karen’s background as a hurdler [4:…

6 дней назад @ soundcloud.com
SDS 524: The Highest-Paying Data Tools
SDS 524: The Highest-Paying Data Tools SDS 524: The Highest-Paying Data Tools

In this episode, I go over the highest-paying data tools based on the O’Reilly survey.

Additional materials: www.superdatascience.com/524

1 неделя, 3 дня назад @ soundcloud.com
SDS 523: Open-Source Analytical Computing (pandas, Apache Arrow)
SDS 523: Open-Source Analytical Computing (pandas, Apache Arrow) SDS 523: Open-Source Analytical Computing (pandas, Apache Arrow)

Wes McKinney joins us to discuss the history and philosophy of pandas and Apache Arrow as well as his continued work in open source tools.

In this episode you will learn:• History of pandas [7:29]• The trends of R and…

1 неделя, 6 дней назад @ soundcloud.com
SDS 522: Data Tools vs. Data Platforms
SDS 522: Data Tools vs. Data Platforms SDS 522: Data Tools vs. Data Platforms

I provide you with some quick definitions of data tools vs data platforms to prep us for deep dives in future episodes.

Additional materials: www.superdatascience.com/522

2 недели, 3 дня назад @ soundcloud.com
SDS 521: Skyrocket Your Career by Sharing Your Writing
SDS 521: Skyrocket Your Career by Sharing Your Writing SDS 521: Skyrocket Your Career by Sharing Your Writing

Khuyen Tran joins us to discuss her work as a prolific technical writer and undergraduate data science student.

In this episode you will learn:• Khuyen’s online writing [4:00]• Book writing [8:50]• How you can increa…

2 недели, 6 дней назад @ soundcloud.com
SDS 520: The Highest-Paying Programming Languages for Data Scientists
SDS 520: The Highest-Paying Programming Languages for Data Scientists SDS 520: The Highest-Paying Programming Languages for Data Scientists

I take a look at the results of O’Reilly’s survey on salaries for data scientists in 2021.

Additional materials: www.superdatascience.com/520

3 недели, 3 дня назад @ soundcloud.com
SDS 519: A.I. for Good
SDS 519: A.I. for Good SDS 519: A.I. for Good

James Hodson joins us to discuss his philosophy and work at A.I.

For Good and how they aim to promote sustainability and A.I.

use for social issues.

In this episode you will learn:• AI for Good [5:17]• Founding of AI …

3 недели, 6 дней назад @ soundcloud.com
SDS 518: Fail More
SDS 518: Fail More SDS 518: Fail More

This week, I provide a short but important bit of advice on failure.

Additional materials: www.superdatascience.com/518

1 месяц назад @ soundcloud.com
SDS 517: Courses in Data Science and Machine Learning
SDS 517: Courses in Data Science and Machine Learning SDS 517: Courses in Data Science and Machine Learning

Sadie St. Lawrence talks in-depth about her extensive work as a data science educator through both online and collegiate courses as well as her organization for diversifying data science careers.

In this episode you wil…

1 месяц назад @ soundcloud.com
SDS 516: Does Caffeine Hurt Productivity? (Part 3: Scientific Literature)
SDS 516: Does Caffeine Hurt Productivity? (Part 3: Scientific Literature) SDS 516: Does Caffeine Hurt Productivity? (Part 3: Scientific Literature)

In this episode, I finish up my saga into the effects of caffeine on productivity.

Additional materials: www.superdatascience.com/516

1 месяц, 1 неделя назад @ soundcloud.com
SDS 515: Accelerating Impact through Community — —with Chrys Wu
SDS 515: Accelerating Impact through Community — —with Chrys Wu SDS 515: Accelerating Impact through Community — —with Chrys Wu

Chrys Wu joins us to discuss her community organizations, her tips, and her recommended resources for building data science communities for impact.

In this episode you will learn:• The world of K-Pop [ 4:07]• Chrys’s …

1 месяц, 1 неделя назад @ soundcloud.com
SDS 514: Does Caffeine Hurt Productivity? (Part 2: Experimental Results)
SDS 514: Does Caffeine Hurt Productivity? (Part 2: Experimental Results) SDS 514: Does Caffeine Hurt Productivity? (Part 2: Experimental Results)

In this episode, I dive into the nuts and bolts of data on my experiment into caffeine and productivity.

Additional materials: www.superdatascience.com/514

1 месяц, 2 недели назад @ soundcloud.com
SDS 513: Transformers for Natural Language Processing
SDS 513: Transformers for Natural Language Processing SDS 513: Transformers for Natural Language Processing

Denis Rothman joins us to discuss his writing work in natural language processing, explainable AI, and more!

In this episode you will learn:• What are transformers and their applications?

[7:54]• Denis’s book on expla…

1 месяц, 2 недели назад @ soundcloud.com
SDS 512: Does Caffeine Hurt Productivity? (Part 1)
SDS 512: Does Caffeine Hurt Productivity? (Part 1) SDS 512: Does Caffeine Hurt Productivity? (Part 1)

I dive into a personal experiment to test my productivity relative to my coffee intake and if caffeine is actually hurting my productivity.

Additional materials: www.superdatascience.com/512

1 месяц, 3 недели назад @ soundcloud.com
Data Science at Home Data Science at Home
последний пост 1 неделя, 6 дней назад
Do you fear of AI? Why? (Ep. 176)
Do you fear of AI? Why? (Ep. 176) Do you fear of AI? Why? (Ep. 176)

November 16, 2021 podcastThis episode summarizes a study about trends of AI in 2021, the way AI is perceived by people of different background and some other weird questions.

For instance, would you have sexual intercourse with a robot?

Would you be in a relationship with an artificial intelligence?

The study has been conducted by Tidio and reported at https://www.tidio.com/blog/ai-trends/SponsorsThis episode is supported by Amethix Technologies.

Amethix uses machine learning and advanced analytics to empower people and organizations to ask and answer complex questions like never before.

1 неделя, 6 дней назад @ datascienceathome.com
Composable models and artificial general intelligence (Ep. 175)
Composable models and artificial general intelligence (Ep. 175) Composable models and artificial general intelligence (Ep. 175)

November 9, 2021 podcastIf you think deep learning is a method to get to AGI, think again.

Humans, as well as all mammals think in a… composable way.

SponsorsThis episode is brought to you by Advanced RISC Machines (ARM).

ARM is a family of reduced instruction set computing architectures for computer processors https://www.arm.com/Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy.

Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.

2 недели, 5 дней назад @ datascienceathome.com
Ethics and explainability in AI with Erika Agostinelli from IBM (ep. 174)
Ethics and explainability in AI with Erika Agostinelli from IBM (ep. 174) Ethics and explainability in AI with Erika Agostinelli from IBM (ep. 174)

November 2, 2021 podcastAI, ethics, and explainability.

We answer such questions in this amazing episode with Erika Agostinelli from the AI Elite team at IBM.

Sponsored by Amethix TechnologiesAmethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy.

Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.

ReferencesAIX360:https://aix360.mybluemix.net/Questioning the AI: Informing Design Practices for Explainable AI User Experienceshttps://arxiv.org/abs/2001.…

3 недели, 6 дней назад @ datascienceathome.com
Fighting Climate Change as a Technologist (Ep. 172)
Fighting Climate Change as a Technologist (Ep. 172) Fighting Climate Change as a Technologist (Ep. 172)

We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits.

By clicking “Accept”, you consent to the use of ALL the cookies.

1 месяц, 1 неделя назад @ datascienceathome.com
AI in the Enterprise with IBM Global AI Strategist Mara Pometti (Ep. 171)
AI in the Enterprise with IBM Global AI Strategist Mara Pometti (Ep. 171) AI in the Enterprise with IBM Global AI Strategist Mara Pometti (Ep. 171)

We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits.

By clicking “Accept”, you consent to the use of ALL the cookies.

1 месяц, 1 неделя назад @ datascienceathome.com
Speaking about data with Mikkel Settnes from Dreamdata.io (Ep. 170)
Speaking about data with Mikkel Settnes from Dreamdata.io (Ep. 170) Speaking about data with Mikkel Settnes from Dreamdata.io (Ep. 170)

We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits.

By clicking “Accept”, you consent to the use of ALL the cookies.

1 месяц, 1 неделя назад @ datascienceathome.com
Send compute to data with POSH data-aware shell (Ep. 169)
Send compute to data with POSH data-aware shell (Ep. 169) Send compute to data with POSH data-aware shell (Ep. 169)

October 21, 2021 podcastOur SponsorsQuantum MetricStay off the naughty list this holiday season by reducing customer friction, increasing conversions, and personalizing the shopping experience.

Visit us at quantummetric.com/podoffer and see if you qualify to receive our “12 Days of Insights” offer with code DATASCIENCE.

This offer gives you 12-day access to our platform coupled with a bespoke insight report that will help you identify where customers are struggling or engaging in your digital product.

Amethix TechnologiesAmethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logist…

1 месяц, 1 неделя назад @ datascienceathome.com
How are organisations doing with data and AI? (Ep. 168)
How are organisations doing with data and AI? (Ep. 168) How are organisations doing with data and AI? (Ep. 168)

October 21, 2021 podcastA few weeks ago I was the guest of a very interesting show called “AI Today”.

In that episode I talked about some of the biggest trends emerging in AI and machine learning today as well as how organizations are dealing with and managing their data.

Visit us at quantummetric.com/podoffer and see if you qualify to receive our “12 Days of Insights” offer with code DATASCIENCE.

Amethix TechnologiesAmethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy.

Amethix provide solutions to collect and secure data with higher transparency and disintermed…

1 месяц, 1 неделя назад @ datascienceathome.com
Don’t fight! Cooperate. Generative Teaching Networks (Ep. 167)
Don’t fight! Cooperate. Generative Teaching Networks (Ep. 167) Don’t fight! Cooperate. Generative Teaching Networks (Ep. 167)

Generative Adversarial Networks for synthetic data generation?

There is a new method called Generative Teaching Networks, that uses similar concepts – just quite the opposite 😛 – to train models faster, better and with less data.

Visit us at quantummetric.com/podoffer and see if you qualify to receive our “12 Days of Insights” offer with code DATASCIENCE.

Amethix TechnologiesAmethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy.

Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models th…

1 месяц, 1 неделя назад @ datascienceathome.com
CSV sucks. Here is why. (Ep. 166)
CSV sucks. Here is why. (Ep. 166) CSV sucks. Here is why. (Ep. 166)

October 21, 2021 podcastIt’s time we get serious about replacing the CSV format with something that, guess what?

In this episode I explain the good parts of CSV files and the not so good ones.

Visit us at quantummetric.com/podoffer and see if you qualify to receive our “12 Days of Insights” offer with code DATASCIENCE.

Amethix TechnologiesAmethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy.

Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.

1 месяц, 1 неделя назад @ datascienceathome.com
Reinforcement Learning is all you need. Or is it? (Ep. 165)
Reinforcement Learning is all you need. Or is it? (Ep. 165) Reinforcement Learning is all you need. Or is it? (Ep. 165)

August 18, 2021 podcastIs reinforcement learning sufficient to build truly intelligent machines?

Our SponsorsQuantum MetricStay off the naughty list this holiday season by reducing customer friction, increasing conversions, and personalizing the shopping experience.

Visit us at quantummetric.com/podoffer and see if you qualify to receive our “12 Days of Insights” offer with code DATASCIENCE.

Amethix TechnologiesAmethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy.

Amethix provide solutions to collect and secure data with higher transparency and disintermediation…

3 месяца, 1 неделя назад @ datascienceathome.com
What’s happening with AI today? (Ep. 164)
What’s happening with AI today? (Ep. 164) What’s happening with AI today? (Ep. 164)

August 18, 2021 podcastIn this episode I have a wonderful chat with Ronald Schmelzer and Kathleen Walch, authors of “AI Today” the top podcast for those wanting a no-hype, practical, real-world insight into what enterprises, public sector agencies, thought leaders, leading technology companies, pundits, and experts are doing with AI today.

Sponsored byQuantum Metric Did you know that 2021 holiday ecommerce sales are expected to exceed 2020 benchmarks?Are you prepared to capture every customer revenue opportunity?

Visit their website at quantummetric.com/podoffer and see if you qualify to receive their “12 Days of Insights” offer with code DATASCIENCE.

Sponsored by Amethix TechnologiesAmethi…

3 месяца, 1 неделя назад @ datascienceathome.com
2 effective ways to explain your predictions (Ep. 163)
2 effective ways to explain your predictions (Ep. 163) 2 effective ways to explain your predictions (Ep. 163)

We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits.

By clicking “Accept”, you consent to the use of ALL the cookies.

3 месяца, 3 недели назад @ datascienceathome.com
The Netflix challenge. Fair or what? (Ep. 162)
The Netflix challenge. Fair or what? (Ep. 162)

Remember the Netflix challenge? It was a ton of money for the one who would have cracked the problem of recommending the best possible movie. Was it a fair challenge? […]

The post The Netflix challenge. Fair or what? (Ep. 162) appeared first on Podcast Data science at home.

3 месяца, 3 недели назад @ datascienceathome.com
Artificial Intelligence for Blockchains with Jonathan Ward CTO of Fetch AI (Ep. 161)
Artificial Intelligence for Blockchains with Jonathan Ward CTO of Fetch AI (Ep. 161)

In this episode Fetch AI CTO Jonathan Ward speaks about decentralization, AI, blockchain for smart cities and the enterprise.Below some great links about collective learning, smart contracts in Rust and […]

The post Artificial Intelligence for Blockchains with Jonathan Ward CTO of Fetch AI (Ep. 161) appeared first on Podcast Data science at home.

3 месяца, 3 недели назад @ datascienceathome.com
ParrotCast ParrotCast
последний пост None