Very ML
Наша SOTA-подборка ML-новостей. Что-то из этого читает Лекун.
DataTau DataTau
последний пост 4 часа назад
NPD and Perfect Store Execution: Cracking the Tough Nut with ShelfWatch
NPD and Perfect Store Execution: Cracking the Tough Nut with ShelfWatch NPD and Perfect Store Execution: Cracking the Tough Nut with ShelfWatch

You must be logged to comment.

4 часа назад @ datatau.net
Case of Orange Juice and The Importance of In-Store Execution
Case of Orange Juice and The Importance of In-Store Execution Case of Orange Juice and The Importance of In-Store Execution

You must be logged to comment.

1 день, 23 часа назад @ datatau.net
30 Helpful Python Snippets That You Can Learn in 30 Seconds or Less
30 Helpful Python Snippets That You Can Learn in 30 Seconds or Less 30 Helpful Python Snippets That You Can Learn in 30 Seconds or Less

You must be logged to comment.

1 день, 23 часа назад @ datatau.net
Facebook AI RegNet Models Outperform EfficientNet Models, Run 5x Faster on GPUs
Facebook AI RegNet Models Outperform EfficientNet Models, Run 5x Faster on GPUs Facebook AI RegNet Models Outperform EfficientNet Models, Run 5x Faster on GPUs

You must be logged to comment.

2 дня, 19 часов назад @ datatau.net
DarwinAI Open-Sources COVID-Net as Medical Imaging in COVID-19 Diagnosis Debate Continues
DarwinAI Open-Sources COVID-Net as Medical Imaging in COVID-19 Diagnosis Debate Continues DarwinAI Open-Sources COVID-Net as Medical Imaging in COVID-19 Diagnosis Debate Continues

You must be logged to comment.

3 дня, 14 часов назад @ datatau.net
3D medical image segmentation with PyTorch
3D medical image segmentation with PyTorch 3D medical image segmentation with PyTorch

You must be logged to comment.

3 дня, 14 часов назад @ datatau.net
Use Graphs to identify Social Media Influencers!
Use Graphs to identify Social Media Influencers! Use Graphs to identify Social Media Influencers!

You must be logged to comment.

3 дня, 15 часов назад @ datatau.net
Looking for remote work in Data Science? Try dsremote.work
Looking for remote work in Data Science? Try dsremote.work Looking for remote work in Data Science? Try dsremote.work

You must be logged to comment.

3 дня, 18 часов назад @ datatau.net
Three useful measures to evaluate your machine learning system
Three useful measures to evaluate your machine learning system Three useful measures to evaluate your machine learning system

You must be logged to comment.

3 дня, 18 часов назад @ datatau.net
Pandas tips I wish I knew before
Pandas tips I wish I knew before Pandas tips I wish I knew before

You must be logged to comment.

4 дня, 4 часа назад @ datatau.net
Fei-Fei Li Proposes AI-Assisted Elder Care Solution at Stanford-Hosted Virtual Conference on COVID‑19 and AI
Fei-Fei Li Proposes AI-Assisted Elder Care Solution at Stanford-Hosted Virtual Conference on COVID‑19 and AI Fei-Fei Li Proposes AI-Assisted Elder Care Solution at Stanford-Hosted Virtual Conference on COVID‑19 and AI

You must be logged to comment.

4 дня, 10 часов назад @ datatau.net
Setting up reproducible Python environments for Data Science
Setting up reproducible Python environments for Data Science Setting up reproducible Python environments for Data Science

You must be logged to comment.

4 дня, 14 часов назад @ datatau.net
Google DeepMind ‘Agent 57’ Beats Human Baselines Across Atari Games Suite
Google DeepMind ‘Agent 57’ Beats Human Baselines Across Atari Games Suite Google DeepMind ‘Agent 57’ Beats Human Baselines Across Atari Games Suite

You must be logged to comment.

4 дня, 19 часов назад @ datatau.net
Basket Analysis with Python
Basket Analysis with Python Basket Analysis with Python

You must be logged to comment.

4 дня, 20 часов назад @ datatau.net
Apache Kafka: Life after the honeymoon period!
Apache Kafka: Life after the honeymoon period! Apache Kafka: Life after the honeymoon period!

You must be logged to comment.

4 дня, 21 час назад @ datatau.net
/r/MachineLearning /r/MachineLearning
последний пост 10 минут назад
[D] - Blog on summarizing NLP Papers
[D] - Blog on summarizing NLP Papers [D] - Blog on summarizing NLP Papers

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

10 минут назад @ reddit.com
[P] Greetings From The Past - Displaying random talking faces of WWII (Eastern Front) casualties generated from real photos and first-order-model
[P] Greetings From The Past - Displaying random talking faces of WWII (Eastern Front) casualties generated from real photos and first-order-model [P] Greetings From The Past - Displaying random talking faces of WWII (Eastern Front) casualties generated from real photos and first-order-model

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

24 минуты назад @ reddit.com
[R] Joint Geodesic and Euclidean Convolutions on 3D Meshes ; Models and Code Publicly Available
[R] Joint Geodesic and Euclidean Convolutions on 3D Meshes ; Models and Code Publicly Available [R] Joint Geodesic and Euclidean Convolutions on 3D Meshes ; Models and Code Publicly Available

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

43 минуты назад @ reddit.com
[Project] If gpt-2 read erotica, what would be its take on the Holy scriptures?
[Project] If gpt-2 read erotica, what would be its take on the Holy scriptures? [Project] If gpt-2 read erotica, what would be its take on the Holy scriptures?

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

56 минут назад @ reddit.com
[Project] Pre-trained model
[Project] Pre-trained model [Project] Pre-trained model

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

1 час назад @ reddit.com
[D] Deciding One Tensor Input Vs Multiple Tensors Input for Neural Net
[D] Deciding One Tensor Input Vs Multiple Tensors Input for Neural Net [D] Deciding One Tensor Input Vs Multiple Tensors Input for Neural Net

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

1 час назад @ reddit.com
[P] Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs
[P] Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs [P] Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

3 часа назад @ reddit.com
[D] ML "advanced" learning
[D] ML "advanced" learning [D] ML "advanced" learning

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

3 часа назад @ reddit.com
[R] Introducing MIDAS: A New Baseline for Anomaly Detection in Graphs
[R] Introducing MIDAS: A New Baseline for Anomaly Detection in Graphs [R] Introducing MIDAS: A New Baseline for Anomaly Detection in Graphs

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

4 часа назад @ reddit.com
[P] Dive into Deep Learning: An interactive deep learning book with code, math, and discussions, based on the NumPy interface.
[P] Dive into Deep Learning: An interactive deep learning book with code, math, and discussions, based on the NumPy interface. [P] Dive into Deep Learning: An interactive deep learning book with code, math, and discussions, based on the NumPy interface.

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

10 часов назад @ reddit.com
[D] What is the current state-of-the-art for handling class imbalance in NLP classification problems?
[D] What is the current state-of-the-art for handling class imbalance in NLP classification problems? [D] What is the current state-of-the-art for handling class imbalance in NLP classification problems?

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

13 часов назад @ reddit.com
[D] Making models that model relationships which are not dependent on all the features being correct or present
[D] Making models that model relationships which are not dependent on all the features being correct or present [D] Making models that model relationships which are not dependent on all the features being correct or present

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

14 часов назад @ reddit.com
[Project] Anyone here want to get paid for a small project? Get some google ai open source up and running, NLP/tensorflow/question answering
[Project] Anyone here want to get paid for a small project? Get some google ai open source up and running, NLP/tensorflow/question answering [Project] Anyone here want to get paid for a small project? Get some google ai open source up and running, NLP/tensorflow/question answering

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

16 часов назад @ reddit.com
[D] Machine Learning - WAYR (What Are You Reading) - Week 85
[D] Machine Learning - WAYR (What Are You Reading) - Week 85 [D] Machine Learning - WAYR (What Are You Reading) - Week 85

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

16 часов назад @ reddit.com
[DISCUSSION] What do I need to know before starting my long-term project of making a self learning (digital) AI?
[DISCUSSION] What do I need to know before starting my long-term project of making a self learning (digital) AI? [DISCUSSION] What do I need to know before starting my long-term project of making a self learning (digital) AI?

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

16 часов назад @ reddit.com
Towards Data Science Towards Data Science
последний пост 7 часов назад
All about Feature Scaling
All about Feature Scaling All about Feature Scaling

Feature scaling in machine learning is one of the most critical steps during the pre-processing of data before creating a machine learning model.

Thus feature scaling is needed to bring every feature in the same footing without any upfront importance.

Another reason why feature scaling is applied is that few algorithms like Neural network gradient descent converge much faster with feature scaling than without it.

Deep learning requires feature scaling for faster convergence, and thus it is vital to decide which feature scaling to use.

Still, like most other machine learning steps, feature scaling too is a trial and error process, not a single silver bullet.

7 часов назад @ towardsdatascience.com
Create a Free Linux Virtual Machine on your Computer for Data Science Projects using VirtualBox…
Create a Free Linux Virtual Machine on your Computer for Data Science Projects using VirtualBox… Create a Free Linux Virtual Machine on your Computer for Data Science Projects using VirtualBox…

In the examples in this article, I’ll explain how to load the free Linux OS, Ubuntu Server, onto a virtual machine using Windows as my host machine operating system.

Like a virtual machine in the cloud, it is even possible to set up SSH making it easy to remote in!

After installing VirtualBox and downloading Ubuntu, it is time to create a virtual machine and get Ubuntu installed.

Select Create a virtual hard disk now as the Hard disk option so the VM has dedicated storage space.

The Create Virtual Hard Disk editor displays if that Hard disk option was selected:Select a File location.

7 часов назад @ towardsdatascience.com
Scrape Tabular Data with Python
Scrape Tabular Data with Python Scrape Tabular Data with Python

One of the bottlenecks in executing a machine learning project is the dataset assembling.

The ways of data assembling vary a lot with the type of data, of which scraping the tabular datasets from the web is one of the most typical sources.

I have been writing about machine learning techniques for a while using the NBA players’ stats as the raw data.

One of the most frequently asked questions to me is whether I could share the data because people would like to play with it.

So, in this post, I would like to share with you how to scrape tabular data from the web with Python.

7 часов назад @ towardsdatascience.com
5 Papers on CNNs Every Data Scientist Should Read
5 Papers on CNNs Every Data Scientist Should Read 5 Papers on CNNs Every Data Scientist Should Read

5 Papers on CNNs Every Data Scientist Should ReadIn this article, we introduce 5 papers on CNNs that represent both novel approaches and baselines in the field.

CNNpred: CNN-based stock market prediction using a diverse set of variables — In this paper, researchers from the University of Tehran introduce their CNN-based framework, CNNpred, for feature extraction and predictive analysis in stock market data.

Their framework has been implemented to predict the next day’s direction of movement for various stock market indices.

Mask R-CNN — One of the highest-rated CNN papers on Papers With Code, Mask R-CNN achieved a SOTA (state of the art) rating for the Instance Segmentation on Cityscapes te…

8 часов назад @ towardsdatascience.com
What is PyTorch?
What is PyTorch? What is PyTorch?

PyTorch emphasizes flexibility and allows deep learning models to be expressed in idiomatic Python.

Installation with Anacondaconda install pytorch torchvision -c pytorchInstallation with pippip3 install torch torchvisionIf you have any problem with installation, find out more about different ways to install PyTorch here.

If you are not familiar with Numpy, PyTorch is written in such an intuitive way that you can learn in second.

Whether we decide to use GPU or CPU, PyTorch makes it easy for us to switch between the twocpu=torch.device("cpu")gpu=torch.device("cuda:0") # GPU 0 # Create tensor with CPUx=torch.ones(3,3, device=cpu)print("CPU:",x.device) x=torch.ones(3,3, device=gpu)print("GPU:…

8 часов назад @ towardsdatascience.com
Data Science Reading List for April 2020
Data Science Reading List for April 2020 Data Science Reading List for April 2020

Data Science Reading List for April 2020This month’s Data Science reading list.

We’ve got a lot of followers in Data Science and not enough leaders.

LINKGame Theory: An IntroductionI love Game Theory.

But if you’re a huge Game Theory nerd like I am, this is awesome.

One reviewer said it best:Steve Tadelis’s “Game Theory” is an amazing textbook that combines accessibility and rigor with many nice applications.”Definitely worth checking out if Game Theory has ever peaked your interest and you have a strong math background.

8 часов назад @ towardsdatascience.com
Visualizing COVID-19 Data Beautifully in Python (in 5 Minutes or Less!!)
Visualizing COVID-19 Data Beautifully in Python (in 5 Minutes or Less!!) Visualizing COVID-19 Data Beautifully in Python (in 5 Minutes or Less!!)

Matplotlib may be the de facto data visualization library for Python, but it’s not always the prettiest.

In this post, we’ll explore how to turn a drab, default Matplotlib graph into a beautiful data visualization.

We’ll explore COVID-19 data to see how the virus has spread throughout different countries.

Let’s Load in Our DataWe’ll be using data from this wonderful Github repository that auto-updates the data daily.

We’ll load our data into a Pandas’ dataframe based on the URL so that it’ll update automatically for us every day.

8 часов назад @ towardsdatascience.com
Small family: Small dataset
Small family: Small dataset Small family: Small dataset

Most of us, who are not part of a large corporation, need to find our way of working with relatively small data sizes.

We witnessed extreme overfitting, caused by the model’s attachment to very particular image styles from the small dataset.

However, a small dataset not only rendered the detection of patterns more difficult it also hindered creativity and the ability to produce new images.

As a result, it was almost impossible to render accurate results using such a small dataset.

Training process 3The approach led us to satisfactory results; a well-formed, yet the distorted image of a perfect family.

8 часов назад @ towardsdatascience.com
Beginners Guide to Transition from SAS to Python
Beginners Guide to Transition from SAS to Python Beginners Guide to Transition from SAS to Python

Once you have access to SAS and Python, the last thing you would need is to install pandas for Python.

SAS provides a powerful procedure — PROC IMPORT to do this.

PROC MEANS is a great utility that provides SAS users an easy way to generate this information.

I have often used DROP and KEEP statements inside SAS DATA step to drop and keep certain columns in datasets.

titanic_outer.to_csv('titanic_outer.csv')ConclusionI hope this guide provides a good starting point to SAS programmers to try their hands at doing data analytics using Python and for Python programmers to use SAS.

8 часов назад @ towardsdatascience.com
Using Machine Learning to Fight the Coronavirus
Using Machine Learning to Fight the Coronavirus Using Machine Learning to Fight the Coronavirus

Machine Learning has been a potent technology that has allowed for advances in healthcare, robotics, data mining, cybersecurity, etc.

Thermal imaging falls under the field of thermal imaging science.

Machine learning has become ubiquitous as the forefront method for pattern recognition, so it should be possible to plug datasets in specific machine learning algorithms and achieve systematic classification of individuals with the fever (and possibly the virus).

This would all be based on a machine learning algorithm processing that input data in realtime.

Some sample algorithms which may be useful for this problem include Naive Bayesian machine learning (probabilistic), a Support Vector Machi…

8 часов назад @ towardsdatascience.com
Scientific Computing with Python
Scientific Computing with Python Scientific Computing with Python

Numpy “NUMerical PYthon” is one of the most powerful math libraries for python.

The ndarray class allows instantiating multidimensional vectors and processing them with native python instructions in order to simplify scientific calculations.

class allows instantiating and processing them with native python instructions in order to simplify scientific calculations.

a = np.linspace(0,1) #by default divides the interval into 50 equidistant pointsb = np.linspace(0, 1, num = 10)Output Result: Google Colab3.

#np.full(shape, fill_value, dtype=None)np.full((5,5), 3, np.int)np.full((2,4), fill_value=3+2j, dtype=np.complex)Output Result: Google ColabWe can use np.ones as an equivalent of np.full .

8 часов назад @ towardsdatascience.com
The Text Must Flow
The Text Must Flow The Text Must Flow

The Text Must FlowTraining A Generative Text Model on Frank Herbert’s DuneImage by Greg Montani from PixabayDune is the story of a feudal society in the far distant future.

I have been exploring generative modeling and wanted to try something out with text!

Generative modeling allows for the model to create novel configurations of the training data it learns from.

A recent example of the power of generative modeling is StyleGAN — check out this vide of it in action.

‘Baron The Baron Of The Baron Of The Baron Of The Baron Of The Baron Of The Baron Of The Baron Of The Baron Of The Baron Of The Baron Of The Baron Of The Baron Of The Baron Of The Baron Of The Baron Of The Baron Of The Baron Of …

9 часов назад @ towardsdatascience.com
Tutorial: Plotting in R for Python Refugees
Tutorial: Plotting in R for Python Refugees Tutorial: Plotting in R for Python Refugees

Let’s start out with the data set.

ggplot(midwest, aes(x=area, y=poptotal)) + geom_point() + geom_smooth(method="lm")We can also add layers that alter the graph.

ggplot(midwest, aes(x=area, y=poptotal)) + geom_point() + geom_smooth(method="lm") + coord_cartesian(xlim=c(0,0.1), ylim=c(0, 100000))Besides adding geoms and zooming, the + operator also allows for labelling titles and axes.

ggplot(midwest, aes(x=area, y=poptotal)) +geom_point() +geom_smooth(method="lm") +coord_cartesian(xlim=c(0,0.1), ylim=c(0, 100000)) +ggtitle("Area Vs Population", subtitle="From midwest dataset") +xlab("Area") +ylab("Population")Each layer has its own parameters — for example, we can specify colors and sizes, …

11 часов назад @ towardsdatascience.com
There won’t be an AI winter this time
There won’t be an AI winter this time There won’t be an AI winter this time

There won’t be an AI winter this timeMachine learning isn’t a “Skynet or bust” propositionDisclaimer: My observations are influenced by the fact that I work on Cortex, an open source machine learning deployment platform.

Every few weeks, a new article predicting an imminent AI winter gets circulated.

Production machine learning isn’t limited to tech giants, either.

Machine learning isn’t a bet anymoreThe reason that hype cycles were able to crash AI investment in previous decades was that AI, and by extension machine learning, were essentially bets.

Journalists who predict the Singularity will be here by Christmas might be wildly wrong, but they won’t be causing an AI winter.

14 часов назад @ towardsdatascience.com
Extracting Coefficients of OpenCV Face Detection DNN model
Extracting Coefficients of OpenCV Face Detection DNN model Extracting Coefficients of OpenCV Face Detection DNN model

Photo from pixabay.comThe latest OpenCV includes a Deep Neural Network (DNN) module, which comes with a nice pre-trained face detection convolutional neural network (CNN).

The reason is mainly that the Caffe has changed quite a lot since the model was trained, making the pre-trained model incompatible with latest Caffe version.

Extract coefficients for all neural network layers inside OpenCV DNN face detector.

Create a new neural network using the latest Caffe, insert the extracted coefficients into it, and save the new model.

Load parameters into new CaffeOnce you save the model using the new Caffe, it is now compatible with the new Caffe.

14 часов назад @ towardsdatascience.com
Becoming Human Becoming Human
последний пост 2 дня, 19 часов назад
What is machine learning and how your business can benefit from it
What is machine learning and how your business can benefit from it What is machine learning and how your business can benefit from it

Jobs in Machine LearningFintech seems to be taking advantage of this technology a lot, as machine learning is widely used in financial data modeling.

Top 4 Most Popular Ai Articles:Strategy to implement if you decide to incorporate machine learning into your businessThe reason behind machine learning popularity is that it enables big data processing.

Using discovered data, machine learning enables systems to learn how to deal with time-intensive documentation and data entry tasks.

#2 Follow the machine learning adoption stagesWhen incorporating machine learning to the strategy, companies go through the stages of data collection, patterns discovery & outcome expectation and the application o…

2 дня, 19 часов назад @ becominghuman.ai
Using artificial intelligence to detect COVID-19
Using artificial intelligence to detect COVID-19 Using artificial intelligence to detect COVID-19

Using Artificial Intelligence to detect COVID-19T he artificial intelligence tool could help states develop more specific COVID-19 disease prevention strategies and improve public health.

We can use Artificial Intelligence algorithms to detect the disease using automatic X-ray analysis to support radiologists.

To test the performance of Artificial Intelligence on the detection of COVID-19, I used the database made available by the team of Dr Joseph Cohen.

The database — which contains 152 observations — was constructed from COVID-19 cases with chest x-ray or CT images.

The database contains 127 observations with COVID-19 disease, 4 ARDS, 2 Pneumocystis, 6 Streptococcus and 2 No Finding.

2 дня, 19 часов назад @ becominghuman.ai
The Logic of Digital Memories
The Logic of Digital Memories The Logic of Digital Memories

So, LSTMs keep adding memory gates that control when memory is saved from one iteration to the next.

- First, a sigmoid layer called the “input gate layer” decides which values are to be updated.

LSTM Walk-through-2 (github)- The old cell state, Ct−1, should be updated into the new cell state Ct.

It refers to letting the gate layers look at the cell state (Koenker et al, 2001).

Another variant would be to use a combination of both coupled forget and input gates.

2 дня, 19 часов назад @ becominghuman.ai
The Scientific War against COVID-19: A Congregation of Cutting-Edge Breakthroughs and Research
The Scientific War against COVID-19: A Congregation of Cutting-Edge Breakthroughs and Research The Scientific War against COVID-19: A Congregation of Cutting-Edge Breakthroughs and Research

Also known as protein vaccines, as part of this vaccines, recombinant antigens are injected into the human body.

mRNA vaccines: mRNA vaccines are similar in that they help the immune system recognize a particular antigen and generate immunity to it.

— Moderna therapeutics is company currently working on generating these mRNA vaccines for COVID.

However, early safety tests indicate that mRNA vaccines may induce a large number of side effects in patients.

CRISPR (which stands for clustered regularly interspaced short palindromic repeats), is part of the bacterial immune system encoded into their genes.

3 дня, 21 час назад @ becominghuman.ai
The Science of Convolutional Networks
The Science of Convolutional Networks The Science of Convolutional Networks

There are three hyperparameters that control the size of the output volume, namely the depth, stride and zero-padding:1.

Pooling LayerThe pooling layer helps to reduce the amount of parameters in the network to avoid overfitting.

Example of pooling layer (github)As it can be seen in Figure above, the pooling layer downsamples the volume regardless of each depth slice of the input volume.

Imagine that the Inception layer only performs 3x3 convolutions (In other words, 256x256 x 3x3 convolutions that have to be performed (589,000s multiply-accumulate operations).

This is basically identical to performing a convolution with strides in parallel with a simple pooling layer.

3 дня, 21 час назад @ becominghuman.ai
The Future of AI Will Look Like Those Who Build It
The Future of AI Will Look Like Those Who Build It The Future of AI Will Look Like Those Who Build It

And at IVOW, home of AIGrrls, we believe that effective fusion of AI and Cultural Intelligence will help diminish bias in algorithmic identification and train AI software to be much more inclusive.

Jennifer Bonine is the CEO of PinkLion AI, a breakout company that brings AI to the world’s App teams.

For her part, civic technologist Nina Bianchi sported a flowing jacket featuring the phrase “The future is female” in Morse code print.

Referring to the controversies around Clearview AI, Raluca emphasized the responsibility to build ethical software, and to ensure you are mindful of the tools you use at every level.

AIGrrls is a networking event powered by IVOW AI, a US tech company focused on …

3 дня, 21 час назад @ becominghuman.ai
Understanding Anchors(backbone of object detection) using YOLO
Understanding Anchors(backbone of object detection) using YOLO Understanding Anchors(backbone of object detection) using YOLO

Yolo has three detection layer;So what happens at each of this detection layer?

And if your images are fit into the size 416 X 416 hence the ground truth label will change also.

Remember the ground truth boundary box are in the dimension of 416 X 416 and the model prediction we are to use is in 13 X 13.

How do we compare the boundary box from 416 X 416 to that presented in 13 x 13.

Then the element in the 13 X 13 that has the highest iou is assign that boundary box.

4 дня, 20 часов назад @ becominghuman.ai
Demystifying Artificial Intelligence
Demystifying Artificial Intelligence Demystifying Artificial Intelligence

Here’s a summary of our insightful discussion where Dr.Monett helped us separate fact from fiction in AI.

Artificial Intelligence (AI) is the same as Machine Learning or Deep Learning.

Artificial General Intelligence (AGI) or human like intelligence is already here.

Top 4 Most Popular Ai Articles:But ethics and moral judgment could to some extend be put into AI programs and help humans in decision making.

Two surveys on explainable AI worth reading:Explainable Artificial Intelligence: A Survey https://academia.edu/36711495/Explainable_Artificial_Intelligence_A_SurveyA Survey on Explainable Artificial Intelligence (XAI): Towards Medical XAI https://arxiv.org/abs/1907.07374#8 Fact or Fiction?

4 дня, 20 часов назад @ becominghuman.ai
What is the Best Subtitling Software Available?
What is the Best Subtitling Software Available? What is the Best Subtitling Software Available?

What is the best subtitling software?

Actually — it’s VideoTranslator — our product is the best subtitling software!

Accuracy in AI transcription — How good is AI transcription really?

Generally subtitling software uses generic AI.

Using WebVTT for SEOIs this really a factor in deciding what subtitling software to use?

4 дня, 20 часов назад @ becominghuman.ai
Basics of Networking: The Neural Way
Basics of Networking: The Neural Way Basics of Networking: The Neural Way

A function that transforms the values or states the conditions for the decision of the output neuron is known as an activation function.

The networks often go by different names: deep feedforward networks, feedforward neural networks, or multi-layer perceptrons (MLP).

Trending AI Articles:Adding hidden layers can allow the neural network to make more complex decisions, but more on that, and how neural networks learn through a process known as backpropagation.

The last fully-connected layer is called the “output layer” and in classification settings it represents the class scores (Koenker et al, 2001).

Convolutional NetworksIn the past, neural networks for complex tasks were too inefficient …

5 дней, 19 часов назад @ becominghuman.ai
How We Can Use Social Media To Predict COVID Contact Tracing
How We Can Use Social Media To Predict COVID Contact Tracing How We Can Use Social Media To Predict COVID Contact Tracing

To prevent infection, we don’t need to predict the future with absolute certainty, we just need a possibility.

What looking for a McDonalds could look likeIn this case, we don’t need to be as completely accurate like with Pre-Crime.

To prevent infection, we don’t need to predict the future with absolute certainty, we just need a possibility.

But of course, we haven’t accounted for changes in behavior due to the existence of COVID-19, and that’s where the Contact Tracing apps come in.

With the Contact Tracing apps, we have a large data pool of each user’s movements during the coronavirus situation.

5 дней, 19 часов назад @ becominghuman.ai
Why Culture Should be Felt in Artificial Intelligence
Why Culture Should be Felt in Artificial Intelligence Why Culture Should be Felt in Artificial Intelligence

The thoughtful conversations focusing on artificial intelligence (AI) and culture were part of our AIGrrls community lab.

Focusing on speculative explainability, Carrie talked about how we can evolve current UI toolkits to communicate and explain automation and AI to humans.

Carrie led the first workshop on speculative explainability in summer 2019 at NYU’s Interactive Telecommunications Program with artists, designers, musicians, and technologists.

Kathy is also Founding Friend and AI Ambassador of a pioneering women’s dataset challenge led by IVOW in collaboration with AI Commons.

JoDell Seaman of PinkLion emphasized that great innovations in artificial intelligence are happening in place…

5 дней, 19 часов назад @ becominghuman.ai
How can I trust you?
How can I trust you? How can I trust you?

If you are more interested on the practical stuff, you may skip to the Trust Score section.

Enter: Trust Score.

trust score = ĥ / h.The explanation above was meant to be the intuition behind the trust score.

The code snippet above lays down the canonical code for computing the trust score, and getting ĥ — which will also use the PCA-encoded test features.

Summary on computing trust score: First, we get the α-high-density-set (where probable outliers are filtered out) by using Algorithm 1.

6 дней, 21 час назад @ becominghuman.ai
When our immunity to Corona virus increases, will our resistance to Artificial Intelligence…
When our immunity to Corona virus increases, will our resistance to Artificial Intelligence… When our immunity to Corona virus increases, will our resistance to Artificial Intelligence…

What we are witnessing is a spectacular collaboration of individual and collective human intelligence with Artificial intelligence.

Following are brief highlights of how individual, collective and artificial intelligence has helped our fight for survival against corona virus.

Trending AI Articles:Collective Intelligence :In the past few days, we have also witnessed the enormous potential of collective human intelligence augmented by A.I.

Organisations like World Health Organisation are compiling all published research on corona virus into a unified database.

As our immunity towards corona virus hopefully increases, our resistance towards Artificial Intelligence is likely to decrease.

6 дней, 21 час назад @ becominghuman.ai
Industry Use-Cases of AI
Industry Use-Cases of AI Industry Use-Cases of AI

With this whopping increase, there is going to be a higher demand for artificial intelligence training and AI certificate programs that can help in creating an Artificial Intelligence expert.

This requires the skillful implementation of different AI concepts in the current business process which can only be achieved by a proficient artificial intelligence expert.

We provide AI certification, and the AI certificate program has been curated as per the industry standard.

This AI training is an online program, and you will get a blend of both classroom and practical learning.

Upon successful completion of this program, you can successfully apply for different AI jobs.

6 дней, 21 час назад @ becominghuman.ai
Distill.pub Distill.pub
последний пост 2 недели, 6 дней назад
Visualizing Neural Networks with the Grand Tour
Visualizing Neural Networks with the Grand Tour

By focusing on linear dimensionality reduction, we show how to visualize many dynamic phenomena in neural networks.

2 недели, 6 дней назад @ distill.pub
Zoom In: An Introduction to Circuits
Zoom In: An Introduction to Circuits

By studying the connections between neurons, we can find meaningful algorithms in the weights of neural networks.

3 недели, 5 дней назад @ distill.pub
Growing Neural Cellular Automata
Growing Neural Cellular Automata

Training an end-to-end differentiable, self-organising cellular automata model of morphogenesis, able to both grow and regenerate specific patterns.

1 месяц, 3 недели назад @ distill.pub
Visualizing the Impact of Feature Attribution Baselines
Visualizing the Impact of Feature Attribution Baselines

Exploring the baseline input hyperparameter, and how it impacts interpretations of neural network behavior.

2 месяца, 3 недели назад @ distill.pub
Computing Receptive Fields of Convolutional Neural Networks
Computing Receptive Fields of Convolutional Neural Networks

Detailed derivations and open-source code to analyze the receptive fields of convnets.

5 месяцев назад @ distill.pub
The Paths Perspective on Value Learning
The Paths Perspective on Value Learning

A closer look at how Temporal Difference Learning merges paths of experience for greater statistical efficiency

6 месяцев, 1 неделя назад @ distill.pub
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Robust Feature Leakage
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Robust Feature Leakage

An example project using webpack and svelte-loader and ejs to inline SVGs

8 месяцев назад @ distill.pub
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Discussion and Author Responses
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Discussion and Author Responses 8 месяцев назад @ distill.pub
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Adversarial Example Researchers Need to Expand What is Meant by 'Robustness'
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Adversarial Example Researchers Need to Expand What is Meant by 'Robustness'

The main hypothesis in Ilyas et al. (2019) happens to be a special case of a more general principle that is commonly accepted in the robustness to distributional shift literature

8 месяцев назад @ distill.pub
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Learning from Incorrectly Labeled Data
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Learning from Incorrectly Labeled Data

Section 3.2 of Ilyas et al. (2019) shows that training a model on only adversarial errors leads to non-trivial generalization on the original test set. We show that these experiments are a specific case of learning from errors.

8 месяцев назад @ distill.pub
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Adversarial Examples are Just Bugs, Too
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Adversarial Examples are Just Bugs, Too

Refining the source of adversarial examples

8 месяцев назад @ distill.pub
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Adversarially Robust Neural Style Transfer
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Adversarially Robust Neural Style Transfer

An experiment showing adversarial robustness makes neural style transfer work on a non-VGG architecture

8 месяцев назад @ distill.pub
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Two Examples of Useful, Non-Robust Features
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Two Examples of Useful, Non-Robust Features

An example project using webpack and svelte-loader and ejs to inline SVGs

8 месяцев назад @ distill.pub
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features'
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features'

Six comments from the community and responses from the original authors

8 месяцев назад @ distill.pub
Open Questions about Generative Adversarial Networks
Open Questions about Generative Adversarial Networks

What we'd like to find out about GANs that we don't know yet.

12 месяцев назад @ distill.pub
The Gradient The Gradient
последний пост 1 день, 15 часов назад
A Speech-To-Text Practitioner’s Criticisms of Industry and Academia
A Speech-To-Text Practitioner’s Criticisms of Industry and Academia A Speech-To-Text Practitioner’s Criticisms of Industry and Academia

This is a follow-up article to our article on building speech-to-text (STT) models, Towards an ImageNet Moment for Speech-to-Text.

In the first article we mostly focused on the practical aspects of building STT models.

Сriticisms of the IndustryIn general, the majority of STT papers we have read were written by researchers from the industry (e.g.

Most criticisms of STT papers and solutions can be attributed to either the"academic" part or the "industry" part of the researchers’ background.

The majority of modern STT papers usually just heavily overfit on the LibriSpeech ASR corpus (LibriSpeech) with increasingly more extravagant methods.

1 день, 15 часов назад @ thegradient.pub
Towards an ImageNet Moment for Speech-to-Text
Towards an ImageNet Moment for Speech-to-Text Towards an ImageNet Moment for Speech-to-Text

Speech-to-text (STT), also known as automated-speech-recognition (ASR), has a long history and has made amazing progress over the past decade.

IntroductionFollowing the success and the democratization (the so-called "ImageNet moment", i.e.

This piece will describe our pursuit of an ImageNet moment for STT, which has so far not been found, and particularly in the context of Russian language.

(i) is easy to estimate just by looking at the model's performance during the first 20-25% of its epochs.

Another problem in speech is that ASR papers usually train 50 - 500 epochs on the full Librispeech dataset.

1 неделя, 1 день назад @ thegradient.pub
Quantifying Independently Reproducible Machine Learning
Quantifying Independently Reproducible Machine Learning Quantifying Independently Reproducible Machine Learning

My investigation in reproducible ML has also relied on personal notes and records hosted on Mendeley and Github.

http://phdcomics.com/comics/archive.php?comicid=1689Our aversion to using or asking for the authors code is more than fear of working with undocumented research-grade code.

What Makes a ML Paper Reproducible?

The biggest factors are that we cannot take all of our assumptions about so-called reproducible ML at face value.

At the same time, our process and systems must result in reproducible work that does not lead us astray.

1 месяц, 4 недели назад @ thegradient.pub
GPT-2 and the Nature of Intelligence
GPT-2 and the Nature of Intelligence GPT-2 and the Nature of Intelligence

--The AI system GPT-2, in a December 2019 interview with The Economist, "An artificial intelligence predicts the future"Innateness, empiricism, and recent developments in deep learningConsider two classic hypotheses about the development of language and cognition.

Consider GPT-2, an AI system that was recently featured in The New Yorker and interviewed by The Economist.

The popular blog StatStarCodex featured it, too, in a podcast entitled "GPT-2 as a step towards General Intelligence".

Compared to any previous system for generating natural language, GPT-2 has a number of remarkable strengths.

I speak fluent EnglishIf you run your experiments talktotransformer.com, you will quickly learn th…

2 месяца, 1 неделя назад @ thegradient.pub
The Economics of AI Today
The Economics of AI Today The Economics of AI Today

Every day we hear claims that Artificial Intelligence (AI) systems are about to transform the economy, creating mass unemployment and vast monopolies.

In September 2017, a group of distinguished economists gathered in Toronto to set out a research agenda for the Economics of Artificial Intelligence (AI).

Previous editions of the Economics of AI conference included papers about the impact of AI in sectors such as media or health-care.

Lack of diversity in the AI research workforce, and the increasing influence of the private sector in setting AI research (and ethical) agendas as part of the industrialization of AI research suggest that this could be a problem, but the evidence base is lackin…

2 месяца, 2 недели назад @ thegradient.pub
Is NeurIPS Getting Too Big?
Is NeurIPS Getting Too Big? Is NeurIPS Getting Too Big?

NeurIPS 2019, the latest incarnation of the Neural Information Processing Systems conference, wrapped up just over a week ago.

No, that's a keynote at #NeurIPS2019 pic.twitter.com/nJjONGzJww — Jevgenij Gamper (@brutforcimag) December 11, 2019 NeurIPS poster session- Too crowded.

Lots of Posters/Talks/TopicsThe other primary purpose of any research conference is to inform attendees of new research and inspire new ideas.

:(NeurIPS 2019, Vancouver, Canada: Got the visa 3 weeks before.

2019 NeurIPS was last week in Vancouver.

3 месяца, 2 недели назад @ thegradient.pub
An Epidemic of AI Misinformation
An Epidemic of AI Misinformation An Epidemic of AI Misinformation

Unfortunately, the problem of overhyped AI extends beyond the media itself.

General AI still seems like it might be a couple decades away, sixty years after the first optimistic projections were issued.

Hundreds of deep learning for radiology companies have been spawned in the meantime, but thus far no actual radiologists have been replaced, and the best guess is that deep learning can augment radiologists, but not, in the near-term replace them.

The net consequences could, in the end, debilitate the field, paradoxically inducing an AI winter after initially helping stimulate public interest.

If AI system is allegedly better than humans, then which humans, and how much better?

4 месяца, 1 неделя назад @ thegradient.pub
Introduction to Artificial Life for People who Like AI
Introduction to Artificial Life for People who Like AI Introduction to Artificial Life for People who Like AI

Artificial Life, often shortened as ALife.

NEAT was awarded the 2017 International Society for Artificial Life Award for Outstanding Paper of the Decade.

First, I think we are seeing the first signs of the next AI winter, a period where people lose confidence in AI research and funding dries out.

Art ALife: “Edge of Chaos: Artificial Life based interactive art installation” by Vasilija Abramovic and Ruairi GlynnHave you heard about the edge of chaos?

She was recently elected to the board of the International Society for Artificial Life.

4 месяца, 1 неделя назад @ thegradient.pub
How Machine Learning Can Help Unlock the World of Ancient Japan
How Machine Learning Can Help Unlock the World of Ancient Japan How Machine Learning Can Help Unlock the World of Ancient Japan

However, these models were unable to achieve strong performance on Kuzushiji recognition.

This was due to inadequate understanding of Japanese historical literature in the optical character recognition (OCR) community and the lack of high quality standardized datasets.

There are several reasons why Kuzushiji recognition is challenging:Capturing both local and global context is important.

This is one reason why conventional sequence models do not have the capability to work well with many Kuzushiji documents.

However there are many other types of Kuzushiji text that a person might want to transcribe.

4 месяца, 2 недели назад @ thegradient.pub
Gaussian Processes, not quite for dummies
Gaussian Processes, not quite for dummies Gaussian Processes, not quite for dummies

Note: if all k components are independent Gaussian random variables, then $X$ must be multivariate Gaussian (because the sum of independent Gaussian random variables is always Gaussian).

Higher dimensional Gaussian5D GaussianNow we can consider a higher dimension Gaussian, starting from 5D — so the covariance matrix is now 5x5.

We then take K and add $I\sigma_y^2$ for the final covariance matrix to factor in noise -- more on this later.

This means in principle, we can calculate this covariance matrix for any real-valued $x_1$ and $x_2$ by simply plugging them in.

Gaussian ProcessTextbook definitionFrom the above derivation, you can view Gaussian process as a generalization of multivariate G…

4 месяца, 3 недели назад @ thegradient.pub
Evaluation Metrics for Language Modeling
Evaluation Metrics for Language Modeling Evaluation Metrics for Language Modeling

Counterintuitively, having more metrics actually makes it harder to compare language models, especially as indicators of how well a language model will perform on a specific downstream task are often unreliable.

Despite the presence of these downstream evaluation benchmarks, traditional intrinsic metrics are, nevertheless, extremely useful during the process of training the language model itself.

Proof: let P be the distribution of the underlying language and Q be the distribution learned by a language model.

The performance of N-gram language models do not improve much as N goes above 4, whereas the performance of neural language models continue improving over time.

In less than two years,…

5 месяцев, 2 недели назад @ thegradient.pub
The State of Machine Learning Frameworks in 2019
The State of Machine Learning Frameworks in 2019 The State of Machine Learning Frameworks in 2019

Since deep learning regained prominence in 2012, many machine learning frameworks have clamored to become the new favorite among researchers and industry practitioners.

It is perhaps under appreciated how much machine learning frameworks shape ML research.

Machine learning research itself is also in a massive state of flux.

Most of us don't work on machine learning software for the money or to assist in our company's strategic plans.

We work in machine learning because we care - about advancing machine learning research, about democratizing AI, or maybe just about building cool stuff.

5 месяцев, 4 недели назад @ thegradient.pub
The #BenderRule: On Naming the Languages We Study and Why It Matters
The #BenderRule: On Naming the Languages We Study and Why It Matters The #BenderRule: On Naming the Languages We Study and Why It Matters

This has led to a digital divide in the field of NLP between high resource and low resource languages.

High resource languages constitute a short list starting with English, (Mandarin) Chinese, Arabic and French .

And yet, the field of NLP is caught in a negative feedback loop that hinders the expansion of the languages we work on.

Work on languages other than English is often considered “language specific” and thus reviewed as less important than equivalent work on English.

Many NLP systems for Chinese, Japanese, Thai and other languages have to start with the problem of word tokenization.

6 месяцев, 3 недели назад @ thegradient.pub
NLP's Clever Hans Moment has Arrived
NLP's Clever Hans Moment has Arrived NLP's Clever Hans Moment has Arrived

However, the model doesn't care about this impossibility and identifies the correct warrant with 71 percent accuracy.

Coming back to the paper, the authors point to a (again, depressingly) large amount of recent work reporting Clever Hans effects in NLP datasets.

For a broader view on this topic, also see Ana Marasović's article on NLP's Generalization Problem.

The growing number of papers finding cases of the Clever Hans effect raises important questions for NLP research, the most obvious one being how the effect can be prevented.

If not much, the dataset may provide unintended non-content cues, such as sentence length or distribution of function words.

7 месяцев, 1 неделя назад @ thegradient.pub
Introducing Retrospectives: 'Real Talk' for your Past Papers
Introducing Retrospectives: 'Real Talk' for your Past Papers Introducing Retrospectives: 'Real Talk' for your Past Papers

What the community still lacks, though, are incentives for publicly documenting our real thoughts and feelings about our past papers.

Today, we’re launching ML Retrospectives, a website for hosting reflections and critiques of researchers’ own past papers that we’re calling retrospectives.

With the clearing of this emotional weight it became easier to look at my past papers.

ML Retrospectives is a platform for hosting retrospectives: documents where researchers write honestly about their past papers.

While a venue for critiquing other people’s papers might also be valuable, we wanted to focus on normalizing sharing drawbacks of your own past papers.

7 месяцев, 2 недели назад @ thegradient.pub
🔬 Science
arXiv.org arXiv.org
последний пост 7 часов назад
Long-term prediction of chaotic systems with recurrent neural networks. (arXiv:2004.01258v1 [cs.LG])
Long-term prediction of chaotic systems with recurrent neural networks. (arXiv:2004.01258v1 [cs.LG]) Long-term prediction of chaotic systems with recurrent neural networks. (arXiv:2004.01258v1 [cs.LG])

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Towards Relevance and Sequence Modeling in Language Recognition. (arXiv:2004.01221v1 [eess.AS])
Towards Relevance and Sequence Modeling in Language Recognition. (arXiv:2004.01221v1 [eess.AS]) Towards Relevance and Sequence Modeling in Language Recognition. (arXiv:2004.01221v1 [eess.AS])

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models. (arXiv:2004.01215v1 [cs.LG])
Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models. (arXiv:2004.01215v1 [cs.LG]) Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models. (arXiv:2004.01215v1 [cs.LG])

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
The data-driven physical-based equations discovery using evolutionary approach. (arXiv:2004.01680v1 [cs.NE])
The data-driven physical-based equations discovery using evolutionary approach. (arXiv:2004.01680v1 [cs.NE]) The data-driven physical-based equations discovery using evolutionary approach. (arXiv:2004.01680v1 [cs.NE])

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Benchmarking Deep Spiking Neural Networks on Neuromorphic Hardware. (arXiv:2004.01656v1 [cs.NE])
Benchmarking Deep Spiking Neural Networks on Neuromorphic Hardware. (arXiv:2004.01656v1 [cs.NE]) Benchmarking Deep Spiking Neural Networks on Neuromorphic Hardware. (arXiv:2004.01656v1 [cs.NE])

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Generating Similarity Map in COVID-19 Transmission Dynamics with Topological Autoencoder. (arXiv:2004.01481v1 [physics.soc-ph])
Generating Similarity Map in COVID-19 Transmission Dynamics with Topological Autoencoder. (arXiv:2004.01481v1 [physics.soc-ph]) Generating Similarity Map in COVID-19 Transmission Dynamics with Topological Autoencoder. (arXiv:2004.01481v1 [physics.soc-ph])

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Neural Architecture Generator Optimization. (arXiv:2004.01395v1 [cs.LG])
Neural Architecture Generator Optimization. (arXiv:2004.01395v1 [cs.LG]) Neural Architecture Generator Optimization. (arXiv:2004.01395v1 [cs.LG])

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Does Comma Selection Help To Cope With Local Optima. (arXiv:2004.01274v1 [cs.NE])
Does Comma Selection Help To Cope With Local Optima. (arXiv:2004.01274v1 [cs.NE]) Does Comma Selection Help To Cope With Local Optima. (arXiv:2004.01274v1 [cs.NE])

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Under the Hood of Neural Networks: Characterizing Learned Representations by Functional Neuron Populations and Network Ablations. (arXiv:2004.01254v1 [cs.NE])
Under the Hood of Neural Networks: Characterizing Learned Representations by Functional Neuron Populations and Network Ablations. (arXiv:2004.01254v1 [cs.NE]) Under the Hood of Neural Networks: Characterizing Learned Representations by Functional Neuron Populations and Network Ablations. (arXiv:2004.01254v1 [cs.NE])

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Predicting the outputs of finite networks trained with noisy gradients. (arXiv:2004.01190v1 [stat.ML])
Predicting the outputs of finite networks trained with noisy gradients. (arXiv:2004.01190v1 [stat.ML]) Predicting the outputs of finite networks trained with noisy gradients. (arXiv:2004.01190v1 [stat.ML])

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Embedding Expansion: Augmentation in Embedding Space for Deep Metric Learning. (arXiv:2003.02546v2 [cs.CV] UPDATED)
Embedding Expansion: Augmentation in Embedding Space for Deep Metric Learning. (arXiv:2003.02546v2 [cs.CV] UPDATED) Embedding Expansion: Augmentation in Embedding Space for Deep Metric Learning. (arXiv:2003.02546v2 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Learning to Shadow Hand-drawn Sketches. (arXiv:2002.11812v2 [cs.CV] UPDATED)
Learning to Shadow Hand-drawn Sketches. (arXiv:2002.11812v2 [cs.CV] UPDATED) Learning to Shadow Hand-drawn Sketches. (arXiv:2002.11812v2 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Rotation, Translation, and Cropping for Zero-Shot Generalization. (arXiv:2001.09908v2 [cs.LG] UPDATED)
Rotation, Translation, and Cropping for Zero-Shot Generalization. (arXiv:2001.09908v2 [cs.LG] UPDATED) Rotation, Translation, and Cropping for Zero-Shot Generalization. (arXiv:2001.09908v2 [cs.LG] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
RoutedFusion: Learning Real-time Depth Map Fusion. (arXiv:2001.04388v2 [cs.CV] UPDATED)
RoutedFusion: Learning Real-time Depth Map Fusion. (arXiv:2001.04388v2 [cs.CV] UPDATED) RoutedFusion: Learning Real-time Depth Map Fusion. (arXiv:2001.04388v2 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
In Defense of Grid Features for Visual Question Answering. (arXiv:2001.03615v2 [cs.CV] UPDATED)
In Defense of Grid Features for Visual Question Answering. (arXiv:2001.03615v2 [cs.CV] UPDATED) In Defense of Grid Features for Visual Question Answering. (arXiv:2001.03615v2 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
arXiv.org arXiv.org
последний пост 7 часов назад
Improving Image Autoencoder Embeddings with Perceptual Loss. (arXiv:2001.03444v2 [cs.CV] UPDATED)
Improving Image Autoencoder Embeddings with Perceptual Loss. (arXiv:2001.03444v2 [cs.CV] UPDATED) Improving Image Autoencoder Embeddings with Perceptual Loss. (arXiv:2001.03444v2 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
UrbanLoco: A Full Sensor Suite Dataset for Mapping and Localization in Urban Scenes. (arXiv:1912.09513v2 [cs.RO] UPDATED)
UrbanLoco: A Full Sensor Suite Dataset for Mapping and Localization in Urban Scenes. (arXiv:1912.09513v2 [cs.RO] UPDATED) UrbanLoco: A Full Sensor Suite Dataset for Mapping and Localization in Urban Scenes. (arXiv:1912.09513v2 [cs.RO] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
C-Flow: Conditional Generative Flow Models for Images and 3D Point Clouds. (arXiv:1912.07009v2 [cs.CV] UPDATED)
C-Flow: Conditional Generative Flow Models for Images and 3D Point Clouds. (arXiv:1912.07009v2 [cs.CV] UPDATED) C-Flow: Conditional Generative Flow Models for Images and 3D Point Clouds. (arXiv:1912.07009v2 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
LatticeNet: Fast Point Cloud Segmentation Using Permutohedral Lattices. (arXiv:1912.05905v2 [cs.CV] UPDATED)
LatticeNet: Fast Point Cloud Segmentation Using Permutohedral Lattices. (arXiv:1912.05905v2 [cs.CV] UPDATED) LatticeNet: Fast Point Cloud Segmentation Using Permutohedral Lattices. (arXiv:1912.05905v2 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Classifying, Segmenting, and Tracking Object Instances in Video with Mask Propagation. (arXiv:1912.04573v2 [cs.CV] UPDATED)
Classifying, Segmenting, and Tracking Object Instances in Video with Mask Propagation. (arXiv:1912.04573v2 [cs.CV] UPDATED) Classifying, Segmenting, and Tracking Object Instances in Video with Mask Propagation. (arXiv:1912.04573v2 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Learning a Neural 3D Texture Space from 2D Exemplars. (arXiv:1912.04158v2 [cs.CV] UPDATED)
Learning a Neural 3D Texture Space from 2D Exemplars. (arXiv:1912.04158v2 [cs.CV] UPDATED) Learning a Neural 3D Texture Space from 2D Exemplars. (arXiv:1912.04158v2 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Learning to Super Resolve Intensity Images from Events. (arXiv:1912.01196v2 [cs.CV] UPDATED)
Learning to Super Resolve Intensity Images from Events. (arXiv:1912.01196v2 [cs.CV] UPDATED) Learning to Super Resolve Intensity Images from Events. (arXiv:1912.01196v2 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
G-TAD: Sub-Graph Localization for Temporal Action Detection. (arXiv:1911.11462v2 [cs.CV] UPDATED)
G-TAD: Sub-Graph Localization for Temporal Action Detection. (arXiv:1911.11462v2 [cs.CV] UPDATED) G-TAD: Sub-Graph Localization for Temporal Action Detection. (arXiv:1911.11462v2 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Unsupervised Representation Learning for Gaze Estimation. (arXiv:1911.06939v4 [cs.CV] UPDATED)
Unsupervised Representation Learning for Gaze Estimation. (arXiv:1911.06939v4 [cs.CV] UPDATED) Unsupervised Representation Learning for Gaze Estimation. (arXiv:1911.06939v4 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Real-world adversarial attack on MTCNN face detection system. (arXiv:1910.06261v2 [cs.CV] UPDATED)
Real-world adversarial attack on MTCNN face detection system. (arXiv:1910.06261v2 [cs.CV] UPDATED) Real-world adversarial attack on MTCNN face detection system. (arXiv:1910.06261v2 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Self-Paced Deep Regression Forests for Facial Age Estimation. (arXiv:1910.03244v4 [cs.CV] UPDATED)
Self-Paced Deep Regression Forests for Facial Age Estimation. (arXiv:1910.03244v4 [cs.CV] UPDATED) Self-Paced Deep Regression Forests for Facial Age Estimation. (arXiv:1910.03244v4 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Multi-path Learning for Object Pose Estimation Across Domains. (arXiv:1908.00151v2 [cs.CV] UPDATED)
Multi-path Learning for Object Pose Estimation Across Domains. (arXiv:1908.00151v2 [cs.CV] UPDATED) Multi-path Learning for Object Pose Estimation Across Domains. (arXiv:1908.00151v2 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Batch-Shaping for Learning Conditional Channel Gated Networks. (arXiv:1907.06627v4 [cs.LG] UPDATED)
Batch-Shaping for Learning Conditional Channel Gated Networks. (arXiv:1907.06627v4 [cs.LG] UPDATED) Batch-Shaping for Learning Conditional Channel Gated Networks. (arXiv:1907.06627v4 [cs.LG] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
What Makes Training Multi-Modal Classification Networks Hard?. (arXiv:1905.12681v5 [cs.CV] UPDATED)
What Makes Training Multi-Modal Classification Networks Hard?. (arXiv:1905.12681v5 [cs.CV] UPDATED) What Makes Training Multi-Modal Classification Networks Hard?. (arXiv:1905.12681v5 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Rethinking Classification and Localization for Object Detection. (arXiv:1904.06493v4 [cs.CV] UPDATED)
Rethinking Classification and Localization for Object Detection. (arXiv:1904.06493v4 [cs.CV] UPDATED) Rethinking Classification and Localization for Object Detection. (arXiv:1904.06493v4 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Papers With Code Papers With Code
последний пост 7 часов назад
Supervised Raw Video Denoising with a Benchmark Dataset on Dynamic Scenes
Supervised Raw Video Denoising with a Benchmark Dataset on Dynamic Scenes Supervised Raw Video Denoising with a Benchmark Dataset on Dynamic Scenes

In recent years, the supervised learning strategy for real noisy image denoising has been emerging and has achieved promising results.

In this way, we construct a dataset with 55 groups of noisy-clean videos with ISO values ranging from 1600 to 25600.

Correspondingly, we propose a raw video denoising network (RViDeNet) by exploring the temporal, spatial, and channel correlations of video frames.

Since the raw video has Bayer patterns, we pack it into four sub-sequences, i.e RGBG sequences, which are denoised by the proposed RViDeNet separately and finally fused into a clean video.

Experimental results demonstrate that our method outperforms state-of-the-art video and raw image denoising alg…

7 часов назад @ paperswithcode.com
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Analysis and a New Strategy
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Analysis and a New Strategy Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Analysis and a New Strategy

Unfortunately, current methods are mostly developed for high-level vision tasks (e.g., classification) and few are studied for low-level vision tasks (e.g., image restoration)...

In this paper, we provide a comprehensive analysis of the existing augmentation methods applied to the super-resolution task.

We find that the methods discarding or manipulating the pixels or features too much hamper the image restoration, where the spatial relationship is very important.

By doing so, the model can understand "how much", instead of blindly learning to apply super-resolution to every given pixel.

We also show that our method improves other low-level vision tasks, such as denoising and compression ar…

7 часов назад @ paperswithcode.com
Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition
Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition

Include the markdown at the top of your GitHub README.md file to showcase the performance of the model.

Badges are live and will be dynamically updated with the latest ranking of this paper.

20 часов назад @ paperswithcode.com
Inverting Gradients -- How easy is it to break privacy in federated learning?
Inverting Gradients -- How easy is it to break privacy in federated learning? Inverting Gradients -- How easy is it to break privacy in federated learning?

The idea of federated learning is to collaboratively train a neural network on a server.

Each user receives the current weights of the network and in turns sends parameter updates (gradients) based on local data...

This protocol has been designed not only to train neural networks data-efficiently, but also to provide privacy benefits for users, as their input data remains on device and only parameter gradients are shared.

In this paper we show that sharing parameter gradients is by no means secure: By exploiting a cosine similarity loss along with optimization methods from adversarial attacks, we are able to faithfully reconstruct images at high resolution from the knowledge of their parame…

1 день, 21 час назад @ paperswithcode.com
DeepLPF: Deep Local Parametric Filters for Image Enhancement
DeepLPF: Deep Local Parametric Filters for Image Enhancement DeepLPF: Deep Local Parametric Filters for Image Enhancement

Beyond global adjustments, professional image editing programs provide local adjustment tools operating on specific parts of an image... Options include parametric (graduated, radial filters) and unconstrained brush tools.

These highly expressive tools enable a diverse set of local image enhancements.

State-of-the-art automated image enhancement approaches typically focus on learning pixel-level or global enhancements.

In this paper, we introduce a novel approach to automatically enhance images using learned spatially local filters of three different types (Elliptical Filter, Graduated Filter, Polynomial Filter).

We introduce a deep neural network, dubbed Deep Local Parametric Filters (Deep…

1 день, 21 час назад @ paperswithcode.com
Heterogeneous Network Representation Learning: Survey, Benchmark, Evaluation, and Beyond
Heterogeneous Network Representation Learning: Survey, Benchmark, Evaluation, and Beyond Heterogeneous Network Representation Learning: Survey, Benchmark, Evaluation, and Beyond

Since real-world objects and their interactions are often multi-modal and multi-typed, heterogeneous networks have been widely used as a more powerful, realistic, and generic superclass of traditional homogeneous networks (graphs).

Meanwhile, representation learning (\aka~embedding) has recently been intensively studied and shown effective for various network mining and analytical tasks...

Since there has already been a broad body of heterogeneous network embedding (HNE) algorithms but no dedicated survey, as the first contribution of this work, we pioneer in providing a unified paradigm for the systematic categorization and analysis over the merits of various existing HNE algorithms.

Moreo…

1 день, 21 час назад @ paperswithcode.com
Learning Unsupervised Hierarchical Part Decomposition of 3D Objects from a Single RGB Image
Learning Unsupervised Hierarchical Part Decomposition of 3D Objects from a Single RGB Image Learning Unsupervised Hierarchical Part Decomposition of 3D Objects from a Single RGB Image

Humans perceive the 3D world as a set of distinct objects that are characterized by various low-level (geometry, reflectance) and high-level (connectivity, adjacency, symmetry) properties.

Recent methods based on convolutional neural networks (CNNs) demonstrated impressive progress in 3D reconstruction, even when using a single 2D image as input...

However, the majority of these methods focuses on recovering the local 3D geometry of an object without considering its part-based decomposition or relations between parts.

We address this challenging problem by proposing a novel formulation that allows to jointly recover the geometry of a 3D object as a set of primitives as well as their latent …

1 день, 21 час назад @ paperswithcode.com
Tracking Objects as Points
Tracking Objects as Points Tracking Objects as Points

Tracking has traditionally been the art of following interest points through space and time.

In this paper, we present a simultaneous detection and tracking algorithm that is simpler, faster, and more accurate than the state of the art.

It achieves 67.3% MOTA on the MOT17 challenge at 22 FPS and 89.4% MOTA on the KITTI tracking benchmark at 15 FPS, setting a new state of the art on both datasets.

CenterTrack is easily extended to monocular 3D tracking by regressing additional 3D attributes.

Using monocular video input, it achieves 28.3% [email protected] on the newly released nuScenes 3D tracking benchmark, substantially outperforming the monocular baseline on this benchmark while running a…

1 день, 21 час назад @ paperswithcode.com
Monocular Camera Localization in Prior LiDAR Maps with 2D-3D Line Correspondences
Monocular Camera Localization in Prior LiDAR Maps with 2D-3D Line Correspondences Monocular Camera Localization in Prior LiDAR Maps with 2D-3D Line Correspondences

Light-weight camera localization in existing maps is essential for vision-based navigation.

Currently, visual and visual-inertial odometry (VO\&VIO) techniques are well-developed for state estimation but with inevitable accumulated drifts and pose jumps upon loop closure... To overcome these problems, we propose an efficient monocular camera localization method in prior LiDAR maps using directly estimated 2D-3D line correspondences.

With the pose prediction from VIO, we can efficiently obtain coarse 2D-3D line correspondences.

After that, the camera poses and 2D-3D correspondences are iteratively optimized by minimizing the projection error of correspondences and rejecting outliers.

The exp…

2 дня, 13 часов назад @ paperswithcode.com
Revisiting Pose-Normalization for Fine-Grained Few-Shot Recognition
Revisiting Pose-Normalization for Fine-Grained Few-Shot Recognition Revisiting Pose-Normalization for Fine-Grained Few-Shot Recognition

Few-shot, fine-grained classification requires a model to learn subtle, fine-grained distinctions between different classes (e.g., birds) based on a few images alone.

This requires a remarkable degree of invariance to pose, articulation and background... A solution is to use pose-normalized representations: first localize semantic parts in each image, and then describe images by characterizing the appearance of each part.

While such representations are out of favor for fully supervised classification, we show that they are extremely effective for few-shot fine-grained classification.

With a minimal increase in model capacity, pose normalization improves accuracy between 10 and 20 percentage…

2 дня, 13 часов назад @ paperswithcode.com
DeepSIBA: Chemical Structure-based Inference of Biological Alterations
DeepSIBA: Chemical Structure-based Inference of Biological Alterations DeepSIBA: Chemical Structure-based Inference of Biological Alterations

Predicting whether a chemical structure shares a desired biological effect can have a significant impact for in-silico compound screening in early drug discovery.

The proposed model was able to learn new representations from chemical structures and identify structurally dissimilar compounds that affect similar biological processes with high precision.

Additionally, by utilizing deep ensembles to estimate uncertainty, we were able to provide reliable and accurate predictions for chemical structures that are very different from the ones used during training.

Finally, we present a novel inference approach, where the trained models are used to estimate the signaling pathways affected by a compo…

2 дня, 13 часов назад @ paperswithcode.com
Adversarial Learning for Personalized Tag Recommendation
Adversarial Learning for Personalized Tag Recommendation Adversarial Learning for Personalized Tag Recommendation

We have recently seen great progress in image classification due to the success of deep convolutional neural networks and the availability of large-scale datasets.

Most of the existing work focuses on single-label image classification...

In this paper, we address the problem of personalized tag recommendation and propose an end-to-end deep network which can be trained on large-scale datasets.

A joint training of user-preference and visual encoding allows the network to efficiently integrate the visual preference with tagging behavior for a better user recommendation.

In addition, we propose the use of adversarial learning, which enforces the network to predict tags resembling user-generated…

2 дня, 13 часов назад @ paperswithcode.com
Background Matting: The World is Your Green Screen
Background Matting: The World is Your Green Screen Background Matting: The World is Your Green Screen

Most existing matting methods require a green screen background or a manually created trimap to produce a good matte... Automatic, trimap-free methods are appearing, but are not of comparable quality.

In our trimap free approach, we ask the user to take an additional photo of the background without the subject at the time of capture.

We first train a matting network with supervised loss on ground truth data with synthetic composites.

To bridge the domain gap to real imagery with no labeling, we train another matting network guided by the first network and by a discriminator that judges the quality of composites.

We demonstrate results on a wide variety of photos and videos and show signific…

2 дня, 13 часов назад @ paperswithcode.com
PaStaNet: Toward Human Activity Knowledge Engine
PaStaNet: Toward Human Activity Knowledge Engine PaStaNet: Toward Human Activity Knowledge Engine

from image to activity concepts, which may encounter performance bottleneck since the huge gap.

In light of this, we propose a new path: infer human part states first and then reason out the activities based on part-level semantics... Human Body Part States (PaSta) are fine-grained action semantic tokens, e.g.

, which can compose the activities and help us step toward human activity knowledge engine.

To fully utilize the power of PaSta, we build a large-scale knowledge base PaStaNet, which contains 7M+ PaSta annotations.

Promoted by PaStaNet, our method achieves significant improvements, e.g.

2 дня, 13 часов назад @ paperswithcode.com
Scene-Adaptive Video Frame Interpolation via Meta-Learning
Scene-Adaptive Video Frame Interpolation via Meta-Learning Scene-Adaptive Video Frame Interpolation via Meta-Learning

Video frame interpolation is a challenging problem because there are different scenarios for each video depending on the variety of foreground and background motion, frame rate, and occlusion.

It is therefore difficult for a single network with fixed parameters to generalize across different videos...

Ideally, one could have a different network for each scenario, but this is computationally infeasible for practical applications.

We first show the benefits of `test-time adaptation' through simple fine-tuning of a network, then we greatly improve its efficiency by incorporating meta-learning.

Finally, we show that our meta-learning framework can be easily employed to any video frame interpola…

2 дня, 13 часов назад @ paperswithcode.com
📝 Cool Blogs
ODS.ai Habr
последний пост 1 неделя назад
Распространение сферического коня в вакууме по территории РФ
Распространение сферического коня в вакууме по территории РФ Распространение сферического коня в вакууме по территории РФ

Привет от ODS. Мы откликнулись на идею tutu.ru поработать с их датасетом пассажиропотока РФ. И если в посте Milfgard огромная таблица выводов и научпоп, то мы хотим рассказать что под капотом.

Что, опять очередной пост про COVID-19? Да, но нет. Нам это было интересно именно с точки зрения математических методов и работы с интересным набором данных. Прежде, чем вы увидите под катом красивые картинки и графики, я обязан сказать несколько вещей: любое моделирование — это очень сложный процесс, внутри которого невероятное количество ЕСЛИ и ПРЕДПОЛОЖИМ. Мы о них расскажем.

те, кто работал над этой статьей — не эпидемиологи или вирусологи. Мы просто группа любителей теории графов, практикующих ме…

1 неделя назад @ habr.com
Рубрика «Читаем статьи за вас». Январь — Февраль 2020
Рубрика «Читаем статьи за вас». Январь — Февраль 2020 Рубрика «Читаем статьи за вас». Январь — Февраль 2020

Привет, Хабр! Продолжаем публиковать рецензии на научные статьи от членов сообщества Open Data Science из канала #article_essense. Хотите получать их раньше всех — вступайте в сообщество!

Представлены обзоры 11 статей по Computer Vision, Natural Language Processing, Reinforcement learning и другим темам. Читать дальше →

2 недели, 2 дня назад @ habr.com
Настройка функции потерь для нейронной сети на данных сейсморазведки
Настройка функции потерь для нейронной сети на данных сейсморазведки Настройка функции потерь для нейронной сети на данных сейсморазведки

В прошлой статье мы описали эксперимент по определению минимального объема вручную размеченных срезов для обучения нейронной сети на данных сейсморазведки. Сегодня мы продолжаем эту тему, выбирая наиболее подходящую функцию потерь. Рассмотрены 2 базовых класса функций – Binary cross entropy и Intersection over Union – в 6-ти вариантах с подбором параметров, а также комбинации функций разных классов. Дополнительно рассмотрена регуляризация функции потерь. Спойлер: удалось существенно улучшить качество прогноза сети. Читать дальше →

1 месяц, 2 недели назад @ habr.com
Открытый курс «Deep Learning in NLP» от создателей DeepPavlov на базе курса cs224n
Открытый курс «Deep Learning in NLP» от создателей DeepPavlov на базе курса cs224n

Всем привет!

Если возник вопрос по курсу — посмотрите раздел Q&A ниже.

Вступление

Меня зовут Алексей Клоков, я хочу рассказать о запуске классного курса по обработке естественного языка (Natural Language Processing), который очередной раз запускают физтехи из проекта DeepPavlov – открытой библиотеки для разговорного искусственного интеллекта, которую разрабатывают в лаборатории нейронных систем и глубокого обучения МФТИ. Благодарю их и Moryshka за разрешение осветить эту тему на Хабре в нашем ods-блоге. Итак, поехали! Читать дальше →

2 месяца назад @ habr.com
Рубрика «Читаем статьи за вас». Октябрь — Декабрь 2019
Рубрика «Читаем статьи за вас». Октябрь — Декабрь 2019 Рубрика «Читаем статьи за вас». Октябрь — Декабрь 2019

Привет, Хабр! Продолжаем публиковать рецензии на научные статьи от членов сообщества Open Data Science из канала #article_essense. Хотите получать их раньше всех — вступайте в сообщество!

Статьи на сегодня: Poly-encoders: Transformer Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring (Facebook, 2019)

Implicit Discriminator in Variational Autoencoder (Indian Institute of Technology Ropar, 2019)

Self-training with Noisy Student improves ImageNet classification (Google Research, Carnegie Mellon University, 2019)

Momentum Contrast for Unsupervised Visual Representation Learning (Facebook, 2019)

Benchmarking Neural Network Robustness to Common Corruptions and …

2 месяца, 1 неделя назад @ habr.com
SVM. Объяснение с нуля и реализация на python. Подробный разбор метода опорных векторов
SVM. Объяснение с нуля и реализация на python. Подробный разбор метода опорных векторов SVM. Объяснение с нуля и реализация на python. Подробный разбор метода опорных векторов

Привет всем, кто выбрал путь ML-самурая!

Введение:

В данной статье рассмотрим метод опорных векторов (англ. SVM, Support Vector Machine) для задачи классификации. Будет представлена основная идея алгоритма, вывод настройки его весов и разобрана простая реализация своими руками. На примере датасета будет продемонстрирована работа написанного алгоритма с линейно разделимыми/неразделимыми данными в пространстве и визуализация обучения/прогноза. Дополнительно будут озвучены плюсы и минусы алгоритма, его модификации. Рисунок 1. Фото цветка ириса из открытых источников Читать дальше →

2 месяца, 2 недели назад @ habr.com
TensorRT 6.x.x.x — высокопроизводительный инференс для моделей глубокого обучения (Object Detection и Segmentation)
TensorRT 6.x.x.x — высокопроизводительный инференс для моделей глубокого обучения (Object Detection и Segmentation) TensorRT 6.x.x.x — высокопроизводительный инференс для моделей глубокого обучения (Object Detection и Segmentation)

Больно только в первый раз! Всем привет! Дорогие друзья, в этой статье я хочу поделиться своим опытом использования TensorRT, RetinaNet на базе репозитория github.com/aidonchuk/retinanet-examples (это форк официальной репы от nvidia, который позволит начать использовать в продакшен оптимизированные модели в кратчайшие сроки). Пролистывая сообщения в каналах сообщества ods.ai, я сталкиваюсь с вопросами по использованию TensorRT, и в основном вопросы повторяются, поэтому я решил написать как можно более полное руководство по использованию быстрого инференса на основе TensorRT, RetinaNet, Unet и docker. Читать дальше →

2 месяца, 2 недели назад @ habr.com
Проект Lacmus: как компьютерное зрение помогает спасать потерявшихся людей
Проект Lacmus: как компьютерное зрение помогает спасать потерявшихся людей Проект Lacmus: как компьютерное зрение помогает спасать потерявшихся людей

Всем привет! Возможно, вы уже знаете про инициативу Machine Learning for Social Good (#ml4sg) сообщества Open Data Science. В её рамках энтузиасты на бесплатной основе применяют методы машинного обучения для решения социально-значимых проблем. Мы, команда проекта Lacmus (#proj_rescuer_la), занимаемся внедрением современных Deep Learning-решений для поиска людей, потерявшихся вне населённой местности: в лесу, поле и т.д. Читать дальше →

2 месяца, 3 недели назад @ habr.com
Эксперименты с нейронными сетями на данных сейсморазведки
Эксперименты с нейронными сетями на данных сейсморазведки Эксперименты с нейронными сетями на данных сейсморазведки

Сложность интерпретации данных сейсмической разведки связана с тем, что к каждой задаче необходимо искать индивидуальный подход, поскольку каждый набор таких данных уникален. Ручная обработка требует значительных трудозатрат, а результат часто содержит ошибки, связанные с человеческим фактором. Использование нейронных сетей для интерпретации может существенно сократить ручной труд, но уникальность данных накладывает ограничения на автоматизацию этой работы. Данная статья описывает эксперимент по анализу применимости нейронных сетей для автоматизации выделения геологических слоев на 2D-изображениях на примере полностью размеченных данных из акватории Северного моря. Рисунок 1. Проведение акв…

2 месяца, 3 недели назад @ habr.com
Как подружить PyTorch и C++. Используем TorchScript
Как подружить PyTorch и C++. Используем TorchScript Как подружить PyTorch и C++. Используем TorchScript

Около года назад разработчики PyTorch представили сообществу TorchScript — инструмент, который позволяет с помощью пары строк кода и нескольких щелчков мыши сделать из пайплайна на питоне отчуждаемое решение, которое можно встроить в систему на C++. Ниже я делюсь опытом его использования и постараюсь описать встречающиеся на этом пути подводные камни. Особенное внимание уделю реализации проекта на Windows, поскольку, хотя исследования в ML обычно делаются на Ubuntu, конечное решение часто (внезапно!) требуется под "окошками". Примеры кода для экспорта модели и проекта на C++, использующего модель, можно найти в репозиториии на GitHub. Читать дальше →

3 месяца, 3 недели назад @ habr.com
О Структурном Моделировании Организационных Изменений
О Структурном Моделировании Организационных Изменений О Структурном Моделировании Организационных Изменений

75% 3 из 4 — так Boston Consulting Group оценивает долю IT проектов, почивших по не-техническим причинам. Уже вот две подряд редакции свода знаний по управлению проектами (PMBOK) выделяют процессы по управлению стейкхолдерами в отдельную область знаний под счастливым номером 13 и настоятельно рекомендуют учитывать: 1. связи между ними,

2. центры влияния, а также 3. культуру общения — для повышения шансов на успех.

Вопрос один: доколе инженеры о стейкхолдерах будут судить догадками? ФОТО: Шариф Хамза для Dazed & Confuzed, модель — Люпита Нионго В свете недавней безоговорочной победы русской математики над вопросом хроматических чисел рассмотрим сценарий применения стремительно набирающей поп…

4 месяца назад @ habr.com
Как я решал соревнование по машинному обучению data-like
Как я решал соревнование по машинному обучению data-like Как я решал соревнование по машинному обучению data-like

Привет, Хабр. Недавно прошло соревнование от Тинькофф и McKinsey. Конкурс проходил в два этапа: первый — отборочный, в kaggle формате, т.е. отсылаешь предсказания — получаешь оценку качества предсказания; побеждает тот, у кого лучше оценка. Второй — онсайт хакатон в Москве, на который проходит топ 20 команд первого этапа. В этой статье я расскажу об отборочном этапе, где мне удалось занять первое место и выиграть макбук. Команда на лидерборде называлась "дети Лёши". Соревнование проходило с 19 сентября до 12 октября. Я начал решать ровно за неделю до конца и решал почти фулл-тайм.

Краткое описание соревнования:

Летом в банковском приложении Тинькофф появились stories (как в Instagram). На s…

4 месяца, 1 неделя назад @ habr.com
Рубрика «Читаем статьи за вас». Июль — Сентябрь 2019
Рубрика «Читаем статьи за вас». Июль — Сентябрь 2019 Рубрика «Читаем статьи за вас». Июль — Сентябрь 2019

Привет, Хабр! Продолжаем публиковать рецензии на научные статьи от членов сообщества Open Data Science из канала #article_essense. Хотите получать их раньше всех — вступайте в сообщество!

Статьи на сегодня: Layer rotation: a surprisingly powerful indicator of generalization in deep networks? (Université catholique de Louvain, Belgium, 2018)

Parameter-Efficient Transfer Learning for NLP (Google Research, Jagiellonian University, 2019) RoBERTa: A Robustly Optimized BERT Pretraining Approach (University of Washington, Facebook AI, 2019)

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks (Google Research, 2019)

How the Brain Transitions from Conscious to Subliminal Percept…

5 месяцев, 2 недели назад @ habr.com
Рубрика «Читаем статьи за вас». Январь — Июнь 2019
Рубрика «Читаем статьи за вас». Январь — Июнь 2019 Рубрика «Читаем статьи за вас». Январь — Июнь 2019

Привет, Хабр! Продолжаем публиковать рецензии на научные статьи от членов сообщества Open Data Science из канала #article_essense. Хотите получать их раньше всех — вступайте в сообщество!

Статьи на сегодня: Neural Ordinary Differential Equations (University of Toronto, 2018)

Semi-Unsupervised Learning with Deep Generative Models: Clustering and Classifying using Ultra-Sparse Labels (University of Oxford, The Alan Turing Institute, London, 2019)

Uncovering and Mitigating Algorithmic Bias through Learned Latent Structure (Massachusetts Institute of Technology, Harvard University, 2019)

Deep reinforcement learning from human preferences (OpenAI, DeepMind, 2017)

Exploring Randomly Wired Neural …

5 месяцев, 3 недели назад @ habr.com
Создаем датасет для распознавания счетчиков на Яндекс.Толоке
Создаем датасет для распознавания счетчиков на Яндекс.Толоке Создаем датасет для распознавания счетчиков на Яндекс.Толоке

Как-то два года назад, случайно включив телевизор, я увидел интересный сюжет в программе "Вести". В нём рассказывали о том, что департамент информационных технологий Москвы создает нейросеть, которая будет считывать показания счетчиков воды по фотографиям. В сюжете телеведущий попросил горожан помочь проекту и прислать снимки своих счетчиков на портал mos.ru, чтобы на них обучить нейронную сеть. Если Вы — департамент Москвы, то выпустить ролик на федеральном канале и попросить людей прислать изображения счетчиков — не очень большая проблема. Но что делать, если Вы — маленький стартап, и сделать рекламу на телеканале не можете? Как получить 50000 изображений счетчиков в таком случае? Читать …

5 месяцев, 4 недели назад @ habr.com
inFERENCe inFERENCe
последний пост 4 месяца, 3 недели назад
Meta-Learning Millions of Hyper-parameters using the Implicit Function Theorem
Meta-Learning Millions of Hyper-parameters using the Implicit Function Theorem Meta-Learning Millions of Hyper-parameters using the Implicit Function Theorem

November 14, 2019Meta-Learning Millions of Hyper-parameters using the Implicit Function TheoremLast night on the train I read this nice paper by David Duvenaud and colleagues.

Implicit Function TheoremMany - though not all - meta-learning or hyperparameter optimization problems can be stated as nested optimization problems.

$$Using a finite truncation of the Neumann series one can approximate the inverse Hessian in the following way:$$\left[\frac{\partial^2 \mathcal{L}_T}{\partial \theta \partial \theta}\right]^{-1} \approx \sum_{i=1}^j \left(I - \frac{\partial^2 \mathcal{L}_T}{\partial \theta \partial \theta}\right)^i.

Most crucially, methods based on implicit gradients assume that your le…

4 месяца, 3 недели назад @ inference.vc
The secular Bayesian: Using belief distributions without really believing
The secular Bayesian: Using belief distributions without really believing The secular Bayesian: Using belief distributions without really believing

October 31, 2019The secular Bayesian: Using belief distributions without really believingThe religious BayesianMy parents didn't raise me in a religious tradition.

The secular BayesianOver the years I came to terms with my Bayesian heritage, and I now live my life as a secular Bayesian.

This choice is the real reason why the resulting update rule will end up very Bayes-rule like, as we will see later.

RationalityNow that we have an update rule which satisfies our desiderata, can we say if it's actually a good or useful update rule?

So, not only is this update rule the only update rule that satisfies the desired properties, it is also optimal under this particular definition of optimality/ra…

5 месяцев, 1 неделя назад @ inference.vc
Exponentially Growing Learning Rate? Implications of Scale Invariance induced by Batch Normalization
Exponentially Growing Learning Rate? Implications of Scale Invariance induced by Batch Normalization Exponentially Growing Learning Rate? Implications of Scale Invariance induced by Batch Normalization

October 25, 2019Exponentially Growing Learning Rate?

Implications of Scale Invariance induced by Batch NormalizationYesterday I read this intriguing paper about the midboggling fact that it is possible to use exponentially growing learning rate schedule when training neural networks with batch normalization:Zhiyuan Li and Sanjeev Arora (2019) An Exponential Learning Rate Schedule for Deep LearningThe paper provides both theoretical insights as well as empirical demonstration of this remarcable property.

So Imagine doing vanilla gradient descent (no momentum, weight decay, fixed learning rate) on such a loss surface.

However, the weight vector won't completely blow up to infinity, because th…

5 месяцев, 1 неделя назад @ inference.vc
On Marginal Likelihood and Cross-Validation
On Marginal Likelihood and Cross-Validation On Marginal Likelihood and Cross-Validation

The marginal likelihood and cross-validationTo discuss the connection between marginal likelihoods to (Bayesian) cross validation, let's first define what is what.

For each of these permutations we can decompose the marginal likelihood as a product of conditionals, or equivalently we can write the log marginal likelihood as a sum of logs of the same conditionals.

So, the sum of all the terms in this matrix gives the marginal likelihood times 6 (as there are 6 columns).

This observation gives a really good motivation for using the marginal likelihood, and also gives a new perspective on how it works.

Calculating the marginal likelihood amounts to evaluating the average predictive score on al…

5 месяцев, 3 недели назад @ inference.vc
Notes on iMAML: Meta-Learning with Implicit Gradients
Notes on iMAML: Meta-Learning with Implicit Gradients Notes on iMAML: Meta-Learning with Implicit Gradients

September 19, 2019Notes on iMAML: Meta-Learning with Implicit GradientsThis week I read this cool new paper on meta-learning: it a slightly different approach compared to its predecessors based on some observations about differentiating the optima of regularized optimization.

Let me illustrate what that dependence looks like:In the figure above, let's say that we would like to minimise an objective function $f(\theta)$.

Rather than deterministically finding a particular local minimum, SGD samples different minima: when run with different random seeds it will find different minima.

The meta-learning objective now depends on $\theta_0$ in two different ways:as we change the anchor $\theta_0$,…

6 месяцев, 2 недели назад @ inference.vc
Invariant Risk Minimization: An Information Theoretic View
Invariant Risk Minimization: An Information Theoretic View Invariant Risk Minimization: An Information Theoretic View

July 19, 2019Invariant Risk Minimization: An Information Theoretic ViewI finally got around to reading this new paper by Arjovsky et al.

Here, I will describe the main idea and then provide an information theoretic view on the same topic.

$Y \perp\mkern-13mu\perp E\vert X_1, W$: The observable $X_1$ and latent $W$ shield the label $Y$ from the influence of the environment.

Say we have a parametric family of functions $f(y\vert \phi(x); \theta)$ for predicting $y$ from $\phi(x)$.

The conditional information can be approximated as follows:\begin{align}I[Y, E \vert \phi(x)] &\approx \min_\theta {E}_{x,y} \ell (f(y\vert \phi(x); \theta) - \mathbb{E}_e \min_{\theta_e} \mathbb{E}_{x,y\vert e} \el…

8 месяцев, 3 недели назад @ inference.vc
ICML Highlight: Contrastive Divergence for Combining Variational Inference and MCMC
ICML Highlight: Contrastive Divergence for Combining Variational Inference and MCMC ICML Highlight: Contrastive Divergence for Combining Variational Inference and MCMC

Ruiz and Titsias (2019) A Contrastive Divergence for Combining Variational Inference and MCMCBackground: principle of minimal improvementFirst, some background on why I found this paper particulartly interesting.

Using such improvement operator you can define an objective function for policies by measuring the extent to which the operator changes a policy.

In the case of AlphaGo Zero, the improvement operator is Monte Carlo Tree Search (MCTS).

The paper I'm talking about uses a very similar argument to come up with a contrastive divergence for variational inference, where the improvement operator is MCMC step.

Combining VI with MCMCThe two dominant ways of performing inference in latent var…

9 месяцев, 4 недели назад @ inference.vc
Notes on the Limitations of the Empirical Fisher Approximation
Notes on the Limitations of the Empirical Fisher Approximation Notes on the Limitations of the Empirical Fisher Approximation

June 6, 2019Notes on the Limitations of the Empirical Fisher ApproximationThis post is a short not on an excellent recent paper on empirical Fisher information matrices:Kunstner, Balles and Hennig (2019) Limitations of the Empirical Fisher ApproximationI was debating with myself whether I should write a post about this because it's a superbly written paper that you should probably read in full.

There isn't a whole lot of novelty in the paper, but it is a great discussion paper that provides a concise overview of the Fisher information, the empirical Fisher matrix and their connectinos to generalized Gauss-Newton methods.

The third shows the gradients corrected by the empirical Fisher instea…

10 месяцев назад @ inference.vc
Perceptual Straightening of Natural Videos
Perceptual Straightening of Natural Videos Perceptual Straightening of Natural Videos

May 30, 2019Perceptual Straightening of Natural VideosVideo is an interesting domain for unsupervised, or self-supervised, representation learning.

So, for example, straight trajectories have an almost $0$ probability under a high-dimensional Brownian motion or Ornstein–Uhlenbeck (OU) process.

Results and SummaryThe main results of the paper - as expected - is that natural video sequences indeed appear to be mapped to straight trajectories in representation space.

For one, the paper assumes a Gaussian observation noise in representation space, and I wonder how robust the analysis would be to assuming heavy-tailed noise.

Similarly, our very definition of straightness and angles relies on the…

10 месяцев, 1 неделя назад @ inference.vc
DeepSets: Modeling Permutation Invariance
DeepSets: Modeling Permutation Invariance DeepSets: Modeling Permutation Invariance

February 7, 2019DeepSets: Modeling Permutation Invariance###### guest post by [ Fabian Fuchs ](https://twitter.com/FabianFuchsML), [ Ed Wagstaff ](https://github.com/edwag), and [ Martin Engelcke ](https://twitter.com/martinengelcke)One of my favourite recent innovations in neural network architectures is Deep Sets.

In such a situation, the invariance property we can exploit is permutation invariance.

To give a short, intuitive explanation for permutation invariance, this is what a permutation invariant function with three inputs would look like: $f(a, b, c) = f(a, c, b) = f(b, a, c) = \dots$.

The Deep Sets Architecture (Sum-Decomposition)Having established that there is a need for permutat…

1 год, 1 месяц назад @ inference.vc
Causal Inference 3: Counterfactuals
Causal Inference 3: Counterfactuals Causal Inference 3: Counterfactuals

You hopefully know enough about causal inference by now to know that $p(🎓\vert 🧔=0)$ is certainly not the quantity we seek.

Counterfactual queriesTo finally explain counterfactuals, I have to step beyond causal graphs and introduce another concept: structural equation models.

Structural Equation ModelsA causal graph encodes which variables have a direct causal effect on any given node - we call these causal parents of the node.

$f_1$ computes $x$ from its causal parent $u$, and $f_2$ computes $a$ from its causal parents $x$ and $v$.

The structural equation model (SEM) entails the causal graph, in that you can reconstruct the causal graph by looking at the inputs of each function.

1 год, 2 месяца назад @ inference.vc
Causal Inference 2: Illustrating Interventions via a Toy Example
Causal Inference 2: Illustrating Interventions via a Toy Example Causal Inference 2: Illustrating Interventions via a Toy Example

Consequently,the joint distribution of data alone is insufficient to predict behaviour under interventions.

Finally, you can use various causal discovery techniques to try to identify the causal diagram from the data itself.

Theoretically, recovering the full causal graph from the data is impossible in general cases.

SummaryWe have seen that modeling the joint distribution can only get you so far, and if you want to predict the effect of interventions, i.e.

calculate $p(y\vert do(x))$-like quantities, you have to add a causal graph to your analysis.

1 год, 2 месяца назад @ inference.vc
Online Bayesian Deep Learning in Production at Tencent
Online Bayesian Deep Learning in Production at Tencent Online Bayesian Deep Learning in Production at Tencent

These applications include active learning, reinforcement learning and online/continual learning.

So as I recently read a paper by Tencent, I was surprised to learn that the online Bayesian deep learning algorithm is apparently deployed in production to power click-through-rate prediction in their ad system.

Assumed Density FilteringThe method relies on the approximate Bayesian online-learning technique often referred to as assumed density filtering.

forward propagation: In Bayesian deep learning, we maintain a distribution $q(w)$ over neural network weights, and each value $w$ defines a conditional probability $p(y\vert x, w)$.

In Bayesian deep learning, we maintain a distribution $q(w)$ o…

1 год, 4 месяца назад @ inference.vc
👻Halloween Special: Critical reviews of the worst NIPS 2018 papers.
👻Halloween Special: Critical reviews of the worst NIPS 2018 papers. 👻Halloween Special: Critical reviews of the worst NIPS 2018 papers.

posts on machine learning, statistics, opinions on things I'm reading in the space

1 год, 5 месяцев назад @ inference.vc
The Blessings of Multiple Causes: Causal Inference when you Can't Measure Confounders
The Blessings of Multiple Causes: Causal Inference when you Can't Measure Confounders The Blessings of Multiple Causes: Causal Inference when you Can't Measure Confounders

September 7, 2018The Blessings of Multiple Causes: Causal Inference when you Can't Measure ConfoundersHappy back-to-school time everyone!

In this case, the size of the kidney stone is a confounder variable.

Let's look at how this differs from the non-causal association you would measure between treatment and outcome (i.e.

there may be confounders, but all confounders causally influence at least two of the cause variables.

It identifies just enough about the causal structure (the substitute confounder variable) to then be able to make causal inferences of a certain type.

1 год, 7 месяцев назад @ inference.vc
The Spectator The Spectator
последний пост 1 месяц, 1 неделя назад
Queer Exceptionalism in Science
Queer Exceptionalism in Science Queer Exceptionalism in Science

Read in 5mins (800 words)Today’s queer scientist is exceptional.

Role of the Queer ScientistFor queer people to hold a recognised role in scientific life requires an acknowledgement that to be queer has consequences.

Challenges Facing Queer ScientistsFor the queer scientist, every encounter involves a conscious act of deliberation, risk assessment, and effort, well before any effort of research is begun.

For queer scientists, every new encounter—with a colleague, supervisor, possible letter-writer, examiner, moderator, student, interviewer, acquaintance, or future-friend—sets up a stressful coming-out scene.

To be queer in science is to ask to belong and to be safe.

1 месяц, 1 неделя назад @ blog.shakirm.com
Machinery of Grace
Machinery of Grace Machinery of Grace

The machinery of grace is always simple.

The machines i’m thinking of are machines with intelligence, machines that learn.

Dialogues that lead to co-design and inclusion in the mission of developing intelligent machines with grace.

Firstly, to celebrate our progress in machine learning, but one that must now be balanced using a new critical practice.

If we are successful in making global AI truly global, and I believe we can be, we set ourselves on the path to realising that intelligent machinery of grace.

4 месяца, 3 недели назад @ blog.shakirm.com
A New Consciousness of Inclusion in Machine Learning
A New Consciousness of Inclusion in Machine Learning A New Consciousness of Inclusion in Machine Learning

On LGBT Freedoms and our Support for Machine Learning in AfricaThis is an exploration of my thinking and my personal views.

The choice of these host countries has fomented concerns throughout our machine learning community: how can we as a community committed to inclusion in every form consider hosting our conferences in countries like these that are far from inclusive?

A politics of location, and an ethics of inclusion is growing healthily within our machine learning community.

But I too am an out and proud gay machine learning scientist.

My hope is that we will always continue to experiment with the ways in which we organise and support our global machine learning community.

9 месяцев, 3 недели назад @ blog.shakirm.com
Racialised Lives and the Life Beyond
Racialised Lives and the Life Beyond Racialised Lives and the Life Beyond

The Black women is racialised, and so too is the White man, as is every person we have ever known, and so the cycle of our racialised lives lives on.

About two-and-a-half years ago, I was part of creating a new organisation called the Deep Learning Indaba, as one attempt to engage with these questions.

The grassroots are those groups within our institutions, like our LGBT resource group within DeepMind, and those outside movements, like the Deep Learning Indaba.

I see the leadership of the Deep Learning Indaba as such a collective.

But I think we show the power of political love today, in this room, with our memory, with our energy, and in the celebration of progress that has brought us her…

10 месяцев назад @ blog.shakirm.com
Talk: How Do We Support Under-represented Groups To Put Themselves Forward?
Talk: How Do We Support Under-represented Groups To Put Themselves Forward? Talk: How Do We Support Under-represented Groups To Put Themselves Forward?

As you think of this question, consider the journey that is taken by the under-represented groups we might have in mind.

Journey’s like mine are our struggle credentials.

This room is filled with struggle credentials.

Struggle credentials play too much of a role in our present.

It is the under-represented groups that must eventually be put forward.

1 год, 5 месяцев назад @ blog.shakirm.com
Machine Learning Trick of the Day (8): Instrumental Thinking
Machine Learning Trick of the Day (8): Instrumental Thinking Machine Learning Trick of the Day (8): Instrumental Thinking

The instrumental variables idea is conceptually simple: we introduce new observed variables z, called instrumental variables, into our model; figure 1 (right).

And this is the trick: instrumental variables are special subset of the data we already have, but they allow us to remove the effect of confounders.

Our problem is to learn a linear value function using features (when in state x) using parameters so that .

But this probabilistic viewpoint through instrumental variables means that we can think of alternative ways of extending this view.

Like every trick in this series, the instrumental variables give us an alternative way to think about existing problems.

1 год, 5 месяцев назад @ blog.shakirm.com
Decolonising Artificial Intelligence
Decolonising Artificial Intelligence Decolonising Artificial Intelligence

· Read in 6mins · 1297 words ·The Artificial Intelligence we believe to be global, is far from it.

Inevitably, a call will be made to decolonise artificial intelligence.

The call for decolonisation in artificial intelligence is yet to reach its full volume.

Kai Fu Lee, The Real Threat of Artificial Intelligence, June 2017We immediately recognise the colonial nature of this possible future.

The only AI that empowers and works for the benefit of humanity is a truly global AI.

1 год, 5 месяцев назад @ blog.shakirm.com
The Price of Transformation
The Price of Transformation The Price of Transformation

The price of transformation is ours to pay.

Transformation cannot be separated from my other pillars, for they require transformation to succeed.

The price of transformation cannot be paid in this way.

We must all confront the question: What is the price of transformation?

We need to convince ourselves that the price of transformation is something we are willing to pay, and that we should pay.

1 год, 6 месяцев назад @ blog.shakirm.com
Machine Learning Trick of the Day (7): Density Ratio Trick
Machine Learning Trick of the Day (7): Density Ratio Trick Machine Learning Trick of the Day (7): Density Ratio Trick

The same is true if we want to compare probability densities: either through a density difference or a density ratio.

Density ratios are ubiquitous in machine learning, and will be our focus.

Density Ratio EstimationThe central task in the above five statistical quantities is to efficiently compute the ratio .

This is where the density ratio trick or formally, density ratio estimation, enters: it tells us to construct a binary classifier that distinguishes between samples from the two distributions.

This final derivation says that the problem of density ratio estimation is equivalent to that of binary classification.

2 года, 2 месяца назад @ blog.shakirm.com
Cognitive Machine Learning (2): Uncertain Thoughts
Cognitive Machine Learning (2): Uncertain Thoughts Cognitive Machine Learning (2): Uncertain Thoughts

These types of thinking are secondary-levels of thinking: a thinking about thinking.

Like the primary colours, our primary thoughts are those that are the basis of our cognition.

Secondary colours use the primary colours as their basis, and similarly, secondary thoughts are thoughts about our primary thoughts.

Our memories, decisions and attitudes are amongst our primary thoughts, and for each we have secondary thoughts—metacognitive confidence assessments—that guide our behaviours.

Again, we can make such assessments in two ways: about the decisions we are still to make, a prospective decision confidence; and decisions we have already made, a retrospective decision confidence.

3 года назад @ blog.shakirm.com
大トロ 大トロ
последний пост 2 недели, 5 дней назад
Neuroevolution of Self-Interpretable Agents
Neuroevolution of Self-Interpretable Agents Neuroevolution of Self-Interpretable Agents

Agents with a self-attention “bottleneck” not only can solve these tasks from pixel inputs with only 4000 parameters, but they are also better at generalization.

Redirecting to attentionagent.github.io, where the article resides.

2 недели, 5 дней назад @ blog.otoro.net
Learning to Predict Without Looking Ahead
Learning to Predict Without Looking Ahead Learning to Predict Without Looking Ahead

Rather than hardcoding forward prediction, we try to get agents to learn that they need to predict the future.

Redirecting to learningtopredict.github.io, where the article resides.

5 месяцев, 1 неделя назад @ blog.otoro.net
Weight Agnostic Neural Networks
Weight Agnostic Neural Networks Weight Agnostic Neural Networks

We search for neural network architectures that can already perform various tasks even when they use random weight values.

Redirecting to weightagnostic.github.io, where the article resides.

9 месяцев, 4 недели назад @ blog.otoro.net
Learning Latent Dynamics for Planning from Pixels
Learning Latent Dynamics for Planning from Pixels Learning Latent Dynamics for Planning from Pixels

PlaNet learns a world model from image inputs only and successfully leverages it for planning in latent space.

Redirecting to planetrl.github.io, where the article resides.

1 год, 1 месяц назад @ blog.otoro.net
Reinforcement Learning for Improving Agent Design
Reinforcement Learning for Improving Agent Design Reinforcement Learning for Improving Agent Design

Little dude rewarded for having little legs.

Redirecting to designrl.github.io, where the article resides.

1 год, 5 месяцев назад @ blog.otoro.net
World Models Experiments
World Models Experiments World Models Experiments

In this article I will give step-by-step instructions for reproducing the experiments in the World Models article (pdf).

For general discussion about the World Models article, there are already some good discussion threads here in the GitHub issues page of the interactive article.

World Models (pdf)A Visual Guide to Evolution StrategiesEvolving Stable StrategiesBelow is optionalMixture Density NetworksMixture Density Networks with TensorFlowRead tutorials on Variational Autoencoders if you are not familiar with them.

I use a combination of OS X for inference, but trained models using Google Cloud VMs.

You should update your git repo with these new models using git add doomrnn/tf_models/*.js…

1 год, 10 месяцев назад @ blog.otoro.net
World Models
World Models World Models

Can agents learn inside of their own dreams?

Redirecting to worldmodels.github.io, where the article resides.

2 года назад @ blog.otoro.net
Evolving Stable Strategies
Evolving Stable Strategies Evolving Stable Strategies

popsize ): # init the agent with a solution agent = Agent ( solutions [ i ]) # rollout env with this agent fitlist [ i ] = rollout ( agent , env ) # give scores results back to ES solver .

One way to convert into a stochastic policy is to make random.

Robot arm grasping task using a stochastic policy.

The Minitaur model in pybullet is designed to mimic the real physical Minitaur.

After making the ball smaller, CMA-ES was able to find a stochastic policy that can walk and balance the ball at the same time.

2 года, 4 месяца назад @ blog.otoro.net
A Visual Guide to Evolution Strategies
A Visual Guide to Evolution Strategies A Visual Guide to Evolution Strategies

In this post I explain how evolution strategies (ES) work with the aid of a few visual examples.

OpenAI published a paper called Evolution Strategies as a Scalable Alternative to Reinforcement Learning where they showed that evolution strategies, while being less data efficient than RL, offer many benefits.

Schaffer-2D FunctionRastrigin-2D FunctionAlthough there are many definitions of evolution strategies, we can define an evolution strategy as an algorithm that provides the user a set of candidate solutions to evaluate a problem.

Let’s visualise the scheme one more time, on the entire search process on both problems:Because CMA-ES can adapt both its mean and covariance matrix using info…

2 года, 5 месяцев назад @ blog.otoro.net
Teaching Machines to Draw
Teaching Machines to Draw Teaching Machines to Draw

In this work, we investigate an alternative to traditional pixel image modelling approaches, and propose a generative model for vector images.

For example, we can subtract the latent vector of an encoded pig head from the latent vector of a full pig, to arrive at a vector that represents the concept of a body.

As we saw earlier, a model trained to draw pigs can be made to draw pig-like trucks if given an input sketch of a truck.

Exploring the latent space between different objects can potentially enable creative designers to find interesting intersections and relationships between different drawings:Exploring the latent space between cats and buses, elephants and pigs, and various owls.

In …

2 года, 10 месяцев назад @ blog.otoro.net
Recurrent Neural Network Tutorial for Artists
Recurrent Neural Network Tutorial for Artists Recurrent Neural Network Tutorial for Artists

In particular, the experiments in the post help visualise the internals of a recurrent neural network trained to generate handwriting.

Recurrent Neural Network for HandwritingWe have pre-trained a recurrent neural network model to preform the handwriting task described in the previous section.

var x , y ; var dx , dy ; var pen ; var prev_pen ; var rnn_state ; var pdf ; var temperature = 0.65 ; var screen_width = window .

get_pdf ( rnn_state ); [ dx , dy , pen ] = Model .

I haven’t personally used keras.js, and I found it fun to just write the handwriting model from scratch in Javascript.

3 года, 3 месяца назад @ blog.otoro.net
Hypernetworks
Hypernetworks Hypernetworks

In our paper, we use HyperNetworks to explore a middle ground - to enforce a relaxed version of weight-tying.

The more exciting work is in the second part of my paper where we apply Hypernetworks to Recurrent Networks.

Dynamic HypernetworksAs mentioned in the Introduction, we also tried to apply Hypernetworks on Recurrent Networks, and I feel this is the main contribution of the research.

Our approach is to put a small LSTM cell (called the HyperLSTM cell) inside a large LSTM cell (the main LSTM).

For our implementation of Dynamic Hypernetworks, we made it so that we can just plug our HyperLSTM cell into any TensorFlow code written to use tf.nn.rnn_cell objects, since the HyperLSTM inherite…

3 года, 6 месяцев назад @ blog.otoro.net
Generating Large Images from Latent Vectors - Part Two
Generating Large Images from Latent Vectors - Part Two Generating Large Images from Latent Vectors - Part Two

Random gaussian latent vectors were generated from numpy.random and fed into the generative network to obtain these images.

Our generator can produce large random images of digits using random gaussian vectors as input.

Unlike the previous model though, the generated images do not necessarily have to look exactly like the set of training images.

All the generator has to do is to create a set of new images that share the same classification labels of the set of training images.

Description of Generator NetworkThe generator used in the previous model uses 4 large layers of 128 nodes that are fully connected.

3 года, 10 месяцев назад @ blog.otoro.net
Neural Network Evolution Playground with Backprop NEAT
Neural Network Evolution Playground with Backprop NEAT Neural Network Evolution Playground with Backprop NEAT

This demo will attempt to use a genetic algorithm to produce efficient, but atypical neural network structures to classify datasets borrowed from TensorFlow Playground.

People started experimenting with different neural network configurations, such as how many neural network layers are actually needed to fit a certain data set, or what initial features should be used for another data set.

In addition to weight-search, Deep Learning research has also produced many powerful neural network architectures that are important building blocks.

Evolving Neural Network TopologyNeuroevolution of Augmenting Topologies (NEAT) is a method that can evolve new types of neural networks based on genetic algo…

3 года, 11 месяцев назад @ blog.otoro.net
Interactive Abstract Pattern Generation Javascript Demo
Interactive Abstract Pattern Generation Javascript Demo Interactive Abstract Pattern Generation Javascript Demo

Interactive Javascript Demo for Abstract Pattern Generation.

Although there were some code available previously in Javascript, it wasn’t general enough to use as a tool for a digital artist.

Karpathy’s recurrent.js library makes it really easy to implement highly customised neural networks in JS, and adopts a computational graph type of method similar to modern libraries.

In addition, the user is able to specify the size and depth of the generator neural network.

The depth and size of the network, and also the image resolution of the output can all be customised in the web app.

3 года, 11 месяцев назад @ blog.otoro.net
The Unofficial Google Data Science Blog The Unofficial Google Data Science Blog
последний пост 4 месяца назад
Humans-in-the-loop forecasting: integrating data science and business planning
Humans-in-the-loop forecasting: integrating data science and business planning Humans-in-the-loop forecasting: integrating data science and business planning

Figure 1: A Google data centerAs an example, consider Google’s forecasting and planning for data center capacity.

In particular, the data scientist must take responsibility for stakeholders approving the “best” forecast from all available information sources.

It required investments from our data science team to re-think our statistical forecasting approach to make it easier to compare against customer forecasts.

It also owns Google’s internal time series forecasting platform described in an earlier blog post .

But looking through the blogosphere, some go further and posit that “platformization” of forecasting and “forecasting as a service” can turn anyone into a data scientist at the push …

4 месяца назад @ unofficialgoogledatascience.com
Estimating the prevalence of rare events — theory and practice
Estimating the prevalence of rare events — theory and practice Estimating the prevalence of rare events — theory and practice

$$S(v_1) = S(v_2) \implies \frac{q(v_1)}{p(v_1)} = \frac{q(v_2)}{p(v_2)}$$The ratio between the importance distribution and target distribution is thus a function of $S(v)$:$$\frac{q(v)}{p(v)} = \frac{\tilde{q}(S(v))}{\tilde{p}(S(v))}$$where $\tilde{p}$ and $\tilde{q}$ are PMFs of $S(v)$ under the target distribution and importance distribution respectively.

In our case when the events are rare and the probability of high conditional prevalence rate is small under the target distribution, the difference between the methods is minor.

We also discuss how to choose $q$ with respect to the conditional prevalence rate $g(S(v))=\mathbb{E}_p\left[f(V)|S(V)=S(v)\right]$.

Conclusion In this post, we…

7 месяцев, 1 неделя назад @ unofficialgoogledatascience.com
Misadventures in experiments for growth
Misadventures in experiments for growth Misadventures in experiments for growth

In summary, classic experimentation is applicable to fledgling products but in a much more limited way than to established products.

For our music example, we imagined that EDM users don't approximate the target population for some experiments.

The behavior of this single user user appears in our data as a large number of impressions with conversions.

A word on growth hackingOf particular concern in growth hacking is the focus on influencers for pushing growth.

For our music example, we imagined that EDM users don't approximate the target population for some experiments.

11 месяцев, 3 недели назад @ unofficialgoogledatascience.com
Crawling the internet: data science within a large engineering system
Crawling the internet: data science within a large engineering system Crawling the internet: data science within a large engineering system

When queries arrive, the search system matches the inferred meaning of the query to web pages on the basis of these snapshots.

This measure of web page value is on a meaningful linear scale, such that our freshness metric (a weighted average) has an intuitive interpretation.

A global constraint of how much compute and network resources Google itself is willing to dedicate to crawling web pages.

In some regimes (and in practice for google search), a greedy algorithm would devote more recrawl resources towards high value pages, as lower value pages would commonly starve.

We can use this function to sort the web pages, and then determine which web pages should be scheduled for immediate crawl.

1 год, 8 месяцев назад @ unofficialgoogledatascience.com
Compliance bias in mobile experiments
Compliance bias in mobile experiments Compliance bias in mobile experiments

The differences between the distribution of users experiencing the treatment and the population are likely to be a key factor here.

Compliance Bias A central issue in this application is that users assigned treatment sometimes do not actually experience the treatment at $T_{\mathrm{measure}}$, and furthermore this set of users is not random.

Here, we can draw a direct analogy to Compliance Bias, which is primarily described in literature on the analysis of medical studies.

Propensity scoring within the treatmentFig 5: Estimated probability of experiencing the treatment in the treatment group.

Here, we ignore any control group, and analyze the treatment group as a self-contained observationa…

2 года назад @ unofficialgoogledatascience.com
Designing A/B tests in a collaboration network
Designing A/B tests in a collaboration network Designing A/B tests in a collaboration network

Our model considers two aspects of network effects:Homophily or similarity within network: users collaborating in network tend to behave similarly.

or similarity within network: users collaborating in network tend to behave similarly.

The network topology itself is the actual collaboration network we observe for GCP.When users are connected in a network, their treatment assignments can generate network effects through their interactions.

In other words, for the three methods of randomizationuniform random componentuniform random projectstratified random component we simulate confidence intervals for A/A tests, i.e.

Conclusion Designing randomized experiments on a network of users is more ch…

2 года, 2 месяца назад @ unofficialgoogledatascience.com
Unintentional data
Unintentional data Unintentional data

The Future of Data AnalysisAvalanche of questions: the role of the data scientist amid unintentional dataIs it relevant to our goals?

In the world of big, unintentional data there are many discoveries to be had which have no bearing on the organization’s goals.

Democratization of analysis: quantity has a quality all its own Just as dealing with unintentional data shapes the role of the data scientists in their organization, it also shapes the day to day practice of data analysis.

Understanding the goals of the organization as well as guiding principles for extracting value from data are both critical for success in this environment.Thankfully not only have modern data analysis tools made da…

2 года, 5 месяцев назад @ unofficialgoogledatascience.com
Fitting Bayesian structural time series with the bsts R package
Fitting Bayesian structural time series with the bsts R package Fitting Bayesian structural time series with the bsts R package

When fitting bsts models that contain a regression component, extra arguments captured by ... are passed to the SpikeSlabPrior function from the BoomSpikeSlab package.

# Fit a bsts model with expected model size 1, the default.

model2 <- bsts(iclaimsNSA ~ ., state.specification = ss, niter = 1000, data = initial.claims)# Fit a bsts model with expected model size 5, to include more coefficients.

(a) (b)Figure 10: Regression coefficients for the (a) plain logistic regression model and (b) time series logistic regression model under equivalent spike and slab priors.

These are a widely useful class of time series models, known in various literatures as "structural time series," "state space mod…

2 года, 8 месяцев назад @ unofficialgoogledatascience.com
Our quest for robust time series forecasting at scale
Our quest for robust time series forecasting at scale Our quest for robust time series forecasting at scale

The demand for time series forecasting at Google grew rapidly along with the company over its first decade.

That is, for an attempt to develop methods and tools that would facilitate accurate large-scale time series forecasting at Google.

The demand for time series forecasting at Google grew rapidly along with the company over its first decade.

But like our approach, Prophet aims to be an automatic, robust forecasting tool.At lastly, "forecasting" for us did not mean anomaly detection.

APAby ERIC TASSONE, FARZAN ROHANITime series forecasting enjoys a rich and luminous history, and today is an essential element of most any business operation.

2 года, 11 месяцев назад @ unofficialgoogledatascience.com
Attributing a deep network’s prediction to its input features
Attributing a deep network’s prediction to its input features Attributing a deep network’s prediction to its input features

We consider a deep network using the For concreteness, let us focus on a network that performs object recognition.

Deep networks have multiple layers of logic and coefficients, combined using nonlinear activation functions .

Application to other networks Our paper also includes application of integrated gradients to other networks (none of these networks were trained by us).

There is also work (such as this ) on architecting deep networks in ways that allow us to understand the internal representations of these networks.

Overall, we hope that deep networks lose their reputation for being impenetrable black-boxes which perform black magic.

3 года назад @ unofficialgoogledatascience.com
Causality in machine learning
Causality in machine learning Causality in machine learning

An obvious attempt to fix this is to upweight randomized data in training, or even train the model solely on the randomized data.

As we observed at the start of this post, standard machine learning techniques don’t distinguish between randomized and observational data the way statistical models do.

Conclusion In this post we described how some randomized data may be applied both to check and improve the accuracy of a machine learning system trained largely on observational data.

Indeed, machine learning generally lacks the vocabulary to capture the distinction between observational data and randomized data that statistics finds crucial.

Rather, the focus of this post is on combining observa…

3 года, 2 месяца назад @ unofficialgoogledatascience.com
Practical advice for analysis of large, complex data sets
Practical advice for analysis of large, complex data sets Practical advice for analysis of large, complex data sets

Some people seemed to be naturally good at doing this kind of high quality data analysis.

Process Separate Validation, Description, and EvaluationValidation or Initial Data Analysis: Do I believe data is self-consistent, that the data was collected correctly, and that data represents what I think it does?

I think about about exploratory data analysis as having 3 interrelated stages:By separating these phases, you can more easily reach agreement with others.

Acknowledge and count your filtering Almost every large data analysis starts by filtering the data in various stages.

Almost every large data analysis starts by filtering the data in various stages.

3 года, 5 месяцев назад @ unofficialgoogledatascience.com
Statistics for Google Sheets
Statistics for Google Sheets Statistics for Google Sheets

IntroductionStatistics for Google Sheets is an add-on for Google Sheets that brings elementary statistical analysis tools to spreadsheet users.

The goal of the Statistics app is to “democratize data science” by putting elementary statistics capabilities in the hands of anyone with a Google account.

If you look closely at the boxplots you can see that returns following down days have slightly greater variation than returns following up days.

Finally, you can use logistic regression to see how a previous day’s return affects the probability of the next day’s return being positive.

Statistics for Google Sheets gives analysts and students the tools to conduct elementary statistical analyses in …

3 года, 6 месяцев назад @ unofficialgoogledatascience.com
Next generation tools for data science
Next generation tools for data science Next generation tools for data science

Introductionthe solution to write data processing pipelines scalable to hundreds of terabytes (or more) is evidenced by the massive uptake.

That MapReduce wassolution to write data processing pipelines scalable to hundreds of terabytes (or more) is evidenced by the massive uptake.

Widely used in medicine for count data, the MH estimator and its generalizations are ubiquitous within data science at Google.

filter( lambda x: x != header) .

Beam/Dataflow’s sweet spot: streaming processing Streaming processing is an ever-increasingly important topic for data science.

3 года, 7 месяцев назад @ unofficialgoogledatascience.com
Mind Your Units
Mind Your Units Mind Your Units

The perils of incorrect units Is the idea of 'minding our units' just some esoteric issue, or can this actually hurt us in practice?

How do we mind our units in analyses at Google?

The above simulation already hints at one of our approaches to incorporating the group structure in some analyses at Google.

Regardless of how you do it, do remember to mind your units.

Regardless of how you do it, do remember to mind your units.

3 года, 8 месяцев назад @ unofficialgoogledatascience.com
Andrew Karpathy
последний пост 11 месяцев, 2 недели назад
A Recipe for Training Neural Networks
A Recipe for Training Neural Networks

Some few weeks ago I posted a tweet on “the most common neural net mistakes”, listing a few common gotchas related to training neural nets.

1) Neural net training is a leaky abstractionIt is allegedly easy to get started with training neural nets.

This is just a start when it comes to training neural nets.

As a result, (and this is reeaally difficult to over-emphasize) a “fast and furious” approach to training neural networks does not work and only leads to suffering.

focus on training loss) and then regularize it appropriately (give up some training loss to improve the validation loss).

11 месяцев, 2 недели назад @ karpathy.github.io
(started posting on Medium instead)
(started posting on Medium instead)

The current state of this blog (with the last post 2 years ago) makes it look like I’ve disappeared.

I’ve certainly become less active on blogs since I’ve joined Tesla, but whenever I do get a chance to post something I have recently been defaulting to doing it on Medium because it is much faster and easier.

I still plan to come back here for longer posts if I get any time, but I’ll default to Medium for everything short-medium in length.

TLDRHave a look at my Medium blog.

2 года, 2 месяца назад @ karpathy.github.io
A Survival Guide to a PhD
A Survival Guide to a PhD A Survival Guide to a PhD

Unlike the undergraduate guide, this one was much more difficult to write because there is significantly more variation in how one can traverse the PhD experience.

You can go one way (PhD -> anywhere else) but not the other (anywhere else -> PhD -> academia/research; it is statistically less likely).

The adviser is an extremely important person who will exercise a lot of influence over your PhD experience.

During your PhD you’ll get to acquire this sense yourself.

It’s usually a painful exercise for me to look through some of my early PhD paper drafts because they are quite terrible.

3 года, 7 месяцев назад @ karpathy.github.io
Deep Reinforcement Learning: Pong from Pixels
Deep Reinforcement Learning: Pong from Pixels Deep Reinforcement Learning: Pong from Pixels

This is a long overdue blog post on Reinforcement Learning (RL).

From left to right: Deep Q Learning network playing ATARI, AlphaGo, Berkeley robot stacking Legos, physically-simulated quadruped leaping over terrain.

Policy network.

For example, suppose we compute \(R_t\) for all of the 20,000 actions in the batch of 100 Pong game rollouts above.

The total number of episodes was approximately 8,000 so the algorithm played roughly 200,000 Pong games (quite a lot isn’t it!)

3 года, 10 месяцев назад @ karpathy.github.io
Short Story on AI: A Cognitive Discontinuity.
Short Story on AI: A Cognitive Discontinuity. Short Story on AI: A Cognitive Discontinuity.

Another great source of good reputation for Visceral were the large number of famous interventions carried out by autonomous Visceral agents.

The list went on and on - one month ago an autonomous Visceral agent recognized a remote drone attack.

He was running the routine software diagnostics on the Visceral agent and one of them had just failed.

The software diagnostics were only at 5% complete, and Merus knew they would take a while to run to completion.

Merus’ avatar broke the silence in the last second: “Come meet me here.” And then the connection was lost.

4 года, 4 месяца назад @ karpathy.github.io
What a Deep Neural Network thinks about your #selfie
What a Deep Neural Network thinks about your #selfie What a Deep Neural Network thinks about your #selfie

In this fun experiment we’re going to do just that: We’ll take a powerful, 140-million-parameter state-of-the-art Convolutional Neural Network, feed it 2 million selfies from the internet, and train it to classify good selfies from bad ones.

what if someone posted a very good selfie but it was late at night, so perhaps not as many people saw it and it got less likes?

What makes a good #selfie ?

To take a good selfie, Do:Be female.

Also, with some relief, it seems that the best selfies do not seem to be the ones that show the most skin.

4 года, 5 месяцев назад @ karpathy.github.io
The Unreasonable Effectiveness of Recurrent Neural Networks
The Unreasonable Effectiveness of Recurrent Neural Networks The Unreasonable Effectiveness of Recurrent Neural Networks

A glaring limitation of Vanilla Neural Networks (and also Convolutional Networks) is that their API is too constrained: they accept a fixed-sized vector as input (e.g.

If training vanilla neural nets is optimization over functions, training recurrent nets is optimization over programs.

At the core, RNNs have a deceptively simple API: They accept an input vector x and give you an output vector y .

Fun with RNNsAll 5 example character models below were trained with the code I’m releasing on Github.

These models have about 10 million parameters, which is still on the lower end for RNN models.

4 года, 10 месяцев назад @ karpathy.github.io
Breaking Linear Classifiers on ImageNet
Breaking Linear Classifiers on ImageNet Breaking Linear Classifiers on ImageNet

speech recognition systems), and most importantly, also to simple, shallow, good old-fashioned Linear Classifiers (Softmax classifier, or Linear Support Vector Machines, etc.).

Instead, lets fool a linear classifier and lets also keep with the theme of breaking models on images because they are fun to look at.

With input images of size 64x64x3 and 1000 ImageNet classes we therefore have 64x64x3x1000 = 12.3 million weights (beefy linear model!

We can then visualize each of the learned weights by reshaping them as images:Example linear classifiers for a few ImageNet classes.

Linear classifier with lower regularization (which leads to more noisy class weights) is easier to fool (top).

5 лет назад @ karpathy.github.io
What I learned from competing against a ConvNet on ImageNet
What I learned from competing against a ConvNet on ImageNet What I learned from competing against a ConvNet on ImageNet

The 100,000 test set images are released with the dataset, but the labels are withheld to prevent teams from overfitting on the test set.

It’s fun to note that about 4 years ago I performed a similar (but much quicker and less detailed) human classification accuracy analysis on CIFAR-10.

In total, we attribute 24 (24%) of GoogLeNet errors and 12 (16%) of human errors to this category.

We estimate that approximately 22 (21%) of GoogLeNet errors fall into this category, while none of the human errors do.

On the hand, a large majority of human errors come from fine-grained categories and class unawareness.

5 лет, 7 месяцев назад @ karpathy.github.io
Quantifying Productivity
Quantifying Productivity Quantifying Productivity

The tracking script currently records active window titles (at frequency of once every 2 seconds) and keystroke typing frequency.

Now, remember that we record keystrokes and window titles throughout.

Hacking Streak is a nifty feature that tries to identify contiguous hacking activity and correlates reasonably with my productivity.

In the end, ulogme shows the final breakdown of titles that occupied me on this day:The final breakdown of active window titles.

The holy grail here is still not implemented: What are the correlated of my productivity?

5 лет, 8 месяцев назад @ karpathy.github.io
Off the Convex Path
последний пост 6 месяцев назад
Ultra-Wide Deep Nets and Neural Tangent Kernel (NTK)
Ultra-Wide Deep Nets and Neural Tangent Kernel (NTK) Ultra-Wide Deep Nets and Neural Tangent Kernel (NTK)

gradient flow) is equivalent to a kernel regression predictor with a deterministic kernel called neural tangent kernel (NTK).

Now we describe how training an ultra-wide fully-connected neural network leads to kernel regression with respect to the NTK.

In the large width limit, it turns out that the time-varying kernel $ker_t(\cdot,\cdot)$ is (with high probability) always close to a deterministic fixed kernel $ker_{\mathsf{NTK}}(\cdot,\cdot)$, which is the neural tangent kernel (NTK).

Now, at least we have a better understanding of a class of ultra-wide neural networks: they are captured by neural tangent kernels!

Similarly, one can try to translate other architectures like recurrent neural…

6 месяцев назад @ offconvex.org
Understanding implicit regularization in deep learning by analyzing trajectories of gradient descent
Understanding implicit regularization in deep learning by analyzing trajectories of gradient descent Understanding implicit regularization in deep learning by analyzing trajectories of gradient descent

Understanding implicit regularization in deep learning by analyzing trajectories of gradient descentSanjeev’s recent blog post suggested that the conventional view of optimization is insufficient for understanding deep learning, as the value of the training objective does not reliably capture generalization.

In recent years, researchers have come to realize the importance of implicit regularization induced by the choice of optimization algorithm.

This theorem disqualifies Schatten quasi-norms as the implicit regularization in deep matrix factorizations, and instead suggests that all depths correspond to nuclear norm.

Full details behind our results on “implicit regularization as norm minimi…

9 месяцев назад @ offconvex.org
Landscape Connectivity of Low Cost Solutions for Multilayer Nets
Landscape Connectivity of Low Cost Solutions for Multilayer Nets Landscape Connectivity of Low Cost Solutions for Multilayer Nets

Landscape Connectivity of Low Cost Solutions for Multilayer NetsA big mystery about deep learning is how, in a highly nonconvex loss landscape, gradient descent often finds near-optimal solutions —those with training cost almost zero— even starting from a random initialization.

Solutions A and B have low cost but the line connecting them goes through solutions with high cost.

Mode Connectivity.

2019) did try to explain the phenomenon of mode connectivity in simple settings (the first of these demonstrated mode connectivity empirically for multi-layer nets).

Thus to explain mode connectivity for multilayer nets we will need to leverage some stronger property of typical solutions discovered v…

9 месяцев, 3 недели назад @ offconvex.org
Is Optimization a Sufficient Language for Understanding Deep Learning?
Is Optimization a Sufficient Language for Understanding Deep Learning?

Is Optimization a Sufficient Language for Understanding Deep Learning?

In this Deep Learning era, machine learning usually boils down to defining a suitable objective/cost function for the learning task at hand, and then optimizing this function using some variant of gradient descent (implemented via backpropagation).

I am suggesting that deep learning algorithms also have important properties that are not always reflected in the objective value.

by playing with batch sizes and learning rates) can be preferable to perfect optimization, even in simple settings such as regression.

NB: Empirically we find that Adam, the celebrated acceleration method for deep learning, speeds up optimization a…

10 месяцев, 1 неделя назад @ offconvex.org
Contrastive Unsupervised Learning of Semantic Representations: A Theoretical Framework
Contrastive Unsupervised Learning of Semantic Representations&#58; A Theoretical Framework Contrastive Unsupervised Learning of Semantic Representations&#58; A Theoretical Framework

Contrastive Unsupervised Learning of Semantic Representations: A Theoretical FrameworkSemantic representations (aka semantic embeddings) of complicated data types (e.g.

Researchers are most interested in unsupervised representation learning using unlabeled data.

samples $x, x^{+}$ from the distribution $D_{c^+}$.

The highlighted parts in the table show that the unsupervised representations compete well with the supervised representations on the average $k$-way classification task ($k=2, 10$).

We find this to be true for unsupervised representations, and surprisingly for supervised representations as well.

1 год назад @ offconvex.org
The search for biologically plausible neural computation: A similarity-based approach
The search for biologically plausible neural computation&#58; A similarity-based approach The search for biologically plausible neural computation&#58; A similarity-based approach

By re-ordering the variables and introducing a new variable, ${\bf W} \in \mathbb{R}^{k\times n}$, we obtain:To prove the second identity, find optimal ${\bf W}$ by taking a derivative of the expression on the right with respect to ${\bf W}$ and setting it to zero, and then substitute the optimal ${\bf W}$ back into the expression.

The price paid for this simplification is the appearance of the minimax optimization problem in variables, ${\bf W}$ and ${\bf M}$.

Variables ${\bf W}$ and ${\bf M}$ are represented by the weights of synapses in feedforward and lateral connections respectively.

In neuroscience, learning rules (2.7) for ${\bf W}$ and ${\bf M}$ are called Hebbian and anti-Hebbian r…

1 год, 4 месяца назад @ offconvex.org
Understanding optimization in deep learning by analyzing trajectories of gradient descent
Understanding optimization in deep learning by analyzing trajectories of gradient descent Understanding optimization in deep learning by analyzing trajectories of gradient descent

Understanding optimization in deep learning by analyzing trajectories of gradient descentNeural network optimization is fundamentally non-convex, and yet simple gradient-based algorithms seem to consistently solve such problems.

Trajectory-Based Analyses for Deep Linear Neural NetworksLinear neural networks are fully-connected neural networks with linear (no) activation.

2014 were the first to carry out a trajectory-based analysis for deep (three or more layer) linear networks, treating gradient flow (gradient descent with infinitesimally small learning rate) minimizing $\ell_2$ loss over whitened data.

Specifically, we analyze trajectories of gradient descent for any linear neural network …

1 год, 5 месяцев назад @ offconvex.org
Simple and efficient semantic embeddings for rare words, n-grams, and language features
Simple and efficient semantic embeddings for rare words, n-grams, and language features Simple and efficient semantic embeddings for rare words, n-grams, and language features

Simple and efficient semantic embeddings for rare words, n-grams, and language featuresDistributional methods for capturing meaning, such as word embeddings, often require observing many examples of words in context.

Here we describe a simple but principled approach called à la carte embeddings, described in our ACL’18 paper with Yingyu Liang, Tengyu Ma, and Brandon Stewart.

For convenience, we will let $u_w^c$ denote the average of the word embeddings of words in $c$.

We test this hypothesis by inducing embeddings for $n$-grams by using contexts from a large text corpus and word embeddings trained on the same corpus.

The à la carte code is available here, allowing you to re-create the resu…

1 год, 6 месяцев назад @ offconvex.org
When Recurrent Models Don't Need to be Recurrent
When Recurrent Models Don't Need to be Recurrent When Recurrent Models Don't Need to be Recurrent

When Recurrent Models Don't Need to be RecurrentIn the last few years, deep learning practitioners have proposed a litany of different sequence models.

Feed-forward models can offer improvements in training stability and speed, while recurrent models are strictly more expressive.

At the outset, recurrent models appear to be a strictly more flexible and expressive model class than feed-forward models.

Feed-forward models make translations using only $k$ words of the sentence, whereas recurrent models can leverage the entire sentence.

Feed-forward models are limited to the past $k$ samples, whereas recurrent models can use the entire history.

1 год, 8 месяцев назад @ offconvex.org
Deep-learning-free Text and Sentence Embedding, Part 2
Deep-learning-free Text and Sentence Embedding, Part 2 Deep-learning-free Text and Sentence Embedding, Part 2

Deep-learning-free Text and Sentence Embedding, Part 2This post continues Sanjeev’s post and describes further attempts to construct elementary and interpretable text embeddings.

Even better, it is much faster to compute, since it uses pretrained (GloVe) word vectors and simple linear algebra.

Note that DisC embeddings leverage classic Bag-of-n-Gram information as well as the power of word embeddings.

The new theorem follows from considering an LSTM that uses random vectors as word embeddings and computes the DisC embedding in one pass over the text.

Sample code for constructing and evaluating DisC embeddings is available, as well as solvers for recreating the sparse recovery results for wo…

1 год, 9 месяцев назад @ offconvex.org
Machine Learning Mastery Machine Learning Mastery
последний пост 17 часов назад
10 Clustering Algorithms With Python
10 Clustering Algorithms With Python 10 Clustering Algorithms With Python

There are many clustering algorithms to choose from and no single best clustering algorithm for all cases.

How to implement, fit, and use top clustering algorithms in Python with the scikit-learn machine learning library.

Tutorial OverviewThis tutorial is divided into three parts; they are:Clustering Clustering Algorithms Examples of Clustering Algorithms Library Installation Clustering Dataset Affinity Propagation Agglomerative Clustering BIRCH DBSCAN K-Means Mini-Batch K-Means Mean Shift OPTICS Spectral Clustering Gaussian Mixture ModelClusteringCluster analysis, or clustering, is an unsupervised machine learning task.

Examples of Clustering AlgorithmsIn this section, we will review how t…

17 часов назад @ machinelearningmastery.com
What Is Argmax in Machine Learning?
What Is Argmax in Machine Learning? What Is Argmax in Machine Learning?

In this tutorial, you will discover the argmax function and how it is used in machine learning.

Argmax can be implemented manually, although the argmax() NumPy function is preferred in practice.

The argmax function is used throughout the field of mathematics and machine learning.

# numpy implementation of argmax from numpy import argmax # define vector vector = [0.4, 0.5, 0.1] # get argmax result = argmax(vector) print('arg max of %s: %d' % (vector, result)) 1 2 3 4 5 6 7 # numpy implementation of argmax from numpy import argmax # define vector vector = [ 0.4 , 0.5 , 0.1 ] # get argmax result = argmax ( vector ) print ( 'arg max of %s: %d' % ( vector , result ) )Running the example prints a…

3 дня, 18 часов назад @ machinelearningmastery.com
Gradient Boosting with Scikit-Learn, XGBoost, LightGBM, and CatBoost
Gradient Boosting with Scikit-Learn, XGBoost, LightGBM, and CatBoost Gradient Boosting with Scikit-Learn, XGBoost, LightGBM, and CatBoost

How to evaluate and use gradient boosting with scikit-learn, including gradient boosting machines and the histogram-based algorithm.

Tutorial OverviewThis tutorial is divided into five parts; they are:Gradient Boosting Overview Gradient Boosting With Scikit-Learn Library Installation Test Problems Gradient Boosting Histogram-Based Gradient Boosting Gradient Boosting With XGBoost Library Installation XGBoost for Classification XGBoost for Regression Gradient Boosting With LightGBM Library Installation LightGBM for Classification LightGBM for Regression Gradient Boosting With CatBoost Library Installation CatBoost for Classification CatBoost for RegressionGradient Boosting OverviewGradient bo…

5 дней, 18 часов назад @ machinelearningmastery.com
How to Calculate Feature Importance With Python
How to Calculate Feature Importance With Python How to Calculate Feature Importance With Python

Tutorial OverviewThis tutorial is divided into five parts; they are:Feature Importance Preparation Check Scikit-Learn Version Test Datasets Coefficients as Feature Importance Linear Regression Feature Importance Logistic Regression Feature Importance Decision Tree Feature Importance CART Feature Importance Random Forest Feature Importance XGBoost Feature Importance Permutation Feature Importance Permutation Feature Importance for Regression Permutation Feature Importance for ClassificationFeature ImportanceFeature importance refers to a class of techniques for assigning scores to input features to a predictive model that indicates the relative importance of each feature when making a predic…

1 неделя назад @ machinelearningmastery.com
How to Develop Multi-Output Regression Models with Python
How to Develop Multi-Output Regression Models with Python How to Develop Multi-Output Regression Models with Python

Tutorial OverviewThis tutorial is divided into three parts; they are:Problem of Multioutput Regression Check Scikit-Learn Version Multioutput Regression Test Problem Inherently Multioutput Regression Algorithms Linear Regression for Multioutput Regression k-Nearest Neighbors for Multioutput Regression Random Forest for Multioutput Regression Evaluate Multioutput Regression With Cross-Validation Wrapper Multioutput Regression Algorithms Separate Model for Each Output (MultiOutputRegressor) Chained Models for Each Output (RegressorChain)Problem of Multioutput RegressionRegression refers to a predictive modeling problem that involves predicting a numerical value.

Linear Regression for Multiout…

1 неделя, 3 дня назад @ machinelearningmastery.com
4 Distance Measures for Machine Learning
4 Distance Measures for Machine Learning 4 Distance Measures for Machine Learning

After completing this tutorial, you will know:The role and importance of distance measures in machine learning algorithms.

Tutorial OverviewThis tutorial is divided into five parts; they are:Role of Distance Measures Hamming Distance Euclidean Distance Manhattan Distance (Taxicab or City Block) Minkowski DistanceRole of Distance MeasuresDistance measures play an important role in machine learning.

Perhaps the most likely way you will encounter distance measures is when you are using a specific machine learning algorithm that uses distance measures at its core.

Perhaps four of the most commonly used distance measures in machine learning are as follows:Hamming DistanceEuclidean DistanceManhat…

1 неделя, 5 дней назад @ machinelearningmastery.com
PyTorch Tutorial: How to Develop Deep Learning Models with Python
PyTorch Tutorial: How to Develop Deep Learning Models with Python PyTorch Tutorial: How to Develop Deep Learning Models with Python

layer ( X ) X = self .

hidden1 ( X ) X = self .

act1 ( X ) X = self .

hidden2 ( X ) X = self .

pool2 ( X ) # flatten X = X .

2 недели назад @ machinelearningmastery.com
Basic Data Cleaning for Machine Learning (That You Must Perform)
Basic Data Cleaning for Machine Learning (That You Must Perform) Basic Data Cleaning for Machine Learning (That You Must Perform)

In tabular data, there are many different statistical analysis and data visualization techniques you can use to explore your data in order to identify data cleaning operations you may want to perform.

Before jumping to the sophisticated methods, there are some very basic data cleaning operations that you probably should perform on every single machine learning project.

shape [ 1 ] ) : num = len ( unique ( data [ : , i ] ) ) percentage = float ( num ) / data .

From a probabilistic perspective, you can think of duplicate data as adjusting the priors for a class label or data distribution.

Typically, this is not the case and machine learning algorithms will perform better by identifying and re…

2 недели, 3 дня назад @ machinelearningmastery.com
Neural Networks are Function Approximation Algorithms
Neural Networks are Function Approximation Algorithms Neural Networks are Function Approximation Algorithms

In this tutorial, you will discover the intuition behind neural networks as function approximation algorithms.

Therefore, function approximation is only a useful tool when the underlying target mapping function is unknown.

The less noise we have in observations, the more crisp approximation we can make of the mapping function.

Next, we can then pretend to forget that we know what the mapping function is and use a neural network to re-learn or re-discover the mapping function.

TutorialsBooksArticlesSummaryIn this tutorial, you discovered the intuition behind neural networks as function approximation algorithms.

2 недели, 5 дней назад @ machinelearningmastery.com
Imbalanced Multiclass Classification with the E.coli Dataset
Imbalanced Multiclass Classification with the E.coli Dataset Imbalanced Multiclass Classification with the E.coli Dataset

The E.coli protein localization sites dataset is a standard dataset for exploring the challenge of imbalanced multiclass classification.

# define the location of the dataset full_path = 'ecoli.csv' # load the dataset X , y = load_dataset ( full_path ) # summarize the loaded dataset print ( X .

dummy import DummyClassifier # load the dataset def load_dataset ( full_path ) : # load the dataset as a numpy array df = read_csv ( full_path , header = None ) # remove rows for the minority classes df = df [ df [ 7 ] != 'imS' ] df = df [ df [ 7 ] != 'imL' ] # retrieve numpy array data = df .

ensemble import RandomForestClassifier # load the dataset def load_dataset ( full_path ) : # load the dataset…

3 недели назад @ machinelearningmastery.com
Imbalanced Multiclass Classification with the Glass Identification Dataset
Imbalanced Multiclass Classification with the Glass Identification Dataset Imbalanced Multiclass Classification with the Glass Identification Dataset

The glass identification dataset is a standard dataset for exploring the challenge of imbalanced multiclass classification.

In this tutorial, you will discover how to develop and evaluate a model for the imbalanced multiclass glass identification dataset.

# define the location of the dataset full_path = 'glass.csv' # load the dataset X , y = load_dataset ( full_path ) # summarize the loaded dataset print ( X .

fit_transform ( y ) return X , y # define the location of the dataset full_path = 'glass.csv' # load the dataset X , y = load_dataset ( full_path ) # define model to evaluate model = RandomForestClassifier ( n_estimators = 1000 ) # fit the model model .

APIsDatasetSummaryIn this tutor…

3 недели, 3 дня назад @ machinelearningmastery.com
Imbalanced Classification with the Fraudulent Credit Card Transactions Dataset
Imbalanced Classification with the Fraudulent Credit Card Transactions Dataset Imbalanced Classification with the Fraudulent Credit Card Transactions Dataset

Identifying fraudulent credit card transactions is a common type of imbalanced binary classification where the focus is on the positive class (is fraud) class.

In this tutorial, you will discover how to develop and evaluate a model for the imbalanced credit card fraud dataset.

Tutorial OverviewThis tutorial is divided into five parts; they are:Credit Card Fraud Dataset Explore the Dataset Model Test and Baseline Result Evaluate Models Make Predictions on New DataCredit Card Fraud DatasetIn this project, we will use a standard imbalanced machine learning dataset referred to as the “Credit Card Fraud Detection” dataset.

... # define model to evaluate model = KNeighborsClassifier() # scale, th…

3 недели, 5 дней назад @ machinelearningmastery.com
Step-By-Step Framework for Imbalanced Classification Projects
Step-By-Step Framework for Imbalanced Classification Projects Step-By-Step Framework for Imbalanced Classification Projects

Use a Systematic Framework Detailed Framework for Imbalanced Classification Select a Metric Spot Check Algorithms Spot Check Imbalanced Algorithms Hyperparameter Tuning1.

We can summarize this process as follows:Select a Metric Spot Check Algorithms Spot Check Imbalanced Algorithms Hyperparameter TuningThis provides a high-level systematic framework to work through an imbalanced classification problem.

Detailed Framework for Imbalanced ClassificationWe can develop a similar low-level framework to systematically work through each step of an imbalanced classification project.

Framework for Spot-Checking Machine Learning AlgorithmsWe can summarize these suggestions into a framework for testing…

4 недели назад @ machinelearningmastery.com
💼 University and corporation labs
DeepMind DeepMind
последний пост 6 дней, 12 часов назад
Agent57: Outperforming the human Atari benchmark
Agent57: Outperforming the human Atari benchmark Agent57: Outperforming the human Atari benchmark

Combining off-policy learning with memory is challenging because you need to know what you might remember when executing a different behaviour.

Within that strand, we distinguish two types of rewards: firstly, long-term novelty rewards encourage visiting many states throughout training, across many episodes.

Secondly, short-term novelty rewards encourage visiting many states over a short span of time (e.g., within a single episode of a game).

However, learning density models of high dimensional spaces is fraught with problems due to the curse of dimensionality.

For example, in Montezuma’s Revenge, unlike undirected exploration strategies, long-term novelty rewards allow the agent to surpass…

6 дней, 12 часов назад @ deepmind.com
A new model and dataset for long-range memory
A new model and dataset for long-range memory A new model and dataset for long-range memory

Modelling natural languageFinding machine learning tasks which both drive the development of better memory architectures and push us further towards artificial general intelligence is challenging.

Transferring knowledgeSuch samples would likely astound Shannon, 70 years on from his early language model experiments.

Google’s prominent natural language model, BERT, achieves state-of-the-art performance on a wide array of NLP benchmarks, and is now a part of Google Search.

Benchmarking language modelsA popular long-range language model benchmark is WikiText-103, which is comprised of English-language Wikipedia articles, and was developed by researchers at Salesforce AI.

As such, we’ve compiled…

1 месяц, 3 недели назад @ deepmind.com
Dopamine and temporal difference learning: A fruitful relationship between neuroscience and AI
Dopamine and temporal difference learning: A fruitful relationship between neuroscience and AI Dopamine and temporal difference learning: A fruitful relationship between neuroscience and AI

Meanwhile, in close contact with this study of reward learning in animals, computer scientists have developed algorithms for reinforcement learning in artificial systems.

A chain of prediction: temporal difference learningReinforcement learning is one of the oldest and most powerful ideas linking neuroscience and AI.

An important breakthrough in solving the problem of reward prediction was the temporal difference learning (TD) algorithm.

Around the same time, in the late 80s and early 90s, neuroscientists were struggling to understand the behaviour of dopamine neurons.

Distributional reinforcement learning

2 месяца, 3 недели назад @ deepmind.com
AlphaFold: Using AI for scientific discovery
AlphaFold: Using AI for scientific discovery AlphaFold: Using AI for scientific discovery

In our study published today in Nature, we demonstrate how artificial intelligence research can drive and accelerate new scientific discoveries.

Our system, AlphaFold – described in peer-reviewed papers now published in Nature and PROTEINS – is the culmination of several years of work, and builds on decades of prior research using large genomic datasets to predict protein structure.

What is the protein folding problem?

What any given protein can do depends on its unique 3D structure.

Why is protein folding important?

2 месяца, 3 недели назад @ deepmind.com
Using WaveNet technology to reunite speech-impaired users with their original voices
Using WaveNet technology to reunite speech-impaired users with their original voices Using WaveNet technology to reunite speech-impaired users with their original voices

This post details a recent project we undertook with Google and ALS campaigner Tim Shaw, as part of Google’s Euphonia project.

We demonstrate an early proof of concept of how text-to-speech technologies can synthesise a high-quality, natural sounding voice using minimal recorded speech data.

But message banking lacks flexibility, resulting in a static dataset of phrases.

Now imagine that you were given the chance to preserve your voice by recording as much of it as possible.

And people who aren’t able to record phrases in time are left to choose a generic computer synthesized voice that lacks the same power of connection as their own.

3 месяца, 2 недели назад @ deepmind.com
Learning human objectives by evaluating hypothetical behaviours
Learning human objectives by evaluating hypothetical behaviours Learning human objectives by evaluating hypothetical behaviours

TL;DR: We present a method for training reinforcement learning agents from human feedback in the presence of unknown unsafe states.

Training RL agents in the presence of unsafe states is known as the safe exploration problem.

The agent has one source of information: feedback about unsafe states from a human user.

Existing methods for training agents from human feedback ask the user to evaluate data of the agent acting in the environment.

The user provides feedback on this hypothetical behaviour, and the system interactively learns a model of the user's reward function.

3 месяца, 3 недели назад @ deepmind.com
From unlikely start-up to major scientific organisation: Entering our tenth year at DeepMind
From unlikely start-up to major scientific organisation: Entering our tenth year at DeepMind From unlikely start-up to major scientific organisation: Entering our tenth year at DeepMind

Pioneering research, growing impactA mission this ambitious requires pioneering research on many fronts over many years.

As our research matures, we’ve been finding more opportunities to partner with others for social and commercial impact, often with our colleagues across Alphabet.

Entering our next phaseAs I discussed with Wired in the summer, this year feels like the start of a new phase for DeepMind as an established scientific organisation.

Over the past year, we’ve also been formalising a leadership team with the seasoned experience and skills for our second decade.

Right back to our origins blending neuroscience with machine learning, we’ve found that breakthroughs happen faster when…

4 месяца назад @ deepmind.com
Strengthening the AI community
Strengthening the AI community Strengthening the AI community

For me, it was being awarded an internship at Intel, the first one ever through Purdue’s Co-Op Engineering program in 1990.

I just didn’t know if I had the right technical skills for the work, or if engineering was really my path.

It grew into a very successful 18-year career at Intel and a 25-year career in tech.

At DeepMind we want to build advanced AI to expand our knowledge and find answers to some of the fundamental questions facing society.

DeepMind Scholarships to open the field of AIThe DeepMind scholarship programme is one way we seek to broaden participation in science and AI.

4 месяца, 2 недели назад @ deepmind.com
Advanced machine learning helps Play Store users discover personalised apps
Advanced machine learning helps Play Store users discover personalised apps Advanced machine learning helps Play Store users discover personalised apps

Candidate generator unbiasingOur model (called a candidate generator) learns what apps a user is more likely to install based on previous apps they’ve installed from the Play store.

The model therefore learns a bias that favours the apps that are shown – and thus installed – more often.

An importance weight is based on the impression-to-install rate of each individual app in comparison with the median impression-to-install rate across the Play store.

Through importance weighting, our candidate generator can downweight or upweight apps based on their install rates, which mitigates the recommendation bias problem.

Our solution to this, the reranker model, learns the relative importance of a p…

4 месяца, 2 недели назад @ deepmind.com
AlphaStar: Grandmaster level in StarCraft II using multi-agent reinforcement learning
AlphaStar: Grandmaster level in StarCraft II using multi-agent reinforcement learning AlphaStar: Grandmaster level in StarCraft II using multi-agent reinforcement learning

Since then, we have taken on a much greater challenge: playing the full game at a Grandmaster level under professionally approved conditions .

AlphaStar can now play in one-on-one matches as and against Protoss, Terran, and Zerg – the three races present in StarCraft II.

Each of the Protoss, Terran, and Zerg agents is a single neural network.

We chose to use general-purpose machine learning techniques – including neural networks, self-play via reinforcement learning, multi-agent learning, and imitation learning – to learn directly from game data with general purpose techniques.

Using the advances described in our Nature paper, AlphaStar was ranked above 99.8% of active players on Battle.net…

5 месяцев, 1 неделя назад @ deepmind.com
Causal Bayesian Networks: A flexible tool to enable fairer machine learning
Causal Bayesian Networks: A flexible tool to enable fairer machine learning Causal Bayesian Networks: A flexible tool to enable fairer machine learning

This simplified example shows how CBNs can provide us with a visual framework for describing different possible unfairness scenarios.

It is nevertheless necessary to avoid pitfalls when evaluating or designing a decision system.

This means that it would be possible for the system to be deemed fair, even if it carries the unfair influence: this would automatically be the case for an error-free decision system.

On the other hand, if the path G→D→A was considered fair, it would be inappropriate to use statistical parity.

Path-specific techniques enable us to estimate the influence that a sensitive attribute has on other variables along specific sets of causal paths.

6 месяцев назад @ deepmind.com
DeepMind’s health team joins Google Health
DeepMind’s health team joins Google Health DeepMind’s health team joins Google Health

Today, with our healthcare partners, the team is excited to officially join the Google Health family.

It’s remarkable that many frontline clinicians, even in the most world’s most advanced hospitals, are still reliant on clunky desktop systems and pagers that make delivering fast and safe patient care challenging.

That’s why I joined DeepMind, and why I will continue this work with Google Health.

We’ve already seen how our mobile medical assistant for clinicians is helping patients and the clinicians looking after them, and we are looking forward to continuing our partnerships with The Royal Free London NHS Foundation Trust, Imperial College Healthcare NHS Trust and Taunton and Somerset NHS…

6 месяцев, 3 недели назад @ deepmind.com
Episode 8: Demis Hassabis - The interview
Episode 8: Demis Hassabis - The interview Episode 8: Demis Hassabis - The interview

Find out more about the themes in this episode:If you know of other resources we should link to, please help other listeners by either replying to us on Twitter (#DMpodcast) or emailing us at podcast@deepmind.com.

You can also use that address to send us questions or feedback on the series.

Credits:Presenter: Hannah FryEditor: David PrestSenior Producer: Louisa FieldProducers: Amy Racs, Dan HardoonBinaural Sound: Lucinda Mason-BrownMusic composition: Eleni Shaw (with help from Sander Dieleman and WaveNet)

6 месяцев, 3 недели назад @ deepmind.com
Episode 7: Towards the future
Episode 7: Towards the future Episode 7: Towards the future

AI researchers around the world are trying to create a general purpose learning system that can learn to solve a broad range of problems without being taught how.

Koray Kavukcuoglu, DeepMind’s Director of Research, describes the journey to get there, and takes Hannah on a whistle-stop tour of DeepMind’s HQ and its research.

Interviewees: Koray Kavukcuoglu, Director of Research; Trevor Back, Product Manager for DeepMind’s science research; research scientists Raia Hadsell and Murray Shanahan; and DeepMind CEO and co-founder, Demis Hassabis.

6 месяцев, 4 недели назад @ deepmind.com
Replay in biological and artificial neural networks
Replay in biological and artificial neural networks Replay in biological and artificial neural networks

The imagination theory makes a different prediction about how replay will look: when you rest on the couch, your brain should replay the sequence "dog, vase, water".

As in previous experiments, fast replay sequences of the objects were evident in the brain recordings.

However, the sequences did not play out in the experienced order (i.e., the scrambled order: spilled water –> vase –> dog).

And to our surprise, during rest they played out in fast sequences that were precisely coordinated with the spontaneous replay sequences mentioned above.

For example, during a dog, vase, water replay sequence, the representation of "water" was preceded by the codes for "home sequence" and "spilled liquid".

7 месяцев назад @ deepmind.com
Facebook Facebook
последний пост 1 неделя, 2 дня назад
Investing in research in the era of COVID-19
Investing in research in the era of COVID-19 Investing in research in the era of COVID-19

As we all continue to monitor the rapidly evolving situation surrounding the COVID-19 outbreak, we want to communicate how Facebook’s research investments and academic relations may be affected.

These investments include conference sponsorships, requests for proposals, our Fellowship program, university lab investments, and sponsored research.

Conference sponsorshipsFacebook believes conference sponsorships are essential in helping sustain a vibrant research community.

We will continue to honor all 2020 research conference sponsorships even in the event that the conference cannot physically take place.

University lab investments and consortiaFacebook will fully support all our current unive…

1 неделя, 2 дня назад @ research.fb.com
New pathways for sustainability research through connectivity
New pathways for sustainability research through connectivity New pathways for sustainability research through connectivity

As part of Facebook’s investment in research, we plan to invite Verdant Place members to respond to an upcoming request for proposals (RFP) in this area.

This RFP follows the success of our 2019 Verdant Place workshop, which took place at our headquarters in Menlo Park on December 11 and 12.

While many workshop presentations highlighted ongoing research and innovations in sustainability, a number of participants presented sustainability challenges with the hope that connectivity can make a meaningful difference.

Lily Cheng Zedler of Facebook Connectivity presented on the role that rural cooperatives play in meeting the rural internet infrastructure gap.

Leverage connectivity innovations to …

2 недели, 5 дней назад @ research.fb.com
Introducing Facebook’s Gender Disaggregated Displacement Maps
Introducing Facebook’s Gender Disaggregated Displacement Maps Introducing Facebook’s Gender Disaggregated Displacement Maps

Displacement trends are seldom disaggregated by gender, meaning we do not have displacement data broken down for men and women.

Today, we are announcing the launch of Facebook’s Gender Disaggregated Displacement Maps as part of the Disaster Maps product suite.

With our new Gender Disaggregated Displacement Maps, we can now better understand how the typhoon affected women and men differently.

Using our gender disaggregated displacement data, we can assess how the wildfires affected men and women differently.

Our Gender Disaggregated Displacement Maps are a new tool for understanding these differences, and survey data can be a useful tool to complement these trends.

3 недели, 5 дней назад @ research.fb.com
Announcing the winners of the Distributed Systems research awards
Announcing the winners of the Distributed Systems research awards Announcing the winners of the Distributed Systems research awards

At Facebook, we are performing forward-looking research into the area of distributed systems, applying important techniques from the field at Facebook’s scale and sharing our designs, implementations, insights, and data with the community.

To address fundamental challenges and understand future issues, we launched the Distributed Systems request for proposals at the Symposium on Operating Systems Principles in October 2019 and invited the academic community to respond.

“We are grateful to the research community for engaging so enthusiastically with us, and we look forward to our continued collaboration.” The RFP winners are invited to the Core Systems Faculty Summit in 2020 (time TBD), wher…

1 месяц, 1 неделя назад @ research.fb.com
Enforcing our Community Standards: How we track and measure progress
Enforcing our Community Standards: How we track and measure progress Enforcing our Community Standards: How we track and measure progress

Over the past several years, we have made multiple investments to help us more effectively measure, detect, and remove content that goes against our Community Standards.

Somin’s team focuses on building transparency and accountability for enforcement of Facebook’s Community Standards.

Irina Somin: One of my key areas of focus is the measurement platform that tracks Facebook’s progress in enforcing our Community Guidelines.

We have developed the measurement platform and created a common language and taxonomies of our policies based on Facebook’s Community Standards.

We update our progress every six months in our Community Standards Enforcement Report , with a supporting commentary blog on Fa…

1 месяц, 1 неделя назад @ research.fb.com
Accelerating innovations in infrastructure and advancing global connectivity with our partners
Accelerating innovations in infrastructure and advancing global connectivity with our partners Accelerating innovations in infrastructure and advancing global connectivity with our partners

High-quality internet access opens new opportunities to make yourself heard, connect with the people and communities you care about, and build new businesses.

In collaboration with our partners, we’re using Terragraph to meet this growing need for reliable, high-speed internet access.

The project includes metro fiber in select cities and connects to the national electrical grid’s fiber network to Kinshasa.

Increasing Wi-Fi connectivity with the Express Wi-Fi platformWith the Express Wi-Fi platform, service providers are able to provide fast, reliable Wi-Fi when and where people need it.

If you’re interested in learning more about how we’re working with our partners to accelerate global conn…

1 месяц, 1 неделя назад @ engineering.fb.com
Facebook supports research on misinformation and polarization with $2 million commitment
Facebook supports research on misinformation and polarization with $2 million commitment Facebook supports research on misinformation and polarization with $2 million commitment

Our partnerships with outside experts are critical in addressing and understanding social challenges on communication platforms.

“To advance our understanding of how technology impacts people and society, we’ve strengthened our commitment to conducting social science research in partnership with academics globally,” says Umer Farooq, who leads community integrity research at Facebook.

With this in mind, we are launching the 2020 Foundational Integrity Research: Misinformation and Polarization request for proposals.

We will award $2 million in unrestricted gifts to support independent social science research on misinformation and polarization related to social communication technologies.

Thi…

1 месяц, 1 неделя назад @ research.fb.com
Summaries of the Content Policy Research Initiative workshops in Dar es Salaam and Rome
Summaries of the Content Policy Research Initiative workshops in Dar es Salaam and Rome Summaries of the Content Policy Research Initiative workshops in Dar es Salaam and Rome

The Content Policy Research Initiative (CPRI) was launched last year with the goal of enhancing engagement with the research community around how we develop and enforce our Community Standards.

The two most recent workshops took place in December 2019, in Dar es Salaam, Tanzania, and Rome, Italy.

CPRI workshop in Dar es SalaamAt the CPRI workshop in Dar es Salaam, Facebook hosted 18 external researchers from around East Africa.

This came up with respect to hate speech, polarization, and the likelihood that offline harm could result from online content.

This came up with respect to hate speech, polarization, and the likelihood that offline harm could result from online content.

1 месяц, 2 недели назад @ research.fb.com
Announcing the winners of the Content Governance request for proposals
Announcing the winners of the Content Governance request for proposals Announcing the winners of the Content Governance request for proposals

It has also sparked a wider conversation on a variety of issues, including online speech, law and technology, digital constitutionalism, multi-stakeholderism, content moderation, content governance, journalism, applied ethics, free expression, digital rights, human rights, tech policy, and other related fields of study.

This ongoing exchange of ideas will influence how Facebook develops its strategies and plans for content governance going forward.

Recognizing the importance of this ongoing conversation, Facebook launched a request for proposals aimed at funding research and advocacy work in the area of online content.

Each proposal was reviewed by a multidisciplinary team at Facebook that …

1 месяц, 2 недели назад @ research.fb.com
New privacy-protected Facebook data for independent research on social media’s impact on democracy
New privacy-protected Facebook data for independent research on social media’s impact on democracy New privacy-protected Facebook data for independent research on social media’s impact on democracy

In 2018, Facebook began an initiative to support independent academic research on social media’s role in elections and democracy.

That 2019 data set consisted of links that had been shared publicly on Facebook by at least 100 unique Facebook users.

With this data, researchers will be able to understand important aspects of how social media shapes our world.

This new data set, like the data we released before it, is protected by a method known as differential privacy .

This white paper summarizes our learnings on implementing differential privacy and serves as a roadmap for other organizations seeking to implement similar privacy protections.

1 месяц, 3 недели назад @ research.fb.com
Fighting Abuse @Scale 2019 recap
Fighting Abuse @Scale 2019 recap Fighting Abuse @Scale 2019 recap

Fighting abuse presents unique challenges for large-scale organizations working to keep the people on their platforms safe.

At Fighting Abuse @Scale 2019, engineers, data scientists, product managers, and operations specialists gathered in Menlo Park for a day of technical talks focused on state-of-the art technologies to fight fraud, spam, and abuse on platforms that serve millions or even billions of people.

Our key insight is that sharing patterns can help hosting platforms identify abusive content, while hosting platforms can help sharing platforms prevent the spread of abusive content.

Results demonstrate that working together as an industry can strengthen the capacity to more quickly …

3 месяца, 3 недели назад @ engineering.fb.com
CCSM: Scalable statistical anomaly detection to resolve app crashes faster
CCSM: Scalable statistical anomaly detection to resolve app crashes faster CCSM: Scalable statistical anomaly detection to resolve app crashes faster

A contrast set mining algorithmCSM provides a scalable, robust way to generate human-readable insights on high dimensional crash data.

For a contrast set X and group G, the support S(X,G) is the percentage of vectors in group G for which the contrast set X is true.

To efficiently traverse the search space of feature combinations, we cast the problem of mining contrast sets as a tree search problem.

However, real world data is often mixed — our crash data contains a mix of categorical, discrete, and continuous data.

The continuous contrast mining algorithm adopts the same tree search framework, with modifications to reason about sets of continuous features.

4 месяца, 1 неделя назад @ engineering.fb.com
Fast dimensional analysis for root cause analysis at scale
Fast dimensional analysis for root cause analysis at scale Fast dimensional analysis for root cause analysis at scale

Nikolay Pavlovich Laptev Fred Lin Keyur Muzumdar Mihai-Valentin CureleaWhat the research is:A fast dimensional analysis (FDA) framework that automates root cause analysis on structured logs with improved scalability.

When a failure event happens in a large-scale distributed production environment, performing root cause analysis can be challenging.

Our proposed FDA framework combines structured logs from a number of sources and provides a meaningful combination of features.

As we’ve mentioned, the challenges of performing root cause analysis in a large-scale distributed production environment make outage detection and mitigation difficult.

Read the full paper:Fast Dimensional Analysis for Ro…

4 месяца, 4 недели назад @ engineering.fb.com
2019 @Scale Conference recap
2019 @Scale Conference recap 2019 @Scale Conference recap

If you are interested in future events, visit the @Scale website or join the @Scale community.

@Scale 2019: Data InfraZanzibar: Google’s consistent, global authorization systemRuoming Pang, Principal Software Engineer, GoogleDetermining whether online users are authorized to access digital objects is central to preserving privacy.

6 technical challenges in developing a distributed SQL databaseNeha Deodhar, Software Engineer, YugaByteNeha discusses the experience of developing YugaByte.

@Scale 2019: SecurityLeveraging the type system to write secure applicationsShannon Zhu, Software Engineer, FacebookShannon discusses ways to extend the type system to eliminate entire classes of security vul…

5 месяцев, 1 неделя назад @ engineering.fb.com
Video @Scale 2019 recap
Video @Scale 2019 recap Video @Scale 2019 recap

At Video @Scale 2019, engineers gathered in San Francisco for a day of technical talks focused on delivering video at scale.

Adopting video at scaleSteven Robertson, Engineer, YouTubeSteven works on streaming video performance at YouTube.

AV1 PanelRonald Bultje, Founder, Two OriolesYaowu Xu, Principal Software Engineer, GoogleChekib Nouira, Senior Video Systems Engineer, IntelPanel moderated by Ioannis Katsavounidis.

Contextual video ad safetyVijaya Chandra, Software Engineering Manager, FacebookRose Kanjirathinkal, Research Scientist, FacebookVijaya leads video understanding efforts at Facebook.

Video integrity at scaleSonal Gandhi, Software Engineer, FacebookSonal talks about reducing har…

5 месяцев, 3 недели назад @ engineering.fb.com
Google Google
последний пост 2 дня, 18 часов назад
Exploring Nature-Inspired Robot Agility
Exploring Nature-Inspired Robot Agility Exploring Nature-Inspired Robot Agility

Comparison of policies before and after adaptation on the real robot.

Before adaptation, the robot is prone to falling.

But after adaptation, the policies are able to more consistently execute the desired skills.

2 дня, 18 часов назад @ ai.googleblog.com
Announcing the 2020 Image Matching Benchmark and Challenge
Announcing the 2020 Image Matching Benchmark and Challenge Announcing the 2020 Image Matching Benchmark and Challenge

Some example images sampled from the Image Matching Challenge dataset, showing different perspectives of the Trevi Fountain.

A 3D reconstruction generated from over 3000 images, including those from the previous figure.

We show point-to-point matches generated by different local feature algorithms.

1 Please note that as of April 2, 2020, CVPR is currently on track, despite the COVID-19 pandemic.

Please see the 2020 Image Matching Challenge website for details.↩

3 дня, 16 часов назад @ ai.googleblog.com
A Step Towards Protecting Patients from Medication Errors
A Step Towards Protecting Patients from Medication Errors A Step Towards Protecting Patients from Medication Errors

Based on a patient’s medical history and current clinical characteristics, the model ranks the medications a physician is most likely to prescribe.

3 дня, 19 часов назад @ ai.googleblog.com
Transform your photo in the style of an iconic artist
Transform your photo in the style of an iconic artist Transform your photo in the style of an iconic artist

From the bold, swirling movement in Vincent van Gogh's paintings, to the surreal, confident brushstrokes of Frida Kahlo, many famous artists have instantly recognizable styles.

Now you can use these styles to transform your own photos.

With Art Transfer, a new feature in the Google Arts & Culture app, you can apply the characteristics of well-known paintings to your own images.

To try it, open the Camera menu in the bottom bar of the Google Arts & Culture app and select “Art Transfer.” After taking or uploading a photo, choose from dozens of masterpieces to transfer that style onto your image.

For more customization, you can use the scissors icon to select which part of the image you want t…

4 дня, 3 часа назад @ blog.google
Improving Audio Quality in Duo with WaveNetEQ
Improving Audio Quality in Duo with WaveNetEQ Improving Audio Quality in Duo with WaveNetEQ

WaveNetEQ architecture.

During inference, we "warm up" the autoregressive network by teacher forcing with the most recent audio.

Afterwards, the model is supplied with its own output as input for the next step.

A MEL spectrogram from a longer audio part is used as input for the conditioning network.

4 дня, 17 часов назад @ ai.googleblog.com
Exploring New Ways to Support Faculty Research
Exploring New Ways to Support Faculty Research Exploring New Ways to Support Faculty Research

Give us feedback in our Product Forums

1 неделя, 3 дня назад @ ai.googleblog.com
A Neural Weather Model for Eight-Hour Precipitation Forecasting
A Neural Weather Model for Eight-Hour Precipitation Forecasting A Neural Weather Model for Eight-Hour Precipitation Forecasting

The architecture of the neural weather model, MetNet.

The input satellite and radar images first pass through a spatial downsampler to reduce memory consumption.

They are then processed by a convolutional LSTM at 15 minute intervals over the 90 minutes of input data.

Then axial attention layers are used to make the network see the entirety of the input images.

1 неделя, 4 дня назад @ ai.googleblog.com
Five things you (maybe) didn't know about AI
Five things you (maybe) didn't know about AI Five things you (maybe) didn't know about AI

AI is being used to help tackle the global climate crisis.

AI offers us the ability to process large volumes of data and uncover patterns—an invaluable aid when it comes to climate change.

One common use case is AI-powered systems that help people regulate the amount of energy they use by turning off the heating and lights when they leave the house.

AI is also helping to model glacier melt and predict rising sea levels so effective that action can be taken.

Researchers are also considering the environmental impact of data centers and AI computing itself by exploring how to develop more energy efficient systems and infrastructures.

1 неделя, 5 дней назад @ blog.google
Massively Scaling Reinforcement Learning with SEED RL
Massively Scaling Reinforcement Learning with SEED RL Massively Scaling Reinforcement Learning with SEED RL

The score of different architectures on the Google Research Football “Hard” task.

We show that by using an input resolution and a larger model, the score is improved, and with more training, the model can significantly outperform the builtin AI.

1 неделя, 6 дней назад @ ai.googleblog.com
Visual Transfer Learning for Robotic Manipulation
Visual Transfer Learning for Robotic Manipulation Visual Transfer Learning for Robotic Manipulation

Affordance-based grasping models trained from scratch can struggle to pick up new objects after 60 minutes of training (left).

With pre-training from visual tasks, our affordance-based grasping models can easily generalize to picking up new objects with less than 10 minutes of training, even when evaluated with different hardware (middle: suction, right: gripper).

2 недели, 2 дня назад @ ai.googleblog.com
Introducing Dreamer: Scalable Reinforcement Learning Using World Models
Introducing Dreamer: Scalable Reinforcement Learning Using World Models Introducing Dreamer: Scalable Reinforcement Learning Using World Models

The three processes of the Dreamer agent.

The world model is learned from past experience.

From predictions of this model, the agent then learns a value network to predict future rewards and an actor network to select actions.

The actor network is used to interact with the environment.

2 недели, 4 дня назад @ ai.googleblog.com
Fast and Easy Infinitely Wide Networks with Neural Tangents
Fast and Easy Infinitely Wide Networks with Neural Tangents Fast and Easy Infinitely Wide Networks with Neural Tangents

In both plots we compare training of an ensemble of finite neural networks with the infinite-width ensemble of the same architecture.

The empirical mean and variance of the finite ensemble is displayed as a dashed black line between two dotted black lines.

The closed-form mean and variance of the infinite-width ensemble is displayed as a solid colored line inside a filled color region.

In both plots finite- and infinite-width ensembles match very closely and can be hard to distinguish.

Right: Train and test loss with uncertainty over the course of training.

3 недели, 2 дня назад @ ai.googleblog.com
Soli Radar-Based Perception and Interaction in Pixel 4
Soli Radar-Based Perception and Interaction in Pixel 4 Soli Radar-Based Perception and Interaction in Pixel 4

Left: Presence - Person walking towards the device.

Middle: Reach - Person reaching towards the device.

Right: Swipe - Person swiping over the device.

3 недели, 3 дня назад @ ai.googleblog.com
Real-Time 3D Object Detection on Mobile Devices with MediaPipe
Real-Time 3D Object Detection on Mobile Devices with MediaPipe Real-Time 3D Object Detection on Mobile Devices with MediaPipe

Real-world data annotation for 3D object detection.

Right: 3D bounding boxes are annotated in the 3D world with detected surfaces and point clouds.

Left: Projections of annotated 3D bounding boxes are overlaid on top of video frames making it easy to validate the annotation.

3 недели, 4 дня назад @ ai.googleblog.com
Introducing Cloud AI Platform Pipelines
Introducing Cloud AI Platform Pipelines Introducing Cloud AI Platform Pipelines

SDKsCloud AI Platform pipelines supports two SDKs to author ML pipelines: the Kubeflow Pipelines SDK—part of the Kubeflow OSS project—and the TFX SDK.

When choosing the SDK to run your ML pipelines with the AI Platform Pipelines beta, we recommend:TFX SDK and its templates for E2E ML Pipelines based on TensorFlow, with customizable data pre-processing and training code.

Kubeflow Pipelines SDK for fully custom pipelines, or pipelines that use prebuilt KFP components, which support access to a wide range of GCP services.

New Pipelines featuresThe beta launch of AI Platform Pipelines includes a number of new features, including support for template-based pipeline construction, versioning, and …

3 недели, 4 дня назад @ cloud.google.com
OpenAI OpenAI
последний пост 2 месяца назад
OpenAI → PyTorch
OpenAI → PyTorch OpenAI → PyTorch

We are standardizing OpenAI’s deep learning framework on PyTorch.

The main reason we've chosen PyTorch is to increase our research productivity at scale on GPUs.

It is very easy to try and execute new research ideas in PyTorch; for example, switching to PyTorch decreased our iteration time on research ideas in generative modeling from weeks to days.

Going forward we'll primarily use PyTorch as our deep learning framework but sometimes use other ones when there's a specific technical reason to do so.

Many of our teams have already made the switch, and we look forward to contributing to the PyTorch community in upcoming months.

2 месяца назад @ openai.com
OpenAI Five
OpenAI Five OpenAI Five

You play against [OpenAI Five] and you realize it has a playstyle that is different.

It’s doing things that you’ve never done and you’ve never seen.

One key learning that we took is how it was allocating resources.

It’s just allocating resources as efficiently as possible.

[…] If OpenAI does that dynamic switch at 100%, we maybe went from 5% to 10%?

3 месяца, 3 недели назад @ openai.com
Deep Double Descent
Deep Double Descent Deep Double Descent

Many classes of modern deep learning models, including CNNs, ResNets, and transformers, exhibit the previously-observed double descent phenomenon when not using early stopping or regularization.

The model-wise double descent phenomenon can lead to a regime where training on more data hurts.

The double descent phenomena is most prominent in settings with added label noise; without it, the peak is smaller and easy to miss.

For a given number of optimization steps (fixed y-coordinate), test and train error exhibit model-size double descent.

We leave fully understanding the mechanisms behind double descent in deep neural networks as an important open question.

4 месяца назад @ openai.com
Procgen Benchmark
Procgen Benchmark Procgen Benchmark

We’re releasing Procgen Benchmark, 16 simple-to-use procedurally-generated environments which provide a direct measure of how quickly a reinforcement learning agent learns generalizable skills.

To fulfill this need, we have created Procgen Benchmark.

CoinRun now serves as the inaugural environment in Procgen Benchmark, contributing its diversity to a greater whole.

With Procgen Benchmark, we strive for all of the following: experimental convenience, high diversity within environments, and high diversity across environments.

We've now expanded on those results, conducting our most thorough study of RL generalization to date using all 16 environments in Procgen Benchmark.

4 месяца назад @ openai.com
Safety Gym
Safety Gym Safety Gym

We're releasing Safety Gym, a suite of environments and tools for measuring progress towards reinforcement learning agents that respect safety constraints while training.

Safety GymTo study constrained RL for safe exploration, we developed a new set of environments and tools called Safety Gym.

BenchmarkTo help make Safety Gym useful out-of-the-box, we evaluated some standard RL and constrained RL algorithms on the Safety Gym benchmark suite: PPO, TRPO, Lagrangian penalized versions of PPO and TRPO, and Constrained Policy Optimization (CPO).

There are three things we are most interested in at the moment:Improving performance on the current Safety Gym environments.

We also hope that systems l…

4 месяца, 2 недели назад @ openai.com
GPT-2: 1.5B Release
GPT-2: 1.5B Release GPT-2: 1.5B Release

As the final model release of GPT-2’s staged release, we’re releasing the largest version (1.5B parameters) of GPT-2 along with code and model weights to facilitate detection of outputs of GPT-2 models.

Our partners at Cornell University surveyed people to assign GPT-2 text a credibility score across model sizes.

People gave the 1.5B model a “credibility score” of 6.91 out of 10.

These results make us more inclined to release the 1.5B model, as the incremental increase in human-perceived credibility relative to 774M seems low.

We acknowledge that we cannot be aware of all threats, and that motivated actors can replicate language models without model release.

5 месяцев назад @ openai.com
Solving Rubik’s Cube with a Robot Hand
Solving Rubik’s Cube with a Robot Hand Solving Rubik’s Cube with a Robot Hand

We've trained a pair of neural networks to solve the Rubik’s Cube with a human-like robot hand.

Since May 2017, we've been trying to train a human-like robotic hand to solve the Rubik’s Cube.

Solving a Rubik’s Cube one-handed is a challenging task even for humans, and it takes children several years to gain the dexterity required to master it.

To test the limits of our method, we experiment with a variety of perturbations while the hand is solving the Rubik’s Cube.

Behind the scenes: Rubik’s Cube prototypes In order to benchmark our progress and make the problem tractable, we built and designed custom versions of cubes as stepping stones towards ultimately solving a regular Rubik’s Cube.

5 месяцев, 3 недели назад @ openai.com
OpenAI Scholars Spring 2020
OpenAI Scholars Spring 2020 OpenAI Scholars Spring 2020

The second class of Scholars recently released their projects and presented their work at the 2019 Scholars Demo Day.

While we hope that some of the scholars will join OpenAI, we want this program to improve diversity in the field at large.

For Bay Area participants, we offer an optional desk at the OpenAI office (which our past Scholars have found very valuable).

We look for people who are comfortable writing software (2+ years in software engineering), but no previous machine learning experience is required.

We ask all Scholars to document their experiences studying deep learning to hopefully inspire others to join the field too.

5 месяцев, 3 недели назад @ openai.com
Fine-Tuning GPT-2 from Human Preferences
Fine-Tuning GPT-2 from
Human Preferences Fine-Tuning GPT-2 from Human Preferences

We’ve fine-tuned the 774M parameter GPT-2 language model using human feedback for various tasks, successfully matching the preferences of the external human labelers, though those preferences did not always match our own.

Fine-tuning for the stylistic continuation tasks is sample efficient: 5,000 human samples suffice for strong performance according to humans.

However, when combining supervised fine-tuning with human fine-tuning, our models outperform lead-3 on ROUGE scores.

The cost of human data means that volume will always be low, so it is easy to retrain from scratch (or rather, from the GPT-2 starting point) each time.

Looking forwardWe’ve demonstrated reward learning from human pref…

6 месяцев, 2 недели назад @ openai.com
Emergent Tool Use from Multi-Agent Interaction
Emergent Tool Use from Multi-Agent Interaction Emergent Tool Use from Multi-Agent Interaction

Through training in our new simulated hide-and-seek environment, agents build a series of six distinct strategies and counterstrategies, some of which we did not know our environment supported.

The self-supervised emergent complexity in this simple environment further suggests that multi-agent co-adaptation may one day produce extremely complex and intelligent behavior.

In this full environment, agents go through two more phases of emergent strategy than in the previous simple environment.

Multi-agent competition vs. intrinsic motivationIn this work we show evidence that agents learn complex strategies and counterstrategies through a self-supervised autocurriculum in hide-and-seek.

Though t…

6 месяцев, 3 недели назад @ openai.com
Testing Robustness Against Unforeseen Adversaries
Testing Robustness Against Unforeseen Adversaries Testing Robustness Against Unforeseen Adversaries

Our method yields a new metric, UAR (Unforeseen Attack Robustness), which evaluates the robustness of a single model against an unanticipated attack, and highlights the need to measure performance across a more diverse range of unforeseen attacks.

The field has made progress in hardening models against such attacks; however, robustness against one type of distortion often does not transfer to robustness against attacks unforeseen by designers of the model.

It also yields a new metric, UAR, which assesses the adversarial robustness of models against unforeseen distortion types.

A UAR score near 100 against an unforeseen adversarial attack implies performance comparable to a defense with prio…

7 месяцев, 2 недели назад @ openai.com
GPT-2: 6-Month Follow-Up
GPT-2: 6-Month Follow-Up GPT-2: 6-Month Follow-Up

Our research suggests that current ML-based methods only achieve low to mid–90s accuracy, and that fine-tuning the language models decreases accuracy further.

PartnershipsWe’ve partnered with four leading research organizations to analyze both the newly-released 774M parameter GPT-2 model and the unreleased full-size GPT-2 model.

is studying human susceptibility to digital disinformation generated by language models.

Center on Terrorism, Extremism, and Counterterrorism (CTEC) is exploring how GPT-2 could be misused by terrorists and extremists online.

The University of Oregon is developing a series of “bias probes” to analyze bias within GPT-2.

7 месяцев, 2 недели назад @ openai.com
Learning Day
Learning Day Learning Day

Before Learning Day, we very rarely saw people grow cross-functionally—for example, employees coming from a software background rarely picked up machine learning (something equally rare in other organizations except academia).

What we learn on Learning DayThe following are examples of what people learn on a single Learning Day.

Learning Day could turn into a normal working day because people may want to accomplish their main project faster (due to internal or external pressure).

We prevent this by having Learning Day on the same day for every team.

Learning Day beyond RoboticsWe’ve recently expanded Learning Day from a subset of our technical teams to the entire company.

8 месяцев, 1 неделя назад @ openai.com
Microsoft Invests In and Partners with OpenAI to Support Us Building Beneficial AGI
Microsoft Invests In and Partners with OpenAI to Support Us Building Beneficial AGI Microsoft Invests In and Partners with OpenAI to Support Us Building Beneficial AGI

Microsoft is investing $1 billion in OpenAI to support us building artificial general intelligence (AGI) with widely distributed economic benefits.

We're partnering to develop a hardware and software platform within Microsoft Azure which will scale to AGI.

An AGI working on a problem would be able to see connections across disciplines that no human could.

OpenAI is producing a sequence of increasingly powerful AI technologies, which requires a lot of capital for computational power.

Instead, we intend to license some of our pre-AGI technologies, with Microsoft becoming our preferred partner for commercializing them.

8 месяцев, 2 недели назад @ openai.com
Why Responsible AI Development Needs Cooperation on Safety
Why Responsible AI Development Needs Cooperation on Safety Why Responsible AI Development Needs Cooperation on Safety

Our analysis shows that industry cooperation on safety will be instrumental in ensuring that AI systems are safe and beneficial, but competitive pressures could lead to a collective action problem, potentially causing AI companies to under-invest in safety.

We hope these strategies will encourage greater cooperation on the safe development of AI and lead to better global outcomes of AI.

Cooperation strategies We've found four strategies that can be used today to improve the likelihood of cooperation on safety norms and standards in AI.

Incentivize adherence to high standards of safety Commend those that adhere to safety standards, reproach failures to ensure that systems are developed safel…

9 месяцев назад @ openai.com
Microsoft Microsoft
последний пост 2 дня, 18 часов назад
A conversation with Kevin Scott, author of “Reprogramming the American Dream”
A conversation with Kevin Scott, author of “Reprogramming the American Dream” A conversation with Kevin Scott, author of “Reprogramming the American Dream”

And it was just an incredible thing, given our circumstances, that my mom and dad would’ve done that for me.

He was an awesome, awesome dad, and some of the most important things I learned from him by example.

The really hilarious thing, as he was giving me all of this miserable work, is that I had already decided that I was really, really interested in computers and programming, and I knew that I wanted to go to college.

In your book, you raise the idea that technology such as AI can help people in rural America achieve the American dream.

You just can’t expect them to fully participate in this new emerging economy without this very basic bit of infrastructure.

2 дня, 18 часов назад @ blogs.microsoft.com
A conversation with Kevin Scott, author of “Reprogramming the American Dream”
A conversation with Kevin Scott, author of “Reprogramming the American Dream” A conversation with Kevin Scott, author of “Reprogramming the American Dream”

And it was just an incredible thing, given our circumstances, that my mom and dad would’ve done that for me.

He was an awesome, awesome dad, and some of the most important things I learned from him by example.

The really hilarious thing, as he was giving me all of this miserable work, is that I had already decided that I was really, really interested in computers and programming, and I knew that I wanted to go to college.

In your book, you raise the idea that technology such as AI can help people in rural America achieve the American dream.

You just can’t expect them to fully participate in this new emerging economy without this very basic bit of infrastructure.

2 дня, 18 часов назад @ blogs.microsoft.com
An interview with Microsoft President Brad Smith
An interview with Microsoft President Brad Smith An interview with Microsoft President Brad Smith

Episode 113 | April 1, 2020Brad Smith is the President of Microsoft and leads a team of more than 1400 employees in 56 countries.

Host: Brad Smith is the President of Microsoft and leads a team of more than 1400 employees in 56 countries.

Brad Smith: I think it’s important to have a word that encompasses more of what we’re really talking about.

Brad Smith: Well, first we, at Microsoft, did develop and publish our six ethical principles in a way that’s sort of remarkable to me.

Brad Smith: They were both really important and, in my view, exciting steps for Microsoft to take.

4 дня, 16 часов назад @ microsoft.com
An interview with Microsoft President Brad Smith
An interview with Microsoft President Brad Smith An interview with Microsoft President Brad Smith

Episode 113 | April 1, 2020Brad Smith is the President of Microsoft and leads a team of more than 1400 employees in 56 countries.

Host: Brad Smith is the President of Microsoft and leads a team of more than 1400 employees in 56 countries.

Brad Smith: I think it’s important to have a word that encompasses more of what we’re really talking about.

Brad Smith: Well, first we, at Microsoft, did develop and publish our six ethical principles in a way that’s sort of remarkable to me.

Brad Smith: They were both really important and, in my view, exciting steps for Microsoft to take.

5 дней, 2 часа назад @ microsoft.com
Alt text that informs: Meeting the needs of people who are blind or low vision
Alt text that informs: Meeting the needs of people who are blind or low vision Alt text that informs: Meeting the needs of people who are blind or low vision

Image descriptions are vital to making digital content fully accessible to people who are blind or have low vision.

Is the Person Naked?’ What People with Vision Impairments Want in Image Descriptions,” which was accepted at the ACM CHI Conference on Human Factors in Computing Systems (CHI 2020).

Screen readers present textual content as audio or Braille output and can only present digital images if they have an accompanying alt text description.

The paper’s title stems from one participant’s observation regarding the inadequacy of current AI-based image descriptions.

News Social Networking eCommerce Employment Dating Productivity E-Publication Event/Scene People Present x x x x x x x Text …

5 дней, 18 часов назад @ microsoft.com
Finally, progress on regulating facial recognition
Finally, progress on regulating facial recognition Finally, progress on regulating facial recognition

In 2018, we urged the tech sector and the public to avoid a commercial race to the bottom on facial recognition technology.

Testing requirementsFirst, the law will accelerate market forces to address the risk of bias in facial recognition technology.

Recent NIST research demonstrated that some facial recognition technologies have encountered higher error rates across different demographic groups.

This training must cover both the capabilities and limitations of the service, as well as how to interpret facial recognition output.

For example, authorities must disclose their use of facial recognition technology to criminal defendants in a timely manner prior to trial.

5 дней, 21 час назад @ blogs.microsoft.com
Extending the reach of care: Tending to patients using chat app technology
Extending the reach of care: Tending to patients using chat app technology Extending the reach of care: Tending to patients using chat app technology

Our research included observation of the nurses’ work, interviews with the nurses and doctors, and an examination of the chat messages that best illustrate the work the nurses carry out on WeChat.

When new patients were enrolled, with their consent they were added to existing patient groups instead of creating new groups each time.

After responding to all “@” messages, the nurses skim through the interactions between patients, looking for the most active patients.

When they use the chat app, they are careful to use professional judgment and practices to determine whether a message should be sent via chat groups, one-on-one chat, or moving a conversation offline (see more about contextualizi…

6 дней, 18 часов назад @ microsoft.com
New AI tools help writers be more clear, concise and inclusive in Office and across the web
New AI tools help writers be more clear, concise and inclusive in Office and across the web New AI tools help writers be more clear, concise and inclusive in Office and across the web

For teams that incorporate AI into productivity tools, one of the most important principles is to keep people at the center of the process.

“So making sure we’re talking to a much broader set of people and hearing everyone’s voice is really important to give people what they truly need,” he said.

“It was like listening for a needle in a haystack, and the fatigue level was really high,” Friedman said.

So the team used AI to offer the most important information upfront and in a much more conversational way.

“We started on this path because we thought inclusive design was an important philosophy that we needed to start living and breathing in product,” Friedman said.

6 дней, 20 часов назад @ blogs.microsoft.com
Microsoft Rocketbox avatar library now available for research and academic use
Microsoft Rocketbox avatar library now available for research and academic use Microsoft Rocketbox avatar library now available for research and academic use

Microsoft Rocketbox can be downloaded from GitHub.

The flexibility of the Rocketbox libraryOriginally developed by Rocketbox Studios GmbH and later supported by Havok, which Microsoft acquired in 2015, the Microsoft Rocketbox avatar library represents extensive work in research and prototyping conducted over a 10-year span.

Avatar embodimentAvatar embodiment is one field of avatar research that the release of the Microsoft Rocketbox library will further enable.

Microsoft Rocketbox a public library,” will be released in the coming weeks.

IEEE VR 2020Apart from the Microsoft Rocketbox release, we also presented our latest research on avatars last week at IEEE VR.

6 дней, 20 часов назад @ microsoft.com
New AI tools help writers be more clear, concise and inclusive in Office and across the web
New AI tools help writers be more clear, concise and inclusive in Office and across the web New AI tools help writers be more clear, concise and inclusive in Office and across the web

For teams that incorporate AI into productivity tools, one of the most important principles is to keep people at the center of the process.

“So making sure we’re talking to a much broader set of people and hearing everyone’s voice is really important to give people what they truly need,” he said.

“It was like listening for a needle in a haystack, and the fatigue level was really high,” Friedman said.

So the team used AI to offer the most important information upfront and in a much more conversational way.

“We started on this path because we thought inclusive design was an important philosophy that we needed to start living and breathing in product,” Friedman said.

6 дней, 20 часов назад @ blogs.microsoft.com
Extending the power of Azure AI to Microsoft 365 users
Extending the power of Azure AI to Microsoft 365 users

Today, Yusuf Mehdi, Corporate Vice President of Modern Life and Devices, announced the availability of two new subscriptions: Microsoft 365 Personal and Microsoft 365 Family. In his blog, he shared…

1 неделя назад @ azure.microsoft.com
Microsoft ElectionGuard—enabling voters to verify that their votes are correctly counted webinar
Microsoft ElectionGuard—enabling voters to verify that their votes are correctly counted webinar Microsoft ElectionGuard—enabling voters to verify that their votes are correctly counted webinar

Microsoft Research Webinar SeriesMicrosoft ElectionGuard—enabling voters to verify that their votes are correctly countedMicrosoft ElectionGuard provides a free, open-source software toolkit, which can be used in new and existing election systems to allow voters to verify that their votes have been accurately counted.

Voters can check for themselves that their votes have been correctly recorded, and anyone—voters, candidates, media, or even casual observers—can verify that the recorded votes have been accurately tallied.

In this webinar with Dr. Josh Benaloh, Senior Principal Cryptographer at Microsoft Research, learn how the ElectionGuard technology works and how anyone can write an indepe…

1 неделя, 3 дня назад @ note.microsoft.com
Microsoft’s AI Transformation, Project Turing and smarter search with Rangan Majumder
Microsoft’s AI Transformation, Project Turing and smarter search with Rangan Majumder Microsoft’s AI Transformation, Project Turing and smarter search with Rangan Majumder

Rangan Majumder: Right.

Rangan Majumder: That’s right.

So first was, we’ve got this deep learning search stack, deep learning question answering system, but then we started to build these Turing Neural Language Representation.

Host: It’s like taking a thoroughbred to a kid’s party…Rangan Majumder: That’s right.

Rangan Majumder: That’s right and he has this Aether committee that our team is actually involved in.

1 неделя, 5 дней назад @ microsoft.com
Data-driven insights for more effective, personalized care in online mental health interventions
Data-driven insights for more effective, personalized care in online mental health interventions Data-driven insights for more effective, personalized care in online mental health interventions

Increases in the occurrence and global effect of mental illness have made the prevention and treatment of mental health problems a public health priority.

This research investigation is part of Project Talia, which follows a human-centered approach for identifying how ML applications can meaningfully assist in the detection, diagnosis, monitoring, and treatment of mental health problems.

how the type and frequency of coaching support could be better tailored to each client’s unique mental health and treatment needs.

Clients work through the program at their own pace and time, and they receive regular, personalized feedback messages sent by a trained mental health coach.

More specifically, w…

1 неделя, 5 дней назад @ microsoft.com
Microsoft is expanding the Azure Stack Edge with NVIDIA GPU preview
Microsoft is expanding the Azure Stack Edge with NVIDIA GPU preview

Today, we’re expanding the Microsoft Azure Stack Edge with NVIDIA T4 Public Preview at NVIDIA’s GPU Technology Conference (GTC).

1 неделя, 5 дней назад @ azure.microsoft.com
MIT AI MIT AI
последний пост 3 дня, 15 часов назад
Q&A: Markus Buehler on setting coronavirus and AI-inspired proteins to music
Q&amp;A: Markus Buehler on setting coronavirus and AI-inspired proteins to music Q&amp;A: Markus Buehler on setting coronavirus and AI-inspired proteins to music

Just ask Markus Buehler: The musician and MIT professor develops artificial intelligence models to design new proteins, sometimes by translating them into sound.

We can visualize the protein’s structure and use other computational methods to assess its function by analyzing its stablity and the other proteins it binds to in cells.

We represented the physical protein structure, with its entangled chains, as interwoven melodies that form a multi-layered composition.

Translating proteins into sound gives scientists another tool to understand and design proteins.

Through sonification, we can also compare the biochemical processes of its spike protein with previous coronaviruses, like SARS or ME…

3 дня, 15 часов назад @ news.mit.edu
Q&A: Markus Buehler on setting SARS-CoV-2 protein and AI-inspired proteins to music
Q&amp;A: Markus Buehler on setting SARS-CoV-2 protein and AI-inspired proteins to music Q&amp;A: Markus Buehler on setting SARS-CoV-2 protein and AI-inspired proteins to music

We can visualize the protein’s structure and use other computational methods to assess its function by analyzing its stablity and the other proteins it binds to in cells.

A: Its protein spike contains three protein chains folded into an intriguing pattern.

We represented the physical protein structure, with its entangled chains, as interwoven melodies that form a multi-layered composition.

Translating proteins into sound gives scientists another tool to understand and design proteins.

Through sonification, we can also compare the biochemical processes of its spike protein with previous coronaviruses, like SARS or MERS.

3 дня, 15 часов назад @ news.mit.edu
Neural networks facilitate optimization in the search for new materials
Neural networks facilitate optimization in the search for new materials Neural networks facilitate optimization in the search for new materials

This culling process would have taken 50 years by conventional analytical methods, they say, but they accomplished it in five weeks.

Instead, Kulik and her team took a small number of different possible materials and used them to teach an advanced machine-learning neural network about the relationship between the materials’ chemical compositions and their physical properties.

That knowledge was then applied to generate suggestions for the next generation of possible materials to be used for the next round of training of the neural network.

Through four successive iterations of this process, the neural network improved significantly each time, until reaching a point where it was clear that f…

1 неделя, 4 дня назад @ news.mit.edu
“Inactive” pill ingredients could raise the dose of your medication
“Inactive” pill ingredients could raise the dose of your medication “Inactive” pill ingredients could raise the dose of your medication

The average medication contains a mix of eight “inactive” ingredients added to pills to make them taste better, last longer, and stabilize the active ingredients within.

But now, in a new twist, MIT researchers have discovered that two other inactive ingredients may actually boost medication strength to the benefit of some patients.

They also outline a method for using machine learning to find other inactive ingredients with untapped therapeutic value.

Machine learning allowed the researchers to quickly make comparisons between millions of drugs and inactive ingredients to identify the additives most likely to have an effect.

Comparing the chemical structures of the 800 “inactive” ingredien…

2 недели, 5 дней назад @ news.mit.edu
Deep learning for mechanical property evaluation
Deep learning for mechanical property evaluation Deep learning for mechanical property evaluation

A standard method for testing some of the mechanical properties of materials is to poke them with a sharp point.

This “indentation technique” can provide detailed measurements of how the material responds to the point’s force, as a function of its penetration depth.

“Small” challenges beyond elasticity“Indentation is a very good method for testing mechanical properties,” Dao says, especially in cases where only small samples are available for testing.

Indentation can be used to determine hardness, but Dao explains that “hardness is only a combination of a material’s elastic and plastic properties.

The work was supported by the Army Research Laboratory, the U.S. Department of Energy, and the…

2 недели, 6 дней назад @ news.mit.edu
The elephant in the server room
The elephant in the server room The elephant in the server room

“The data never speak for themselves,” says D’Ignazio, referring to the general problem of finding reliable numbers about women’s lives.

In the book, “Data Feminism,” published this month by the MIT Press, the authors use the lens of intersectional feminism to scrutinize how data science reflects the social structures it emerges from.

Who’s benefiting?”Still, the question of who participates in data science is, as the authors write, “the elephant in the server room.” As of 2011, only 26 percent of all undergraduates receiving computer science degrees in the U.S. were women.

People interested in data feminism, the authors state, should also “value multiple forms of knowledge,” including firs…

4 недели назад @ news.mit.edu
“Doing machine learning the right way”
“Doing machine learning the right way” “Doing machine learning the right way”

The work of MIT computer scientist Aleksander Madry is fueled by one core mission: “doing machine learning the right way.”Madry’s research centers largely on making machine learning — a type of artificial intelligence — more accurate, efficient, and robust against errors.

It’s in my DNA.”Getting adversarialShortly after joining MIT, Madry found himself swept up in a novel science: machine learning.

“We want machine learning not just as a toy, but as something you can use in, say, an autonomous car, or health care.

“Sometimes we overestimate the power of machine learning, thinking it will be our salvation.

“To do machine learning right, there’s still a lot still left to figure out.”

4 недели, 1 день назад @ news.mit.edu
Showing robots how to do your chores
Showing robots how to do your chores Showing robots how to do your chores

Training interactive robots may one day be an easy job for everyone, even those without programming expertise.

Roboticists are developing automated robots that can learn new tasks solely by observing humans.

In the workplace, you could train robots like new employees, showing them how to perform many duties.

But the researchers’ robot made no mistakes over several real-world experiments, and only a handful of mistakes over tens of thousands of simulated test runs.

The robot’s observations of 30 human demonstrations for setting the table yielded a probability distribution over 25 different LTL formulas.

1 месяц назад @ news.mit.edu
A new model of vision
A new model of vision A new model of vision

This type of model, known as efficient inverse graphics (EIG), also correlates well with electrical recordings from face-selective regions in the brains of nonhuman primates, suggesting that the primate visual system may be organized in much the same way as the computer model, the researchers say.

“Vision is the functional aspect of the brain that we understand the best, in humans and other animals,” Tenenbaum says.

Their model thus learns to reverse the steps performed by a computer graphics program for generating faces.

Model performanceThe researchers found that their model is consistent with data obtained by studying certain regions in the brains of macaque monkeys.

“Their approach merg…

1 месяц назад @ news.mit.edu
Demystifying the world of deep networks
Demystifying the world of deep networks Demystifying the world of deep networks

Despite this, deep networks show good predictive performance, and in fact do better the more parameters they have.

The surprising fact is that no such explicit constraint seems to be needed in training deep networks.

As co-author and MIT postdoc Andrzej Banburski explains, “Understanding convergence in deep networks shows that there are clear directions for improving our algorithms.

There is no magic behind deep networks.

This work suggests ways to improve deep networks, making them more accurate and faster to train.

1 месяц, 1 неделя назад @ news.mit.edu
Machine learning picks out hidden vibrations from earthquake data
Machine learning picks out hidden vibrations from earthquake data Machine learning picks out hidden vibrations from earthquake data

They do so by tracking seismic waves that are produced naturally by earthquakes or artificially via explosives or underwater air guns.

Specifically generating low-frequency waves would require pumping in enormous amounts of energy.

For these reasons, low-frequency seismic waves have largely gone missing in human-generated seismic data.

Speaking another frequencyA neural network is a set of algorithms modeled loosely after the neural workings of the human brain.

Sun and Demanet adapted a neural network for signal processing, specifically, to recognize patterns in seismic data.

1 месяц, 1 неделя назад @ news.mit.edu
To self-drive in the snow, look under the road
To self-drive in the snow, look under the road To self-drive in the snow, look under the road

But so far even the most high-tech vehicles still fail when it comes to safely navigating in rain and snow.

This is because these weather conditions wreak havoc on the most common approaches for sensing, which usually involve either lidar sensors or cameras.

Specifically, the CSAIL team used a particular form of GPR instrumentation developed at MIT Lincoln Laboratory called localizing ground-penetrating radar, or LGPR.

But its ability to localize in bad weather means that it would couple nicely with lidar and vision approaches.

LGPR maps also take up only about 80 percent of the space used by traditional 2D sensor maps that many companies use for their cars.

1 месяц, 1 неделя назад @ news.mit.edu
MIT Solve announces 2020 global challenges
MIT Solve announces 2020 global challenges MIT Solve announces 2020 global challenges

Solve seeks tech-based solutions from social entrepreneurs around the world that address these four challenges.

Finalists will be invited to attend Solve Challenge Finals on Sept. 20 in New York City during U.N. General Assembly week.

At the event, they will pitch their solutions to Solve’s Challenge Leadership Groups, judging panels comprised of industry leaders and MIT faculty.

Solve’s challenge design process collects insights and ideas from industry leaders, MIT faculty, and local community voices alike.

As a marketplace for social impact innovation, Solve’s mission is to solve world challenges.

1 месяц, 1 неделя назад @ news.mit.edu
Bringing deep learning to life
Bringing deep learning to life Bringing deep learning to life

Gaby Ecanow loves listening to music, but never considered writing her own until taking 6.S191 (Introduction to Deep Learning).

The course covers the technical foundations of deep learning and its societal implications through lectures and software labs focused on real-world applications.

A branch of machine learning, deep learning harnesses massive data and algorithms modeled loosely on how the brain processes information to make predictions.

Predicting protein behavior is key to designing drug targets, among other clinical applications, and Sledzieski wondered if deep learning could speed up the search for viable protein pairs.

After finishing their undergraduate degrees, they decided to …

1 месяц, 1 неделя назад @ news.mit.edu
A human-machine collaboration to defend against cyberattacks
A human-machine collaboration to defend against cyberattacks A human-machine collaboration to defend against cyberattacks

Being a cybersecurity analyst at a large company today is a bit like looking for a needle in a haystack — if that haystack were hurtling toward you at fiber optic speed.

“Most machine learning systems in cybersecurity have been doing anomaly detection,” says Kalyan Veeramachaneni, a co-founder of PatternEx and a principal research scientist at MIT.

Veeramachaneni and Arnaldo knew from their time building tools for machine-learning researchers at MIT that a successful solution would need to seamlessly integrate machine learning with human expertise.

The platform uses machine learning models to go through more than 50 streams of data and identify suspicious behavior.

We do that very efficient…

1 месяц, 2 недели назад @ news.mit.edu
Berkeley AI
последний пост 3 дня, 3 часа назад
Robots Learning to Move like Animals
Robots Learning to Move like Animals Robots Learning to Move like Animals

Robots Learning to Move like AnimalsQuadruped robot learning locomotion skills by imitating a dog.

The superior agility seen in animals, as compared to robots, might lead one to wonder: can we create more agile robotic controllers with less effort by directly imitating animals?

a dog), our framework uses reinforcement learning to train a control policy that enables a robot to imitate the motion in the real world.

1) First, given a reference motion, the motion retargeting stage maps the motion from the original animal’s morphology to the robot’s morphology.

2) Next, the motion imitation stage uses the retargeted reference motion to train a policy for imitating the motion in simulation.

3 дня, 3 часа назад @ bair.berkeley.edu
Physically Realistic Attacks on Deep Reinforcement Learning
Physically Realistic Attacks on Deep Reinforcement Learning Physically Realistic Attacks on Deep Reinforcement Learning

Physically Realistic Attacks on Deep Reinforcement LearningDeep reinforcement learning (RL) has achieved superhuman performance in problems ranging from data center cooling to video games.

Consequently, it is critical that RL policies are robust: both to naturally occurring distribution shift, and to malicious attacks by adversaries.

We find it is still possible to attack victim policies in this more realistic multi-agent threat model.

To better understand how the adversarial policies exploit their victims, we created “masked” versions of victim policies.

The existence of adversarial policies has significant implications for the training, understanding and evaluation of RL policies.

1 неделя, 3 дня назад @ bair.berkeley.edu
Does On-Policy Data Collection Fix Errors in Off-Policy Reinforcement Learning?
Does On-Policy Data Collection Fix Errors in Off-Policy Reinforcement Learning? Does On-Policy Data Collection Fix Errors in Off-Policy Reinforcement Learning?

Does On-Policy Data Collection Fix Errors in Off-Policy Reinforcement Learning?

Corrective Feedback and Why it is Absent in ADPWhat is corrective feedback formally?

This enjoys corrective feedback, and we then contrast it with ADP methods, which do not.

One way to prevent this problem is by computing an “optimal” data distribution that provides maximal corrective feedback, and train Q-functions using this distribution?

More generally, we would like to make a case of analyzing effects of data distribution more deeply in the context of deep RL algorithms.

3 недели назад @ bair.berkeley.edu
BADGR:The Berkeley Autonomous Driving Ground Robot
BADGR:The Berkeley Autonomous Driving Ground Robot BADGR:The Berkeley Autonomous Driving Ground Robot

We call our robot learning system BADGR: the Berkeley Autonomous Driving Ground Robot.

The neural network predictive model is trained to predict these future events as accurately as possible.

(4) Planning and NavigatingBADGR predicting which actions lead to bumpy terrain (left) or collisions (right).

For example, the reward function could encourage driving towards a goal while discouraging collisions or driving over bumpy terrain.

BADGR successfully reaches the goal while avoiding collisions and bumpy terrain, while the geometry-based policy is unable to avoid bumpy terrain.

3 недели, 4 дня назад @ bair.berkeley.edu
Speeding Up Transformer Training and Inference By Increasing Model Size
Speeding Up Transformer Training and Inference By Increasing Model Size Speeding Up Transformer Training and Inference By Increasing Model Size

Speeding Up Transformer Training and Inference By Increasing Model SizeModel Training Can Be SlowIn deep learning, using more compute (e.g., increasing model size, dataset size, or training steps) often leads to higher accuracy.

Instead, when training Transformer models on a budget, you want to drastically increase model size but stop training very early.

This phenomenon occurs because larger models converge to lower test error in fewer gradient updates than smaller models.

We also recommend increasing model size, not batch size.

ConclusionWe have shown that increasing Transformer model size can improve the efficiency of training and inference, i.e., one should Train Large, Then Compress.

1 месяц назад @ bair.berkeley.edu
Large Scale Training at BAIR with Ray Tune
Large Scale Training at BAIR with Ray Tune Large Scale Training at BAIR with Ray Tune

Large Scale Training at BAIR with Ray TuneIn this blog post, we share our experiences in developing two critical software libraries that many BAIR researchers use to execute large-scale AI experiments: Ray Tune and the Ray Cluster Launcher, both of which now back many popular open-source AI libraries.

Ray Cluster Launcher: a utility for managing resource provisioning and cluster configurations across AWS, GCP, and Kubernetes.

Actor-based TrainingMany techniques for hyperparameter optimization require a framework that monitors the metrics of all concurrent training jobs and controls the training execution.

The Ray Tune documentation page for distributed experiments shows you how you can do t…

2 месяца, 3 недели назад @ bair.berkeley.edu
Emergent Behavior by Minimizing Chaos
Emergent Behavior by Minimizing Chaos Emergent Behavior by Minimizing Chaos

Emergent Behavior by Minimizing ChaosAll living organisms carve out environmental niches within which they can maintain relative predictability amidst the ever-increasing entropy around them (1), (2).

In simulated worlds, such as video games, novelty-seeking intrinsic motivation can lead to interesting and meaningful behavior.

In entropic and dynamic environments with undesirable forms of novelty, minimizing surprise (i.e., minimizing novelty) causes agents to naturally seek an equilibrium that can be stably maintained.

Emergent behaviorThe SMiRL agent demonstrates meaningful emergent behaviors in a number of different environments.

The agent also learns emergent game playing behavior in th…

3 месяца, 2 недели назад @ bair.berkeley.edu
What is My Data Worth?
What is My Data Worth? What is My Data Worth?

In the worst-case scenario, adversarial data sources may even degrade model performance via data poisoning attacks.

Hence, the data value should reflect the efficacy of data by assigning high values to data which can notably improve the model’s performance.

With the desiderata above, we now discuss a principled notion of data value and computationally efficient algorithms for data valuation.

Relating these game theoretic concepts to the problem of data valuation, one can think of the players as training data sources, and accordingly, the utility function $U(S)$ as a performance measure of the model trained on the subset S of training data.

ConclusionWe hope that our approaches for data va…

3 месяца, 3 недели назад @ bair.berkeley.edu
Learning to Imitate Human Demonstrations via CycleGAN
Learning to Imitate Human Demonstrations via CycleGAN Learning to Imitate Human Demonstrations via CycleGAN

Learning to Imitate Human Demonstrations via CycleGANThis work presents AVID, a method that allows a robot to learn a task, such as making coffee, directly by watching a human perform the task.

Providing rewards via human videos handles the task definition, however there is still human cost during the actual learning process.

Thus, we train a CycleGAN where the domains are human and robot images: for training data, we collect demonstrations from the human and random movements from both the human and robot.

Through this, we obtain a CycleGAN that is capable of generating fake robot demonstrations from human demonstrations, as depicted above.

We used a total of 30 human demonstrations for thi…

3 месяца, 3 недели назад @ bair.berkeley.edu
Model-Based Reinforcement Learning:Theory and Practice
Model-Based Reinforcement Learning:Theory and Practice Model-Based Reinforcement Learning:Theory and Practice

The natural question to ask after making this distinction is whether to use such a predictive model.

The latter half of this post is based on our recent paper on model-based policy optimization, for which code is available here.

Model-based techniquesBelow, model-based algorithms are grouped into four categories to highlight the range of uses of predictive models.

Sampling-based planningIn the fully general case of nonlinear dynamics models, we lose guarantees of local optimality and must resort to sampling action sequences.

The original proposal of such a combination comes from the Dyna algorithm by Sutton, which alternates between model learning, data generation under a model, and policy …

3 месяца, 3 недели назад @ bair.berkeley.edu
AWS Machine Learning AWS Machine Learning
последний пост 3 дня, 14 часов назад
Deploying machine learning models as serverless APIs
Deploying machine learning models as serverless APIs Deploying machine learning models as serverless APIs

Machine learning (ML) practitioners gather data, design algorithms, run experiments, and evaluate the results.

Depending on latency and memory requirements, AWS Lambda can be an excellent choice for easily deploying ML models.

This post provides an example of how to easily expose your ML model to end-users as a serverless API.

Amazon API Gateway – Provides a REST API for your front end to interface with your deep learning Lambda function.

Upload the file inference.zip to the AWS Lambda IDE.The following screenshot shows the Function code section on the AWS Lambda console.

3 дня, 14 часов назад @ aws.amazon.com
Reducing player wait time and right sizing compute allocation using Amazon SageMaker RL and Amazon EKS
Reducing player wait time and right sizing compute allocation using Amazon SageMaker RL and Amazon EKS Reducing player wait time and right sizing compute allocation using Amazon SageMaker RL and Amazon EKS

Even with predictions, optimizing compute resource allocation is non-trivial because it takes substantial time to prepare Amazon EC2 instances.

This post shows how to use Amazon SageMaker RL to apply reinforcement learning (RL) with Amazon EKS, Amazon DynamoDB, AWS Lambda functions, and Amazon API Gateway.

It illustrates the interactions between the player, the matchmaking, and game lobby application with the dedicated game servers hosted by EKS.

The autopilot server returns the safe inference number to the autopilot client that sets the size of game server deployment.

This post assumes that the server farm is an existing live system that operates regardless of the game server autopilot.

3 дня, 17 часов назад @ aws.amazon.com
Autodesk optimizes visual similarity search model in Fusion 360 with Amazon SageMaker Debugger
Autodesk optimizes visual similarity search model in Fusion 360 with Amazon SageMaker Debugger Autodesk optimizes visual similarity search model in Fusion 360 with Amazon SageMaker Debugger

The Autodesk team partnered with AWS to use Amazon SageMaker Debugger to assess how it could improve the model training and debugging process.

Pre-SageMaker Debugger approachThe Autodesk team followed a linear process for training and editing the model before using SageMaker Debugger.

Keeping model training costs under control was one of the drivers for not kicking off parallel training jobs.

Adapting code to use Amazon SageMaker DebuggerAdding SageMaker Debugger to the Autodesk training job was simple.

Post-SageMaker Debugger approachWith SageMaker Debugger, Autodesk kicked off parallel training jobs to run a parameter sweep.

5 дней, 12 часов назад @ aws.amazon.com
Pruning machine learning models with Amazon SageMaker Debugger and Amazon SageMaker Experiments
Pruning machine learning models with Amazon SageMaker Debugger and Amazon SageMaker Experiments Pruning machine learning models with Amazon SageMaker Debugger and Amazon SageMaker Experiments

This post demonstrates iterative model pruning with Amazon SageMaker.

For more information, see Amazon SageMaker Debugger – Debug Your Machine Learning Models.

Amazon SageMaker Experiments lets you customize, visualize, and track ML experiments at scale.

For more information, see Amazon SageMaker Experiments – Organize, Track and Compare Your Machine Learning Trainings.

In the context of Amazon SageMaker Debugger, a trial is an object that lets you query tensors for a given training job.

5 дней, 12 часов назад @ aws.amazon.com
Increasing performance and reducing the cost of MXNet inference using Amazon SageMaker Neo and Amazon Elastic Inference
Increasing performance and reducing the cost of MXNet inference using Amazon SageMaker Neo and Amazon Elastic Inference Increasing performance and reducing the cost of MXNet inference using Amazon SageMaker Neo and Amazon Elastic Inference

At re:Invent 2018, AWS introduced Amazon SageMaker Neo and Amazon Elastic Inference, two services that can make models more efficient for deep learning.

This post evaluates deployment options using Amazon SageMaker and Amazon Elastic Inference and the different results you may see if you choose different Amazon EC2 instances.

The notebook shows how to use Amazon SageMaker to fine-tune a pre-trained convolutional neural network model, optimize the model using SageMaker Neo, and deploy the model and evaluate its latency in a variety of methods using SageMaker Neo and EIA.

Using SageMaker Neo on G4 provided a decrease in latency nearly six times greater than a base G4 instance.

For throughput …

5 дней, 13 часов назад @ aws.amazon.com
Analyzing and optimizing Amazon Lex conversations using Dashbot
Analyzing and optimizing Amazon Lex conversations using Dashbot Analyzing and optimizing Amazon Lex conversations using Dashbot

In this post, we describe how to analyze interactions with an Amazon Lex bot using Dashbot capabilities.

This solution allows you to use your Amazon Lex conversation logs data to capture bot interactions and analyze them using Dashbot service capabilities.

Enabling conversation logs for Amazon LexTo send your bot interactions to Dashbot, enable conversation logs text for your Amazon Lex bot.

Integrating your Amazon Lex conversation logs with DashbotNow that you’re logging your conversations to CloudWatch Logs, configure a subscription to send these messages to Dashbot.

Testing your integrationWhen the setup is complete, you can test your integration by sending messages to your Amazon Lex bo…

1 неделя, 4 дня назад @ aws.amazon.com
AWS delivers sessions online at NVIDIA GTC Digital
AWS delivers sessions online at NVIDIA GTC Digital AWS delivers sessions online at NVIDIA GTC Digital

Starting Tuesday, March 24, 2020, NVIDIA GTC Digital is offering courses for you to learn AWS best practices to accomplish your ML goals faster and more easily.

Aditya Bindal, Senior Product Manager, AWS Deep EngineIndu Thangakrishnan, Software Development Engineer, AWS Deep EngineS22493: Improve ML Training Performance with Amazon SageMaker DebuggerWith Amazon SageMaker Debugger, developers can get complete insights into the training process by automating data capture and analysis from training runs without code changes.

Haibin Lin, Applied Scientist, AWS Deep Engine-EngineeringLin Yuan, Software Development Engineer, Amazon Deep Learning SDKS21179: Calculating Surface Traversability Using…

1 неделя, 5 дней назад @ aws.amazon.com
Building a trash sorter with AWS DeepLens
Building a trash sorter with AWS DeepLens Building a trash sorter with AWS DeepLens

In this blog post, we show you how to build a prototype trash sorter using AWS DeepLens, the AWS deep learning-enabled video camera designed for developers to learn machine learning in a fun, hands-on way.

PrerequisitesTo complete this walkthrough, you must have the following prerequisites:An AWS accountAn AWS DeepLens device.

For instructions, see Create AWS IoT Devices in an AWS IoT Greengrass Group.

By default, AWS DeepLens blocks traffic on port 8883, which is required for local AWS IoT Greengrass communication.

Because the Raspberry Pi uses Python to interact with AWS IoT, you need to make sure it has the AWS IoT Device SDK for Python.

1 неделя, 5 дней назад @ aws.amazon.com
Making accurate energy consumption predictions with Amazon Forecast
Making accurate energy consumption predictions with Amazon Forecast Making accurate energy consumption predictions with Amazon Forecast

Amazon Forecast is a fully managed service that uses machine learning (ML) to generate highly accurate forecasts, without requiring any prior ML experience.

Power consumption forecast at an aggregate level to better manage supply and demand – As a utility provider, you must balance aggregate supply and demand.

Creating an energy consumption forecast model with ARIMAAutoregressive integrated moving average (ARIMA) is a classic statistical model for time series.

The input data used is individual energy consumption data.

The following screenshot shows the forecast energy consumption for the ID test .

1 неделя, 5 дней назад @ aws.amazon.com
Investigating performance issues with Amazon CodeGuru Profiler
Investigating performance issues with Amazon CodeGuru Profiler Investigating performance issues with Amazon CodeGuru Profiler

Amazon CodeGuru (Preview) analyzes your application’s performance characteristics and provides automatic recommendations on how to improve it.

How CodeGuru Profiler shows anomaliesThe following use case illustrates how CodeGuru Profiler displays anomalies.

To prevent every developer from having to go through the same investigation effort, CodeGuru Profiler tries to detect the issues automatically.

ConclusionThis post discussed the main use cases in which CodeGuru Profiler can help you find performance issues and opportunities to reduce cost.

About the authorPierre Marieu is a Software Development Engineer in the Amazon CodeGuru Profiler team in London.

1 неделя, 6 дней назад @ aws.amazon.com
Translating documents with Amazon Translate, AWS Lambda, and the new Batch Translate API
Translating documents with Amazon Translate, AWS Lambda, and the new Batch Translate API Translating documents with Amazon Translate, AWS Lambda, and the new Batch Translate API

In this blog post, we walk through two different solutions to translate documents – a simple approach to translate a batch of document asynchronously using asynchronous Batch Translation and an advanced approach to translate documents synchronously as they arrive using AWS Lambda and Amazon Real-Time Translation.

Amazon Translate recently introduced asynchronous Batch Translation that enables you to translate a large collection of text or HTML documents.

We use the three text files listed below to review Amazon Translate batch translation.

The Lambda function assumes this role for accessing the required Amazon S3 and Amazon Translate APIs.

ConclusionIn this post, we show the implementation …

2 недели, 2 дня назад @ aws.amazon.com
Converting your content to audio for free with Trinity Audio WordPress plugin
Converting your content to audio for free with Trinity Audio WordPress plugin Converting your content to audio for free with Trinity Audio WordPress plugin

Trinity Audio uses Amazon Polly, which provides an audio option with an easy plug-and-play solution.

This audio content solution provides an entirely new way to increase engagement and grow your audience by effortlessly transforming your content into audio.

Shepherd Yaw Morttey, online content manager at Mfidie.com, says that their company “started using the Trinity Audio player shortly after its release on WordPress.

With voice-enabled interaction becoming the norm, having your content available by audio is a no-brainer.”Installing the pluginTo install the plugin, complete the following steps:Navigate to the Trinity Audio plugin page on the WordPress website.

For more information, see Trin…

2 недели, 3 дня назад @ aws.amazon.com
Reduce ML inference costs on Amazon SageMaker for PyTorch models using Amazon Elastic Inference
Reduce ML inference costs on Amazon SageMaker for PyTorch models using Amazon Elastic Inference Reduce ML inference costs on Amazon SageMaker for PyTorch models using Amazon Elastic Inference

Today, we are excited to announce that you can now use Amazon Elastic Inference to accelerate inference and reduce inference costs for PyTorch models in both Amazon SageMaker and Amazon EC2.

To use Elastic Inference with PyTorch, you have to convert your models into TorchScript format and use the inference API for Elastic Inference.

End-to-end inference benchmarking in Amazon SageMaker with Elastic Inference PyTorchThis post walks you through the process of benchmarking Elastic Inference-enabled PyTorch inference latency for DenseNet-121 using an Amazon SageMaker hosted endpoint.

This policy gives permissions to use Amazon Elastic Inference and Amazon SageMaker.

SummaryAmazon Elastic Infere…

2 недели, 4 дня назад @ aws.amazon.com
Building an AI-powered Battlesnake with reinforcement learning on Amazon SageMaker
Building an AI-powered Battlesnake with reinforcement learning on Amazon SageMaker Building an AI-powered Battlesnake with reinforcement learning on Amazon SageMaker

The SageMaker Battlesnake Starter Pack allows you to focus on developing your AI instead of worrying about the infrastructure surrounding it.

The Amazon SageMaker Battlesnake Starter Pack uses quick-create links in AWS CloudFormation, which provide one-click deployment from the GitHub repo to your AWS Management Console.

Reinforcement learning with BattlesnakeThis section reviews the basics of reinforcement learning and how the SageMaker Battlesnake Starter Pack models the Battlesnake environment in an RL framework.

This notebook automatically downloads the trained model artifacts from Amazon S3, packages it, and processes it for updating the Amazon SageMaker endpoint automatically.

He cont…

3 недели, 2 дня назад @ aws.amazon.com
Creating a machine learning-powered REST API with Amazon API Gateway mapping templates and Amazon SageMaker
Creating a machine learning-powered REST API with Amazon API Gateway mapping templates and Amazon SageMaker Creating a machine learning-powered REST API with Amazon API Gateway mapping templates and Amazon SageMaker

– Defines the data model for the REST request format, and specifies validation and authorization checks to be performed on the received REST request.

– You need sufficient IAM permissions to work with IAM roles, Amazon SageMaker, Amazon S3, and API Gateway.

Step 4: Building an API Gateway endpointIn this section, you build your REST API.

Additional considerationsThis post focused on how to use API Gateway mapping templates to transform requests and responses between formats required by the REST API and model runtime.

Mapping templates can help you avoid using intermediate compute resources between API Gateway and Amazon SageMaker.

3 недели, 2 дня назад @ aws.amazon.com
NVIDIA
последний пост 2 дня, 14 часов назад
Accelerating WinML and NVIDIA Tensor Cores
Accelerating WinML and NVIDIA Tensor Cores Accelerating WinML and NVIDIA Tensor Cores

NVIDIA Tensor CoresOn NVIDIA RTX hardware, from the Volta architecture forward, the GPU includes Tensor Cores to enable acceleration of some of the heavy lift operations involved with deep learning.

Tensor Cores provide the operation with a boost at the most crucial part of the operation, when the per-block dot products are accumulated.

WinML and Tensor CoresModels that run on Windows Machine Learning (WinML) using ONNX can benefit from Tensor Cores on NVIDIA hardware, but it is not immediately obvious how to make sure that they are in fact used.

There is no switch or button labeled Use Tensor Cores and there are certain constraints by which the model and input data must abide.

To leverage …

2 дня, 14 часов назад @ devblogs.nvidia.com
GTC Digital Demo: Assessing Property Damage with AI
GTC Digital Demo: Assessing Property Damage with AI GTC Digital Demo: Assessing Property Damage with AI

The demo shows the workflow from training the deep learning model to inferencing, which ultimately automated the detection of damaged homes.

The deep learning tools within Esri ArcGIS, a geographic information system for working with maps and geographic information maintained by Esri, sped up the process to provide aid to those affected by this disaster.

For this demo, the developers used a client-server architecture, which gives a clean separation of the roles of a Geographic Information System Analyst (GIS), and a Data Scientist.

The GIS Analyst used an NVIDIA Quadro Virtual Data Center Workstation to create, edit and explore spatial data.

The data scientist used the NVIDIA Virtual Comput…

2 дня, 16 часов назад @ news.developer.nvidia.com
Speed of Light: SLAC’s Ryan Coffee Talks Ultrafast Science
Speed of Light: SLAC’s Ryan Coffee Talks Ultrafast Science Speed of Light: SLAC’s Ryan Coffee Talks Ultrafast Science

Ryan Coffee, senior research scientist at SLAC National Accelerator Laboratory at Stanford, blows things up for a living.

Making “Iron Man” Interface Real: AI-Based Virtualitics Demystifies Data Science with VRVirtualitics, an AI-based analytics platform, is bringing creativity to data science through machine learning and immersive virtualization.

Virtualitics Machine Learning Projects Head Aakash Indurkhya speaks about how VR can be instrumental in the field.

Tune in to the AI PodcastGet the AI Podcast through iTunes, Google Podcasts, Google Play, Castbox, DoggCatcher, Overcast, PlayerFM, Pocket Casts, Podbay, PodBean, PodCruncher, PodKicker, Soundcloud, Spotify, Stitcher and TuneIn.

Make …

2 дня, 17 часов назад @ blogs.nvidia.com
General Practitioner: Startup Taps AI to Tackle Multiple Healthcare Conditions
General Practitioner: Startup Taps AI to Tackle Multiple Healthcare Conditions General Practitioner: Startup Taps AI to Tackle Multiple Healthcare Conditions

The South Korea-based startup is building an extended family of NVIDIA GPU-powered AI products, each addressing diagnoses of different diseases.

The company’s AI algorithms are designed to capitalize on the growing pools of big data associated with a vast number of conditions.

So far, VUNO has products that address bone age assessment, neurodegenerative disorders and diseases with signs visible on chest X-rays or CT scans.

“AI technology has made the impossible possible by elevating the diagnostic capability to the human level in medical imaging,” said Jung.

And with the help of the NVIDIA Inception program for AI startups, VUNO expects to push the envelope further.

3 дня, 20 часов назад @ blogs.nvidia.com
Sky-High Performance in an Instance: Quadro Virtual Workstations in the Cloud
Sky-High Performance in an Instance: Quadro Virtual Workstations in the Cloud Sky-High Performance in an Instance: Quadro Virtual Workstations in the Cloud

Global cloud providers hosting Quadro Virtual Workstations can help them meet the challenge.

Some IT teams can repurpose on-premises GPU resources to support remote workers through Quadro Virtual Workstation software (Quadro vWS) hosted onsite.

These virtual workstations support the same NVIDIA Quadro drivers and features as the physical Quadro GPUs that professional artists and designers run in local workstations.

Virtual Workstations Simply, Quickly, EasilyWith Quadro vWS running in the cloud, IT departments can spin up a GPU-accelerated virtual workstation — or multiple systems — easily and in minutes.

Amazon AppStream 2.0 provides two NVIDIA graphics instance families – Graphics Pro ins…

3 дня, 21 час назад @ blogs.nvidia.com
Imagination Meets Innovation: New GeForce RTX SUPER GPUs Power High-Performance RTX Studio Laptops
Imagination Meets Innovation: New GeForce RTX SUPER GPUs Power High-Performance RTX Studio Laptops Imagination Meets Innovation: New GeForce RTX SUPER GPUs Power High-Performance RTX Studio Laptops

They’re all powered by new GeForce RTX SUPER GPUs, which deliver faster performance than the original RTX 20 series.

RTX Studio laptops changed that, with manufacturers using NVIDIA GeForce and Quadro RTX GPUs to build a new class of high-performance systems.

In the weeks ahead, HP will launch new precision-engineered RTX Studio laptops.

D5 Render, a new real-time ray-tracing renderer based on DXR, uses RTX GPUs’ real-time ray tracing and rasterization technology.

Diversity in the growing lineup of systems purpose-built for creators means there’s an RTX Studio laptop to keep your work flowing.

4 дня, 5 часов назад @ blogs.nvidia.com
Virus War Goes Viral: Folding@Home Gets 1.5 Exaflops to Fight COVID-19
Virus War Goes Viral: Folding@Home Gets 1.5 Exaflops to Fight COVID-19 Virus War Goes Viral: Folding@Home Gets 1.5 Exaflops to Fight COVID-19

That’s when the research network that Folding@Home manages had arguably become the world’s most powerful supercomputer.

In just 10 days, supporters had downloaded the group’s software on hundreds of thousands of home PCs to help crack the code on COVID-19.

“It’s been a pretty amazing experience,” said Greg Bowman, director of Folding@Home, an all-volunteer team of researchers.

Bowman’s also an associate professor at the Washington University School of Medicine, in St. Louis, home to one of 11 labs worldwide that keep the Folding@Home network humming.

The U.S. Department of Energy, which runs both systems, aims to switch on its first exascale system sometime in 2021.

4 дня, 21 час назад @ blogs.nvidia.com
Latest Updates to NVIDIA CUDA-X AI Libraries
Latest Updates to NVIDIA CUDA-X AI Libraries Latest Updates to NVIDIA CUDA-X AI Libraries

Learn what’s new in the latest releases of NVIDIA’s CUDA-X AI libraries and NGC.

For more information on NVIDIA’s developer tools, join live webinars, training, and Connect with the Experts sessions now through GTC Digital.

NVIDIA Collective Communications Library 2.6NVIDIA Collective Communications Library (NCCL) implements multi-GPU and multi-node collective communication primitives that are performance optimized for NVIDIA GPUs.

Experimental Python client and server support for community standard GRPC inferencing API.

DALI 0.20NVIDIA Data Loading Library (DALI) is a portable, open-source library to GPU-accelerate decoding and augmentation of image/video in deep learning apps.

5 дней, 19 часов назад @ news.developer.nvidia.com
Meet the Researcher: Alan Fern, Machine Learning and Automated Planning for Sequential Decision Making
Meet the Researcher: Alan Fern, Machine Learning and Automated Planning for Sequential Decision Making Meet the Researcher: Alan Fern, Machine Learning and Automated Planning for Sequential Decision Making

His research interests span a range of topics in artificial intelligence, including machine learning and automated planning/control.

My academic areas of focus include machine learning and automated planning for sequential decision making.

Some aspects of this work fall under the category of reinforcement learning.

Some of my main projects these days are: explainable AI in the context of reinforcement learning and vision, robust AI, and reinforcement learning and planning for agile biped robot locomotion.

Right now, we can’t always understand the reasoning behind the decisions or recommendations that AI systems make, especially systems that include machine learning components.

5 дней, 22 часа назад @ news.developer.nvidia.com
Folding@Home Crowdsources GPU-accelerated exaFLOP Supercomputer for COVID-19 Research
Folding@Home Crowdsources GPU-accelerated exaFLOP Supercomputer for COVID-19 Research Folding@Home Crowdsources GPU-accelerated exaFLOP Supercomputer for COVID-19 Research

To help tackle COVID-19, the long-running Folding@Home program, a distributed computing project for simulating protein dynamics, hit a breakthrough by achieving more than an exaflop of processing power.

In just a couple of days, nearly 400,000 gamers donated their GPU resources to build up the Folding@Home supercomputer.

With this power, scientists aim to analyze the dynamics of the COVID-19 protein and hopefully gain better insights into potential drug interactions that can disable the virus.

Folding@Home says they were the first distributed computing project to use GPUs for molecular dynamics simulations.

“For certain types of calculations, we’ve seen GPUs give us a 20-30x speedup over th…

6 дней, 13 часов назад @ news.developer.nvidia.com
Optimizing VK/VKR and DX12/DXR Applications Using Nsight Graphics: GPU Trace Advanced Mode Metrics
Optimizing VK/VKR and DX12/DXR Applications Using Nsight Graphics: GPU Trace Advanced Mode Metrics Optimizing VK/VKR and DX12/DXR Applications Using Nsight Graphics: GPU Trace Advanced Mode Metrics

In a GDC 2019 talk, we showed how to apply the top-down P3 performance-triage method for optimizing any DX12 GPU workload using GPU Trace.

Launching an app through GPU TraceTo take a GPU Trace capture with the new advanced mode metrics:Launch Nsight Graphics 2020.2.

Launching GPU Trace with Advanced Mode Metrics enabled.

In this version of GPU Trace (2020.2), we recommend setting the frame count to 1 when Advanced Mode Metrics is enabled.

To make sure that GPU Trace has successfully attached itself to the application, you can ALT-Tab back to Nsight Graphics and check that you see the Generate GPU Trace Capture button:Figure 2.

6 дней, 15 часов назад @ devblogs.nvidia.com
AI Podcast: Margot Gerritsen’s Got Binders Full of Women in Data Science — and She’s Serious
AI Podcast: Margot Gerritsen’s Got Binders Full of Women in Data Science — and She’s Serious AI Podcast: Margot Gerritsen’s Got Binders Full of Women in Data Science — and She’s Serious

This week’s AI Podcast guest is a renaissance woman with a special passion for data science.

Margot Gerritsen is senior associate dean for educational affairs and professor of energy resources engineering at Stanford University.

She’s the co-founder and co-director of the organization Women in Data Science (WiDS).

Gerritsen spoke to AI Podcast host Noah Kravitz about WiDS, the projects she’s overseeing at Stanford, and what she’s excited about in the current era of data science: the democratization of data.

Gerritsen sees today’s vast quantities of data, open source code and computational power as a “perfect storm” for groundbreaking analytical work.

6 дней, 23 часа назад @ blogs.nvidia.com
No Pain, No Grain: Autodesk VRED Accelerates Design Workflows with AI Denoising and Real-Time Rendering
No Pain, No Grain: Autodesk VRED Accelerates Design Workflows with AI Denoising and Real-Time Rendering No Pain, No Grain: Autodesk VRED Accelerates Design Workflows with AI Denoising and Real-Time Rendering

Autodesk VRED, which was previously limited to CPU, now leverages RTX technology to support the high demands of consumers and provide interactive ray tracing and AI-powered denoising.

Design with Real-Time Rendering, Deliver Real-Time ResultsAutodesk VRED enables users to create digital prototypes so they can gain insight into how vehicles will look and perform.

VRED users can also achieve physically accurate scalable rendering on GPU, and they can connect several NVIDIA RTX Servers to speed up real-time and offline rendering performance.

While many automotive designers are adopting VR for their design workflows, Alstom pushes the graphics envelope further to bring massive train models to l…

6 дней, 23 часа назад @ blogs.nvidia.com
Speeding up Deep Learning Inference Using TensorFlow, ONNX, and TensorRT
Speeding up Deep Learning Inference Using TensorFlow, ONNX, and TensorRT Speeding up Deep Learning Inference Using TensorFlow, ONNX, and TensorRT

In this post, we discuss how to create a TensorRT engine using the ONNX workflow and how to run inference from a TensorRT engine.

ResNet ONNX workflow exampleIn this example, we show how to use the ONNX workflow on two different networks and create a TensorRT engine.

The builder creates an empty network ( builder.create_network() ) and the ONNX parser parses the ONNX file into the network ( parser.parse(model.read()) ).

Running inference from the TensorRT engine:The TensorRT engine runs inference in the following workflow:Allocate buffers for inputs and outputs in the GPU.

After creating the TensorRT engine for the inference, do a similar conversion to what you did for semantic segmentation.

1 неделя, 2 дня назад @ devblogs.nvidia.com
GTC Digital Demo: NVIDIA Tool to Visualize and Interact with Feature Maps
GTC Digital Demo: NVIDIA Tool to Visualize and Interact with Feature Maps GTC Digital Demo: NVIDIA Tool to Visualize and Interact with Feature Maps

NVIDIA Feature Map Explorer is a new powerful tool that visualizes 4-dimensional image-based feature map data in a fluid and interactive fashion.

It provides users with a rich set of views into feature map data that range from high-level summary to low-level channel slices, as well as detailed statistics information.

NVIDIA Feature Map Explorer Offers:Rich views of feature map data at different levels.

Flexible comparisons of feature map data across batches and epochs.

Leverages GPU for fast processing of large quantities of feature map data.

1 неделя, 2 дня назад @ news.developer.nvidia.com
Intel AI Intel AI
последний пост 1 неделя, 2 дня назад
Huiying Medical: Helping Combat COVID-19 with AI Technology
Huiying Medical: Helping Combat COVID-19 with AI Technology Huiying Medical: Helping Combat COVID-19 with AI Technology

Huiying Medical: Helping Combat COVID-19 with AI TechnologyThe COVID-19 coronavirus, since its initial outbreak in Wuhan, China, has quickly become a global pandemic, as declared by the World Health Organization (WHO) .

Image courtesy Huiying Medical.

Huiying Medical’s AI solution is also enabling and accelerating medical research by healthcare professionals on the CT imaging features of the coronavirus.

The Intel AI Builders Program salutes and supports our partner Huiying Medical’s efforts in confronting the challenge with innovation and collaboration.

To learn more, visit Huiying Medical’s solutions page on the Intel AI Builders site.

1 неделя, 2 дня назад @ intel.ai
Maximize CPU Inference Performance with Improved Threads and Memory Management in Intel® Distribution of OpenVINO™ toolkit
Maximize CPU Inference Performance with Improved Threads and Memory Management in Intel® Distribution of OpenVINO™ toolkit

The popularity of convolutional neural network (CNN) models and the ubiquity of CPUs means that better inference performance can deliver significant gains to a larger number of users than ever before. As multi-core processors become the norm, efficient threading is required to leverage parallelism. This blog post covers recent advances in the Intel® Distribution of [...]Read More...

1 неделя, 5 дней назад @ intel.ai
Simplifying Cloud to Edge AI Deployments with the Intel® Distribution of OpenVINO™ Toolkit, Microsoft Azure, and ONNX Runtime
Simplifying Cloud to Edge AI Deployments with the Intel® Distribution of OpenVINO™ Toolkit, Microsoft Azure, and ONNX Runtime Simplifying Cloud to Edge AI Deployments with the Intel® Distribution of OpenVINO™ Toolkit, Microsoft Azure, and ONNX Runtime

Simplifying Cloud to Edge AI Deployments with the Intel® Distribution of OpenVINO™ Toolkit, Microsoft Azure, and ONNX RuntimeOur life is frittered away by detail.

Public clouds are naturally attractive to AI developers, offering ample infrastructure resources, integrated development environments and excellent tools and support.

Under the covers, the integration leverages the ONNX Runtime using the OpenVINO toolkit as the Execution Provider .

Train-to-deploy workflow using Azure Machine Learning, Intel Distribution of OpenVINO toolkit and ONNX Runtime.

Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries.

3 недели, 2 дня назад @ intel.ai
Apple Machine Learning Journal
последний пост 4 месяца назад
Apple at NeurIPS 2019
Apple at NeurIPS 2019 Apple at NeurIPS 2019

The conference, of which Apple is a Diamond Sponsor, will take place in Vancouver, Canada from December 8th to 14th.

If you’re interested in opportunities to make an impact on Apple products through machine learning research and development, check out our teams at Jobs at Apple.

We propose to evaluate both the generator and the discriminator by deriving corresponding Fisher Score and Fisher Information from the EBM.

In this work, we address this problem by introducing data parameters.

During training, at each iteration, as we update the model parameters, we also update the data parameters.

4 месяца назад @ machinelearning.apple.com
Apple at Interspeech 2019
Apple at Interspeech 2019 Apple at Interspeech 2019

Apple is attending Interspeech 2019, the world’s largest conference on the science and technology of spoken language processing.

For Interspeech attendees, join the authors of our accepted papers at our booth to learn more about the great speech research happening at Apple.

If you’re interested in opportunities to make an impact on Apple products through machine learning research and development, check out our teams at Jobs at Apple.

The model can be used to check that text-to-speech (TTS) training speech follows the script and words are pronounced as expected.

Adding more annotated training data for any ML system typically improves accuracy, but only if it provides examples not alrea…

6 месяцев, 3 недели назад @ machinelearning.apple.com
Language Identification from Very Short Strings
Language Identification from Very Short Strings Language Identification from Very Short Strings

For example, this capability is needed to load the right autocorrection lexicon and the right language model for predictive and multilingual typing.

Neural LID ArchitectureWe model LID as a character level sequence labeling problem.

LSTM model sizes, on the other hand, are simply a function of the network parameters.

At Apple, bi-LSTM LID is now used for most tasks which require language identification, like text tagging and other public APIs part of the Natural Language framework.

Language Identification from Short Strings.

8 месяцев, 2 недели назад @ machinelearning.apple.com
Bridging the Domain Gap for Neural Models
Bridging the Domain Gap for Neural Models Bridging the Domain Gap for Neural Models

This task is called the covariate shift problem, for the case where we have access to labeled data from one domain (source) and unlabeled data from another domain (target).

Unsupervised domain adaptation is an especially attractive alternative when the ground truth labels cannot be obtained easily for the task of interest.

An adversarial learning-based method for domain adaptation at pixel-level would try to translate/synthesize input images from one domain to the other, bringing the input distributions closer.

We can see clear improvements from models trained on source domain only to models trained with the proposed SWD method.

ConclusionThis method of unsupervised domain adaptation helps …

9 месяцев, 3 недели назад @ machinelearning.apple.com
Optimizing Siri on HomePod in Far‑Field Settings
Optimizing Siri on HomePod in Far‑Field Settings Optimizing Siri on HomePod in Far‑Field Settings

Unlike Siri on iPhone, which operates close to the user’s mouth, Siri on HomePod must work well in a far-field setting.

Block diagram of the online multichannel signal processing chain on HomePod for Siri.

The RES is designed to suppress nonlinear components of the echo signal that aren’t being modeled by the linear MCEC.

It is obvious that the optimal integration of our speech processing technologies substantially improves the overall WERs across conditions.

A survey of convolutive blind source separation methods, Multichannel Speech Processing Handbook, 2007.

1 год, 4 месяца назад @ machinelearning.apple.com
Apple at NeurIPS 2018
Apple at NeurIPS 2018 Apple at NeurIPS 2018

This December we’ll be in Montreal, Canada, attending the 32nd Conference on Neural Information Processing Systems (NeurIPS).

We’ll have a booth staffed with Machine Learning experts from across Apple who would love to chat with you.

Please drop by if you’re attending the conference.

Apple is dedicated to advancing state-of-the-art machine learning technologies.

If you are interested in applying to specific machine learning positions, please explore opportunities at Machine Learning Jobs At Apple.

1 год, 4 месяца назад @ machinelearning.apple.com
Can Global Semantic Context Improve Neural Language Models?
Can Global Semantic Context Improve Neural Language Models? Can Global Semantic Context Improve Neural Language Models?

In this article, we explore whether we can improve word predictions for the QuickType keyboard using global semantic context.

Can this global semantic context result in better language models?

All neural network solutions to date predict either a word in context or the local context itself, which doesn’t adequately reflect global semantic information.

ConclusionWe set out to assess the potential benefits of incorporating global semantic information into neural language models.

In summary, using bi-LSTM RNNs to train global semantic word embeddings can indeed lead to improved accuracy in neural language modeling.

1 год, 6 месяцев назад @ machinelearning.apple.com
Finding Local Destinations with Siri’s Regionally Specific Language Models for Speech Recognition
Finding Local Destinations with Siri’s Regionally Specific Language Models for Speech Recognition Finding Local Destinations with Siri’s Regionally Specific Language Models for Speech Recognition

The accuracy of automatic speech recognition (ASR) systems has improved phenomenally over recent years, due to the widespread adoption of deep learning techniques.

We decided to improve Siri’s ability to recognize names of local POIs by incorporating knowledge of the user’s location into our speech recognition system.

We’ve been able to significantly improve the accuracy of local POI recognition and understanding by incorporating users’ geolocation information into Siri’s ASR system.

Incremental Language Models for Speech Recognition Using Finite-state Transducers.

Convolutional Neural Networks for Speech Recognition IEEE/ACM Transactions on Audio, Speech, and Language Processing,…

1 год, 8 месяцев назад @ machinelearning.apple.com
Personalized Hey Siri
Personalized Hey Siri Personalized Hey Siri

When a user says, “Hey Siri, how is the weather today?” the phone wakes up upon hearing “Hey Siri” and processes the rest of the utterance as a Siri request.

The application of a speaker recognition system involves a two-step process: enrollment and recognition.

User EnrollmentThe main design discussion for personalized “Hey Siri” (PHS) revolves around two methods for user enrollment: explicit and implicit.

Improving the Speaker TransformThe speaker transform is the most important part of any speaker recognition system.

At its core, the purpose of the “Hey Siri” feature is to enable users to make Siri requests.

1 год, 11 месяцев назад @ machinelearning.apple.com
Learning with Privacy at Scale
Learning with Privacy at Scale Learning with Privacy at Scale

We develop a system architecture that enables learning at scale by leveraging local differential privacy, combined with existing privacy best practices.

In this article, we give an overview of a system architecture that combines differential privacy and privacy best practices to learn from a user population.

Differential privacy [2] provides a mathematically rigorous definition of privacy and is one of the strongest guarantees of privacy available.

In our system, we choose not to collect raw data on the server which is required for central differential privacy; hence, we adopt local differential privacy, which is a superior form of privacy [3].

ConclusionIn this article, we have presented a…

2 года, 4 месяца назад @ machinelearning.apple.com
Uber Engineering Uber Engineering
последний пост 1 месяц назад
Under the Hood of Uber ATG’s Machine Learning Infrastructure and Versioning Control Platform for Self-Driving Vehicles
Under the Hood of Uber ATG’s Machine Learning Infrastructure and Versioning Control Platform for Self-Driving Vehicles Under the Hood of Uber ATG’s Machine Learning Infrastructure and Versioning Control Platform for Self-Driving Vehicles

A trained model, which requires as input the data set artifact, the model training code, and configuration files governing model training.

Example sequence of events: registering a new data setUpon user-registration of a new data set, the VerCD Data set Service stores the dependency metadata in our database.

Data set service APIThe data set service is responsible for tracking the dependencies for building a given data set.

The REST API supports the functions of creating a new data set, reading the metadata for a data set, updating the metadata of a data set, deleting a data set, and getting the artifact locations of the data set (such as in S3 or HDFS).

For instance, the VerCD data set serv…

1 месяц назад @ eng.uber.com
Building a Backtesting Service to Measure Model Performance at Uber-scale
Building a Backtesting Service to Measure Model Performance at Uber-scale Building a Backtesting Service to Measure Model Performance at Uber-scale

To better assess the performance of our models, we built a backtesting service for measuring forecast model error rates.

The backtesting service runs in a distributed system, allowing multiple models (>10), many backtesting windows (>20), and models for different cities (>200) to run simultaneously.

Backtesting at scaleOur data science teams regularly create forecast models and statistics to better understand budget spending and project financial performance.

For the purposes of our backtesting service, we chose to leverage two primary backtesting data split mechanisms, backtesting with an expanding window and backtesting with a sliding window:Above, we showcase three windows for each metho…

1 месяц, 3 недели назад @ eng.uber.com
Uber AI in 2019: Advancing Mobility with Artificial Intelligence
Uber AI in 2019: Advancing Mobility with Artificial Intelligence Uber AI in 2019: Advancing Mobility with Artificial Intelligence

At the forefront of this effort is Uber AI, Uber’s center for advanced artificial intelligence research and platforms.

In this year alone, AI research at Uber has led to significant improvements in demand prediction and more seamless pick-up experiences.

Fostering AI collaboration through open sourceIn 2019, Uber AI was committed to sharing knowledge and best practices with the broader scientific community through open source projects.

Looking towards 2020Next year, Uber AI will continue to innovate, collaborate, and contribute to Uber’s platform services through the application of AI across our business.

For more on Uber AI, be sure to check out related articles on the Uber Engineering Blo…

3 месяца, 2 недели назад @ eng.uber.com
Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data
Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data

We in Uber AI Labs investigated the intriguing question of whether we can create learning algorithms that automatically generate training data, learning environments, and curricula to help AI agents rapidly learn.

Increasingly, neural architecture search (NAS) algorithms are being deployed to automate the search for architectures, with great results.

32), new learners are able to learn on synthetic data faster than real data (red line vs. blue line in Figure 1).

In our experiments, the estimates come either from training for 128 SGD steps on GTN-generated data or real data.

Then, for each method, the final best architecture according to the estimate is trained a long time on real data.

3 месяца, 2 недели назад @ eng.uber.com
Controlling Text Generation with Plug and Play Language Models
Controlling Text Generation with Plug and Play Language Models Controlling Text Generation with Plug and Play Language Models

This article discusses an alternative approach to controlled text generation, titled the Plug and Play Language Model (PPLM), introduced in a recent paper from Uber AI.

In many ways, language models are like wise but unguided wooly mammoths that lumber wherever they please.

As we will show below, attribute models with only a single layer containing 4,000 parameters perform well at recognizing attributes and guiding generation.

Thus, we use the unmodified language model to ensure the fluency of language is maintained at or near the level of the original language model (in this example, GPT-2-medium).

Multiple attribute modelsWe may combine multiple attribute models in controlled generation, …

4 месяца назад @ eng.uber.com
Food Discovery with Uber Eats: Using Graph Learning to Power Recommendations
Food Discovery with Uber Eats: Using Graph Learning to Power Recommendations Food Discovery with Uber Eats: Using Graph Learning to Power Recommendations

To this end, we previously developed ML models to better understand queries and for multi-objective optimization in Uber Eats search and recommender system in Uber Eats searches and surfaced food options.

Graph learning in a nutshellTo best understand how we made our Uber Eats recommendations more accurate, it helps to know the basics of how graph learning works.

For example, to represent an eater in our Uber Eats model we don’t only use order history to inform order suggestions, but also information about what food items are connected to past Uber Eats orders and insights about similar users.

For our Uber Eats use case, we opted for a graph neural network (GNN)-based approach to obtain an …

4 месяца назад @ eng.uber.com
Uber Goes to NeurIPS 2019
Uber Goes to NeurIPS 2019 Uber Goes to NeurIPS 2019

This year, Uber is presenting 11 papers at the NeurIPS 2019 conference in Vancouver, Canada!

Scalable Global Optimization via Local Bayesian OptimizationDavid Eriksson (Uber AI) · Michael Pearce (Uber AI intern / Warwick University) · Jacob Gardner (Uber AI) · Ryan Turner (Uber AI) · Matthias Poloczek (Uber AI)ArXivDecember 10 at 4:25 pm, West Ballroom C, NeurIPS Spotlight TalkDecember 10 at 5:30 pm, East Exhibition Hall B&C, Poster #9Bayesian optimization (BO) has recently emerged as a successful technique for the global optimization of black-box functions.

For additional information about our talks and posters, check out the Uber NeurIPS 2019 site.

Interested in the ML research that Uber …

4 месяца назад @ eng.uber.com
Announcing the 2020 Uber AI Residency
Announcing the 2020 Uber AI Residency Announcing the 2020 Uber AI Residency

On behalf of Uber, we invite you to join us on our journey as an Uber AI Resident.

Established in 2018, the Uber AI Residency is a 12-month training program for recent college and master’s graduates, professionals who are looking to reinforce their AI skills, and those with quantitative skills and interest in becoming an AI researcher at Uber.

This year’s AI residency program will focus on our self-driving cars project through Uber Advanced Technology Group (ATG).

Open source & publication opportunitiesAcross Uber, we are committed to an open and inclusive research mission that benefits the community at large through both Uber AI and Uber ATG Research.

Learn more about the Uber AI Residency…

4 месяца, 1 неделя назад @ eng.uber.com
Get to Know Uber ATG at ICCV, CoRL, and IROS 2019
Get to Know Uber ATG at ICCV, CoRL, and IROS 2019 Get to Know Uber ATG at ICCV, CoRL, and IROS 2019

We hope our approach to sharing will deepen the interactions and collaborations between industry and academia, and will ultimately bring self-driving research communities together.

This year, Uber ATG has five publications accepted at ICCV, two publications accepted at CoRL, and two publications accepted at IROS.

In addition, Raquel Urtasun, Uber ATG Chief Scientist and Head of Uber ATG R&D, will be giving four talks at ICCV.

Please come visit us at ICCV (booth #D-7) IROS and CORL to learn more about our lab’s research, discuss the work with our researchers, and hear about career opportunities with Uber ATG.

Learn more about research opportunities with Uber ATG by visiting our careers page.

5 месяцев, 1 неделя назад @ eng.uber.com
Evolving Michelangelo Model Representation for Flexibility at Scale
Evolving Michelangelo Model Representation for Flexibility at Scale Evolving Michelangelo Model Representation for Flexibility at Scale

To address these issues, we evolved Michelangelo’s use of Spark MLlib, particularly in the areas of model representation, persistence, and online serving.

Its end-to-end support for scheduled Spark-based data ingestion, model training, and evaluation, along with deployment for batch and online model serving, has gained wide acceptance across Uber.

More recently, Michelangelo has evolved to handle more use cases, including serving models trained outside of core Michelangelo.

Michelangelo had specific pipeline model definitions for each supported model type, with an in-house custom protobuf representation of trained models for serving.

It is important to note that Michelangelo online serving …

5 месяцев, 3 недели назад @ eng.uber.com
Searchable Ground Truth: Querying Uncommon Scenarios in Self-Driving Car Development
Searchable Ground Truth: Querying Uncommon Scenarios in Self-Driving Car Development Searchable Ground Truth: Querying Uncommon Scenarios in Self-Driving Car Development

We use these traffic scenarios to develop machine learning models that help our self-driving cars safely react to common, and not so common, scenarios that come up in a given operational domain.

These specific scenarios can then be used to train our self-driving cars to safely navigate a traffic situation with bicyclists.

Modeled tables are crucial in making our data useful for training self-driving cars to operate safely.

The ability to query data that replicates traffic scenarios ranging from the everyday to the very rare will help prepare our self-driving cars for any situation.

There is no shortage of work to be done in making the future of self-driving cars a reality.

6 месяцев назад @ eng.uber.com
Science at Uber: Improving Transportation with Artificial Intelligence
Science at Uber: Improving Transportation with Artificial Intelligence Science at Uber: Improving Transportation with Artificial Intelligence

In our Science at Uber video series, Uber employees talk about how we apply data science, artificial intelligence, machine learning, and other innovative technologies in our daily work.

Zoubin Ghahramani, Chief Scientist at Uber, spent many years in academia researching artificial intelligence.

Applied to the huge amount of data around transportation, artificial intelligence has the capability to make travel easier and more seamless.

At Uber, deep learning, an area of artificial intelligence research, finds use in multiple applications, including improving our understanding of cities and traffic, helping compute ETAs, and in developing self-driving cars.

Beyond deep learning, however, we al…

6 месяцев, 3 недели назад @ eng.uber.com
Three Approaches to Scaling Machine Learning with Uber Seattle Engineering
Three Approaches to Scaling Machine Learning with Uber Seattle Engineering Three Approaches to Scaling Machine Learning with Uber Seattle Engineering

In an effort to constantly optimize our operations, serve our customers, and train our systems to perform better and better, we leverage machine learning (ML).

In addition, we make many of our ML tools open source, sharing them with the community to advance the state of the art.

In this spirit, members of our Seattle Engineering team shared their work at an April 2019 meetup on ML and AI at Uber.

Below, we highlight three different approaches Uber Seattle Engineering is currently working on to improve our ML ecosystem and that of the tech community at large.

Horovod: Distributed Deep Learning on Apache SparkDuring his talk, senior software engineer Travis Addair, from the ML Platform team, …

6 месяцев, 3 недели назад @ eng.uber.com
Science at Uber: Powering Machine Learning at Uber
Science at Uber: Powering Machine Learning at Uber Science at Uber: Powering Machine Learning at Uber

Share Vote Reddit WhatsApp 149 SharesAt Uber, we take advanced research work and use it to solve real world problems.

In our Science at Uber video series, Uber employees talk about how we apply data science, artificial intelligence, machine learning, and other innovative technologies in our daily work.

Machine learning helps Uber make data-driven decisions which not only enable services such as ridesharing, but also financial planning and other core business needs.

Our machine learning platform, Michelangelo, lets teams across the company train, evaluate, and deploy models that help us forecast a wide range of business metrics.

The platform enables our teams to simply, flexibly, and intelli…

6 месяцев, 4 недели назад @ eng.uber.com
Introducing LCA: Loss Change Allocation for Neural Network Training
Introducing LCA: Loss Change Allocation for Neural Network Training Introducing LCA: Loss Change Allocation for Neural Network Training

In our paper, LCA: Loss Change Allocation for Neural Network Training, to be presented at NeurIPS 2019, we propose a method called Loss Change Allocation (LCA) that provides a rich window into the neural network training process.

Our methodsOne way of revealing detailed insights into the neural network training process is to measure how much each trainable parameter of the neural network “learns” at any point in time.

Suppose we are training a network, and during a single training iteration the parameter vector moves from to .

We call this measure Loss Change Allocation (LCA): how much a parameter’s movement at an iteration caused the loss to go up or down.

If we track validation LCA as wel…

6 месяцев, 4 недели назад @ eng.uber.com
✈️ Telegram
DL in NLP DL in NLP
последний пост 10 часов назад
🔥 пост с обзором на ряд недавних NLP публикаций
🔥 пост с обзором на ряд недавних NLP публикаций 🔥 пост с обзором на ряд недавних NLP публикаций

🔥 пост с обзором на ряд недавних NLP публикаций

10 часов назад @ t.me
Automatically Neutralizing Subjective Bias in Text
Automatically Neutralizing Subjective Bias in Text Automatically Neutralizing Subjective Bias in Text

Automatically Neutralizing Subjective Bias in Text

Pryzant et al.

arxiv.org/abs/1911.09709 Забавная новая задача повышения объективности текста. Датасет намайнили из правок Википедии, в качестве модельки используют систему из двух моделей: BERT детектирует субъективные выражения, а LSTM их исправляет. Интересно, что будет, если применить эту модель к моей ленте твиттера.

10 часов назад @ t.me
[photo]
[photo] [photo] 10 часов назад @ t.me
Emerging Cross-lingual Structure in Pretrained Language Models
Emerging Cross-lingual Structure in Pretrained Language Models Emerging Cross-lingual Structure in Pretrained Language Models

Emerging Cross-lingual Structure in Pretrained Language Models

Wu, Conneau, et al. [FAIR]

arxiv.org/abs/1911.01464 Статья для тех, кто не любит SOTA-driven approach. Авторы задают конкретные вопросы про мультиязычные модели и пытаются на них ответить: Q: Важны ли anchor points (одинаковые по написанию и смыслу токены, которые автоматически мапятся в один эмбеддинг ещё на стадии предобработки) для предтренировки mBERT?

A: Слабо важны, дают 1-2 пункта на downstream-задачах Q: Насколько важен model parameter sharing между языками?

A: Критически важен, для далёких языков (En-Ru, En-Zh) качество downstream задач падает почти до уровня случайного выбора, если шарить только половину параметров Q: …

15 часов назад @ t.me
Советую, если интересуетесь zero-shot multilingual transfer. Версия статьи с выделенными интересными моментами.
Советую, если интересуетесь zero-shot multilingual transfer. Версия статьи с выделенными интересными моментами. Советую, если интересуетесь zero-shot multilingual transfer. Версия статьи с выделенными интересными моментами.

Советую, если интересуетесь zero-shot multilingual transfer. Версия статьи с выделенными интересными моментами.

15 часов назад @ t.me
[file]
[file] [file] 15 часов назад @ t.me
[photo]
[photo] [photo] 15 часов назад @ t.me
Deep Learning Reproducibility with TensorFlow
Deep Learning Reproducibility with TensorFlow Deep Learning Reproducibility with TensorFlow

Deep Learning Reproducibility with TensorFlow

youtu.be/Ys8ofBeR2kA Хороший обзор проблемы воспроизводимости в DL и как её решать.

Рекомендуется для просмотров всем - воспроизводимость важна и в исследованиях (ваши метрики в статье) и в проде (в regression testing, например).

Не только релевантно для TF, но и легко адаптируется под PyTorch. Из моих наблюдений:

1. разный random seed может менять вашу метрику вплоть до 5-10 пунктов - см один из постов выше

1. если вы забыли поставить какой-то __один__ seed (python, numpy, cuda) - то же самое (даже если все остальные сиды стоят)

1. смена GPU на CPU: 0.5 пункта

1. недетерминированность GPU: - 0.1 пункт Я не удивлюсь, если на самом деле эти цифры…

1 день, 21 час назад @ t.me
NLP newsletter
NLP newsletter NLP newsletter

NLP newsletter

3 дня, 10 часов назад @ t.me
Вчера было 1 апреля, а это означает конференцию в CMU. Приметившиеся статьи:
Вчера было 1 апреля, а это означает конференцию  в CMU. Приметившиеся статьи: Вчера было 1 апреля, а это означает конференцию в CMU. Приметившиеся статьи:

Вчера было 1 апреля, а это означает конференцию в CMU. Приметившиеся статьи: 1. Новый алгоритм сортировки - - "uses human intelligence to compare the elements of a (possibily) heterogeneous list"

1. - "In this paper I (switching to ’I’ to avoid sounding pretentious with ’we’) introduce Artificial General Relativity (AGR) which, when achieved, will allow us to control gravity and spacetime."

1. Proceedings: Аудиозапись конференции:

3 дня, 20 часов назад @ t.me
[photo]
[photo] [photo] 3 дня, 20 часов назад @ t.me
Пример использования PyTorch TPU для NER (сам код по ссылке на гитхабе, в колабе только вызывалка этого).
Пример использования PyTorch TPU для NER (сам код по ссылке на гитхабе, в колабе только вызывалка этого). Пример использования PyTorch TPU для NER (сам код по ссылке на гитхабе, в колабе только вызывалка этого).

Пример использования PyTorch TPU для NER (сам код по ссылке на гитхабе, в колабе только вызывалка этого).

Всё ещё выглядит кривовато, но в прошлом году на подъём всего окружения нужно было потратить больше часа, а теперь всё работает за пару минут, так что советую почитать код и запустить колаб. Интересно, что Lightning уже поддерживает TPU в две строчки: 1. При создании Trainer указать num_tpu_cores 1. Делать шаг оптимизатора с помощью torch_xla.core.xla_model.optimizer_step(optimizer) вместо обычного optimizer.step() twitter.com/srush_nlp/status/1233161898268467206

4 дня, 10 часов назад @ t.me
Наткнулся на блогпост к уже довольно старой статье (How Does Batch Normalization Help Optimization? Santurkar et al., 2018). В посте довольно хорошо описывается интуиция за статьёй. Немножко матана прилагается - всё как вы любите.
Наткнулся на блогпост к уже довольно старой статье (How Does Batch Normalization Help Optimization? Santurkar et al., 2018). В посте довольно хорошо описывается интуиция за статьёй. Немножко матана прилагается - всё как вы любите. Наткнулся на блогпост к уже довольно старой статье (How Does Batch Normalization Help Optimization? Santurkar et al., 2018). В посте довольно хорошо описывается интуиция за статьёй. Немножко матана прилагается - всё как вы любите.

Наткнулся на блогпост к уже довольно старой статье (How Does Batch Normalization Help Optimization? Santurkar et al., 2018). В посте довольно хорошо описывается интуиция за статьёй. Немножко матана прилагается - всё как вы любите. И почему эта статья вообще важная: она показывает, что наша интуиция иногда ведёт к неправильным выводам и нужно всегда её проверять. blog.paperspace.com/busting-the-myths-about-batch-normalization

4 дня, 10 часов назад @ t.me
Deep Learning Memory Usage and Pytorch Optimization Tricks
Deep Learning Memory Usage and Pytorch Optimization Tricks Deep Learning Memory Usage and Pytorch Optimization Tricks

Deep Learning Memory Usage and Pytorch Optimization Tricks

www.sicara.ai/blog/2019-28-10-deep-learning-memory-usage-and-pytorch-optimization-tricks Хороший пост как для новичков, так и для тех, кто уже немножко погружён в DL. Рассказывает о том, почему нейросетки (конкретно бэкпроп) потребляют столько памяти и как с этим жить.

4 дня, 10 часов назад @ t.me
Rethinking Batch Normalization in Transformers
Rethinking Batch Normalization in Transformers Rethinking Batch Normalization in Transformers

Rethinking Batch Normalization in Transformers

Shen et al.

arxiv.org/abs/2003.07845 Авторы изучают нормализацию в трансформерах. Первое, что нашли: дисперсия in-batch статистик в NLP задачах на порядки выше, чем в CV. А следовательно она большая и в градиентах, а следовательно она может влиять на сходимость - как быстро и куда. Дальше они предлагают новый тип нормализации - PowerNorm - и доказывают, что он (так же, как и BatchNorm) повышает липциц-гладкость поверхности лосса. Эксперименты показывают буст (небольшой) в машинном переводе и (заметный) в языковом моделировании. Нам надо больше таких исследований - в архитектуру трансформера заложено много общепринятых, но мало изученных практик…

6 дней, 17 часов назад @ t.me
gonzo-обзоры ML-статей gonzo-обзоры ML-статей
последний пост 1 день, 10 часов назад
Proving the Lottery Ticket Hypothesis: Pruning is All You Need
Proving the Lottery Ticket Hypothesis: Pruning is All You Need Proving the Lottery Ticket Hypothesis: Pruning is All You Need

Proving the Lottery Ticket Hypothesis: Pruning is All You Need

__Eran Malach, Gilad Yehudai, Shai Shalev-Shwartz, Ohad Shamir

__Статья: https://arxiv.org/abs/2002.00585 И вот продолжение предыдущей работы, более сильный вариант гипотезы лотерейного билета получил математическое доказательство. Для любого ограниченного распределения и любой целевой сети с ограниченными весами, достаточно over-parameterized сетка со случайными весами содержит подсеть с примерно такой же точностью, что и целевая сеть. БЕЗ ВСЯКОГО ОБУЧЕНИЯ. __(Disclaimer: вывод доказательства пока не проверял) __Авторы разделяют два типа подсетей: подсети с удалёнными весами (weight-subnetworks) и подсети с целиком удалёнными н…

1 день, 10 часов назад @ t.me
для оценки масштаба бедствия
для оценки масштаба бедствия для оценки масштаба бедствия

для оценки масштаба бедствия

1 день, 10 часов назад @ t.me
What's Hidden in a Randomly Weighted Neural Network?
What's Hidden in a Randomly Weighted Neural Network? What's Hidden in a Randomly Weighted Neural Network?

What's Hidden in a Randomly Weighted Neural Network?

__Vivek Ramanujan, Mitchell Wortsman, Aniruddha Kembhavi, Ali Farhadi, Mohammad Rastegari__

Статья: https://arxiv.org/abs/1911.13299 Из серии “про суть вещей”. Случайно инициализированные сетки содержат подсети, которые демонстрируют хорошее качество вообще без обучения. Например, в рандомно инициализированном Wide ResNet-50 содержится подсетка со случайными весами, которая _без обучения_ достигает качества _обученного_ на ImageNet ResNet-34. Предлагается алгоритм для эффективного нахождения таких подсеток. В каком-то смысле продолжение истории про Lottery Ticket Hypothesys (https://arxiv.org/abs/1803.03635). Если кто не помнит, то это ис…

2 дня, 10 часов назад @ t.me
К вопросу о правильных и неправильных инициализациях
К вопросу о правильных и неправильных инициализациях К вопросу о правильных и неправильных инициализациях

К вопросу о правильных и неправильных инициализациях

2 дня, 10 часов назад @ t.me
Самый шик
Самый шик Самый шик

Самый шик

2 дня, 10 часов назад @ t.me
Rethinking the Value of Network Pruning
Rethinking the Value of Network Pruning Rethinking the Value of Network Pruning

Rethinking the Value of Network Pruning

__Zhuang Liu, Mingjie Sun, Tinghui Zhou, Gao Huang, Trevor Darrell

__Статья: https://arxiv.org/abs/1810.05270 Код (unofficial): https://github.com/Eric-mingjie/rethinking-network-pruning Про переосмысление ценностей и снова про прунинг. Классический прунинг реализует пайплайн вида “обучение модели” → “прунинг” → “файнтюнинг” Данная работа ниспровергает некоторые широко распространённые верования относительно этого процесса: 1) Важно начинать с обучения большой over-parameterized сети, из которой мы затем можем удалить часть избыточных параметров, не влияя существенно на точность. И это, считается, лучше, чем обучать маленькую модель с нуля. 2) И веса,…

1 неделя, 1 день назад @ t.me
Авторы показывают, что вся разница от learning rate. В LTH он был слишком низкий. Для structured pruning и при низком, и при высоком LR выигрышный билет не превосходит случайную инициализацию. В unstructured pruning оригинальная инициализация даёт преимуще
Авторы показывают, что вся разница от learning rate. В LTH он был слишком низкий. Для structured pruning и при низком, и при высоком LR выигрышный билет не превосходит случайную инициализацию. В unstructured pruning оригинальная инициализация даёт преимуще Авторы показывают, что вся разница от learning rate. В LTH он был слишком низкий. Для structured pruning и при низком, и при высоком LR выигрышный билет не превосходит случайную инициализацию. В unstructured pruning оригинальная инициализация даёт преимуще

Авторы показывают, что вся разница от learning rate. В LTH он был слишком низкий. Для structured pruning и при низком, и при высоком LR выигрышный билет не превосходит случайную инициализацию. В unstructured pruning оригинальная инициализация даёт преимущество только при низком LR (а с таким низким LR итогое качество и так хуже, чем с высоким, который обычно и используется). Авторы оригинальной работы про LTH, кстати, пишут, что не сумели найти выигрышные билеты на высоком LR (__“At the higher learning rate, iterative pruning does not find winning tickets, and performance is no better than when the pruned networks are randomly reinitialized.”__). Так что сходится.

1 неделя, 1 день назад @ t.me
Пример результата
Пример результата Пример результата

Пример результата

1 неделя, 1 день назад @ t.me
Guided sparsification
Guided sparsification Guided sparsification

Guided sparsification

1 неделя, 1 день назад @ t.me
Lottery ticket suboptimality — всё дело в волшебных пузырьках!
Lottery ticket suboptimality — всё дело в волшебных пузырьках! Lottery ticket suboptimality — всё дело в волшебных пузырьках!

Lottery ticket suboptimality — всё дело в волшебных пузырьках!

1 неделя, 1 день назад @ t.me
Другие известные дистилляции BERT’а
Другие известные дистилляции BERT’а Другие известные дистилляции BERT’а

Другие известные дистилляции BERT’а (2019/03) “Distilling Task-Specific Knowledge from BERT into Simple Neural Networks”

Статья: https://arxiv.org/abs/1903.12136 В работе BERT дистиллируется в однослойную BiLSTM, получают результат сравнимый с EMLo при стократно меньшем числе параметров и в 15 раз меньшем времени инференса. Как видно из таблицы в предыдущем посте, и DistilBERT, и TinyBERT этот результат бьют по качеству, хотя коэффициенты и сжатия, и ускорения здесь вроде выше. (2019/08) “Patient Knowledge Distillation for BERT Model Compression”

Статья: https://arxiv.org/abs/1908.09355

Код: https://github.com/intersun/PKD-for-BERT-Model-Compression Предложили метод под названием Patient Kn…

1 неделя, 2 дня назад @ t.me
Результаты BiLSTM_soft
Результаты BiLSTM_soft Результаты BiLSTM_soft

Результаты BiLSTM_soft

1 неделя, 2 дня назад @ t.me
Результаты BERT_PKD
Результаты BERT_PKD Результаты BERT_PKD

Результаты BERT_PKD

1 неделя, 2 дня назад @ t.me
Результаты дистиллированного BERT'а со сжатым словарём
Результаты дистиллированного BERT'а со сжатым словарём Результаты дистиллированного BERT'а со сжатым словарём