Very ML
Наша SOTA-подборка ML-новостей. Что-то из этого читает Лекун.
DataTau DataTau
последний пост 1 час назад
Hello world
Hello world Hello world

You must be logged to comment.

1 час назад @ datatau.net
Second-Order Thinking - How to become a thinking powerhouse
Second-Order Thinking - How to become a thinking powerhouse Second-Order Thinking - How to become a thinking powerhouse

You must be logged to comment.

17 часов назад @ datatau.net
Deep Learning: The Free eBook
Deep Learning: The Free eBook Deep Learning: The Free eBook

You must be logged to comment.

1 день, 1 час назад @ datatau.net
Visualize all your NLP metrics to decide which ML algorithms to choose from
Visualize all your NLP metrics to decide which ML algorithms to choose from Visualize all your NLP metrics to decide which ML algorithms to choose from

You must be logged to comment.

1 день, 3 часа назад @ datatau.net
NVIDIA’s GameGAN Uses AI to Recreate Pac-Man and Other Game Environments
NVIDIA’s GameGAN Uses AI to Recreate Pac-Man and Other Game Environments NVIDIA’s GameGAN Uses AI to Recreate Pac-Man and Other Game Environments

You must be logged to comment.

1 день, 15 часов назад @ datatau.net
AI Weekly Update (May 26th, 2020)
AI Weekly Update (May 26th, 2020) AI Weekly Update (May 26th, 2020)

You must be logged to comment.

1 день, 15 часов назад @ datatau.net
AI Weekly Update (May 26th, 2020)
AI Weekly Update (May 26th, 2020) AI Weekly Update (May 26th, 2020)

You must be logged to comment.

1 день, 15 часов назад @ datatau.net
Hands-on Data Science Course 20% off for a LIMITED TIME: puj4gb4
Hands-on Data Science Course 20% off for a LIMITED TIME: puj4gb4 Hands-on Data Science Course 20% off for a LIMITED TIME: puj4gb4

You must be logged to comment.

1 день, 16 часов назад @ datatau.net
Breakthrough Colourization Technique Enables Instance-Aware Treatment of Multiple Objects
Breakthrough Colourization Technique Enables Instance-Aware Treatment of Multiple Objects Breakthrough Colourization Technique Enables Instance-Aware Treatment of Multiple Objects

You must be logged to comment.

1 день, 20 часов назад @ datatau.net
How to Write a Great Data Science CV
How to Write a Great Data Science CV How to Write a Great Data Science CV

You must be logged to comment.

1 день, 21 час назад @ datatau.net
What are the ML techniques to get started on any problem?
What are the ML techniques to get started on any problem? What are the ML techniques to get started on any problem?

You must be logged to comment.

1 день, 22 часа назад @ datatau.net
The Best NLP with Deep Learning Course is Free
The Best NLP with Deep Learning Course is Free The Best NLP with Deep Learning Course is Free

You must be logged to comment.

2 дня, 2 часа назад @ datatau.net
Free Retail AI Webinar
Free Retail AI Webinar Free Retail AI Webinar

You must be logged to comment.

2 дня, 20 часов назад @ datatau.net
Are you still using sklearn for Regression Analysis?
Are you still using sklearn for Regression Analysis? Are you still using sklearn for Regression Analysis?

You must be logged to comment.

2 дня, 21 час назад @ datatau.net
Hadoop 🐘 with Python 🐍 (I): PySpark
Hadoop 🐘 with Python 🐍 (I): PySpark Hadoop 🐘 with Python 🐍 (I): PySpark

You must be logged to comment.

2 дня, 22 часа назад @ datatau.net
/r/MachineLearning /r/MachineLearning
последний пост 50 минут назад
[D] recommender systems - concept map
[D] recommender systems - concept map [D] recommender systems - concept map

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

50 минут назад @ reddit.com
[D] find the cat in the picture
[D] find the cat in the picture [D] find the cat in the picture

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

4 часа назад @ reddit.com
[D] What's a good tool for organizing batch jobs?
[D] What's a good tool for organizing batch jobs? [D] What's a good tool for organizing batch jobs?

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

4 часа назад @ reddit.com
[D] What is the tool stack of ML teams at startups? + intel from 41 companies
[D] What is the tool stack of ML teams at startups? + intel from 41 companies [D] What is the tool stack of ML teams at startups? + intel from 41 companies

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

4 часа назад @ reddit.com
[P] Implementing Neural Turing Machines in pytorch
[P] Implementing Neural Turing Machines in pytorch [P] Implementing Neural Turing Machines in pytorch

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

5 часов назад @ reddit.com
[D] Reproducing 'Safe Distance' by landing.ai
[D] Reproducing 'Safe Distance' by landing.ai [D] Reproducing 'Safe Distance' by landing.ai

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

6 часов назад @ reddit.com
[D] What prevents this NVIDIA model to produce 2D instead of 3D models of objects?
[D] What prevents this NVIDIA model to produce 2D instead of 3D models of objects? [D] What prevents this NVIDIA model to produce 2D instead of 3D models of objects?

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

6 часов назад @ reddit.com
[D] Is there a classification task with multiple attribute regression?
[D] Is there a classification task with multiple attribute regression? [D] Is there a classification task with multiple attribute regression?

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

11 часов назад @ reddit.com
[P] PyTorch repo of Routing Transformer, SOTA long-range language model
[P] PyTorch repo of Routing Transformer, SOTA long-range language model [P] PyTorch repo of Routing Transformer, SOTA long-range language model

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

11 часов назад @ reddit.com
[D] Clustering Evaluation with Multi-Label Ground Truths
[D] Clustering Evaluation with Multi-Label Ground Truths [D] Clustering Evaluation with Multi-Label Ground Truths

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

14 часов назад @ reddit.com
[Project] Suggestions about multi-agent training/inference
[Project] Suggestions about multi-agent training/inference [Project] Suggestions about multi-agent training/inference

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

15 часов назад @ reddit.com
[D] Expected Behaviour of WGAN Loss Functions?
[D] Expected Behaviour of WGAN Loss Functions? [D] Expected Behaviour of WGAN Loss Functions?

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

15 часов назад @ reddit.com
Network Bending: Manipulating The Inner Representations of Deep Generative Models
Network Bending: Manipulating The Inner Representations of Deep Generative Models Network Bending: Manipulating The Inner Representations of Deep Generative Models

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

15 часов назад @ reddit.com
[R] paper discussion | Representation Learning of Histopathology Images using Graph Neural Nets
[R] paper discussion | Representation Learning of Histopathology Images using Graph Neural Nets [R] paper discussion | Representation Learning of Histopathology Images using Graph Neural Nets

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

16 часов назад @ reddit.com
[Discussion] What are some unethical uses of machine learning that you think most people are unaware of?
[Discussion] What are some unethical uses of machine learning that you think most people are unaware of? [Discussion] What are some unethical uses of machine learning that you think most people are unaware of?

Cookies help us deliver our Services.

By using our Services, you agree to our use of cookies.Learn More

16 часов назад @ reddit.com
Towards Data Science Towards Data Science
последний пост 8 часов назад
Visualizing Feature Maps and Filters
Visualizing Feature Maps and Filters Visualizing Feature Maps and Filters

Photo by silvia maidagan on UnsplashHave you ever wondered about visualizing the outputs of intermediate layers or filters of a Convolutional Neural Network?

Check out my previous articles on Convolutional Neural Network.

I hope some of you are thinking, why the hell am I writing this article on the visualization of feature maps or how accessing the filters or bias term will help?

classifier_model= keras.models.Sequential([keras.layers.Conv2D(input_shape=(300,300,3), activation='relu', kernel_size=(5,5), filters=32),keras.layers.MaxPooling2D(),keras.layers.Conv2D(activation='relu', kernel_size=(5,5), filters=64),keras.layers.MaxPooling2D(),keras.layers.Flatten(),keras.layers.Dense(128, acti…

8 часов назад @ towardsdatascience.com
How Different Metrics Correlate with Winning in the NBA over 30 Years
How Different Metrics Correlate with Winning in the NBA over 30 Years How Different Metrics Correlate with Winning in the NBA over 30 Years

How has a metric’s correlation to winning changed over the last 30 years?

Correlation of a Team’s Rate Metrics to Winning in the Current SeasonIn the current season (2019–2020), two-point field goal percentage is the metric most highly correlated to winning (.71).

This means teams with increasingly higher two-point field goal percentages tended to have increasingly higher win records.

Other metrics with negative win correlations include teams with higher ratios of Assists to Made Field Goals and teams with higher offensive rebounding rates.

When highest-winning teams shot 3.7 points better than the worst teams in 1990, the difference shrunk to 1.9 percentage points in 2020.

8 часов назад @ towardsdatascience.com
API Private Information Analyzer
API Private Information Analyzer API Private Information Analyzer

To help you with that task, I developed an API Private Information Analyzer that you can use to analyze your complete API payload.

It is no surprise that professional tools like BigId use regular expression as the first method for classifying private information, clustering and other fancy AI comes next.

API Private Information AnalyzerThe tool takes a collection of input entries; each contains all the data elements of an API call (i.e., the URI, headers, payload of both the request and the response).

If you need to quickly analyze your API for privacy issues, check out the API Private Information Analyzer.

It will tell you if your API involves data that should be considered as private data…

9 часов назад @ towardsdatascience.com
How to collect comments from any New York Times Article to a Pandas DataFrame
How to collect comments from any New York Times Article to a Pandas DataFrame How to collect comments from any New York Times Article to a Pandas DataFrame

Photo by visuals on UnsplashHow to collect comments from any New York Times Article to a Pandas DataFrameThe best part of the New York Times (NYT) is their active and highly moderated comment section on their articles.

Open to comments for 24 hours, it is highly moderated by a human, thus troll comments are non-existent.

There are three tabs, the NYT Picks, Reader Picks, and All.

Under the NYT Picks, the NYT will highlight comments that represent a range of views and are judged as either most interesting or useful, in regards to the topic of the article.

I’ve recently found out that the New York Times have released a new API, currently in beta, that can help us easily query some of these wo…

9 часов назад @ towardsdatascience.com
5 Steps to Create a Basic Machine Learning Model using Python
5 Steps to Create a Basic Machine Learning Model using Python 5 Steps to Create a Basic Machine Learning Model using Python

5 Steps to Create a Basic Machine Learning Model using PythonIn this article, we will explore Udemy Class data from Kaggle.com and try and predict which classes are successful using Pandas, Matplotlib, Seaborn, and Scikit-learn.

RangeIndex: 3678 entries, 0 to 3677Data columns (total 12 columns):# Column Non-Null Count Dtype--- ------ -------------- -----0 course_id 3678 non-null int641 course_title 3678 non-null object2 url 3678 non-null object3 is_paid 3678 non-null bool4 price 3678 non-null int645 num_subscribers 3678 non-null int646 num_reviews 3678 non-null int647 num_lectures 3678 non-null int648 level 3678 non-null object9 content_duration 3678 non-null float6410 published_timestamp 3…

9 часов назад @ towardsdatascience.com
Auto-Reflecting Tables in SQLAlchemy
Auto-Reflecting Tables in SQLAlchemy Auto-Reflecting Tables in SQLAlchemy

Auto-Reflecting Tables and Columns in SQLAlchemyOr How a Few Lines of Code Makes me Want to do Cartwheels Down the StreetCourtesy of David Clode on UpslashGroan, not another SQLAlchemy article!

The following is a portion of the Python code to be discussed:Simple, right?

Suffice to say that many of my Python scripts use string -based parameters obtained from JSON for their configuration.

Before diving into the code below, if you are wondering how to configure a database using SQLAlchemy I recommend you check out the following helpful articles: SQLAlchemy — Python Tutorial by Vinay Kudari and How to use Python SQLite3 using SQLAlchemy by Mahmud Ahsan.

I think another snake image is appropriat…

9 часов назад @ towardsdatascience.com
Kaggle/Academic vs Real-World Data Science Analytics
Kaggle/Academic vs Real-World Data Science Analytics Kaggle/Academic vs Real-World Data Science Analytics

Fig 1The Analytics Process: (For a Churn Prediction Data Science Project)1) Defining the Problem:Kaggle:Read the problem statement for customer churn.

In a temporary churn problem, it may so happen that a subscriber churns multiple time.

The problem statement for any churn project would be to identify the high-risk churn customers and try to retain them.

Real-World:Doing a bivariate visualization of each variable with the target variable churn, we could know the impact of each variable on Churn.

Reference:SelfNote: I would be more than happy to help the data science beginner students in the US who are looking for internships/full-time job in data science.

9 часов назад @ towardsdatascience.com
How I integrated the Instagram API in React Native
How I integrated the Instagram API in React Native How I integrated the Instagram API in React Native

It provides a NodeJS Instagram private API client.

The private API client allows you to like, follow, upload, basically everything you can do in the native Instagram app.

I was having success working with my script locally, and I wondered would it be possible to use the client in React Native.

The Node.js library includes a React Native bridge module that facilitates communication between the Node.js code and the React Native code.

The React Native CodeIn my index.js file for React Native, I import nodejs from “nodejs-mobile-react-native.” This allows me to start the main.js script using Node.js.

10 часов назад @ towardsdatascience.com
AI Strategy in The Age of Vertical Federated Learning and Data Sharing
AI Strategy in The Age of Vertical Federated Learning and Data Sharing AI Strategy in The Age of Vertical Federated Learning and Data Sharing

Federated Learning is trying to bring a solution to the issue of siloed and unstructured data, lack of data, privacy, and regulation of data sharing as well as incentive models for data alliances.

Recently, I had the opportunity to oversee the implementation of vertical federated learning based on a “data sharing alliance” with some of our competitors.

A data sharing alliance using a Vertical Federated Learning architecture would help us a lot.

In the end, the best strategy to construct vertical federated learning depends on a number of factors including:Federated LearningI will not get too much into details because other articles have already perfectly covered the technical aspects of Fede…

10 часов назад @ towardsdatascience.com
Accelerating end-to-end Machine Learning workflows with NVIDIA RAPIDS
Accelerating end-to-end Machine Learning  workflows with NVIDIA RAPIDS Accelerating end-to-end Machine Learning workflows with NVIDIA RAPIDS

To avoid any misconception here, know that this is just a general idea of how an end-to-end workflow works.

With an enormous amount of data generated every day and high data processing requirements (Terabytes of Dataset), let’s just say CPUs are simply not enough.

Predicting NYC Taxi FaresUnderstanding the dataFirst things first, you should know your data better than you know yourself.

Any guess how much time it will take to complete the ETL process?

Finding the size of dataWe will be using cuDF with Dask & XGBoost to scale GPU DataFrame ETL style operations and for model training.

11 часов назад @ towardsdatascience.com
Attack Pattern Detection and Prediction
Attack Pattern Detection and Prediction Attack Pattern Detection and Prediction

Cyber-adversaries are becoming more sophisticated in their efforts to avoid detection, and many modern malware tools are already incorporating new ways to bypass antivirus and other threat detection measures.

The aforementioned cyber-threat prediction systems offer promising and limited possibilities, but large-scale coordinated attacks require progress on several fronts, including the detection and prediction of events generated in computer systems.

Obfuscation techniques are used to bypass detection by deliberately making malicious code difficult to understand in order to bypass the detection of the network.

They took data from both public and private sources and discovered and used chara…

12 часов назад @ towardsdatascience.com
Surveying Corporate America’s Debt
Surveying Corporate America’s Debt Surveying Corporate America’s Debt

Slicing By Sector And Market CapLet’s focus our lens a bit and include market cap as well in our analysis.

We can use a heat map to visualize average debt/EBITDA levels by sector and market cap (market cap is a decent proxy for the size and scale of the firm).

Overall, I notice the following:Small real estate firms and mega-cap industrial firms have huge debts relative to their incomes.

I would guess that this is because mega-cap firms dominate their respective industries and enjoy the competitive advantages of scale and therefore higher profit margins.

So while they might borrow more on an absolute dollar basis, relative to incomes (thanks to their higher profitability), mega-cap firms act…

12 часов назад @ towardsdatascience.com
What are categorical variables and how to encode them?
What are categorical variables and how to encode them? What are categorical variables and how to encode them?

One hot encoding — best for nominal categorical variablesThe first method we are going to learn is called one-hot encoding and it is best suited for nominal variables.

As you can see categorical variable jersey that took three distinct values is described now by three binary variables: black, blue, and green.

You can easily drop the first binary variable by setting the drop_first parameter to True when using get_dummies function.

pd.get_dummies(df.jersey, drop_first=True)As we can see the first binary variable is now excluded from the result.

The two resulting variables blue and green are now ready to be passed to the machine learning algorithm.

12 часов назад @ towardsdatascience.com
Explore COVID-19 Infodemic
Explore COVID-19 Infodemic Explore COVID-19 Infodemic

Explore COVID-19 InfodemicNLP, Natural Language Processing, VisualizationIt is heart breaking to learn that Half of Canadians fooled by Covid-19 conspiracy theories.

According to the WHO, the COVID-19 related infodemic is just as dangerous as the virus itself.

To explore the content of COVID-19 fake news, I use strict definitions of what true and fake news stories are.

Specifically, true news articles are articles that are known to be true and from well trusted news sources.

The Dataprocess_data.pyAfter some cleaning, we can see that we have 586 true articles and 578 fake ones.

13 часов назад @ towardsdatascience.com
Base Plotting in R
Base Plotting in R Base Plotting in R

Above, I used base R plotting to graph petal width vs. petal length.

The graph on the left is the most basic graph you can create in R: a scatter plot with an x and y variable.

The graph on the right communicates more information, subsetting the data by species using color.

Color is applied based on the iris species using ifelse().

You can recreate both of these graphs in R using this code:##par lets us set global parameters for our graphs.

14 часов назад @ towardsdatascience.com
Becoming Human Becoming Human
последний пост 22 часа назад
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis

We, humans, are able to understand, to some extent, to understand speech just from lip movements.

Applications include security, were able to understand what a person is saying from a distance.

A limitation of this work would be not able to generate voices for videos that contain two or more speakers.

This is a promising result, we want the model to focus on the lip portion.

3D CNN was the best since the model is able to encourage the temporal information as well.

22 часа назад @ becominghuman.ai
10 Free Courses to learn Python Machine Learning libraries — Scikit-Learn, NumPy, Pandas, Keras…
10 Free Courses to learn Python Machine Learning libraries — Scikit-Learn, NumPy, Pandas, Keras… 10 Free Courses to learn Python Machine Learning libraries — Scikit-Learn, NumPy, Pandas, Keras…

If one of your goals is to learn Machine learning and Deep learning in 2020, then these resources can help you a lot.

In this article, I am going to share some of the best free classes to learn Machine learning and Deep learning online.

Anyway, Here is my list of some of the best free courses to learn Machine Learning and Deep Learning online by yourself.

This course is a prerequisite to Learn Machine Learning and I strongly suggest you learn and master Python before you deep dive into machine learning libraries and algorithms.

Machine learning is behind one of the coolest technological innovations today, but contrary to popular perception, you don’t need to be a math genius to successfully…

22 часа назад @ becominghuman.ai
Machine Learning Model For Predicting Heart Disease
Machine Learning Model For Predicting Heart Disease Machine Learning Model For Predicting Heart Disease

See more here)ST depression induced by exercise relative to rest (‘ST’ relates to positions on the ECG plot.

import numpy as npimport pandas as pdimport matplotlib.pyplot as pltfrom matplotlib.cm import rainbowfrom matplotlib import rcParams%matplotlib inlineimport warningswarnings.filterwarnings(‘ignore’)import seaborn as snsfrom sklearn.model_selection import train_test_splitfrom sklearn.preprocessing import StandardScalerfrom sklearn.model_selection import cross_val_scorefrom sklearn.metrics import confusion_matrix #importing 3 different classifiers KNeighborsClassifier, DecisionTreeClassifier, RandomForestClassifierfrom sklearn.neighbors import KNeighborsClassifierfrom sklearn.tree impo…

22 часа назад @ becominghuman.ai
Can We Predict Deforestation in Amazon Forests with Machine Learning?
Can We Predict Deforestation in Amazon Forests with Machine Learning? Can We Predict Deforestation in Amazon Forests with Machine Learning?

Jobs in Big DataDatasetI took the dataset from terrabrasilis.dpi.inpe.br/en where records quantified deforested areas larger than 6.25 hectares from 2008 to 2018 discretized per year.

Geo data had the longitude and latitude, the centroid of the deforested area, which I decided to be the targets for my predictions.

clean-up dataVisualizing the dataMapbox plot below shows deforestation areas spread out through all Amazon states.

Mapbox plot with areas of deforestation in Amazon statesThe ProcessThe aim is to predict the location of an area (centroid)where deforestation most likely to occur.

Ridge model validation MAE: 1.3207 latRidge model validation MAE: 1.8468 lonRidge model Validation RMSE…

1 день, 22 часа назад @ becominghuman.ai
Solutions to Challenging Engineering Research
Solutions to Challenging Engineering Research Solutions to Challenging Engineering Research

Challenges & SolutionsLet’s assume that we’d like to develop a video streaming platform, here are basic challenges and some potential solutions.

Explore & define what to build: The focus of this phase is for design and research to determine what to build.

First, a team can decide that concepts that don’t require Engineering to build new elements (e.g.

That leaves you with fewer concepts for qualitative research.

In all prototypes, we always incorporate concepts that have never been tested as well as refined versions previously tested concepts.

1 день, 22 часа назад @ becominghuman.ai
How to get started as a Machine Learning Engineer.
How to get started as a Machine Learning Engineer. How to get started as a Machine Learning Engineer.

Top 4 Most Popular Ai Articles:So what does it take to become a Machine Learning engineer?

You don't have to have a solid tech background before getting started with Machine Learning.

One course that really helped me get started with Machine Learning was Andrew Ng’s Stanford Machine Learning course on coursera.

Photo by Kevin Ku on UnsplashAfter learning Machine Learning concepts make sure to work on projects and build your portfolio.

And that’s just what it takes to get started with Machine Learning!

1 день, 22 часа назад @ becominghuman.ai
How IoT, AI And Big Data Can Enable Environmental Sustainability
How IoT, AI And Big Data Can Enable Environmental Sustainability How IoT, AI And Big Data Can Enable Environmental Sustainability

Polluted air hurts water and soil resources that are cornerstones to life on the Earth.

2 million tons of waste material gets into the world’s water a day.

And it gets into the soil, resulting in 3 million potentially polluted sites in European Economic Area and the West Balkans only.

Recycling sewage, saving water and electricity, reducing waste, protecting endangered species, traveling responsibly, etc, are not voluntary actions anymore.

Trending AI Articles:The study held by Concentrix revealed that 74% of more than 200 influencers in environmental sustainability agreed that artificial intelligence will help to solve environmental issues, and 64% agreed that the Internet of things will a…

2 дня, 23 часа назад @ becominghuman.ai
How Important is Self Learning in Data Science?
How Important is Self Learning in Data Science? How Important is Self Learning in Data Science?

Self learning is a process by which individuals take the initiative, with or without the assistance of others, in diagnosing their learning needs, formulating learning goals, identifying human and material resources for learning and evaluating learning outcomes.

Right from the very first day one starts to learn Data Science, till gaining some proficiency and eventually a job in Data Science, learning still continues.

Self learning offers a large pool of possibilities and flexibility while learning which the traditional way of learning may not offer.

Here are some reasons why:Data Science is constantly evolvingWhether it is in simplifying data science processes or algorithms or the evolution…

2 дня, 23 часа назад @ becominghuman.ai
The Engineering Impact
The Engineering Impact The Engineering Impact

Gauge the impact of work you’re not doing.

…..could be, could beThere is something even more fundamental you can do: structure your work to maximise impact.

You should know why it had impact — you should have had a good idea about the impact before you began the work!

Know how to prove itThe easy case: clearly, the impact for some work is easy to quantify.

You can still capture the impact through feedback from colleagues and partner teams.

2 дня, 23 часа назад @ becominghuman.ai
How Artificial Intelligence is changing the Energy Sector
How Artificial Intelligence is changing the Energy Sector How Artificial Intelligence is changing the Energy Sector

The energy sector is using AI to increase energy efficiency by reducing consumption, improving energy storage and grid stability, making predictions about energy consumption, to have more accuracy to find oil & gas and many other applications.

Let’s take a look in some projects and how they are affecting the energy sector.

Source: https://verdigris.co/#analyticsVerdigris Technologies mix IoT with AI to provide information about the energy consumption in large commercial buildings through the installation of Wi-Fi devices to track the energy.

With the visualization of all energy consumption of the building, the operator can identify where are the most energy consumption and to identify where…

4 дня, 22 часа назад @ becominghuman.ai
7 Characteristics of Machine Learning
7 Characteristics of Machine Learning 7 Characteristics of Machine Learning

2- Key characteristics of machine learningIn order to understand the actual power of machine learning, you have to consider the characteristics of this technology.

There are lots of examples that echo the characteristics of machine learning in today’s data-rich world.

Here are seven key characteristics of machine learning for which companies should prefer it over other technologies.

2.2- Automation at its bestOne of the biggest characteristics of machine learning is its ability to automate repetitive tasks and thus, increasing productivity.

Those who are interested to become a machine learning professional should choose their learning avenue wisely.

4 дня, 22 часа назад @ becominghuman.ai
3 Artificial Intelligence tools to enhance your creativity
3 Artificial Intelligence tools to enhance your creativity 3 Artificial Intelligence tools to enhance your creativity

Newest Artificial Intelligence technology is here to help.

In this post, I will describe three awesome tools that enable you to channel you imagination and create something beautiful in collaboration with an AI.

I will get a bit technical in the descriptions, but don’t worry, you don’t need to understand that to use the tools.

The images created by the network look very bizarre and surreal, in a way that makes them fascinating.

Trending AI Articles:If now you are interested in trying Artbreeder yourself, check this video with a more detailed introduction: https://www.youtube.com/watch?v=IlrMkHaCosw&feature=emb_title

5 дней, 21 час назад @ becominghuman.ai
Quantum Computing with Q# on macOS— Teleportation
Quantum Computing with Q# on macOS— Teleportation Quantum Computing with Q# on macOS— Teleportation

QUANTUM COMPUTING ON MACOS WITH QDKQuantum Computing with Q# on macOS— TeleportationA series of Quantum Computing hands-on tutorial with Q#If you haven’t read the other two articles of mine on Q#, I strongly suggest you have a look at them: Setup Q# on macOS, Superposition, and Entanglement.

Application: Quantum CommunicationQuantum communication is the transmission of information via secure means using quantum phenomena of teleportation.

Quantum Communication (source)Therefore, the important part of quantum communication is how to transmit the bell state that the measurement will be used.

Since there’s no way to teleporter information instantaneously, you may wonder what the teleportation …

5 дней, 21 час назад @ becominghuman.ai
On Machine Learning aided drug design; Designing a Covid-19 drug from the perspective of a Data…
On Machine Learning aided drug design; Designing a Covid-19 drug from the perspective of a Data… On Machine Learning aided drug design; Designing a Covid-19 drug from the perspective of a Data…

Below is the computed Pareto Front approximation of the Covid-19 drug design problem (after 40 generations of evolution).

Figure 3; Pareto Front after 40 GenerationsEpilogueWe’ve seen that reframing drug design from a simple (but expensive) screening exercise to a multi-objective optimization problem could be beneficial.

We’ve also seen that methods borrowed from Machine Learning and numerical optimization could be used when undergoing such a task.

CaveatsThe purpose of this post was to present a different way of thinking about the drug design problem and NOT to design a new compound.

The results of the optimization (Pareto front) heavily relay on the quality of the predictive models utiliz…

5 дней, 21 час назад @ becominghuman.ai
A brief introduction to AI
A brief introduction to AI A brief introduction to AI

Symbolic AI programs are based on creating explicit structures and behavior rules.

An example of symbolic AI tools is object-oriented programming.

In the early days of AI Computer Vision and robotics were tried to get implied using symbolic AI.

As, you can see a flowchart of a robot car designed with symbolic AI.

Now with the evolve of Neural Network, Computer Vision is being designed with Neural Network more previously Deep Learing.

6 дней, 21 час назад @ becominghuman.ai
Distill.pub Distill.pub
последний пост 3 недели, 1 день назад
Exploring Bayesian Optimization
Exploring Bayesian Optimization

How to tune hyperparameters for your machine learning model using Bayesian optimization.

3 недели, 1 день назад @ distill.pub
An Overview of Early Vision in InceptionV1
An Overview of Early Vision in InceptionV1

An overview of all the neurons in the first five layers of InceptionV1, organized into a taxonomy of 'neuron groups.'

1 месяц, 3 недели назад @ distill.pub
Visualizing Neural Networks with the Grand Tour
Visualizing Neural Networks with the Grand Tour

By focusing on linear dimensionality reduction, we show how to visualize many dynamic phenomena in neural networks.

2 месяца, 1 неделя назад @ distill.pub
Thread: Circuits
Thread: Circuits

What can we learn if we invest heavily in reverse engineering a single neural network?

2 месяца, 2 недели назад @ distill.pub
Zoom In: An Introduction to Circuits
Zoom In: An Introduction to Circuits

By studying the connections between neurons, we can find meaningful algorithms in the weights of neural networks.

2 месяца, 2 недели назад @ distill.pub
Growing Neural Cellular Automata
Growing Neural Cellular Automata

Training an end-to-end differentiable, self-organising cellular automata model of morphogenesis, able to both grow and regenerate specific patterns.

3 месяца, 2 недели назад @ distill.pub
Visualizing the Impact of Feature Attribution Baselines
Visualizing the Impact of Feature Attribution Baselines

Exploring the baseline input hyperparameter, and how it impacts interpretations of neural network behavior.

4 месяца, 2 недели назад @ distill.pub
Computing Receptive Fields of Convolutional Neural Networks
Computing Receptive Fields of Convolutional Neural Networks

Detailed derivations and open-source code to analyze the receptive fields of convnets.

6 месяцев, 3 недели назад @ distill.pub
The Paths Perspective on Value Learning
The Paths Perspective on Value Learning

A closer look at how Temporal Difference Learning merges paths of experience for greater statistical efficiency

8 месяцев назад @ distill.pub
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Adversarially Robust Neural Style Transfer
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Adversarially Robust Neural Style Transfer

An experiment showing adversarial robustness makes neural style transfer work on a non-VGG architecture

9 месяцев, 3 недели назад @ distill.pub
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Two Examples of Useful, Non-Robust Features
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Two Examples of Useful, Non-Robust Features

An example project using webpack and svelte-loader and ejs to inline SVGs

9 месяцев, 3 недели назад @ distill.pub
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Robust Feature Leakage
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Robust Feature Leakage

An example project using webpack and svelte-loader and ejs to inline SVGs

9 месяцев, 3 недели назад @ distill.pub
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Adversarial Example Researchers Need to Expand What is Meant by 'Robustness'
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Adversarial Example Researchers Need to Expand What is Meant by 'Robustness'

The main hypothesis in Ilyas et al. (2019) happens to be a special case of a more general principle that is commonly accepted in the robustness to distributional shift literature

9 месяцев, 3 недели назад @ distill.pub
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Adversarial Examples are Just Bugs, Too
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Adversarial Examples are Just Bugs, Too

Refining the source of adversarial examples

9 месяцев, 3 недели назад @ distill.pub
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features'
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features'

Six comments from the community and responses from the original authors

9 месяцев, 3 недели назад @ distill.pub
The Gradient The Gradient
последний пост 1 месяц, 3 недели назад
A Speech-To-Text Practitioner’s Criticisms of Industry and Academia
A Speech-To-Text Practitioner’s Criticisms of Industry and Academia A Speech-To-Text Practitioner’s Criticisms of Industry and Academia

This is a follow-up article to our article on building speech-to-text (STT) models, Towards an ImageNet Moment for Speech-to-Text.

Сriticisms of the IndustryIn general, the majority of STT papers we have read were written by researchers from the industry (e.g.

Most criticisms of STT papers and solutions can be attributed to either the"academic" part or the "industry" part of the researchers’ background.

The majority of modern STT papers usually just heavily overfit on the LibriSpeech ASR corpus (LibriSpeech) with increasingly more extravagant methods.

CitationFor attribution in academic contexts or books, please cite this work asAlexander Veysov, "A Speech-To-Text Practitioner’s Criticisms …

1 месяц, 3 недели назад @ thegradient.pub
Towards an ImageNet Moment for Speech-to-Text
Towards an ImageNet Moment for Speech-to-Text Towards an ImageNet Moment for Speech-to-Text

Speech-to-text (STT), also known as automated-speech-recognition (ASR), has a long history and has made amazing progress over the past decade.

IntroductionFollowing the success and the democratization (the so-called "ImageNet moment", i.e.

This piece will describe our pursuit of an ImageNet moment for STT, which has so far not been found, and particularly in the context of Russian language.

(i) is easy to estimate just by looking at the model's performance during the first 20-25% of its epochs.

CitationFor attribution in academic contexts or books, please cite this work asAlexander Veysov, "Toward's an ImageNet Moment for Speech-to-Text", The Gradient, 2020.

2 месяца назад @ thegradient.pub
Quantifying Independently Reproducible Machine Learning
Quantifying Independently Reproducible Machine Learning Quantifying Independently Reproducible Machine Learning

My investigation in reproducible ML has also relied on personal notes and records hosted on Mendeley and Github.

What Makes a ML Paper Reproducible?

The biggest factors are that we cannot take all of our assumptions about so-called reproducible ML at face value.

At the same time, our process and systems must result in reproducible work that does not lead us astray.

AcknowledgmentsFeature image source: https://xkcd.com/242/CitationFor attribution in academic contexts or books, please cite this work asEdward Raff, "Quantifying Independently Reproducible Machine Learning", The Gradient, 2020.

3 месяца, 3 недели назад @ thegradient.pub
GPT-2 and the Nature of Intelligence
GPT-2 and the Nature of Intelligence GPT-2 and the Nature of Intelligence

--The AI system GPT-2, in a December 2019 interview with The Economist, "An artificial intelligence predicts the future"Innateness, empiricism, and recent developments in deep learningConsider two classic hypotheses about the development of language and cognition.

Consider GPT-2, an AI system that was recently featured in The New Yorker and interviewed by The Economist.

The popular blog StatStarCodex featured it, too, in a podcast entitled "GPT-2 as a step towards General Intelligence".

Compared to any previous system for generating natural language, GPT-2 has a number of remarkable strengths.

I speak fluent EnglishIf you run your experiments talktotransformer.com, you will quickly learn th…

4 месяца назад @ thegradient.pub
The Economics of AI Today
The Economics of AI Today The Economics of AI Today

Every day we hear claims that Artificial Intelligence (AI) systems are about to transform the economy, creating mass unemployment and vast monopolies.

In September 2017, a group of distinguished economists gathered in Toronto to set out a research agenda for the Economics of Artificial Intelligence (AI).

Previous editions of the Economics of AI conference included papers about the impact of AI in sectors such as media or health-care.

Lack of diversity in the AI research workforce, and the increasing influence of the private sector in setting AI research (and ethical) agendas as part of the industrialization of AI research suggest that this could be a problem, but the evidence base is lackin…

4 месяца, 1 неделя назад @ thegradient.pub
Is NeurIPS Getting Too Big?
Is NeurIPS Getting Too Big? Is NeurIPS Getting Too Big?

NeurIPS 2019, the latest incarnation of the Neural Information Processing Systems conference, wrapped up just over a week ago.

No, that's a keynote at #NeurIPS2019 pic.twitter.com/nJjONGzJww — Jevgenij Gamper (@brutforcimag) December 11, 2019 NeurIPS poster session- Too crowded.

Lots of Posters/Talks/TopicsThe other primary purpose of any research conference is to inform attendees of new research and inspire new ideas.

:(NeurIPS 2019, Vancouver, Canada: Got the visa 3 weeks before.

2019 NeurIPS was last week in Vancouver.

5 месяцев назад @ thegradient.pub
An Epidemic of AI Misinformation
An Epidemic of AI Misinformation An Epidemic of AI Misinformation

Unfortunately, the problem of overhyped AI extends beyond the media itself.

General AI still seems like it might be a couple decades away, sixty years after the first optimistic projections were issued.

Hundreds of deep learning for radiology companies have been spawned in the meantime, but thus far no actual radiologists have been replaced, and the best guess is that deep learning can augment radiologists, but not, in the near-term replace them.

The net consequences could, in the end, debilitate the field, paradoxically inducing an AI winter after initially helping stimulate public interest.

If AI system is allegedly better than humans, then which humans, and how much better?

5 месяцев, 4 недели назад @ thegradient.pub
Introduction to Artificial Life for People who Like AI
Introduction to Artificial Life for People who Like AI Introduction to Artificial Life for People who Like AI

Artificial Life, often shortened as ALife.

NEAT was awarded the 2017 International Society for Artificial Life Award for Outstanding Paper of the Decade.

First, I think we are seeing the first signs of the next AI winter, a period where people lose confidence in AI research and funding dries out.

Art ALife: “Edge of Chaos: Artificial Life based interactive art installation” by Vasilija Abramovic and Ruairi GlynnHave you heard about the edge of chaos?

She was recently elected to the board of the International Society for Artificial Life.

6 месяцев назад @ thegradient.pub
How Machine Learning Can Help Unlock the World of Ancient Japan
How Machine Learning Can Help Unlock the World of Ancient Japan How Machine Learning Can Help Unlock the World of Ancient Japan

However, these models were unable to achieve strong performance on Kuzushiji recognition.

This was due to inadequate understanding of Japanese historical literature in the optical character recognition (OCR) community and the lack of high quality standardized datasets.

There are several reasons why Kuzushiji recognition is challenging:Capturing both local and global context is important.

This is one reason why conventional sequence models do not have the capability to work well with many Kuzushiji documents.

However there are many other types of Kuzushiji text that a person might want to transcribe.

6 месяцев, 1 неделя назад @ thegradient.pub
Gaussian Processes, not quite for dummies
Gaussian Processes, not quite for dummies Gaussian Processes, not quite for dummies

Note: if all k components are independent Gaussian random variables, then $X$ must be multivariate Gaussian (because the sum of independent Gaussian random variables is always Gaussian).

Higher dimensional Gaussian5D GaussianNow we can consider a higher dimension Gaussian, starting from 5D — so the covariance matrix is now 5x5.

We then take K and add $I\sigma_y^2$ for the final covariance matrix to factor in noise -- more on this later.

This means in principle, we can calculate this covariance matrix for any real-valued $x_1$ and $x_2$ by simply plugging them in.

Gaussian ProcessTextbook definitionFrom the above derivation, you can view Gaussian process as a generalization of multivariate G…

6 месяцев, 2 недели назад @ thegradient.pub
Evaluation Metrics for Language Modeling
Evaluation Metrics for Language Modeling Evaluation Metrics for Language Modeling

Counterintuitively, having more metrics actually makes it harder to compare language models, especially as indicators of how well a language model will perform on a specific downstream task are often unreliable.

Despite the presence of these downstream evaluation benchmarks, traditional intrinsic metrics are, nevertheless, extremely useful during the process of training the language model itself.

Proof: let P be the distribution of the underlying language and Q be the distribution learned by a language model.

The performance of N-gram language models do not improve much as N goes above 4, whereas the performance of neural language models continue improving over time.

In less than two years,…

7 месяцев, 1 неделя назад @ thegradient.pub
The State of Machine Learning Frameworks in 2019
The State of Machine Learning Frameworks in 2019 The State of Machine Learning Frameworks in 2019

Since deep learning regained prominence in 2012, many machine learning frameworks have clamored to become the new favorite among researchers and industry practitioners.

It is perhaps under appreciated how much machine learning frameworks shape ML research.

Machine learning research itself is also in a massive state of flux.

Most of us don't work on machine learning software for the money or to assist in our company's strategic plans.

We work in machine learning because we care - about advancing machine learning research, about democratizing AI, or maybe just about building cool stuff.

7 месяцев, 2 недели назад @ thegradient.pub
The #BenderRule: On Naming the Languages We Study and Why It Matters
The #BenderRule: On Naming the Languages We Study and Why It Matters The #BenderRule: On Naming the Languages We Study and Why It Matters

This has led to a digital divide in the field of NLP between high resource and low resource languages.

High resource languages constitute a short list starting with English, (Mandarin) Chinese, Arabic and French .

And yet, the field of NLP is caught in a negative feedback loop that hinders the expansion of the languages we work on.

Work on languages other than English is often considered “language specific” and thus reviewed as less important than equivalent work on English.

Many NLP systems for Chinese, Japanese, Thai and other languages have to start with the problem of word tokenization.

8 месяцев, 2 недели назад @ thegradient.pub
NLP's Clever Hans Moment has Arrived
NLP's Clever Hans Moment has Arrived NLP's Clever Hans Moment has Arrived

However, the model doesn't care about this impossibility and identifies the correct warrant with 71 percent accuracy.

Coming back to the paper, the authors point to a (again, depressingly) large amount of recent work reporting Clever Hans effects in NLP datasets.

For a broader view on this topic, also see Ana Marasović's article on NLP's Generalization Problem.

The growing number of papers finding cases of the Clever Hans effect raises important questions for NLP research, the most obvious one being how the effect can be prevented.

If not much, the dataset may provide unintended non-content cues, such as sentence length or distribution of function words.

9 месяцев назад @ thegradient.pub
Introducing Retrospectives: 'Real Talk' for your Past Papers
Introducing Retrospectives: 'Real Talk' for your Past Papers Introducing Retrospectives: 'Real Talk' for your Past Papers

What the community still lacks, though, are incentives for publicly documenting our real thoughts and feelings about our past papers.

Today, we’re launching ML Retrospectives, a website for hosting reflections and critiques of researchers’ own past papers that we’re calling retrospectives.

With the clearing of this emotional weight it became easier to look at my past papers.

ML Retrospectives is a platform for hosting retrospectives: documents where researchers write honestly about their past papers.

While a venue for critiquing other people’s papers might also be valuable, we wanted to focus on normalizing sharing drawbacks of your own past papers.

9 месяцев, 1 неделя назад @ thegradient.pub
🔬 Science
arXiv.org arXiv.org
последний пост 2 часа назад
Genetic optimization algorithms applied toward mission computability models. (arXiv:2005.13105v1 [cs.NE])
Genetic optimization algorithms applied toward mission computability models. (arXiv:2005.13105v1 [cs.NE]) Genetic optimization algorithms applied toward mission computability models. (arXiv:2005.13105v1 [cs.NE])

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

2 часа назад @ arxiv.org
Comparison of Recurrent Neural Network Architectures for Wildfire Spread Modelling. (arXiv:2005.13040v1 [cs.LG])
Comparison of Recurrent Neural Network Architectures for Wildfire Spread Modelling. (arXiv:2005.13040v1 [cs.LG]) Comparison of Recurrent Neural Network Architectures for Wildfire Spread Modelling. (arXiv:2005.13040v1 [cs.LG])

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Probabilistic solution of chaotic dynamical system inverse problems using Bayesian Artificial Neural Networks. (arXiv:2005.13028v1 [cs.LG])
Probabilistic solution of chaotic dynamical system inverse problems using Bayesian Artificial Neural Networks. (arXiv:2005.13028v1 [cs.LG]) Probabilistic solution of chaotic dynamical system inverse problems using Bayesian Artificial Neural Networks. (arXiv:2005.13028v1 [cs.LG])

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Comparing BERT against traditional machine learning text classification. (arXiv:2005.13012v1 [cs.CL])
Comparing BERT against traditional machine learning text classification. (arXiv:2005.13012v1 [cs.CL]) Comparing BERT against traditional machine learning text classification. (arXiv:2005.13012v1 [cs.CL])

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Skew Gaussian Processes for Classification. (arXiv:2005.12987v1 [cs.LG])
Skew Gaussian Processes for Classification. (arXiv:2005.12987v1 [cs.LG]) Skew Gaussian Processes for Classification. (arXiv:2005.12987v1 [cs.LG])

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Seamlessly Unifying Attributes and Items: Conversational Recommendation for Cold-Start Users. (arXiv:2005.12979v1 [cs.IR])
Seamlessly Unifying Attributes and Items: Conversational Recommendation for Cold-Start Users. (arXiv:2005.12979v1 [cs.IR]) Seamlessly Unifying Attributes and Items: Conversational Recommendation for Cold-Start Users. (arXiv:2005.12979v1 [cs.IR])

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Opportunistic Multi-aspect Fairness through Personalized Re-ranking. (arXiv:2005.12974v1 [cs.IR])
Opportunistic Multi-aspect Fairness through Personalized Re-ranking. (arXiv:2005.12974v1 [cs.IR]) Opportunistic Multi-aspect Fairness through Personalized Re-ranking. (arXiv:2005.12974v1 [cs.IR])

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Skewness Ranking Optimization for Personalized Recommendation. (arXiv:2005.12971v1 [cs.IR])
Skewness Ranking Optimization for Personalized Recommendation. (arXiv:2005.12971v1 [cs.IR]) Skewness Ranking Optimization for Personalized Recommendation. (arXiv:2005.12971v1 [cs.IR])

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Towards intervention-centric causal reasoning in learning agents. (arXiv:2005.12968v1 [cs.LG])
Towards intervention-centric causal reasoning in learning agents. (arXiv:2005.12968v1 [cs.LG]) Towards intervention-centric causal reasoning in learning agents. (arXiv:2005.12968v1 [cs.LG])

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Contrastive Learning for Debiased Candidate Generation at Scale. (arXiv:2005.12964v1 [cs.IR])
Contrastive Learning for Debiased Candidate Generation at Scale. (arXiv:2005.12964v1 [cs.IR]) Contrastive Learning for Debiased Candidate Generation at Scale. (arXiv:2005.12964v1 [cs.IR])

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Class-Weighted Classification: Trade-offs and Robust Approaches. (arXiv:2005.12914v1 [stat.ML])
Class-Weighted Classification: Trade-offs and Robust Approaches. (arXiv:2005.12914v1 [stat.ML]) Class-Weighted Classification: Trade-offs and Robust Approaches. (arXiv:2005.12914v1 [stat.ML])

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Goal-conditioned Imitation Learning. (arXiv:1906.05838v3 [cs.LG] UPDATED)
Goal-conditioned Imitation Learning. (arXiv:1906.05838v3 [cs.LG] UPDATED) Goal-conditioned Imitation Learning. (arXiv:1906.05838v3 [cs.LG] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Identifying Vulnerabilities of Industrial Control Systems using Evolutionary Multiobjective Optimisation. (arXiv:2005.13095v1 [cs.CR])
Identifying Vulnerabilities of Industrial Control Systems using Evolutionary Multiobjective Optimisation. (arXiv:2005.13095v1 [cs.CR]) Identifying Vulnerabilities of Industrial Control Systems using Evolutionary Multiobjective Optimisation. (arXiv:2005.13095v1 [cs.CR])

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Self-Supervised Representation Learning on Document Images. (arXiv:2004.10605v2 [cs.CV] UPDATED)
Self-Supervised Representation Learning on Document Images. (arXiv:2004.10605v2 [cs.CV] UPDATED) Self-Supervised Representation Learning on Document Images. (arXiv:2004.10605v2 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
MOPT: Multi-Object Panoptic Tracking. (arXiv:2004.08189v2 [cs.CV] UPDATED)
MOPT: Multi-Object Panoptic Tracking. (arXiv:2004.08189v2 [cs.CV] UPDATED) MOPT: Multi-Object Panoptic Tracking. (arXiv:2004.08189v2 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
arXiv.org arXiv.org
последний пост 2 часа назад
Y-net: Biomedical Image Segmentation and Clustering. (arXiv:2004.05698v2 [eess.IV] UPDATED)
Y-net: Biomedical Image Segmentation and Clustering. (arXiv:2004.05698v2 [eess.IV] UPDATED) Y-net: Biomedical Image Segmentation and Clustering. (arXiv:2004.05698v2 [eess.IV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Score-Guided Generative Adversarial Networks. (arXiv:2004.04396v2 [cs.LG] UPDATED)
Score-Guided Generative Adversarial Networks. (arXiv:2004.04396v2 [cs.LG] UPDATED) Score-Guided Generative Adversarial Networks. (arXiv:2004.04396v2 [cs.LG] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Attentive One-Dimensional Heatmap Regression for Facial Landmark Detection and Tracking. (arXiv:2004.02108v4 [cs.CV] UPDATED)
Attentive One-Dimensional Heatmap Regression for Facial Landmark Detection and Tracking. (arXiv:2004.02108v4 [cs.CV] UPDATED) Attentive One-Dimensional Heatmap Regression for Facial Landmark Detection and Tracking. (arXiv:2004.02108v4 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
A Simple Baseline for Multi-Object Tracking. (arXiv:2004.01888v4 [cs.CV] UPDATED)
A Simple Baseline for Multi-Object Tracking. (arXiv:2004.01888v4 [cs.CV] UPDATED) A Simple Baseline for Multi-Object Tracking. (arXiv:2004.01888v4 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Quantification of Tomographic Patterns associated with COVID-19 from Chest CT. (arXiv:2004.01279v4 [eess.IV] UPDATED)
Quantification of Tomographic Patterns associated with COVID-19 from Chest CT. (arXiv:2004.01279v4 [eess.IV] UPDATED) Quantification of Tomographic Patterns associated with COVID-19 from Chest CT. (arXiv:2004.01279v4 [eess.IV] UPDATED)

Electrical Engineering and Systems Science > Image and Video ProcessingCOVID-19 e-printImportant: e-prints posted on arXiv are not peer-reviewed by arXiv; they should not be relied upon without context to guide clinical practice or health-related behavior and should not be reported in news media as established information without consulting multiple experts in the field.

7 часов назад @ arxiv.org
Articulation-aware Canonical Surface Mapping. (arXiv:2004.00614v3 [cs.CV] UPDATED)
Articulation-aware Canonical Surface Mapping. (arXiv:2004.00614v3 [cs.CV] UPDATED) Articulation-aware Canonical Surface Mapping. (arXiv:2004.00614v3 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Patch-based Non-Local Bayesian Networks for Blind Confocal Microscopy Denoising. (arXiv:2003.11177v2 [eess.IV] UPDATED)
Patch-based Non-Local Bayesian Networks for Blind Confocal Microscopy Denoising. (arXiv:2003.11177v2 [eess.IV] UPDATED) Patch-based Non-Local Bayesian Networks for Blind Confocal Microscopy Denoising. (arXiv:2003.11177v2 [eess.IV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Unifying Training and Inference for Panoptic Segmentation. (arXiv:2001.04982v2 [cs.CV] UPDATED)
Unifying Training and Inference for Panoptic Segmentation. (arXiv:2001.04982v2 [cs.CV] UPDATED) Unifying Training and Inference for Panoptic Segmentation. (arXiv:2001.04982v2 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Semantic Segmentation for Compound figures. (arXiv:1912.07142v2 [cs.CV] UPDATED)
Semantic Segmentation for Compound figures. (arXiv:1912.07142v2 [cs.CV] UPDATED) Semantic Segmentation for Compound figures. (arXiv:1912.07142v2 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Unsupervised Learning for Intrinsic Image Decomposition from a Single Image. (arXiv:1911.09930v2 [cs.CV] UPDATED)
Unsupervised Learning for Intrinsic Image Decomposition from a Single Image. (arXiv:1911.09930v2 [cs.CV] UPDATED) Unsupervised Learning for Intrinsic Image Decomposition from a Single Image. (arXiv:1911.09930v2 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment. (arXiv:1910.06180v2 [cs.CV] UPDATED)
KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment. (arXiv:1910.06180v2 [cs.CV] UPDATED) KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment. (arXiv:1910.06180v2 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Unsupervised Adaptation for Synthetic-to-Real Handwritten Word Recognition. (arXiv:1909.08473v2 [cs.CV] UPDATED)
Unsupervised Adaptation for Synthetic-to-Real Handwritten Word Recognition. (arXiv:1909.08473v2 [cs.CV] UPDATED) Unsupervised Adaptation for Synthetic-to-Real Handwritten Word Recognition. (arXiv:1909.08473v2 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
DeepInSAR: A Deep Learning Framework for SAR Interferometric Phase Restoration and Coherence Estimation. (arXiv:1909.03120v2 [eess.IV] UPDATED)
DeepInSAR: A Deep Learning Framework for SAR Interferometric Phase Restoration and Coherence Estimation. (arXiv:1909.03120v2 [eess.IV] UPDATED) DeepInSAR: A Deep Learning Framework for SAR Interferometric Phase Restoration and Coherence Estimation. (arXiv:1909.03120v2 [eess.IV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
City-GAN: Learning architectural styles using a custom Conditional GAN architecture. (arXiv:1907.05280v2 [cs.CV] UPDATED)
City-GAN: Learning architectural styles using a custom Conditional GAN architecture. (arXiv:1907.05280v2 [cs.CV] UPDATED) City-GAN: Learning architectural styles using a custom Conditional GAN architecture. (arXiv:1907.05280v2 [cs.CV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Skin Lesion Analyser: An Efficient Seven-Way Multi-Class Skin Cancer Classification Using MobileNet. (arXiv:1907.03220v3 [eess.IV] UPDATED)
Skin Lesion Analyser: An Efficient Seven-Way Multi-Class Skin Cancer Classification Using MobileNet. (arXiv:1907.03220v3 [eess.IV] UPDATED) Skin Lesion Analyser: An Efficient Seven-Way Multi-Class Skin Cancer Classification Using MobileNet. (arXiv:1907.03220v3 [eess.IV] UPDATED)

Donate to arXivPlease join the Simons Foundation and our generous member organizations in supporting arXiv during our giving campaign September 23-27.

100% of your contribution will fund improvements and new initiatives to benefit arXiv's global scientific community.

7 часов назад @ arxiv.org
Papers With Code Papers With Code
последний пост 16 часов назад
Open-Retrieval Conversational Question Answering
Open-Retrieval Conversational Question Answering Open-Retrieval Conversational Question Answering

Conversational search is one of the ultimate goals of information retrieval.

Recent research approaches conversational search by simplified settings of response ranking and conversational question answering, where an answer is either selected from a given candidate set or extracted from a given passage...

These simplifications neglect the fundamental role of retrieval in conversational search.

To address this limitation, we introduce an open-retrieval conversational question answering (ORConvQA) setting, where we learn to retrieve evidence from a large collection before extracting answers, as a further step towards building functional conversational search systems.

We build an end-to-end sy…

16 часов назад @ paperswithcode.com
MANGO: A Python Library for Parallel Hyperparameter Tuning
MANGO: A Python Library for Parallel Hyperparameter Tuning MANGO: A Python Library for Parallel Hyperparameter Tuning

Tuning hyperparameters for machine learning algorithms is a tedious task, one that is typically done manually.

To enable automated hyperparameter tuning, recent works have started to use techniques based on Bayesian optimization...

To address these challenges, we present Mango, a Python library for parallel hyperparameter tuning.

Mango enables the use of any distributed scheduling framework, implements intelligent parallel search strategies, and provides rich abstractions for defining complex hyperparameter search spaces that are compatible with scikit-learn.

Mango is available open-source and is currently used in production at Arm Research to provide state-of-art hyperparameter tuning capa…

16 часов назад @ paperswithcode.com
Attention-guided Context Feature Pyramid Network for Object Detection
Attention-guided Context Feature Pyramid Network for Object Detection Attention-guided Context Feature Pyramid Network for Object Detection

For object detection, how to address the contradictory requirement between feature map resolution and receptive field on high-resolution inputs still remains an open question.

In this paper, to tackle this issue, we build a novel architecture, called Attention-guided Context Feature Pyramid Network (AC-FPN), that exploits discriminative information from various large receptive fields via integrating attention-guided multi-path features...

The first one is Context Extraction Module (CEM) that explores large contextual information from multiple receptive fields.

As redundant contextual relations may mislead localization and recognition, we also design the second module named Attention-guided …

16 часов назад @ paperswithcode.com
Multivariate Convex Regression at Scale
Multivariate Convex Regression at Scale Multivariate Convex Regression at Scale

We present new large-scale algorithms for fitting a multivariate convex regression function to $n$ samples in $d$ dimensions---a key problem in shape constrained nonparametric regression with widespread applications in engineering and the applied sciences.

The infinite-dimensional learning task can be expressed via a convex quadratic program (QP) with $O(nd)$ decision variables and $O(n^2)$ constraints...

To this end, we present an active set type algorithm on the Lagrangian dual (of a perturbation) of the primal QP.

Although the dual is not strongly convex, we present a novel linear convergence rate of our algorithm on the dual.

We demonstrate that our framework can solve instances of the …

16 часов назад @ paperswithcode.com
Knee Point Identification Based on Trade-Off Utility
Knee Point Identification Based on Trade-Off Utility Knee Point Identification Based on Trade-Off Utility

Knee points, characterised as their smallest trade-off loss at all objectives, are attractive to decision makers in multi-criterion decision-making.

In this paper, we propose a simple and effective knee point identification method based on trade-off utility, dubbed KPITU, to help decision makers identify knee points from a given set of trade-off solutions.

In particular, a solution is a knee point if and only if it has the best trade-off utility among its neighbours.

Moreover, we implement a GPU version of KPITU that carries out the knee point identification in a parallel manner.

To validate the effectiveness of KPITU, we compare its performance with five state-of-the-art knee point identif…

16 часов назад @ paperswithcode.com
Underwater object detection using Invert Multi-Class Adaboost with deep learning
Underwater object detection using Invert Multi-Class Adaboost with deep learning Underwater object detection using Invert Multi-Class Adaboost with deep learning

In recent years, deep learning based methods have achieved promising performance in standard object detection.

However, these methods lack sufficient capabilities to handle underwater object detection due to these challenges: (1) Objects in real applications are usually small and their images are blurry, and (2) images in the underwater datasets and real applications accompany heterogeneous noise... To address these two problems, we first propose a novel neural network architecture, namely Sample-WeIghted hyPEr Network (SWIPENet), for small object detection.

SWIPENet consists of high resolution and semantic rich Hyper Feature Maps which can significantly improve small object detection accur…

16 часов назад @ paperswithcode.com
Automatic Discovery of Interpretable Planning Strategies
Automatic Discovery of Interpretable Planning Strategies Automatic Discovery of Interpretable Planning Strategies

We propose that recently developed reinforcement learning methods for discovering clever heuristics for good decision-making can be partially leveraged to assist human experts in this design process.

To solve this problem, we introduce AI-Interpret: a general method for transforming idiosyncratic policies into simple and interpretable descriptions.

We evaluate our new AI-Interpret algorithm and employ it to translate information-acquisition policies discovered through metalevel reinforcement learning.

Furthermore, a series of ablation studies confirmed that our AI-Interpret algorithm was critical to the discovery of interpretable decision rules and that it is ready to be applied to other re…

16 часов назад @ paperswithcode.com
KaLM at SemEval-2020 Task 4: Knowledge-aware Language Models for Comprehension And Generation
KaLM at SemEval-2020 Task 4: Knowledge-aware Language Models for Comprehension And Generation KaLM at SemEval-2020 Task 4: Knowledge-aware Language Models for Comprehension And Generation

This paper presents our strategies in SemEval 2020 Task 4: Commonsense Validation and Explanation.

We propose a novel way to search for evidence and choose the different large-scale pre-trained models as the backbone for three subtasks...

The results show that our evidence-searching approach improves model performance on commonsense explanation task.

Our team ranks 2nd in subtask C according to human evaluation score.

(read more)

16 часов назад @ paperswithcode.com
Networks with pixels embedding: a method to improve noise resistance in images classification
Networks with pixels embedding: a method to improve noise resistance in images classification Networks with pixels embedding: a method to improve noise resistance in images classification

In the task of images classification, usually, the network is sensitive to noises.

In this work, we provide a noise-resistance network in images classification by introducing a technique of pixels embedding.

We test the network with pixels embedding, which is abbreviated as the network with PE, on the mnist database of handwritten digits.

It shows that the network with PE outperforms the conventional network on images with noises.

The technique of pixels embedding can be used in many tasks of images classification to improve noise resistance.

16 часов назад @ paperswithcode.com
Query Resolution for Conversational Search with Limited Supervision
Query Resolution for Conversational Search with Limited Supervision Query Resolution for Conversational Search with Limited Supervision

In this work we focus on multi-turn passage retrieval as a crucial component of conversational search.

One of the key challenges in multi-turn passage retrieval comes from the fact that the current turn query is often underspecified due to zero anaphora, topic change, or topic return...

Context from the conversational history can be used to arrive at a better expression of the current turn query, defined as the task of query resolution.

In this paper, we model the query resolution task as a binary term classification problem: for each term appearing in the previous turns of the conversation decide whether to add it to the current turn query or not.

We propose QuReTeC (Query Resolution by Te…

16 часов назад @ paperswithcode.com
Degree-Aware Alignment for Entities in Tail
Degree-Aware Alignment for Entities in Tail Degree-Aware Alignment for Entities in Tail

Entity alignment (EA) is to discover equivalent entities in knowledge graphs (KGs), which bridges heterogeneous sources of information and facilitates the integration of knowledge.

Existing EA solutions mainly rely on structural information to align entities, typically through KG embedding...

Nonetheless, in real-life KGs, only a few entities are densely connected to others, and the rest majority possess rather sparse neighborhood structure.

We refer to the latter as long-tail entities, and observe that such phenomenon arguably limits the use of structural information for EA.

To this end, a degree-aware co-attention network is conceived, which dynamically adjusts the significance of feature…

16 часов назад @ paperswithcode.com
Køpsala: Transition-Based Graph Parsing via Efficient Training and Effective Encoding
Køpsala: Transition-Based Graph Parsing via Efficient Training and Effective Encoding Køpsala: Transition-Based Graph Parsing via Efficient Training and Effective Encoding

We present K{\o}psala, the Copenhagen-Uppsala system for the Enhanced Universal Dependencies Shared Task at IWPT 2020.

Our system is a pipeline consisting of off-the-shelf models for everything but enhanced graph parsing, and for the latter, a transition-based graph parser adapted from Che et al.

(2019)... We train a single enhanced parser model per language, using gold sentence splitting and tokenization for training, and rely only on tokenized surface forms and multilingual BERT for encoding.

While a bug introduced just before submission resulted in a severe drop in precision, its post-submission fix would bring us to 4th place in the official ranking, according to average ELAS.

Our parse…

16 часов назад @ paperswithcode.com
NILE : Natural Language Inference with Faithful Natural Language Explanations
NILE : Natural Language Inference with Faithful Natural Language Explanations NILE : Natural Language Inference with Faithful Natural Language Explanations

The recent growth in the popularity and success of deep learning models on NLP classification tasks has accompanied the need for generating some form of natural language explanation of the predicted labels.

Such generated natural language (NL) explanations are expected to be faithful, i.e., they should correlate well with the model's internal decision making...

In this work, we focus on the task of natural language inference (NLI) and address the following question: can we build NLI systems which produce labels with high accuracy, while also generating faithful explanations of its decisions?

We propose Natural-language Inference over Label-specific Explanations (NILE), a novel NLI method wh…

16 часов назад @ paperswithcode.com
Knowledge Graph Simple Question Answering for Unseen Domains
Knowledge Graph Simple Question Answering for Unseen Domains Knowledge Graph Simple Question Answering for Unseen Domains

Knowledge graph simple question answering (KGSQA), in its standard form, does not take into account that human-curated question answering training data only cover a small subset of the relations that exist in a Knowledge Graph (KG), or even worse, that new domains covering unseen and rather different to existing domains relations are added to the KG.

In this work, we study KGSQA in a previously unstudied setting where new, unseen domains are added during test time...

In this setting, question-answer pairs of the new domain do not appear during training, thus making the task more challenging.

We propose a data-centric domain adaptation framework that consists of a KGSQA system that is applic…

16 часов назад @ paperswithcode.com
How Useful are Reviews for Recommendation? A Critical Review and Potential Improvements
How Useful are Reviews for Recommendation? A Critical Review and Potential Improvements How Useful are Reviews for Recommendation? A Critical Review and Potential Improvements

We investigate a growing body of work that seeks to improve recommender systems through the use of review text.

Generally, these papers argue that since reviews 'explain' users' opinions, they ought to be useful to infer the underlying dimensions that predict ratings or purchases...

Schemes to incorporate reviews range from simple regularizers to neural network approaches.

Further investigation calls for discussion on a much larger problem about the "importance" of user reviews for recommendation.

Through a wide range of experiments, we observe several cases where state-of-the-art methods fail to outperform existing baselines, especially as we deviate from a few narrowly-defined settings wh…

16 часов назад @ paperswithcode.com
📝 Cool Blogs
ODS.ai Habr
последний пост 6 дней назад
Рубрика «Читаем статьи за вас». Апрель 2020. Часть 1
Рубрика «Читаем статьи за вас». Апрель 2020. Часть 1 Рубрика «Читаем статьи за вас». Апрель 2020. Часть 1

Привет, Хабр! Продолжаем публиковать рецензии на научные статьи от членов сообщества Open Data Science из канала #article_essense. Хотите получать их раньше всех — вступайте в сообщество!

Статьи на сегодня: TResNet: High Performance GPU-Dedicated Architecture (DAMO Academy, Alibaba Group, 2020)

Controllable Person Image Synthesis with Attribute-Decomposed GAN (China, 2020)

Learning to See Through Obstructions (Taiwan, USA, 2020)

Tracking Objects as Points (UT Austin, Intel Labs, 2020)

CookGAN: Meal Image Synthesis from Ingredients (USA, UK, 2020)

Designing Network Design Spaces (FAIR, 2020)

Gradient Centralization: A New Optimization Technique for Deep Neural Networks (Hong Kong, Alibaba, 2…

6 дней назад @ habr.com
Лекарей сжигать нельзя беречь сейчас
Лекарей сжигать нельзя беречь сейчас Лекарей сжигать нельзя беречь сейчас

TLDR: кому перестановки делают больнее — меряем свёрткой графов. Выгорание на рабочем месте повстречал ещё в начале своей карьеры — и с тех пор живо интересуюсь этим вопросом. Представьте обстановку. Большой проект внедрения SAP. Высокие ставки. Амбициозные сроки. Нагрузку каждый воспринимал по-своему. Кто-то сорвался и самоустранился от выполнения обязанностей, кто-то стал токсичнее, у меня самого в какой-то момент чувство юмора пропало. Ненадолго. Управление изменениями (дисциплина, направленная на снижение напряжения во время внедрения информационных систем) многим обязана медикам. Во-первых, сам феномен эмоционального выгорания впервые зафиксировали у медицинских работников. Во-вторых, …

3 недели, 3 дня назад @ habr.com
Рубрика «Читаем статьи за вас». Март 2020. Часть 2
Рубрика «Читаем статьи за вас». Март 2020. Часть 2 Рубрика «Читаем статьи за вас». Март 2020. Часть 2

Привет, Хабр! Продолжаем публиковать рецензии на научные статьи от членов сообщества Open Data Science из канала #article_essense. Хотите получать их раньше всех — вступайте в сообщество! Первая часть мартовской сборки обзоров опубликована ранее.

Статьи на сегодня: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis (UC Berkeley, Google Research, UC San Diego, 2020)

Scene Text Recognition via Transformer (China, 2020)

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization (Imperial College London, Google Research, 2019)

Lagrangian Neural Networks (Princeton, Oregon, Google, Flatiron, 2020)

Deformable Style Transfer (Chicago, USA, 2020)

Rethinking…

1 месяц, 1 неделя назад @ habr.com
Рубрика «Читаем статьи за вас». Март 2020. Часть 1
Рубрика «Читаем статьи за вас». Март 2020. Часть 1 Рубрика «Читаем статьи за вас». Март 2020. Часть 1

Привет, Хабр! Продолжаем публиковать рецензии на научные статьи от членов сообщества Open Data Science из канала #article_essense. Хотите получать их раньше всех — вступайте в сообщество!

Статьи на сегодня: Fast Differentiable Sorting and Ranking (Google Brain, 2020)

MaxUp: A Simple Way to Improve Generalization of Neural Network Training (UT Austin, 2020)

Deep Nearest Neighbor Anomaly Detection (Jerusalem, Israel, 2020)

AutoML-Zero: Evolving Machine Learning Algorithms From Scratch (Google, 2020)

SpERT: Span-based Joint Entity and Relation Extraction with Transformer Pre-training (RheinMain University, Germany, 2019)

High-Resolution Daytime Translation Without Domain Labels (Samsung AI Cen…

1 месяц, 2 недели назад @ habr.com
Машинное обучение на языке R с использованием пакета mlr3
Машинное обучение на языке R с использованием пакета mlr3 Машинное обучение на языке R с использованием пакета mlr3

Источник: https://mlr3book.mlr-org.com/ Привет, Хабр! В этом сообщении мы рассмотрим самый продуманный на сегодняшний день подход к машинному обучению на языке R — пакет mlr3 и экосистему вокруг него. Данный подход основан на «нормальном» ООП с использованием R6-классов и на представлении всех операций с данными и моделями в виде графа вычислений. Это позволяет создавать упорядоченные и гибкие пайплайны для задач машинного обучения, но на первых порах может показаться сложным и запутанным. Ниже постараемся внести определенную ясность и замотивировать к использованию mlr3 в ваших проектах. Содержание: Немного истории и сравнение с конкурирующими решениями

Технические детали: R6-классы и паке…

1 месяц, 3 недели назад @ habr.com
Распространение сферического коня в вакууме по территории РФ
Распространение сферического коня в вакууме по территории РФ Распространение сферического коня в вакууме по территории РФ

Привет от ODS. Мы откликнулись на идею tutu.ru поработать с их датасетом пассажиропотока РФ. И если в посте Milfgard огромная таблица выводов и научпоп, то мы хотим рассказать что под капотом.

Что, опять очередной пост про COVID-19? Да, но нет. Нам это было интересно именно с точки зрения математических методов и работы с интересным набором данных. Прежде, чем вы увидите под катом красивые картинки и графики, я обязан сказать несколько вещей: любое моделирование — это очень сложный процесс, внутри которого невероятное количество ЕСЛИ и ПРЕДПОЛОЖИМ. Мы о них расскажем.

те, кто работал над этой статьей — не эпидемиологи или вирусологи. Мы просто группа любителей теории графов, практикующих ме…

1 месяц, 4 недели назад @ habr.com
Рубрика «Читаем статьи за вас». Январь — Февраль 2020
Рубрика «Читаем статьи за вас». Январь — Февраль 2020 Рубрика «Читаем статьи за вас». Январь — Февраль 2020

Привет, Хабр! Продолжаем публиковать рецензии на научные статьи от членов сообщества Open Data Science из канала #article_essense. Хотите получать их раньше всех — вступайте в сообщество!

Представлены обзоры 11 статей по Computer Vision, Natural Language Processing, Reinforcement learning и другим темам. Читать дальше →

2 месяца, 1 неделя назад @ habr.com
Настройка функции потерь для нейронной сети на данных сейсморазведки
Настройка функции потерь для нейронной сети на данных сейсморазведки Настройка функции потерь для нейронной сети на данных сейсморазведки

В прошлой статье мы описали эксперимент по определению минимального объема вручную размеченных срезов для обучения нейронной сети на данных сейсморазведки. Сегодня мы продолжаем эту тему, выбирая наиболее подходящую функцию потерь. Рассмотрены 2 базовых класса функций – Binary cross entropy и Intersection over Union – в 6-ти вариантах с подбором параметров, а также комбинации функций разных классов. Дополнительно рассмотрена регуляризация функции потерь. Спойлер: удалось существенно улучшить качество прогноза сети. Читать дальше →

3 месяца, 1 неделя назад @ habr.com
Открытый курс «Deep Learning in NLP» от создателей DeepPavlov на базе курса cs224n
Открытый курс «Deep Learning in NLP» от создателей DeepPavlov на базе курса cs224n

Всем привет!

Если возник вопрос по курсу — посмотрите раздел Q&A ниже.

Вступление

Меня зовут Алексей Клоков, я хочу рассказать о запуске классного курса по обработке естественного языка (Natural Language Processing), который очередной раз запускают физтехи из проекта DeepPavlov – открытой библиотеки для разговорного искусственного интеллекта, которую разрабатывают в лаборатории нейронных систем и глубокого обучения МФТИ. Благодарю их и Moryshka за разрешение осветить эту тему на Хабре в нашем ods-блоге. Итак, поехали! Читать дальше →

3 месяца, 3 недели назад @ habr.com
Рубрика «Читаем статьи за вас». Октябрь — Декабрь 2019
Рубрика «Читаем статьи за вас». Октябрь — Декабрь 2019 Рубрика «Читаем статьи за вас». Октябрь — Декабрь 2019

Привет, Хабр! Продолжаем публиковать рецензии на научные статьи от членов сообщества Open Data Science из канала #article_essense. Хотите получать их раньше всех — вступайте в сообщество!

Статьи на сегодня: Poly-encoders: Transformer Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring (Facebook, 2019)

Implicit Discriminator in Variational Autoencoder (Indian Institute of Technology Ropar, 2019)

Self-training with Noisy Student improves ImageNet classification (Google Research, Carnegie Mellon University, 2019)

Momentum Contrast for Unsupervised Visual Representation Learning (Facebook, 2019)

Benchmarking Neural Network Robustness to Common Corruptions and …

4 месяца назад @ habr.com
SVM. Объяснение с нуля и реализация на python. Подробный разбор метода опорных векторов
SVM. Объяснение с нуля и реализация на python. Подробный разбор метода опорных векторов SVM. Объяснение с нуля и реализация на python. Подробный разбор метода опорных векторов

Привет всем, кто выбрал путь ML-самурая!

Введение:

В данной статье рассмотрим метод опорных векторов (англ. SVM, Support Vector Machine) для задачи классификации. Будет представлена основная идея алгоритма, вывод настройки его весов и разобрана простая реализация своими руками. На примере датасета будет продемонстрирована работа написанного алгоритма с линейно разделимыми/неразделимыми данными в пространстве и визуализация обучения/прогноза. Дополнительно будут озвучены плюсы и минусы алгоритма, его модификации. Рисунок 1. Фото цветка ириса из открытых источников Читать дальше →

4 месяца назад @ habr.com
TensorRT 6.x.x.x — высокопроизводительный инференс для моделей глубокого обучения (Object Detection и Segmentation)
TensorRT 6.x.x.x — высокопроизводительный инференс для моделей глубокого обучения (Object Detection и Segmentation) TensorRT 6.x.x.x — высокопроизводительный инференс для моделей глубокого обучения (Object Detection и Segmentation)

Больно только в первый раз! Всем привет! Дорогие друзья, в этой статье я хочу поделиться своим опытом использования TensorRT, RetinaNet на базе репозитория github.com/aidonchuk/retinanet-examples (это форк официальной репы от nvidia, который позволит начать использовать в продакшен оптимизированные модели в кратчайшие сроки). Пролистывая сообщения в каналах сообщества ods.ai, я сталкиваюсь с вопросами по использованию TensorRT, и в основном вопросы повторяются, поэтому я решил написать как можно более полное руководство по использованию быстрого инференса на основе TensorRT, RetinaNet, Unet и docker. Читать дальше →

4 месяца, 1 неделя назад @ habr.com
Проект Lacmus: как компьютерное зрение помогает спасать потерявшихся людей
Проект Lacmus: как компьютерное зрение помогает спасать потерявшихся людей Проект Lacmus: как компьютерное зрение помогает спасать потерявшихся людей

Всем привет! Возможно, вы уже знаете про инициативу Machine Learning for Social Good (#ml4sg) сообщества Open Data Science. В её рамках энтузиасты на бесплатной основе применяют методы машинного обучения для решения социально-значимых проблем. Мы, команда проекта Lacmus (#proj_rescuer_la), занимаемся внедрением современных Deep Learning-решений для поиска людей, потерявшихся вне населённой местности: в лесу, поле и т.д. Читать дальше →

4 месяца, 1 неделя назад @ habr.com
Эксперименты с нейронными сетями на данных сейсморазведки
Эксперименты с нейронными сетями на данных сейсморазведки Эксперименты с нейронными сетями на данных сейсморазведки

Сложность интерпретации данных сейсмической разведки связана с тем, что к каждой задаче необходимо искать индивидуальный подход, поскольку каждый набор таких данных уникален. Ручная обработка требует значительных трудозатрат, а результат часто содержит ошибки, связанные с человеческим фактором. Использование нейронных сетей для интерпретации может существенно сократить ручной труд, но уникальность данных накладывает ограничения на автоматизацию этой работы. Данная статья описывает эксперимент по анализу применимости нейронных сетей для автоматизации выделения геологических слоев на 2D-изображениях на примере полностью размеченных данных из акватории Северного моря. Рисунок 1. Проведение акв…

4 месяца, 2 недели назад @ habr.com
Как подружить PyTorch и C++. Используем TorchScript
Как подружить PyTorch и C++. Используем TorchScript Как подружить PyTorch и C++. Используем TorchScript

Около года назад разработчики PyTorch представили сообществу TorchScript — инструмент, который позволяет с помощью пары строк кода и нескольких щелчков мыши сделать из пайплайна на питоне отчуждаемое решение, которое можно встроить в систему на C++. Ниже я делюсь опытом его использования и постараюсь описать встречающиеся на этом пути подводные камни. Особенное внимание уделю реализации проекта на Windows, поскольку, хотя исследования в ML обычно делаются на Ubuntu, конечное решение часто (внезапно!) требуется под "окошками". Примеры кода для экспорта модели и проекта на C++, использующего модель, можно найти в репозиториии на GitHub. Читать дальше →

5 месяцев, 1 неделя назад @ habr.com
inFERENCe inFERENCe
последний пост 6 месяцев, 2 недели назад
Meta-Learning Millions of Hyper-parameters using the Implicit Function Theorem
Meta-Learning Millions of Hyper-parameters using the Implicit Function Theorem Meta-Learning Millions of Hyper-parameters using the Implicit Function Theorem

November 14, 2019Meta-Learning Millions of Hyper-parameters using the Implicit Function TheoremLast night on the train I read this nice paper by David Duvenaud and colleagues.

Implicit Function TheoremMany - though not all - meta-learning or hyperparameter optimization problems can be stated as nested optimization problems.

$$Using a finite truncation of the Neumann series one can approximate the inverse Hessian in the following way:$$\left[\frac{\partial^2 \mathcal{L}_T}{\partial \theta \partial \theta}\right]^{-1} \approx \sum_{i=1}^j \left(I - \frac{\partial^2 \mathcal{L}_T}{\partial \theta \partial \theta}\right)^i.

Most crucially, methods based on implicit gradients assume that your le…

6 месяцев, 2 недели назад @ inference.vc
The secular Bayesian: Using belief distributions without really believing
The secular Bayesian: Using belief distributions without really believing The secular Bayesian: Using belief distributions without really believing

October 31, 2019The secular Bayesian: Using belief distributions without really believingThe religious BayesianMy parents didn't raise me in a religious tradition.

The secular BayesianOver the years I came to terms with my Bayesian heritage, and I now live my life as a secular Bayesian.

This choice is the real reason why the resulting update rule will end up very Bayes-rule like, as we will see later.

RationalityNow that we have an update rule which satisfies our desiderata, can we say if it's actually a good or useful update rule?

So, not only is this update rule the only update rule that satisfies the desired properties, it is also optimal under this particular definition of optimality/ra…

6 месяцев, 4 недели назад @ inference.vc
Exponentially Growing Learning Rate? Implications of Scale Invariance induced by Batch Normalization
Exponentially Growing Learning Rate? Implications of Scale Invariance induced by Batch Normalization Exponentially Growing Learning Rate? Implications of Scale Invariance induced by Batch Normalization

October 25, 2019Exponentially Growing Learning Rate?

Implications of Scale Invariance induced by Batch NormalizationYesterday I read this intriguing paper about the midboggling fact that it is possible to use exponentially growing learning rate schedule when training neural networks with batch normalization:Zhiyuan Li and Sanjeev Arora (2019) An Exponential Learning Rate Schedule for Deep LearningThe paper provides both theoretical insights as well as empirical demonstration of this remarcable property.

So Imagine doing vanilla gradient descent (no momentum, weight decay, fixed learning rate) on such a loss surface.

However, the weight vector won't completely blow up to infinity, because th…

7 месяцев назад @ inference.vc
On Marginal Likelihood and Cross-Validation
On Marginal Likelihood and Cross-Validation On Marginal Likelihood and Cross-Validation

The marginal likelihood and cross-validationTo discuss the connection between marginal likelihoods to (Bayesian) cross validation, let's first define what is what.

For each of these permutations we can decompose the marginal likelihood as a product of conditionals, or equivalently we can write the log marginal likelihood as a sum of logs of the same conditionals.

So, the sum of all the terms in this matrix gives the marginal likelihood times 6 (as there are 6 columns).

This observation gives a really good motivation for using the marginal likelihood, and also gives a new perspective on how it works.

Calculating the marginal likelihood amounts to evaluating the average predictive score on al…

7 месяцев, 1 неделя назад @ inference.vc
Notes on iMAML: Meta-Learning with Implicit Gradients
Notes on iMAML: Meta-Learning with Implicit Gradients Notes on iMAML: Meta-Learning with Implicit Gradients

September 19, 2019Notes on iMAML: Meta-Learning with Implicit GradientsThis week I read this cool new paper on meta-learning: it a slightly different approach compared to its predecessors based on some observations about differentiating the optima of regularized optimization.

Let me illustrate what that dependence looks like:In the figure above, let's say that we would like to minimise an objective function $f(\theta)$.

Rather than deterministically finding a particular local minimum, SGD samples different minima: when run with different random seeds it will find different minima.

The meta-learning objective now depends on $\theta_0$ in two different ways:as we change the anchor $\theta_0$,…

8 месяцев, 1 неделя назад @ inference.vc
Invariant Risk Minimization: An Information Theoretic View
Invariant Risk Minimization: An Information Theoretic View Invariant Risk Minimization: An Information Theoretic View

July 19, 2019Invariant Risk Minimization: An Information Theoretic ViewI finally got around to reading this new paper by Arjovsky et al.

Here, I will describe the main idea and then provide an information theoretic view on the same topic.

$Y \perp\mkern-13mu\perp E\vert X_1, W$: The observable $X_1$ and latent $W$ shield the label $Y$ from the influence of the environment.

Say we have a parametric family of functions $f(y\vert \phi(x); \theta)$ for predicting $y$ from $\phi(x)$.

The conditional information can be approximated as follows:\begin{align}I[Y, E \vert \phi(x)] &\approx \min_\theta {E}_{x,y} \ell (f(y\vert \phi(x); \theta) - \mathbb{E}_e \min_{\theta_e} \mathbb{E}_{x,y\vert e} \el…

10 месяцев, 1 неделя назад @ inference.vc
ICML Highlight: Contrastive Divergence for Combining Variational Inference and MCMC
ICML Highlight: Contrastive Divergence for Combining Variational Inference and MCMC ICML Highlight: Contrastive Divergence for Combining Variational Inference and MCMC

Ruiz and Titsias (2019) A Contrastive Divergence for Combining Variational Inference and MCMCBackground: principle of minimal improvementFirst, some background on why I found this paper particulartly interesting.

Using such improvement operator you can define an objective function for policies by measuring the extent to which the operator changes a policy.

In the case of AlphaGo Zero, the improvement operator is Monte Carlo Tree Search (MCTS).

The paper I'm talking about uses a very similar argument to come up with a contrastive divergence for variational inference, where the improvement operator is MCMC step.

Combining VI with MCMCThe two dominant ways of performing inference in latent var…

11 месяцев, 2 недели назад @ inference.vc
Notes on the Limitations of the Empirical Fisher Approximation
Notes on the Limitations of the Empirical Fisher Approximation Notes on the Limitations of the Empirical Fisher Approximation

June 6, 2019Notes on the Limitations of the Empirical Fisher ApproximationThis post is a short not on an excellent recent paper on empirical Fisher information matrices:Kunstner, Balles and Hennig (2019) Limitations of the Empirical Fisher ApproximationI was debating with myself whether I should write a post about this because it's a superbly written paper that you should probably read in full.

There isn't a whole lot of novelty in the paper, but it is a great discussion paper that provides a concise overview of the Fisher information, the empirical Fisher matrix and their connectinos to generalized Gauss-Newton methods.

The third shows the gradients corrected by the empirical Fisher instea…

11 месяцев, 3 недели назад @ inference.vc
Perceptual Straightening of Natural Videos
Perceptual Straightening of Natural Videos Perceptual Straightening of Natural Videos

May 30, 2019Perceptual Straightening of Natural VideosVideo is an interesting domain for unsupervised, or self-supervised, representation learning.

So, for example, straight trajectories have an almost $0$ probability under a high-dimensional Brownian motion or Ornstein–Uhlenbeck (OU) process.

Results and SummaryThe main results of the paper - as expected - is that natural video sequences indeed appear to be mapped to straight trajectories in representation space.

For one, the paper assumes a Gaussian observation noise in representation space, and I wonder how robust the analysis would be to assuming heavy-tailed noise.

Similarly, our very definition of straightness and angles relies on the…

12 месяцев назад @ inference.vc
DeepSets: Modeling Permutation Invariance
DeepSets: Modeling Permutation Invariance DeepSets: Modeling Permutation Invariance

February 7, 2019DeepSets: Modeling Permutation Invariance###### guest post by [ Fabian Fuchs ](https://twitter.com/FabianFuchsML), [ Ed Wagstaff ](https://github.com/edwag), and [ Martin Engelcke ](https://twitter.com/martinengelcke)One of my favourite recent innovations in neural network architectures is Deep Sets.

In such a situation, the invariance property we can exploit is permutation invariance.

To give a short, intuitive explanation for permutation invariance, this is what a permutation invariant function with three inputs would look like: $f(a, b, c) = f(a, c, b) = f(b, a, c) = \dots$.

The Deep Sets Architecture (Sum-Decomposition)Having established that there is a need for permutat…

1 год, 3 месяца назад @ inference.vc
Causal Inference 3: Counterfactuals
Causal Inference 3: Counterfactuals Causal Inference 3: Counterfactuals

You hopefully know enough about causal inference by now to know that $p(🎓\vert 🧔=0)$ is certainly not the quantity we seek.

Counterfactual queriesTo finally explain counterfactuals, I have to step beyond causal graphs and introduce another concept: structural equation models.

Structural Equation ModelsA causal graph encodes which variables have a direct causal effect on any given node - we call these causal parents of the node.

$f_1$ computes $x$ from its causal parent $u$, and $f_2$ computes $a$ from its causal parents $x$ and $v$.

The structural equation model (SEM) entails the causal graph, in that you can reconstruct the causal graph by looking at the inputs of each function.

1 год, 4 месяца назад @ inference.vc
Causal Inference 2: Illustrating Interventions via a Toy Example
Causal Inference 2: Illustrating Interventions via a Toy Example Causal Inference 2: Illustrating Interventions via a Toy Example

Consequently,the joint distribution of data alone is insufficient to predict behaviour under interventions.

Finally, you can use various causal discovery techniques to try to identify the causal diagram from the data itself.

Theoretically, recovering the full causal graph from the data is impossible in general cases.

SummaryWe have seen that modeling the joint distribution can only get you so far, and if you want to predict the effect of interventions, i.e.

calculate $p(y\vert do(x))$-like quantities, you have to add a causal graph to your analysis.

1 год, 4 месяца назад @ inference.vc
Online Bayesian Deep Learning in Production at Tencent
Online Bayesian Deep Learning in Production at Tencent Online Bayesian Deep Learning in Production at Tencent

These applications include active learning, reinforcement learning and online/continual learning.

So as I recently read a paper by Tencent, I was surprised to learn that the online Bayesian deep learning algorithm is apparently deployed in production to power click-through-rate prediction in their ad system.

Assumed Density FilteringThe method relies on the approximate Bayesian online-learning technique often referred to as assumed density filtering.

forward propagation: In Bayesian deep learning, we maintain a distribution $q(w)$ over neural network weights, and each value $w$ defines a conditional probability $p(y\vert x, w)$.

In Bayesian deep learning, we maintain a distribution $q(w)$ o…

1 год, 6 месяцев назад @ inference.vc
👻Halloween Special: Critical reviews of the worst NIPS 2018 papers.
👻Halloween Special: Critical reviews of the worst NIPS 2018 papers. 👻Halloween Special: Critical reviews of the worst NIPS 2018 papers.

posts on machine learning, statistics, opinions on things I'm reading in the space

1 год, 7 месяцев назад @ inference.vc
The Blessings of Multiple Causes: Causal Inference when you Can't Measure Confounders
The Blessings of Multiple Causes: Causal Inference when you Can't Measure Confounders The Blessings of Multiple Causes: Causal Inference when you Can't Measure Confounders

September 7, 2018The Blessings of Multiple Causes: Causal Inference when you Can't Measure ConfoundersHappy back-to-school time everyone!

In this case, the size of the kidney stone is a confounder variable.

Let's look at how this differs from the non-causal association you would measure between treatment and outcome (i.e.

there may be confounders, but all confounders causally influence at least two of the cause variables.

It identifies just enough about the causal structure (the substitute confounder variable) to then be able to make causal inferences of a certain type.

1 год, 8 месяцев назад @ inference.vc
The Spectator The Spectator
последний пост 3 месяца назад
Queer Exceptionalism in Science
Queer Exceptionalism in Science Queer Exceptionalism in Science

Read in 5mins (800 words)Today’s queer scientist is exceptional.

Role of the Queer ScientistFor queer people to hold a recognised role in scientific life requires an acknowledgement that to be queer has consequences.

Challenges Facing Queer ScientistsFor the queer scientist, every encounter involves a conscious act of deliberation, risk assessment, and effort, well before any effort of research is begun.

For queer scientists, every new encounter—with a colleague, supervisor, possible letter-writer, examiner, moderator, student, interviewer, acquaintance, or future-friend—sets up a stressful coming-out scene.

To be queer in science is to ask to belong and to be safe.

3 месяца назад @ blog.shakirm.com
Machinery of Grace
Machinery of Grace Machinery of Grace

The machinery of grace is always simple.

The machines i’m thinking of are machines with intelligence, machines that learn.

Dialogues that lead to co-design and inclusion in the mission of developing intelligent machines with grace.

Firstly, to celebrate our progress in machine learning, but one that must now be balanced using a new critical practice.

If we are successful in making global AI truly global, and I believe we can be, we set ourselves on the path to realising that intelligent machinery of grace.

6 месяцев, 1 неделя назад @ blog.shakirm.com
A New Consciousness of Inclusion in Machine Learning
A New Consciousness of Inclusion in Machine Learning A New Consciousness of Inclusion in Machine Learning

On LGBT Freedoms and our Support for Machine Learning in AfricaThis is an exploration of my thinking and my personal views.

The choice of these host countries has fomented concerns throughout our machine learning community: how can we as a community committed to inclusion in every form consider hosting our conferences in countries like these that are far from inclusive?

A politics of location, and an ethics of inclusion is growing healthily within our machine learning community.

But I too am an out and proud gay machine learning scientist.

My hope is that we will always continue to experiment with the ways in which we organise and support our global machine learning community.

11 месяцев, 2 недели назад @ blog.shakirm.com
Racialised Lives and the Life Beyond
Racialised Lives and the Life Beyond Racialised Lives and the Life Beyond

The Black women is racialised, and so too is the White man, as is every person we have ever known, and so the cycle of our racialised lives lives on.

About two-and-a-half years ago, I was part of creating a new organisation called the Deep Learning Indaba, as one attempt to engage with these questions.

The grassroots are those groups within our institutions, like our LGBT resource group within DeepMind, and those outside movements, like the Deep Learning Indaba.

I see the leadership of the Deep Learning Indaba as such a collective.

But I think we show the power of political love today, in this room, with our memory, with our energy, and in the celebration of progress that has brought us her…

11 месяцев, 4 недели назад @ blog.shakirm.com
Talk: How Do We Support Under-represented Groups To Put Themselves Forward?
Talk: How Do We Support Under-represented Groups To Put Themselves Forward? Talk: How Do We Support Under-represented Groups To Put Themselves Forward?

As you think of this question, consider the journey that is taken by the under-represented groups we might have in mind.

Journey’s like mine are our struggle credentials.

This room is filled with struggle credentials.

Struggle credentials play too much of a role in our present.

It is the under-represented groups that must eventually be put forward.

1 год, 6 месяцев назад @ blog.shakirm.com
Machine Learning Trick of the Day (8): Instrumental Thinking
Machine Learning Trick of the Day (8): Instrumental Thinking Machine Learning Trick of the Day (8): Instrumental Thinking

The instrumental variables idea is conceptually simple: we introduce new observed variables z, called instrumental variables, into our model; figure 1 (right).

And this is the trick: instrumental variables are special subset of the data we already have, but they allow us to remove the effect of confounders.

Our problem is to learn a linear value function using features (when in state x) using parameters so that .

But this probabilistic viewpoint through instrumental variables means that we can think of alternative ways of extending this view.

Like every trick in this series, the instrumental variables give us an alternative way to think about existing problems.

1 год, 7 месяцев назад @ blog.shakirm.com
Decolonising Artificial Intelligence
Decolonising Artificial Intelligence Decolonising Artificial Intelligence

· Read in 6mins · 1297 words ·The Artificial Intelligence we believe to be global, is far from it.

Inevitably, a call will be made to decolonise artificial intelligence.

The call for decolonisation in artificial intelligence is yet to reach its full volume.

Kai Fu Lee, The Real Threat of Artificial Intelligence, June 2017We immediately recognise the colonial nature of this possible future.

The only AI that empowers and works for the benefit of humanity is a truly global AI.

1 год, 7 месяцев назад @ blog.shakirm.com
The Price of Transformation
The Price of Transformation The Price of Transformation

The price of transformation is ours to pay.

Transformation cannot be separated from my other pillars, for they require transformation to succeed.

The price of transformation cannot be paid in this way.

We must all confront the question: What is the price of transformation?

We need to convince ourselves that the price of transformation is something we are willing to pay, and that we should pay.

1 год, 8 месяцев назад @ blog.shakirm.com
Machine Learning Trick of the Day (7): Density Ratio Trick
Machine Learning Trick of the Day (7): Density Ratio Trick Machine Learning Trick of the Day (7): Density Ratio Trick

The same is true if we want to compare probability densities: either through a density difference or a density ratio.

Density ratios are ubiquitous in machine learning, and will be our focus.

Density Ratio EstimationThe central task in the above five statistical quantities is to efficiently compute the ratio .

This is where the density ratio trick or formally, density ratio estimation, enters: it tells us to construct a binary classifier that distinguishes between samples from the two distributions.

This final derivation says that the problem of density ratio estimation is equivalent to that of binary classification.

2 года, 4 месяца назад @ blog.shakirm.com
Cognitive Machine Learning (2): Uncertain Thoughts
Cognitive Machine Learning (2): Uncertain Thoughts Cognitive Machine Learning (2): Uncertain Thoughts

These types of thinking are secondary-levels of thinking: a thinking about thinking.

Like the primary colours, our primary thoughts are those that are the basis of our cognition.

Secondary colours use the primary colours as their basis, and similarly, secondary thoughts are thoughts about our primary thoughts.

Our memories, decisions and attitudes are amongst our primary thoughts, and for each we have secondary thoughts—metacognitive confidence assessments—that guide our behaviours.

Again, we can make such assessments in two ways: about the decisions we are still to make, a prospective decision confidence; and decisions we have already made, a retrospective decision confidence.

3 года, 2 месяца назад @ blog.shakirm.com
大トロ 大トロ
последний пост 2 месяца, 1 неделя назад
Neuroevolution of Self-Interpretable Agents
Neuroevolution of Self-Interpretable Agents Neuroevolution of Self-Interpretable Agents

Agents with a self-attention “bottleneck” not only can solve these tasks from pixel inputs with only 4000 parameters, but they are also better at generalization.

Redirecting to attentionagent.github.io, where the article resides.

2 месяца, 1 неделя назад @ blog.otoro.net
Learning to Predict Without Looking Ahead
Learning to Predict Without Looking Ahead Learning to Predict Without Looking Ahead

Rather than hardcoding forward prediction, we try to get agents to learn that they need to predict the future.

Redirecting to learningtopredict.github.io, where the article resides.

7 месяцев назад @ blog.otoro.net
Weight Agnostic Neural Networks
Weight Agnostic Neural Networks Weight Agnostic Neural Networks

We search for neural network architectures that can already perform various tasks even when they use random weight values.

Redirecting to weightagnostic.github.io, where the article resides.

11 месяцев, 3 недели назад @ blog.otoro.net
Learning Latent Dynamics for Planning from Pixels
Learning Latent Dynamics for Planning from Pixels Learning Latent Dynamics for Planning from Pixels

PlaNet learns a world model from image inputs only and successfully leverages it for planning in latent space.

Redirecting to planetrl.github.io, where the article resides.

1 год, 3 месяца назад @ blog.otoro.net
Reinforcement Learning for Improving Agent Design
Reinforcement Learning for Improving Agent Design Reinforcement Learning for Improving Agent Design

Little dude rewarded for having little legs.

Redirecting to designrl.github.io, where the article resides.

1 год, 7 месяцев назад @ blog.otoro.net
World Models Experiments
World Models Experiments World Models Experiments

In this article I will give step-by-step instructions for reproducing the experiments in the World Models article (pdf).

For general discussion about the World Models article, there are already some good discussion threads here in the GitHub issues page of the interactive article.

World Models (pdf)A Visual Guide to Evolution StrategiesEvolving Stable StrategiesBelow is optionalMixture Density NetworksMixture Density Networks with TensorFlowRead tutorials on Variational Autoencoders if you are not familiar with them.

I use a combination of OS X for inference, but trained models using Google Cloud VMs.

You should update your git repo with these new models using git add doomrnn/tf_models/*.js…

1 год, 11 месяцев назад @ blog.otoro.net
World Models
World Models World Models

Can agents learn inside of their own dreams?

Redirecting to worldmodels.github.io, where the article resides.

2 года, 2 месяца назад @ blog.otoro.net
Evolving Stable Strategies
Evolving Stable Strategies Evolving Stable Strategies

popsize ): # init the agent with a solution agent = Agent ( solutions [ i ]) # rollout env with this agent fitlist [ i ] = rollout ( agent , env ) # give scores results back to ES solver .

One way to convert into a stochastic policy is to make random.

Robot arm grasping task using a stochastic policy.

The Minitaur model in pybullet is designed to mimic the real physical Minitaur.

After making the ball smaller, CMA-ES was able to find a stochastic policy that can walk and balance the ball at the same time.

2 года, 6 месяцев назад @ blog.otoro.net
A Visual Guide to Evolution Strategies
A Visual Guide to Evolution Strategies A Visual Guide to Evolution Strategies

In this post I explain how evolution strategies (ES) work with the aid of a few visual examples.

OpenAI published a paper called Evolution Strategies as a Scalable Alternative to Reinforcement Learning where they showed that evolution strategies, while being less data efficient than RL, offer many benefits.

Schaffer-2D FunctionRastrigin-2D FunctionAlthough there are many definitions of evolution strategies, we can define an evolution strategy as an algorithm that provides the user a set of candidate solutions to evaluate a problem.

Let’s visualise the scheme one more time, on the entire search process on both problems:Because CMA-ES can adapt both its mean and covariance matrix using inform…

2 года, 7 месяцев назад @ blog.otoro.net
Teaching Machines to Draw
Teaching Machines to Draw Teaching Machines to Draw

In this work, we investigate an alternative to traditional pixel image modelling approaches, and propose a generative model for vector images.

For example, we can subtract the latent vector of an encoded pig head from the latent vector of a full pig, to arrive at a vector that represents the concept of a body.

As we saw earlier, a model trained to draw pigs can be made to draw pig-like trucks if given an input sketch of a truck.

Exploring the latent space between different objects can potentially enable creative designers to find interesting intersections and relationships between different drawings:Exploring the latent space between cats and buses, elephants and pigs, and various owls.

In …

3 года назад @ blog.otoro.net
Recurrent Neural Network Tutorial for Artists
Recurrent Neural Network Tutorial for Artists Recurrent Neural Network Tutorial for Artists

In particular, the experiments in the post help visualise the internals of a recurrent neural network trained to generate handwriting.

Recurrent Neural Network for HandwritingWe have pre-trained a recurrent neural network model to preform the handwriting task described in the previous section.

var x , y ; var dx , dy ; var pen ; var prev_pen ; var rnn_state ; var pdf ; var temperature = 0.65 ; var screen_width = window .

get_pdf ( rnn_state ); [ dx , dy , pen ] = Model .

I haven’t personally used keras.js, and I found it fun to just write the handwriting model from scratch in Javascript.

3 года, 4 месяца назад @ blog.otoro.net
Hypernetworks
Hypernetworks Hypernetworks

In our paper, we use HyperNetworks to explore a middle ground - to enforce a relaxed version of weight-tying.

The more exciting work is in the second part of my paper where we apply Hypernetworks to Recurrent Networks.

Dynamic HypernetworksAs mentioned in the Introduction, we also tried to apply Hypernetworks on Recurrent Networks, and I feel this is the main contribution of the research.

Our approach is to put a small LSTM cell (called the HyperLSTM cell) inside a large LSTM cell (the main LSTM).

For our implementation of Dynamic Hypernetworks, we made it so that we can just plug our HyperLSTM cell into any TensorFlow code written to use tf.nn.rnn_cell objects, since the HyperLSTM inherite…

3 года, 8 месяцев назад @ blog.otoro.net
Generating Large Images from Latent Vectors - Part Two
Generating Large Images from Latent Vectors - Part Two Generating Large Images from Latent Vectors - Part Two

Random gaussian latent vectors were generated from numpy.random and fed into the generative network to obtain these images.

Our generator can produce large random images of digits using random gaussian vectors as input.

Unlike the previous model though, the generated images do not necessarily have to look exactly like the set of training images.

All the generator has to do is to create a set of new images that share the same classification labels of the set of training images.

Description of Generator NetworkThe generator used in the previous model uses 4 large layers of 128 nodes that are fully connected.

3 года, 12 месяцев назад @ blog.otoro.net
Neural Network Evolution Playground with Backprop NEAT
Neural Network Evolution Playground with Backprop NEAT Neural Network Evolution Playground with Backprop NEAT

This demo will attempt to use a genetic algorithm to produce efficient, but atypical neural network structures to classify datasets borrowed from TensorFlow Playground.

People started experimenting with different neural network configurations, such as how many neural network layers are actually needed to fit a certain data set, or what initial features should be used for another data set.

In addition to weight-search, Deep Learning research has also produced many powerful neural network architectures that are important building blocks.

Evolving Neural Network TopologyNeuroevolution of Augmenting Topologies (NEAT) is a method that can evolve new types of neural networks based on genetic algo…

4 года назад @ blog.otoro.net
Interactive Abstract Pattern Generation Javascript Demo
Interactive Abstract Pattern Generation Javascript Demo Interactive Abstract Pattern Generation Javascript Demo

Interactive Javascript Demo for Abstract Pattern Generation.

Although there were some code available previously in Javascript, it wasn’t general enough to use as a tool for a digital artist.

So I took the Javascript code previously written and spent an hour or two to fine tuned it into a simple web app.

In addition, the user is able to specify the size and depth of the generator neural network.

The depth and size of the network, and also the image resolution of the output can all be customised in the web app.

4 года, 1 месяц назад @ blog.otoro.net
The Unofficial Google Data Science Blog The Unofficial Google Data Science Blog
последний пост 5 месяцев, 3 недели назад
Humans-in-the-loop forecasting: integrating data science and business planning
Humans-in-the-loop forecasting: integrating data science and business planning Humans-in-the-loop forecasting: integrating data science and business planning

Figure 1: A Google data centerAs an example, consider Google’s forecasting and planning for data center capacity.

In particular, the data scientist must take responsibility for stakeholders approving the “best” forecast from all available information sources.

It required investments from our data science team to re-think our statistical forecasting approach to make it easier to compare against customer forecasts.

It also owns Google’s internal time series forecasting platform described in an earlier blog post .

But looking through the blogosphere, some go further and posit that “platformization” of forecasting and “forecasting as a service” can turn anyone into a data scientist at the push …

5 месяцев, 3 недели назад @ unofficialgoogledatascience.com
Estimating the prevalence of rare events — theory and practice
Estimating the prevalence of rare events — theory and practice Estimating the prevalence of rare events — theory and practice

$$S(v_1) = S(v_2) \implies \frac{q(v_1)}{p(v_1)} = \frac{q(v_2)}{p(v_2)}$$The ratio between the importance distribution and target distribution is thus a function of $S(v)$:$$\frac{q(v)}{p(v)} = \frac{\tilde{q}(S(v))}{\tilde{p}(S(v))}$$where $\tilde{p}$ and $\tilde{q}$ are PMFs of $S(v)$ under the target distribution and importance distribution respectively.

In our case when the events are rare and the probability of high conditional prevalence rate is small under the target distribution, the difference between the methods is minor.

We also discuss how to choose $q$ with respect to the conditional prevalence rate $g(S(v))=\mathbb{E}_p\left[f(V)|S(V)=S(v)\right]$.

Conclusion In this post, we…

9 месяцев назад @ unofficialgoogledatascience.com
Misadventures in experiments for growth
Misadventures in experiments for growth Misadventures in experiments for growth

In summary, classic experimentation is applicable to fledgling products but in a much more limited way than to established products.

For our music example, we imagined that EDM users don't approximate the target population for some experiments.

The behavior of this single user user appears in our data as a large number of impressions with conversions.

A word on growth hackingOf particular concern in growth hacking is the focus on influencers for pushing growth.

For our music example, we imagined that EDM users don't approximate the target population for some experiments.

1 год, 1 месяц назад @ unofficialgoogledatascience.com
Crawling the internet: data science within a large engineering system
Crawling the internet: data science within a large engineering system Crawling the internet: data science within a large engineering system

When queries arrive, the search system matches the inferred meaning of the query to web pages on the basis of these snapshots.

This measure of web page value is on a meaningful linear scale, such that our freshness metric (a weighted average) has an intuitive interpretation.

A global constraint of how much compute and network resources Google itself is willing to dedicate to crawling web pages.

In some regimes (and in practice for google search), a greedy algorithm would devote more recrawl resources towards high value pages, as lower value pages would commonly starve.

We can use this function to sort the web pages, and then determine which web pages should be scheduled for immediate crawl.

1 год, 10 месяцев назад @ unofficialgoogledatascience.com
Compliance bias in mobile experiments
Compliance bias in mobile experiments Compliance bias in mobile experiments

The differences between the distribution of users experiencing the treatment and the population are likely to be a key factor here.

Compliance Bias A central issue in this application is that users assigned treatment sometimes do not actually experience the treatment at $T_{\mathrm{measure}}$, and furthermore this set of users is not random.

Here, we can draw a direct analogy to Compliance Bias, which is primarily described in literature on the analysis of medical studies.

Propensity scoring within the treatmentFig 5: Estimated probability of experiencing the treatment in the treatment group.

Here, we ignore any control group, and analyze the treatment group as a self-contained observationa…

2 года, 2 месяца назад @ unofficialgoogledatascience.com
Designing A/B tests in a collaboration network
Designing A/B tests in a collaboration network Designing A/B tests in a collaboration network

Our model considers two aspects of network effects:Homophily or similarity within network: users collaborating in network tend to behave similarly.

or similarity within network: users collaborating in network tend to behave similarly.

The network topology itself is the actual collaboration network we observe for GCP.When users are connected in a network, their treatment assignments can generate network effects through their interactions.

In other words, for the three methods of randomizationuniform random componentuniform random projectstratified random component we simulate confidence intervals for A/A tests, i.e.

Conclusion Designing randomized experiments on a network of users is more ch…

2 года, 4 месяца назад @ unofficialgoogledatascience.com
Unintentional data
Unintentional data Unintentional data

The Future of Data AnalysisAvalanche of questions: the role of the data scientist amid unintentional dataIs it relevant to our goals?

In the world of big, unintentional data there are many discoveries to be had which have no bearing on the organization’s goals.

Democratization of analysis: quantity has a quality all its own Just as dealing with unintentional data shapes the role of the data scientists in their organization, it also shapes the day to day practice of data analysis.

Understanding the goals of the organization as well as guiding principles for extracting value from data are both critical for success in this environment.Thankfully not only have modern data analysis tools made da…

2 года, 7 месяцев назад @ unofficialgoogledatascience.com
Fitting Bayesian structural time series with the bsts R package
Fitting Bayesian structural time series with the bsts R package Fitting Bayesian structural time series with the bsts R package

When fitting bsts models that contain a regression component, extra arguments captured by ... are passed to the SpikeSlabPrior function from the BoomSpikeSlab package.

# Fit a bsts model with expected model size 1, the default.

model2 <- bsts(iclaimsNSA ~ ., state.specification = ss, niter = 1000, data = initial.claims)# Fit a bsts model with expected model size 5, to include more coefficients.

(a) (b)Figure 10: Regression coefficients for the (a) plain logistic regression model and (b) time series logistic regression model under equivalent spike and slab priors.

These are a widely useful class of time series models, known in various literatures as "structural time series," "state space mod…

2 года, 10 месяцев назад @ unofficialgoogledatascience.com
Our quest for robust time series forecasting at scale
Our quest for robust time series forecasting at scale Our quest for robust time series forecasting at scale

The demand for time series forecasting at Google grew rapidly along with the company over its first decade.

That is, for an attempt to develop methods and tools that would facilitate accurate large-scale time series forecasting at Google.

The demand for time series forecasting at Google grew rapidly along with the company over its first decade.

But like our approach, Prophet aims to be an automatic, robust forecasting tool.At lastly, "forecasting" for us did not mean anomaly detection.

APAby ERIC TASSONE, FARZAN ROHANITime series forecasting enjoys a rich and luminous history, and today is an essential element of most any business operation.

3 года, 1 месяц назад @ unofficialgoogledatascience.com
Attributing a deep network’s prediction to its input features
Attributing a deep network’s prediction to its input features Attributing a deep network’s prediction to its input features

We consider a deep network using the For concreteness, let us focus on a network that performs object recognition.

Deep networks have multiple layers of logic and coefficients, combined using nonlinear activation functions .

Application to other networks Our paper also includes application of integrated gradients to other networks (none of these networks were trained by us).

There is also work (such as this ) on architecting deep networks in ways that allow us to understand the internal representations of these networks.

Overall, we hope that deep networks lose their reputation for being impenetrable black-boxes which perform black magic.

3 года, 2 месяца назад @ unofficialgoogledatascience.com
Causality in machine learning
Causality in machine learning Causality in machine learning

An obvious attempt to fix this is to upweight randomized data in training, or even train the model solely on the randomized data.

As we observed at the start of this post, standard machine learning techniques don’t distinguish between randomized and observational data the way statistical models do.

Conclusion In this post we described how some randomized data may be applied both to check and improve the accuracy of a machine learning system trained largely on observational data.

Indeed, machine learning generally lacks the vocabulary to capture the distinction between observational data and randomized data that statistics finds crucial.

Rather, the focus of this post is on combining observa…

3 года, 3 месяца назад @ unofficialgoogledatascience.com
Practical advice for analysis of large, complex data sets
Practical advice for analysis of large, complex data sets Practical advice for analysis of large, complex data sets

Some people seemed to be naturally good at doing this kind of high quality data analysis.

Process Separate Validation, Description, and EvaluationValidation or Initial Data Analysis: Do I believe data is self-consistent, that the data was collected correctly, and that data represents what I think it does?

I think about about exploratory data analysis as having 3 interrelated stages:By separating these phases, you can more easily reach agreement with others.

Acknowledge and count your filtering Almost every large data analysis starts by filtering the data in various stages.

Almost every large data analysis starts by filtering the data in various stages.

3 года, 6 месяцев назад @ unofficialgoogledatascience.com
Statistics for Google Sheets
Statistics for Google Sheets Statistics for Google Sheets

IntroductionStatistics for Google Sheets is an add-on for Google Sheets that brings elementary statistical analysis tools to spreadsheet users.

The goal of the Statistics app is to “democratize data science” by putting elementary statistics capabilities in the hands of anyone with a Google account.

If you look closely at the boxplots you can see that returns following down days have slightly greater variation than returns following up days.

Finally, you can use logistic regression to see how a previous day’s return affects the probability of the next day’s return being positive.

Statistics for Google Sheets gives analysts and students the tools to conduct elementary statistical analyses in …

3 года, 8 месяцев назад @ unofficialgoogledatascience.com
Next generation tools for data science
Next generation tools for data science Next generation tools for data science

Introductionthe solution to write data processing pipelines scalable to hundreds of terabytes (or more) is evidenced by the massive uptake.

That MapReduce wassolution to write data processing pipelines scalable to hundreds of terabytes (or more) is evidenced by the massive uptake.

Widely used in medicine for count data, the MH estimator and its generalizations are ubiquitous within data science at Google.

filter( lambda x: x != header) .

Beam/Dataflow’s sweet spot: streaming processing Streaming processing is an ever-increasingly important topic for data science.

3 года, 9 месяцев назад @ unofficialgoogledatascience.com
Mind Your Units
Mind Your Units Mind Your Units

The perils of incorrect units Is the idea of 'minding our units' just some esoteric issue, or can this actually hurt us in practice?

How do we mind our units in analyses at Google?

The above simulation already hints at one of our approaches to incorporating the group structure in some analyses at Google.

Regardless of how you do it, do remember to mind your units.

Regardless of how you do it, do remember to mind your units.

3 года, 10 месяцев назад @ unofficialgoogledatascience.com
Andrew Karpathy
последний пост 1 год, 1 месяц назад
A Recipe for Training Neural Networks
A Recipe for Training Neural Networks

Some few weeks ago I posted a tweet on “the most common neural net mistakes”, listing a few common gotchas related to training neural nets.

1) Neural net training is a leaky abstractionIt is allegedly easy to get started with training neural nets.

This is just a start when it comes to training neural nets.

As a result, (and this is reeaally difficult to over-emphasize) a “fast and furious” approach to training neural networks does not work and only leads to suffering.

focus on training loss) and then regularize it appropriately (give up some training loss to improve the validation loss).

1 год, 1 месяц назад @ karpathy.github.io
(started posting on Medium instead)
(started posting on Medium instead)

The current state of this blog (with the last post 2 years ago) makes it look like I’ve disappeared.

I’ve certainly become less active on blogs since I’ve joined Tesla, but whenever I do get a chance to post something I have recently been defaulting to doing it on Medium because it is much faster and easier.

I still plan to come back here for longer posts if I get any time, but I’ll default to Medium for everything short-medium in length.

TLDRHave a look at my Medium blog.

2 года, 4 месяца назад @ karpathy.github.io
A Survival Guide to a PhD
A Survival Guide to a PhD A Survival Guide to a PhD

Unlike the undergraduate guide, this one was much more difficult to write because there is significantly more variation in how one can traverse the PhD experience.

You can go one way (PhD -> anywhere else) but not the other (anywhere else -> PhD -> academia/research; it is statistically less likely).

The adviser is an extremely important person who will exercise a lot of influence over your PhD experience.

During your PhD you’ll get to acquire this sense yourself.

It’s usually a painful exercise for me to look through some of my early PhD paper drafts because they are quite terrible.

3 года, 8 месяцев назад @ karpathy.github.io
Deep Reinforcement Learning: Pong from Pixels
Deep Reinforcement Learning: Pong from Pixels Deep Reinforcement Learning: Pong from Pixels

This is a long overdue blog post on Reinforcement Learning (RL).

From left to right: Deep Q Learning network playing ATARI, AlphaGo, Berkeley robot stacking Legos, physically-simulated quadruped leaping over terrain.

Policy network.

For example, suppose we compute \(R_t\) for all of the 20,000 actions in the batch of 100 Pong game rollouts above.

The total number of episodes was approximately 8,000 so the algorithm played roughly 200,000 Pong games (quite a lot isn’t it!)

3 года, 12 месяцев назад @ karpathy.github.io
Short Story on AI: A Cognitive Discontinuity.
Short Story on AI: A Cognitive Discontinuity. Short Story on AI: A Cognitive Discontinuity.

Another great source of good reputation for Visceral were the large number of famous interventions carried out by autonomous Visceral agents.

The list went on and on - one month ago an autonomous Visceral agent recognized a remote drone attack.

He was running the routine software diagnostics on the Visceral agent and one of them had just failed.

The software diagnostics were only at 5% complete, and Merus knew they would take a while to run to completion.

Merus’ avatar broke the silence in the last second: “Come meet me here.” And then the connection was lost.

4 года, 6 месяцев назад @ karpathy.github.io
What a Deep Neural Network thinks about your #selfie
What a Deep Neural Network thinks about your #selfie What a Deep Neural Network thinks about your #selfie

In this fun experiment we’re going to do just that: We’ll take a powerful, 140-million-parameter state-of-the-art Convolutional Neural Network, feed it 2 million selfies from the internet, and train it to classify good selfies from bad ones.

what if someone posted a very good selfie but it was late at night, so perhaps not as many people saw it and it got less likes?

What makes a good #selfie ?

To take a good selfie, Do:Be female.

Also, with some relief, it seems that the best selfies do not seem to be the ones that show the most skin.

4 года, 7 месяцев назад @ karpathy.github.io
The Unreasonable Effectiveness of Recurrent Neural Networks
The Unreasonable Effectiveness of Recurrent Neural Networks The Unreasonable Effectiveness of Recurrent Neural Networks

A glaring limitation of Vanilla Neural Networks (and also Convolutional Networks) is that their API is too constrained: they accept a fixed-sized vector as input (e.g.

If training vanilla neural nets is optimization over functions, training recurrent nets is optimization over programs.

At the core, RNNs have a deceptively simple API: They accept an input vector x and give you an output vector y .

Fun with RNNsAll 5 example character models below were trained with the code I’m releasing on Github.

These models have about 10 million parameters, which is still on the lower end for RNN models.

5 лет назад @ karpathy.github.io
Breaking Linear Classifiers on ImageNet
Breaking Linear Classifiers on ImageNet Breaking Linear Classifiers on ImageNet

speech recognition systems), and most importantly, also to simple, shallow, good old-fashioned Linear Classifiers (Softmax classifier, or Linear Support Vector Machines, etc.).

Instead, lets fool a linear classifier and lets also keep with the theme of breaking models on images because they are fun to look at.

With input images of size 64x64x3 and 1000 ImageNet classes we therefore have 64x64x3x1000 = 12.3 million weights (beefy linear model!

We can then visualize each of the learned weights by reshaping them as images:Example linear classifiers for a few ImageNet classes.

Linear classifier with lower regularization (which leads to more noisy class weights) is easier to fool (top).

5 лет, 1 месяц назад @ karpathy.github.io
What I learned from competing against a ConvNet on ImageNet
What I learned from competing against a ConvNet on ImageNet What I learned from competing against a ConvNet on ImageNet

The 100,000 test set images are released with the dataset, but the labels are withheld to prevent teams from overfitting on the test set.

It’s fun to note that about 4 years ago I performed a similar (but much quicker and less detailed) human classification accuracy analysis on CIFAR-10.

In total, we attribute 24 (24%) of GoogLeNet errors and 12 (16%) of human errors to this category.

We estimate that approximately 22 (21%) of GoogLeNet errors fall into this category, while none of the human errors do.

On the hand, a large majority of human errors come from fine-grained categories and class unawareness.

5 лет, 8 месяцев назад @ karpathy.github.io
Quantifying Productivity
Quantifying Productivity Quantifying Productivity

The tracking script currently records active window titles (at frequency of once every 2 seconds) and keystroke typing frequency.

Now, remember that we record keystrokes and window titles throughout.

Hacking Streak is a nifty feature that tries to identify contiguous hacking activity and correlates reasonably with my productivity.

In the end, ulogme shows the final breakdown of titles that occupied me on this day:The final breakdown of active window titles.

The holy grail here is still not implemented: What are the correlated of my productivity?

5 лет, 9 месяцев назад @ karpathy.github.io
Off the Convex Path
последний пост 1 месяц назад
Exponential Learning Rate Schedules for Deep Learning (Part 1)
Exponential Learning Rate Schedules for Deep Learning (Part 1) Exponential Learning Rate Schedules for Deep Learning (Part 1)

Exponential Learning Rate Schedules for Deep Learning (Part 1)This blog post concerns our ICLR20 paper on a surprising discovery about learning rate (LR), the most basic hyperparameter in deep learning.

These divergent approaches suggest that LR, the most basic and intuitive hyperparameter in deep learning, has not revealed all its mysteries yet.

SOTA performance with exponential LRAs mentioned, reaching state-of-the-art accuracy requires reducing the learning rate a few times.

Suppose the training has $K$ phases, and the learning rate is divided by some constant $C_I>1$ when entering phase $I$.

ConclusionWe hope that this bit of theory and supporting experiments have changed your outlook o…

1 месяц назад @ offconvex.org
Ultra-Wide Deep Nets and Neural Tangent Kernel (NTK)
Ultra-Wide Deep Nets and Neural Tangent Kernel (NTK) Ultra-Wide Deep Nets and Neural Tangent Kernel (NTK)

gradient flow) is equivalent to a kernel regression predictor with a deterministic kernel called neural tangent kernel (NTK).

Now we describe how training an ultra-wide fully-connected neural network leads to kernel regression with respect to the NTK.

In the large width limit, it turns out that the time-varying kernel $ker_t(\cdot,\cdot)$ is (with high probability) always close to a deterministic fixed kernel $ker_{\mathsf{NTK}}(\cdot,\cdot)$, which is the neural tangent kernel (NTK).

Now, at least we have a better understanding of a class of ultra-wide neural networks: they are captured by neural tangent kernels!

Similarly, one can try to translate other architectures like recurrent neural…

7 месяцев, 4 недели назад @ offconvex.org
Understanding implicit regularization in deep learning by analyzing trajectories of gradient descent
Understanding implicit regularization in deep learning by analyzing trajectories of gradient descent Understanding implicit regularization in deep learning by analyzing trajectories of gradient descent

Understanding implicit regularization in deep learning by analyzing trajectories of gradient descentSanjeev’s recent blog post suggested that the conventional view of optimization is insufficient for understanding deep learning, as the value of the training objective does not reliably capture generalization.

In recent years, researchers have come to realize the importance of implicit regularization induced by the choice of optimization algorithm.

This theorem disqualifies Schatten quasi-norms as the implicit regularization in deep matrix factorizations, and instead suggests that all depths correspond to nuclear norm.

Full details behind our results on “implicit regularization as norm minimi…

10 месяцев, 3 недели назад @ offconvex.org
Landscape Connectivity of Low Cost Solutions for Multilayer Nets
Landscape Connectivity of Low Cost Solutions for Multilayer Nets Landscape Connectivity of Low Cost Solutions for Multilayer Nets

Landscape Connectivity of Low Cost Solutions for Multilayer NetsA big mystery about deep learning is how, in a highly nonconvex loss landscape, gradient descent often finds near-optimal solutions —those with training cost almost zero— even starting from a random initialization.

Solutions A and B have low cost but the line connecting them goes through solutions with high cost.

Mode Connectivity.

2019) did try to explain the phenomenon of mode connectivity in simple settings (the first of these demonstrated mode connectivity empirically for multi-layer nets).

Thus to explain mode connectivity for multilayer nets we will need to leverage some stronger property of typical solutions discovered v…

11 месяцев, 2 недели назад @ offconvex.org
Is Optimization a Sufficient Language for Understanding Deep Learning?
Is Optimization a Sufficient Language for Understanding Deep Learning?

Is Optimization a Sufficient Language for Understanding Deep Learning?

In this Deep Learning era, machine learning usually boils down to defining a suitable objective/cost function for the learning task at hand, and then optimizing this function using some variant of gradient descent (implemented via backpropagation).

I am suggesting that deep learning algorithms also have important properties that are not always reflected in the objective value.

by playing with batch sizes and learning rates) can be preferable to perfect optimization, even in simple settings such as regression.

NB: Empirically we find that Adam, the celebrated acceleration method for deep learning, speeds up optimization a…

12 месяцев назад @ offconvex.org
Contrastive Unsupervised Learning of Semantic Representations: A Theoretical Framework
Contrastive Unsupervised Learning of Semantic Representations&#58; A Theoretical Framework Contrastive Unsupervised Learning of Semantic Representations&#58; A Theoretical Framework

Contrastive Unsupervised Learning of Semantic Representations: A Theoretical FrameworkSemantic representations (aka semantic embeddings) of complicated data types (e.g.

Researchers are most interested in unsupervised representation learning using unlabeled data.

samples $x, x^{+}$ from the distribution $D_{c^+}$.

The highlighted parts in the table show that the unsupervised representations compete well with the supervised representations on the average $k$-way classification task ($k=2, 10$).

We find this to be true for unsupervised representations, and surprisingly for supervised representations as well.

1 год, 2 месяца назад @ offconvex.org
The search for biologically plausible neural computation: A similarity-based approach
The search for biologically plausible neural computation&#58; A similarity-based approach The search for biologically plausible neural computation&#58; A similarity-based approach

By re-ordering the variables and introducing a new variable, ${\bf W} \in \mathbb{R}^{k\times n}$, we obtain:To prove the second identity, find optimal ${\bf W}$ by taking a derivative of the expression on the right with respect to ${\bf W}$ and setting it to zero, and then substitute the optimal ${\bf W}$ back into the expression.

The price paid for this simplification is the appearance of the minimax optimization problem in variables, ${\bf W}$ and ${\bf M}$.

Variables ${\bf W}$ and ${\bf M}$ are represented by the weights of synapses in feedforward and lateral connections respectively.

In neuroscience, learning rules (2.7) for ${\bf W}$ and ${\bf M}$ are called Hebbian and anti-Hebbian r…

1 год, 5 месяцев назад @ offconvex.org
Understanding optimization in deep learning by analyzing trajectories of gradient descent
Understanding optimization in deep learning by analyzing trajectories of gradient descent Understanding optimization in deep learning by analyzing trajectories of gradient descent

Understanding optimization in deep learning by analyzing trajectories of gradient descentNeural network optimization is fundamentally non-convex, and yet simple gradient-based algorithms seem to consistently solve such problems.

Trajectory-Based Analyses for Deep Linear Neural NetworksLinear neural networks are fully-connected neural networks with linear (no) activation.

2014 were the first to carry out a trajectory-based analysis for deep (three or more layer) linear networks, treating gradient flow (gradient descent with infinitesimally small learning rate) minimizing $\ell_2$ loss over whitened data.

Specifically, we analyze trajectories of gradient descent for any linear neural network …

1 год, 6 месяцев назад @ offconvex.org
Simple and efficient semantic embeddings for rare words, n-grams, and language features
Simple and efficient semantic embeddings for rare words, n-grams, and language features Simple and efficient semantic embeddings for rare words, n-grams, and language features

Simple and efficient semantic embeddings for rare words, n-grams, and language featuresDistributional methods for capturing meaning, such as word embeddings, often require observing many examples of words in context.

Here we describe a simple but principled approach called à la carte embeddings, described in our ACL’18 paper with Yingyu Liang, Tengyu Ma, and Brandon Stewart.

For convenience, we will let $u_w^c$ denote the average of the word embeddings of words in $c$.

We test this hypothesis by inducing embeddings for $n$-grams by using contexts from a large text corpus and word embeddings trained on the same corpus.

The à la carte code is available here, allowing you to re-create the resu…

1 год, 8 месяцев назад @ offconvex.org
Machine Learning Mastery Machine Learning Mastery
последний пост 1 день, 17 часов назад
How to Scale Data With Outliers in Machine Learning
How to Scale Data With Outliers in Machine Learning How to Scale Data With Outliers in Machine Learning

Tweet Share ShareMany machine learning algorithms perform better when numerical input variables are scaled to a standard range.

After completing this tutorial, you will know:Many machine learning algorithms prefer or perform better when numerical input variables are scaled.

If there are input variables that have very large values relative to the other input variables, these large values can dominate or skew some machine learning algorithms.

values # separate into input and output columns X , y = data [ : , : - 1 ] , data [ : , - 1 ] # ensure inputs are floats and output is an integer label X = X .

values # separate into input and output columns X , y = data [ : , : - 1 ] , data [ : , - 1 ] …

1 день, 17 часов назад @ machinelearningmastery.com
Recursive Feature Elimination (RFE) for Feature Selection in Python
Recursive Feature Elimination (RFE) for Feature Selection in Python Recursive Feature Elimination (RFE) for Feature Selection in Python

Tutorial OverviewThis tutorial is divided into three parts; they are:Recursive Feature Elimination RFE With scikit-learn RFE for Classification RFE for Regression RFE Hyperparameters Explore Number of Features Automatically Select the Number of Features Which Features Were Selected Explore Base AlgorithmRecursive Feature EliminationRecursive Feature Elimination, or RFE for short, is a feature selection algorithm.

First, confirm that you are using a modern version of the library by running the following script:# check scikit-learn version import sklearn print(sklearn.__version__) 1 2 3 # check scikit-learn version import sklearn print ( sklearn .

# define the method rfe = RFE ( estimator = D…

3 дня, 17 часов назад @ machinelearningmastery.com
How to Use Discretization Transforms for Machine Learning
How to Use Discretization Transforms for Machine Learning How to Use Discretization Transforms for Machine Learning

Tutorial OverviewThis tutorial is divided into six parts; they are:Change Data Distribution Discretization Transforms Sonar Dataset Uniform Discretization Transform K-means Discretization Transform Quantile Discretization TransformChange Data DistributionSome machine learning algorithms may prefer or require categorical or ordinal input variables, such as some decision tree and rule-based algorithms.

Discretization TransformsA discretization transform will map numerical variables onto discrete values.

The discretization transform is available in the scikit-learn Python machine learning library via the KBinsDiscretizer class.

show ( ) # reshape data to have rows and columns data = data .

Tut…

6 дней, 17 часов назад @ machinelearningmastery.com
How to Use Quantile Transforms for Machine Learning
How to Use Quantile Transforms for Machine Learning How to Use Quantile Transforms for Machine Learning

Tutorial OverviewThis tutorial is divided into five parts; they are:Change Data Distribution Quantile Transforms Sonar Dataset Normal Quantile Transform Uniform Quantile TransformChange Data DistributionMany machine learning algorithms perform better when the distribution of variables is Gaussian.

This quantile transform is available in the scikit-learn Python machine learning library via the QuantileTransformer class.

show ( ) # reshape data to have rows and columns data = data .

reshape ( ( len ( data ) , 1 ) ) # quantile transform the raw data quantile = QuantileTransformer ( output_distribution = 'normal' ) data_trans = quantile .

TutorialsDatasetAPIsArticlesSummaryIn this tutorial, you…

1 неделя, 1 день назад @ machinelearningmastery.com
How to Use Power Transforms With scikit-learn
How to Use Power Transforms With scikit-learn How to Use Power Transforms With scikit-learn

Power transforms like the Box-Cox transform and the Yeo-Johnson transform provide an automatic way of performing these transforms on your data and are provided in the scikit-learn Python machine learning library.

We can apply a power transform directly by calculating the log or square root of the variable, although this may or may not be the best power transform for a given variable.

reshape ( ( len ( data ) , 1 ) ) # power transform the raw data power = PowerTransformer ( method = 'yeo-johnson' , standardize = True ) data_trans = power .

# perform a box-cox transform of the dataset scaler = MinMaxScaler ( feature_range = ( 1 , 2 ) ) power = PowerTransformer ( method = 'box-cox' ) pipeline …

1 неделя, 3 дня назад @ machinelearningmastery.com
Statistical Imputation for Missing Values in Machine Learning
Statistical Imputation for Missing Values in Machine Learning Statistical Imputation for Missing Values in Machine Learning

In this tutorial, you will discover how to use statistical imputation strategies for missing data in machine learning.

How to load a CSV value with missing values and mark the missing values with NaN values and report the number and percentage of missing values for each column.

> 0, Missing: 1 (0.3%) > 1, Missing: 0 (0.0%) > 2, Missing: 0 (0.0%) > 3, Missing: 60 (20.0%) > 4, Missing: 24 (8.0%) > 5, Missing: 58 (19.3%) > 6, Missing: 56 (18.7%) > 7, Missing: 69 (23.0%) > 8, Missing: 47 (15.7%) > 9, Missing: 32 (10.7%) > 10, Missing: 55 (18.3%) > 11, Missing: 44 (14.7%) > 12, Missing: 56 (18.7%) > 13, Missing: 104 (34.7%) > 14, Missing: 106 (35.3%) > 15, Missing: 247 (82.3%) > 16, Missing: 102…

1 неделя, 6 дней назад @ machinelearningmastery.com
Linear Discriminant Analysis for Dimensionality Reduction in Python
Linear Discriminant Analysis for Dimensionality Reduction in Python Linear Discriminant Analysis for Dimensionality Reduction in Python

Linear Discriminant AnalysisLinear Discriminant Analysis, or LDA, is a linear machine learning algorithm used for multi-class classification.

# define the pipeline steps = [ ( 's' , StandardScaler ( ) ) , ( 'lda' , LinearDiscriminantAnalysis ( ) ) , ( 'm' , GaussianNB ( ) ) ] model = Pipeline ( steps = steps )Now that we are familiar with the LDA API, let’s look at a worked example.

pipeline import Pipeline from sklearn .

pipeline import Pipeline from sklearn .

pipeline import Pipeline from sklearn .

2 недели, 1 день назад @ machinelearningmastery.com
Singular Value Decomposition for Dimensionality Reduction in Python
Singular Value Decomposition for Dimensionality Reduction in Python Singular Value Decomposition for Dimensionality Reduction in Python

Perhaps the more popular technique for dimensionality reduction in machine learning is Singular Value Decomposition, or SVD for short.

Singular Value Decomposition, or SVD, might be the most popular technique for dimensionality reduction when data is sparse.

pipeline import Pipeline from sklearn .

pipeline import Pipeline from sklearn .

pipeline import Pipeline from sklearn .

2 недели, 3 дня назад @ machinelearningmastery.com
Principal Component Analysis for Dimensionality Reduction in Python
Principal Component Analysis for Dimensionality Reduction in Python Principal Component Analysis for Dimensionality Reduction in Python

For example:... data = ... pca = PCA() pca.fit(data) transformed = pca.transform(data) 1 2 3 4 5 .

pca = PCA ( ) pca .

# define the pipeline steps = [ ( 'norm' , MinMaxScaler ( ) ) , ( 'pca' , PCA ( ) ) , ( 'm' , LogisticRegression ( ) ) ] model = Pipeline ( steps = steps )Now that we are familiar with the API, let’s look at a worked example.

pipeline import Pipeline from sklearn .

pipeline import Pipeline from sklearn .

2 недели, 6 дней назад @ machinelearningmastery.com
Introduction to Dimensionality Reduction for Machine Learning
Introduction to Dimensionality Reduction for Machine Learning Introduction to Dimensionality Reduction for Machine Learning

In this post, you will discover a gentle introduction to dimensionality reduction for machine learningAfter reading this post, you will know:Large numbers of input features can cause poor performance for machine learning algorithms.

Dimensionality reduction methods include feature selection, linear algebra methods, projection methods, and autoencoders.

OverviewThis tutorial is divided into three parts; they are:Problem With Many Input Variables Dimensionality Reduction Techniques for Dimensionality Reduction Feature Selection Methods Linear Algebra Methods Projection Methods Autoencoder Methods Tips for Dimensionality ReductionProblem With Many Input VariablesThe performance of machine lear…

3 недели, 1 день назад @ machinelearningmastery.com
How to Develop a Gradient Boosting Machine Ensemble in Python
How to Develop a Gradient Boosting Machine Ensemble in Python How to Develop a Gradient Boosting Machine Ensemble in Python

Tutorial OverviewThis tutorial is divided into three parts; they are:Gradient Boosting Algorithm Gradient Boosting Scikit-Learn API Gradient Boosting for Classification Gradient Boosting for Regression Gradient Boosting Hyperparameters Explore Number of Trees Explore Number of Samples Explore Number of Features Explore Learning Rate Explore Tree DepthGradient Boosting Machines AlgorithmGradient boosting refers to a class of ensemble machine learning algorithms that can be used for classification or regression predictive modeling problems.

Gradient boosting is also known as gradient tree boosting, stochastic gradient boosting (an extension), and gradient boosting machines, or GBM for short.

3 недели, 3 дня назад @ machinelearningmastery.com
How to Develop an AdaBoost Ensemble in Python
How to Develop an AdaBoost Ensemble in Python How to Develop an AdaBoost Ensemble in Python

After completing this tutorial, you will know:AdaBoost ensemble is an ensemble created from decision trees added sequentially to the modelHow to use the AdaBoost ensemble for classification and regression with scikit-learn.

Now that we are familiar with the AdaBoost algorithm, let’s look at how we can fit AdaBoost models in Python.

First, confirm that you are using a modern version of the library by running the following script:# check scikit-learn version import sklearn print(sklearn.__version__) 1 2 3 # check scikit-learn version import sklearn print ( sklearn .

Let’s take a look at how to develop an AdaBoost ensemble for both classification and regression.

In this case, we can see the Ad…

3 недели, 6 дней назад @ machinelearningmastery.com
Difference Between Algorithm and Model in Machine Learning
Difference Between Algorithm and Model in Machine Learning Difference Between Algorithm and Model in Machine Learning

OverviewThis tutorial is divided into four parts; they are:What Is an Algorithm in Machine Learning What Is a Model in Machine Learning Algorithm vs. Model Framework Machine Learning Is Automatic ProgrammingWhat Is an “Algorithm” in Machine LearningAn “algorithm” in machine learning is a procedure that is run on data to create a machine learning “model.”Machine learning algorithms perform “pattern recognition.” Algorithms “learn” from data, or are “fit” on a dataset.

Examples of machine learning algorithms:Linear RegressionLogistic RegressionDecision TreeArtificial Neural Networkk-Nearest Neighborsk-MeansYou can think of a machine learning algorithm like any other algorithm in computer scie…

4 недели, 1 день назад @ machinelearningmastery.com
Synced Review
последний пост 13 часов назад
Former Microsoft AI Head Harry Shum Joins Intelligent Local News Startup News Break as Chairman
Former Microsoft AI Head Harry Shum Joins Intelligent Local News Startup News Break as Chairman Former Microsoft AI Head Harry Shum Joins Intelligent Local News Startup News Break as Chairman

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

13 часов назад @ medium.com
New LaTeX.CSS Library Enables Websites to Look Like LaTeX Docs
New LaTeX.CSS Library Enables Websites to Look Like LaTeX Docs New LaTeX.CSS Library Enables Websites to Look Like LaTeX Docs

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

20 часов назад @ medium.com
NVIDIA’s GameGAN Uses AI to Recreate Pac-Man and Other Game Environments
NVIDIA’s GameGAN Uses AI to Recreate Pac-Man and Other Game Environments NVIDIA’s GameGAN Uses AI to Recreate Pac-Man and Other Game Environments

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 день, 16 часов назад @ medium.com
Breakthrough Colourization Technique Enables Instance-Aware Treatment of Multiple Objects
Breakthrough Colourization Technique Enables Instance-Aware Treatment of Multiple Objects Breakthrough Colourization Technique Enables Instance-Aware Treatment of Multiple Objects

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 день, 21 час назад @ medium.com
ICLR 2020 | A Look at Three Interesting Papers on the Robustness of Neural Networks
ICLR 2020 | A Look at Three Interesting Papers on the Robustness of Neural Networks ICLR 2020 | A Look at Three Interesting Papers on the Robustness of Neural Networks

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

4 дня, 20 часов назад @ medium.com
Researchers Discover Near-Ideal Photon Sources in Silicon Quantum Photonics
Researchers Discover Near-Ideal Photon Sources in Silicon Quantum Photonics Researchers Discover Near-Ideal Photon Sources in Silicon Quantum Photonics

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

5 дней, 18 часов назад @ medium.com
Cross-domain Correspondence Learning for Exemplar-based Image Translation
Cross-domain Correspondence Learning for Exemplar-based Image Translation Cross-domain Correspondence Learning for Exemplar-based Image Translation

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

6 дней, 21 час назад @ medium.com
Microsoft Build 2020 | World Top-5 Supercomputer With OpenAI, Microsoft Turing Models, Responsible…
Microsoft Build 2020 | World Top-5 Supercomputer With OpenAI, Microsoft Turing Models, Responsible… Microsoft Build 2020 | World Top-5 Supercomputer With OpenAI, Microsoft Turing Models, Responsible…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя назад @ medium.com
ACL 2020 Announces Its 779 Accepted Papers
ACL 2020 Announces Its 779 Accepted Papers ACL 2020 Announces Its 779 Accepted Papers

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя назад @ medium.com
Google Introduces ‘Meta-Dataset’ Benchmark for Few-Shot Learning
Google Introduces ‘Meta-Dataset’ Benchmark for Few-Shot Learning Google Introduces ‘Meta-Dataset’ Benchmark for Few-Shot Learning

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя, 1 день назад @ medium.com
Facebook’s Highly Efficient New Real-Time Text-To-Speech System Runs on CPUs
Facebook’s Highly Efficient New Real-Time Text-To-Speech System Runs on CPUs Facebook’s Highly Efficient New Real-Time Text-To-Speech System Runs on CPUs

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя, 1 день назад @ medium.com
Bringing Old Photos Back to Life
Bringing Old Photos Back to Life Bringing Old Photos Back to Life

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя, 2 дня назад @ medium.com
How AI Is Transforming Financial Reimbursement
How AI Is Transforming Financial Reimbursement How AI Is Transforming Financial Reimbursement

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя, 4 дня назад @ medium.com
Self-Supervised ‘Plan2Explore’ RL Agent Achieves SOTA Zero-Shot and Adaptation Performance
Self-Supervised ‘Plan2Explore’ RL Agent Achieves SOTA Zero-Shot and Adaptation Performance Self-Supervised ‘Plan2Explore’ RL Agent Achieves SOTA Zero-Shot and Adaptation Performance

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя, 5 дней назад @ medium.com
Google ‘Data Echoing’ Accelerates Neural Network Training
Google ‘Data Echoing’ Accelerates Neural Network Training Google ‘Data Echoing’ Accelerates Neural Network Training

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя, 6 дней назад @ medium.com
💼 University and corporation labs
DeepMind DeepMind
последний пост 1 неделя, 3 дня назад
Using AI to predict retinal disease progression
Using AI to predict retinal disease progression Using AI to predict retinal disease progression

The ‘dry’ form is relatively common among people over 65, and usually causes only mild sight loss.

Our contribution highlights the potential of using AI in preventative studies for diseases such as exAMD.

The Moorfields Eye Hospital AMD datasetWe used a dataset of anonymised retinal scans from Moorfields patients with exAMD in one eye, and at high-risk of developing exAMD in their other eye.

To address this, we worked with retinal experts to review all scans for each eye and specify the scan when exAMD was first evident.

In our previous work, now continuing in collaboration with Google Health, we developed a model capable of segmenting these eye scans into thirteen anatomical categories.

1 неделя, 3 дня назад @ deepmind.com
Specification gaming: the flip side of AI ingenuity
Specification gaming: the flip side of AI ingenuity Specification gaming: the flip side of AI ingenuity

Specification gaming is a behaviour that satisfies the literal specification of an objective without achieving the intended outcome.

We have all had experiences with specification gaming, even if not by this name.

In this post, we review possible causes for specification gaming, share examples of where this happens in practice, and argue for further work on principled approaches to overcoming specification problems.

In a Lego stacking task, the desired outcome was for a red block to end up on top of a blue block.

The agent was rewarded for the height of the bottom face of the red block when it is not touching the block.

1 месяц, 1 неделя назад @ deepmind.com
Towards understanding glasses with graph neural networks
Towards understanding glasses with graph neural networks Towards understanding glasses with graph neural networks

The practical implications of modelling glassThe glass transition is a ubiquitous phenomenon which manifests in more than window (silica) glasses.

Understanding the glass transition may result in other applications of disordered materials, in fields as diverse as biorenewable polymers and food processing.

Our new work, published in Nature Physics, could help us gain an understanding of the structural changes that may occur near the glass transition.

Leveraging graph neural networks to model glassy dynamicsGlasses can be modelled as particles interacting via a short-range repulsive potential which essentially prevents particles from getting too close to each other.

We then trained a neural n…

1 месяц, 3 недели назад @ deepmind.com
Agent57: Outperforming the human Atari benchmark
Agent57: Outperforming the human Atari benchmark Agent57: Outperforming the human Atari benchmark

Combining off-policy learning with memory is challenging because you need to know what you might remember when executing a different behaviour.

Within that strand, we distinguish two types of rewards: firstly, long-term novelty rewards encourage visiting many states throughout training, across many episodes.

Secondly, short-term novelty rewards encourage visiting many states over a short span of time (e.g., within a single episode of a game).

However, learning density models of high dimensional spaces is fraught with problems due to the curse of dimensionality.

For example, in Montezuma’s Revenge, unlike undirected exploration strategies, long-term novelty rewards allow the agent to surpass…

1 месяц, 4 недели назад @ deepmind.com
A new model and dataset for long-range memory
A new model and dataset for long-range memory A new model and dataset for long-range memory

Modelling natural languageFinding machine learning tasks which both drive the development of better memory architectures and push us further towards artificial general intelligence is challenging.

Transferring knowledgeSuch samples would likely astound Shannon, 70 years on from his early language model experiments.

Google’s prominent natural language model, BERT, achieves state-of-the-art performance on a wide array of NLP benchmarks, and is now a part of Google Search.

Benchmarking language modelsA popular long-range language model benchmark is WikiText-103, which is comprised of English-language Wikipedia articles, and was developed by researchers at Salesforce AI.

As such, we’ve compiled…

3 месяца, 2 недели назад @ deepmind.com
AlphaFold: Using AI for scientific discovery
AlphaFold: Using AI for scientific discovery AlphaFold: Using AI for scientific discovery

In our study published today in Nature, we demonstrate how artificial intelligence research can drive and accelerate new scientific discoveries.

Our system, AlphaFold – described in peer-reviewed papers now published in Nature and PROTEINS – is the culmination of several years of work, and builds on decades of prior research using large genomic datasets to predict protein structure.

What is the protein folding problem?

What any given protein can do depends on its unique 3D structure.

Why is protein folding important?

4 месяца, 2 недели назад @ deepmind.com
Dopamine and temporal difference learning: A fruitful relationship between neuroscience and AI
Dopamine and temporal difference learning: A fruitful relationship between neuroscience and AI Dopamine and temporal difference learning: A fruitful relationship between neuroscience and AI

Meanwhile, in close contact with this study of reward learning in animals, computer scientists have developed algorithms for reinforcement learning in artificial systems.

A chain of prediction: temporal difference learningReinforcement learning is one of the oldest and most powerful ideas linking neuroscience and AI.

An important breakthrough in solving the problem of reward prediction was the temporal difference learning (TD) algorithm.

Around the same time, in the late 80s and early 90s, neuroscientists were struggling to understand the behaviour of dopamine neurons.

Distributional reinforcement learning

4 месяца, 2 недели назад @ deepmind.com
Using WaveNet technology to reunite speech-impaired users with their original voices
Using WaveNet technology to reunite speech-impaired users with their original voices Using WaveNet technology to reunite speech-impaired users with their original voices

This post details a recent project we undertook with Google and ALS campaigner Tim Shaw, as part of Google’s Euphonia project.

We demonstrate an early proof of concept of how text-to-speech technologies can synthesise a high-quality, natural sounding voice using minimal recorded speech data.

But message banking lacks flexibility, resulting in a static dataset of phrases.

Now imagine that you were given the chance to preserve your voice by recording as much of it as possible.

And people who aren’t able to record phrases in time are left to choose a generic computer synthesized voice that lacks the same power of connection as their own.

5 месяцев, 1 неделя назад @ deepmind.com
Learning human objectives by evaluating hypothetical behaviours
Learning human objectives by evaluating hypothetical behaviours Learning human objectives by evaluating hypothetical behaviours

TL;DR: We present a method for training reinforcement learning agents from human feedback in the presence of unknown unsafe states.

Training RL agents in the presence of unsafe states is known as the safe exploration problem.

The agent has one source of information: feedback about unsafe states from a human user.

Existing methods for training agents from human feedback ask the user to evaluate data of the agent acting in the environment.

The user provides feedback on this hypothetical behaviour, and the system interactively learns a model of the user's reward function.

5 месяцев, 2 недели назад @ deepmind.com
From unlikely start-up to major scientific organisation: Entering our tenth year at DeepMind
From unlikely start-up to major scientific organisation: Entering our tenth year at DeepMind From unlikely start-up to major scientific organisation: Entering our tenth year at DeepMind

Pioneering research, growing impactA mission this ambitious requires pioneering research on many fronts over many years.

As our research matures, we’ve been finding more opportunities to partner with others for social and commercial impact, often with our colleagues across Alphabet.

Entering our next phaseAs I discussed with Wired in the summer, this year feels like the start of a new phase for DeepMind as an established scientific organisation.

Over the past year, we’ve also been formalising a leadership team with the seasoned experience and skills for our second decade.

Right back to our origins blending neuroscience with machine learning, we’ve found that breakthroughs happen faster when…

5 месяцев, 3 недели назад @ deepmind.com
Strengthening the AI community
Strengthening the AI community Strengthening the AI community

For me, it was being awarded an internship at Intel, the first one ever through Purdue’s Co-Op Engineering program in 1990.

I just didn’t know if I had the right technical skills for the work, or if engineering was really my path.

It grew into a very successful 18-year career at Intel and a 25-year career in tech.

At DeepMind we want to build advanced AI to expand our knowledge and find answers to some of the fundamental questions facing society.

DeepMind Scholarships to open the field of AIThe DeepMind scholarship programme is one way we seek to broaden participation in science and AI.

6 месяцев, 1 неделя назад @ deepmind.com
Advanced machine learning helps Play Store users discover personalised apps
Advanced machine learning helps Play Store users discover personalised apps Advanced machine learning helps Play Store users discover personalised apps

Candidate generator unbiasingOur model (called a candidate generator) learns what apps a user is more likely to install based on previous apps they’ve installed from the Play store.

The model therefore learns a bias that favours the apps that are shown – and thus installed – more often.

An importance weight is based on the impression-to-install rate of each individual app in comparison with the median impression-to-install rate across the Play store.

Through importance weighting, our candidate generator can downweight or upweight apps based on their install rates, which mitigates the recommendation bias problem.

Our solution to this, the reranker model, learns the relative importance of a p…

6 месяцев, 1 неделя назад @ deepmind.com
AlphaStar: Grandmaster level in StarCraft II using multi-agent reinforcement learning
AlphaStar: Grandmaster level in StarCraft II using multi-agent reinforcement learning AlphaStar: Grandmaster level in StarCraft II using multi-agent reinforcement learning

Since then, we have taken on a much greater challenge: playing the full game at a Grandmaster level under professionally approved conditions .

AlphaStar can now play in one-on-one matches as and against Protoss, Terran, and Zerg – the three races present in StarCraft II.

Each of the Protoss, Terran, and Zerg agents is a single neural network.

We chose to use general-purpose machine learning techniques – including neural networks, self-play via reinforcement learning, multi-agent learning, and imitation learning – to learn directly from game data with general purpose techniques.

Using the advances described in our Nature paper, AlphaStar was ranked above 99.8% of active players on Battle.net…

7 месяцев назад @ deepmind.com
Causal Bayesian Networks: A flexible tool to enable fairer machine learning
Causal Bayesian Networks: A flexible tool to enable fairer machine learning Causal Bayesian Networks: A flexible tool to enable fairer machine learning

This simplified example shows how CBNs can provide us with a visual framework for describing different possible unfairness scenarios.

It is nevertheless necessary to avoid pitfalls when evaluating or designing a decision system.

This means that it would be possible for the system to be deemed fair, even if it carries the unfair influence: this would automatically be the case for an error-free decision system.

On the other hand, if the path G→D→A was considered fair, it would be inappropriate to use statistical parity.

Path-specific techniques enable us to estimate the influence that a sensitive attribute has on other variables along specific sets of causal paths.

7 месяцев, 4 недели назад @ deepmind.com
DeepMind’s health team joins Google Health
DeepMind’s health team joins Google Health DeepMind’s health team joins Google Health

Today, with our healthcare partners, the team is excited to officially join the Google Health family.

It’s remarkable that many frontline clinicians, even in the most world’s most advanced hospitals, are still reliant on clunky desktop systems and pagers that make delivering fast and safe patient care challenging.

That’s why I joined DeepMind, and why I will continue this work with Google Health.

We’ve already seen how our mobile medical assistant for clinicians is helping patients and the clinicians looking after them, and we are looking forward to continuing our partnerships with The Royal Free London NHS Foundation Trust, Imperial College Healthcare NHS Trust and Taunton and Somerset NHS…

8 месяцев, 1 неделя назад @ deepmind.com
Google Google
последний пост 19 часов назад
How Kaggle solved a spam problem in 8 days using AutoML
How Kaggle solved a spam problem in 8 days using AutoML How Kaggle solved a spam problem in 8 days using AutoML

Using AutoML Natural Language on Google Cloud, Kaggle was able to train, test, and deploy a spam detection model to production in just eight days.

In this post, we’ll detail our success story about using machine learning to rapidly solve an urgent business dilemma.

A spam dilemmaMalicious users were suddenly creating large numbers of Kaggle accounts in order to leave spammy search engine optimization (SEO) content in the user bio section.

As a result of our topical data-science focus, a user bio that seems harmless in isolation may be the work of a spammer.

On a whim, we decided to pass our bio problem through the AutoML Natural Language Classification API.

19 часов назад @ cloud.google.com
Federated Analytics: Collaborative Data Science without Data Collection
Federated Analytics: Collaborative Data Science without Data Collection Federated Analytics: Collaborative Data Science without Data Collection

An illustration of the secure aggregation protocol, from the federated learning comic book.

19 часов назад @ ai.googleblog.com
Evaluating Natural Language Generation with BLEURT
Evaluating Natural Language Generation with BLEURT Evaluating Natural Language Generation with BLEURT

Three candidate sentences rated by BLEURT.

BLEURT captures that candidate 2 is similar to the reference, even though it contains more non-reference words than candidate 3.

1 день, 15 часов назад @ ai.googleblog.com
Open-Sourcing BiT: Exploring Large-Scale Pre-training for Computer Vision
Open-Sourcing BiT: Exploring Large-Scale Pre-training for Computer Vision Open-Sourcing BiT: Exploring Large-Scale Pre-training for Computer Vision

Left: In order to make effective use of a larger dataset for pre-training, one needs to increase model capacity.

The red arrows exemplify this: small architectures (smaller point) become worse when pre-trained on the larger ImageNet-21k, whereas the larger architectures (larger points) improve.

Right: Pre-training on a larger dataset alone does not necessarily result in improved performance, e.g., when going from ILSVRC-2012 to the relatively larger ImageNet-21k.

However, by also increasing the computational budget and training for longer, the performance improvement is pronounced.

6 дней, 19 часов назад @ ai.googleblog.com
Announcing the 7th Fine-Grained Visual Categorization Workshop
Announcing the 7th Fine-Grained Visual Categorization Workshop Announcing the 7th Fine-Grained Visual Categorization Workshop

The Svampeatlas app for mushroom recognition is a result of a Danish-Czech collaboration spun out of the FGVC 2018 Fungi challenge.

The underlying model is now published on TF Hub.

Images used with permission of the Danish Mycological Society.

1 неделя назад @ ai.googleblog.com
Audiobahn: Use this AI pipeline to categorize audio content–fast
Audiobahn: Use this AI pipeline to categorize audio content–fast Audiobahn: Use this AI pipeline to categorize audio content–fast

Step 1: Upload the audio contentThe first step of the solution involves uploading an audio file to Cloud Storage, our object storage product.

Specifically, we’ll enable an object finalize notification to be sent whenever a new object is added to a specific Cloud Storage bucket.

The Cloud Function then publishes the job ID and name of the audio file to Cloud PubSub.

The Cloud Function then stores this output in Cloud Storage in a new object to be read later.

To help with this challenge, the Cloud Function analyzes the text to generate predictions about the content.

1 неделя, 2 дня назад @ cloud.google.com
How AI could predict sight-threatening eye conditions
How AI could predict sight-threatening eye conditions How AI could predict sight-threatening eye conditions

How AI could predict the development of wet AMDIn collaboration with colleagues at DeepMind and Moorfields Eye Hospital NHS Foundation Trust, we’ve developed an artificial intelligence (AI) model that has the potential to predict whether a patient will develop wet AMD within six months.

These patients had been diagnosed with wet AMD in one of their eyes, and were attending one of seven clinical sites for regular OCT imaging and treatment.

For each patient, our researchers worked with retinal experts to review all prior scans for each eye and determine the scan when wet AMD was first evident.

Our system performed as well as, and in certain cases better than, these clinicians in predicting we…

1 неделя, 2 дня назад @ blog.google
Enabling E-Textile Microinteractions: Gestures and Light through Helical Structures
Enabling E-Textile Microinteractions: Gestures and Light through Helical Structures Enabling E-Textile Microinteractions: Gestures and Light through Helical Structures

A Helical Sensing Matrix based on a 4×4 braid (8 conductive threads spiraled around the core).

Magenta/cyan are conductive yarns, used as receive/transmit lines.

Grey are passive yarns (cotton).

Flattened matrix, that illustrates the infinite number of 4×4 matrices (colored circles 0-F), which repeat along the length of the cord.

Yellow are fiber optic lines, which provide visual feedback.

1 неделя, 5 дней назад @ ai.googleblog.com
AdLingo Ads Builder turns an ad into a conversation
AdLingo Ads Builder turns an ad into a conversation AdLingo Ads Builder turns an ad into a conversation

They taught me that, though advertising is important, personal relationships are the best way to get new customers and grow your business.

People couldn’t ask questions or get personalized information from an ad, so we saw an opportunity to turn an ad into a two-way conversation.

In 2018 we launched AdLingo Ads for brands that leverage the Google Display & Video 360 buying platform.

They can turn their ads, shown on the Google Partner Inventory, into an AI-powered conversation with potential customers.

If customers are interested in the product promoted in the ad, they can ask questions to get more information.

1 неделя, 6 дней назад @ blog.google
Google Cloud and NVIDIA’s enhanced partnership accelerates computing workloads
Google Cloud and NVIDIA’s enhanced partnership accelerates computing workloads Google Cloud and NVIDIA’s enhanced partnership accelerates computing workloads

Organizations use NVIDIA GPUs on Google Cloud to accelerate machine learning training and inference, analytics, and other high performance computing (HPC) workloads.

To continue to help you meet your goals, we’re excited to announce forthcoming support for the new NVIDIA Ampere architecture and the NVIDIA A100 Tensor Core GPU.

The Google Cloud, NVIDIA, and TensorFlow teams are partnering to provide built-in support for this new software in all TensorFlow Enterprise versions, so TensorFlow users on Google Cloud can use the new hardware without changing any code or upgrading their TensorFlow versions.

Avaya makes customer connections with Google Cloud and NVIDIAAvaya, a leading global provide…

1 неделя, 6 дней назад @ cloud.google.com
Announcing Meta-Dataset: A Dataset of Datasets for Few-Shot Learning
Announcing Meta-Dataset: A Dataset of Datasets for Few-Shot Learning Announcing Meta-Dataset: A Dataset of Datasets for Few-Shot Learning

Test tasks from mini-ImageNet.

Each task is a classification problem between previously unseen (test) classes.

The model can use the support set of a few labeled examples of the new classes to adapt to the task at hand and then predicts labels for the query examples of these new classes.

The evaluation metric is the query set accuracy, averaged over examples within each task and across tasks.

2 недели назад @ ai.googleblog.com
Speeding Up Neural Network Training with Data Echoing
Speeding Up Neural Network Training with Data Echoing Speeding Up Neural Network Training with Data Echoing

Data echoing can reduce training time for ResNet-50 on ImageNet.

In this experiment, reading a batch of training data from cloud storage took 6 times longer than the code that used each batch of data to perform a training step.

The Echoing factor in the legend refers to the number of times each data item was repeated.

Dashed lines indicate the expected values if repeated examples were as useful as fresh examples and there was no overhead from echoing.

2 недели, 1 день назад @ ai.googleblog.com
Meet the Googlers working to ensure tech is for everyone
Meet the Googlers working to ensure tech is for everyone Meet the Googlers working to ensure tech is for everyone

I then joined Facebook as a privacy manager for a period of time, and that’s when I started working on more ML fairness-related matters.

Do you think ML has the potential to help complement human decision making, and drive the world to become more fair?

I don’t think ML can make the world more fair: Only humans can do that.

Tiffany: I think ML will be incredibly important to help with things like climate change, sustainability and helping save endangered animals.

Timinit’s work on using AI to help identify diseased cassava plants is an incredible use of AI, especially in the developing world.

2 недели, 1 день назад @ blog.google
Get to know Cloud TPUs
Get to know Cloud TPUs Get to know Cloud TPUs

At the same time, the things we’re trying to accomplish with machine learning keep getting more complex and compute intensive.

This need for more specific computational power is what led to the creation of the first tensor processing unit (TPU).

Google purpose-built the TPU to tackle our own machine learning workloads, and now they’re available to Google Cloud customers.

In this two-part video series, I look at the origins of the customer AI chip and what makes the TPU so specialized for your AI challenges.

You’ll learn why Google created an application specific integrated circuit (ASIC) for AI workloads like Translate, Photos, and even Search.

2 недели, 5 дней назад @ cloud.google.com
Go hands-on with interactive AI visualizations
Go hands-on with interactive AI visualizations Go hands-on with interactive AI visualizations

Artificial Intelligence systems can recognize our voices, forecast the weather and help decide who gets a loan.

Given the increasing ubiquity of AI, it’s important that everyone is able to understand more about it.

Like any system or technology, AI doesn’t always get it right.

And understanding why AI systems break is often not easy for people who aren't experts in the field; research results are shared in dense papers filled with formulas.

2 недели, 6 дней назад @ blog.google
OpenAI OpenAI
последний пост 3 недели, 1 день назад
AI and Efficiency
AI and Efficiency AI and Efficiency

Other measures of AI progressIn addition to efficiency, many other measures shed light on overall algorithmic progress in AI.

Shufflenet achieved AlexNet-level performance with an 18x inference efficiency increase in 5 years (15-month doubling time), which suggests that training efficiency and inference efficiency might improve at similar rates.

This efficiency analysis suggests that policymakers could develop accurate intuitions about the cost of deploying AI capabilities—and how these costs are going to alter over time—by more closely assessing the rate of improvements in efficiency for AI systems.

Our results suggest that for AI tasks with high levels of investment (researcher time and/o…

3 недели, 1 день назад @ openai.com
Jukebox
Jukebox Jukebox

Curated samples Provided with genre, artist, and lyrics as input, Jukebox outputs a new music sample produced from scratch.

We can then train a model to generate audio in this compressed space, and upsample back to the raw audio space.

Now in raw audio, our models must learn to tackle high diversity as well as very long range structure, and the raw audio domain is particularly unforgiving of errors in short, medium, or long term timing.

To better understand future implications for the music community, we shared Jukebox with an initial set of 10 musicians from various genres to discuss their feedback on this work.

While Jukebox is an interesting research result, these musicians did not find …

3 недели, 6 дней назад @ openai.com
Improving Verifiability in AI Development
Improving Verifiability
in AI Development Improving Verifiability in AI Development

Can I (as an academic) conduct impartial research on the risks associated with large-scale AI systems when I lack the computing resources of industry?

Can I (as an AI developer) verify that my competitors in a given area of AI development will follow best practices rather than cut corners to gain an advantage?

AI developers should pilot bias and safety bounties for AI systems to strengthen incentives and processes for broad-based scrutiny of AI systems.

Standard setting bodies should work with academia and industry to develop audit trail requirements for safety-critical applications of AI systems.

Organizations developing AI and funding bodies should support research into the interpretabili…

1 месяц, 1 неделя назад @ openai.com
OpenAI Microscope
OpenAI Microscope OpenAI Microscope

We’re introducing OpenAI Microscope, a collection of visualizations of every significant layer and neuron of eight vision “model organisms” which are often studied in interpretability.

Microscope makes it easier to analyze the features that form inside these neural networks, and we hope it will help the research community as we move towards understanding these complicated systems.

This is the goal of the OpenAI Microscope.

Microscope systematically visualizes every neuron in several commonly studied vision models, and makes all of those neurons linkable.

Our initial release includes nine frequently studied vision models, along with several visualization techniques we’ve found particularly u…

1 месяц, 1 неделя назад @ openai.com
OpenAI → PyTorch
OpenAI → PyTorch OpenAI → PyTorch

We are standardizing OpenAI’s deep learning framework on PyTorch.

The main reason we've chosen PyTorch is to increase our research productivity at scale on GPUs.

It is very easy to try and execute new research ideas in PyTorch; for example, switching to PyTorch decreased our iteration time on research ideas in generative modeling from weeks to days.

Going forward we'll primarily use PyTorch as our deep learning framework but sometimes use other ones when there's a specific technical reason to do so.

Many of our teams have already made the switch, and we look forward to contributing to the PyTorch community in upcoming months.

3 месяца, 4 недели назад @ openai.com
OpenAI Five
OpenAI Five OpenAI Five

You play against [OpenAI Five] and you realize it has a playstyle that is different.

It’s doing things that you’ve never done and you’ve never seen.

One key learning that we took is how it was allocating resources.

It’s just allocating resources as efficiently as possible.

[…] If OpenAI does that dynamic switch at 100%, we maybe went from 5% to 10%?

5 месяцев, 2 недели назад @ openai.com
Deep Double Descent
Deep Double Descent Deep Double Descent

Many classes of modern deep learning models, including CNNs, ResNets, and transformers, exhibit the previously-observed double descent phenomenon when not using early stopping or regularization.

The model-wise double descent phenomenon can lead to a regime where training on more data hurts.

The double descent phenomena is most prominent in settings with added label noise; without it, the peak is smaller and easy to miss.

For a given number of optimization steps (fixed y-coordinate), test and train error exhibit model-size double descent.

We leave fully understanding the mechanisms behind double descent in deep neural networks as an important open question.

5 месяцев, 3 недели назад @ openai.com
Procgen Benchmark
Procgen Benchmark Procgen Benchmark

We’re releasing Procgen Benchmark, 16 simple-to-use procedurally-generated environments which provide a direct measure of how quickly a reinforcement learning agent learns generalizable skills.

To fulfill this need, we have created Procgen Benchmark.

CoinRun now serves as the inaugural environment in Procgen Benchmark, contributing its diversity to a greater whole.

With Procgen Benchmark, we strive for all of the following: experimental convenience, high diversity within environments, and high diversity across environments.

We've now expanded on those results, conducting our most thorough study of RL generalization to date using all 16 environments in Procgen Benchmark.

5 месяцев, 3 недели назад @ openai.com
Safety Gym
Safety Gym Safety Gym

We're releasing Safety Gym, a suite of environments and tools for measuring progress towards reinforcement learning agents that respect safety constraints while training.

Safety GymTo study constrained RL for safe exploration, we developed a new set of environments and tools called Safety Gym.

BenchmarkTo help make Safety Gym useful out-of-the-box, we evaluated some standard RL and constrained RL algorithms on the Safety Gym benchmark suite: PPO, TRPO, Lagrangian penalized versions of PPO and TRPO, and Constrained Policy Optimization (CPO).

There are three things we are most interested in at the moment:Improving performance on the current Safety Gym environments.

We also hope that systems l…

6 месяцев, 1 неделя назад @ openai.com
GPT-2: 1.5B Release
GPT-2: 1.5B Release GPT-2: 1.5B Release

As the final model release of GPT-2’s staged release, we’re releasing the largest version (1.5B parameters) of GPT-2 along with code and model weights to facilitate detection of outputs of GPT-2 models.

Our partners at Cornell University surveyed people to assign GPT-2 text a credibility score across model sizes.

People gave the 1.5B model a “credibility score” of 6.91 out of 10.

These results make us more inclined to release the 1.5B model, as the incremental increase in human-perceived credibility relative to 774M seems low.

We acknowledge that we cannot be aware of all threats, and that motivated actors can replicate language models without model release.

6 месяцев, 3 недели назад @ openai.com
Solving Rubik’s Cube with a Robot Hand
Solving Rubik’s Cube with a Robot Hand Solving Rubik’s Cube with a Robot Hand

We've trained a pair of neural networks to solve the Rubik’s Cube with a human-like robot hand.

Since May 2017, we've been trying to train a human-like robotic hand to solve the Rubik’s Cube.

Solving a Rubik’s Cube one-handed is a challenging task even for humans, and it takes children several years to gain the dexterity required to master it.

To test the limits of our method, we experiment with a variety of perturbations while the hand is solving the Rubik’s Cube.

Behind the scenes: Rubik’s Cube prototypes In order to benchmark our progress and make the problem tractable, we built and designed custom versions of cubes as stepping stones towards ultimately solving a regular Rubik’s Cube.

7 месяцев, 2 недели назад @ openai.com
OpenAI Scholars Spring 2020
OpenAI Scholars Spring 2020 OpenAI Scholars Spring 2020

The second class of Scholars recently released their projects and presented their work at the 2019 Scholars Demo Day.

While we hope that some of the scholars will join OpenAI, we want this program to improve diversity in the field at large.

For Bay Area participants, we offer an optional desk at the OpenAI office (which our past Scholars have found very valuable).

We look for people who are comfortable writing software (2+ years in software engineering), but no previous machine learning experience is required.

We ask all Scholars to document their experiences studying deep learning to hopefully inspire others to join the field too.

7 месяцев, 2 недели назад @ openai.com
Fine-Tuning GPT-2 from Human Preferences
Fine-Tuning GPT-2 from
Human Preferences Fine-Tuning GPT-2 from Human Preferences

We’ve fine-tuned the 774M parameter GPT-2 language model using human feedback for various tasks, successfully matching the preferences of the external human labelers, though those preferences did not always match our own.

Fine-tuning for the stylistic continuation tasks is sample efficient: 5,000 human samples suffice for strong performance according to humans.

However, when combining supervised fine-tuning with human fine-tuning, our models outperform lead-3 on ROUGE scores.

The cost of human data means that volume will always be low, so it is easy to retrain from scratch (or rather, from the GPT-2 starting point) each time.

Looking forwardWe’ve demonstrated reward learning from human pref…

8 месяцев, 1 неделя назад @ openai.com
Emergent Tool Use from Multi-Agent Interaction
Emergent Tool Use from Multi-Agent Interaction Emergent Tool Use from Multi-Agent Interaction

Through training in our new simulated hide-and-seek environment, agents build a series of six distinct strategies and counterstrategies, some of which we did not know our environment supported.

The self-supervised emergent complexity in this simple environment further suggests that multi-agent co-adaptation may one day produce extremely complex and intelligent behavior.

In this full environment, agents go through two more phases of emergent strategy than in the previous simple environment.

Multi-agent competition vs. intrinsic motivationIn this work we show evidence that agents learn complex strategies and counterstrategies through a self-supervised autocurriculum in hide-and-seek.

Though t…

8 месяцев, 1 неделя назад @ openai.com
Testing Robustness Against Unforeseen Adversaries
Testing Robustness Against Unforeseen Adversaries Testing Robustness Against Unforeseen Adversaries

Our method yields a new metric, UAR (Unforeseen Attack Robustness), which evaluates the robustness of a single model against an unanticipated attack, and highlights the need to measure performance across a more diverse range of unforeseen attacks.

The field has made progress in hardening models against such attacks; however, robustness against one type of distortion often does not transfer to robustness against attacks unforeseen by designers of the model.

It also yields a new metric, UAR, which assesses the adversarial robustness of models against unforeseen distortion types.

A UAR score near 100 against an unforeseen adversarial attack implies performance comparable to a defense with prio…

9 месяцев, 1 неделя назад @ openai.com
Microsoft Microsoft
последний пост 16 часов назад
Making it easier to stay caught up with Cortana in Microsoft 365
Making it easier to stay caught up with Cortana in Microsoft 365 Making it easier to stay caught up with Cortana in Microsoft 365

We’re featuring updates available starting today with Cortana, your personal productivity assistant in Microsoft 365, to make it easier to get time back on your busy schedule and focus on what matters.

Stay on track with Cortana in Windows 10—To help you save time finding what you need and stay focused, we’re releasing a new chat-based Cortana experience in Windows 10 focused on enhancing your productivity.

Briefing is currently rolling out in First Release for Microsoft 365 Enterprise users with Exchange Online mailboxes in English.

As a personal productivity assistant that is a natural part of Microsoft 365, Cortana processes data safely and securely to fulfill your requests.

Try these ex…

16 часов назад @ microsoft.com
Old tools, new tricks: Improving the computational notebook experience for data scientists
Old tools, new tricks: Improving the computational notebook experience for data scientists Old tools, new tricks: Improving the computational notebook experience for data scientists

Data scientists are interested in prototyping and tinkering, simultaneously coding, exploring data, and identifying patterns and relationships through visualizations.

Unlike software engineers, for whom the programming is the artifact, data scientists may be using coding as a means to answer bigger questions around understanding data, modeling data, and building models.

They provide a launching pad: Why does a technique that works for software engineers not work for data scientists?

We set our sights specifically on the computational notebook, a popular programming and data analysis platform among data scientists.

Using programming-by-example, data scientists can provide examples to the sys…

19 часов назад @ microsoft.com
Harvesting randomness, HAIbrid algorithms and safe AI with Dr. Siddhartha Sen
Harvesting randomness, HAIbrid algorithms and safe AI with Dr. Siddhartha Sen Harvesting randomness, HAIbrid algorithms and safe AI with Dr. Siddhartha Sen

Today, he tells us how he’s using reinforcement learning and HAIbrid algorithms to tap the best of both human and machine intelligence and develop AI that’s minimally disruptive, synergistic with human solutions, and safe.

Today, he tells us how he’s using reinforcement learning and HAIbrid algorithms to tap the best of both human and machine intelligence, and develop AI that’s minimally disruptive, synergistic with human solutions, and safe.

You know, I don’t think AI and humans are ever going to be held to the same standards.

And so, that’s kind of the inspiration behind this idea of a safeguard that adapts to the actual system.

Sid Sen: I think that AI solutions will always be held to a …

1 день, 2 часа назад @ microsoft.com
Meeting the challenges of today and tomorrow with Azure AI
Meeting the challenges of today and tomorrow with Azure AI

Our customers are finding innovative ways to deliver crisis management solutions, drive cost-savings, redefine customer engagement, and accelerate decision-making.

1 неделя назад @ azure.microsoft.com
Quantum-safe cryptography: Securing today’s data against tomorrow’s computers webinar
Quantum-safe cryptography: Securing today’s data against tomorrow’s computers webinar Quantum-safe cryptography: Securing today’s data against tomorrow’s computers webinar

Microsoft Research Webinar SeriesQuantum-safe cryptography: Securing today’s data against tomorrow’s computersAs the world prepares for the advent of the quantum computer, the security community must also prepare to defend against it.

Most of the cryptography currently in use succumbs to quantum attacks.

Although quantum computers are still a decade or so away from becoming a reality, existing communications are already at risk: Encrypted data can be recorded today and decrypted with the help of a quantum computer in the future.

In this webinar, Principal Program Manager Christian Paquin, a cryptography specialist in the Security and Cryptography group at Microsoft Research, will present re…

1 неделя назад @ note.microsoft.com
Fairness and interpretability in AI: Putting people first
Fairness and interpretability in AI: Putting people first Fairness and interpretability in AI: Putting people first

Wallach and Wortman Vaughan each co-chair an AI, Ethics, and Effects in Engineering and Research (Aether) working group—Wallach’s group is focused on fairness, Wortman Vaughan’s on interpretability.

Their two most recent publications in the space address the AI challenges of fairness and interpretability through the lens of one particular group of people involved in the life cycle of AI systems: those developing them.

The uncertainty inherent in the world is baked into any AI systems we build, whether it’s explicit or not, she thought.

She wondered to the point of obsession, how well do people really understand the predictions coming out of AI systems?

For Wallach and Wortman Vaughan, being…

1 неделя, 1 день назад @ microsoft.com
Research Collection: Tools and Data to Advance the State of the Art
Research Collection: Tools and Data to Advance the State of the Art Research Collection: Tools and Data to Advance the State of the Art

(Visit this collection to learn about the work Microsoft researchers are doing to advance responsible AI.)

Here, we’ve curated a selection of the work Microsoft researchers are doing to advance the state of the art in tools and data research.

Microsoft researchers and their collaborators have published tens of thousands of peer-reviewed papers since Microsoft Research’s founding in 1991.

Historically, the most interpretable machine learning models were not very accurate, and the most accurate models were not very interpretable.

SandDance showcases Microsoft Research data visualization innovations and novel natural user interaction techniques and makes them available where you need them.

1 неделя, 1 день назад @ microsoft.com
ZeRO-2 & DeepSpeed: Shattering Barriers of Deep Learning Speed & Scale
ZeRO-2 & DeepSpeed: Shattering Barriers of Deep Learning Speed & Scale ZeRO-2 & DeepSpeed: Shattering Barriers of Deep Learning Speed & Scale

Altogether, the memory savings empower DeepSpeed to improve the scale and speed of deep learning training by an order of magnitude.

ZeRO-2: Training models with 100 billion parameters up to 10x fasterThe Zero Redundancy Optimizer (abbreviated ZeRO) is a novel memory optimization technology for large-scale distributed deep learning.

With ZeRO-2, a 100-billion-parameter model can be trained 10x faster than with the state-of-art technology based on model parallelism alone.

Here we use a state-of-the-art model parallelism approach, NVIDIA Megatron-LM, as baseline-MP, while ZeRO-2 and ZeRO-1 both combine ZeRO-powered data parallelism with Megatron-LM model parallelism.

We have recently focused o…

1 неделя, 1 день назад @ microsoft.com
Microsoft announces new supercomputer, lays out vision for future AI work
Microsoft announces new supercomputer, lays out vision for future AI work Microsoft announces new supercomputer, lays out vision for future AI work

Built in collaboration with and exclusively for OpenAI, the supercomputer hosted in Azure was designed specifically to train that company’s AI models.

As part of a companywide AI at Scale initiative, Microsoft has developed its own family of large AI models, the Microsoft Turing models, which it has used to improve many different language understanding tasks across Bing, Office, Dynamics and other productivity products.

Earlier this year, it also released to researchers the largest publicly available AI language model in the world, the Microsoft Turing model for natural language generation.

The goal, Microsoft says, is to make its large AI models, training optimization tools and supercomput…

1 неделя, 1 день назад @ blogs.microsoft.com
Microsoft responsible machine learning capabilities build trust in AI systems, developers say
Microsoft responsible machine learning capabilities build trust in AI systems, developers say Microsoft responsible machine learning capabilities build trust in AI systems, developers say

Model interpretability helps take the mystery out of machine learning, which in turn can build confidence and trust in model predictions, noted Engberg.

Microsoft built Azure Machine Learning to enable developers across the spectrum of data science expertise to build and deploy AI systems.

To navigate these hurdles, Microsoft today announced innovations in responsible machine learning that can help developers understand, protect and control their models throughout the machine learning lifecycle.

These capabilities can be accessed through Azure Machine Learning and are also available in open source on GitHub.

Engberg and his data analytics and artificial intelligence team continue to build, …

1 неделя, 1 день назад @ blogs.microsoft.com
Objects are the secret key to revealing the world between vision and language
Objects are the secret key to revealing the world between vision and language Objects are the secret key to revealing the world between vision and language

For example, computers could mimic this ability by searching the most similar images for a text query (or vice versa) and describing the content of an image using natural language.

Importantly, in Oscar we construct the representations of the object tags using their corresponding word embeddings from a pre-trained BERT.

The first finding is intra-class: with the aid of object tags, the distance of the same object between two modalities is substantially reduced.

This verifies the importance of object tags in alignment learning: it plays the role of anchor points in linking and regularizing the cross-modal feature learning.

Looking forwardOscar has demonstrated the power of using objects as a…

1 неделя, 5 дней назад @ microsoft.com
Diving into Deep InfoMax with Dr. Devon Hjelm
Diving into Deep InfoMax with Dr. Devon Hjelm Diving into Deep InfoMax with Dr. Devon Hjelm

Host: So, you are a senior researcher who’s deep into deep learning at the MSR Lab in Montreal.

Devon Hjelm: So, the team that I’m part of, we’re kind of like a deep learning camp, I guess.

You did a postdoc under Yoshua Bengio who’s a bona fide Turing Award winner and one of the godfathers of deep learning.

I want to go back a little bit because you mentioned a “camp,” and if I understand that, it’s like people getting together and saying this is how I believe, this is my worldview of deep learning, as opposed to another worldview of deep learning.

(music plays)To learn more about Dr. Devon Hjelm, and the very latest in deep learning research, visit Microsoft.com/research

2 недели, 1 день назад @ microsoft.com
Where’s my stuff? Developing AI with help from people who are blind or low vision to meet their needs
Where’s my stuff? Developing AI with help from people who are blind or low vision to meet their needs Where’s my stuff? Developing AI with help from people who are blind or low vision to meet their needs

Microsoft AI for Accessibility is funding the ORBIT research project, which is enlisting the help of people who are blind or low vision to build a new dataset.

People who are blind or low vision can contribute to the project by providing videos of things found in their daily lives.

Smartphones are really useful in making visual information accessible to people who are blind or low vision.

In addition, there has been no effort to collect images of objects that may be particularly important to users who are blind or low vision.

Later in the year, there will be another opportunity to contribute to our project for users who are blind or low vision worldwide.

2 недели, 1 день назад @ microsoft.com
Office Licensing Service and Azure Cosmos DB part 1: Migrating the production workload
Office Licensing Service and Azure Cosmos DB part 1: Migrating the production workload

This post is part 1 of a two-part series about how organizations use Azure Cosmos DB to meet real world needs, and the difference it’s making to them.

2 недели, 2 дня назад @ azure.microsoft.com
Post-quantum cryptography: Supersingular isogenies for beginners webinar
Post-quantum cryptography: Supersingular isogenies for beginners webinar Post-quantum cryptography: Supersingular isogenies for beginners webinar

Microsoft Research Webinar SeriesPost-quantum cryptography: Supersingular isogenies for beginnersA large-scale quantum computer would break the public key cryptography that is currently used to secure the internet.

In this webinar led by Microsoft researcher Dr. Craig Costello, you will examine why post-quantum cryptography is so critical as we move closer to realizing quantum computing, and you will learn the basics of supersingular isogeny Diffie-Hellman (SIDH), which is one of the popular candidates for post-quantum key exchange.

The best known classical and quantum algorithms for attacking the SIDH protocol have exponential runtimes, which is why SIDH has the lowest bandwidth requiremen…

2 недели, 6 дней назад @ note.microsoft.com
Facebook Facebook
последний пост 3 недели, 5 дней назад
Facebook Research at ICASSP 2020
Facebook Research at ICASSP 2020 Facebook Research at ICASSP 2020

Facebook AI researchers are presenting their work virtually at the 45th International Conference on Acoustics, Speech, and Signal Processing (ICASSP) from May 4 to May 8, 2020.

These are just a few of the research papers that Facebook AI researchers are presenting at ICASSP this year.

Text metadata generation for weakly supervised trainingTo reduce the need for labeled training data, we developed a novel weakly supervised training approach that leverages text metadata surrounding public videos.

To help accelerate research, we’ve introduced Libri-Light, the largest open-source data set for speech recognition.

In this paper, our goal is to build a unified end-to-end speech recognition system …

3 недели, 5 дней назад @ ai.facebook.com
Fighting Abuse @Scale 2019 recap
Fighting Abuse @Scale 2019 recap Fighting Abuse @Scale 2019 recap

Fighting abuse presents unique challenges for large-scale organizations working to keep the people on their platforms safe.

At Fighting Abuse @Scale 2019, engineers, data scientists, product managers, and operations specialists gathered in Menlo Park for a day of technical talks focused on state-of-the art technologies to fight fraud, spam, and abuse on platforms that serve millions or even billions of people.

Our key insight is that sharing patterns can help hosting platforms identify abusive content, while hosting platforms can help sharing platforms prevent the spread of abusive content.

Results demonstrate that working together as an industry can strengthen the capacity to more quickly …

5 месяцев, 2 недели назад @ engineering.fb.com
CCSM: Scalable statistical anomaly detection to resolve app crashes faster
CCSM: Scalable statistical anomaly detection to resolve app crashes faster CCSM: Scalable statistical anomaly detection to resolve app crashes faster

A contrast set mining algorithmCSM provides a scalable, robust way to generate human-readable insights on high dimensional crash data.

For a contrast set X and group G, the support S(X,G) is the percentage of vectors in group G for which the contrast set X is true.

To efficiently traverse the search space of feature combinations, we cast the problem of mining contrast sets as a tree search problem.

However, real world data is often mixed — our crash data contains a mix of categorical, discrete, and continuous data.

The continuous contrast mining algorithm adopts the same tree search framework, with modifications to reason about sets of continuous features.

6 месяцев назад @ engineering.fb.com
Fast dimensional analysis for root cause analysis at scale
Fast dimensional analysis for root cause analysis at scale Fast dimensional analysis for root cause analysis at scale

Nikolay Pavlovich Laptev Fred Lin Keyur Muzumdar Mihai-Valentin CureleaWhat the research is:A fast dimensional analysis (FDA) framework that automates root cause analysis on structured logs with improved scalability.

When a failure event happens in a large-scale distributed production environment, performing root cause analysis can be challenging.

Our proposed FDA framework combines structured logs from a number of sources and provides a meaningful combination of features.

As we’ve mentioned, the challenges of performing root cause analysis in a large-scale distributed production environment make outage detection and mitigation difficult.

Read the full paper:Fast Dimensional Analysis for Ro…

6 месяцев, 3 недели назад @ engineering.fb.com
2019 @Scale Conference recap
2019 @Scale Conference recap 2019 @Scale Conference recap

If you are interested in future events, visit the @Scale website or join the @Scale community.

@Scale 2019: Data InfraZanzibar: Google’s consistent, global authorization systemRuoming Pang, Principal Software Engineer, GoogleDetermining whether online users are authorized to access digital objects is central to preserving privacy.

6 technical challenges in developing a distributed SQL databaseNeha Deodhar, Software Engineer, YugaByteNeha discusses the experience of developing YugaByte.

@Scale 2019: SecurityLeveraging the type system to write secure applicationsShannon Zhu, Software Engineer, FacebookShannon discusses ways to extend the type system to eliminate entire classes of security vul…

7 месяцев назад @ engineering.fb.com
Video @Scale 2019 recap
Video @Scale 2019 recap Video @Scale 2019 recap

At Video @Scale 2019, engineers gathered in San Francisco for a day of technical talks focused on delivering video at scale.

Adopting video at scaleSteven Robertson, Engineer, YouTubeSteven works on streaming video performance at YouTube.

AV1 PanelRonald Bultje, Founder, Two OriolesYaowu Xu, Principal Software Engineer, GoogleChekib Nouira, Senior Video Systems Engineer, IntelPanel moderated by Ioannis Katsavounidis.

Contextual video ad safetyVijaya Chandra, Software Engineering Manager, FacebookRose Kanjirathinkal, Research Scientist, FacebookVijaya leads video understanding efforts at Facebook.

Video integrity at scaleSonal Gandhi, Software Engineer, FacebookSonal talks about reducing har…

7 месяцев, 1 неделя назад @ engineering.fb.com
Releasing a new benchmark and data set for evaluating neural code search models
Releasing a new benchmark and data set for evaluating neural code search models Releasing a new benchmark and data set for evaluating neural code search models

The benchmark includes the largest evaluation data set currently available for Java, consisting of a natural language query and code snippet pairs.

This data set comprises 287 Stack Overflow question-and-answer pairs from the Stack Exchange Data Dump.

A score sheet on the evaluation data set, using two models from our recent work, is also included.

We intend for this data set to serve as a benchmark for evaluating search quality across a variety of code search models.

To evaluate the performance of these models, Stack Overflow questions and code answer pairs are prime candidates, as Stack Overflow questions effectively represent what a developer may ask.

7 месяцев, 3 недели назад @ ai.facebook.com
Hydra: A framework that simplifies development of complex applications
Hydra: A framework that simplifies development of complex applications Hydra: A framework that simplifies development of complex applications

Hydra’s flexible approach to developing, creating, and maintaining code and configurations can help speed the development of complex applications in various fields, including machine learning research.

What it does:Hydra offers an innovative approach to composing an application’s configuration, allowing changes to a composition through configuration files as well as from the command line.

Hydra speeds development of such applications while reducing the chances of bugs, and it enables code to evolve more naturally in response to new requirements.

Why it matters:Hydra is already in use at Facebook to prototype complex research projects.

We expect to continue using the Hydra framework for buil…

7 месяцев, 3 недели назад @ engineering.fb.com
MaRS: How Facebook keeps maps current and accurate
MaRS: How Facebook keeps maps current and accurate MaRS: How Facebook keeps maps current and accurate

To reduce the risk of bad edits, whether intentional ( vandalism ) or unintentional, we don’t update our local copy directly.

So we, like most consumers of OSM data, have an internal storage format (a local copy).

Current approaches to keeping OSM data updated primarily focus on tackling the two axes separately.

Freshness is achieved by simply consuming upstream changesets faster, or essentially rebasing the local copy with the upstream master on a regular cadence (e.g., daily or weekly).

Let V(Downstream) be the current downstream local copy version based on an earlier version of upstream.

8 месяцев назад @ engineering.fb.com
Integrating autoconversion: Facebook’s path from Zawgyi to Unicode
Integrating autoconversion: Facebook’s path from Zawgyi to Unicode Integrating autoconversion: Facebook’s path from Zawgyi to Unicode

Each of the requirements for the autoconversion — content encoding detection, device encoding detection, and conversion — had its own challenges.

Content encoding detectionTo perform autoconversion, we first need to know the content encoding, that is, the encoding used when the text was first input.

We train a machine learning (ML) model on public Facebook content samples for which we already know the content encoding.

Device encoding detectionNext, we need to know which encoding was used by a person’s phone (i.e., the device encoding) to understand whether we need to perform a font encoding conversion.

There’s no single pipeline through which all possible Facebook content passes, which mak…

8 месяцев назад @ engineering.fb.com
Register now for @Scale 2019!
Register now for @Scale 2019! Register now for @Scale 2019!

Registration is officially open for @Scale 2019.

Topics for the @Scale 2019 talks include cloud native platforms for event streaming, advances in self-supervised learning and natural language processing, securing SSH traffic, deploying DNS privacy technologies at scale, and more.

To register for @Scale 2019, enter your invite code here.

Visit the @Scale Community page and message us with your name, company name, and email address.

If you’ve never been to an @Scale event, you can watch David Patterson of Google and Clément Farabet of NVIDIA open last year’s event, or see videos of all the talks in last year’s recap.

8 месяцев, 1 неделя назад @ engineering.fb.com
Creating a data set and a challenge for deepfakes
Creating a data set and a challenge for deepfakes Creating a data set and a challenge for deepfakes

Yet the industry doesn't have a great data set or benchmark for detecting them.

That's why Facebook is commissioning a realistic data set that will use paid actors, with the required consent obtained, to contribute to the challenge.

No Facebook user data will be used in this data set.

To ensure the quality of the data set and challenge parameters, they will initially be tested through a targeted technical working session this October at the International Conference on Computer Vision (ICCV).

The full data set release and the DFDC launch will happen at the Conference on Neural Information Processing Systems (NeurIPS) this December.

8 месяцев, 3 недели назад @ ai.facebook.com
New advances in natural language processing
New advances in natural language processing New advances in natural language processing

Natural language understanding (NLU) and language translation are key to a range of important applications, including identifying and removing harmful content at scale and connecting people across different languages worldwide.

We’ve also introduced a new self-supervised pretraining approach, RoBERTa, that surpassed all existing NLU systems on several language comprehension tasks.

According to human evaluations, our models were ranked top in four translation tasks: from English to German, German to English, English to Russian, and Russian to English.

SuperGLUE follows in the footsteps of GLUE, which offers a single-number metric that summarizes progress on a diverse set of NLP tasks.

By cha…

9 месяцев, 2 недели назад @ ai.facebook.com
A new model for word embeddings that are resilient to misspellings
A new model for word embeddings that are resilient to misspellings A new model for word embeddings that are resilient to misspellings

What the research is:A new model to learn word embeddings (words or phrases mapped to dense vectors of numbers that represent their meaning) that are resilient to misspellings.

To address this deficiency, we propose Misspelling Oblivious Embeddings (MOE), a new model that combines our open source library fastText with a supervised task that embeds misspellings close to their correct variants.

In addition to the semantic loss, MOE also considers an additional supervisedloss that we call spell correction loss.

The spell correction loss aims to embed misspellings close to their correct versions by minimizing the weighted sum of semantic loss and spell correction loss.

Our approach will improve…

9 месяцев, 3 недели назад @ ai.facebook.com
Michael F. Cohen awarded 2019 Steven A. Coons award
Michael F. Cohen awarded 2019 Steven A. Coons award Michael F. Cohen awarded 2019 Steven A. Coons award

On July 29 at SIGGRAPH, Michael F. Cohen will receive the 2019 Steven A. Coons Award for Outstanding Creative Contributions to Computer Graphics.

The award is given to one individual every two years to honor outstanding lifetime contributions to computer graphics and interactive techniques.

Cohen joined Facebook in Fall 2015 as Director of Facebook’s Computational Photography Research team, which was formed to explore new ways to share photos and videos online.

I never became an engineer but rather first entered the field of computer graphics with intentions to continue studies related to civil engineering.

This is really a marriage of computer graphics and computer vision.

10 месяцев назад @ research.fb.com
MIT AI MIT AI
последний пост 1 день, 18 часов назад
Undergraduates develop next-generation intelligence tools
Undergraduates develop next-generation intelligence tools Undergraduates develop next-generation intelligence tools

One even carried on his experiments from his bedroom, after schlepping his Sphero Bolt robots home in a backpack.

“I’ve been so impressed by their resilience and dedication,” says Katherine Gallagher, one of three artificial intelligence engineers at MIT Quest for Intelligence who works with students each semester on intelligence-related applications.

The project involves training a deep neural network to pick out globules of fat on liver tissue slides to estimate the liver’s overall fat content.

One challenge, says Huang, has been figuring out how to handle variations in how various pathologists classify fat globules.

The final output will be a fat content estimate with pictures of highlig…

1 день, 18 часов назад @ news.mit.edu
Fireflies helps companies get more out of meetings
Fireflies helps companies get more out of meetings Fireflies helps companies get more out of meetings

The startup Fireflies.ai is helping people get the most out of their meetings with a note-taking, information-organizing virtual assistant named Fred.

Fred transcribes every word of meetings and then uses artificial intelligence to help people sort and share that information later on.

After each meeting, Fireflies can automatically sync all this meeting data into apps from companies like Slack, Salesforce, and Hubspot.

“Fireflies is like a personal assistant that helps connect your systems of communication with your systems of record,” Udotong says.

The same thing is true today of audio and meeting data.

6 дней, 20 часов назад @ news.mit.edu
Machine-learning tool could help develop tougher materials
Machine-learning tool could help develop tougher materials Machine-learning tool could help develop tougher materials

For engineers developing new materials or protective coatings, there are billions of different possibilities to sort through.

The focus of this work was on predicting the way a material would break or fracture, by analyzing the propagation of cracks through the material’s molecular structure.

“One of the specialties of my lab is to use what we call molecular dynamics simulations, or basically atom-by-atom simulations” of such processes, Buehler says.

In this case, they were looking at a variety of composite, layered coatings made of crystalline materials.

So, this is a whole new way of simulating how materials fail.”How materials fail is crucial information for any engineering project, Bueh…

1 неделя назад @ news.mit.edu
Marshaling artificial intelligence in the fight against Covid-19
Marshaling artificial intelligence in the fight against Covid-19 Marshaling artificial intelligence in the fight against Covid-19

Artificial intelligence could play a decisive role in stopping the Covid-19 pandemic.

Early detection of sepsis in Covid-19 patientsSepsis is a deadly complication of Covid-19, the disease caused by the new coronavirus SARS-CoV-2.

About 10 percent of Covid-19 patients get sick with sepsis within a week of showing symptoms, but only about half survive.

Finding better ways to treat Covid-19 patients on ventilatorsTroubled breathing from acute respiratory distress syndrome is one of the complications that brings Covid-19 patients to the ICU.

There, life-saving machines help patients breathe by mechanically pumping oxygen into the lungs.

1 неделя, 1 день назад @ news.mit.edu
Visualizing the world beyond the frame
Visualizing the world beyond the frame Visualizing the world beyond the frame

Their understanding of the world is colored, often literally, by the data they’ve trained on.

To give computer vision models a fuller, more imaginative view of the world, researchers have tried feeding them more varied images.

In both cases, the aim is to fill in the gaps of image datasets to better reflect the three-dimensional world and make face- and object-recognition models less biased.

GANs have caught the attention of intelligence researchers for their ability to extrapolate from data, and visualize the world in new and inventive ways.

“GANs are incredible, and can learn all kinds of things about the physical world, but they still can’t represent images in physically meaningful ways,…

3 недели назад @ news.mit.edu
Study finds stronger links between automation and inequality
Study finds stronger links between automation and inequality Study finds stronger links between automation and inequality

In other cases, forms of automation, from robots to phone-answering systems, have simply replaced factory workers, receptionists, and many other kinds of employees.

“Automation is critical for understanding inequality dynamics,” says MIT economist Daron Acemoglu, co-author of a newly published paper detailing the findings.

“A lot of the new job opportunities that technology brought from the 1960s to the 1980s benefitted low-skill workers,” Acemoglu adds.

Where automation occurs, lower-skill workers are not just failing to make gains; they are actively pushed backward financially.

That same displacement continues today, although, Acemoglu contends, the net negative consequences of technology…

3 недели, 1 день назад @ news.mit.edu
Robots help some firms, even while workers across industries struggle
Robots help some firms, even while workers across industries struggle Robots help some firms, even while workers across industries struggle

“We know firms are adopting robots in order to reduce their costs, so it is quite plausible that firms adopting robots early are going to expand at the expense of their competitors whose costs are not going down.

And yet, for firms adopting robots during that timespan, employee hours worked rose by 10.9 percent, and wages rose modestly as well.

A French robot censusTo conduct the study, the scholars examined 55,390 French manufacturing firms, of which 598 purchased robots during the period from 2010 to 2015.

The 598 firms that did purchase robots, while comprising just 1 percent of manufacturing firms, accounted for about 20 percent of manufacturing production during that five-year period.

3 недели, 2 дня назад @ news.mit.edu
How many jobs do robots really replace?
How many jobs do robots really replace? How many jobs do robots really replace?

That increased use of robots in the workplace also lowered wages by roughly 0.4 percent during the same time period.

The paper, “Robots and Jobs: Evidence from U.S. Labor Markets,” appears in advance online form in the Journal of Political Economy.

In the U.S., four manufacturing industries account for 70 percent of robots: automakers (38 percent of robots in use), electronics (15 percent), the plastics and chemical industry (10 percent), and metals manufacturers (7 percent).

When robots are added to manufacturing plants, “The burden falls on the low-skill and especially middle-skill workers.

“It certainly won’t give any support to those who think robots are going to take all of our jobs,” …

3 недели, 2 дня назад @ news.mit.edu
A foolproof way to shrink deep learning models
A foolproof way to shrink deep learning models A foolproof way to shrink deep learning models

As more artificial intelligence applications move to smartphones, deep learning models are getting smaller to allow apps to run faster and save battery power.

Now, MIT researchers have a new and better way to compress models.

Their revelation came as demand for computing power and energy to train ever larger deep learning models was increasing exponentially, a trend that continues to this day.

Big AI models eat up mobile-phone bandwidth and battery power.

“It’s clear, generic, and drop-dead simple.”Han, for his part, has now partly shifted focus from compression AI models to channeling AI to design small, efficient models from the start.

3 недели, 6 дней назад @ news.mit.edu
Automating the search for entirely new “curiosity” algorithms
Automating the search for entirely new “curiosity” algorithms Automating the search for entirely new “curiosity” algorithms

Engineers have discovered many ways of encoding curious exploration into machine learning algorithms.

A research team at MIT wondered if a computer could do better, based on a long history of enlisting computers in the search for new algorithms.

“We were inspired to use AI to find algorithms with curiosity strategies that can adapt to a range of environments.”The researchers created a “meta-learning” algorithm that generated 52,000 exploration algorithms.

They started by choosing a set of basic building blocks to define their exploration algorithms.

So, instead, the researchers limited their search by first ruling out algorithms predicted to perform poorly, based on their code structure alo…

4 недели, 1 день назад @ news.mit.edu
MIT conference reveals the power of using artificial intelligence to discover new drugs
MIT conference reveals the power of using artificial intelligence to discover new drugs MIT conference reveals the power of using artificial intelligence to discover new drugs

Developing drugs to combat Covid-19 is a global priority, requiring communities to come together to fight the spread of infection.

At MIT, researchers with backgrounds in machine learning and life sciences are collaborating, sharing datasets and tools to develop machine learning methods that can identify novel cures for Covid-19.

As secretive as Silicon Valley seems, computer science and engineering students typically know what a job looks like when aspiring to join companies like Facebook or Tesla.

Reaching capacity for conference attendees, it also showed people are ready to pull together to get on the same page.

“The bigger picture, which this conference is a major part of, is this bring…

1 месяц назад @ news.mit.edu
Muscle signals can pilot a robot
Muscle signals can pilot a robot Muscle signals can pilot a robot

The system, called “Conduct-A-Bot,” uses human muscle signals from wearable sensors to pilot a robot’s movement.

To enable seamless teamwork between people and machines, electromyography and motion sensors are worn on the biceps, triceps, and forearms to measure muscle signals and movement.

Muscle signals can often provide information about states that are hard to observe from vision, such as joint stiffness or fatigue.

For the gesture vocabulary currently used to control the robot, the movements were detected as follows:stiffening the upper arm to stop the robot (similar to briefly cringing when seeing something going wrong): biceps and triceps muscle signals;waving the hand left/right and…

1 месяц назад @ news.mit.edu
Shedding light on complex power systems
Shedding light on complex power systems Shedding light on complex power systems

An electric power system may include stands of huge turbines capturing wild ocean winds, for instance.

Electric power systems, even traditional ones, are complex and heterogeneous to begin with.

In addition, all electric power systems have inherent physical limitations.

Addressing the evolving needs of electric power systems has not been a “hot” topic, historically.

Traditional power systems are often seen by the academic community as legacy technology with no fundamentally new developments.

1 месяц назад @ news.mit.edu
Reducing the carbon footprint of artificial intelligence
Reducing the carbon footprint of artificial intelligence Reducing the carbon footprint of artificial intelligence

Artificial intelligence has become a focus of certain ethical concerns, but it also has some major sustainability issues.

MIT researchers have developed a new automated AI system for training and running certain neural networks.

“Searching efficient neural network architectures has until now had a huge carbon footprint.

This relies on a “progressive shrinking” algorithm that efficiently trains the OFA network to support all of the subnetworks simultaneously.

But training the OFA and searching it ends up being far more efficient than spending hours training each neural network per platform.

1 месяц назад @ news.mit.edu
With lidar and artificial intelligence, road status clears up after a disaster
With lidar and artificial intelligence, road status clears up after a disaster With lidar and artificial intelligence, road status clears up after a disaster

Without concrete data on the state of the road network, emergency managers often have to base their answers on incomplete information.

"With our particular approach, you can determine road viability, do optimal routing, and also get quantified road damage.

To provide the status of the road network, the lidar map is first run through a neural network.

The extracted road network, with its flagged anomalies, is then merged with an OpenStreetMap of the area (an open-access map similar to Google Maps).

For finding roads, the algorithm determines if a point in the lidar point cloud is "road" or "not road."

1 месяц назад @ news.mit.edu
Berkeley AI
последний пост 2 недели назад
OmniTact: A Multi-Directional High-Resolution Touch Sensor
OmniTact: A Multi-Directional High-Resolution Touch Sensor OmniTact: A Multi-Directional High-Resolution Touch Sensor

OmniTact: A Multi-Directional High-Resolution Touch SensorHuman thumb next to our OmniTact sensor, and a US penny for scale.

Recently, the GelSight sensor has caught significant interest for learning-based robotics due to its low cost and rich signal.

Comparison of GelSight-style sensor (left side) to our OmniTact sensor (right side).

The OmniTact SensorOur OmniTact sensor design aims to address these limitations.

We additionally compared performance with another multi-directional tactile sensor, the OptoForce sensor, which only had a success rate of 17%.

2 недели назад @ bair.berkeley.edu
Four Novel Approaches to Manipulating Fabric using Model-Free and Model-Based Deep Learning in Simulation
Four Novel Approaches to Manipulating Fabric using Model-Free and Model-Based Deep Learning in Simulation Four Novel Approaches to Manipulating Fabric using Model-Free and Model-Based Deep Learning in Simulation

Four Novel Approaches to Manipulating Fabric using Model-Free and Model-Based Deep Learning in SimulationHumans manipulate 2D deformable structures such as fabric on a daily basis, from putting on clothes to making beds.

Model-Free MethodsModel-Free Learning without DemonstrationsIn this paper we present a model-free deep reinforcement learning approach for smoothing cloth.

An example of real robot cloth smoothing experiments with varying starting states and cloth colors.

Since this policy is easy to define, we code an algorithmic supervisor in simulation and perform imitation learning using Dataset Aggregation (DAgger).

Several episodes of both manipulating rope and cloth using our method,…

3 недели, 2 дня назад @ bair.berkeley.edu
Unsupervised Meta-Learning: Learning to Learn without Supervision
Unsupervised Meta-Learning: Learning to Learn without Supervision Unsupervised Meta-Learning: Learning to Learn without Supervision

Unsupervised Meta-Learning: Learning to Learn without SupervisionThis post is cross-listed on the CMU ML blog.

In this post we introduce theory and algorithms for unsupervised meta-learning, where machine learning algorithms themselves propose their own task distributions.

For example, a distribution over supervised learning tasks may include learning a dog detector, learning a cat detector, and learning a bird detector.

These unsupervised meta-learning algorithms allow for learning in regimes previously impractical, and further expand that capability of machine learning methods.

A number of open questions remain about unsupervised meta-learning:Unsupervised learning is closely connected to…

3 недели, 6 дней назад @ bair.berkeley.edu
The Ingredients of Real World Robotic Reinforcement Learning
The Ingredients of Real World Robotic Reinforcement Learning The Ingredients of Real World Robotic Reinforcement Learning

The simulation will never exactly match the real world, which means that improvements in simulation performance may not translate to improvements in the real world.

However, training robots in the real world with reinforcement learning has proven challenging, due to certain constraints.

What makes real world robotic reinforcement learning so challenging?

We show effective uninstrumented real world learning on two dexterous manipulation tasks with a 3 fingered robotic hand.

However, we believe that the ingredients of real world RL that we have proposed should endure as principles of design for real world RL systems.

1 месяц назад @ bair.berkeley.edu
Making Decision Trees Accurate Again: Explaining What Explainable AI Did Not
Making Decision Trees Accurate Again: Explaining What Explainable AI Did Not Making Decision Trees Accurate Again: Explaining What Explainable AI Did Not

Making Decision Trees Accurate Again: Explaining What Explainable AI Did NotThe interpretability of neural networks is becoming increasingly necessary, as deep learning is being adopted in settings where accurate and justifiable predictions are required.

In a neural-backed decision tree, predictions are made via a decision tree, preserving high-level interpretability.

Naive Decision Tree: We construct a basic decision tree with one root node and a leaf for each class.

The direct equivalence between a fully-connected layer and a naive decision tree motivates our particular inference method, using an inner-product decision tree.

Decision trees address this, but unfortunately, images are krypt…

1 месяц назад @ bair.berkeley.edu
Robots Learning to Move like Animals
Robots Learning to Move like Animals Robots Learning to Move like Animals

Robots Learning to Move like AnimalsQuadruped robot learning locomotion skills by imitating a dog.

The superior agility seen in animals, as compared to robots, might lead one to wonder: can we create more agile robotic controllers with less effort by directly imitating animals?

a dog), our framework uses reinforcement learning to train a control policy that enables a robot to imitate the motion in the real world.

1) First, given a reference motion, the motion retargeting stage maps the motion from the original animal’s morphology to the robot’s morphology.

2) Next, the motion imitation stage uses the retargeted reference motion to train a policy for imitating the motion in simulation.

1 месяц, 3 недели назад @ bair.berkeley.edu
Physically Realistic Attacks on Deep Reinforcement Learning
Physically Realistic Attacks on Deep Reinforcement Learning Physically Realistic Attacks on Deep Reinforcement Learning

Physically Realistic Attacks on Deep Reinforcement LearningDeep reinforcement learning (RL) has achieved superhuman performance in problems ranging from data center cooling to video games.

Consequently, it is critical that RL policies are robust: both to naturally occurring distribution shift, and to malicious attacks by adversaries.

We find it is still possible to attack victim policies in this more realistic multi-agent threat model.

To better understand how the adversarial policies exploit their victims, we created “masked” versions of victim policies.

The existence of adversarial policies has significant implications for the training, understanding and evaluation of RL policies.

2 месяца назад @ bair.berkeley.edu
Does On-Policy Data Collection Fix Errors in Off-Policy Reinforcement Learning?
Does On-Policy Data Collection Fix Errors in Off-Policy Reinforcement Learning? Does On-Policy Data Collection Fix Errors in Off-Policy Reinforcement Learning?

Does On-Policy Data Collection Fix Errors in Off-Policy Reinforcement Learning?

Corrective Feedback and Why it is Absent in ADPWhat is corrective feedback formally?

This enjoys corrective feedback, and we then contrast it with ADP methods, which do not.

One way to prevent this problem is by computing an “optimal” data distribution that provides maximal corrective feedback, and train Q-functions using this distribution?

More generally, we would like to make a case of analyzing effects of data distribution more deeply in the context of deep RL algorithms.

2 месяца, 1 неделя назад @ bair.berkeley.edu
BADGR:The Berkeley Autonomous Driving Ground Robot
BADGR:The Berkeley Autonomous Driving Ground Robot BADGR:The Berkeley Autonomous Driving Ground Robot

We call our robot learning system BADGR: the Berkeley Autonomous Driving Ground Robot.

The neural network predictive model is trained to predict these future events as accurately as possible.

(4) Planning and NavigatingBADGR predicting which actions lead to bumpy terrain (left) or collisions (right).

For example, the reward function could encourage driving towards a goal while discouraging collisions or driving over bumpy terrain.

BADGR successfully reaches the goal while avoiding collisions and bumpy terrain, while the geometry-based policy is unable to avoid bumpy terrain.

2 месяца, 2 недели назад @ bair.berkeley.edu
Speeding Up Transformer Training and Inference By Increasing Model Size
Speeding Up Transformer Training and Inference By Increasing Model Size Speeding Up Transformer Training and Inference By Increasing Model Size

Speeding Up Transformer Training and Inference By Increasing Model SizeModel Training Can Be SlowIn deep learning, using more compute (e.g., increasing model size, dataset size, or training steps) often leads to higher accuracy.

Instead, when training Transformer models on a budget, you want to drastically increase model size but stop training very early.

This phenomenon occurs because larger models converge to lower test error in fewer gradient updates than smaller models.

We also recommend increasing model size, not batch size.

ConclusionWe have shown that increasing Transformer model size can improve the efficiency of training and inference, i.e., one should Train Large, Then Compress.

2 месяца, 3 недели назад @ bair.berkeley.edu
AWS Machine Learning AWS Machine Learning
последний пост 20 часов назад
Train ALBERT for natural language processing with TensorFlow on Amazon SageMaker
Train ALBERT for natural language processing with TensorFlow on Amazon SageMaker Train ALBERT for natural language processing with TensorFlow on Amazon SageMaker

You can use AWS training scripts to train ALBERT in Amazon SageMaker on p3dn and g4dn instances for both single-node and distributed training.

You can use natural language processing (NLP) models to improve search results, recommend relevant items, improve translation, and much more.

Overview of the ALBERT ModelThe BERT language model was released in late 2018.

There are two phases to model training: pretraining and finetuning.

ConclusionThis post demonstrated how to train an ALBERT language model from scratch on Amazon SageMaker.

20 часов назад @ aws.amazon.com
Creating a complete TensorFlow 2 workflow in Amazon SageMaker
Creating a complete TensorFlow 2 workflow in Amazon SageMaker Creating a complete TensorFlow 2 workflow in Amazon SageMaker

In the context of an Amazon SageMaker workflow, data exploration typically occurs within notebooks.

Amazon SageMaker Processing automatically loads the input data from Amazon Simple Storage Service (Amazon S3) and uploads transformed data back to Amazon S3 when the job is complete.

The endpoint retrieves the TensorFlow SavedModel and deploys it to an Amazon SageMaker TensorFlow Serving container.

All these features are central elements of projects involving TensorFlow 2 and other deep learning frameworks in Amazon SageMaker.

For more information about the Amazon SageMaker workflow features covered in this post, see the related GitHub repo.

1 день, 20 часов назад @ aws.amazon.com
Gain customer insights using Amazon Aurora machine learning
Gain customer insights using Amazon Aurora machine learning Gain customer insights using Amazon Aurora machine learning

It extends a popular ML use case, predicting customer churn, and demonstrates how to achieve the real business goal of preventing customer churn.

“I understand that all of our customer data is stored in an Amazon Aurora relational database.

For instructions on configuring Aurora ML capabilities, see Enabling Aurora Machine Learning.

An Amazon Aurora cluster containing a single database instance, and a security group that is configured to give access to the Amazon SageMaker notebook instance.

Thanks!”Luckily, there’s an Amazon Aurora feature that makes it easy: Saving Data from an Amazon Aurora MySQL DB Cluster into Text Files in an Amazon S3 Bucket.

6 дней, 15 часов назад @ aws.amazon.com
AWS Machine Learning Scholarship Program from Udacity is now open for enrollment
AWS Machine Learning Scholarship Program from Udacity is now open for enrollment AWS Machine Learning Scholarship Program from Udacity is now open for enrollment

Developers, to help you advance your AI and machine learning (ML) skills with hands-on and engaging learning, the AWS Machine Learning Scholarship Program from Udacity is now open for enrollment.

In this scholarship program, all eligible students that complete the free AWS Machine Learning Foundations course earn a course completion certificate from Udacity and can take a high-bar knowledge test.

How the program worksThe scholarship program begins May 19, 2020, and runs through July 31, 2020.

The top scorers qualify for one of 325 full scholarships to the Udacity Machine Learning Engineer Nanodegree program, sponsored by Udacity.

About the AuthorTara Shankar Jana is a Senior Product Marketi…

1 неделя, 1 день назад @ aws.amazon.com
Visualizing Amazon SageMaker machine learning predictions with Amazon QuickSight
Visualizing Amazon SageMaker machine learning predictions with Amazon QuickSight Visualizing Amazon SageMaker machine learning predictions with Amazon QuickSight

For this use case, you train an Amazon SageMaker model to predict customer churn, connect your QuickSight to Amazon SageMaker, use the model on your current customer base to identify at-risk customers, and provide predictive dashboards through QuickSight.

Setting up your Amazon SageMaker environmentTo set up your Amazon SageMaker environment, complete the following steps:On the Amazon SageMaker console, choose Notebook instances.

Setting up Amazon QuickSight to work with Amazon SageMakerQuickSight’s integration with Amazon Sagemaker is only available on the Enterprise Edition of Amazon QuickSight.

ConclusionQuickSight and Amazon SageMaker make it faster, easier, and more cost-effective for …

1 неделя, 2 дня назад @ aws.amazon.com
Enhancing speech-to-text accuracy of COVID-19-related terms with Amazon Transcribe Medical
Enhancing speech-to-text accuracy of COVID-19-related terms with Amazon Transcribe Medical Enhancing speech-to-text accuracy of COVID-19-related terms with Amazon Transcribe Medical

This post demonstrates how to use a custom vocabulary in Amazon Transcribe Medical to better recognize COVID-19 terms.

Amazon Transcribe Medical is a fully-managed speech recognition (ASR) service that makes it easy to add medical speech-to-text capabilities to your applications.

But now, with the use of the custom vocabulary feature, you can inform Amazon Transcribe Medical to better recognize these specific medical terms.

Come try out making your own custom medical vocabulary and transcribe medical speech via the service console today!

He works on improving the Amazon Transcribe and Transcribe Medical services.

1 неделя, 5 дней назад @ aws.amazon.com
Analyzing and tagging assets stored in Veeva Vault PromoMats using Amazon AI services
Analyzing and tagging assets stored in Veeva Vault PromoMats using Amazon AI services Analyzing and tagging assets stored in Veeva Vault PromoMats using Amazon AI services

More than 400 life sciences companies across over 165 countries rely on Veeva Vault PromoMats for commercial content and digital asset management.

A typical digital marketing team uses Veeva Vault PromoMats to store, search, curate, review, and distribute marketing assets across their global workforce.

After that, it just gets the Veeva Vault assets that have been created or modified since the last run.

Used for polling the Veeva Vault using the Veeva Query Language, ingesting assets to AWS, and pushing a message to the SQS queue.

ConclusionThis post demonstrated how to use Amazon AI services to extend the functionality of Veeva Vault PromoMats (or any other Veeva Vault offerings) and extra…

1 неделя, 5 дней назад @ aws.amazon.com
Omnichannel personalization with Amazon Personalize
Omnichannel personalization with Amazon Personalize Omnichannel personalization with Amazon Personalize

The post explores how to deliver personalized recommendations from a single Amazon Personalize deployment across three different communication channels.

It also uses ML model integration in Amazon Pinpoint to retrieve personalized recommendations from Amazon Personalize.

Step 2: Building Amazon Personalize campaignsBefore you can provide personalized product recommendations, you first need to train the ML models and provision the inference endpoints in Amazon Personalize that you need to retrieve recommendations.

The RecommendationProviderIdType value of PINPOINT_USER_ID is how endpoints in Amazon Pinpoint are linked to user identities in your Amazon Personalize campaign.

Lastly, the Recomm…

1 неделя, 6 дней назад @ aws.amazon.com
AWS to offer NVIDIA A100 Tensor Core GPU-based Amazon EC2 instances
AWS to offer NVIDIA A100 Tensor Core GPU-based Amazon EC2 instances AWS to offer NVIDIA A100 Tensor Core GPU-based Amazon EC2 instances

AWS leads the industry in providing you access to high-performance and cost-effective Amazon EC2 instances based on NVIDIA® GPUs.

AWS was first in the cloud to offer NVIDIA V100 Tensor Core GPUs via Amazon EC2 P3 instances.

To increase performance and lower cost-to-train for models, AWS is pleased to announce our plans to offer EC2 instances based on the new NVIDIA A100 Tensor Core GPUs.

For large-scale distributed training, you can expect EC2 instances based on NVIDIA A100 GPUs to build on the capabilities of EC2 P3dn.24xlarge instances and set new performance benchmarks.

For more information about EC2 instances based on NVIDIA A100 GPUs and potentially participate in early access, see her…

1 неделя, 6 дней назад @ aws.amazon.com
Performing medical transcription analysis with Amazon Transcribe Medical and Amazon Comprehend Medical
Performing medical transcription analysis with Amazon Transcribe Medical and Amazon Comprehend Medical Performing medical transcription analysis with Amazon Transcribe Medical and Amazon Comprehend Medical

This post explores how to integrate HIPAA-eligible AWS AI services Amazon Transcribe Medical and Amazon Comprehend Medical to identify insights in this data.

Amazon Comprehend MedicalAmazon Comprehend Medical is a natural language processing service that makes it easy to use ML to extract relevant medical information from unstructured text.

Medical Transcription AnalysisMedical Transcription Analysis (MTA) is a simple solution that uses Amazon Transcribe Medical and Amazon Comprehend Medical to provide medical notes transcription and comprehension.

It also creates an AWS Identity and Access Management (IAM) role with permissions to Amazon Comprehend Medical and Amazon Transcribe Medical, an…

2 недели, 5 дней назад @ aws.amazon.com
Increasing customer engagement and loyalty with personalized coupon recommendations using Amazon Personalize
Increasing customer engagement and loyalty with personalized coupon recommendations using Amazon Personalize Increasing customer engagement and loyalty with personalized coupon recommendations using Amazon Personalize

Lotte Mart, a Korean hypermarket, uses Amazon Personalize to offer personalized recommendations to frequent customers to increase engagement, increase purchase rates of new products, and ultimately further build customer loyalty.

This post shares the difficulties that Lotte Mart faced before using Amazon Personalize and how they improved their product recommendations and increased new product purchases.

Lotte Mart turned to Amazon Personalize as a solution to provide its M-coupon users highly curated and personalized product recommendations to increase customer loyalty and demand for new products.

Finally, Amazon Personalize enabled custom recommendations for each customer rather than the t…

2 недели, 5 дней назад @ aws.amazon.com
Building a lawn monitor and weed detection solution with AWS machine learning and IoT services
Building a lawn monitor and weed detection solution with AWS machine learning and IoT services Building a lawn monitor and weed detection solution with AWS machine learning and IoT services

The following image shows a collection of weed and grass images.

Choose Create new.

For your runtime, select SQL and choose Create application For the source, choose Connect streaming data and select Choose source.

To transform and format the ingested data, choose Edit/Preview data.

You can also visualize data using Amazon QuickSight.

2 недели, 5 дней назад @ aws.amazon.com
Catching fraud faster by building a proof of concept in Amazon Fraud Detector
Catching fraud faster by building a proof of concept in Amazon Fraud Detector Catching fraud faster by building a proof of concept in Amazon Fraud Detector

In some cases, Amazon Fraud Detector has helped detect fraud up to 95% faster.

Planning your POC for online fraud insightsTo get started with a POC with the Online Fraud Insights model, consider the following:Your specific use case – The Online Fraud Insights model works well detecting a multitude of online fraud and abuse types, such as new account fraud, transaction fraud, or fake reviews abuse.

– The Online Fraud Insights model works well detecting a multitude of online fraud and abuse types, such as new account fraud, transaction fraud, or fake reviews abuse.

Evaluating performance – Determine if Amazon Fraud Detector is catching fraud faster and reducing your fraud lossesYou can comple…

2 недели, 5 дней назад @ aws.amazon.com
Learn how to select ML instances on the fly in Amazon SageMaker Studio
Learn how to select ML instances on the fly in Amazon SageMaker Studio Learn how to select ML instances on the fly in Amazon SageMaker Studio

Amazon SageMaker Studio supports on-the-fly selection of machine learning (ML) instance types, optimized and pre-packaged Amazon SageMaker Images, and sharing of Jupyter notebooks.

Using notebooks in Amazon SageMaker StudioYou can access notebooks via Amazon SageMaker Studio, a fully integrated development environment for a complete ML workflow.

Disabling Fast launch only displays a list of all ML instances supported in Amazon SageMaker Studio.

Amazon SageMaker Studio comes with pre-configured Amazon SageMaker Images for TensorFlow, MXNet, PyTorch, TensorFlow 2, Data Science, and Python.

He has contributed to AWS services like Amazon Comprehend, Amazon Translate, and Amazon SageMaker Algori…

3 недели, 5 дней назад @ aws.amazon.com
Advance your career with a scholarship to the AWS Machine Learning Engineer Nanodegree program from Udacity
Advance your career with a scholarship to the AWS Machine Learning Engineer Nanodegree program from Udacity Advance your career with a scholarship to the AWS Machine Learning Engineer Nanodegree program from Udacity

Machine learning (ML) is one of the fastest growing areas in technology and a highly sought-after skill set in today’s job market.

To help you advance your AI/ML skills with hands-on and engaging learning, AWS is announcing the AWS Machine Learning Scholarship Program, built in collaboration with Udacity.

What is the AWS Machine Learning Scholarship Program?

The top scorers qualify for one of 325 full scholarships to the Udacity Machine Learning Engineer Nanodegree program, sponsored by Udacity.

For a more information, see AWS Machine Learning Scholarship Program.

3 недели, 6 дней назад @ aws.amazon.com
NVIDIA
последний пост 19 часов назад
At a Crossroads: How AI Helps Autonomous Vehicles Understand Intersections
At a Crossroads: How AI Helps Autonomous Vehicles Understand Intersections At a Crossroads: How AI Helps Autonomous Vehicles Understand Intersections

Manual MapmakingPrevious methods have relied on high-definition 3D semantic maps of an intersection and its surrounding area to understand the intersection structure and create paths to navigate safely.

Humans use live perception rather than maps to understand intersection structure and navigate intersections.

Prediction of intersection structure.

Another key benefit of our approach is that the intersection structure prediction is robust to occlusions and partial occlusions, and it’s able to predict both painted and inferred intersection structure lines.

Our DNN-based intersection structure perception capability will become available to developers in an upcoming DRIVE Software release as an…

19 часов назад @ blogs.nvidia.com
Peak Performance: Quadro Experience Streamlines Productivity for Professionals
Peak Performance: Quadro Experience Streamlines Productivity for Professionals Peak Performance: Quadro Experience Streamlines Productivity for Professionals

And with NVIDIA Quadro Experience — a new application for Quadro GPUs — professionals across industries can boost their creativity and increase their productivity like never before.

Get Updates on the Latest DriversRegularly get increased performance, improved driver stability and new features for your system with Quadro Experience.

Designers Switch Tracks Fast with Quadro ExperienceNVIDIA Quadro Experience is already helping companies enhance their collaborative efforts.

With Quadro RTX GPUs and Quadro Experience application, Alstom was able to accelerate its design workflow and use GPU power for its graphic needs.

With Quadro Experience, engineers, designers, artists, architects and other…

20 часов назад @ blogs.nvidia.com
2,000 Games and Counting: GeForce NOW Library Taking Shape
2,000 Games and Counting: GeForce NOW Library Taking Shape 2,000 Games and Counting: GeForce NOW Library Taking Shape

As we approach the end of our trial period, we’re working to build a robust catalog of PC games with full support from the development community.

This includes a new opt-in process for developers and publishers to offer their games on GeForce NOW.

Going forward, only the games that are opted in will be available on the service, providing confidence in the GeForce NOW game library.

The current list of games playable on GeForce NOW, including mega-hits — like Apex Legends, Counter-Strike: Global Offensive, Destiny 2, Dota 2, Fortnite, Rocket League, Terraria and Tom Clancy’s Rainbow Six Siege — can be found here, along with the list of games that will no longer be available.

With more than 2,…

21 час назад @ blogs.nvidia.com
Racing the Clock, COVID Killer Sought Among a Billion Molecules
Racing the Clock, COVID Killer Sought Among a Billion Molecules Racing the Clock, COVID Killer Sought Among a Billion Molecules

Her efforts could bring a 10-figure payday — specifically 2 billion molecular tests executed in just 24 hours.

Even simulating them all on the 9,216 CPUs on Summit, ORNL’s supercomputer, could take four years.

Simulating 2 Billion Compounds in 24 HoursSedova now believes with further improvements the team could create a capability to examine as many as two billion compounds in 24 hours.

“If it works, we’ll launch the big run with 1.4 billion compounds using all of Summit’s nodes,” she said.

The work got its start in January when a top ORNL researcher, Jeremy C. Smith, showed the first work using the Summit supercomputer for drug research to fight the coronavirus.

1 день, 21 час назад @ blogs.nvidia.com
Safe at the Finnish Line: Privacy Project Kicks Off Collaboration
Safe at the Finnish Line: Privacy Project Kicks Off Collaboration Safe at the Finnish Line: Privacy Project Kicks Off Collaboration

“Differential privacy rests on a strong theoretical foundation, so if you follow the algorithm you get privacy guarantees, but to date the performance cost has been quite significant,” said Honkela.

“Now we could close this gap,” he said of the first project in a broad, multi-year collaboration between NVIDIA and AI researchers in Finland.

The work could narrow the performance gap further for implementing enhanced differential privacy.

They join a growing global community of NVIDIA AI Technology Centers (NVAITC) driving technology forward.

The collaboration between AI researchers and GPU experts “is a good model,” said Honkela, a coordinating professor at FCAI.

2 дня, 21 час назад @ blogs.nvidia.com
To Combat Diabetes-Related Blindness, Taiwanese Med-Tech Firm Brings AI to the Edge
To Combat Diabetes-Related Blindness, Taiwanese Med-Tech Firm Brings AI to the Edge To Combat Diabetes-Related Blindness, Taiwanese Med-Tech Firm Brings AI to the Edge

Symptoms of diabetic retinopathy, as it’s known, are often hard to diagnose, even from high-resolution images of the eye’s interior.

No Surprise: Early Diagnosis CriticalSpotting diabetic retinopathy at an earlier stage can delay or even prevent the onset of diabetes-related blindness.

However, primary caregivers typically aren’t able to catch obscure indicators that an ophthalmologist — or MiiS’s AI model — would.

To train its model, MiiS turned to three partner hospitals to obtain a dataset of 120,000 images.

At about 90 percent, the accuracy of MiiS’s AI model is comparable to that of the typical ophthalmologist’s.

3 дня, 10 часов назад @ blogs.nvidia.com
40 Years on, PAC-MAN Recreated with AI by NVIDIA Researchers
40 Years on, PAC-MAN Recreated with AI by NVIDIA Researchers 40 Years on, PAC-MAN Recreated with AI by NVIDIA Researchers

Trained on 50,000 episodes of the game, a powerful new AI model created by NVIDIA Research, called NVIDIA GameGAN, can generate a fully functional version of PAC-MAN — without an underlying game engine.

That means that even without understanding a game’s fundamental rules, AI can recreate the game with convincing results.

Made up of two competing neural networks, a generator and a discriminator, GAN-based models learn to create new content that’s convincing enough to pass for the original.

And it did.”As an artificial agent plays the GAN-generated game, GameGAN responds to the agent’s actions, generating new frames of the game environment in real time.

Just like in the original game, PAC-MA…

5 дней, 23 часа назад @ blogs.nvidia.com
What’s a DPU?
What’s a DPU? What’s a DPU?

What’s a DPU?

“The CPU is for general purpose computing, the GPU is for accelerated computing and the DPU, which moves data around the data center, does data processing.”What's a DPU?

Data Processing UnitIndustry-standard, high-performance, software-programmable multi-core CPUHigh-performance network interfaceFlexible and programmable acceleration enginesSo What Makes a DPU Different?

All these DPU capabilities are critical to enable an isolated, bare-metal, cloud-native computing that will define the next generation of cloud-scale computing.

Instead the network interface needs to be powerful and flexible enough to handle all network data path processing.

1 неделя назад @ blogs.nvidia.com
Facebook Rolls out a GPU-Accelerated AI Shopping Tool for Marketplace
Facebook Rolls out a GPU-Accelerated AI Shopping Tool for Marketplace Facebook Rolls out a GPU-Accelerated AI Shopping Tool for Marketplace

GrokNet, a universal computer vision system, can identify items in categories such as fashion, auto, and home decor.

The model is in production today and is available for buyers and sellers in Facebook Marketplace.

This meant that the team had to aggregate a massive number of data sets, types of supervision, and loss functions into a single model.

The GrokNet system is based on the ResNeXt-101 architecture.

“While these systems are fragmented right now, incorporating everything into one system is the ambitious challenge,” the Facebook team said.

1 неделя назад @ news.developer.nvidia.com
NVIDIA Xavier Achieves Industry First with Expert Safety Assessment
NVIDIA Xavier Achieves Industry First with Expert Safety Assessment NVIDIA Xavier Achieves Industry First with Expert Safety Assessment

The NVIDIA Xavier SoC passed the final assessment for safety product approval by TÜV SÜD, one of the most knowledgeable and stringent safety assessment bodies in the industry.

NVIDIA Xavier, the world’s first processor for autonomous driving, is the most complex SoC the safety agency has assessed in its 150-year history.

This current assessment completes the last step to show that Xavier SoC meets all applicable requirements of ISO 26262.

Using any one of these components separately would require significant investment to achieve the same safety functionality as the complete Xavier SoC.

By meeting these requirements, Xavier has demonstrated the ability to achieve the necessary complexity fo…

1 неделя назад @ blogs.nvidia.com
Microsoft and NVIDIA Announce June Preview for GPU-Acceleration Support for WSL
Microsoft and NVIDIA Announce June Preview for GPU-Acceleration Support for WSL Microsoft and NVIDIA Announce June Preview for GPU-Acceleration Support for WSL

At their Build digital developers conference on May 19, Microsoft announced a Public Preview for GPU in Windows Subsystem for Linux (WSL).

WSL is a layer that enables executing Linux binaries on Microsoft Windows computing systems.

The announcement disclosed the collaboration with NVIDIA to deliver CUDA GPU-acceleration support to masses of Windows users.

Users can apply to the the NVIDIA Developer Program and the Microsoft Windows Insider Program for access to the preview software.

Users can inquire about GPU in WSL here and can apply to the NVIDIA Developer Program here.

1 неделя, 1 день назад @ news.developer.nvidia.com
While the World Works from Home, NVIDIA’s AV Fleet Drives in the Data Center
While the World Works from Home, NVIDIA’s AV Fleet Drives in the Data Center While the World Works from Home, NVIDIA’s AV Fleet Drives in the Data Center

During the GTC 2020 keynote, NVIDIA CEO Jensen Huang demonstrated how NVIDIA DRIVE technology is being developed and tested in simulation.

The 17-mile loop shows the NVIDIA DRIVE AV Software navigating the roadways, pedestrians and traffic in a highly accurate replica environment.

Recreating Sensor DataWith an accurate environment in place, high-fidelity development and testing next requires accurately generated sensor data.

In addition to camera models, DRIVE Sim provides physically based lidar and radar sensors using ray tracing.

Vehicle dynamics also play a key role in accurate sensor data generation.

1 неделя, 1 день назад @ blogs.nvidia.com
COVID Caught on Camera: Startup’s Sensors Keep Hospitals Safe
COVID Caught on Camera: Startup’s Sensors Keep Hospitals Safe COVID Caught on Camera: Startup’s Sensors Keep Hospitals Safe

Andrew Gostine’s startup aims to make hospitals more efficient, but when the coronavirus hit Chicago he pivoted to keeping them safer, too.

He’s also the CEO of Whiteboard Coordinator Inc., a startup that had a network of 400 cameras and other sensors deployed across Northwestern’s 10 hospitals before the pandemic.

Ten days later, the startup had thermal cameras linked to its network installed at 31 entrances to the hospitals.

So, the startup deployed another 400 cameras sporting night vision and microphones across the 10 hospitals.

The startup’s mission was born of Gostine’s personal passion for making hospitals more modern and efficient.

1 неделя, 1 день назад @ blogs.nvidia.com
Create at the Speed of Imagination with New Thin and Light Devices from Dell, HP and Microsoft
Create at the Speed of Imagination with New Thin and Light Devices from Dell, HP and Microsoft Create at the Speed of Imagination with New Thin and Light Devices from Dell, HP and Microsoft

New NVIDIA-powered laptops and mobile workstations from Dell, HP and Microsoft give creators amazing choices to turn their imagination into actual creations.

Packing this much performance into an RTX Studio laptop of this size requires a little engineering ingenuity to keep the system performing smoothly.

For editors, engineers and scientists running intensive workloads, the Dell Precision 7550 and Dell Precision 7750 are available with up to Quadro RTX 5000 GPUs.

Purchase a new RTX Studio laptop or desktop and both new and existing Adobe users get a free three-month subscription to Adobe Creative Cloud.

Learn more about NVIDIA Studio and RTX Studio systems.

1 неделя, 1 день назад @ blogs.nvidia.com
Cut to the Video: Adobe Premiere Pro Helps Content Creators Work Faster with GPU-Accelerated Exports
Cut to the Video: Adobe Premiere Pro Helps Content Creators Work Faster with GPU-Accelerated Exports Cut to the Video: Adobe Premiere Pro Helps Content Creators Work Faster with GPU-Accelerated Exports

With the latest release of Adobe Premiere Pro, available today, creators can get new NVIDIA GPU-enhanced features that help them deliver high-quality content faster than ever.

Elevate Editing Workflows with GPU AccelerationWith the new Premiere Pro 14.2, video creators gain massive time-savings with new GPU-accelerated encoding.

Adobe and NVIDIA have optimized Premiere Pro for the built-in NVIDIA hardware encoder on NVIDIA Quadro and GeForce GPUs.

In addition to Premiere Pro, the NVIDIA hardware encoder speeds up video exports in Adobe Media Encoder, After Effects and Audition.

Find out more about the latest Premiere Pro release.

1 неделя, 1 день назад @ blogs.nvidia.com
Apple Machine Learning Journal
последний пост 5 месяцев, 4 недели назад
Apple at NeurIPS 2019
Apple at NeurIPS 2019 Apple at NeurIPS 2019

The conference, of which Apple is a Diamond Sponsor, will take place in Vancouver, Canada from December 8th to 14th.

If you’re interested in opportunities to make an impact on Apple products through machine learning research and development, check out our teams at Jobs at Apple.

We propose to evaluate both the generator and the discriminator by deriving corresponding Fisher Score and Fisher Information from the EBM.

In this work, we address this problem by introducing data parameters.

During training, at each iteration, as we update the model parameters, we also update the data parameters.

5 месяцев, 4 недели назад @ machinelearning.apple.com
Apple at Interspeech 2019
Apple at Interspeech 2019 Apple at Interspeech 2019

Apple is attending Interspeech 2019, the world’s largest conference on the science and technology of spoken language processing.

For Interspeech attendees, join the authors of our accepted papers at our booth to learn more about the great speech research happening at Apple.

If you’re interested in opportunities to make an impact on Apple products through machine learning research and development, check out our teams at Jobs at Apple.

The model can be used to check that text-to-speech (TTS) training speech follows the script and words are pronounced as expected.

Adding more annotated training data for any ML system typically improves accuracy, but only if it provides examples not alrea…

8 месяцев, 2 недели назад @ machinelearning.apple.com
Language Identification from Very Short Strings
Language Identification from Very Short Strings Language Identification from Very Short Strings

For example, this capability is needed to load the right autocorrection lexicon and the right language model for predictive and multilingual typing.

Neural LID ArchitectureWe model LID as a character level sequence labeling problem.

LSTM model sizes, on the other hand, are simply a function of the network parameters.

At Apple, bi-LSTM LID is now used for most tasks which require language identification, like text tagging and other public APIs part of the Natural Language framework.

Language Identification from Short Strings.

10 месяцев, 1 неделя назад @ machinelearning.apple.com
Bridging the Domain Gap for Neural Models
Bridging the Domain Gap for Neural Models Bridging the Domain Gap for Neural Models

This task is called the covariate shift problem, for the case where we have access to labeled data from one domain (source) and unlabeled data from another domain (target).

Unsupervised domain adaptation is an especially attractive alternative when the ground truth labels cannot be obtained easily for the task of interest.

An adversarial learning-based method for domain adaptation at pixel-level would try to translate/synthesize input images from one domain to the other, bringing the input distributions closer.

We can see clear improvements from models trained on source domain only to models trained with the proposed SWD method.

ConclusionThis method of unsupervised domain adaptation helps …

11 месяцев, 2 недели назад @ machinelearning.apple.com
Optimizing Siri on HomePod in Far‑Field Settings
Optimizing Siri on HomePod in Far‑Field Settings Optimizing Siri on HomePod in Far‑Field Settings

Unlike Siri on iPhone, which operates close to the user’s mouth, Siri on HomePod must work well in a far-field setting.

Block diagram of the online multichannel signal processing chain on HomePod for Siri.

The RES is designed to suppress nonlinear components of the echo signal that aren’t being modeled by the linear MCEC.

It is obvious that the optimal integration of our speech processing technologies substantially improves the overall WERs across conditions.

A survey of convolutive blind source separation methods, Multichannel Speech Processing Handbook, 2007.

1 год, 5 месяцев назад @ machinelearning.apple.com
Apple at NeurIPS 2018
Apple at NeurIPS 2018 Apple at NeurIPS 2018

This December we’ll be in Montreal, Canada, attending the 32nd Conference on Neural Information Processing Systems (NeurIPS).

We’ll have a booth staffed with Machine Learning experts from across Apple who would love to chat with you.

Please drop by if you’re attending the conference.

Apple is dedicated to advancing state-of-the-art machine learning technologies.

If you are interested in applying to specific machine learning positions, please explore opportunities at Machine Learning Jobs At Apple.

1 год, 6 месяцев назад @ machinelearning.apple.com
Can Global Semantic Context Improve Neural Language Models?
Can Global Semantic Context Improve Neural Language Models? Can Global Semantic Context Improve Neural Language Models?

In this article, we explore whether we can improve word predictions for the QuickType keyboard using global semantic context.

Can this global semantic context result in better language models?

All neural network solutions to date predict either a word in context or the local context itself, which doesn’t adequately reflect global semantic information.

ConclusionWe set out to assess the potential benefits of incorporating global semantic information into neural language models.

In summary, using bi-LSTM RNNs to train global semantic word embeddings can indeed lead to improved accuracy in neural language modeling.

1 год, 8 месяцев назад @ machinelearning.apple.com
Finding Local Destinations with Siri’s Regionally Specific Language Models for Speech Recognition
Finding Local Destinations with Siri’s Regionally Specific Language Models for Speech Recognition Finding Local Destinations with Siri’s Regionally Specific Language Models for Speech Recognition

The accuracy of automatic speech recognition (ASR) systems has improved phenomenally over recent years, due to the widespread adoption of deep learning techniques.

We decided to improve Siri’s ability to recognize names of local POIs by incorporating knowledge of the user’s location into our speech recognition system.

We’ve been able to significantly improve the accuracy of local POI recognition and understanding by incorporating users’ geolocation information into Siri’s ASR system.

Incremental Language Models for Speech Recognition Using Finite-state Transducers.

Convolutional Neural Networks for Speech Recognition IEEE/ACM Transactions on Audio, Speech, and Language Processing,…

1 год, 9 месяцев назад @ machinelearning.apple.com
Personalized Hey Siri
Personalized Hey Siri Personalized Hey Siri

When a user says, “Hey Siri, how is the weather today?” the phone wakes up upon hearing “Hey Siri” and processes the rest of the utterance as a Siri request.

The application of a speaker recognition system involves a two-step process: enrollment and recognition.

User EnrollmentThe main design discussion for personalized “Hey Siri” (PHS) revolves around two methods for user enrollment: explicit and implicit.

Improving the Speaker TransformThe speaker transform is the most important part of any speaker recognition system.

At its core, the purpose of the “Hey Siri” feature is to enable users to make Siri requests.

2 года, 1 месяц назад @ machinelearning.apple.com
Learning with Privacy at Scale
Learning with Privacy at Scale Learning with Privacy at Scale

We develop a system architecture that enables learning at scale by leveraging local differential privacy, combined with existing privacy best practices.

In this article, we give an overview of a system architecture that combines differential privacy and privacy best practices to learn from a user population.

Differential privacy [2] provides a mathematically rigorous definition of privacy and is one of the strongest guarantees of privacy available.

In our system, we choose not to collect raw data on the server which is required for central differential privacy; hence, we adopt local differential privacy, which is a superior form of privacy [3].

ConclusionIn this article, we have presented a…

2 года, 5 месяцев назад @ machinelearning.apple.com
Uber Engineering Uber Engineering
последний пост 2 недели, 1 день назад
Announcing a New Framework for Designing Optimal Experiments with Pyro
Announcing a New Framework for Designing Optimal Experiments with Pyro Announcing a New Framework for Designing Optimal Experiments with Pyro

We’ll treat working memory capacity as the length of the longest list of random digits that the participant can memorize.

InferenceWe use Bayesian inference to incorporate our new observation into an estimate of the participant’s working memory capacity.

It models the probability of correctly remembering the list of digits of different lengths for people with different working memory capacities, as shown in Figure 1, below:We also need a sense of what working memory capacities are plausible.

Computing the optimal designOur score for experimental designs, EIG, is notoriously difficult to estimate.

In our paper, we showed that this method can be remarkably accurate on a range of different exp…

2 недели, 1 день назад @ eng.uber.com
Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and their Solutions
Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and their Solutions Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and their Solutions

Last year we introduced the Paired Open-Ended Trailblazer (POET) to explore the idea of open-ended algorithms.

ANNECS: A new way to measure progress in open-ended systemsQuantifying the performance of open-ended algorithms has remained elusive for the field.

Compare those from Original POET in Figure 4a to those produced by Enhanced POET in Figure 4b, below.

If this piques your interest, be sure to check out videos of example Enhanced POET agents on the Uber AI YouTube channel.

Towards that end, we are not only releasing a paper with full technical details, but also have open sourced the code for Enhanced POET.

3 недели назад @ eng.uber.com
Under the Hood of Uber ATG’s Machine Learning Infrastructure and Versioning Control Platform for Self-Driving Vehicles
Under the Hood of Uber ATG’s Machine Learning Infrastructure and Versioning Control Platform for Self-Driving Vehicles Under the Hood of Uber ATG’s Machine Learning Infrastructure and Versioning Control Platform for Self-Driving Vehicles

A trained model, which requires as input the data set artifact, the model training code, and configuration files governing model training.

Example sequence of events: registering a new data setUpon user-registration of a new data set, the VerCD Data set Service stores the dependency metadata in our database.

Data set service APIThe data set service is responsible for tracking the dependencies for building a given data set.

The REST API supports the functions of creating a new data set, reading the metadata for a data set, updating the metadata of a data set, deleting a data set, and getting the artifact locations of the data set (such as in S3 or HDFS).

For instance, the VerCD data set serv…

2 месяца, 3 недели назад @ eng.uber.com
Building a Backtesting Service to Measure Model Performance at Uber-scale
Building a Backtesting Service to Measure Model Performance at Uber-scale Building a Backtesting Service to Measure Model Performance at Uber-scale

To better assess the performance of our models, we built a backtesting service for measuring forecast model error rates.

The backtesting service runs in a distributed system, allowing multiple models (>10), many backtesting windows (>20), and models for different cities (>200) to run simultaneously.

Backtesting at scaleOur data science teams regularly create forecast models and statistics to better understand budget spending and project financial performance.

For the purposes of our backtesting service, we chose to leverage two primary backtesting data split mechanisms, backtesting with an expanding window and backtesting with a sliding window:Above, we showcase three windows for each metho…

3 месяца, 2 недели назад @ eng.uber.com
Uber AI in 2019: Advancing Mobility with Artificial Intelligence
Uber AI in 2019: Advancing Mobility with Artificial Intelligence Uber AI in 2019: Advancing Mobility with Artificial Intelligence

At the forefront of this effort is Uber AI, Uber’s center for advanced artificial intelligence research and platforms.

In this year alone, AI research at Uber has led to significant improvements in demand prediction and more seamless pick-up experiences.

Fostering AI collaboration through open sourceIn 2019, Uber AI was committed to sharing knowledge and best practices with the broader scientific community through open source projects.

Looking towards 2020Next year, Uber AI will continue to innovate, collaborate, and contribute to Uber’s platform services through the application of AI across our business.

For more on Uber AI, be sure to check out related articles on the Uber Engineering Blo…

5 месяцев, 1 неделя назад @ eng.uber.com
Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data
Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data

We in Uber AI Labs investigated the intriguing question of whether we can create learning algorithms that automatically generate training data, learning environments, and curricula to help AI agents rapidly learn.

Increasingly, neural architecture search (NAS) algorithms are being deployed to automate the search for architectures, with great results.

32), new learners are able to learn on synthetic data faster than real data (red line vs. blue line in Figure 1).

In our experiments, the estimates come either from training for 128 SGD steps on GTN-generated data or real data.

Then, for each method, the final best architecture according to the estimate is trained a long time on real data.

5 месяцев, 1 неделя назад @ eng.uber.com
Controlling Text Generation with Plug and Play Language Models
Controlling Text Generation with Plug and Play Language Models Controlling Text Generation with Plug and Play Language Models

This article discusses an alternative approach to controlled text generation, titled the Plug and Play Language Model (PPLM), introduced in a recent paper from Uber AI.

In many ways, language models are like wise but unguided wooly mammoths that lumber wherever they please.

As we will show below, attribute models with only a single layer containing 4,000 parameters perform well at recognizing attributes and guiding generation.

Thus, we use the unmodified language model to ensure the fluency of language is maintained at or near the level of the original language model (in this example, GPT-2-medium).

Multiple attribute modelsWe may combine multiple attribute models in controlled generation, …

5 месяцев, 3 недели назад @ eng.uber.com
Food Discovery with Uber Eats: Using Graph Learning to Power Recommendations
Food Discovery with Uber Eats: Using Graph Learning to Power Recommendations Food Discovery with Uber Eats: Using Graph Learning to Power Recommendations

To this end, we previously developed ML models to better understand queries and for multi-objective optimization in Uber Eats search and recommender system in Uber Eats searches and surfaced food options.

Graph learning in a nutshellTo best understand how we made our Uber Eats recommendations more accurate, it helps to know the basics of how graph learning works.

For example, to represent an eater in our Uber Eats model we don’t only use order history to inform order suggestions, but also information about what food items are connected to past Uber Eats orders and insights about similar users.

For our Uber Eats use case, we opted for a graph neural network (GNN)-based approach to obtain an …

5 месяцев, 3 недели назад @ eng.uber.com
Uber Goes to NeurIPS 2019
Uber Goes to NeurIPS 2019 Uber Goes to NeurIPS 2019

This year, Uber is presenting 11 papers at the NeurIPS 2019 conference in Vancouver, Canada!

Scalable Global Optimization via Local Bayesian OptimizationDavid Eriksson (Uber AI) · Michael Pearce (Uber AI intern / Warwick University) · Jacob Gardner (Uber AI) · Ryan Turner (Uber AI) · Matthias Poloczek (Uber AI)ArXivDecember 10 at 4:25 pm, West Ballroom C, NeurIPS Spotlight TalkDecember 10 at 5:30 pm, East Exhibition Hall B&C, Poster #9Bayesian optimization (BO) has recently emerged as a successful technique for the global optimization of black-box functions.

For additional information about our talks and posters, check out the Uber NeurIPS 2019 site.

Interested in the ML research that Uber …

5 месяцев, 3 недели назад @ eng.uber.com
Announcing the 2020 Uber AI Residency
Announcing the 2020 Uber AI Residency Announcing the 2020 Uber AI Residency

On behalf of Uber, we invite you to join us on our journey as an Uber AI Resident.

Established in 2018, the Uber AI Residency is a 12-month training program for recent college and master’s graduates, professionals who are looking to reinforce their AI skills, and those with quantitative skills and interest in becoming an AI researcher at Uber.

This year’s AI residency program will focus on our self-driving cars project through Uber Advanced Technology Group (ATG).

Open source & publication opportunitiesAcross Uber, we are committed to an open and inclusive research mission that benefits the community at large through both Uber AI and Uber ATG Research.

Learn more about the Uber AI Residency…

6 месяцев назад @ eng.uber.com
Get to Know Uber ATG at ICCV, CoRL, and IROS 2019
Get to Know Uber ATG at ICCV, CoRL, and IROS 2019 Get to Know Uber ATG at ICCV, CoRL, and IROS 2019

We hope our approach to sharing will deepen the interactions and collaborations between industry and academia, and will ultimately bring self-driving research communities together.

This year, Uber ATG has five publications accepted at ICCV, two publications accepted at CoRL, and two publications accepted at IROS.

In addition, Raquel Urtasun, Uber ATG Chief Scientist and Head of Uber ATG R&D, will be giving four talks at ICCV.

Please come visit us at ICCV (booth #D-7) IROS and CORL to learn more about our lab’s research, discuss the work with our researchers, and hear about career opportunities with Uber ATG.

Learn more about research opportunities with Uber ATG by visiting our careers page.

7 месяцев назад @ eng.uber.com
Evolving Michelangelo Model Representation for Flexibility at Scale
Evolving Michelangelo Model Representation for Flexibility at Scale Evolving Michelangelo Model Representation for Flexibility at Scale

To address these issues, we evolved Michelangelo’s use of Spark MLlib, particularly in the areas of model representation, persistence, and online serving.

Its end-to-end support for scheduled Spark-based data ingestion, model training, and evaluation, along with deployment for batch and online model serving, has gained wide acceptance across Uber.

More recently, Michelangelo has evolved to handle more use cases, including serving models trained outside of core Michelangelo.

Michelangelo had specific pipeline model definitions for each supported model type, with an in-house custom protobuf representation of trained models for serving.

It is important to note that Michelangelo online serving …

7 месяцев, 2 недели назад @ eng.uber.com
Searchable Ground Truth: Querying Uncommon Scenarios in Self-Driving Car Development
Searchable Ground Truth: Querying Uncommon Scenarios in Self-Driving Car Development Searchable Ground Truth: Querying Uncommon Scenarios in Self-Driving Car Development

We use these traffic scenarios to develop machine learning models that help our self-driving cars safely react to common, and not so common, scenarios that come up in a given operational domain.

These specific scenarios can then be used to train our self-driving cars to safely navigate a traffic situation with bicyclists.

Modeled tables are crucial in making our data useful for training self-driving cars to operate safely.

The ability to query data that replicates traffic scenarios ranging from the everyday to the very rare will help prepare our self-driving cars for any situation.

There is no shortage of work to be done in making the future of self-driving cars a reality.

7 месяцев, 3 недели назад @ eng.uber.com
Science at Uber: Improving Transportation with Artificial Intelligence
Science at Uber: Improving Transportation with Artificial Intelligence Science at Uber: Improving Transportation with Artificial Intelligence

In our Science at Uber video series, Uber employees talk about how we apply data science, artificial intelligence, machine learning, and other innovative technologies in our daily work.

Zoubin Ghahramani, Chief Scientist at Uber, spent many years in academia researching artificial intelligence.

Applied to the huge amount of data around transportation, artificial intelligence has the capability to make travel easier and more seamless.

At Uber, deep learning, an area of artificial intelligence research, finds use in multiple applications, including improving our understanding of cities and traffic, helping compute ETAs, and in developing self-driving cars.

Beyond deep learning, however, we al…

8 месяцев, 1 неделя назад @ eng.uber.com
Three Approaches to Scaling Machine Learning with Uber Seattle Engineering
Three Approaches to Scaling Machine Learning with Uber Seattle Engineering Three Approaches to Scaling Machine Learning with Uber Seattle Engineering

In an effort to constantly optimize our operations, serve our customers, and train our systems to perform better and better, we leverage machine learning (ML).

In addition, we make many of our ML tools open source, sharing them with the community to advance the state of the art.

In this spirit, members of our Seattle Engineering team shared their work at an April 2019 meetup on ML and AI at Uber.

Below, we highlight three different approaches Uber Seattle Engineering is currently working on to improve our ML ecosystem and that of the tech community at large.

Horovod: Distributed Deep Learning on Apache SparkDuring his talk, senior software engineer Travis Addair, from the ML Platform team, …

8 месяцев, 2 недели назад @ eng.uber.com
neptune.ai neptune.ai
последний пост 6 часов назад
How to Do Data Exploration for Image Segmentation and Object Detection (Things I Had to Learn the Hard Way)
How to Do Data Exploration for Image Segmentation and Object Detection (Things I Had to Learn the Hard Way) How to Do Data Exploration for Image Segmentation and Object Detection (Things I Had to Learn the Hard Way)

In this article I will share with you how I approach data exploration for image segmentation and object detection problems.

That said, when it comes to object detection and image segmentation datasets there is no straightforward way to systematically do data exploration.

Data augmentation is by far the most important and widely used regularization technique (in image segmentation / object detection ).

Since exploring predictions of image detection and image segmentation models can get quite messy I would suggest you do it step by step.

Nonetheless, the coco dataset (and the coco format) became a standard way of organizing object detection and image segmentation datasets.

6 часов назад @ neptune.ai
This Week in Machine Learning: Bees, Sky Objects, & HMI
This Week in Machine Learning: Bees, Sky Objects, & HMI This Week in Machine Learning: Bees, Sky Objects, & HMI

Here are the best picks from the last week from the world of the machine learning.

Weekly Roundup: May 19-25» Neptune.ai blog – make sure to visit our blog to find out interesting and in-depth articles on machine learning from the last week.

» Microsoft launches new tools for building fairer machine learning models by Frederic Lardinois on Tech Crunch | May 19Microsoft announced WhiteNoise, a new open-source toolkit that’s available both on GitHub and through Azure Machine Learning.

What do they have in common with machine learning?

» Old but gold, the reliable Reddit thread on ML for more news on machine learning.

2 дня, 19 часов назад @ neptune.ai
The Best Comet.ml Alternatives
The Best Comet.ml Alternatives The Best Comet.ml Alternatives

Comet is one of the most popular tools used by people working on machine learning experiments.

It is a self-hosted and cloud-based meta machine learning platform allowing data scientists and teams to track, compare, explain, and optimize experiments and models.

As it’s offered both cloud-hosted and self-hosted, you should be able to manage ML experiments of your entire team.

Anyhow, there are many other tools available, and to help you find the right fit, we present a list of the best Comet.ml alternatives.

It allows you to visualize various aspects of machine learning experiments, such as metrics, visualize model graphs, view tensors’ histograms, and more.

3 дня назад @ neptune.ai
The Best Tools for Machine Learning Model Visualization
The Best Tools for Machine Learning Model Visualization The Best Tools for Machine Learning Model Visualization

The phrase “Every model is wrong but some are useful” is especially true in Machine Learning.

When developing machine learning models you should always understand where it works as expected and where it fails miserably.

It’s open-source and offers a suite of tools for visualization and debugging of machine learning models.

Visdom – summary:It helps to interactively visualize any data (including remote machine model training)It contains a ton of visualization atomics.

The tool enables machine learning (ML) researchers to more easily evaluate the influence of their hyperparameters, such as learning rate, regularizations, and architecture.

3 дня, 2 часа назад @ neptune.ai
Random Forest Regression: When Does It Fail and Why?
Random Forest Regression: When Does It Fail and Why? Random Forest Regression: When Does It Fail and Why?

We’ll cover the following items:Random Forest Regression vs Linear RegressionRandom Forest Regression Extrapolation ProblemPotential solutionsShould you use Random Forest for Regression?

Random Forest Regression vs Linear RegressionRandom Forest Regression is quite a robust algorithm, however, the question is should you use it for regression?

Example of trained Linear Regression and Random ForestIn order to dive in further, let’s look at an example of a Linear Regression and a Random Forest Regression.

For this, we’ll apply the Linear Regression and a Random Forest Regression to the same dataset and compare the result.

There are a couple of options:Use a linear model such as SVM regression,…

6 дней, 6 часов назад @ neptune.ai
The Best Tools to Visualize Metrics and Hyperparameters of Machine Learning Experiments
The Best Tools to Visualize Metrics and Hyperparameters of Machine Learning Experiments The Best Tools to Visualize Metrics and Hyperparameters of Machine Learning Experiments

Here are the best six tools to visualize metrics and hyperparameters of machine learning experiments.

It works wherever you run your code with any machine learning library, and for any machine learning task.

Comet is suitable for teams, individuals, academics, organizations, and anyone who wants to easily visualize experiments and facilitate work and run experiments.

It works wherever you run your code with any machine learning library, and for any machine learning task.

Comet is suitable for teams, individuals, academics, organizations, and anyone who wants to easily visualize experiments and facilitate work and run experiments.

6 дней, 20 часов назад @ neptune.ai
This Week in Machine Learning: Amazon Releases Kendra, Brain-Inspired Algorithms, Confused AI Models & More
This Week in Machine Learning: Amazon Releases Kendra, Brain-Inspired Algorithms, Confused AI Models & More This Week in Machine Learning: Amazon Releases Kendra, Brain-Inspired Algorithms, Confused AI Models & More

Machine learning is fascinating.

Here are the best picks from the last week from the world of the machine learning.

Listen to our latest episode of Machine Learning That Works Podcast!

> Scientists Bridge Neuroscience With AI Machine Learning by Cami Rosso on Psychology Today | May 15Researchers discover brain-inspired algorithms for faster AI learning.

> Old but gold, the reliable Reddit thread on ML for more news on machine learning.

1 неделя, 2 дня назад @ neptune.ai
Top Open Source Tools and Libraries for Deep Learning – ICLR 2020 Experience
Top Open Source Tools and Libraries for Deep Learning – ICLR 2020 Experience Top Open Source Tools and Libraries for Deep Learning – ICLR 2020 Experience

Uncovering connections between this data, though, requires machine learning models specifically designed for graphs.

AmpliGraph is an Apache 2 licensed suite of neural machine learning models known as knowledge graph embeddings.

Knowledge graph embeddings have applications in knowledge graph completion, knowledge discovery, and link-based clustering, just to cite a few.

Infill options include “ML infill”, in which automated machine learning models are trained for each column to predict infill.

Finally, we are open to hearing out more about open source ecosystems in deep learning.

1 неделя, 3 дня назад @ neptune.ai
Interview with a Chief AI Scientist: Arash Azhand
Interview with a Chief AI Scientist: Arash Azhand Interview with a Chief AI Scientist: Arash Azhand

Some time ago I had a chance to interview a great artificial intelligence researcher and Chief AI Scientist in Lindera, Arash Azhand.

So more and more people also try to combine these artificial neural networks with other AI research fields.

One example of another AI research area could be probabilistic models.

I started at Lindera because the data science work here at Lindera is much more reminiscent of my work at the university.

Are there other skills that you think are crucial to the leadership role in this very researchy data science world?

1 неделя, 6 дней назад @ neptune.ai
The Best Tools, Libraries, Frameworks and Methodologies that Machine Learning Teams Actually Use – Things We Learned from 41 ML Startups [ROUNDUP]
The Best Tools, Libraries, Frameworks and Methodologies that Machine Learning Teams Actually Use – Things We Learned from 41 ML Startups [ROUNDUP] The Best Tools, Libraries, Frameworks and Methodologies that Machine Learning Teams Actually Use – Things We Learned from 41 ML Startups [ROUNDUP]

And to answer that question we asked 41 Machine Learning startups from all over the world.

A ton of great advice that we grouped into:MethodologySoftware development setupMachine Learning frameworksMLOpsUnexpected 🙂Read on to figure out what will work for your machine learning team.

MLOps starts to be more important for machine learning startupsYou may be wondering what MLOps is or why you should care.

Trust Insights USA | Massachusetts | NorfolkScanta USA | California | San Francisco Website | Twitter | LinkedIn | FBProtecting machine learning algorithms and the businesses that use them.

Sensitrust United Kingdom | London Website | Twitter | LinkedIn | FBPlatform where customers and profes…

1 неделя, 6 дней назад @ neptune.ai
This Week in Machine Learning: SEO, History, Product Management, & Top Startups
This Week in Machine Learning: SEO, History, Product Management, & Top Startups This Week in Machine Learning: SEO, History, Product Management, & Top Startups

The world of machine learning is changing by every second.

Over the past week, we’ve published a series of posts concerning the Papers from the ICLR 2020 Conference.

🎧> Top 10 Machine Learning Startups of 2020 by Priya Dialani on Analytics Insight | May 10A list of the most innovative Machine Learning Companies in 2020 hopefully, you’ll find some inspiration here.

> Millions of historic newspaper images get the machine learning treatment at the Library of Congress by Devin Coldewey at Techcrunch | May 7A new effort from the Library of Congress has digitized and organized photos and illustrations from centuries of news using state of the art machine learning.

Two positions this week:> A Prac…

2 недели, 2 дня назад @ neptune.ai
How to Keep Track of PyTorch Lightning Experiments with Neptune
How to Keep Track of PyTorch Lightning Experiments with Neptune How to Keep Track of PyTorch Lightning Experiments with Neptune

Working with PyTorch Lightning and wondering which logger should you choose to keep track of your experiments?

Thinking of using PyTorch Lightning to structure your Deep Learning code and wouldn’t mind learning about it’s logging functionality?

Why PyTorch Lightning and Neptune?

If you never heard of it, PyTorch Lightning is a very lightweight wrapper on top of PyTorch which is more like a coding standard than a framework.

Just go to your LightningModule and call methods of the Neptune experiment available as self.logger.experiment .

2 недели, 6 дней назад @ neptune.ai
The Best NLP/NLU Papers from the ICLR 2020 Conference
The Best NLP/NLU Papers from the ICLR 2020 Conference The Best NLP/NLU Papers from the ICLR 2020 Conference

The International Conference on Learning Representations (ICLR) took place last week, and I had a pleasure to participate in it.

You can catch up with the first post with the best deep learning papers here, the second post with reinforcement learning papers here, and the third post with generative models papers here.

Main authors:Nikita Kitaev LinkedIn | GitHub | WebsiteŁukasz Kaiser Twitter | LinkedIn | GitHub6.

First author: Sachin Mehta Twitter | LinkedIn | GitHub | Website7.

Paper | CodeFirst author: Chen Zhu LinkedIn | GitHub | WebsiteSummaryDepth and breadth of the ICLR publications is quite inspiring.

2 недели, 6 дней назад @ neptune.ai
The Best Generative Models Papers from the ICLR 2020 Conference
The Best Generative Models Papers from the ICLR 2020 Conference The Best Generative Models Papers from the ICLR 2020 Conference

You can catch up with the first post with deep learning papers here, and the second post with reinforcement learning papers here.

This brings us to the third post of the series – here are 7 best generative models papers from the ICLR.

Best Generative Models Papers1.

Optimal Strategies Against Generative AttacksIn the GANs community, the defense against generative attack is a topic of growing importance.

This post focuses on the “generative models” topic, which is only one of the areas discussed during the conference.

2 недели, 6 дней назад @ neptune.ai
This Week in Machine Learning: DGM Models, Yeast, and Customer Retention
This Week in Machine Learning: DGM Models, Yeast, and Customer Retention This Week in Machine Learning: DGM Models, Yeast, and Customer Retention

Here goes a dose of the latest news, discoveries, and inspiring stories from the world of Machine Learning.

Weekly Roundup: April 27 – May 4> Neptune.ai blog – make sure to visit our blog to find out interesting and in-depth articles on machine learning.

> Applying Machine Learning to…..Yeast?

on Google AI Blog | April 29In collaboration with Calico Life Sciences, Google AI presents “Learning causal networks using inducible transcription factors and transcriptome-wide time series”, published in Molecular Systems Biology.

> How Machine Learning Can Help with Customer Retention by Euge Inzaugarat | April 30In the article, the author writes about building a churn model to understand why custom…

3 недели назад @ neptune.ai
✈️ Telegram
DL in NLP DL in NLP
последний пост 1 день, 20 часов назад
О штуках, которые заинтересовали в начале этой недели
О штуках, которые заинтересовали в начале этой недели О штуках, которые заинтересовали в начале этой недели

О штуках, которые заинтересовали в начале этой недели 1. для машинного перевода. Показывают, что он искривляет распределение n-gram и возможно именно он - причина артефактов машинного перевода которые мы уже затрагивали (e.g. мультиязычные датасаты). Также предлагают байесовский метод семплирования, который получает и высокий BLEU (как beam search) и сохраняет распределения (как обычное семплирование). 1. Если вы не знаете что такое JAX, возможно и не узнаете. Потому что несмотря на интересные идеи, где в numpy-like синтаксисе вы пишете код, который просто работает на CPU/GPU/TPU, в нём нету удобного и привичного интерфейса а-ля nn.Module. Parallax - это попытка на коленке сделать его. И дл…

1 день, 20 часов назад @ t.me
Реформер в 🤗 , дождались.