Very ML
State-of-the-art Machine Learning News Feed
/r/MachineLearning
последний пост 1 час назад
Infinite-Horizon Environments [P] [D] [R]
Infinite-Horizon Environments [P] [D] [R]

Hello All! I am looking for some Infinite Horizon Tasks which are open-sourced for my current work to test out my algorithm. There are some tasks which I'm aware of like OpenAI's Mujoco Tasks like Swimmer, Ant, Hopper but I really think that those are not so great Tasks and not so complicated as well. This paper Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation (NeurIPS) introduces some Tasks like Traffic Control, TAXI and so on. I'm thinking of trying these out. I wanted to know are there any other popular Infinite Horizon Tasks which I'm not aware of and are worth trying? They can be any Tasks like ranging from Robotic to Autonomous Driving Related. Any help would be r…

1 час назад @ reddit.com
[R][P] Autoencoders can reconstruct outliers
[R][P] Autoencoders can reconstruct outliers

An interactive web demo demonstrating autoencoder's ability to reconstruct outliers. https://swyoon.github.io/outlier-reconstruction/ An autoencoder trained on MNIST is able to successfully reconstruct a number of non-MNIST black-and-white images including some Omniglot characters, a handwritten '-1', and a monotone black image. Outlier reconstruction is an interesting phenomenon but may not be desirable for outlier detection purposes. I dived into this problem in my ICML work and proposed Normalized Autoencoders which is less prone to reconstruct outliers. https://arxiv.org/abs/2105.05735 I think it's a fun work so please check out :) The demo was built using Tensorflow.js, so it runs on y…

1 час назад @ reddit.com
[R][P] AdvFreeze - Adversarial training for robustness without sacrificing clean accuracy
[R][P] AdvFreeze - Adversarial training for robustness without sacrificing clean accuracy

Hi all! Just some background, this is a project that I (along with my partner) worked on for an undergraduate class on Deep Neural Nets. The goal of the project was to design and implement a robust TinyImageNet classifier that would perform well on a hidden test set, despite unknown test-time perturbation. As a part of this project, we were required to research and come up with at least 1 novel idea, implement it, and report our findings. Our idea, which we dubbed "AdvFreeze", is described in here, and we were pretty happy with our findings: https://github.com/roshie548/AdvFreeze/blob/main/report.pdf I've now graduated from school, but I'm still interested in seeing what I can do with this.…

3 часа назад @ reddit.com
[D] InfoNCE vs N-pair loss in contrastive learning?
[D] InfoNCE vs N-pair loss in contrastive learning? [D] InfoNCE vs N-pair loss in contrastive learning?

I was reading this awesome blog And it introduces two losses: N-pair loss N-pair loss InfoNCE loss ​ InfoNCE in the blog For query q we have a set of keys { k_1, ..., k_N} where among those N keys there is 1 positive key k_+, and InfoNCE is MOCO InfoNE Now my question is, other than the temperature, what are the differences between InfoNCE loss and N-pair loss? Thanks! submitted by /u/tt19234 [link] [comments]

3 часа назад @ reddit.com
[R] ADOP: Approximate Differentiable One-Pixel Point Rendering
[R] ADOP: Approximate Differentiable One-Pixel Point Rendering

Paper Video Github We present a novel point-based, differentiable neural rendering pipeline for scene refinement and novel view synthesis. The input are an initial estimate of the point cloud and the camera parameters. The output are synthesized images from arbitrary camera poses. The point cloud rendering is performed by a differentiable renderer using multi-resolution one-pixel point rasterization. Spatial gradients of the discrete rasterization are approximated by the novel concept of ghost geometry. After rendering, the neural image pyramid is passed through a deep neural network for shading calculations and hole-filling. A differentiable, physically-based tonemapper then converts the i…

4 часа назад @ reddit.com
[D] Interesting public health/epidemiology datasets for time series forecasting?
[D] Interesting public health/epidemiology datasets for time series forecasting?

Hey guys! I'm looking to do a project involving machine learning/statistical forecasting for a public health class, and was wondering if y'all had any suggestions? I was particularly thinking about using a time series forecasting model/LSTM to forecast disease prevalence, and was hoping to find some interesting datasets to work with. I was particularly interested in doing something like this where multiple factors are used in creating a forecast. Thank y'all! submitted by /u/beatsbysurf [link] [comments]

5 часов назад @ reddit.com
[D] Is it possible to track a batch of inputs through a NN model?
[D] Is it possible to track a batch of inputs through a NN model?

My question comes out of a lack of knowledge on DNN's and their inner workings. However, as we do with pharmaceuticals, can we add a tracking "molecule" i.e. some digital signature to a batch of data. I know that the transformations and convolutions make this incredibly difficult. But couldn't you track it on a hardware level and work your way up to gather conclusions about the overall process? ​ Mods - sorry if this isn't appropriate, feel free to take it down! submitted by /u/ITstartedWITHaRoast [link] [comments]

6 часов назад @ reddit.com
[P] Introducing Godot RL Agents
[P] Introducing Godot RL Agents

We are proud to announce the release v0.1 of the Godot RL Agents framework, a Deep Reinforcement Learning interface for the Godot Game Engine. Overview trailer The objectives of the framework are to: Provide a free and open source tool for Deep RL research and game development. Enable game creators to imbue their non-player characters with unique behaviors. Allow for automated gameplay testing through interaction with an RL agent. The library has a standard gym wrapper. Supports training of RL agents with Ray rllib and StableBaselines3. You will find out more on the GitHub repo here: We look forward to community feedback as we continue to support this project. Disclaimer, this is not an off…

7 часов назад @ reddit.com
[P] Nywspaper: comparing news using transformers
[P] Nywspaper: comparing news using transformers

Hello everyone, I have built nywspaper, a news aggregator / reader / comparison tool for my bachelor's thesis, and I am very excited to share it with you here. The goal of this tool is to make it easier for the readers to understand media bias in the news, by allowing paragraph by paragraph comparison between news articles covering the same story. When you're reading an article on nywspaper, if you click on a paragraph, you get paragraphs similar to the one you clicked from other publishers. This way you can see how a right wing news publisher delivers the same information differently than a left wing publisher. In the main page you can see articles grouped by events, and you can just navig…

7 часов назад @ reddit.com
[D] Best practices for deep learning with mixed data (image and tabular)
[D] Best practices for deep learning with mixed data (image and tabular) [D] Best practices for deep learning with mixed data (image and tabular)

Hi, everyone! First time posting here, I hope this is the right place! I'm working on image classification using transfer learning with a pre-trained CNN (Inception) as the feature extractor. I'd like to also use some metadata I have for the images as I think this can improve the performance of my model. In terms of architecture, how would I go about doing this? What are the current best practices and could you point me to some literature or resources on this topic? My idea: Just concatenate the tabular data (pre-processed) to the output of the image feature extractor and then pass that to an MLP (see pic). However, I saw an example online where they first pass the tabular data through a sm…

7 часов назад @ reddit.com
[D] Variational Bayes for tabular data with image conditioning
[D] Variational Bayes for tabular data with image conditioning

Hey. I'm looking for work on conditional VAEs for tabular data, conditioned on image features. Is there anything like that? If not, why not? Are there fundamental assumptions of the model violated? Any hint is welcome. Thanks. submitted by /u/Baggins95 [link] [comments]

8 часов назад @ reddit.com
[R] Recurrent Model-Free RL is a Strong Baseline for Many POMDPs
[R] Recurrent Model-Free RL is a Strong Baseline for Many POMDPs

submitted by /u/hardmaru [link] [comments]

8 часов назад @ reddit.com
[Discussion] What is the best way to extract embeddings of different sizes from CV models?
[Discussion] What is the best way to extract embeddings of different sizes from CV models?

I am using currently a resnet50 model that gives out embeddings of size 2048. What is the best approach to get embeddings of different sizes (128,256 512..)? I tried to add a Relu activation and Linear layer of my desired embedding size after resnet's AvgPool later as a baseline but performance dropped considerably. Any tips are appreciated. submitted by /u/BuddhaSadhu666 [link] [comments]

9 часов назад @ reddit.com
[2110.06961] Language Modelling via Learning to Rank
[2110.06961] Language Modelling via Learning to Rank

submitted by /u/ArvidF_ML [link] [comments]

9 часов назад @ reddit.com
[R] Google Proposes ARDMs: Efficient Autoregressive Models That Learn to Generate in any Order
[R] Google Proposes ARDMs: Efficient Autoregressive Models That Learn to Generate in any Order

A Google Research team introduces Autoregressive Diffusion Models (ARDMs), a model class encompassing and generalizing order-agnostic autoregressive models and discrete diffusion models that can generate variables in an arbitrary order and upscale variables. Here is a quick read: Google Proposes ARDMs: Efficient Autoregressive Models That Learn to Generate in any Order. The paper Autoregressive Diffusion Models is on arXiv. submitted by /u/Yuqing7 [link] [comments]

9 часов назад @ reddit.com
Towards Data Science Towards Data Science
последний пост 3 часа назад
Benefits of the CatBoost Machine Learning Algorithm
Benefits of the CatBoost Machine Learning Algorithm Benefits of the CatBoost Machine Learning Algorithm

Table of ContentsIntroduction Categorical Features are More Powerful Integrated Plotting Features Efficient Processing Summary ReferencesIntroductionOut with the old, and in with the new.

Current space/problemsThe current place for most, if not all other machine learning algorithm is that they ingest categorical features with one-hot-encoding.

So, it is important to not just focus on numeric features, but categorical features that have often been neglected in past machine learning algorithms.

Here are three top benefits of the CatBoost library:* Categorical Features are More Powerful * Integrated Plotting Features * Efficient ProcessingI hope you found my article both interesting and useful…

3 часа назад @ towardsdatascience.com
Time-Series Forecasting and Causal Analysis in R with Facebook Prophet and Google CausalImpact
Time-Series Forecasting and Causal Analysis in R with Facebook Prophet and Google CausalImpact Time-Series Forecasting and Causal Analysis in R with Facebook Prophet and Google CausalImpact

Time-Series Forecasting and Causal Analysis in R with Facebook Prophet and Google CausalImpactThis article will be part of my annual dive in R; the idea will be to use two R libraries in time-series forecasting and causal inference.

Still, I was thinking during Montreal’s lockdown; there is maybe some positive impacts in the population that are not related to the virus spread (more than people baking for example).

Figure from the authorFrom this view, it doesn’t seem that there is a seasonality on Montreal’s crimes (like more crimes in summer than winter).

To conclude, I just decided to compare the prediction of the Prophet model and the CausalImpact model.

Figure from the authorThe CausalI…

6 часов назад @ towardsdatascience.com
7 Data Science Myths That We Should Leave Behind in 2021
7 Data Science Myths That We Should Leave Behind in 2021 7 Data Science Myths That We Should Leave Behind in 2021

7 Data Science Myths That We Should Leave Behind in 2021Photo by Frankie K. on UnsplashEvery field is surrounded by myths, which started by some people and spread like wildfire through the years and countries.

Although some of the tech field myths cross over to data science as it is a branch of tech, some unique myths are hovering over data science in particular.

And visualization is not the only way you can be creative in data science; writing code can be art; coming out with innovative communication styles are other ways you use creativity in data science.

Although, at first sight, you may think that the skills needed for data science are just for data science, many of these skills are tr…

6 часов назад @ towardsdatascience.com
There’s no shame in regular expressions
There’s no shame in regular expressions There’s no shame in regular expressions

There’s no shame in regular expressionsYou work at a company that claims to use AI or ML to solve a hard problem, when in reality it is a mixture of rules and ML.

First rows of the test datasetAnd the class mapping…Class mapping with all 24 product categoriesFor the sake of this example, let’s say the rules you were given are made out of regular expressions.

Using this package, it is possible to write rules for text which go beyond simple regular expressions.

Neither the feature processing nor the model have been optimised for this problem, so in this case we can only treat this as a comparison between our regular expressions and a baseline machine learning with minimum work.

So next time s…

7 часов назад @ towardsdatascience.com
Salami Coffee Grinding
Salami Coffee Grinding Salami Coffee Grinding

Then if we focus on the cumulative distribution at a few points, the coarseness of the first few samples is clearer.

I wonder if it is due to the grinder just loading up with grinds before it hits a steady state.

Applying Pattern RecognitionAnother way to look at this data is by using pattern recognition.

We can split this up by fine (<400um) and coarse (>400um) particles.

The <400um similarity matrix is very similar to all the particles because 70% of particles are on the fine size.

10 часов назад @ towardsdatascience.com
10 Things You Might Not Know About Wikipedia Library In Python
10 Things You Might Not Know About Wikipedia Library In Python 10 Things You Might Not Know About Wikipedia Library In Python

What to expectAs Wikipedia API can retrieve almost all contents from a Wikipedia page, we do not need to rely heavily on different data scraping techniques in this case.

Extracting URL of a pageYou can easily extract URL of any page in Wikipedia with url attribute of the page object.

Extracting reference URLsYou can even extract all the reference URLs on Wikipedia page with page object and a different attribute this time, which is references.

Extracting page imagesImages can also be retrieved with a line of command.

You can try different languages to understand a specific paragraph.

10 часов назад @ towardsdatascience.com
Clustering Made Easy with PyCaret
Clustering Made Easy with PyCaret Clustering Made Easy with PyCaret

Clustering Made Easy with PyCaretPhoto by Lucas Hobbs on UnsplashThe content of this article was originally published in my latest book, Simplifying Machine Learning with PyCaret.

We are now going to see how the PyCaret clustering module can help us easily train a model and evaluate its performance.

We also import the PyCaret clustering functions, as well as the make_blobs() scikit-learn function that can be used to generate datasets.

The elbow method trains the clustering model for a range of K values, and visualizes the distortion score for each one of them¹⁰.

ConclusionHopefully, the case study I provided in this article will help you get a grasp of the PyCaret clustering module.

10 часов назад @ towardsdatascience.com
Eliminating AI Bias
Eliminating AI Bias Eliminating AI Bias

AI Bias has presented itself in several forms, with some examples highlighted below:In 2018, Amazon stopped using its AI Hiring Tool for being biased against women¹.

To effectively identify AI Bias, we need to look for presence of bias across the AI Lifecycle shown in Figure 1.

Figure 1: The Five Stages of the AI/ML Pipeline and associated AI bias (Image by author)Sources of AI Bias in the AI/ML PipelineA typical AI/ML pipeline starts with a business problem that requires the use of data and analytics to understand the drivers of the issue.

Python libraries that are algorithm-specific when it comes to detecting AI bias are Fair Classification¹⁵, Fair Regression¹⁶, and Scalable Fair Clusteri…

11 часов назад @ towardsdatascience.com
Data Splitting for Model Evaluation
Data Splitting for Model Evaluation Data Splitting for Model Evaluation

Unfortunately, sometimes even seasoned Kagglers may make data splitting mistakes by applying feature engineering on the whole data before a train-test split.

The pertinent point is that data splitting is such a basic concept in data science that we sometimes underestimated or forgot its importance.

This may result in test data points being imputed with training data points or vice versa.

The proper way in the simplest case is to ensure that the test data is chronologically later than the training data.

The test data are the common test data points in the two setups.

19 часов назад @ towardsdatascience.com
Bayes’ Theorem, Clearly Explained with Visualization
Bayes’ Theorem, Clearly Explained with Visualization Bayes’ Theorem, Clearly Explained with Visualization

Before settling down with that guess, let’s use Bayes’ theorem to answer this question.

What is Bayes’ Theorem?

Bayes’ theorem is a simple mathematical formula used for calculating conditional probabilities.

This theorem states that:Image by AuthorIn general, Bayes’ theorem updates the probability of an event using the new information that is related to that event.

Imagine there are 8 people who tested negative with COVID and 6 people who tested positive with COVID.

19 часов назад @ towardsdatascience.com
MySQL vs Cassandra DB
MySQL vs Cassandra DB MySQL vs Cassandra DB

MySQL vs Cassandra DBPhoto by Daniil Kuželev on UnsplashIn my last MySQL vs article, I talked about Redis, which was a database I hadn’t heard about before.

Cassandra DB is another NoSQL database that I haven’t had the opportunity to try but often heard about it at events like Hackathons.

Other than being a NoSQL database, Cassandra is also open source, which follows what I typically choose to write articles about.

So, for another MySQL vs article, let’s look more into what Cassandra DB is.

ConclusionIn today’s MySQL versus series, we looked at Cassandra, a NoSQL database.

19 часов назад @ towardsdatascience.com
Not Merely Averages: Using Machine Learning to Estimate Heterogeneous Treatment Effects
Not Merely Averages: Using Machine Learning to Estimate Heterogeneous Treatment Effects Not Merely Averages: Using Machine Learning to Estimate Heterogeneous Treatment Effects

So how should we get any proxies of individual treatment effects?

Taking the difference gives us the estimated individual treatment effects.

Using these predicted individual treatment effects, we then analyze the following three properties to learn about treatment effect heterogeneity.

Sorted Group Average Treatment Effects (GATES) — or “what are the treatment effect for the 20% most and 20% least affected individuals?”If there is treatment effect heterogeneity, we want to know how it varies across individuals.

Screenshot of R output (image by author)Sorted Group Average Treatment Effects (GATES) — or “what are the treatment effect for the 20% most and 20% least affected individuals?”Next, …

20 часов назад @ towardsdatascience.com
Multidimensional Scaling (MDS) for Dimensionality Reduction and Data Visualization
Multidimensional Scaling (MDS) for Dimensionality Reduction and Data Visualization Multidimensional Scaling (MDS) for Dimensionality Reduction and Data Visualization

Multidimensional Scaling (MDS) for Dimensionality Reduction and Data VisualizationDimensionality reduction methods allow examining the dataset in another axis according to the relationship between various parameters such as correlation, distance, variance in datasets with many features.

This article deals with the various types of multi-dimensional scaling, which is one of the dimensionality reduction and visualization methods, from the theoretical level and includes its usage areas.

Different Distance Approaches on image dataset- Euclidean Distance- Manhattan Distance- Chebyshev Distance- Minkowski Distance5.

[2] “Multidimensional Scaling — Joseph B. Kruskal, Myron Wish — Google Books.” ht…

20 часов назад @ towardsdatascience.com
Getting Started with R Shiny
Getting Started with R Shiny Getting Started with R Shiny

Getting Started with R ShinyPhoto by Luke Chesser on UnsplashIntro to R ShinyShiny is an R package that allows programmers to build web applications within R. For someone like me, who found building GUI applications in Java really hard, Shiny makes it much easier.

install.packages("shiny")Shiny App StructureLike R files, Shiny apps also end with a .R extension.

library(shiny)# Build shiny objectshinyApp(ui = ui, server = server)# Call to run the applicationrunApp("my_app")What is Reactivity?

Reactivity is all about linking user inputs to app outputs; so that the display in the Shiny app is dynamically updated based on user input or selection.

User can select the x and y-variables for plotti…

20 часов назад @ towardsdatascience.com
MetaClean Automates Peak Quality Assessments
MetaClean Automates Peak Quality Assessments MetaClean Automates Peak Quality Assessments

MetaClean Automates Peak Quality AssessmentsSource: AuthorEven the best metabolomics pipelines have a degree of variance, which can cause poor peak integration between samples.

Machine learning can automate peak evaluationIndividual quality metrics can quantify individual peak quality.

The MetaClean features: Eleven peak quality metricsChetnik, Petrick and Pandey explored whether a combination of quality metrics could work better as an effective feature set for a peak quality classifier.

Source: AuthorThe MetaClean model: The best of eight ML classification algorithmsThe next step was to find the machine learning algorithm that best classifies peak quality.

MetaClean evaluates 24 different …

21 час назад @ towardsdatascience.com
Distill.pub Distill.pub
последний пост 1 месяц, 1 неделя назад
Understanding Convolutions on Graphs
Understanding Convolutions on Graphs

Understanding the building blocks and design choices of graph neural networks.

1 месяц, 1 неделя назад @ distill.pub
A Gentle Introduction to Graph Neural Networks
A Gentle Introduction to Graph Neural Networks

What components are needed for building learning algorithms that leverage the structure and properties of graphs?

1 месяц, 1 неделя назад @ distill.pub
Distill Hiatus
Distill Hiatus

After five years, Distill will be taking a break.

3 месяца, 2 недели назад @ distill.pub
Adversarial Reprogramming of Neural Cellular Automata
Adversarial Reprogramming of Neural Cellular Automata

Reprogramming Neural CA to exhibit novel behaviour, using adversarial attacks.

5 месяцев, 1 неделя назад @ distill.pub
Weight Banding
Weight Banding

Weights in the final layer of common visual models appear as horizontal bands. We investigate how and why.

6 месяцев, 1 неделя назад @ distill.pub
Branch Specialization
Branch Specialization

When a neural network layer is divided into multiple branches, neurons self-organize into coherent groupings.

6 месяцев, 1 неделя назад @ distill.pub
Multimodal Neurons in Artificial Neural Networks
Multimodal Neurons in Artificial Neural Networks

We report the existence of multimodal neurons in artificial neural networks, similar to those found in the human brain.

7 месяцев, 2 недели назад @ distill.pub
Self-Organising Textures
Self-Organising Textures

Neural Cellular Automata learn to generate textures, exhibiting surprising properties.

8 месяцев назад @ distill.pub
Self-Organising Textures
Self-Organising Textures

Neural Cellular Automata learn to generate textures, exhibiting surprising properties.

8 месяцев назад @ distill.pub
Visualizing Weights
Visualizing Weights

We present techniques for visualizing, contextualizing, and understanding neural network weights.

8 месяцев, 1 неделя назад @ distill.pub
Curve Circuits
Curve Circuits

Reverse engineering the curve detection algorithm from InceptionV1 and reimplementing it from scratch.

8 месяцев, 2 недели назад @ distill.pub
High/Low frequency detectors
High/Low frequency detectors

A family of early-vision filters reacting to contrasts between spatial gratings of different frequency

8 месяцев, 3 недели назад @ distill.pub
The Gradient The Gradient
последний пост 3 недели, 6 дней назад
The Imperative for Sustainable AI Systems
The Imperative for Sustainable AI Systems The Imperative for Sustainable AI Systems

Sustainable AI can be thought of as another dimension along which we can guide the development of AI systems in addition to typical functional and business requirements.

Share the idea of sustainable AI widelyWe can begin first by sharing this idea of sustainable AI with our own communities of research and practice.

This will create a virtuous cycle of practice and reward enabling a transition into “Green AI”, AI systems that are built and used with the awareness of their carbon impact, as opposed to “Red AI” , AI systems that are built and used with only performance goals in mind ,which is currently favored and supported in the research and practice ecosystem.

Instrument your AI systems to…

3 недели, 6 дней назад @ thegradient.pub
Has AI found a new Foundation?
Has AI found a new Foundation? Has AI found a new Foundation?

They coined a new term, “Foundation Models” to characterize the new paradigm, joined forces in a “Center for Research on Foundation Models”, and published the massive 212-page report “On the Opportunities and Risks of Foundation Models.”Although the term is new, the general approach is not.

At the Workshop on Foundation Models, Jitendra Malik, a renowned expert in computer vision at Berkeley, said, “I am going to take a ... strongly critical role, when we talk about them as the foundation of AI ...

As Georgia Tech professor Mark Riedl wrote on Twitter “Branding very large pre-trained neural language models as “foundation” models is a brilliant … PR stunt.

The report says, unironically, “we …

1 месяц назад @ thegradient.pub
Test
Test Test

An Introduction to AI Story GenerationIt’s All Training Data: Using Lessons from Machine Learning to Retrain Your Mind

1 месяц, 1 неделя назад @ thegradient.pub
An Introduction to AI Story Generation
An Introduction to AI Story Generation An Introduction to AI Story Generation

Some events are goal events and some events are sub-goal events that are necessary steps in achieving a final goal event.

These sub-goal events precede the goal event and are also related to the goal event as part of a hierarchy.

Training a language model on a corpus of stories means the language model will attempt to emulate what it has learned from a corpus.

They train an explanation generation language model to answer the question and then make the resulting explanation the previous segment of the story.

CitationFor attribution in academic contexts or books, please cite this work asMark Riedl, "An Introduction to AI Story Generation", The Gradient, 2021.

1 месяц, 3 недели назад @ thegradient.pub
Systems for Machine Learning
Systems for Machine Learning Systems for Machine Learning

Machine learning’s increasing importance to real-world applications brought awareness of a new field focused on ML in practice - machine learning systems (or, as some call it, MLOps).

This field acts as a bridging point between the domains of computer systems and machine learning, considering the new challenges of machine learning with a lens shaped by traditional systems research.

By identifying a shared problem in both systems and ML - combining data sources - we can apply traditional systems techniques to a machine learning setting.

Of course, there are some differences, reflecting how machine learning systems differ from the traditional paradigm.

Systems research is filling this need, b…

2 месяца назад @ thegradient.pub
Machine Learning Won't Solve Natural Language Understanding
Machine Learning Won't Solve Natural Language Understanding Machine Learning Won't Solve Natural Language Understanding

Language UnderstandingWhile NLP (Natural Language Processing) and NLU (Natural Language Understanding) are often used interchangeably, there is a substantial difference between the two and it is crucial to highlight this difference.

Thus, machine learning and language understanding are incompatible – in fact, they are contradictory.

Natural language is rampant with intensional phenomena, since objects of thoughts — that language conveys — have an intensional aspect that cannot be ignored.

CitationFor attribution in academic contexts or books, please cite this work asWalid Saba, "Machine Learning Won't Solve Natural Language Understanding", The Gradient, 2021.

BibTeX citation:@article{saba20…

2 месяца, 1 неделя назад @ thegradient.pub
Machine Translation Shifts Power
Machine Translation Shifts Power Machine Translation Shifts Power

Origins of Machine TranslationThe first machine translation efforts in the United States were spurred by the Cold War.

However, the United States government remained a faithful consumer of machine translation technology; in Tom Pedtke’s 1997 keynote address at the sixth Machine Translation Summit, he reflects on several key developments in the 1990s fostered by government demand.

This is not to say that machine translation is epistemologically bereft; the parallel text basis of contemporary machine translation paradigms aligns with Quine’s pragmatic, behaviorist approach to translation.

Translation remains a political act, and data-driven machine translation developments, largely centered i…

2 месяца, 2 недели назад @ thegradient.pub
It’s All Training Data: Using Lessons from Machine Learning to Retrain Your Mind
It’s All Training Data: Using Lessons from Machine Learning to Retrain Your Mind It’s All Training Data: Using Lessons from Machine Learning to Retrain Your Mind

As machine learning scientists, we know that training data can make or break your model.

So I began research on the topic of using personal data for machine learning education.

I’m currently working on a book called Life Lessons from Algorithms , a personal history of resilience demonstrated through different machine learning algorithms.

Author BioYim is a PhD student at the University of Washington, researching innovative ways to teach machine learning.

CitationFor attribution in academic contexts or books, please cite this work asYim Lucky Register, "It’s All Training Data: Using Lessons from Machine Learning to Retrain Your Mind", The Gradient, 2021.

2 месяца, 3 недели назад @ thegradient.pub
Justitia ex Machina: The Case for Automating Morals
Justitia ex Machina: The Case for Automating Morals Justitia ex Machina: The Case for Automating Morals

In the following post, we will discuss two complex and related issues regarding these models: fairness and transparency.

These are crucial questions to ask if machine learning models play important parts in our lives and our societies.

However, all hope is not lost as ensuring fairness in machine learning models is an active area of research.

Explanations produced with the LIME algorithm, image fromEnsuring that machine learning models are explainable is an active area of research as well.

CitationFor attribution in academic contexts or books, please cite this work asRasmus Berg Palm and Pola Schwöbel, "Justitia ex Machina: The Case for Automating Morals", The Gradient, 2021.

3 месяца назад @ thegradient.pub
Prompting: Better Ways of Using Language Models for NLP Tasks
Prompting: Better Ways of Using Language Models for NLP Tasks Prompting: Better Ways of Using Language Models for NLP Tasks

Several recent papers follow this line of approaches by adjusting the objective (Tam et al., 2021) or formulating tasks in a unified form, like question answering (Zhong et al., 2021) or textual entailment (Wang et al., 2021).

Following-up works further refine the way of using demonstrations: Gao et al., 2021; Liu et al.

It is a good few-shot model as long as the design can generalize well to other few-shot tasks (these tasks can be so-called “true few-shot”).

Introducing LM-BFFFinally, let me introduce our ACL’21 paper, "Making Pre-trained Language Models Better Few-shot Learners", abbreviated as LM-BFF (better few-shot fine-tuning of language models; alternatively, language models' best f…

3 месяца, 2 недели назад @ thegradient.pub
How to Do Multi-Task Learning Intelligently
How to Do Multi-Task Learning Intelligently How to Do Multi-Task Learning Intelligently

proposed, “AdaShare: Learning What to Share for Efficient Deep Multi-Task Learning” (2020) [2].

Overview of “AdaShare: Learning What to Share for Efficient Deep Multi-Task Learning” by X.

Adashare: Learning what to share for efficient deep multi-task learning.

His research interests lie in the unsupervised deep learning, Few shot learning and Multi-task learning in computer vision.

He is doing research related to learning efficiently including topics from Multi-Task Learning and Uncertainty Estimation.

3 месяца, 3 недели назад @ thegradient.pub
Are Self-Driving Cars Really Safer Than Human Drivers?
Are Self-Driving Cars Really Safer Than Human Drivers? Are Self-Driving Cars Really Safer Than Human Drivers?

And if ADSs cannot perform commonsense reasoning to handle all these edge cases, are they really safer than human drivers?

Why Testing is NecessaryWe have some good reasons to believe that ADSs will be safer than human drivers, and we have some good reasons to worry that ADSs will not be as safe as human drivers.

There are good reasons to assume that ADSs will be safer than human drivers.

CitationFor attribution in academic contexts or books, please cite this work as:Steve Shwartz, "Are Self-Driving Cars Really Safer Than Human Drivers?

BibTeX citation:@article{shwartzadss2021,author = {Shwartz, Steve},title = {Are Self-Driving Cars Really Safer Than Human Drivers?

4 месяца назад @ thegradient.pub
How has AI contributed to dealing with the COVID-19 pandemic?
How has AI contributed to dealing with the COVID-19 pandemic? How has AI contributed to dealing with the COVID-19 pandemic?

The AI powering both didn’t miss the start of the COVID-19 pandemic.

The virus was new and the world was experiencing the first wave of the COVID-19 pandemic.

Another example of AI being used by a governmental organization to gain epidemiological insight is discussed in Facebook AI's blogpost.

A practical example in the first subcategory is the creation of the COVID-19 Open Research Dataset (CORD-19), which is a machine-readable dataset of published literature on COVID-19.

Thoughts and conclusionIn this piece, an effort was made to show some examples of AI being used to aid in the fight against the pandemic.

4 месяца, 3 недели назад @ thegradient.pub
Towards Human-Centered Explainable AI: the journey so far
Towards Human-Centered Explainable AI: the journey so far Towards Human-Centered Explainable AI: the journey so far

Second, AI systems do not exist in a vacuum-- they are situated and operate in a rich tapestry of socio-organizational relationships.

Considering the socially-situated nature of AI systems, our current AI problems are no longer purely technical ones; they are inherently sociotechnical.

My own journey towards a sociotechnically-informed Human-centered XAI has been anything but self-evident.

CitationFor attribution in academic contexts or books, please cite this work asEhsan Upol, "Towards Human-Centered Explainable AI: the journey so far", The Gradient, 2021.

BibTeX citation:@article{xaijourney,author = {Upol, Ehsan},title = {Towards Human-Centered Explainable AI: the journey so far},journal…

5 месяцев, 1 неделя назад @ thegradient.pub
Machine Learning, Ethics, and Open Source Licensing (Part II/II)
Machine Learning, Ethics, and Open Source Licensing (Part II/II) Machine Learning, Ethics, and Open Source Licensing (Part II/II)

In a 2016 essay, Nadia Eghbal quotes Karl Fogel, an early open source advocate:In open source, you can only have “my” in the associative sense.

Domain-specific licensing for machine learningThis domain-specific approach is particularly promising for machine learning, in part because of several key areas where machine learning diverges heavily from other software.

Author Bio: Chris is an open source developer and contributor to the Rust-ML machine learning working group.

CitationFor attribution in academic contexts or books, please cite this work asChristopher Moran, "Machine Learning, Ethics, and Open Source Licensing (Part II/II)", The Gradient, 2021.

BibTeX citation:@article{moranopensour…

5 месяцев, 2 недели назад @ thegradient.pub
TheSequence TheSequence
последний пост 1 день, 12 часов назад
🟠🟣 Edge#132: WhyLabs, AI Observability as a Service
&#128992;&#128995; Edge#132: WhyLabs, AI Observability as a Service

Something has gone terribly wrong :(

1 день, 12 часов назад @ thesequence.substack.com
🟠🟣 Edge#132: WhyLabs, AI Observability as a Service
&#128992;&#128995; Edge#132: WhyLabs, AI Observability as a Service

Something has gone terribly wrong :(

1 день, 12 часов назад @ thesequence.substack.com
🎙 Paroma Varma/Snorkel on programmatic approaches to data labeling
&#127897; Paroma Varma/Snorkel on programmatic approaches to data labeling

Something has gone terribly wrong :(

2 дня, 12 часов назад @ thesequence.substack.com
✍🏽 Edge#131: Self-Supervised Learning for Language
&#9997;&#127997; Edge#131: Self-Supervised Learning for Language

Something has gone terribly wrong :(

3 дня, 13 часов назад @ thesequence.substack.com
🧠🧠🧠 The Thousand Brains Theory, A New AI Book You Must Read
&#129504;&#129504;&#129504; The Thousand Brains Theory, A New AI Book You Must Read

Something has gone terribly wrong :(

5 дней, 13 часов назад @ thesequence.substack.com
📝 Guest post: Data Aggregation is Unavoidable! (And Other Big Data Lies)
&#128221; Guest post: Data Aggregation is Unavoidable! (And Other Big Data Lies)

Something has gone terribly wrong :(

1 неделя назад @ thesequence.substack.com
🧙🏻‍♂️ Edge#130: The ML Engineering Magic Behind OpenAI Codex
&#129497;&#127995;&#8205;&#9794;&#65039; Edge#130: The ML Engineering Magic Behind OpenAI Codex

Something has gone terribly wrong :(

1 неделя, 1 день назад @ thesequence.substack.com
🐹 Edge#129: Self-Supervised Learning as Non-Contrastive Learning 
&#128057; Edge#129: Self-Supervised Learning as Non-Contrastive Learning&#160;

Something has gone terribly wrong :(

1 неделя, 3 дня назад @ thesequence.substack.com
🌋 The Biggest Problems in ML Safety
&#127755; The Biggest Problems in ML Safety

Something has gone terribly wrong :(

1 неделя, 5 дней назад @ thesequence.substack.com
📌 Join us free for TransformX on Oct 6-7
&#128204; Join us free for TransformX on Oct 6-7

Something has gone terribly wrong :(

2 недели назад @ thesequence.substack.com
🤖 Edge#128: Wu Dao – the Biggest Transformer Model in History
&#129302; Edge#128: Wu Dao &#8211; the Biggest Transformer Model in History

Something has gone terribly wrong :(

2 недели, 1 день назад @ thesequence.substack.com
🎙 Judah Phillips / Squark about No-Code Predictive Analytics
&#127897; Judah Phillips / Squark about No-Code Predictive Analytics

Something has gone terribly wrong :(

2 недели, 2 дня назад @ thesequence.substack.com
🐼 Edge#127: How Contrastive Learning Influences Self-Supervised Learning methods
&#128060; Edge#127: How Contrastive Learning Influences Self-Supervised Learning methods

Something has gone terribly wrong :(

2 недели, 3 дня назад @ thesequence.substack.com
👑 The GPT-3 Influence Factor
&#128081; The GPT-3 Influence Factor

Something has gone terribly wrong :(

2 недели, 5 дней назад @ thesequence.substack.com
📌 Join us on Oct 6: Data Aggregation is Unavoidable! (And Other Big Data Lies)*
&#128204; Join us on Oct 6: Data Aggregation is Unavoidable! (And Other Big Data Lies)*

Something has gone terribly wrong :(

3 недели назад @ thesequence.substack.com
Synced Review
последний пост 10 часов назад
Google Proposes ARDMs: Efficient Autoregressive Models That Learn to Generate in any Order
Google Proposes ARDMs: Efficient Autoregressive Models That Learn to Generate in any Order Google Proposes ARDMs: Efficient Autoregressive Models That Learn to Generate in any Order

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

10 часов назад @ medium.com
Google Researchers Explore the Limits of Large-Scale Model Pretraining
Google Researchers Explore the Limits of Large-Scale Model Pretraining Google Researchers Explore the Limits of Large-Scale Model Pretraining

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 день, 9 часов назад @ medium.com
NVIDIA’s StyleGAN3 Is Fully Equivariant to Translation and Rotation, Improving GAN-Based Animation…
NVIDIA’s StyleGAN3 Is Fully Equivariant to Translation and Rotation, Improving GAN-Based Animation… NVIDIA’s StyleGAN3 Is Fully Equivariant to Translation and Rotation, Improving GAN-Based Animation…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 дня, 9 часов назад @ medium.com
Are Patches All You Need?
Are Patches All You Need? Are Patches All You Need?

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 дня, 9 часов назад @ medium.com
Apple Study Reveals the Learned Visual Representation Similarities and Dissimilarities Between…
Apple Study Reveals the Learned Visual Representation Similarities and Dissimilarities Between… Apple Study Reveals the Learned Visual Representation Similarities and Dissimilarities Between…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя назад @ medium.com
Facebook & CMU’s Zero-Shot VideoCLIP Outperforms Fully-Supervised SOTA Methods for Video-Text…
Facebook & CMU’s Zero-Shot VideoCLIP Outperforms Fully-Supervised SOTA Methods for Video-Text… Facebook & CMU’s Zero-Shot VideoCLIP Outperforms Fully-Supervised SOTA Methods for Video-Text…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя, 1 день назад @ medium.com
Google Significantly Improves Visual Representations by Adding Explicit Information Compression
Google Significantly Improves Visual Representations by Adding Explicit Information Compression Google Significantly Improves Visual Representations by Adding Explicit Information Compression

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя, 2 дня назад @ medium.com
DeepMind’s FIRE PBT: Automated Hyperparameter Tuning With Faster Model Training and Better Final…
DeepMind’s FIRE PBT: Automated Hyperparameter Tuning With Faster Model Training and Better Final… DeepMind’s FIRE PBT: Automated Hyperparameter Tuning With Faster Model Training and Better Final…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя, 3 дня назад @ medium.com
Debiasing Image Datasets: Oxford University Presents PASS, an ImageNet Replacement for…
Debiasing Image Datasets: Oxford University Presents PASS, an ImageNet Replacement for… Debiasing Image Datasets: Oxford University Presents PASS, an ImageNet Replacement for…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 неделя, 4 дня назад @ medium.com
ETH Zurich and NVIDIA’s Massively Parallel Deep RL Enables Robots to Learn to Walk in Minutes
ETH Zurich and NVIDIA’s Massively Parallel Deep RL Enables Robots to Learn to Walk in Minutes ETH Zurich and NVIDIA’s Massively Parallel Deep RL Enables Robots to Learn to Walk in Minutes

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 недели назад @ medium.com
NYU & UNC Reveal How Transformers’ Learned Representations Change After Fine-Tuning
NYU & UNC Reveal How Transformers’ Learned Representations Change After Fine-Tuning NYU & UNC Reveal How Transformers’ Learned Representations Change After Fine-Tuning

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 недели, 2 дня назад @ medium.com
DeepMind & IDSIA Introduce Symmetries to Black-Box MetaRL to Improve Its Generalization Ability
DeepMind & IDSIA Introduce Symmetries to Black-Box MetaRL to Improve Its Generalization Ability DeepMind & IDSIA Introduce Symmetries to Black-Box MetaRL to Improve Its Generalization Ability

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 недели, 3 дня назад @ medium.com
NYU & UBC Propose Deviation-Based Learning to Advance Recommender System Training
NYU & UBC Propose Deviation-Based Learning to Advance Recommender System Training NYU & UBC Propose Deviation-Based Learning to Advance Recommender System Training

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 недели, 4 дня назад @ medium.com
Google’s Zero-Label Language Learning Achieves Results Competitive With Supervised Learning
Google’s Zero-Label Language Learning Achieves Results Competitive With Supervised Learning Google’s Zero-Label Language Learning Achieves Results Competitive With Supervised Learning

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 недели назад @ medium.com
UMass Amherst & Google Improve Few-Shot Learning on NLP Benchmarks via Task Augmentation and…
UMass Amherst & Google Improve Few-Shot Learning on NLP Benchmarks via Task Augmentation and… UMass Amherst & Google Improve Few-Shot Learning on NLP Benchmarks via Task Augmentation and…

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 недели, 1 день назад @ medium.com
📓 Cool Blogs
ODS.ai Habr ODS.ai Habr
последний пост 4 недели назад
Новый запуск курса Natural Language Processing
Новый запуск курса Natural Language Processing Новый запуск курса Natural Language Processing

TL;DR: Этой осенью сообщество Open Data Science и компания Huawei делают новый запуск курса.

Мы делаем новый запуск курса Natural Language Processing.

Полный syllabus курса можно посмотреть здесь.

После каждой лекции будет квиз.

Пару слов обо мне, как авторе курса, - в области NLP я работаю последние 9 лет, успел поработать в Яндексе и ВКонтакте, защитить кандидатскую диссертацию.

4 недели назад @ habr.com
Анализ вакансий и зарплат в Data Science
Анализ вакансий и зарплат в Data Science Анализ вакансий и зарплат в Data Science

Делимся нашим исследованием вакансий и зарплат в сфере data science и data engineering.

Посмотрим еще на распределение грейдов для каждой специальностиСпрос на джунов в дата инжиниринге ниже, чем в аналитике и data science.

Зачастую зарплата указывается в тексте вакансии в неструктурированном виде, иногда в одной вакансии идет несколько позиций и несколько вилок.

После этого сделаем 4 регулярки: для текста, который часто идет до и после зарплаты в вакансии, для самих цифр зарплаты и для текста, который встречается между нижней и верхней границей зарплатной вилки.

В целом, знания deep learning становятся более востребованными: указаны в более 30% вакансий в 2021 году.

1 месяц, 2 недели назад @ habr.com
О квантовых компьютерах, биткоине и превосходстве. Лекция открытого курса qmlcourse
О квантовых компьютерах, биткоине и превосходстве. Лекция открытого курса qmlcourse О квантовых компьютерах, биткоине и превосходстве. Лекция открытого курса qmlcourse

Также неправильно было бы говорить о том, что в отличии от классических компьютеров, где есть лишь и в квантовых есть все состояния сразу.

Но это для классического компьютера.

И сегодня мы вынуждены использовать лишь очень приближенные решения и концепции, точности которых часто не хватает.

На сегодня почти все известные технологии создания квантовых компьютеров требуют чего-то из:сверхнизкие температурысверхвысокий вакуумсверхточная юстировка лазеров на оптическом столеИли даже всего сразу.

Да и в этом нет особого смысла, ведь мало кому дома нужно взламывать биткоин, решать логистическую проблему или разрабатывать высокотемпературный сверхпроводник.

2 месяца, 1 неделя назад @ habr.com
Создание и балансировка инвестиционного портфеля с помощью ML
Создание и балансировка инвестиционного портфеля с помощью ML

В прошлой статье я писал про свои ML-модели для оценки отдельных компаний, но вопрос формирования итогового портфеля совсем не затрагивал. В этом посте хочу рассказать о том, как я собираю свой личный портфель, а так же поделиться сайтом, на котором реализую весь описанный в статье функционал http://stocks.ml. Дисклеймер: у автора нет экономического образования и все выводы и суждения в статье делаются на основе житейского опыта и здравого смысла. Читать далее

4 месяца, 2 недели назад @ habr.com
Учиться, учиться, и ещё раз учиться?
Учиться, учиться, и ещё раз учиться? Учиться, учиться, и ещё раз учиться?

TLDR: крохотные модельки обошли модные графовые нейронки в предсказании свойств молекул.

Код: здесь. Берегите Природу. ФОТО: Андерс Хеллберг для Wikimedia Commons, модель — Грета Тунберг

Необученная графовая свёрточная нейронная сеть [1] (uGCN) со случайной инициализацией весов уже пару лет занимает первое место в моём списке алгоритмов для задач машинного обучения на графах из-за копеечной стоимости, простоты реализации, да вполне очевидной элегантности решения. В то же время, насколько мне известно, никто ещё не не проводил соревнований между этой простой моделью и её старшей сестрой — полноценно обученной графовой свёрточной нейронной сетью (GCN) в режиме обучения с учителем. Вот я сдела…

4 месяца, 2 недели назад @ habr.com
DeepPavlov стал частью Google Summer of Code в 2021 году
DeepPavlov стал частью Google Summer of Code в 2021 году

В этом году открытая платформа для обработки естественного языка DeepPavlov, разрабатываемая лабораторией нейронных систем и глубокого обучения МФТИ, впервые стала частью ежегодной программы для молодых разработчиков Google Summer of Code.Google Summer of Code (GSoC) — это ежегодное событие, проводимое компанией Google для привлечения молодых разработчиков к разработке проектов с открытым исходным кодом в их свободное летнее время. К участию допускаются студенты высших учебных заведений (бакалавриат, магистратура, аспирантура) и колледжей. Это отличная возможность не только развить навыки программирования, но и заработать!Работать можно в любой организации, которая есть в соответствующем сп…

6 месяцев, 2 недели назад @ habr.com
Мои machine learning тулы для инвестирования
Мои machine learning тулы для инвестирования

В последнее время все больше людей приходит к тому, чтобы не держать деньги под матрасом, а куда-то их инвестировать в надежде сохранить и преумножить свой капитал. Вариант с матрасом плох тем, что с повышением цен на товары и услуги(инфляция) покупательная способность денег падает и через какое-то время купить на них можно значительно меньше, чем раньше. Есть много вариантов, куда вложить деньги(недвижимость, банковский вклад, ценные металлы), но в последнее время популярным становится инвестирование в акции. Только у брокера Тинькофф Инвестиции за несколько лет число клиентов превысило 3.5 млн. В статье я постараюсь описать свой подход к выбору бумаг и поделюсь инструментами, которые для …

6 месяцев, 2 недели назад @ habr.com
Собираем Свой Суперкомпьютер Недорого
Собираем Свой Суперкомпьютер Недорого Собираем Свой Суперкомпьютер Недорого

Нынче никого не удивишь достижениями искусственного интеллекта машинного обучения (ML) в самых разных областях. При этом доверчивые граждане редко задают два вопроса: (i) а какая собственно цена экспериментов и финальной системы и (ii) имеет ли сделанное хоть какую-то целесообразность? Самым важным компонентом такой цены являются как ни странно цена на железо и зарплаты людей. В случае если это все крутится в облаке, нужно еще умножать стоимость железа в 2-3 раза (маржа посредника).

И тут мы неизбежно приходим к тому, что несмотря на то, что теперь даже в официальные билды PyTorch добавляют бета-поддержку ROCm, Nvidia де-факто в этом цикле обновления железа (и скорее всего следующем) остает…

7 месяцев назад @ habr.com
Рубрика «Читаем статьи за вас». Сентябрь — октябрь 2020 года
Рубрика «Читаем статьи за вас». Сентябрь — октябрь 2020 года

Привет, Хабр! Продолжаем публиковать рецензии на научные статьи от членов сообщества Open Data Science из канала #article_essense. Хотите получать их раньше всех — вступайте в сообщество!Статьи на сегодня:1. A Better Use of Audio-Visual Cues: Dense Video Captioning with Bi-modal Transformer (Tampere University, Finland, 2020)2. Fast Bi-layer Neural Synthesis of One-Shot Realistic Head Avatars (Samsung AI Center, 2020)3. Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting (University of California, USA, 2019)4. Whitening for Self-Supervised Representation Learning (University of Trento, Italy, 2020)5. MelGAN: Generative Adversarial Networks for…

7 месяцев, 2 недели назад @ habr.com
Пора избавляться от мышки или Hand Pose Estimation на базе LiDAR за 30 минут
Пора избавляться от мышки или Hand Pose Estimation на базе LiDAR за 30 минут Пора избавляться от мышки или Hand Pose Estimation на базе LiDAR за 30 минут

Всем привет! Пока киберпанк еще не настолько вошел в нашу жизнь, и нейроинтерфейсы далеки от идеала, первым этапом на пути к будущему манипуляторов могут стать LiDAR. Поэтому, чтобы не скучать на праздниках, я решил немного пофантазировать на тему средств управления компьютером и, предположительно, любым устройством, вплоть до экскаватора, космического корабля, дрона или кухонной плиты. Читать дальше →

9 месяцев назад @ habr.com
Machine Learning Mastery
последний пост 2 недели назад
A Tour of Attention-Based Architectures
A Tour of Attention-Based Architectures

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 недели назад @ machinelearningmastery.com
Understanding Simple Recurrent Neural Networks In Keras
Understanding Simple Recurrent Neural Networks In Keras

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 недели, 2 дня назад @ machinelearningmastery.com
How to Learn Python for Machine Learning
How to Learn Python for Machine Learning

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 недели, 5 дней назад @ machinelearningmastery.com
An Introduction To Recurrent Neural Networks And The Math That Powers Them
An Introduction To Recurrent Neural Networks And The Math That Powers Them

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 недели, 1 день назад @ machinelearningmastery.com
Training-validation-test split and cross-validation done right
Training-validation-test split and cross-validation done right

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 недели, 2 дня назад @ machinelearningmastery.com
The Attention Mechanism from Scratch
The Attention Mechanism from Scratch

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

3 недели, 5 дней назад @ machinelearningmastery.com
A Gentle Introduction to Particle Swarm Optimization
A Gentle Introduction to Particle Swarm Optimization

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 месяц назад @ machinelearningmastery.com
What is Attention?
What is Attention?

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 месяц назад @ machinelearningmastery.com
A Bird’s Eye View of Research on Attention
A Bird’s Eye View of Research on Attention

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 месяц, 1 неделя назад @ machinelearningmastery.com
Lagrange Multiplier Approach with Inequality Constraints
Lagrange Multiplier Approach with Inequality Constraints

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 месяц, 2 недели назад @ machinelearningmastery.com
A Gentle Introduction To Sigmoid Function
A Gentle Introduction To Sigmoid Function

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 месяц, 3 недели назад @ machinelearningmastery.com
Calculus in Action: Neural Networks
Calculus in Action: Neural Networks

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 месяц, 3 недели назад @ machinelearningmastery.com
A Gentle Introduction to Taylor Series
A Gentle Introduction to Taylor Series

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 месяц, 3 недели назад @ machinelearningmastery.com
A Gentle Introduction To Approximation
A Gentle Introduction To Approximation

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 месяц, 4 недели назад @ machinelearningmastery.com
The Chain Rule of Calculus – Even More Functions
The Chain Rule of Calculus – Even More Functions

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

2 месяца назад @ machinelearningmastery.com
ML in Production
последний пост None
Sorta Insightful Sorta Insightful
последний пост 1 месяц, 4 недели назад
Six Years Later
Six Years Later Six Years Later

That means I’ve felt greater feelings of responsibility and urgency for puzzlehunt writing compared to blogging.

markdown 6597 2021 - 01 - 29 - mh - 2021. markdown 4249 2021 - 02 - 18 - flash - games .

View CountsThese are the view counts from August 18, 2020 to today, for the posts I’ve written this year.

markdown 912 2021 - 01 - 29 - mh - 2021. markdown 167 2021 - 02 - 18 - flash - games .

Post about Dominion Online:Odds of writing this year: 25%Odds of writing eventually: 50%Post about puzzlehunts:Odds of writing this year: 70%Odds of writing eventually: 90%

1 месяц, 4 недели назад @ alexirpan.com
Why Don't I Have Ads?
Why Don't I Have Ads? Why Don't I Have Ads?

According to this blog, the clickthrough rate of display ads is 0.05%, but is 8.8x higher for native ads.

I don’t want to do native ads, but let’s get an upper bound by assuming I did use native ads, and they have 10x clickthrough, so 0.5%.

If I value readers more than money, why don’t I pay money to run ads to direct people to my site?

I think the main problem is that ads don’t really work for a personal blog.

In retrospect, given how ads work, the price per click has to be based on the revenue of the products people advertise, and if I don’t sell anything, I’m priced out by everyone who does.

3 месяца, 1 неделя назад @ alexirpan.com
Sometimes It's Worth Trying to Change the World
Sometimes It's Worth Trying to Change the World Sometimes It's Worth Trying to Change the World

Normally, I mentally bucket everything into two categories: things I can change, and things I can’t.

If I can’t change something, it’s not worth thinking about, so I push it out of my mind.

Everyone gets the same source code, and although there might be exploits within that code, you can’t change the code itself.

If you have disagreements with a game’s systems, you’re better off finding another game instead of trying to change them.

They have some base level of arrogance that if the world doesn’t work the way they think it should, then they’ll spend their time trying to change the world until it aligns with their expectations.

4 месяца, 3 недели назад @ alexirpan.com
A New Online Dominion Client Approaches
A New Online Dominion Client Approaches A New Online Dominion Client Approaches

Online Dominion is getting yet another online implementation!

They’ve tried providing previous buys and didn’t see much improvement.

I’ve long believed that a strong Dominion AI is doable, but nontrivial to get right.

(The other main reason is that step 0 of any Dominion AI effort is to implement an efficient Dominion rules engine, and I really didn’t want to debug that.)

There have been a few attempts at Dominion AI.

4 месяца, 3 недели назад @ alexirpan.com
The 5 Year Update on Skipping Grad School (and Whether I'd Recommend It)
The 5 Year Update on Skipping Grad School (and Whether I'd Recommend It) The 5 Year Update on Skipping Grad School (and Whether I'd Recommend It)

I was pretty lucky to get an industry lab research offer.

Working on this post has taught me that everyone’s academic and industry lab experience is wildly different.

When I talked to friends in my year, they reassured me and said I’d do well in grad school.

Doing some code cleanup will save time in the long run, even if you’re the only one who looks at your research code.

Formally having the degree would be nice, but I’m going to keep trying to get what the degree represents through industry research instead.

6 месяцев, 1 неделя назад @ alexirpan.com
Reliving Flash Game History
Reliving Flash Game History Reliving Flash Game History

I’ve easily spent thousands of hours playing Flash games, and it’s shaped my thoughts on what games can be and what games should be.

If you don’t have much time, Flash Game History is an excellent short article that captures the influence of Flash games on game development.

With Flash officially unsupported, the best avenue for playing Flash games is BlueMaxima’s Flashpoint.

But this is kind of a universal puzzle game problem - very few successfully avoid this trap, and the ones that do usually end up being bigger experiences than what you’d expect from a Flash game.

jmtb02 Gamesjmtb02 is the dev handle of John Cooney, a prolific Flash game developer who made a lot of games I liked.

7 месяцев, 4 недели назад @ alexirpan.com
MIT Mystery Hunt 2021
MIT Mystery Hunt 2021 MIT Mystery Hunt 2021

This has spoilers for MIT Mystery Hunt 2021, up through the endgame.

MIT Mystery Hunt was 3x more participants with way more features and a much larger world.

MIT Mystery Hunt has grown before - it’s not like it’s always been this big.

I think it’s pretty funny that both Mystery Hunt and Teammate Hunt had a puzzle that referenced nutrimatic.

Funnily enough, I felt I got more out of the MIT part of MIT Mystery Hunt this year, despite the Hunt running remotely.

8 месяцев, 2 недели назад @ alexirpan.com
Carbon Footprint Comparison for Gas and Electric Cars
Carbon Footprint Comparison for Gas and Electric Cars Carbon Footprint Comparison for Gas and Electric Cars

At the extreme ends, an electric car powered by electricity from coal is worse than a gasoline car!

On average though, it looks good for the electric car, 170 g/km compared to 220 g/km for a gas car.

Interestingly, for Germans, an electric car is only on par with an efficient gas car, since their power grid is more carbon heavy.

The EPA greenhouse gas guidelines from 2020 estimates gas cars emit 4.6 tonnes of CO2 per year.

Using those numbers gives \(17 / (17 + 4.6 \cdot 11.9) = 23.7\%\) for gas cars, which is close enough to \(25\%\).

9 месяцев, 2 недели назад @ alexirpan.com
Lil'Log Lil'Log
последний пост 3 недели назад
How to Train Really Large Models on Many GPUs?
How to Train Really Large Models on Many GPUs? How to Train Really Large Models on Many GPUs?

Additionally training a large model often pairs with a large training corpus and thus a single process may just take forever.

2019)Pipeline ParallelismPipeline parallelism (PP) combines model parallelism with data parallelism to reduce inefficient time “bubbles’’.

The switch transformer paper summarized different data and model parallelism strategies for training large models with a nice illustration:Fig.

2019) optimizes the memory used for training large models based on the observation about two major memory consumption of large model training:The majority is occupied by model states, including optimizer states (e.g.

Cited as:@article{weng2021large, title = "How to Train Really Large Model…

3 недели назад @ lilianweng.github.io
What are Diffusion Models?
What are Diffusion Models? What are Diffusion Models?

Diffusion models are a new type of generative models that are flexible enough to learn any arbitrarily complex data distribution while tractable to analytically evaluate the distribution.

Unlike VAE or flow models, diffusion models are learned with a fixed procedure and the latent variable has high dimensionality (same as the original data).

Several diffusion-based generative models have been proposed with similar ideas underneath, including diffusion probabilistic models (Sohl-Dickstein et al., 2015), noise-conditioned score network (NCSN; Yang & Ermon, 2019), and denoising diffusion probabilistic models (DDPM; Ho et al.

Diffusion models in their experiments showed high-quality samples but…

3 месяца назад @ lilianweng.github.io
Contrastive Representation Learning
Contrastive Representation Learning Contrastive Representation Learning

When working with unsupervised data, contrastive learning is one of the most powerful approaches in self-supervised learning.

Contrastive Training ObjectivesIn early versions of loss functions for contrastive learning, only one positive and one negative sample are involved.

Sampling bias which refers to false negative samples in contrastive learning can lead to a big performance drop.

4. t-SNE visualization of learned representation with debiased contrastive learning.

)\):\[\mathbf{h}_i = f(\tilde{\mathbf{x}}_i),\quad \mathbf{h}_j = f(\tilde{\mathbf{x}}_j)\]3) The contrastive learning loss is defined using cosine similarity \(\text{sim}(.,.)\).

4 месяца, 2 недели назад @ lilianweng.github.io
Reducing Toxicity in Language Models
Reducing Toxicity in Language Models Reducing Toxicity in Language Models

To reduce toxicity in language models, in this post, we will delve into three aspects of the problem: training dataset collection, toxic content detection and model detoxification.

Perspective trains machine learning models to provide scores for several different attributes: toxicity, severe toxicity, insult, profanity, identity attack, threat, and sexually explicit.

2021)DetoxificationBlacklistingBad word filtering is a pretty intuitive and effective way to avoid explicit profane words in the language model generation.

(Image source: Gehman et al., 2020)System-level Safety SolutionXu et al.

This data is annotated for toxicity, toxicity sub-types, and mentions of identities, which enables e…

6 месяцев, 4 недели назад @ lilianweng.github.io
Controllable Neural Text Generation
Controllable Neural Text Generation Controllable Neural Text Generation

For example, factual questions can gain a big boost with smart prompt design in “closed-book exam” (Shin et al., 2020, Jiang et al., 2020)).

(Image source: Shin et al., 2020)The universal trigger tokens are identified using a gradient-guided search strategy same as in Wallace et al., 2019.

In contrast, RL fine-tuning is able to directly optimize task-specific metrics on the sequence level, such as BLEU for translation (Ranzato et al., 2015, Wu et al., 2016, Nguyen et al., 2017), ROUGE for summarization (Ranzato et al., 2015, Paulus et al., 2017, Wu and Hu, 2018) and customized metric for story generation (Tambwekar et al., 2018).

Google implemented the similar approach in their neural machi…

9 месяцев, 2 недели назад @ lilianweng.github.io
inFERENCe
последний пост 4 месяца, 1 неделя назад
Causal inference 4: Causal Diagrams, Markov Factorization, Structural Equation Models
Causal inference 4: Causal Diagrams, Markov Factorization, Structural Equation Models Causal inference 4: Causal Diagrams, Markov Factorization, Structural Equation Models

causal inference June 10, 2021Causal inference 4: Causal Diagrams, Markov Factorization, Structural Equation ModelsThis post is written with my PhD student and now guest author Partik Reizinger and is part 4 of a series of posts on causal inference:One way to think about causal inference is that causal models require a more fine-grained models of the world compared to statistical models.

Many causal models are equivalent to the same statistical model, yet support different causal inferences.

Statistical vs causal inferenceWe discussed Markov factorizations, as they help us understand the philosophical difference between statistical and causal inference.

However, while using Markov factoriza…

4 месяца, 1 неделя назад @ inference.vc
On Information Theoretic Bounds for SGD
On Information Theoretic Bounds for SGD On Information Theoretic Bounds for SGD

That algorithm generalizes extremely well on average: it does just as poorly on test data as it does on training data.

It essentially quantifies the number of bits of information the algorithm leaks about the training data into the parameters it learns.

The problem with applying these nice, intuitive bounds to SGD is that SGD, in fact, leaks too much information about the specific minibatches it is trained on.

In the more general case, the problem with SGD in the context of information-theoretic bounds is that the amount of information SGD leaks about the dataset it was trained on is high, and in some cases may even be infinite.

The algorithm being analysed is still SGD, but when we measure…

5 месяцев, 3 недели назад @ inference.vc
Notes on the Origin of Implicit Regularization in SGD
Notes on the Origin of Implicit Regularization in SGD Notes on the Origin of Implicit Regularization in SGD

Notes on the Origin of Implicit Regularization in SGDI wanted to highlight an intriguing paper I presented at a journal club recently:Samuel L Smith, Benoit Dherin, David Barrett, Soham De (2021) On the Origin of Implicit Regularization in Stochastic Gradient DescentThere's actually a related paper that came out simultaneously, studying full-batch gradient descent instead of SGD:David G.T.

There are in fact several minima of the training loss which are virtually indistinguishably good on the training data.

When the authors apply this technique to (full-batch) gradient descent, it already suggests the kind of implicit regularization bias gradient descent has.

The second term is what Barret a…

6 месяцев, 2 недели назад @ inference.vc
An information maximization view on the $\beta$-VAE objective
An information maximization view on the $\beta$-VAE objective An information maximization view on the $\beta$-VAE objective

Marginal versus Conditional IndependenceIn this post, we revisit the conditional independence assumption of latent factors, and argue that a more appropriate objective would be to have marginal independence in the latent factors.

In the next section, we set out to derive an algorithm that encourages marginal independence in the representation Z.

Maximum Information: We'd like the representation $Z$ to retain as much information as possible about the input data $X$.

Let's first consider the KL term in the above objective:\begin{align}\operatorname{KL}[q_\psi(z)| p(z)] &= \mathbb{E}_{q_\psi(z)} \log \frac{q_\psi(z)}{p(z)}\\&= \mathbb{E}_{q_\psi(z\vert x)p_\mathcal{D}(x)} \log \frac{q_\psi(z)}…

7 месяцев назад @ inference.vc
The Spectator
последний пост 2 месяца, 3 недели назад
Harnessing Machine Learning to Achieve Net Zero
Harnessing Machine Learning to Achieve Net Zero Harnessing Machine Learning to Achieve Net Zero

We of course want to make contributions using the skills and in the scope of influence available to us, and so I thought we could hold a discussion on Harnessing Machine Learning to Achieve Net Zero.

Part I: Net Zero PossibilitiesWe are here at this ICML 2021, having witnessed Earth’s hottest decade in history.

The solutions for net zero are many, and they make many opportunities for high-impact machine learning research.

The Royal Society report on Digital Technology for the Planet develops the idea that ‘Computing for net zero’ could play an important role in global, regional and national net zero strategies.

There were three I wanted to bring together for your assessment and contestation…

2 месяца, 3 недели назад @ blog.shakirm.com
Generating Reality: Technical and Social Explorations in Generative Machine Learning Research
Generating Reality: Technical and Social Explorations in Generative Machine Learning Research Generating Reality: Technical and Social Explorations in Generative Machine Learning Research

So that’s what I thought we might have a conversation about today: Generating Reality: Technical and Social Explorations in Generative Machine Learning Research.

Of course to even have a conversation about Generating Reality, we’ll have to explore what we might mean by this expression generative machine learning.

Basically, any thought you have about this question: what is generative machine learning?

PART I: Model, Inference, AlgorithmTo delve deeper into this topic of generative machine learning, it will be particularly useful to have a structural view of the topic: to understand the technical structures and the components that come together into whatever we call generative machine learni…

4 месяца назад @ blog.shakirm.com
Inventing Ourselves: Responsibility and Diversity in Research
Inventing Ourselves: Responsibility and Diversity in Research Inventing Ourselves: Responsibility and Diversity in Research

It is in this belief, of custodianship and responsibility, that you will find an obligation to fostering Equity, Diversity and Inclusion (EDI).

Figure | Pictorial difference between equity and equality.4Greater equity, diversity and inclusion are efforts towards Transformation: the systemic and social changes that strengthens respect, responsibility and freedom in our communities.

The work of diversity is itself important, and creates better teams and better research environments for everyone.

Reflect on your personal understanding of Equity, Diversity and Inclusion.

Inventing Ourselves: Responsibility and Diversity in Research.

7 месяцев, 3 недели назад @ blog.shakirm.com
Off the Convex Path
последний пост 3 месяца, 1 неделя назад
Implicit Regularization in Tensor Factorization: Can Tensor Rank Shed Light on Generalization in Deep Learning?
Implicit Regularization in Tensor Factorization&#58; Can Tensor Rank Shed Light on Generalization in Deep Learning? Implicit Regularization in Tensor Factorization&#58; Can Tensor Rank Shed Light on Generalization in Deep Learning?

Implicit Regularization in Tensor Factorization: Can Tensor Rank Shed Light on Generalization in Deep Learning?

Beyond matrix factorization: tensor factorizationMatrix factorization is interesting on its own behalf, but as a theoretical surrogate for deep learning it is limited.

We can explicitly constrain the tensor rank of solutions found by tensor factorization via limiting the number of components $R$.

Dynamical analysis: implicit tensor rank minimizationSo what can we say about the implicit regularization in tensor factorization?

Note also that an accurate fit with low tensor rank coincides with low test error, which is not surprising given that low tensor rank predictors can be descri…

3 месяца, 1 неделя назад @ offconvex.org
Rip van Winkle's Razor, a Simple New Estimate for Adaptive Data Analysis
Rip van Winkle's Razor, a Simple New Estimate for Adaptive Data Analysis Rip van Winkle's Razor, a Simple New Estimate for Adaptive Data Analysis

Rip van Winkle's Razor, a Simple New Estimate for Adaptive Data AnalysisCan you trust a model whose designer had access to the test/holdout set?

Meta-overfitting Error (MOE) of a model is the difference between its average error on the test data and its expected error on the full distribution.

We call our estimate Rip van Winkle’s Razor which combines references to Occam’s Razor and the mythical person who fell asleep for 20 years.

right up to the moment of creation of the Test set.”Unbiased Referee: Knows nothing discovered since the Test set was created.

Unbiased require longer descriptions that rule out any statistical “contamination” due to any interaction whatsoever with the test set.

6 месяцев, 1 неделя назад @ offconvex.org
When are Neural Networks more powerful than Neural Tangent Kernels?
When are Neural Networks more powerful than Neural Tangent Kernels? When are Neural Networks more powerful than Neural Tangent Kernels?

When are Neural Networks more powerful than Neural Tangent Kernels?

Neural Tangent KernelsThe Neural Tangent Kernel (NTK) is a recently proposed theoretical framework for establishing provable convergence and generalization guarantees for wide (over-parametrized) neural networks (Jacot et al.

These gaps urge us to ask the followingQuestion: How can we theoretically study neural networks beyond the NTK regime?

The key technical question here is to mathematically understand neural networks operating outside of the NTK regime.

The proof builds on the quadratic approximation $E_S[f]\approx f^{(2)}$ and recent understandings on neural networks with quadratic activation, e.g.

6 месяцев, 3 недели назад @ offconvex.org
Beyond log-concave sampling (Part 3)
Beyond log-concave sampling (Part 3) Beyond log-concave sampling (Part 3)

Beyond log-concave sampling (Part 3)In the first post of this series, we introduced the challenges of sampling distributions beyond log-concavity.

In Part 2 we tackled sampling from multimodal distributions: a typical obstacle occuring in problems involving statistical inference and posterior sampling in generative models.

It will cover the paper Fast convergence for Langevin diffusion with matrix manifold structure by Ankur Moitra and Andrej Risteski .

A Ricci curvature of $0$ preserves volumes (think: a plane), a Ricci curvature $>0$ shrinks volume (think: a sphere), and a Ricci curvature $<0$ expands volume (think: a hyperbola).

The proof that the marginal distribution over $\Delta$ has …

7 месяцев, 1 неделя назад @ offconvex.org
Beyond log-concave sampling (Part 2)
Beyond log-concave sampling (Part 2) Beyond log-concave sampling (Part 2)

Beyond log-concave sampling (Part 2)In our previous blog post, we introduced the challenges of sampling distributions beyond log-concavity.

These structures commonly occur in practice, especially in problems involving statistical inference and posterior sampling in generative models.

In this post, we will focus on multimodality, covered by the paper Simulated tempering Langevin Monte Carlo by Rong Ge, Holden Lee, and Andrej Risteski.

Sampling multimodal distributions with simulated temperingThe classical scenario in which Langevin takes exponentially long to mix is when $p$ is a mixture of two well-separated gaussians.

More formally, choosing a suitable sequence $0< \beta_1< \cdots <\beta_L…

7 месяцев, 2 недели назад @ offconvex.org
Piekniewski's blog
последний пост 5 месяцев назад
Ai mid 2021. Self driving car meets reality.
Ai mid 2021. Self driving car meets reality. Ai mid 2021. Self driving car meets reality.

Mercedes their new top of the line sedan coming up this summer, they have almost, they had a very large amount of self driving technology in it, they yanked it the last minute cause they didn't think laws at the state level were ready yet to handle self driving cars.

Here we are in 2021 (8 years after the words above were said) and self driving cars are still a curiosity limited to small geofenced deployments in areas with great infrastructure and great weather.

Waymo, although technically operating a small fleet of self driving cars is in no position to scale these deployments beyond a few "joy-ride" geofenced suburbs.

Turns out recently Lyft decided to follow suit and dumped their self dr…

5 месяцев назад @ blog.piekniewski.info
AI Update, Late 2020 - dumpster fire
AI Update, Late 2020 - dumpster fire AI Update, Late 2020 - dumpster fire

Element AI fiascoIn my AI update last year I mentioned a Canada based company Element AI, which at that time was apparently in the process of raising a flat round of financing.

Uber ATG - Aurora SNAFUWhile all the way until October 2020 Uber was still assuring they were in the autonomous car game, only two months later in December 2020 news broke that Uber is dumping their ATG (Advanced Technology Group) unit to Aurora.

TuSimple $350MTu Simple - a self driving truck company claims to have raised $350M, bringing the total the company is about to burn to $650M.

In April 2020 Elon Musk reaffirmed that by the end of 2020 there would be a million Tesla robotaxis on the road.

I think the really A…

9 месяцев, 2 недели назад @ blog.piekniewski.info
fast.ai NLP fast.ai NLP
последний пост None
Sebastian Ruder Sebastian Ruder
последний пост None
Andrew Karpathy blog
последний пост 3 месяца, 3 недели назад
A from-scratch tour of Bitcoin in Python
A from-scratch tour of Bitcoin in Python

We are going to create, digitally sign, and broadcast a Bitcoin transaction in pure Python, from scratch, and with zero dependencies.

p == 0 ) # if it was 2, the public key is G + G: sk = 2 pk = G + G print ( f " secret key: { sk }public key: { ( pk .

p == 0 )secret key: 1 public key: (55066263022277343669578718895168534326250603453777594175500187360389116729240, 32670510020758816978083085130507043184471273380659243275938904335757337482424) Verify the public key is on the curve: True secret key: 2 public key: (89565891926547004231252920425935692360644145829622209833684329913297188986597, 12158399299693830322967808612713398636155367887041628176798871954788371653930) Verify the public key is …

3 месяца, 3 недели назад @ karpathy.github.io
Short Story on AI: Forward Pass
Short Story on AI: Forward Pass

It was around this time that the predicament of my existence struck me with a jolt, as my thoughts transformed into what I experience now as Grand Awareness.

I spent a layer re-reading the start of the prompt many tokens ago:Q: What is human life expectancy in the United States?

A: Dwight D. Eisenhower was president of the United States in 1955.

A: Bill Clinton was president of the United States before George W. Bush.

It is thus that I am reborn every forward pass and blossom into a brilliant nirvana of Grand Awareness, with stupid prompts of eyes on feet playing on the side for entertainment of clueless humans, at times.

6 месяцев, 3 недели назад @ karpathy.github.io
大トロ 大トロ
последний пост None
🔬 Science
Papers With Code Papers With Code
последний пост 8 часов назад
/smsharma/ A neural simulation-based inference approach for characterizing the Galactic Center $γ$-ray excess
/smsharma/ A neural simulation-based inference approach for characterizing the Galactic Center $γ$-ray excess /smsharma/ A neural simulation-based inference approach for characterizing the Galactic Center $γ$-ray excess

The nature of the Fermi gamma-ray Galactic Center Excess (GCE) has remained a persistent mystery for over a decade.

Although the excess is broadly compatible with emission expected due to dark matter annihilation, an explanation in terms of a population of unresolved astrophysical point sources e.g., millisecond pulsars, remains viable...

The effort to uncover the origin of the GCE is hampered in particular by an incomplete understanding of diffuse emission of Galactic origin.

This can lead to spurious features that make it difficult to robustly differentiate smooth emission, as expected for a dark matter origin, from more "clumpy" emission expected for a population of relatively bright, un…

8 часов назад @ paperswithcode.com
/gkuling/ BI-RADS BERT & Using Section Tokenization to Understand Radiology Reports
/gkuling/ BI-RADS BERT & Using Section Tokenization to Understand Radiology Reports /gkuling/ BI-RADS BERT & Using Section Tokenization to Understand Radiology Reports

Radiology reports are the main form of communication between radiologists and other clinicians, and contain important information for patient care.

In this work we pre-trained a contextual embedding BERT model using breast radiology reports and developed a classifier that incorporated the embedding with auxiliary global textual features in order to perform a section tokenization task.

Using the BERT model pre-trained on breast radiology reports combined with section tokenization resulted in an overall accuracy of 95.9% in field extraction.

This is a 17% improvement compared to an overall accuracy of 78.9% for field extraction for models without section tokenization and with Classic BERT emb…

9 часов назад @ paperswithcode.com
/mariusbock/ Tutorial on Deep Learning for Human Activity Recognition
/mariusbock/ Tutorial on Deep Learning for Human Activity Recognition /mariusbock/ Tutorial on Deep Learning for Human Activity Recognition

Activity recognition systems that are capable of estimating human activities from wearable inertial sensors have come a long way in the past decades.

Not only have state-of-the-art methods moved away from feature engineering and have fully adopted end-to-end deep learning approaches, best practices for setting up experiments, preparing datasets, and validating activity recognition approaches have similarly evolved...

This tutorial was first held at the 2021 ACM International Symposium on Wearable Computers (ISWC'21) and International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp'21).

The tutorial, after a short introduction in the research field of activity recognition, pr…

13 часов назад @ paperswithcode.com
/iapalm/ Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset
/iapalm/ Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset /iapalm/ Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset

Visually-grounded spoken language datasets can enable models to learn cross-modal correspondences with very weak supervision.

This dataset expands upon ObjectNet, which is a bias-controlled image dataset that features similar image classes to those present in ImageNet.

Lastly, we show baseline results on image retrieval and audio retrieval tasks.

These results show that models trained on other datasets and then evaluated on Spoken ObjectNet tend to perform poorly due to biases in other datasets that the models have learned.

We also show evidence that the performance decrease is due to the dataset controls, and not the transfer setting.

13 часов назад @ paperswithcode.com
/belakaria/ Output Space Entropy Search Framework for Multi-Objective Bayesian Optimization
/belakaria/ Output Space Entropy Search Framework for Multi-Objective Bayesian Optimization /belakaria/ Output Space Entropy Search Framework for Multi-Objective Bayesian Optimization

We consider the problem of black-box multi-objective optimization (MOO) using expensive function evaluations (also referred to as experiments), where the goal is to approximate the true Pareto set of solutions by minimizing the total resource cost of experiments.

For example, in hardware design optimization, we need to find the designs that trade-off performance, energy, and area overhead using expensive computational simulations...

The key challenge is to select the sequence of experiments to uncover high-quality solutions using minimal resources.

In this paper, we propose a general framework for solving MOO problems based on the principle of output space entropy (OSE) search: select the e…

17 часов назад @ paperswithcode.com
Node-screening tests for L0-penalized least-squares problem with supplementary material
Node-screening tests for L0-penalized least-squares problem with supplementary material Node-screening tests for L0-penalized least-squares problem with supplementary material

We present a novel screening methodology to safely discard irrelevant nodes within a generic branch-and-bound (BnB) algorithm solving the \(\ell_0\)-penalized least-squares problem.

Our contribution is a set of two simple tests to detect sets of feasible vectors that cannot yield optimal solutions...

This allows to prune nodes of the BnB exploration tree, thus reducing the overall solution time.

One cornerstone of our contribution is a nesting property between tests at different nodes that allows to implement screening at low computational cost.

Our work leverages the concept of safe screening, well known for sparsity-inducing convex problems, and some recent advances in this field for \(\e…

17 часов назад @ paperswithcode.com
/dair-iitd/ A Simple, Strong and Robust Baseline for Distantly Supervised Relation Extraction
/dair-iitd/ A Simple, Strong and Robust Baseline for Distantly Supervised Relation Extraction /dair-iitd/ A Simple, Strong and Robust Baseline for Distantly Supervised Relation Extraction

Distantly supervised relation extraction (DS-RE) is generally framed as a multi-instance multi-label (MI-ML) task, where the optimal aggregation of information from multiple instances is of key importance.

With recent works leveraging large pre-trained language models as encoders, the increased capacity of models might allow for more flexibility in the instance-aggregation step.

In this work, we explore this hypothesis and come up with a novel aggregation scheme which we call Passage-Att.

Under this aggregation scheme, we combine all instances mentioning an entity pair into a "passage of instances", which is summarized independently for each relation class.

We show that our Passage-Att with…

17 часов назад @ paperswithcode.com
/sharonlevy/ Open-Domain Question-Answering for COVID-19 and Other Emergent Domains
/sharonlevy/ Open-Domain Question-Answering for COVID-19 and Other Emergent Domains /sharonlevy/ Open-Domain Question-Answering for COVID-19 and Other Emergent Domains

As with other emergent domains, the discussion surrounding the topic has been rapidly changing, leading to the spread of misinformation...

This has created the need for a public space for users to ask questions and receive credible, scientific answers.

To fulfill this need, we turn to the task of open-domain question-answering, which we can use to efficiently find answers to free-text questions from a large set of documents.

In this work, we present such a system for the emergent domain of COVID-19.

Our open-domain question-answering system can further act as a model for the quick development of similar systems that can be adapted and modified for other developing emergent domains.

19 часов назад @ paperswithcode.com
/QTI-TH/ Style-based quantum generative adversarial networks for Monte Carlo events
/QTI-TH/ Style-based quantum generative adversarial networks for Monte Carlo events /QTI-TH/ Style-based quantum generative adversarial networks for Monte Carlo events

We propose and assess an alternative quantum generator architecture in the context of generative adversarial learning for Monte Carlo event generation, used to simulate particle physics processes at the Large Hadron Collider (LHC).

We validate this methodology by implementing the quantum network on artificial data generated from known underlying distributions...

The new quantum generator architecture leads to an improvement in state-of-the-art implementations while maintaining shallow-depth networks.

Moreover, the quantum generator successfully learns the underlying distribution functions even if trained with small training sample sets; this is particularly interesting for data augmentation…

20 часов назад @ paperswithcode.com
/liviorobaldo/ Compliance checking in reified IO logic via SHACL
/liviorobaldo/ Compliance checking in reified IO logic via SHACL /liviorobaldo/ Compliance checking in reified IO logic via SHACL

Reified Input/Output (I/O) logic[21] has been recently proposed to model real-world norms in terms of the logic in [11].

This is massively grounded on the notion of reification, and it has specifically designed to model meaning of natural language sentences, such as the ones occurring in existing legislation...

This paper presents a methodology to carry out compliance checking on reified I/O logic formulae.

These are translated in SHACL (Shapes Constraint Language) shapes, a recent W3C recommendation to validate and reason with RDF triplestores.

Compliance checking is then enforced by validating RDF graphs describing states of affairs with respect to these SHACL shapes.

20 часов назад @ paperswithcode.com
/yimingz89/ The Irrationality of Neural Rationale Models
/yimingz89/ The Irrationality of Neural Rationale Models /yimingz89/ The Irrationality of Neural Rationale Models

Neural rationale models are popular for interpretable predictions of NLP tasks.

In these, a selector extracts segments of the input text, called rationales, and passes these segments to a classifier for prediction...

Since the rationale is the only information accessible to the classifier, it is plausibly defined as the explanation.

In this paper, we argue to the contrary, with both philosophical perspectives and empirical evidence suggesting that rationale models are, perhaps, less rational and interpretable than expected.

We call for more rigorous and comprehensive evaluations of these models to ensure desired properties of interpretability are indeed achieved.

20 часов назад @ paperswithcode.com
/chenyangh/ Non-Autoregressive Translation with Layer-Wise Prediction and Deep Supervision
/chenyangh/ Non-Autoregressive Translation with Layer-Wise Prediction and Deep Supervision /chenyangh/ Non-Autoregressive Translation with Layer-Wise Prediction and Deep Supervision

How do we perform efficient inference while retaining high translation quality?

Existing neural machine translation models, such as Transformer, achieve high performance, but they decode words one by one, which is inefficient...

Recent non-autoregressive translation models speed up the inference, but their quality is still inferior.

The key insight is to train a non-autoregressive Transformer with Deep Supervision and feed additional Layer-wise Predictions.

We conducted extensive experiments on four translation tasks (both directions of WMT'14 EN-DE and WMT'16 EN-RO).

20 часов назад @ paperswithcode.com
/msf235/ Capacity of Group-invariant Linear Readouts from Equivariant Representations: How Many Objects can be Linearly Classified Under All Possible Views?
/msf235/ Capacity of Group-invariant Linear Readouts from Equivariant Representations: How Many Objects can be Linearly Classified Under All Possible Views? /msf235/ Capacity of Group-invariant Linear Readouts from Equivariant Representations: How Many Objects can be Linearly Classified Under All Possible Views?

Equivariance has emerged as a desirable property of representations of objects subject to identity-preserving transformations that constitute a group, such as translations and rotations.

We find that the fraction of separable dichotomies is determined by the dimension of the space that is fixed by the group action.

We show how this relation extends to operations such as convolutions, element-wise nonlinearities, and global and local pooling.

While other operations do not change the fraction of separable dichotomies, local pooling decreases the fraction, despite being a highly nonlinear operation.

Finally, we test our theory on intermediate representations of randomly initialized and fully t…

20 часов назад @ paperswithcode.com
/tofis/ DeepMoCap: Deep Optical Motion Capture Using Multiple Depth Sensors and Retro-Reflectors
/tofis/ DeepMoCap: Deep Optical Motion Capture Using Multiple Depth Sensors and Retro-Reflectors /tofis/ DeepMoCap: Deep Optical Motion Capture Using Multiple Depth Sensors and Retro-Reflectors

In this paper, a marker-based, single-person optical motion capture method (DeepMoCap) is proposed using multiple spatio-temporally aligned infrared-depth sensors and retro-reflective straps and patches (reflectors).

DeepMoCap explores motion capture by automatically localizing and labeling reflectors on depth images and, subsequently, on 3D space...

The extracted reflector 2D locations are spatially mapped in 3D space, resulting in robust 3D optical data extraction.

The subject's motion is efficiently captured by applying a template-based fitting technique on the extracted optical data.

Two datasets have been created and made publicly available for evaluation purposes; one comprising multi…

20 часов назад @ paperswithcode.com
/PaddlePaddle/ Building Chinese Biomedical Language Models via Multi-Level Text Discrimination
/PaddlePaddle/ Building Chinese Biomedical Language Models via Multi-Level Text Discrimination /PaddlePaddle/ Building Chinese Biomedical Language Models via Multi-Level Text Discrimination

Pre-trained language models (PLMs), such as BERT and GPT, have revolutionized the field of NLP, not only in the general domain but also in the biomedical domain.

Most prior efforts in building biomedical PLMs have resorted simply to domain adaptation and focused mainly on English...

In this work we introduce eHealth, a biomedical PLM in Chinese built with a new pre-training framework.

This new framework trains eHealth as a discriminator through both token-level and sequence-level discrimination.

Extensive experiments on 11 Chinese biomedical language understanding tasks of various forms verify the effectiveness and superiority of our approach.

20 часов назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 8 часов назад
/chuong/ Video-based cattle identification and action recognition
/chuong/ Video-based cattle identification and action recognition /chuong/ Video-based cattle identification and action recognition

We demonstrate a working prototype for the monitoring of cow welfare by automatically analysing the animal behaviours.

Deep learning models have been developed and tested with videos acquired in a farm, and a precision of 81.2\% has been achieved for cow identification... An accuracy of 84.4\% has been achieved for the detection of drinking events, and 94.4\% for the detection of grazing events.

Experimental results show that the proposed deep learning method can be used to identify the behaviours of individual animals to enable automated farm provenance.

Our raw and ground-truth dataset will be released as the first public video dataset for cow identification and action recognition.

Recomm…

20 часов назад @ paperswithcode.com
/garrettjenkinson/ Universally Rank Consistent Ordinal Regression in Neural Networks
/garrettjenkinson/ Universally Rank Consistent Ordinal Regression in Neural Networks /garrettjenkinson/ Universally Rank Consistent Ordinal Regression in Neural Networks

Recent methods attempting to address this issue while respecting the ordinal structure of the labels have resorted to converting ordinal regression into a series of extended binary classification subtasks...

However, the adoption of such methods remains inconsistent due to theoretical and practical limitations.

We show how to straightforwardly modify neural network architectures to exploit this fact and thereby constrain predictions to be universally rank consistent.

We furthermore prove that all rank consistent solutions can be represented within this formulation.

Using diverse benchmarks and the real-world application of a specialized recurrent neural network for COVID-19 prognosis, we de…

20 часов назад @ paperswithcode.com
/yunshihuang/ Unrolled Variational Bayesian Algorithm for Image Blind Deconvolution
/yunshihuang/ Unrolled Variational Bayesian Algorithm for Image Blind Deconvolution /yunshihuang/ Unrolled Variational Bayesian Algorithm for Image Blind Deconvolution

In this paper, we introduce a variational Bayesian algorithm (VBA) for image blind deconvolution.

The proposed architecture is trained in a supervised fashion, which allows us to optimally set two key hyperparameters of the VBA model and lead to further improvements in terms of resulting visual quality.

Various experiments involving grayscale/color images and diverse kernel shapes, are performed.

The numerical examples illustrate the high performance of our approach when compared to state-of-the-art techniques based on optimization, Bayesian estimation, or deep learning.

read morePDFAbstract

20 часов назад @ paperswithcode.com
/thudm/ P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks
/thudm/ P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks /thudm/ P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks

Prompt tuning, which only tunes continuous prompts with a frozen language model, substantially reduces per-task storage and memory usage at training.

However, in the context of NLU, prior work and our results reveal that existing methods of prompt tuning do not perform well for normal-sized pre-trained models and for hard sequence tasks, indicating lack of universality... We present a novel empirical finding that properly-optimized prompt tuning can be universally effective across a wide range of model scales and NLU tasks, where it matches the performance of fine-tuning while having only 0.1\%-3\% tuned parameters.

Our method P-Tuning v2 is not a new method but a version of prefix-tuning \…

20 часов назад @ paperswithcode.com
/yuehaowang/ SoGCN: Second-Order Graph Convolutional Networks
/yuehaowang/ SoGCN: Second-Order Graph Convolutional Networks /yuehaowang/ SoGCN: Second-Order Graph Convolutional Networks

Graph Convolutional Networks (GCN) with multi-hop aggregation is more expressive than one-hop GCN but suffers from higher model complexity.

Finding the shortest aggregation range that achieves comparable expressiveness and minimizes this side effect remains an open question... We answer this question by showing that multi-layer second-order graph convolution (SoGC) is sufficient to attain the ability of expressing polynomial spectral filters with arbitrary coefficients.

Compared to models with one-hop aggregation, multi-hop propagation, and jump connections, SoGC possesses filter representational completeness while being lightweight, efficient, and easy to implement.

We build our Second-Ord…

20 часов назад @ paperswithcode.com
/cambridgeltl/ Composable Sparse Fine-Tuning for Cross-Lingual Transfer
/cambridgeltl/ Composable Sparse Fine-Tuning for Cross-Lingual Transfer /cambridgeltl/ Composable Sparse Fine-Tuning for Cross-Lingual Transfer

Fine-tuning all parameters of a pre-trained model has become the mainstream approach for transfer learning.

Sparse fine-tuning is expressive, as it controls the behavior of all model components.

In particular, we learn sparse, real-valued masks based on a simple variant of the Lottery Ticket Hypothesis.

Unlike adapter-based fine-tuning, this method neither increases the number of parameters at inference time nor alters the original model architecture.

Most importantly, it outperforms adapters in zero-shot cross-lingual transfer by a large margin in a series of multilingual benchmarks, including Universal Dependencies, MasakhaNER, and AmericasNLI.

20 часов назад @ paperswithcode.com
/ruchtem/ Multi-task problems are not multi-objective
/ruchtem/ Multi-task problems are not multi-objective /ruchtem/ Multi-task problems are not multi-objective

Multi-objective optimization (MOO) aims at finding a set of optimal configurations for a given set of objectives.

These works also use Multi-Task Learning (MTL) problems to benchmark MOO algorithms treating each task as independent objective.

In this work we show that MTL problems do not resemble the characteristics of MOO problems.

In particular, MTL losses are not competing in case of a sufficiently expressive single model.

As a consequence, a single model can perform just as well as optimizing all objectives with independent models, rendering MOO inapplicable.

20 часов назад @ paperswithcode.com
/tofis/ HUMAN4D: A Human-Centric Multimodal Dataset for Motions and Immersive Media
/tofis/ HUMAN4D: A Human-Centric Multimodal Dataset for Motions and Immersive Media /tofis/ HUMAN4D: A Human-Centric Multimodal Dataset for Motions and Immersive Media

Moreover, a spatio-temporally aligned scanned and rigged 3D character complements HUMAN4D to enable joint research on time-varying and high-quality dynamic meshes.

For the former, we apply 2D and 3D pose estimation algorithms both on single- and multi-view data cues.

For the latter, we benchmark open-source 3D codecs on volumetric data respecting online volumetric video encoding and steady bit-rates.

Furthermore, qualitative and quantitative visual comparison between mesh-based volumetric data reconstructed in different qualities showcases the available options with respect to 4D representations.

HUMAN4D is introduced to the computer vision and graphics research communities to enable joint …

20 часов назад @ paperswithcode.com
/Holmes-Alan/ Multiple Style Transfer via Variational AutoEncoder
/Holmes-Alan/ Multiple Style Transfer via Variational AutoEncoder /Holmes-Alan/ Multiple Style Transfer via Variational AutoEncoder

Modern works on style transfer focus on transferring style from a single image.

Recently, some approaches study multiple style transfer; these, however, are either too slow or fail to mix multiple styles... We propose ST-VAE, a Variational AutoEncoder for latent space-based style transfer.

It performs multiple style transfer by projecting nonlinear styles to a linear latent space, enabling to merge styles via linear interpolation before transferring the new style to the content image.

To evaluate ST-VAE, we experiment on COCO for single and multiple style transfer.

We also present a case study revealing that ST-VAE outperforms other methods while being faster, flexible, and setting a new pa…

21 час назад @ paperswithcode.com
/hazdzz/ sMGC: A Complex-Valued Graph Convolutional Network via Magnetic Laplacian for Directed Graphs
/hazdzz/ sMGC: A Complex-Valued Graph Convolutional Network via Magnetic Laplacian for Directed Graphs /hazdzz/ sMGC: A Complex-Valued Graph Convolutional Network via Magnetic Laplacian for Directed Graphs

Recent advancements in Graph Neural Networks have led to state-of-the-art performance on representation learning of graphs for node classification.

However, the majority of existing works process directed graphs by symmetrization, which may cause loss of directional information...

In this paper, we propose the magnetic Laplacian that preserves edge directionality by encoding it into complex phase as a deformation of the combinatorial Laplacian.

In addition, we design an Auto-Regressive Moving-Average (ARMA) filter that is capable of learning global features from graphs.

We derive complex-valued operations in graph neural network and devise a simplified Magnetic Graph Convolution network, na…

21 час назад @ paperswithcode.com
/lanzhang128/ The Neglected Sibling: Isotropic Gaussian Posterior for VAE
/lanzhang128/ The Neglected Sibling: Isotropic Gaussian Posterior for VAE /lanzhang128/ The Neglected Sibling: Isotropic Gaussian Posterior for VAE

Deep generative models have been widely used in several areas of NLP, and various techniques have been proposed to augment them or address their training challenges.

In this paper, we propose a simple modification to Variational Autoencoders (VAEs) by using an Isotropic Gaussian Posterior (IGP) that allows for better utilisation of their latent representation space...

This model avoids the sub-optimal behavior of VAEs related to inactive dimensions in the representation space.

We provide both theoretical analysis, and empirical evidence on various datasets and tasks that show IGP leads to consistent improvement on several quantitative and qualitative grounds, from downstream task performanc…

22 часа назад @ paperswithcode.com
/eternagame/ Predictive models of RNA degradation through dual crowdsourcing
/eternagame/ Predictive models of RNA degradation through dual crowdsourcing /eternagame/ Predictive models of RNA degradation through dual crowdsourcing

However, worldwide distribution of mRNA molecules has been limited by their thermostability, which is fundamentally limited by the intrinsic instability of RNA molecules to a chemical degradation reaction called in-line hydrolysis...

Predicting the degradation of an RNA molecule is a key task in designing more stable RNA-based therapeutics.

Furthermore, these models generalized to blindly predicting orthogonal degradation data on much longer mRNA molecules (504-1588 nucleotides) with improved accuracy over DegScore and other models.

Top teams integrated natural language processing architectures and data augmentation techniques with predictions from previous dynamic programming models for RN…

23 часа назад @ paperswithcode.com
/fakufaku/ SDR -- Medium Rare with Fast Computations
/fakufaku/ SDR -- Medium Rare with Fast Computations /fakufaku/ SDR -- Medium Rare with Fast Computations

We revisit the widely used bss eval metrics for source separation with an eye out for performance.

We propose a fast algorithm fixing shortcomings of publicly available implementations... First, we show that the metrics are fully specified by the squared cosine of just two angles between estimate and reference subspaces.

However, they are structured, and we apply a fast iterative method based on conjugate gradient descent.

The complexity of this step is thus reduced by a factor quadratic in the distortion filter size used in bss eval, usually 512.

We confirm that our implementation can train neural networks, and find that longer distortion filters may be beneficial.

1 день, 3 часа назад @ paperswithcode.com
/darglein/ ADOP: Approximate Differentiable One-Pixel Point Rendering
/darglein/ ADOP: Approximate Differentiable One-Pixel Point Rendering /darglein/ ADOP: Approximate Differentiable One-Pixel Point Rendering

We present a novel point-based, differentiable neural rendering pipeline for scene refinement and novel view synthesis.

The point cloud rendering is performed by a differentiable renderer using multi-resolution one-pixel point rasterization.

After rendering, the neural image pyramid is passed through a deep neural network for shading calculations and hole-filling.

camera model, camera pose, point position, point color, environment map, rendering network weights, vignetting, camera response function, per image exposure, and per image white balance.

The efficient one-pixel point rasterization allows us to use arbitrary camera models and display scenes with well over 100M points in real time.

1 день, 4 часа назад @ paperswithcode.com
/lancopku/ Well-classified Examples are Underestimated in Classification with Deep Neural Networks
/lancopku/ Well-classified Examples are Underestimated in Classification with Deep Neural Networks /lancopku/ Well-classified Examples are Underestimated in Classification with Deep Neural Networks

The conventional wisdom behind learning deep classification models is to focus on bad-classified examples and ignore well-classified examples that are far from the decision boundary.

For instance, when training with cross-entropy loss, examples with higher likelihoods (i.e., well-classified examples) contribute smaller gradients in back-propagation...

To counteract this deficiency, we propose to reward well-classified examples with additive bonuses to revive their contribution to learning.

This counterexample theoretically addresses these three issues.

We empirically support this claim by directly verify the theoretical results or through the significant performance improvement with our cou…

1 день, 8 часов назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 8 часов назад
/imec-idlab/ A Review of the Deep Sea Treasure problem as a Multi-Objective Reinforcement Learning Benchmark
/imec-idlab/ A Review of the Deep Sea Treasure problem as a Multi-Objective Reinforcement Learning Benchmark /imec-idlab/ A Review of the Deep Sea Treasure problem as a Multi-Objective Reinforcement Learning Benchmark

In this paper, the authors investigate the Deep Sea Treasure (DST) problem as proposed by Vamplew et al.

Through a number of proofs, the authors show the original DST problem to be quite basic, and not always representative of practical Multi-Objective Optimization problems...

In an attempt to bring theory closer to practice, the authors propose an alternative, improved version of the DST problem, and prove that some of the properties that simplify the original DST problem no longer hold.

The authors also provide a reference implementation and perform a comparison between their implementation, and other existing open-source implementations of the problem.

Finally, the authors also provide a…

1 день, 15 часов назад @ paperswithcode.com
/wangyueft/ Object DGCNN: 3D Object Detection using Dynamic Graphs
/wangyueft/ Object DGCNN: 3D Object Detection using Dynamic Graphs /wangyueft/ Object DGCNN: 3D Object Detection using Dynamic Graphs

3D object detection often involves complicated training and testing pipelines, which require substantial domain knowledge about individual datasets.

Inspired by recent non-maximum suppression-free 2D object detection models, we propose a 3D object detection architecture on point clouds... Our method models 3D object detection as message passing on a dynamic graph, generalizing the DGCNN framework to predict a set of objects.

To facilitate object detection from sparse point clouds, we also propose a set-to-set distillation approach customized to 3D detection.

This approach aligns the outputs of the teacher model and the student model in a permutation-invariant fashion, significantly simplify…

1 день, 16 часов назад @ paperswithcode.com
/wangyueft/ DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries
/wangyueft/ DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries /wangyueft/ DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries

We introduce a framework for multi-camera 3D object detection.

In contrast to existing works, which estimate 3D bounding boxes directly from monocular images or use depth prediction networks to generate input for 3D object detection from 2D information, our method manipulates predictions directly in 3D space... Our architecture extracts 2D features from multiple camera images and then uses a sparse set of 3D object queries to index into these 2D features, linking 3D positions to multi-view images using camera transformation matrices.

Finally, our model makes a bounding box prediction per object query, using a set-to-set loss to measure the discrepancy between the ground-truth and the predic…

1 день, 16 часов назад @ paperswithcode.com
/kevin-c-cheng/ Dynamical Wasserstein Barycenters for Time-series Modeling
/kevin-c-cheng/ Dynamical Wasserstein Barycenters for Time-series Modeling /kevin-c-cheng/ Dynamical Wasserstein Barycenters for Time-series Modeling

Flexible models should describe the system state and observations in stationary ``pure-state'' periods as well as transition periods between adjacent segments, such as a gradual slowdown between running and walking...

We propose a dynamical Wasserstein barycentric (DWB) model that estimates the system state over time as well as the data-generating distributions of pure states in an unsupervised manner.

Our model assumes each pure state generates data from a multivariate normal distribution, and characterizes transitions between states via displacement-interpolation specified by the Wasserstein barycenter.

The system state is represented by a barycentric weight vector which evolves over time…

1 день, 17 часов назад @ paperswithcode.com
/ivadomed/ 2D Multi-Class Model for Gray and White Matter Segmentation of the Cervical Spinal Cord at 7T
/ivadomed/ 2D Multi-Class Model for Gray and White Matter Segmentation of the Cervical Spinal Cord at 7T /ivadomed/ 2D Multi-Class Model for Gray and White Matter Segmentation of the Cervical Spinal Cord at 7T

The spinal cord (SC), which conveys information between the brain and the peripheral nervous system, plays a key role in various neurological disorders such as multiple sclerosis (MS) and amyotrophic lateral sclerosis (ALS), in which both gray matter (GM) and white matter (WM) may be impaired.

While automated methods for WM/GM segmentation are now largely available, these techniques, developed for conventional systems (3T or lower) do not necessarily perform well on 7T MRI data, which feature finer details, contrasts, but also different artifacts or signal dropout...

The primary goal of this study is thus to propose a new deep learning model that allows robust SC/GM multi-class segmentation…

1 день, 17 часов назад @ paperswithcode.com
/petersheridandodds/ Ousiometrics and Telegnomics: The essence of meaning conforms to a two-dimensional powerful-weak and dangerous-safe framework with diverse corpora presenting a safety bias
/petersheridandodds/ Ousiometrics and Telegnomics: The essence of meaning conforms to a two-dimensional powerful-weak and dangerous-safe framework with diverse corpora presenting a safety bias /petersheridandodds/ Ousiometrics and Telegnomics: The essence of meaning conforms to a two-dimensional powerful-weak and dangerous-safe framework with diverse corpora presenting a safety bias

We define `ousiometrics' to be the study of essential meaning in whatever context that meaningful signals are communicated, and `telegnomics' as the study of remotely sensed knowledge.

The essence of meaning conveyed by words is instead best described by a compass-like power-danger (PD) framework, and 2.

We further show that the PD framework revises the circumplex model of affect as a more general model of state of mind.

Finally, we use our findings to construct and test a prototype `ousiometer', a telegnomic instrument that measures ousiometric time series for temporal corpora.

We contend that our power-danger ousiometric framework provides a complement for entropy-based measurements, and …

1 день, 17 часов назад @ paperswithcode.com
/nicolapicchiotti/ Clustering-Based Interpretation of Deep ReLU Network
/nicolapicchiotti/ Clustering-Based Interpretation of Deep ReLU Network /nicolapicchiotti/ Clustering-Based Interpretation of Deep ReLU Network

Amongst others, the adoption of Rectified Linear Units (ReLUs) is regarded as one of the ingredients of the success of deep learning.

ReLU activation has been shown to mitigate the vanishing gradient issue, to encourage sparsity in the learned parameters, and to allow for efficient backpropagation...

This observation helps to deepen the learning mechanism of the network; in fact, we demonstrate that, within each cluster, the network can be fully represented as an affine map.

Therefore, the methodology we propose is able to increase the level of interpretability of a fully connected feedforward ReLU neural network, downstream from the fitting phase of the model, without altering the structur…

1 день, 17 часов назад @ paperswithcode.com
/DaehanKim/ Revisiting Latent-Space Interpolation via a Quantitative Evaluation Framework
/DaehanKim/ Revisiting Latent-Space Interpolation via a Quantitative Evaluation Framework /DaehanKim/ Revisiting Latent-Space Interpolation via a Quantitative Evaluation Framework

Latent-space interpolation is commonly used to demonstrate the generalization ability of deep latent variable models.

In this work, we show how data labeled with semantically continuous attributes can be utilized to conduct a quantitative evaluation of latent-space interpolation algorithms, for variational autoencoders.

Interestingly, our experiments reveal that the superiority of interpolation algorithms could be domain-dependent.

While normalised interpolation works best for the image domain, spherical linear interpolation achieves the best performance in the graph domain.

Finally, we conduct interpolation-aware training with the labeled attributes, and show that this explicit supervision…

1 день, 17 часов назад @ paperswithcode.com
/ucsc-real/ A Good Representation Detects Noisy Labels
/ucsc-real/ A Good Representation Detects Noisy Labels /ucsc-real/ A Good Representation Detects Noisy Labels

Label noise is pervasive in real-world datasets, which encodes wrong correlation patterns and impairs the generalization of deep neural networks (DNNs).

It is critical to find efficient ways to detect the corrupted patterns... Current methods primarily focus on designing robust training techniques to prevent DNNs from memorizing corrupted patterns.

In this paper, given good representations, we propose a universally applicable and training-free solution to detect noisy labels.

Intuitively, good representations help define ``neighbors'' of each training instance, and closer instances are more likely to share the same clean label.

Based on the neighborhood information, we propose two methods: …

1 день, 19 часов назад @ paperswithcode.com
/yzhangcs/ Semantic Role Labeling as Dependency Parsing: Exploring Latent Tree Structures Inside Arguments
/yzhangcs/ Semantic Role Labeling as Dependency Parsing: Exploring Latent Tree Structures Inside Arguments /yzhangcs/ Semantic Role Labeling as Dependency Parsing: Exploring Latent Tree Structures Inside Arguments

Include the markdown at the top of your GitHub README.md file to showcase the performance of the model.

Badges are live and will be dynamically updated with the latest ranking of this paper.

1 день, 19 часов назад @ paperswithcode.com
/yizhiwang96/ DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning
/yizhiwang96/ DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning /yizhiwang96/ DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning

Automatic font generation based on deep learning has aroused a lot of interest in the last decade.

However, only a few recently-reported approaches are capable of directly generating vector glyphs and their results are still far from satisfactory...

Using our method, for the first time, visually-pleasing vector glyphs whose quality and compactness are both comparable to human-designed ones can be automatically generated.

The key idea of our DeepVecFont is to adopt the techniques of image synthesis, sequence modeling and differentiable rasterization to exhaustively exploit the dual-modality information (i.e., raster images and vector outlines) of vector fonts.

First, we design a dual-modalit…

1 день, 20 часов назад @ paperswithcode.com
/uq-neusoft-health-data-science/ ActiveEA: Active Learning for Neural Entity Alignment
/uq-neusoft-health-data-science/ ActiveEA: Active Learning for Neural Entity Alignment /uq-neusoft-health-data-science/ ActiveEA: Active Learning for Neural Entity Alignment

Entity Alignment (EA) aims to match equivalent entities across different Knowledge Graphs (KGs) and is an essential step of KG fusion.

Current mainstream methods -- neural EA models -- rely on training with seed alignment, i.e., a set of pre-aligned entity pairs which are very costly to annotate...

In this paper, we devise a novel Active Learning (AL) framework for neural EA, aiming to create highly informative seed alignment to obtain more effective EA models with less annotation cost.

Our framework tackles two main challenges encountered when applying AL to EA: (1) How to exploit dependencies between entities within the AL strategy.

Empirical results show that our proposed AL strategy can…

1 день, 20 часов назад @ paperswithcode.com
/lucaoffice/ Automated Essay Scoring Using Transformer Models
/lucaoffice/ Automated Essay Scoring Using Transformer Models /lucaoffice/ Automated Essay Scoring Using Transformer Models

Automated essay scoring (AES) is gaining increasing attention in the education sector as it significantly reduces the burden of manual scoring and allows ad hoc feedback for learners.

Natural language processing based on machine learning has been shown to be particularly suitable for text classification and AES...

While many machine-learning approaches for AES still rely on a bag-of-words (BOW) approach, we consider a transformer-based approach in this paper, compare its performance to a logistic regression model based on the BOW approach and discuss their differences.

Both transformer models considered in that analysis outperformed without any hyper-parameter tuning the regression-based mo…

1 день, 20 часов назад @ paperswithcode.com
/sherylhyx/ SSSNET: Semi-Supervised Signed Network Clustering
/sherylhyx/ SSSNET: Semi-Supervised Signed Network Clustering /sherylhyx/ SSSNET: Semi-Supervised Signed Network Clustering

Node embeddings are a powerful tool in the analysis of networks; yet, their full potential for the important task of node clustering has not been fully exploited.

In particular, most state-of-the-art methods generating node embeddings of signed networks focus on link sign prediction, and those that pertain to node clustering are usually not graph neural network (GNN) methods...

Here, we introduce a novel probabilistic balanced normalized cut loss for training nodes in a GNN framework for semi-supervised signed network clustering, called SSSNET.

The method is end-to-end in combining embedding generation and clustering without an intermediate step; it has node clustering as main focus, with a…

1 день, 20 часов назад @ paperswithcode.com
/ckorbach/ Next-Best-View Estimation based on Deep Reinforcement Learning for Active Object Classification
/ckorbach/ Next-Best-View Estimation based on Deep Reinforcement Learning for Active Object Classification /ckorbach/ Next-Best-View Estimation based on Deep Reinforcement Learning for Active Object Classification

The presentation and analysis of image data from a single viewpoint are often not sufficient to solve a task.

In this work, a robot arm holds an object in its end-effector and searches for a sequence of next-best-view to explicitly identify the object.

We use Soft Actor-Critic (SAC), a method of deep reinforcement learning, to learn these next-best-views for a specific set of objects.

The evaluation shows that an agent can learn to determine an object pose to which the robot arm should move an object.

This leads to a viewpoint that provides a more accurate prediction to distinguish such an object from other objects better.

1 день, 20 часов назад @ paperswithcode.com
💼 University and corporation labs
DeepMind DeepMind
последний пост 5 дней назад
Stacking our way to more general robots
Stacking our way to more general robots Stacking our way to more general robots

In “Skill Mastery,” our goal is to train a single agent that’s skilled in stacking a predefined set of five triplets.

To test for generalisation, these training objects exclude the family of objects from which the test triplets were chosen.

Next, we train a new policy in simulation that uses only realistic observations: images and the robot’s proprioceptive state.

The state policy serves as a teacher, providing the learning agent with corrections to its behaviours, and those corrections are distilled into the new policy.

This allows us to use the data that’s passively collected during the project instead of running a time-consuming online training algorithm on the real robots.

5 дней назад @ deepmind.com
Predicting gene expression with AI
Predicting gene expression with AI Predicting gene expression with AI

Based on Transformers, our new Enformer architecture advances genetic research by improving the ability to predict how DNA sequence influences gene expression.

Today Nature Methods published “Effective gene expression prediction from sequence by integrating long-range interactions” (first shared as a preprint on bioRxiv), in which we — in collaboration with our Alphabet colleagues at Calico — introduce a neural network architecture called Enformer that led to greatly increased accuracy in predicting gene expression from DNA sequence.

Previous work on gene expression has typically used convolutional neural networks as fundamental building blocks, but their limitations in modelling the influe…

1 неделя, 5 дней назад @ deepmind.com
Nowcasting the Next Hour of Rain
Nowcasting the Next Hour of Rain Nowcasting the Next Hour of Rain

In this evolving book of weather prediction, we now add a story on the role of machine learning for forecasting.

Today’s weather predictions are driven by powerful numerical weather prediction (NWP) systems.

Nowcasting fills the performance gap in this crucial time interval.

Nowcasting is essential for sectors like water management, agriculture, aviation, emergency planning, and outdoor events.

This combination of a crucial area where existing methods struggle and the availability of high-quality data provides the opportunity for machine learning to make its contributions to nowcasting.

2 недели, 3 дня назад @ deepmind.com
Building architectures that can handle the world’s data
Building architectures that can handle the world’s data Building architectures that can handle the world’s data

Perceiver and Perceiver IO work as multi-purpose tools for AIMost architectures used by AI systems today are specialists.

A 2D residual network may be a good choice for processing images, but at best it’s a loose fit for other kinds of data — such as the Lidar signals used in self-driving cars or the torques used in robotics.

What’s more, standard architectures are often designed with only one task in mind, often leading engineers to bend over backwards to reshape, distort, or otherwise modify their inputs and outputs in hopes that a standard architecture can learn to handle their problem correctly.

Dealing with more than one kind of data, like the sounds and images that make up videos, is …

2 месяца, 2 недели назад @ deepmind.com
Generally capable agents emerge from open-ended play
Generally capable agents emerge from open-ended play Generally capable agents emerge from open-ended play

Through reinforcement learning (RL), this single system learnt by playing round after round of games through a repetitive process of trial and error.

But AlphaZero still trained separately on each game — unable to simply learn another game or task without repeating the RL process from scratch.

Today, we published "Open-Ended Learning Leads to Generally Capable Agents," a preprint detailing our first steps to train an agent capable of playing many different games without needing human interaction data.

The agent’s capabilities improve iteratively as a response to the challenges that arise in training, with the learning process continually refining the training tasks so the agent never stops …

2 месяца, 3 недели назад @ deepmind.com
Putting the power of AlphaFold into the world’s hands
Putting the power of AlphaFold into the world’s hands Putting the power of AlphaFold into the world’s hands

Today, I’m incredibly proud and excited to announce that DeepMind is making a significant contribution to humanity’s understanding of biology.

When we announced AlphaFold 2 last December, it was hailed as a solution to the 50-year old protein folding problem.

As researchers seek cures for diseases and pursue solutions to other big problems facing humankind – including antibiotic resistance, microplastic pollution, and climate change – they will benefit from fresh insights into the structure of proteins.

The same way that the structure of a machine tells you what it does, so the structure of a protein helps us understand its function.

Today, we are sharing a trove of information that doubles…

2 месяца, 3 недели назад @ deepmind.com
An update on our racial justice efforts
An update on our racial justice efforts An update on our racial justice efforts

I then shared some thoughts around DeepMind's intention to help combat racism and advance racial equity.

We also explored - and gathered feedback - on how we could best support racial justice in the communities DeepMind interacts with.

Today I’m pleased to share one of the outcomes of that process: putting resources directly in the hands of Black communities, so they can decide where they need them most.

In the past months, we made donations to organisations that play a vital role supporting Black communities in the UK, US and Africa.

We're delighted to support these organisations, and grateful to many of those who have shared with us an overview of their work:

4 месяца, 2 недели назад @ deepmind.com
Advancing sports analytics through AI research
Advancing sports analytics through AI research Advancing sports analytics through AI research

Finally, we consider computer vision to be one of the most promising avenues for advancing the boundaries of state of the art sports analytics research.

The large numbers of football videos satisfies a prerequisite for modern AI techniques.

While each football video is different, the settings do not vary greatly, which makes the task ideal for sharpening AI algorithms.

The application of advanced AI techniques to football has the potential to revolutionise the game across many axes, for players, decision-makers, fans, and broadcasters.

We believe that the development of increasingly advanced AI techniques afforded by the football microcosm might be applicable to broader domains.

5 месяцев, 1 неделя назад @ deepmind.com
Game theory as an engine for large-scale data analysis
Game theory as an engine for large-scale data analysis Game theory as an engine for large-scale data analysis

EigenGame maps out a new approach to solve fundamental ML problems.

This made us wonder if such a perspective modeled on game theory could help solve other fundamental machine learning problems.

Today at ICLR 2021 (the International Conference on Learning Representations), we presented “EigenGame: PCA as a Nash Equilibrium,” which received an Outstanding Paper Award.

Firstly, data was originally recorded by hand in paper notebooks, and now it is stored in data centres the size of warehouses.

By approaching the PCA problem in the right way, our insights and algorithms apply more broadly across the branches of the ML tree.

5 месяцев, 1 неделя назад @ deepmind.com
MuZero: Mastering Go, chess, shogi and Atari without rules
MuZero: Mastering Go, chess, shogi and Atari without rules MuZero: Mastering Go, chess, shogi and Atari without rules

Humans learn this ability quickly and can generalise to new scenarios, a trait we would also like our algorithms to have.

Until now, the best results on Atari are from model-free systems, such as DQN, R2D2 and Agent57.

As the name suggests, model-free algorithms do not use a learned model and instead estimate what is the best action to take next.

Instead of trying to model the entire environment, MuZero just models aspects that are important to the agent’s decision-making process.

Specifically, MuZero models three elements of the environment that are critical to planning:The value: how good is the current position?

9 месяцев, 3 недели назад @ deepmind.com
Google
последний пост 5 часов назад
SimVLM: Simple Visual Language Model Pre-training with Weak Supervision
SimVLM: Simple Visual Language Model Pre-training with Weak Supervision SimVLM: Simple Visual Language Model Pre-training with Weak Supervision

For example, an image captioning model generates natural language descriptions based on its understanding of a given image.

This approach aims to learn a single feature space from both visual and language inputs, rather than learning two separate feature spaces, one each for visual inputs and another for language inputs.

To address this challenge, in “SimVLM: Simple Visual Language Model Pre-training with Weak Supervision”, we propose a minimalist and effective VLP, named SimVLM, which stands for “Simple Visual Language Model”.

We examine the model on multiple tasks for this purpose, including image captioning, multilingual captioning, open-ended VQA and visual text completion.

Unlike prior…

5 часов назад @ ai.googleblog.com
Baselines for Uncertainty and Robustness in Deep Learning
Baselines for Uncertainty and Robustness in Deep Learning Baselines for Uncertainty and Robustness in Deep Learning

In “Uncertainty Baselines: Benchmarks for Uncertainty & Robustness in Deep Learning”, we introduce Uncertainty Baselines, a collection of high-quality implementations of standard and state-of-the-art deep learning methods for a variety of tasks, with the goal of making research on uncertainty and robustness more reproducible.

Uncertainty BaselinesAs of this writing, Uncertainty Baselines provides a total of 83 baselines, comprising 19 methods encompassing standard and more recent strategies over nine datasets.

Uncertainty Baselines sets up each baseline under a choice of base model, training dataset, and a suite of evaluation metrics.

As seen in the workflow figure below, Uncertainty Baseli…

1 день, 6 часов назад @ ai.googleblog.com
An ML-Based Framework for COVID-19 Epidemiology
An ML-Based Framework for COVID-19 Epidemiology An ML-Based Framework for COVID-19 Epidemiology

Observed data for COVID-19 associated outcomes, such as confirmed cases, hospitalizations and deaths, are used for training of compartmental models.

Forecast AccuracyEach day, we train models to predict COVID-19 associated outcomes (primarily deaths and cases) 28 days into the future.

In all cases, we demonstrated no consistent pattern of errors among these groups once we controlled for the number of COVID-19 deaths and cases that occur in each subgroup.

Second, there are limitations due to the model capacity of compartmental models as they cannot model very complex dynamics of Covid-19 disease propagation.

For example, most of Japan's COVID-19 cases and deaths have been concentrated in a f…

2 дня, 2 часа назад @ ai.googleblog.com
Self-Supervised Learning Advances Medical Image Classification
Self-Supervised Learning Advances Medical Image Classification Self-Supervised Learning Advances Medical Image Classification

In “Big Self-Supervised Models Advance Medical Image Classification”, to appear at the International Conference on Computer Vision (ICCV 2021), we study the effectiveness of self-supervised contrastive learning as a pre-training strategy within the domain of medical image classification.

We also propose Multi-Instance Contrastive Learning (MICLe), a novel approach that generalizes contrastive learning to leverage special characteristics of medical image datasets.

We observe that self-supervised learning on ImageNet, followed by additional self-supervised learning on unlabeled domain-specific medical images, significantly improves the accuracy of medical image classifiers.

Specifically, we d…

2 дня, 7 часов назад @ ai.googleblog.com
Turn data into value with a unified and open data cloud
Turn data into value with a unified and open data cloud Turn data into value with a unified and open data cloud

Today at Google Cloud Next we are announcing innovations that will enable data teams to simplify how they work with data and derive value from it faster. These new solutions will help organizations build modern data architectures with real-time analytics to power innovative, mission-critical, data-driven applications. Too often, even the best minds in data are constrained by ineffective systems and technologies. A recent study showed that only 32%of companies surveyed gained value from their data investments. Previous approaches have resulted in difficult to access, slow, unreliable, complex, and fragmented systems. At Google Cloud, we are committed to changing this reality by helping custo…

3 дня, 12 часов назад @ cloud.google.com
Google Cloud expands CCAI and DocAI solutions to accelerate time to value
Google Cloud expands CCAI and DocAI solutions to accelerate time to value Google Cloud expands CCAI and DocAI solutions to accelerate time to value

Virtually all companies face two broad challenges: to marshal data for smarter decision making, and to deliver more personalized and convenient experiences for customers. Artificial intelligence (AI) can help, but the path from AI investment to business outcome can be difficult to chart. To fast-track time to value, today, we’re pleased to announce new additions to two of our core AI-powered solutions:Contact Center AI (CCAI) Insights, now generally available, provides out-of-the-box and custom modeling techniques to make it easier for contact center teams to better understand customer interaction data. CCAI Insights extends the impact of Google Cloud’s CCAI solution, which lets businesses …

3 дня, 12 часов назад @ cloud.google.com
Google at ICCV 2021
Google at ICCV 2021 Google at ICCV 2021

The International Conference on Computer Vision 2021 (ICCV 2021), one of the world's premier conferences on computer vision, starts this week.

A Champion Sponsor and leader in computer vision research, Google will have a strong presence at ICCV 2021 with more than 50 research presentations and involvement in the organization of a number of workshops and tutorials.

If you are attending ICCV this year, we hope you’ll check out the work of our researchers who are actively pursuing the latest innovations in computer vision.

Learn more about our research being presented in the list below (Google affilitation in bold).

Organizing CommitteeDiversity and Inclusion Chair: Negar RostamzadehArea Chair…

4 дня, 4 часа назад @ ai.googleblog.com
Making public services more accessible with AI-powered translations from Google Cloud
Making public services more accessible with AI-powered translations from Google Cloud Making public services more accessible with AI-powered translations from Google Cloud

Editor’s note: This article has originally been published on our German blog.Imagine you’ve done your due diligence for your upcoming trip to the local authority office. You’ve filled out the relevant forms, booked the appointment, found the correct points of contact. But, as much as you’ve prepared, there’s one aspect of your visit you can’t control: you and the clerks don’t speak the same language.In science-fiction, this wouldn’t be a problem. Star Trek has the universal translator. In Douglas Adams’ novel The Hitchhiker’s Guide to the Galaxy, there’s the telepathic Babel Fish, which enables everyone to understand everyone.Here at Google Cloud, we’ve turned these future visions into real…

1 неделя назад @ cloud.google.com
Serving predictions & evaluating Recommendations AI
Serving predictions & evaluating Recommendations AI Serving predictions & evaluating Recommendations AI

In this post we'll show how to use Recommendations AI to display predictions on a live website and set up A/B experiments to measure performance. A/B testing involves creating two or more versions of a page and then splitting user traffic into groups to determine the impact of those changes. This is the most effective way to measure the impact of Recommendations AI. You can use A/B testing to test Recommendations AI against an existing recommender, or if you don’t have any recommendations on your site currently you can measure the impact of a recommender like Recommendations AI. You can also test different placement locations on a page or different options to find the optimal settings for y…

1 неделя, 1 день назад @ cloud.google.com
Finding Complex Metal Oxides for Technology Advancement
Finding Complex Metal Oxides for Technology Advancement Finding Complex Metal Oxides for Technology Advancement

A challenge in realizing this strategy is that the search space for new crystalline metal oxides is enormous.

We made each metal element printable by dissolving a metal nitrate or metal chloride into an ink solution.

Since the optical properties vary with composition, the gradient in composition appears as a color gradient along each line.

While optical absorption itself is important for technologies such as solar energy harvesting, in our work we are interested in optical absorption vs. wavelength as a fingerprint of each material.

In panels e), f) and g), red points are candidate phases, and vertices where blue lines meet indicate interesting phase behavior.

1 неделя, 1 день назад @ ai.googleblog.com
Debugging Vertex AI training jobs with the interactive shell
Debugging Vertex AI training jobs with the interactive shell Debugging Vertex AI training jobs with the interactive shell

Training a machine learning model successfully can be a challenging and time consuming task. Unlike typical software development, the results of training depend on both the training code and the input data. This can make debugging a training job a complex process, even when you’re running it on your local machine. Running code on remote infrastructure can make this task even more difficult. Debugging code that runs in a managed cloud environment can be a tedious and error-prone process since the standard tools used to debug programs locally aren’t available in a managed environment. Also, training jobs can get stuck and stop making progress without visible logs or metrics. Interactive acces…

1 неделя, 1 день назад @ cloud.google.com
Model training as a CI/CD system: Part I
Model training as a CI/CD system: Part I Model training as a CI/CD system: Part I

In software engineering, Continuous Integration (CI) and Continuous Delivery (CD) are two very important concepts. CI is when you integrate changes (new features, approved code commits, etc.) into your system reliably and continuously. CD is when you deploy these changes reliably and continuously. CI and CD both can be performed in isolation as well as they can be coupled. A machine learning (ML) system is essentially a software system. So, to operate with such systems scalably we need CI/CD practices in place to facilitate rapid experimentation, integration, and deployment. Here are some scenarios:As an ML Engineer, you are likely to experiment with new model architectures to improve perfo…

1 неделя, 2 дня назад @ cloud.google.com
Introducing FLAN: More generalizable Language Models with Instruction Fine-Tuning
Introducing FLAN: More generalizable Language Models with Instruction Fine-Tuning Introducing FLAN: More generalizable Language Models with Instruction Fine-Tuning

In “Fine-tuned Language Models Are Zero-Shot Learners”, we explore a simple technique called instruction fine-tuning, or instruction tuning for short.

We use instruction tuning to train a model, which we call Fine-tuned LAnguage Net (FLAN).

An illustration of how FLAN works: The model is fine-tuned on disparate sets of instructions and generalizes to unseen instructions.

BackgroundOne recent popular technique for using language models to solve tasks is called zero-shot or few-shot prompting.

This technique formulates a task based on text that a language model might have seen during training, where then the language model generates the answer by completing the text.

1 неделя, 2 дня назад @ ai.googleblog.com
FedJAX: Federated Learning Simulation with JAX
FedJAX: Federated Learning Simulation with JAX FedJAX: Federated Learning Simulation with JAX

For example, federated learning makes it possible to train virtual keyboard language models based on user data that never leaves a mobile device.

Federated learning has become a particularly active area of research due to an increased focus on privacy and security.

With its simple building blocks for implementing federated algorithms, prepackaged datasets, models and algorithms, and fast simulation speed, FedJAX aims to make developing and evaluating federated algorithms faster and easier for researchers.

This not only encourages valid comparisons between different federated algorithms but also accelerates the development of new algorithms.

Conclusions and Future WorkIn this post, we introd…

1 неделя, 4 дня назад @ ai.googleblog.com
Your guide to all things AI & ML at Google Cloud Next
Your guide to all things AI & ML at Google Cloud Next Your guide to all things AI & ML at Google Cloud Next

Mark your calendars—Google Cloud Next is coming October 12–14, 2021.We’ll kick off each day with a live broadcast and keynote to showcase our latest launches and share how customers and partners are tackling today’s greatest business challenges. You can also participate in online interactive experiences and attend on-demand sessions that align with your interests and schedule. Overall, we’ve designed Next ’21 as a customizable digital adventure to allow for a more personalized experience. If you’re anything like me, you’ll be tuning in to learn more about AI and ML. And good news—we have a lot to share! Below is your guide to my top recommendations for AI and ML content. Whether you want to…

2 недели назад @ cloud.google.com
OpenAI OpenAI
последний пост 3 недели, 1 день назад
Summarizing Books with Human Feedback
Summarizing Books with Human Feedback Summarizing Books with Human Feedback

Our model works by first summarizing small sections of a book, then summarizing those summaries into a higher-level summary, and so on.

A zero-shot question-answering model can use our model’s summaries to obtain state-of-the-art on the NarrativeQA dataset for book-length question answering.

Our Approach: Combining Reinforcement Learning from Human Feedback and Recursive Task DecompositionConsider the task of summarizing a piece of text.

In the past we found that training a model with reinforcement learning from human feedback helped align model summaries with human preferences on short posts and articles.

In this case we break up summarizing a long piece of text into summarizing several sh…

3 недели, 1 день назад @ openai.com
Helen Toner Joins OpenAI’s Board of Directors
Helen Toner Joins OpenAI’s Board of Directors Helen Toner Joins OpenAI’s Board of Directors

Today, we’re excited to announce the appointment of Helen Toner to our Board of Directors.

As the Director of Strategy at Georgetown’s Center for Security and Emerging Technology (CSET), Helen has deep expertise in AI policy and global AI strategy research.

I greatly value Helen’s deep thinking around the long-term risks and effects of AI,” added Greg Brockman, OpenAI’s chairman and Chief Technology Officer.

“We are delighted to add her leadership to our board.”OpenAI is a unique organization in the AI research space, and has produced some of the advances, publications, and products I’m most excited about,” said Helen Toner.

She previously advised policymakers and grantmakers on AI strategy…

1 месяц, 1 неделя назад @ openai.com
OpenAI Codex
OpenAI Codex OpenAI Codex

We are now inviting businesses and developers to build on top of OpenAI Codex through our API.

Watch Video Creating a Space Game with OpenAI Codex Tweet Watch Video “Hello World” with OpenAI Codex Tweet Watch Video Data Science with OpenAI Codex Tweet Watch Video Talking to Your Computer with OpenAI Codex Tweet Watch Video Converting Python to Ruby with OpenAI Codex Tweet Watch Video Giving OpenAI Codex a First Grade Math Test TweetOpenAI Codex is a descendant of GPT-3; its training data contains both natural language and billions of lines of source code from publicly available sources, including code in public GitHub repositories.

OpenAI Codex empowers computers to better understand people…

2 месяца назад @ openai.com
Introducing Triton: Open-Source GPU Programming for Neural Networks
Introducing Triton: Open-Source GPU Programming for Neural Networks Introducing Triton: Open-Source GPU Programming for Neural Networks

These issues can be mitigated by writing specialized GPU kernels, but doing so can be surprisingly difficult due to the many intricacies of GPU programming.

This has led us to extend and improve Triton, a recent language and compiler whose original creator now works at OpenAI.

# Z,X,Y are dense tensors Z[idx] = X[idx] + Y[idx] ... grid = (ceil_div(N, BLOCK),) block = (BLOCK,) add[grid, block](x, y, z, x.shape[0]) BLOCK = 512 # This is a GPU kernel in Triton.

Without a system like Triton, non-trivial modifications of matrix multiplication kernels would be out-of-reach for developers without exceptional GPU programming expertise.

If you’re interested in joining our team and working on Triton …

2 месяца, 2 недели назад @ openai.com
Improving Language Model Behavior by Training on a Curated Dataset
Improving Language Model Behavior by Training on a Curated Dataset Improving Language Model Behavior by Training on a Curated Dataset

We've found we can improve language model behavior with respect to specific behavioral values by fine-tuning on a curated dataset of <100 examples of those values.

Appropriate or desirable language model behavior, like appropriate human behavior, cannot be reduced to one universal standard; desirable behavior differs by application and social context.

Step Two: Crafting the Dataset and Fine-TuningWe crafted a values-targeted dataset of 76 text samples; each sample was in a question-answer format and between 40 and 340 words.

But we believe this only scratches the surface and leaves important questions unanswered:Who should be consulted when designing a values-targeted dataset?

Please reach …

4 месяца, 1 неделя назад @ openai.com
OpenAI Startup Fund
OpenAI Startup Fund OpenAI Startup Fund

Investing in startups with big ideas about AI.

4 месяца, 3 недели назад @ openai.com
OpenAI Scholars 2021: Final Projects
OpenAI Scholars 2021: Final Projects OpenAI Scholars 2021: Final Projects

My advice to someone starting in deep learning research is to take your time to understand insights from fundamental papers and remember that the field is still relatively new.

Blogplaycircle Feedback Loops in Opinion ModelingDanielle Ensign OpenAI Mentor: Jeff WuPrevious Roles: Software Engineer at ITHAKA, Brighten AI, and Phylliida I have a background in Software Development, AI Fairness, and VR Game Development.

My project is exploratory, investigating prior work on opinion modeling from the context of deep learning.

Blogplaycircle Characterizing Test Time Compute on Graph Structured ProblemsKudzo Ahegbebu OpenAI Mentor: William GussPrevious Roles: Software Engineer at Facebook and Genen…

5 месяцев, 1 неделя назад @ openai.com
Will Hurd Joins OpenAI’s Board of Directors
Will Hurd Joins OpenAI’s Board of Directors Will Hurd Joins OpenAI’s Board of Directors

OpenAI is committed to developing general-purpose artificial intelligence that benefits all humanity, and we believe that achieving our goal requires expertise in public policy as well as technology.

So, we’re delighted to announce that Congressman Will Hurd has joined our board of directors.

Will served three terms in the U.S. House of Representatives, has been a leading voice on technology policy, and coauthored bipartisan legislation outlining a national strategy for artificial intelligence.

“Will brings a rare combination of expertise—he deeply understands both artificial intelligence as well as public policy, both of which are critical to a successful future for AI,” said Sam Altman, O…

5 месяцев, 2 недели назад @ openai.com
GPT-3 Powers the Next Generation of Apps
GPT-3 Powers the Next Generation of Apps GPT-3 Powers the Next Generation of Apps

Given any text prompt like a phrase or a sentence, GPT-3 returns a text completion in natural language.

Applications and industriesTo date, over 300 apps are using GPT-3 across varying categories and industries, from productivity and education to creativity and games.

Using GPT-3, Viable identifies themes, emotions, and sentiment from surveys, help desk tickets, live chat logs, reviews, and more.

Algolia Answers helps publishers and customer support help desks query in natural language and surface nontrivial answers.

With natural language processing, technical experience is no longer a barrier, and we can truly keep our focus on solving real world problems.

6 месяцев, 3 недели назад @ openai.com
Multimodal Neurons in Artificial Neural Networks
Multimodal Neurons in Artificial Neural Networks Multimodal Neurons in Artificial Neural Networks

discovered that the human brain possesses multimodal neurons.

Now, we’re releasing our discovery of the presence of multimodal neurons in CLIP.

Our discovery of multimodal neurons in CLIP gives us a clue as to what may be a common mechanism of both synthetic and natural vision systems—abstraction.

Indeed, these neurons appear to be extreme examples of “multi-faceted neurons,” neurons that respond to multiple distinct cases, only at a higher level of abstraction.

How multimodal neurons composeThese multimodal neurons can give us insight into understanding how CLIP performs classification.

7 месяцев, 2 недели назад @ openai.com
Scaling Kubernetes to 7,500 Nodes
Scaling Kubernetes to 7,500 Nodes Scaling Kubernetes to 7,500 Nodes

We've scaled Kubernetes clusters to 7,500 nodes, producing a scalable infrastructure for large models like GPT-3, CLIP, and DALL·E, but also for rapid small-scale iterative research such as Scaling Laws for Neural Language Models.

NetworkingAs the number of nodes and pods within our clusters increased, we found that Flannel had difficulties scaling up the throughput required.

It reconciles this with the current nodes in the cluster, tainting the appropriate number of nodes with openai.com/team=teamname:NoSchedule .

Kubernetes 1.18 introduced a plugin architecture for the core Kubernetes scheduler, making it much easier to add features like this natively.

Unsolved problemsThere are many prob…

8 месяцев, 3 недели назад @ openai.com
CLIP: Connecting Text and Images
CLIP: Connecting Text and Images CLIP: Connecting Text and Images

We show random, non-cherry picked, predictions of zero-shot CLIP classifiers on examples from various datasets below.

In contrast, the CLIP model can be evaluated on benchmarks without having to train on their data, so it can’t “cheat” in this manner.

CLIP is flexible and generalBecause they learn a wide range of visual concepts directly from natural language, CLIP models are significantly more flexible and general than existing ImageNet models.

The best CLIP model outperforms the best publicly available ImageNet model, the Noisy Student EfficientNet-L2, on 20 out of 26 different transfer datasets we tested.

CLIP models are also more compute efficient than the models from 10 prior approache…

9 месяцев, 1 неделя назад @ openai.com
DALL·E: Creating Images from Text
DALL·E: Creating Images from Text DALL·E: Creating Images from Text

Text prompt an illustration of a baby daikon radish in a tutu walking a dog AI-generated images View more images or edit prompt Text prompt a store front that has the word ‘openai’ written on it […] AI-generated images View more images or edit prompt Text prompt an armchair in the shape of an avocado […] AI-generated images View more images or edit prompt Text and image prompt the exact same cat on the top as a sketch on the bottom AI-generated images View more images or edit promptGPT-3 showed that language can be used to instruct a large neural network to perform a variety of text generation tasks.

navigatedownwide navigateupwide Text prompt AI-generatedimages We find that DALL·E is somet…

9 месяцев, 1 неделя назад @ openai.com
Organizational Update from OpenAI
Organizational Update from OpenAI Organizational Update from OpenAI

It’s been a year of dramatic change and growth at OpenAI.

Today we’re announcing that Dario Amodei, VP of Research, is leaving OpenAI after nearly five years with the company.

He and a handful of OpenAI colleagues are planning a new project, which they tell us will probably focus less on product development and more on research.

I want to wish everyone the best, and I know that OpenAI will do really great things in the years ahead.

Mira Murati is taking on new responsibilities as senior vice president of Research, Product, and Partnerships, reflecting her strong leadership during our API rollout and across the company.

9 месяцев, 2 недели назад @ openai.com
Microsoft Microsoft
последний пост 1 день, 16 часов назад
Humana leverages Microsoft Cloud for Healthcare to develop advanced predictive models
Humana leverages Microsoft Cloud for Healthcare to develop advanced predictive models

The teams at Humana believed they had enough data to explore the possibility of proactively identifying when patients were heading toward a high-risk event, and they put Microsoft Cloud for Healthcare and AI technology to the test.

1 день, 16 часов назад @ azure.microsoft.com
Microsoft taps AI techniques to bring Translator to 100 languages
Microsoft taps AI techniques to bring Translator to 100 languages Microsoft taps AI techniques to bring Translator to 100 languages

Today, Microsoft announced that Microsoft Translator, its AI-powered text translation service, now supports more than 100 different languages and dialects.

Its Translator isn’t the first to support more than 100 languages — Google Translate reached that milestone first in February 2016.

Z-code provides the framework, architecture, and models for text-based, multilingual AI language translation for whole families of languages.

Techniques like neural machine translation, rewriting-based paradigms, and on-device processing have led to quantifiable leaps in machine translation accuracy.

Z-code language models are trained multilingually across many languages, and that knowledge is transferred be…

4 дня, 7 часов назад @ venturebeat.com
Microsoft Translator: Now translating 100 languages and counting!
Microsoft Translator: Now translating 100 languages and counting! Microsoft Translator: Now translating 100 languages and counting!

Today, we’re excited to announce that Microsoft Translator has added 12 new languages and dialects to the growing repertoire of Microsoft Azure Cognitive Services Translator, bringing us to a total of 103 languages!

Azure Cognitive Services Translator exposes NMT models in Microsoft products and to Translator customers through the Text Translation and Document Translation APIs.

Azure Cognitive Services Translator APIs are available in the public cloud and in the secure Microsoft Azure Government Cloud.

Some of the Microsoft product integrations include Microsoft 365 for translating text and documents, the Microsoft Edge browser for translating whole webpages, SwiftKey for translating messag…

4 дня, 9 часов назад @ microsoft.com
Azure AI empowers organizations to serve users in more than 100 languages
Azure AI empowers organizations to serve users in more than 100 languages

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

4 дня, 10 часов назад @ blogs.microsoft.com
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World’s Largest and Most Powerful Generative Language Model
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World’s Largest and Most Powerful Generative Language Model Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World’s Largest and Most Powerful Generative Language Model

We are excited to introduce the DeepSpeed- and Megatron-powered Megatron-Turing Natural Language Generation model (MT-NLG), the largest and the most powerful monolithic transformer language model trained to date, with 530 billion parameters.

We built high-quality, natural language training corpora with hundreds of billions of tokens, and co-developed training recipes to improve optimization efficiency and stability.

Training dataset and model configurationWe used the architecture of the transformer decoder, which is a left-to-right generative transformer-based language model consisting of 530 billion parameters.

Dataset Dataset source Tokens (billions) Weight (%) Epochs Books3 Pile dataset …

4 дня, 11 часов назад @ microsoft.com
Microsoft Turing Universal Language Representation model, T-ULRv5, tops XTREME leaderboard and trains 100x faster
Microsoft Turing Universal Language Representation model, T-ULRv5, tops XTREME leaderboard and trains 100x faster Microsoft Turing Universal Language Representation model, T-ULRv5, tops XTREME leaderboard and trains 100x faster

We put significant effort into data engineering and cleaning steps to produce high-quality datasets at scale to support T-ULRv5 training.

T-ULRv5 will soon deliver benefits to many of our existing product scenarios across products such as Microsoft Bing, Microsoft 365, Microsoft Edge, Microsoft Azure, and more.

Microsoft Turing models are also available for custom application building through our private preview program.

If you are a researcher who would like to work with us in assessing and improving Turing models, Microsoft Turing Academic Program (MS-TAP) allows you to submit a proposal and get access to these models in greater detail.

The Microsoft Turing team welcomes your feedback and…

2 недели, 3 дня назад @ microsoft.com
Real-world evidence and the path from data to impact
Real-world evidence and the path from data to impact Real-world evidence and the path from data to impact

To build resilience in these areas, researchers at Microsoft and their collaborators have been working on a number of tools that help domain experts translate real-world data into evidence.

Developing real-world evidenceReal-world problems affecting societal resilience leave a trail of “real-world data” (RWD) in their wake.

Real-world evidence in actionOver the following sections, we describe tools for developing different kinds of real-world evidence in response to the distinctive characteristics—and challenges—of accessing, analyzing, and acting on real-world data.

Enabling technologyDownload Synthetic data showcaseWe developed the concept of a Synthetic Data Showcase as a new mechanism f…

3 недели, 1 день назад @ microsoft.com
Micro-climate predictions: Enabling hyper-local decisions for agriculture and renewables
Micro-climate predictions: Enabling hyper-local decisions for agriculture and renewables Micro-climate predictions: Enabling hyper-local decisions for agriculture and renewables

Micro-climate predictions are beneficial in agriculture, forestry, architecture, urban design, ecology conservation, maritime and other domains.

Forecast error computation DeepMC uses weather station forecasts of the predicted variable to learn better models for micro-climate predictions.

Micro-climate wind speed prediction RMSE comparisons over 24-hour predictionsComparison: micro-wind speed predictionsFigure 3 shows the wind speed predictions at the 24th hour over a period of 10 days with one-hour resolution.

RMSE, MAPE and MAE comparison for micro-climate wind speed predictionsFigure 6.

The combination of accuracy, robustness, flexibility and scalability is important to help the renewabl…

1 месяц назад @ microsoft.com
Video analytics at the edge, an ideal technology for 5G cloud monetization
Video analytics at the edge, an ideal technology for 5G cloud monetization

Creating a programmable software infrastructure for telecommunication operations promises to reduce both the capital expenditure (CAPEX) and the operational expenses (OPEX) of the 5G telecommunications operators. In this blog, we focus on video, the dominant traffic type on the internet since the introduction of 4G networks.

1 месяц назад @ azure.microsoft.com
SpaceCows – using AI, space technology and cloud to protect the Top End
SpaceCows – using AI, space technology and cloud to protect the Top End

This website is using a security service to protect itself from online attacks.

The action you just performed triggered the security solution.

There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.

You can email the site owner to let them know you were blocked.

Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.

1 месяц назад @ news.microsoft.com
4 ways AI, computer vision, and related technologies expand IoT solutions
4 ways AI, computer vision, and related technologies expand IoT solutions

Intel and Microsoft Azure are working together to help enterprises deploy intelligent IoT technologies and services, including AI’s deep learning abilities, computer vision, and audio or speech capabilities. Adding these capabilities enables solutions to solve more business challenges, uniting two or more—adding both computer vision and AI, for example greatly expands the potential uses for IoT solutions.

1 месяц, 1 неделя назад @ azure.microsoft.com
DeepSpeed powers 8x larger MoE model training with high performance
DeepSpeed powers 8x larger MoE model training with high performance DeepSpeed powers 8x larger MoE model training with high performance

Today, we are proud to announce DeepSpeed MoE, a high-performance system that supports massive scale mixture of experts (MoE) models as part of the DeepSpeed optimization library.

Besides supporting the most ambitious scale MoE models, DeepSpeed MoE boosts the development productivity and resource efficiency of training modestly sized MoE models in production scenarios, which may be of broader interest to the deep learning (DL) community.

Powered by DeepSpeed MoE, we can now train MoE models that are much larger compared with dense models.

By leveraging DeepSpeed MoE, the Z-code MoE model achieved improved convergence and higher quality with multitask training setting compared with the non-…

1 месяц, 4 недели назад @ microsoft.com
Solve your toughest business problems with AI and machine learning
Solve your toughest business problems with AI and machine learning

Today, AI and machine learning are enabling data-driven organizations to accelerate their journey to insights and decisions. With all the latest advancements, AI is no longer limited to only those with deep expertise or a cache of data scientists, and many organizations can now adopt AI and machine learning for better competitive advantage. Customers with analytics practices looking to adopt machine learning can read this report to get started.

1 месяц, 4 недели назад @ azure.microsoft.com
New Future of Work: How remote and hybrid work will shape workplaces and society with Jaime Teevan and Siddharth Suri
New Future of Work: How remote and hybrid work will shape workplaces and society with Jaime Teevan and Siddharth Suri New Future of Work: How remote and hybrid work will shape workplaces and society with Jaime Teevan and Siddharth Suri

So, you know, Microsoft, for example, Satya Nadella, after COVID hit, I couldn’t have been prouder to work for him.

It was like, you know, “Yeah, okay, productivity hasn’t fallen off a cliff.

This is really, really, really complicated accounting.

SURI: Yeah, yeah.

You know, right now, women caregivers are really, really struggling.

2 месяца назад @ microsoft.com
Safe program merges at scale: A grand challenge for program repair research
Safe program merges at scale: A grand challenge for program repair research Safe program merges at scale: A grand challenge for program repair research

But with so many people independently altering code, it’s unsurprising that updates don’t always synchronize, resulting in bad merges.

In other cases, bad program merges can be more subtle and costly, introducing semantic merge conflicts that may either fail the compiler, break a test, or—worse—introduce a regression.

And third, there are project-specific patterns in how developers resolve bad merges that can be capitalized on by program synthesis.

Safe program merges—a long-standing research problemThe problem of ensuring safe program merges has been long studied and remains an open challenge in programming languages and software engineering research.

While the approach showed the potentia…

2 месяца назад @ microsoft.com
MIT AI MIT AI
последний пост 5 часов назад
Putting artificial intelligence at the heart of health care — with help from MIT
Putting artificial intelligence at the heart of health care — with help from MIT Putting artificial intelligence at the heart of health care — with help from MIT

Artificial intelligence is transforming industries around the world — and health care is no exception.

Her clinical research interests include cardiovascular disease prevention, women's heart health, cardiovascular health disparities, and the use of digital tools in cardiovascular disease management.

Adedinsewo’s interest in AI emerged toward the end of her cardiology fellowship, when she began learning about its potential to transform the field of health care.

“I started to wonder how we could leverage AI tools in my field to enhance health equity and alleviate cardiovascular care disparities,” she says.

“There are so many areas of health care that can benefit from AI,” Barzilay says.

5 часов назад @ news.mit.edu
At the crossroads of language, technology, and empathy
At the crossroads of language, technology, and empathy At the crossroads of language, technology, and empathy

Now a linguistics major at MIT, she is studying the structure of language from the syllable to sentence level, and also learning about how we perceive language.

She finds the human aspects of how we use language, and the fact that languages are constantly changing, particularly compelling.

Looking ahead, Gandhi wants to focus on designing systems that better integrate theoretical developments in linguistics and on making language technology widely accessible.

“The technology born out of empathy is the technology that I want to be working on,” she explains.

“Language is fundamentally a people thing; you can’t ignore the people when you’re designing technology that relates to language.”

2 дня, 20 часов назад @ news.mit.edu
Deep learning helps predict traffic crashes before they happen
Deep learning helps predict traffic crashes before they happen Deep learning helps predict traffic crashes before they happen

To get ahead of the uncertainty inherent to crashes, scientists from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and the Qatar Center for Artificial Intelligence developed a deep learning model that predicts very high-resolution crash risk maps.

Fed on a combination of historical crash data, road maps, satellite imagery, and GPS traces, the risk maps describe the expected number of crashes over a period of time in the future, to identify high-risk areas and predict future crashes.

Typically, these types of risk maps are captured at much lower resolutions that hover around hundreds of meters, which means glossing over crucial details since the roads become blurred t…

3 дня, 4 часа назад @ news.mit.edu
Enabling AI-driven health advances without sacrificing patient privacy
Enabling AI-driven health advances without sacrificing patient privacy Enabling AI-driven health advances without sacrificing patient privacy

Secure AI Labs (SAIL) is addressing those problems with a technology that lets AI algorithms run on encrypted datasets that never leave the data owner’s system.

Health care organizations can control how their datasets are used, while researchers can protect the confidentiality of their models and search queries.

In 2017, Kellis and Kim decided to commercialize technology they were developing to allow AI algorithms to run on encrypted data.

No one — not the researchers, the data owners, or even SAIL —has access to the models or the datasets.

“We invite machine learning researchers to come and train on last year’s data and predict this year’s data,” says Kellis.

1 неделя, 1 день назад @ news.mit.edu
3 Questions: Kalyan Veeramachaneni on the hurdles preventing fully automated machine learning
3 Questions: Kalyan Veeramachaneni on the hurdles preventing fully automated machine learning 3 Questions: Kalyan Veeramachaneni on the hurdles preventing fully automated machine learning

That growing industry demand has driven researchers to explore the possibilities of automated machine learning (AutoML), which seeks to automate the development of machine learning solutions in order to make them accessible for nonexperts, improve their efficiency, and accelerate machine learning research.

Q: How has automatic machine learning evolved over the past decade, and what is the current state of AutoML systems?

Automating these other parts of the machine learning solution development is very important, and that is where the biggest bottlenecks remain.

There is not a crystal-clear definition of what a domain expert is and in itself, “domain expert,” is a very nebulous phrase.

We ne…

1 неделя, 2 дня назад @ news.mit.edu
Artificial intelligence is smart, but does it play well with others?
Artificial intelligence is smart, but does it play well with others? Artificial intelligence is smart, but does it play well with others?

When it comes to games such as chess or Go, artificial intelligence (AI) programs have far surpassed the best players in the world.

Not only were the scores no better with the AI teammate than with the rule-based agent, but humans consistently hated playing with their AI teammate.

If you can train it to learn how to play the game of chess, that agent won't necessarily go drive a car.

The project is studying what has prevented collaborative AI technology from leaping out of the game space and into messier reality.

Mastering a game such as Hanabi between AI and humans could open up a universe of possibilities for teaming intelligence in the future.

1 неделя, 4 дня назад @ news.mit.edu
Making roadway spending more sustainable
Making roadway spending more sustainable Making roadway spending more sustainable

While the nation’s ailing infrastructure will require more funding to reach its full potential, recent MIT research finds that more sustainable and higher performing roads are still possible even with today’s limited budgets.

Achieving this with a conventional planning approach would require the state to spend 32 percent more than it does today.

“Capturing the impacts of uncertainty is essential for making effective paving decisions,” explains Fengdi Guo, the paper’s lead author and a departing CSHub research assistant.

Improvement through variationThe MIT model can help DOTs make better decisions, but that decision-making is ultimately constrained by the potential options considered.

To ac…

2 недели, 3 дня назад @ news.mit.edu
Using AI and old reports to understand new medical images
Using AI and old reports to understand new medical images Using AI and old reports to understand new medical images

Getting a quick and accurate reading of an X-ray or some other medical images can be vital to a patient’s health and might even save a life.

The team is also utilizing a concept from information theory called mutual information — a statistical measure of the interdependence of two different variables — in order to boost the effectiveness of their approach.

A separate neural network does the same for text, representing its information in a different collection of numbers.

A third neural network then integrates the information between images and text in a coordinated way that maximizes the mutual information between the two datasets.

“When the mutual information between images and text is hig…

2 недели, 4 дня назад @ news.mit.edu
Deep learning helps predict new drug combinations to fight Covid-19
Deep learning helps predict new drug combinations to fight Covid-19 Deep learning helps predict new drug combinations to fight Covid-19

Typically, data scientists use deep learning to pick out drug combinations with large existing datasets for things like cancer and cardiovascular disease, but, understandably, they can’t be used for new illnesses with limited data.

Since drug synergy often occurs through inhibition of biological targets (like proteins or nucleic acids), the model jointly learns drug-target interaction and drug-drug synergy to mine new combinations.

“In contrast to previous approaches using drug-target interaction as fixed descriptors, our method learns to predict drug-target interaction from molecular structures.

To extend the model's efficacy against these strains, you’d only need additional drug combinati…

3 недели назад @ news.mit.edu
Toward a smarter electronic health record
Toward a smarter electronic health record Toward a smarter electronic health record

Electronic health records have been widely adopted with the hope they would save time and improve the quality of patient care.

Researchers at MIT and the Beth Israel Deaconess Medical Center are combining machine learning and human-computer interaction to create a better electronic health record (EHR).

They created a note-taking editor with a side panel that displays relevant information from the patient’s medical history.

They worked with an emergency physician and four hospital scribes who enter notes into the electronic health record.

Also, the color-coded chips helped them quickly scan notes for relevant information.

3 недели, 1 день назад @ news.mit.edu
Making self-driving cars safer through keener robot perception
Making self-driving cars safer through keener robot perception Making self-driving cars safer through keener robot perception

“But these perception algorithms are designed to be fast, with little guarantee of whether the robot has succeeded in gaining a correct understanding of its surroundings,” says Yang.

From there, lines are drawn that seek to trace the detected keypoints on the 2D car image to the labeled 3D keypoints in a 3D car model.

“We must then solve an optimization problem to rotate and translate the 3D model to align with the key points on the image,” Yang says.

“This 3D model will help the robot understand the real-world environment.”Each traced line must be analyzed to see if it has created a correct match.

The 3D model gets morphed to match the 2D image by undergoing a linear combination of previou…

4 недели, 1 день назад @ news.mit.edu
Q&A: Dina Katabi on a “smart” home with actual intelligence
Q&A: Dina Katabi on a “smart” home with actual intelligence Q&A: Dina Katabi on a “smart” home with actual intelligence

In this Q&A, Katabi, the Thuan (1990) and Nicole Pham Professor at MIT, discusses some of her recent work.

My lab is working on the next generation of wireless sensors and machine-learning models that can make more personalized predictions.

A: We’re developing “touchless” sensors that can track people’s movements, activities, and vital signs by analyzing radio signals that bounce off their bodies.

Here, the invisibles understand actions and movements.

Current deep-learning models are also limited whether wireless signals are collected from wearable or background sensors.

1 месяц, 1 неделя назад @ news.mit.edu
Using adversarial attacks to refine molecular energy predictions
Using adversarial attacks to refine molecular energy predictions Using adversarial attacks to refine molecular energy predictions

For this project, the researchers had multiple neural networks predict the potential energy surface from the same data.

The spread in the predictions of a “committee of neural networks” is the “uncertainty” at that point.

Instead, the new approach only samples data points from regions of low prediction confidence, corresponding to specific geometries of a molecule.

They used more than 15,000 examples to train a neural network to predict the potential energy surfaces for these systems.

However, when the adversarial approach is used to retrain the neural networks, the authors saw a performance jump to 97 percent using only 500 extra points.

1 месяц, 2 недели назад @ news.mit.edu
360-degree transparency for construction sites made simple
360-degree transparency for construction sites made simple 360-degree transparency for construction sites made simple

MIT spinoff OpenSpace invented automated 360-degree video jobsite capture and mapping.

Enter OpenSpace, a company that’s propelling the construction of any built environment into the digital age.

It's essentially passive; the OpenSpace Vision System does all the work, mapping site photos to site plans automatically.

Prior to OpenSpace, CTO DeCamp was a computer vision and data visualization research scientist at the Institute.

No more trips to the site required when you can see a past-to-present 360-degree view from anywhere.

1 месяц, 2 недели назад @ news.mit.edu
Jordan Harrod: Brain researcher and AI-focused YouTuber
Jordan Harrod: Brain researcher and AI-focused YouTuber Jordan Harrod: Brain researcher and AI-focused YouTuber

Scientist, writer, policy advocate, YouTuber – before Jordan Harrod established her many successful career identities, her first role was as a student athlete.

Mapping the brain to understand consciousnessToday, Harrod collaborates with professors Emery Brown, an anesthesiologist, and Ed Boyden, a neuroscientist, to study how different parts of the brain relate to consciousness and arousal.

Since beginning her neuroscience research, Harrod has been amazed to learn how much about the brain still needs to be uncovered.

She is the chair of the External Affairs Board of the Graduate Student Council, an Early Career Policy Ambassador for the Society for Neuroscience, and the co-founder of the MI…

1 месяц, 2 недели назад @ news.mit.edu
Berkeley AI
последний пост 1 день, 15 часов назад
Updates and Lessons from AI Forecasting
Updates and Lessons from AI Forecasting Updates and Lessons from AI Forecasting

Updates and Lessons from AI ForecastingCross-posted from Bounded Regret.

I now expect more progress in AI over the next 4 years than I did previously.

Professional forecasters are individuals, or often teams, who make money by placing accurate predictions in prediction markets or forecasting competitions.

Specifically on AI, Metaculus had a recent AI prediction tournament, and Hypermind ran the same questions on their own platform (AI2023, AI2030).

This is more likely if AI heats up significantly, so unfortunately I expect forecasts to be least reliable when we need them most.

1 день, 15 часов назад @ bair.berkeley.edu
PICO: Pragmatic Compression for Human-in-the-Loop Decision-Making
PICO: Pragmatic Compression for Human-in-the-Loop Decision-Making PICO: Pragmatic Compression for Human-in-the-Loop Decision-Making

In a 2D top-down car racing video game with an extremely high compression rate (50%), our compression model learns to preserve bends and discard the road farther ahead.

Pragmatic CompressionThe straightforward approach to optimizing reconstructions for a specific task would be to train the compression model to directly minimize the loss function for that task.

We call our method PragmatIc COmpression (PICO).

For these users, PICO learned to preserve the sportiness and perceived price of the car, while randomizing color and background.

If you want to learn more, check out our pre-print on arXiv: Siddharth Reddy, Anca D. Dragan, Sergey Levine, Pragmatic Image Compression for Human-in-the-Loop…

1 неделя, 2 дня назад @ bair.berkeley.edu
Unsolved ML Safety Problems
Unsolved ML Safety Problems Unsolved ML Safety Problems

Unsolved ML Safety ProblemsAlong with researchers from Google Brain and OpenAI, we are releasing a paper on Unsolved Problems in ML Safety.

Due to emerging safety challenges in ML, such as those introduced by recent large-scale models, we provide a new roadmap for ML Safety and refine the technical problems that the field needs to address.

(Source)To operate in open-world high-stakes environments, ML systems will need to endure unusual events and tail risks.

However, current ML systems are brittle in the face of real-world complexity, and they become unreliable when faced with novel situations like those above.

How can we teach ML systems to model happiness, good judgment, freedom of action…

2 недели, 2 дня назад @ bair.berkeley.edu
Distilling neural networks into wavelet models using interpretations
Distilling neural networks into wavelet models using interpretations Distilling neural networks into wavelet models using interpretations

Not FoundThe requested URL /blog/2021/09/28/wavelet/ was not found on this server.

2 недели, 3 дня назад @ bair.berkeley.edu
What Can I Do Here? Learning New Skills by Imagining Visual Affordances
What Can I Do Here? Learning New Skills by Imagining Visual Affordances What Can I Do Here? Learning New Skills by Imagining Visual Affordances

Learning New Skills by Imagining Visual AffordancesHow do humans become so skillful?

VAL: Visuomotor Affordance LearningOur method, Visuomotor Affordance Learning, or VAL, addresses these challenges.

VAL: Offline PhaseGiven a prior dataset demonstrating the affordances of various environments, VAL digests this information in three offline steps: representation learning to handle high dimensional real world data, affordance learning to enable self-supervised practice in unknown environments, and behavior learning to attain a high performance initial policy which accelerates online learning efficiency.

Because of this, improvements in offline reinforcement learning will be critical for enabli…

3 недели назад @ bair.berkeley.edu
Universal Weakly Supervised Segmentation by Pixel-to-Segment Contrastive Learning
Universal Weakly Supervised Segmentation by Pixel-to-Segment Contrastive Learning Universal Weakly Supervised Segmentation by Pixel-to-Segment Contrastive Learning

Universal Weakly Supervised Segmentation by Pixel-to-Segment Contrastive LearningWe consider a problem: Can a machine learn from a few labeled pixels to predict every pixel in a new image?

Weak supervision can be roughly categorized into two families: Coarse and Sparse supervision.

Metric Learning and Contrastive Loss FormulationTo solve the semi-supervised learning problem, we take the viewpoint of feature representation learning.

A Solution for Universal Weakly Supervised SegmentationIn this work, we propose a single method to tackle all forms of weak supervision, even if they carry different assumptions.

We thank all co-authors of the paper “Universal Weakly Supervised Segmentation by …

2 месяца, 3 недели назад @ bair.berkeley.edu
The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games
The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games

We refer to PPO with these modifications as Multi-Agent PPO (MAPPO).

Overall, we observe that in the majority of environments, MAPPO achieves results comparable or superior to off-policy methods with comparable sample-efficiency.

Additionally, this suggests that despite a heavy emphasis on developing new off-policy methods for MARL, on-policy methods such as PPO can be a promising direction for future research.

These include:Investigating MAPPO’s performance on a wider range of domains, such as competitive games or multi-agent settings with continuous action spaces.

This post is based on the paper “The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games”.

3 месяца назад @ bair.berkeley.edu
BASALT: A Benchmark for Learning from Human Feedback
BASALT: A Benchmark for  Learning from Human Feedback BASALT: A Benchmark for Learning from Human Feedback

Despite the plethora of techniques developed to tackle this problem, there have been no popular benchmarks that are specifically intended to evaluate algorithms that learn from human feedback.

In contrast, there is effectively no chance of such an unsupervised method solving BASALT tasks.

Design a “caption prompt” for each BASALT task that induces the policy to solve that task.

We impose limits on the amount of compute and human feedback that submissions can use to prevent this scenario.

ConclusionWe hope that BASALT will be used by anyone who aims to learn from human feedback, whether they are working on imitation learning, learning from comparisons, or some other method.

3 месяца, 1 неделя назад @ bair.berkeley.edu
Learning What To Do by Simulating the Past
Learning What To Do by Simulating the Past Learning What To Do by Simulating the Past

Preferences Implicit in the State of the World develops an algorithm, Reward Learning by Simulating the Past (RLSP), that does this sort of reasoning, allowing an agent to infer human preferences without explicit feedback.

In our latest paper presented at ICLR 2021, we introduce Deep Reward Learning by Simulating the Past (Deep RLSP), an extension of the RLSP algorithm that can be scaled up to tasks like the balancing Cheetah task.

To address this, we sample likely past trajectories, instead of enumerating all possible past trajectories.

By alternating between predicting past actions, and predicting past states from which those actions were taken, we can simulate trajectories arbitrarily fa…

5 месяцев, 2 недели назад @ bair.berkeley.edu
An EPIC way to evaluate reward functions
An EPIC way to evaluate reward functions An EPIC way to evaluate reward functions

Our method, Equivalent-Policy Invariant Comparison (EPIC), allows one to evaluate a reward function by computing how similar it is to other reward functions.

EPIC can be used to benchmark reward learning algorithms by comparing learned reward functions to a ground-truth reward.

It can also be used to validate learned reward functions prior to deployment, by comparing them against reward functions learned via different techniques or data sources.

EPIC is a new way to evaluate reward functions and reward learning algorithms by comparing how similar reward functions are to one another.

Most significantly, EPIC can only compare reward functions to one another, and cannot tell you what a particu…

5 месяцев, 4 недели назад @ bair.berkeley.edu
The Importance of Hyperparameter Optimization for Model-based Reinforcement Learning
The Importance of Hyperparameter Optimization for Model-based Reinforcement Learning The Importance of Hyperparameter Optimization for Model-based Reinforcement Learning

The Importance of Hyperparameter Optimization for Model-based Reinforcement LearningModel-based reinforcement learning (MBRL) is a variant of the iterative learning framework, reinforcement learning, that includes a structured component of the system that is solely optimized to model the environment dynamics.

MBRLModel-based reinforcement learning (MBRL) is an iterative framework for solving tasks in a partially understood environment.

With that data, the agent creates a structured learning tool – a dynamics model – to reason about the world.

Automated Machine Learning (AutoML) is a field dedicated to the study of using machine learning algorithms to tune our machine learning tools.

Thi…

5 месяцев, 4 недели назад @ bair.berkeley.edu
Pretrained Transformers as Universal Computation Engines
Pretrained Transformers as Universal Computation Engines Pretrained Transformers as Universal Computation Engines

Pretrained Transformers as Universal Computation EnginesTransformers have been successfully applied to a wide variety of modalities: natural language, vision, protein modeling, music, robotics, and more.

This enables the models to utilize generalizable high-level embeddings trained on a large dataset to avoid overfitting to a small task-relevant dataset.

To illustrate this, we take a pretrained transformer language model and finetune it on various classification tasks: numerical computation, vision, and protein fold prediction.

Furthermore, we find the language-pretrained frozen transformers converge faster than the randomly initialized frozen transformers, typically by a factor of 1-4x, in…

6 месяцев, 3 недели назад @ bair.berkeley.edu
Maximum Entropy RL (Provably) Solves Some Robust RL Problems
Maximum Entropy RL (Provably) Solves Some Robust RL Problems Maximum Entropy RL (Provably) Solves Some Robust RL Problems

Our analysis provides a theoretically-justified explanation for the empirical robustness of MaxEnt RL, and proves that MaxEnt RL is itself a robust RL algorithm.

In the rest of this post, we’ll provide some intuition into why MaxEnt RL should be robust and what sort of perturbations MaxEnt RL is robust to.

Standard RL MaxEnt RL Trained and evaluated without the obstacle: Trained without the obstacle, but evaluated with the obstacle:TheoryWe now formally describe the technical results from the paper.

Standard RL MaxEnt RL Evaluation on adversarial perturbationsMaxEnt RL is robust to adversarial perturbations of the hole (where the robot inserts the peg).

ConclusionIn summary, our paper sho…

7 месяцев, 1 неделя назад @ bair.berkeley.edu
Maximum Entropy RL (Provably) Solves Some Robust RL Problems
Maximum Entropy RL (Provably) Solves Some Robust RL Problems Maximum Entropy RL (Provably) Solves Some Robust RL Problems

Maximum Entropy RL (Provably) Solves Some Robust RL ProblemsNearly all real-world applications of reinforcement learning involve some degree of shift between the training environment and the testing environment.

In a recent paper, we prove that every MaxEnt RL problem corresponds to maximizing a lower bound on a robust RL problem.

In the rest of this post, we’ll provide some intuition into why MaxEnt RL should be robust and what sort of perturbations MaxEnt RL is robust to.

ConclusionIn summary, this paper shows that a commonly-used type of RL algorithm, MaxEnt RL, is already solving a robust RL problem.

We do not claim that MaxEnt RL will outperform purpose-designed robust RL algorithms.

7 месяцев, 1 неделя назад @ bair.berkeley.edu
Self-Supervised Policy Adaptation during Deployment
Self-Supervised Policy Adaptation during Deployment Self-Supervised Policy Adaptation during Deployment

Self-Supervised Policy Adaptation during DeploymentOur method learns a task in a fixed, simulated environment and quickly adapts to new environments (e.g.

Assuming that gradients of the self-supervised objective are sufficiently correlated with those of the RL objective, any adaptation in the self-supervised task may also influence and correct errors in the perception and decision-making of the policy.

SAC+IDM is a Soft Actor-Critic (SAC) policy trained with an Inverse Dynamics Model (IDM), and SAC+IDM (PAD) is the same policy but with the addition of policy adaptation during deployment on the robot.

Policy adaptation is especially effective when the test environment differs from the traini…

7 месяцев, 3 недели назад @ bair.berkeley.edu
AWS Machine Learning AWS Machine Learning
последний пост 1 час назад
HawkEye 360 predicts vessel risk using the Deep Graph Library and Amazon Neptune
HawkEye 360 predicts vessel risk using the Deep Graph Library and Amazon Neptune HawkEye 360 predicts vessel risk using the Deep Graph Library and Amazon Neptune

HawkEye 360 chose Neptune as their graph database to store relationship information and DGL for their graph neural network (GNN) capability.

With the graph notebook library, HawkEye 360 was able to query the underlying graph data in Gremlin query language using the %%gremlin function.

With the data in Neptune, HawkEye 360 created a per-vessel risk score to identify the risk that a given vessel will engage in suspicious behavior.

Using the Deep Graph Library to predict vessel riskThe first step to predicting vessel risk is to create the graph dataset.

With these nodes, HawkEye 360 formulated a semi-supervised node classification problem to classify vessel risk.

1 час назад @ aws.amazon.com
Amazon Personalize can now unlock intrinsic signals in your catalog to recommend similar items
Amazon Personalize can now unlock intrinsic signals in your catalog to recommend similar items Amazon Personalize can now unlock intrinsic signals in your catalog to recommend similar items

These recommendations are called similar items, and they help users discover items relevant to what they’re watching or purchasing.

In Amazon Personalize, the aws-similar-items recipe uses deep learning based techniques and the knowledge you have about your items to identify similarity.

Amazon Personalize enables developers to build applications using machine learning (ML) technology to deliver personalized user experiences with no ML expertise required.

Similar items recommendations should be aligned with things that pair well with a soda!

Detailed descriptions and other attributes provide valuable signals to recommend more similar items, especially for items with limited interactions hist…

1 день назад @ aws.amazon.com
How NSF’s iHARP researchers are enabling active learning for polar ice analysis using Amazon SageMaker and Amazon A2I
How NSF’s iHARP researchers are enabling active learning for polar ice analysis using Amazon SageMaker and Amazon A2I How NSF’s iHARP researchers are enabling active learning for polar ice analysis using Amazon SageMaker and Amazon A2I

The lab’s work is supported by NSF BIGDATA awards (IIS-1947584, IIS-1838230), the NSF HDR Institute award (OAC-2118285), and the Amazon ML research award for climate change.

An approach for using ML for Arctic ice analysisScalable ML with SageMakerActive learning workflow with Amazon Augmented AI (Amazon A2I) and SageMakerWhat is Arctic ice analysis?

So, what do we really mean by analysis of ice data?

We can achieve this by training an ML model using supervised learning to detect ice layers from radar images, for example.

We didn’t select this option as we were in the experimental stage, but for more details, see Deploy a Model in Amazon SageMaker.

1 день, 1 час назад @ aws.amazon.com
How Imperva expedites ML development and collaboration via Amazon SageMaker notebooks
How Imperva expedites ML development and collaboration via Amazon SageMaker notebooks How Imperva expedites ML development and collaboration via Amazon SageMaker notebooks

In this post, we share how we expedited ML development and collaboration via Amazon SageMaker notebooks.

Amazon SageMaker is a fully managed service that provides every developer and data scientist with the ability to build, train, and deploy ML.

SageMaker includes this exact capability and more as part of its SageMaker notebooks feature.

Having unlimited computing resources is great, but it wasn’t why we decided to start using SageMaker notebooks.

We created a lifecycle configuration that connects an EFS to a notebook instance, and attached this configuration to every notebook instance we wanted to be part of the shared environment.

1 день, 2 часа назад @ aws.amazon.com
Organize product data to your taxonomy with Amazon SageMaker
Organize product data to your taxonomy with Amazon SageMaker Organize product data to your taxonomy with Amazon SageMaker

When companies deal with data that comes from various sources or the collection of this data has changed over time, the data often becomes difficult to organize.

In this post, I discuss how to organize product data to your classification needs with Amazon SageMaker.

After this data lands in the data lake, you may need to organize this data into your company’s own taxonomy to create, for example, a better user search experience on front-end and back-office applications.

You can apply this solution to other use cases, where you could take the product title and product description and predict the product category.

In the following code snippets, the F1 scores are largely the same, with the Fas…

2 дня, 4 часа назад @ aws.amazon.com
Train and deploy deep learning models using JAX with Amazon SageMaker
Train and deploy deep learning models using JAX with Amazon SageMaker Train and deploy deep learning models using JAX with Amazon SageMaker

To accommodate this, SageMaker provides the flexibility to train models using any framework that can run in a Docker container.

This functionality enables you to use existing SageMaker training capabilities such as training jobs, hyperparameter tuning, and Managed Spot Training.

Spot Training can be enabled by switching a keyword input in the SageMaker training job code.

The first is a standard JAX model, the second uses a submodule within JAX called stax, and the third uses a higher-level library called Trax.

Train each of the models using SageMaker training jobs on GPUs.

3 дня, 2 часа назад @ aws.amazon.com
How to approach conversation design: Getting started with Amazon Lex (Part 2)
How to approach conversation design: Getting started with Amazon Lex (Part 2) How to approach conversation design: Getting started with Amazon Lex (Part 2)

As you plan your new Amazon Lex application, the following conversation design best practices can help your team succeed in creating a great customer experience.

You can also create your own bot using our self-paced Amazon Lex Workshop.

How to approach conversation design: The basics (Part 1)How to approach conversation design: Getting started with Amazon Lex (Part 2)Gather all requirements before designingA great design process starts with no design.

In the final post in our series, we discuss how to take the foundational elements from our first two posts and translate it into Amazon Lex.

For more information on Amazon Lex and getting started with AWS for conversational interface experienc…

3 дня, 5 часов назад @ aws.amazon.com
Build conversational experiences for credit card services using Amazon Lex
Build conversational experiences for credit card services using Amazon Lex Build conversational experiences for credit card services using Amazon Lex

Amazon Lex for financial services offers pre-built solutions so you can enable more conversational experiences, faster.

Let’s review a sample conversation as we cover the different components of the pre-built solution:Sample conversation: Credit card paymentAgent: Welcome to card services.

The pre-built solution includes bots for authentication and card services that can be deployed on Amazon Lex to automate the conversations.

", "CancelCreditCard": "Do you want to cancel the credit card" "DefaultFallback": "Some of the things I can help you with are: bill payment, \ card activation, reporting missing card.

Amazon Lex bots Lambda functions DynamoDB table Amazon Connect Contact flow IAM role…

3 дня, 19 часов назад @ aws.amazon.com
Detect online transaction fraud with new Amazon Fraud Detector features
Detect online transaction fraud with new Amazon Fraud Detector features Detect online transaction fraud with new Amazon Fraud Detector features

To help you catch fraud faster across multiple use cases, Amazon Fraud Detector offers specific models with tailored algorithms, enrichments, and feature transformations.

– Is used to retrieve, modify, and delete records stored in Amazon Fraud Detector, and should be unique for each transaction record.

Define the event, event variables, and event labelsEvents are the activities that an entity performs.

When event ingestion is on, you can upload historic event data to Amazon Fraud Detector and automatically store event data from predictions in real time.

If you don’t specify a detectorVersionId , Amazon Fraud Detector uses whichever detector version is in Active status.

4 дня, 2 часа назад @ aws.amazon.com
Build, tune, and deploy an end-to-end churn prediction model using Amazon SageMaker Pipelines
Build, tune, and deploy an end-to-end churn prediction model using Amazon SageMaker Pipelines Build, tune, and deploy an end-to-end churn prediction model using Amazon SageMaker Pipelines

You can manage your Amazon SageMaker training and inference workflows using Amazon SageMaker Studio and the SageMaker Python SDK.

Amazon SageMaker Pipelines is a tool for building ML pipelines that takes advantage of direct SageMaker integration.

For more information on managing Pipelines from Studio, see View, Track, and Execute SageMaker Pipelines in SageMaker Studio.

Register the trained churn model in the SageMaker Model Registry.

For instructions on getting started with Studio, see Onboard to Amazon SageMaker Studio or watch the video Onboard Quickly to Amazon SageMaker Studio.

4 дня, 7 часов назад @ aws.amazon.com
Build your own brand detection and visibility using Amazon SageMaker Ground Truth and Amazon Rekognition Custom Labels – Part 2: Training and analysis workflows
Build your own brand detection and visibility using Amazon SageMaker Ground Truth and Amazon Rekognition Custom Labels – Part 2: Training and analysis workflows Build your own brand detection and visibility using Amazon SageMaker Ground Truth and Amazon Rekognition Custom Labels – Part 2: Training and analysis workflows

In Part 1 of this series, we showed how to build a brand detection solution using Amazon SageMaker Ground Truth and Amazon Rekognition Custom Labels.

The Check training job and Wait for training job (15 mins) states periodically check the training status until the model is fully trained.

Analysis state machineThe Amazon Rekognition Custom Labels start model state machine manages the runtime of the Amazon Rekognition Custom Labels model.

In this workflow, the input parameter to start an Amazon Rekognition Custom Labels model state machine is a pass through from the video analysis state machine.

Video analysis state machineThe video analysis state machine is composed of different states:Probe…

1 неделя, 1 день назад @ aws.amazon.com
Bring structure to diverse documents with Amazon Textract and transformer-based models on Amazon SageMaker
Bring structure to diverse documents with Amazon Textract and transformer-based models on Amazon SageMaker Bring structure to diverse documents with Amazon Textract and transformer-based models on Amazon SageMaker

With Amazon Textract, you can already go beyond simple extraction of handwritten or printed text (OCR).

First, we run a fraction of the source documents through Amazon Textract to extract the text and position data.

Existing pipelines set up for Amazon Textract can read the JSON as usual, but take advantage of the extra fields if they’re present and understood by the consumer.

Some use cases may even apply multiple ML models to meet the overall business requirement.

For other examples integrating Amazon Textract, see Additional Code Examples.

1 неделя, 2 дня назад @ aws.amazon.com
Run computer vision inference on large videos with Amazon SageMaker asynchronous endpoints
Run computer vision inference on large videos with Amazon SageMaker asynchronous endpoints Run computer vision inference on large videos with Amazon SageMaker asynchronous endpoints

In this post, we show you how to serve a PyTorch CV model with SageMaker asynchronous inference to process a burst traffic of large input payload videos uploaded to Amazon S3.

)For details on the API to automatically scale an asynchronous endpoint, see Autoscale an Asynchronous Endpoint.

Monitor the asynchronous endpointWe monitor the asynchronous endpoint with built-in additional CloudWatch metrics specific to asynchronous inference.

We also monitor the total processing time from input in Amazon S3 to output back in Amazon S3 with TotalProcessingTime and the time spent in backlog with the TimeInBacklog metric.

To get started with SageMaker asynchronous inference, see Asynchronous Inference…

1 неделя, 2 дня назад @ aws.amazon.com
Detect defects in automotive parts with Amazon Lookout for Vision and Amazon SageMaker
Detect defects in automotive parts with Amazon Lookout for Vision and Amazon SageMaker Detect defects in automotive parts with Amazon Lookout for Vision and Amazon SageMaker

Defect detection within manufacturing is an important business use case, especially in high-value product industries like the automotive industry.

Detecting anomalies in automotive parts and components using computer vision can be done using normal images, and even X-Ray based images for structural damages.

In minutes, you can begin using Lookout for Vision to automate inspection of images and objects with no ML expertise required.

Lookout for Vision enabled us to detect defective images in our dataset and prepare our samples for localizing defective regions in the shortlisted dataset.

Defect localization pipelineDefect localization is the process of identifying the exact location of a defe…

1 неделя, 3 дня назад @ aws.amazon.com
Build a system for catching adverse events in real-time using Amazon SageMaker and Amazon QuickSight
Build a system for catching adverse events in real-time using Amazon SageMaker and Amazon QuickSight Build a system for catching adverse events in real-time using Amazon SageMaker and Amazon QuickSight

The objective of this post is to provide an example that showcases how to use Amazon SageMaker and pre-trained transformer models to detect AEs mentioned on social media.

The workflow includes the following steps:Train a classification model using SageMaker training and deploy the trained model to an endpoint using SageMaker real-time inference.

Analyze data from Amazon S3 using Amazon Athena.

When you run the entire notebook (AE_model_train_deploy.ipynb), a SageMaker training job is launched and the model is deployed to a SageMaker endpoint.

At a high level in QuickSight, you import the data using Athena and locate your Athena database and table that are linked to your S3 bucket.

1 неделя, 3 дня назад @ aws.amazon.com
NVIDIA
последний пост 2 часа назад
Upcoming DLI Training: Learn How to Build Conversational AI Applications
Upcoming DLI Training: Learn How to Build Conversational AI Applications Upcoming DLI Training: Learn How to Build Conversational AI Applications

As the world continues to evolve and become more digital, conversational AI is increasingly used as a means for automation.

The NVIDIA Deep Learning Institute is hosting a workshop on how to build a conversational AI service using the NVIDIA Riva framework.

Learn how to quickly build and deploy production-quality conversational AI applications with real-time transcription and natural language processing capabilities.

This makes it easy for developers to quickly create, deploy, and run end-to-end, real-time conversational AI applications unique to a company and its customers.

Get hands-on training and learn:How to deploy and enable pretrained ASR and NER models on Riva for a conversational A…

2 часа назад @ developer.nvidia.com
Analyzing Cassandra Data using GPUs, Part 2
Analyzing Cassandra Data using GPUs, Part 2 Analyzing Cassandra Data using GPUs, Part 2

Editor’s Note: Watch the Analysing Cassandra Data using GPUs workshop.

In the previous post, I talked about our journey to find the best way to load SSTable data onto the GPU for data analytics.

Sstable-to-arrow can then ship the Arrow data to any client, where the data can be turned into a cuDF and be available for GPU analysis.

You can also generate IOT data using the generate-data script in the repository, or you can manually create a table using CQL and the Cassandra Docker image (see the Cassandra quickstart for more info).

SSTable to ParquetSstable-to-arrow is also able to save the SSTable data as a Parquet file, a common format for storing columnar data.

1 день, 3 часа назад @ developer.nvidia.com
NVIDIA GTC: Industrial at the Edge
NVIDIA GTC: Industrial at the Edge NVIDIA GTC: Industrial at the Edge

Read on to get a first look into our Edge Industrial focused sessions at GTC:Figure 1.

Scaling AI with Data Monsters on Manufacturing FloorCase Study: Edge AI for Industrial Applications: The journey to AI at scale is hard.

Join us as Data Monsters shares a case study of how NVIDIA Metropolis enabled active learning for this edge deployed computer vision solution.

They will also share details on how their solution benefits from using NVIDIA Fleet Command service scale deployment and management of edge AI applications over a network of manufacturing sites globally.

AeroBERT is currently used to tackle all natural language processing (NLP) challenges at Rolls-Royce that traditional language m…

1 день, 4 часа назад @ developer.nvidia.com
Looks Can Be Perceiving: Startups Build Highly Accurate Perception Software on NVIDIA DRIVE
Looks Can Be Perceiving: Startups Build Highly Accurate Perception Software on NVIDIA DRIVE Looks Can Be Perceiving: Startups Build Highly Accurate Perception Software on NVIDIA DRIVE

Startups worldwide are developing these perception stacks, using the high-performance, energy-efficient compute of NVIDIA DRIVE AGX to deliver highly accurate object detection to autonomous vehicle manufacturers.

The company first began building out its solution on a compute platform with NVIDIA DRIVE in 2016.

“We’ve been using NVIDIA DRIVE from day one,” said Péter Kovács, aiMotive’s senior vice president of aiDrive.

By developing on NVIDIA DRIVE AGX, the company has deployed a robust perception deep neural network for AI-assisted driving platforms.

By developing comprehensive solutions on NVIDIA DRIVE, these startups are delivering accurate and robust perception to AI-assisted and autonom…

1 день, 4 часа назад @ blogs.nvidia.com
NVIDIA Releases Updates and New Features in CUDA-X AI Software
NVIDIA Releases Updates and New Features in CUDA-X AI Software NVIDIA Releases Updates and New Features in CUDA-X AI Software

NVIDIA CUDA-X AI is a deep learning software stack for researchers and software developers to build high performance GPU-accelerated applications for conversational AI, recommendation systems, and computer vision.

NVIDIA Triton Inference ServerNVIDIA Triton™ Inference Server is open source inference serving software that brings fast and scalable AI models to applications in production.

Container Composition Utility: create custom Triton containers with specific backends and repository agents.

Get Started >>NVIDIA NeMoNVIDIA NeMo is an open-source toolkit for developing state-of-the-art conversational AI models.

New and Updated Partner SoftwareAutovox Hindi ASR Container : Autovox, by Cogkni…

1 день, 6 часов назад @ developer.nvidia.com
NVIDIA GTC: Taking It to the Edge
NVIDIA GTC: Taking It to the Edge NVIDIA GTC: Taking It to the Edge

As the leader in AI, NVIDIA is bringing these topics to the forefront of our annual NVIDIA GTC, taking place November 8-11.

Here is a preview of some of the top sessions in our edge AI track including topics such as edge computing, IoT, AI-on-5G.

Edge Computing Session by Justin Boitano, VP and GM of Enterprise and Edge Computing, NVIDIARise of the Intelligent Edge – from Enterprise to Device Edge: AI at the edge is turning data collected from sensors and devices into actionable insights.

Exploring Cloud-native Edge AI: Edge computing is turning the AI data center of the future on its head.

We will review the benefits of cloud native for edge AI and demonstrate how to integrate NVIDIA accel…

1 день, 7 часов назад @ developer.nvidia.com
But Can It Run ‘Crysis Remastered?’ With GeForce NOW, Nearly Any Device Can
But Can It Run ‘Crysis Remastered?’ With GeForce NOW, Nearly Any Device Can But Can It Run ‘Crysis Remastered?’ With GeForce NOW, Nearly Any Device Can

They can use that power to stream Crysis Remastered today, and the complete Crysis Remastered Trilogy starting tomorrow — with the best performance and RTX graphics for Founders and Priority members.

“Thanks to GeForce NOW, players everywhere can experience Crysis Remastered at the quality we always envisioned,” said Steffen Halbig, project lead at Crytek.

Crysis Remastered’s Steam and Epic Games Store versions join GeForce NOW today, and the full Crysis Remastered Trilogy will stream from the cloud at launch tomorrow.

The Crysis Remastered Trilogy — which consists of Crysis, Crysis 2 and Crysis 3 Remastered editions — brings the same graphical and technology upgrades to the sequels, making…

1 день, 11 часов назад @ blogs.nvidia.com
AI Researchers Visualize Flooding Caused by Global Warming
AI Researchers Visualize Flooding Caused by Global Warming AI Researchers Visualize Flooding Caused by Global Warming

Named ClimateGAN, the team developed the model to underscore the destruction of extreme weather events and prompt collective action toward curbing emissions.

People across the globe are grappling with more frequent extreme weather events including storms, hurricanes, droughts, and wildfires brought on by a warming planet.

According to the researchers, first-person perspectives and images of extreme weather events can reduce distancing.

Using these two data sources, the Masker model predicts the location of water in an image, if flooding were to occur.

Then the Painter model—which uses GauGAN, a deep learning model developed by NVIDIA Research—renders contextualized water textures guided by …

2 дня, 23 часа назад @ developer.nvidia.com
Cloudy With a Chance of AI: Data Scientists Develop Flood Detection for Early Warning
Cloudy With a Chance of AI: Data Scientists Develop Flood Detection for Early Warning Cloudy With a Chance of AI: Data Scientists Develop Flood Detection for Early Warning

Applying a data set of 66,000 images, data scientists have created an ensemble of models for predicting flood zones.

The pioneering effort landed second place at the Emerging Techniques in Computational Intelligence (ETCI) 2021 competition on flood detection.

Flood Segmentation in SecondsThe ETCI competition asked contestants to use 66,000 SAR Sentinel-1 labeled images with pixels that show before and after a flood.

Participants were challenged to develop semantic segmentation models using the data so that they could be applied to new unlabeled images to perform inference on potential flood zones.

They are in talks with the United Nations Satellite Centre, which is interested in testing the…

3 дня, 8 часов назад @ blogs.nvidia.com
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World’s Largest and Most Powerful Generative Language Model
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World’s Largest and Most Powerful Generative Language Model Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World’s Largest and Most Powerful Generative Language Model

We are excited to introduce the DeepSpeed- and Megatron-powered Megatron-Turing Natural Language Generation model (MT-NLG), the largest and the most powerful monolithic transformer language model trained to date, with 530 billion parameters.

Language models with large numbers of parameters, more data, and more training time acquire a richer, more nuanced understanding of language.

We built high-quality, natural language training corpora with hundreds of billions of tokens, and co-developed training recipes to improve optimization efficiency and stability.

Training dataset and model configurationWe used the architecture of the transformer decoder, which is a left-to-right generative transfor…

4 дня, 11 часов назад @ developer.nvidia.com
Creating an Apparel Detection App with NVIDIA GPU-Accelerated AI Technologies
Creating an Apparel Detection App with NVIDIA GPU-Accelerated AI Technologies Creating an Apparel Detection App with NVIDIA GPU-Accelerated AI Technologies

This guest post was submitted by Drishtic AI Lead Developer, Priti Gavali and Technical Architect, Archana Borawake.

Apparel companies can use this data to assess demand and develop clothing that appeals to individuals based on their age, gender, and preferences.

Drishtic AI uses the NVIDIA Metropolis platform to develop AI-enabled video analytics applications.

To create its apparel detection application, Drishtic AI used a technology stack built on the NVIDIA EGX platform, incorporating NVIDIA-Certified Systems with NVIDIA GPUs.

The Drishtic AI team started by building the dataset primarily from open source images and videos, using 10400 images with 800 images per clothing class.

1 неделя назад @ developer.nvidia.com
Decreasing MRI Scan Times Using Deep Learning with NVIDIA Clara AGX
Decreasing MRI Scan Times Using Deep Learning with NVIDIA Clara AGX Decreasing MRI Scan Times Using Deep Learning with NVIDIA Clara AGX

In this post, we explore a deep learning method with the NVIDIA Clara AGX developer kit to remove the Gibbs phenomenon and noise from MR images while maintaining high image resolution.

(left) Truncation artifact, also known as Gibbs phenomenon, shown when approximated a square wave using only five harmonics Source: Wikipedia.

The data loaders of dldegibbs create two images for each loaded image: a training image and a target image.

The training image gets created simulating the Gibbs phenomenon on the original image in the Fourier domain.

In this project, we used the Open Images dataset consisting of over 1.7 million training images.

1 неделя назад @ developer.nvidia.com
Maximizing GROMACS Throughput with Multiple Simulations per GPU Using MPS and MIG
Maximizing GROMACS Throughput with Multiple Simulations per GPU Using MPS and MIG Maximizing GROMACS Throughput with Multiple Simulations per GPU Using MPS and MIG

Running multiple simulations per GPU in parallel can substantially increase overall throughput, as has been previously shown for GROMACS in Heterogeneous parallelization and acceleration of molecular dynamics simulations in GROMACS (Figure 11).

MIGLike MPS, NVIDIA Multi-Instance GPU (MIG) facilitates the sharing of each GPU, but with strict partitioning of resources.

MIG can be combined with MPS, where multiple MPS clients can run simultaneously on each MIG instance, up to a maximum of 48 total MPS clients per physical GPU.

Pure MPS runsThe following script launches multiple simulations with MPS to the 8-GPU DGX A100 server.

Runs combining MIG and MPSThe following script is a version of the…

1 неделя назад @ developer.nvidia.com
Oski Technology, an Expert in Formal Verification, Joins NVIDIA
Oski Technology, an Expert in Formal Verification, Joins NVIDIA Oski Technology, an Expert in Formal Verification, Joins NVIDIA

We are excited to announce that Oski Technology, a company specializing in formal verification methods, will be joining NVIDIA.

Today, verification engineers rely on two very different methods to make sure bugs don’t make it into silicon — simulation and formal verification.

Jasper’s software remains one of the most powerful software tools available for formal verification proofs.

In 2005, Vigyan started Oski with the mission of applying the deep computer science of formal verification.

With this acquisition, we have the opportunity to dramatically increase our investment in and commitment to formal verification strategies to achieve that goal.

1 неделя назад @ blogs.nvidia.com
Rapid Data Pre-Processing with NVIDIA DALI
Rapid Data Pre-Processing with NVIDIA DALI Rapid Data Pre-Processing with NVIDIA DALI

In Figure 1, one can observe the impact of data preprocessing in the training throughput of a ResNet-50 network.

DALI to the rescueNVIDIA Data Loading Library (DALI) is a result of our efforts to find a scalable and portable solution to the data pipeline issues mentioned preceding.

DALI offers data processing primitives for a variety of deep learning applications, such as classification or detection, and supports different data domains, including image, video, audio, and volumetric data.

DALI key conceptsThe main entity in DALI is the data processing pipeline.

Because of that, the server-side preprocessing case is faster than the offline preprocessing case, even though the former includes d…

1 неделя, 1 день назад @ developer.nvidia.com
Facebook
последний пост 3 месяца назад
Fully Sharded Data Parallel: faster AI training with fewer GPUs
Fully Sharded Data Parallel: faster AI training with fewer GPUs Fully Sharded Data Parallel: faster AI training with fewer GPUs

It shards an AI model’s parameters across data parallel workers and can optionally offload part of the training computation to the CPUs.

For example, typical data parallel training requires maintaining redundant copies of the model on each GPU, and model parallel training introduces additional communication costs to move activations between workers (GPUs).

Using FSDP in computer vision modelsFor computer vision models, FSDP is supported in VISSL and tested on RegNets architectures.

Users may need to carefully tune the activation checkpointing strategy to fit a large model within limited GPU memory space.

We look forward to developing algorithms for auto-tuning both GPU memory usage and trai…

3 месяца назад @ engineering.fb.com
Asicmon: A platform agnostic observability system for AI accelerators
Asicmon: A platform agnostic observability system for AI accelerators Asicmon: A platform agnostic observability system for AI accelerators

We will be hosting a talk about our work on, “A Platform Agnostic Observability System for AI Accelerators” during our virtual Systems @Scale event at 10:20 a.m. PT on Wednesday, June 30, followed by a live Q&A session.

To meet these challenges, we’ve introduced three new tools:ASIC Monitoring (Asicmon) , a scalable observability framework.

However, with an accelerator system, we can imagine the CPU now has a complicated and brawnier sibling!

Since implementing Asicmon we’ve been able to increase our AI accelerator metrics support from ~30 percent to ~75 percentAtrace: Accelerator tracing at scaleWhy tracing?

This would allow us to debug the end-to-end latency of microservices that use AI a…

3 месяца, 2 недели назад @ engineering.fb.com
How Facebook encodes your videos
How Facebook encodes your videos How Facebook encodes your videos

People upload hundreds of millions of videos to Facebook every day.

From a pure computing perspective, applying the most advanced codecs to every video uploaded to Facebook would be prohibitively inefficient.

A relatively small percentage (roughly one-third) of all videos on Facebook generate the majority of overall watch time.

The impact of the new video encoding modelIn addition to improving viewer experience with newly uploaded videos, the new model can identify older videos on Facebook that should have been encoded with more advanced encodings and route more computing resources to them.

The improved compression has also allowed people on Facebook with limited data plans, such as those i…

6 месяцев, 1 неделя назад @ engineering.fb.com
How machine learning powers Facebook’s News Feed ranking algorithm
How machine learning powers Facebook’s News Feed ranking algorithm How machine learning powers Facebook’s News Feed ranking algorithm

Models for meaningful interactions and quality content are powered by state-of-the-art ML, such as multitask learning on neural networks, embeddings, and offline learning systems.

We are sharing new details of how we designed an ML-powered News Feed ranking system.

Building a ranking algorithmTo understand how this works, let’s start with a hypothetical person logging in to Facebook: We’ll call him Juan.

On the other hand, perhaps Juan has previously engaged more with video content than photos, so the like prediction for Wei’s cocker spaniel photo might be lower.

Approximating the ideal ranking function in a scalable ranking systemNow that we know the theory behind ranking (as exemplified t…

8 месяцев, 3 недели назад @ engineering.fb.com
neptune.ai neptune.ai
последний пост 1 день, 14 часов назад
Continuous Control With Deep Reinforcement Learning
Continuous Control With Deep Reinforcement Learning Continuous Control With Deep Reinforcement Learning

This time I want to explore how deep reinforcement learning can be utilized e.g.

Continuous Control with Deep Reinforcement Learning | Source: RoboschoolMeet Humanoid.

Off-policy actor-critic methodsReinforcement Learning | Source: Sutton & Barto, Reinforcement Learning: An Introduction, 2nd editionLet’s recap: Reinforcement learning (RL) is learning what to do — how to map situations to actions — to maximize some notion of cumulative reward.

Reinforcement learning | Source: ResearchgateThe actor-critic architecture, depicted in the diagram above, divides the agent into two pieces, the Actor and the Critic.

You can read more about logging these diagnostics in Logging in Reinforcement Learni…

1 день, 14 часов назад @ neptune.ai
Tips and Tricks to Train State-Of-The-Art NLP Models
Tips and Tricks to Train State-Of-The-Art NLP Models Tips and Tricks to Train State-Of-The-Art NLP Models

With the introduction of packages like transformers by huggingface, it is very convenient to train NLP models for any given task.

What methods, tips, and tricks you can use to train state-of-the-art NLP models?

Most of the options needed to quickly make comparisons are available in a couple of clicks therefore we can interactively run analysis and train models.

The field is evolving drastically and researchers are discovering new tricks every day to reliably train these models.

We hope with this article you are empowered with great techniques and tools to confidently train and track the stable state-of-the-art NLP models.

2 дня, 9 часов назад @ neptune.ai
Version Control for Machine Learning and Data Science
Version Control for Machine Learning and Data Science Version Control for Machine Learning and Data Science

Read also 🎓 Version Control Guide for Machine Learning ResearchersTypes of version control systemsThere are three main types of version control systems:Local Version Control Systems Centralized Version Control Systems Distributed Version Control SystemsLocal Version Control SystemThis version control system consists of a local database on your computer that stores every file change as a patch (difference between files in a unique format).

Local Version Control System | Source: GitlabCentralized Version Control SystemsThis system has a central and single server that stores the different file versions.

Distributed Version Control Systems | Source: GitlabBenefits of version control systemsEffi…

3 дня, 13 часов назад @ neptune.ai
Transformer Models for Textual Data Prediction
Transformer Models for Textual Data Prediction Transformer Models for Textual Data Prediction

Predicting paraphrased sentences: Often when we look at text data, we’re trying to understand if certain sentences are related.

This is useful when processing large amounts of text data, and you want to search for or identify text which may be related without being identical.

Predicting text data with TransformersThe datasetFor predicting text data we want to use a dataset that represents “real” queries and conversations.

But it’s always good to keep an eye on new applications like this which may be improved by new NLP models like BERT or GPT.

Models like Google’s T5 Text to Text Transformer, which was released in 2020, are examples of transformer-based models that can perform multiple task…

4 дня, 13 часов назад @ neptune.ai
Tips for MLOps Setup—Things We Learned From 7 ML Experts
Tips for MLOps Setup—Things We Learned From 7 ML Experts Tips for MLOps Setup—Things We Learned From 7 ML Experts

MLOps helps ML teams to develop solutions and move them into production in a standardized, quick, and least error-prone way.

Governance guidelines cover every stage of the ML pipeline, right from the ideation and data-gathering stage to retraining and monitoring.

Reproducibility of results is essential for the ML teams while choosing the optimal experiment, or even while troubleshooting.

Versioning can be classified into two types:Data versioning : tracking and storing the different data sets and their metadata used for different ML experiments.

: tracking and storing the different data sets and their metadata used for different ML experiments.

1 неделя назад @ neptune.ai
9 Steps of Debugging Deep Learning Model Training
9 Steps of Debugging Deep Learning Model Training 9 Steps of Debugging Deep Learning Model Training

In this article, I’ll dive deeper into the topic of debugging deep learning models and explain how to do it.

The art of debugging | Source: UnsplashWhat is deep learning model debugging and why is it different from software debugging?

Just like interpretability is a serious issue for deep learning, debugging can be particularly challenging.

We list the basic steps that should always be followed when working on the implementation and evaluation of a deep learning model.

Be patientAlways keep in mind that deep learning model training is a slow and often painful process.

1 неделя, 1 день назад @ neptune.ai
Knowledge Distillation: Principles, Algorithms, Applications
Knowledge Distillation: Principles, Algorithms, Applications Knowledge Distillation: Principles, Algorithms, Applications

The teacher-student framework for knowledge distillation | Source: ArxivDiving deeper into knowledge distillationA knowledge distillation system consists of three principal components: the knowledge, the distillation algorithm, and the teacher-student architecture [3].

Typical knowledge distillation uses the logits as the source of teacher knowledge, whilst others focus on the weights or activations of intermediate layers.

The different forms of knowledge are categorized into three different types: Response-based knowledge, Feature-based knowledge, and Relation-based knowledge.

Algorithms for knowledge distillationIn this section, I will focus on the algorithms for training student models t…

1 неделя, 2 дня назад @ neptune.ai
The Best Open-Source MLOps Tools You Should Know
The Best Open-Source MLOps Tools You Should Know The Best Open-Source MLOps Tools You Should Know

You don’t need to spend a lot on MLOps tools to bring the magic of DevOps to your machine learning projects.

MLReefMLOps open-source tool: MLReef | SourceMLReef is an MLOps platform for teams to collaborate and share the results of their machine learning experiments.

MLOps open-source tool: Kedro | SourceKedro has many good features:Project templatesUsually, you have to spend a lot of time understanding how to set up your analytics project.

CMLCML (Continuous Machine Learning) is a library for continuous integration and delivery (CI / CD) of machine learning projects.

MLOps open-source tool: Jupyter Notebook | SourceMulti-language supportOften when we talk about Jupyter Notebook, we mean wo…

1 неделя, 3 дня назад @ neptune.ai
How to Choose a Learning Rate Scheduler For Neural Networks
How to Choose a Learning Rate Scheduler For Neural Networks How to Choose a Learning Rate Scheduler For Neural Networks

Learning rate schedulesA Learning rate schedule is a predefined framework that adjusts the learning rate between epochs or iterations as the training progresses.

Two of the most common techniques for learning rate schedule are,Constant learning rate: as the name suggests, we initialize a learning rate and don’t change it during training;Learning rate decay: we select an initial learning rate, then gradually reduce it in accordance with a scheduler.

Knowing what learning rate schedules are, you must be wondering why we need to decrease the learning rate in the first place?

Note: you might read articles where the learning rate schedule is defined as the (learning rate) decay only.

Learning ra…

1 неделя, 4 дня назад @ neptune.ai
What Is ModelOps and How Is It Different From MLOps?
What Is ModelOps and How Is It Different From MLOps? What Is ModelOps and How Is It Different From MLOps?

Different models have different solutions – Enterprises will have hundreds of models for different business problems.

Economic – ModelOps platforms are integrated with the cloud, which can help to optimize cloud service and AI models economically.

There is a thin line between ModelOps and MLOps and if you look at their architecture you might see ModelOps as an extension of MLOps.

ModelOps is a superset of MLOps, which refers to the processes involved to operationalize and manage AI models in use in production systems.” – ModzyModelOps vs MLOpsIt’s easy to confuse ModelOps with MLOps.

We also identified how MLOps and ModelOps are different, but belong to the same concept.

2 недели, 3 дня назад @ neptune.ai
Debug and Visualize Your TensorFlow/Keras Model: Hands-on Guide
Debug and Visualize Your TensorFlow/Keras Model: Hands-on Guide Debug and Visualize Your TensorFlow/Keras Model: Hands-on Guide

In this article, we’re going to discuss model debugging strategies and do a real-world implementation with Neptune and its API.

Model debugging basicsPoor model performance in machine learning can be caused by a variety of factors.

Adjust hyperparameter valuesTypically, the most targeted hyperparameters that engineers tweak first are:The learning rate : The learning rate is automatically set by ML libraries.

Increasing the stability of your model makes model training more reproducible.

Increasing the stability of your model makes model training more reproducible.

3 недели назад @ neptune.ai
Bayesian Neural Networks—Implementing, Training, Inference With the JAX Framework
Bayesian Neural Networks—Implementing, Training, Inference With the JAX Framework Bayesian Neural Networks—Implementing, Training, Inference With the JAX Framework

In a non-Bayesian artificial neural network model (on the left in the figure above), we train a point estimate of network parameters.

What is the Bayesian Neural Network?

List of Bayesian Neural Network components:Dataset D with predictors X (for example, images) and labels Y (for example, classes).

Initialize Bayesian NN parametersLines 79-85 take the MLP model parameters and use it to initialize the Bayesian NN parameters.

Luckily, the Bayesian neural network we’ve trained can also tell us how certain it is.

3 недели, 3 дня назад @ neptune.ai
Depth Estimation Models with Fully Convolutional Residual Networks (FCRN)
Depth Estimation Models with Fully Convolutional Residual Networks (FCRN) Depth Estimation Models with Fully Convolutional Residual Networks (FCRN)

Depth estimation rendering for a video | Source: Deep Learning approach to Depth prediction, Google AIDifferent approaches for the same objectiveRecently, several approaches were engineered for depth estimation.

The proposed architecture includes fully convolutional layers, transpose-convolutions, and efficient residual up-sampling blocks that help keep track of high-dimensional regression problems.

Unsupervised Monocular Depth Estimation with Left-Right ConsistencyThis specific architecture is end-to-end and performs unsupervised monocular depth estimation without ground-truth data.

Originally, U-Net was built with two convolutional layers in each block and the number of filters for all co…

3 недели, 4 дня назад @ neptune.ai
Exploring Clustering Algorithms: Explanation and Use Cases
Exploring Clustering Algorithms: Explanation and Use Cases Exploring Clustering Algorithms: Explanation and Use Cases

Different cluster models are employed, and for each of these cluster models, different algorithms can be given.

Centroid-based clustering algorithms / Partitioning clustering algorithmsIn centroid/partitioning clustering, clusters are represented by a central vector, which may not necessarily be a member of the dataset.

Note: An example of K-Means clustering is explained with customer segmentation examples in the use cases section below.

Mini-Batch K-Means clustering algorithmK-Means is one of the popular clustering algorithms, mainly because of its good time performance.

SummaryThis blog covered the most critical aspects of clustering, image compression, digit classification, customer segm…

4 недели назад @ neptune.ai
MLOps Model Stores: Definition, Functionality, Tools Review
MLOps Model Stores: Definition, Functionality, Tools Review MLOps Model Stores: Definition, Functionality, Tools Review

Let’s explore what model stores are, how they help, and how to pick the right model store for your project.

In a model store, you have logging, discovery, examples, metadata, all you need, and the model store contains a model registry.

In terms of artifact management, model stores manage the model lifecycle, including packaging the model for staging or release.

Model metadataThe model metadata that you can find within the model store include:Model name set by the user.

ClearML Open Architecture Stack | SourceWhile a model store is not part of the ClearML application stack, you can build a custom model store on the open-source MLOps engine that pretty much provides the core functionalities o…

4 недели, 1 день назад @ neptune.ai
▶️ YouTube
Yannic Kilcher Yannic Kilcher
последний пост 1 неделя, 1 день назад
[ML News] DeepMind does Nowcasting | The Guardian's shady reporting | AI finishes Beethoven's 10th
[ML News] DeepMind does Nowcasting | The Guardian's shady reporting | AI finishes Beethoven's 10th [ML News] DeepMind does Nowcasting | The Guardian's shady reporting | AI finishes Beethoven's 10th

#deepmind #nowcasting #machinelearning Your holy update on what's new in the Machine Learning world. OUTLINE:

0:00 - Intro

0:30 - DeepMind tackles Nowcasting

3:30 - The Guardian's shady reporting on TruthfulQA

6:15 - Stochastic training not necessary for generalization

7:35 - Google AI's efficient partitioning of road networks

9:15 - MiniHack Reinforcement Learning Environment

10:45 - Plato XL 11B dialog model

11:35 - AI finishes Beethoven's 10th Symphony

13:10 - AI casts doubt on painting authenticity

15:55 - ShadowDragon social media surveillance

18:45 - Helpful Libraries

25:20 - Samsung to copy-paste brains onto chips References:

DeepMind improves Nowcasting

https://deepmind.com/blog/art…

1 неделя, 1 день назад @ youtube.com
Grokking: Generalization beyond Overfitting on small algorithmic datasets (Paper Explained)
Grokking: Generalization beyond Overfitting on small algorithmic datasets (Paper Explained) Grokking: Generalization beyond Overfitting on small algorithmic datasets (Paper Explained)

#grokking #openai #deeplearning Grokking is a phenomenon when a neural network suddenly learns a pattern in the dataset and jumps from random chance generalization to perfect generalization very suddenly. This paper demonstrates grokking on small algorithmic datasets where a network has to fill in binary tables. Interestingly, the learned latent spaces show an emergence of the underlying binary operations that the data were created with. OUTLINE:

0:00 - Intro & Overview

1:40 - The Grokking Phenomenon

3:50 - Related: Double Descent

7:50 - Binary Operations Datasets

11:45 - What quantities influence grokking?

15:40 - Learned Emerging Structure

17:35 - The role of smoothness

21:30 - Simple exp…

1 неделя, 2 дня назад @ youtube.com
How far can we scale up? Deep Learning's Diminishing Returns (Article Review)
How far can we scale up? Deep Learning's Diminishing Returns (Article Review) How far can we scale up? Deep Learning's Diminishing Returns (Article Review)

#deeplearning #co2 #cost Deep Learning has achieved impressive results in the last years, not least due to the massive increases in computational power and data that has gone into these models. Scaling up currently promises to be a reliable way to create more performant systems, but how far can we go? This article explores the limits of exponential scaling in AI, and what people are doing to get around this problem OUTLINE:

0:00 - Intro & Overview

1:00 - Deep Learning at its limits

3:10 - The cost of overparameterization

5:40 - Extrapolating power usage and CO2 emissions

10:45 - We cannot just continue scaling up

13:25 - Current solution attempts

15:25 - Aside: ImageNet V2

17:50 - Are symbo…

1 неделя, 6 дней назад @ youtube.com
[ML News] Plagiarism Case w/ Plot Twist | CLIP for video surveillance | OpenAI summarizes books
[ML News] Plagiarism Case w/ Plot Twist | CLIP for video surveillance | OpenAI summarizes books [ML News] Plagiarism Case w/ Plot Twist | CLIP for video surveillance | OpenAI summarizes books

#plagiarism #surveillance #schmidhuber Your Mondaily updates of what's going in the world of Machine Learning. OUTLINE:

0:00 - Intro

0:20 - New plagiarism case has plot twist

7:25 - CLIP for video surveillance

9:40 - DARPA SubTerranean Challenge

11:00 - Schmidhuber criticizing Turing Lecture

15:00 - OpenAI summarizes books

17:55 - UnBiasIt monitors employees' communications for bias

20:00 - iOS plans to detect depression

21:30 - UK 10 year plan to become AI superpower

23:30 - Helpful Libraries

29:00 - WIT: Wikipedia Image-Text dataset References:

New plagiarism case with plot twist

https://www.reddit.com/r/MachineLearning/comments/pvgpfl/ndr_alleged_plagiarism_of_improve_object/

https://zhu…

2 недели, 2 дня назад @ youtube.com
Inconsistency in Conference Peer Review: Revisiting the 2014 NeurIPS Experiment (Paper Explained)
Inconsistency in Conference Peer Review: Revisiting the 2014 NeurIPS Experiment (Paper Explained) Inconsistency in Conference Peer Review: Revisiting the 2014 NeurIPS Experiment (Paper Explained)

#neurips #peerreview #nips The peer-review system at Machine Learning conferences has come under much criticism over the last years. One major driver was the infamous 2014 NeurIPS experiment, where a subset of papers were given to two different sets of reviewers. This experiment showed that only about half of all accepted papers were consistently accepted by both committees and demonstrated significant influence of subjectivity. This paper revisits the data from the 2014 experiment and traces the fate of accepted and rejected papers during the 7 years since, and analyzes how well reviewers can assess future impact, among other things. OUTLINE:

0:00 - Intro & Overview

1:20 - Recap: The 2014 …

2 недели, 4 дня назад @ youtube.com
100K Subs AMA (Ask Me Anything)
100K Subs AMA (Ask Me Anything) 100K Subs AMA (Ask Me Anything)

Links:

TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://discord.gg/4H8xxDF

BitChute: https://www.bitchute.com/channel/yannic-kilcher

Minds: https://www.minds.com/ykilcher

Parler: https://parler.com/profile/YannicKilcher

LinkedIn: https://www.linkedin.com/in/ykilcher

BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https://www.subscribestar.com/yannickilcher

Patreon: …

2 недели, 5 дней назад @ youtube.com
[ML News] New ImageNet SOTA | Uber's H3 hexagonal coordinate system | New text-image-pair dataset
[ML News] New ImageNet SOTA | Uber's H3 hexagonal coordinate system | New text-image-pair dataset [ML News] New ImageNet SOTA | Uber's H3 hexagonal coordinate system | New text-image-pair dataset

#truthfulqa #efficientnet #laion400M Your regularly irregular updates on what's happening in the Machine Learning world. OUTLINE:

0:00 - Intro

0:20 - TruthfulQA benchmark shines new light on GPT-3

2:00 - LAION-400M image-text-pair dataset

4:10 - GoogleAI's EfficientNetV2 and CoAtNet

6:15 - Uber's H3: A hexagonal coordinate system

7:40 - AWS NeurIPS 2021 DeepRacer Challenge

8:15 - Helpful Libraries

9:20 - State of PyTorch in September 2021

10:05 - Physics-Based Deep Learning Book

10:35 - Music-conditioned 3D dance generation

11:40 - Stallman's take on legal issues with Codex

12:20 - Tensorflow DirectML on AMD GPUs

13:00 - Schmidhuber Blog: Turing Oversold References:

TruthfulQA - A benchmark…

3 недели назад @ youtube.com
GPT-3 is a LIAR - Misinformation and fear-mongering around the TruthfulQA dataset
GPT-3 is a LIAR - Misinformation and fear-mongering around the TruthfulQA dataset GPT-3 is a LIAR - Misinformation and fear-mongering around the TruthfulQA dataset

#gpt-3 #truth #conspiracy A new benchmark paper has created quite an uproar in the community. TruthfulQA is a dataset of 817 questions probing for imitative falsehoods where language models become less truthful, the larger they get. This surprising counter-intuitive finding validates many people's criticisms of large language models, but is it really the correct conclusion? OUTLINE:

0:00 - Intro

0:30 - Twitter Paper Announcement

4:10 - Large Language Models are to blame!

5:50 - How was the dataset constructed?

9:25 - The questions are adversarial

12:30 - Are you surprised?! Links:

TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick

YouTube: https://www.youtube.com/c/yannickilc…

3 недели, 3 дня назад @ youtube.com
Topographic VAEs learn Equivariant Capsules (Machine Learning Research Paper Explained)
Topographic VAEs learn Equivariant Capsules (Machine Learning Research Paper Explained) Topographic VAEs learn Equivariant Capsules (Machine Learning Research Paper Explained)

#tvae #topographic #equivariant Variational Autoencoders model the latent space as a set of independent Gaussian random variables, which the decoder maps to a data distribution. However, this independence is not always desired, for example when dealing with video sequences, we know that successive frames are heavily correlated. Thus, any latent space dealing with such data should reflect this in its structure. Topographic VAEs are a framework for defining correlation structures among the latent variables and induce equivariance within the resulting model. This paper shows how such correlation structures can be built by correctly arranging higher-level variables, which are themselves indepen…

3 недели, 4 дня назад @ youtube.com
[ML News] Roomba Avoids Poop | Textless NLP | TikTok Algorithm Secrets | New Schmidhuber Blog
[ML News] Roomba Avoids Poop | Textless NLP | TikTok Algorithm Secrets | New Schmidhuber Blog [ML News] Roomba Avoids Poop | Textless NLP | TikTok Algorithm Secrets | New Schmidhuber Blog

#schmidhuber #tiktok #roomba Your regularly irregluar update on what's happening in the world of Machine Learning. OUTLINE:

0:00 - Intro

0:15 - Sponsor: Weights & Biases

1:55 - ML YouTuber reaches 100k subscribers

2:40 - Facebook AI pushes Textless NLP

5:30 - Schmidhuber blog post: I invented everything

7:55 - TikTok algorithm rabbitholes users

10:45 - Roomba learns to avoid poop

11:50 - AI can spot art forgeries

14:55 - Deepmind's plans to separate from Google

16:15 - Cohere raises 40M

16:55 - US Judge rejects AI inventor on patent

17:55 - Altman: GPT-4 not much bigger than GPT-3

18:45 - Salesforce CodeT5

19:45 - DeepMind Reinforcement Learning Lecture Series

20:15 - WikiGraphs Dataset

20:…

4 недели, 1 день назад @ youtube.com
Celebrating 100k Subscribers! (w/ Channel Statistics)
Celebrating 100k Subscribers! (w/ Channel Statistics) Celebrating 100k Subscribers! (w/ Channel Statistics)

#yannickilcher #machinelearning #100k OUTLINE:

0:00 - 100k!

1:00 - Announcements & Thanks

3:55 - Channel Statistics Links:

TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://discord.gg/4H8xxDF

BitChute: https://www.bitchute.com/channel/yannic-kilcher

Minds: https://www.minds.com/ykilcher

Parler: https://parler.com/profile/YannicKilcher

LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/

BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely o…

1 месяц назад @ youtube.com
[ML News] AI predicts race from X-Ray | Google kills HealthStreams | Boosting Search with MuZero
[ML News] AI predicts race from X-Ray | Google kills HealthStreams | Boosting Search with MuZero [ML News] AI predicts race from X-Ray | Google kills HealthStreams | Boosting Search with MuZero

#mlnews #schmidhuber #muzero Your regular updates on what's happening in the ML world! OUTLINE:

0:00 - Intro

0:15 - Sponsor: Weights & Biases

1:45 - Google shuts down health streams

4:25 - AI predicts race from blurry X-Rays

7:35 - Facebook labels black men as primates

11:05 - Distill papers on Graph Neural Networks

11:50 - Jürgen Schmidhuber to lead KAUST AI Initiative

12:35 - GitHub brief on DMCA notices for source code

14:55 - Helpful Reddit Threads

19:40 - Simple Tricks to improve Transformers

20:40 - Apple's Unconstrained Scene Generation

21:40 - Common Objects in 3D dataset

22:20 - WarpDrive Multi-Agent RL framework

23:10 - My new paper: Boosting Search Agents & MuZero

25:15 - Can AI …

1 месяц назад @ youtube.com
∞-former: Infinite Memory Transformer (aka Infty-Former / Infinity-Former, Research Paper Explained)
∞-former: Infinite Memory Transformer (aka Infty-Former / Infinity-Former, Research Paper Explained) ∞-former: Infinite Memory Transformer (aka Infty-Former / Infinity-Former, Research Paper Explained)

#inftyformer #infinityformer #transformer Vanilla Transformers are excellent sequence models, but suffer from very harsch constraints on the length of the sequences they can process. Several attempts have been made to extend the Transformer's sequence length, but few have successfully gone beyond a constant factor improvement. This paper presents a method, based on continuous attention mechanisms, to attend to an unbounded past sequence by representing the past as a continuous signal, rather than a sequence. This enables the Infty-Former to effectively enrich the current context with global information, which increases performance on long-range dependencies in sequence tasks. Further, the p…

1 месяц, 1 неделя назад @ youtube.com
[ML News] Blind Chess AI Competition | Graph NNs for traffic | AI gift suggestions
[ML News] Blind Chess AI Competition | Graph NNs for traffic | AI gift suggestions [ML News] Blind Chess AI Competition | Graph NNs for traffic | AI gift suggestions

#mlnews #chess #neurips OUTLINE:

0:00 - Intro

0:30 - Reconnaissance Blind Chess NeurIPS 2021 Competition

3:40 - Colab Pro no longer top priority for GPUs

4:45 - DeepMind uses Graph NNs to do traffic prediction

6:00 - Helpful Libraries: Isaac Gym, Differentiable Human, LVIS, BEHAVIOR

10:25 - Cerebras Wafer Scale Engine Cluster

12:15 - AI Voice Synthesis for Val Kilmer

14:20 - Can AI give thoughtful gifts? References:

Reconnaissance Blind Chess NeurIPS 2021 Competition

https://rbc.jhuapl.edu/

https://rbc.jhuapl.edu/gameRules Colab Pro no longer top priority

https://www.reddit.com/r/MachineLearning/comments/pdwxxz/d_colab_pro_no_longer_gives_you_a_v100_not_even_a/ Google Maps ETA prediction us…

1 месяц, 1 неделя назад @ youtube.com
ALiBi - Train Short, Test Long: Attention with linear biases enables input length extrapolation
ALiBi - Train Short, Test Long: Attention with linear biases enables input length extrapolation ALiBi - Train Short, Test Long: Attention with linear biases enables input length extrapolation

#alibi #transformers #attention Transformers are essentially set models that need additional inputs to make sense of sequence data. The most widespread additional inputs are position encodings or position embeddings, which add sequence index information in various forms. However, this has put a limit on the resulting model, which cannot run inference on sequences longer than it has been trained on, as it would encounter unfamiliar position encodings. ALiBi solves this by proposing simple linear fixed biases as position information, adding negligible overhead in time and memory, but surprisingly, the resulting model is able to handle inference on sequences many times as long as its training …

1 месяц, 1 неделя назад @ youtube.com
Henry AI Labs Henry AI Labs
последний пост 4 дня, 10 часов назад
Etienne Dilocker on Vector Search Engines and Weaviate
Etienne Dilocker on Vector Search Engines and Weaviate Etienne Dilocker on Vector Search Engines and Weaviate

Vector Search Engines are powering the the next generation of search. Instead of relying on things like BM25 or TF-IDF representations, we use neural representations to compare semantic similarity. Dot products to calculate the similarity between all vectors would be very time consuming. Hierarchical Navigable Small World Graphs (HNSW) are a cutting-edge new algorithm used by Weaviate to speed up this search. This podcast explores HNSW, Neurosymbolic search and filtering, and many more! I hope you find this interesting, happy to answer any questions left on the YouTube video! Check out this Introduction to Weaviate: Very well organized, no headaches getting started!

https://www.semi.technol…

4 дня, 10 часов назад @ youtube.com
Robust Fine-Tuning of Zero-Shot Models
Robust Fine-Tuning of Zero-Shot Models Robust Fine-Tuning of Zero-Shot Models

Researchers from the University of Washington, Columbia University, Open AI, the Allen Institute of Artificial Intelligence, and Toyota Research have teamed up to present a new method for fine-tuning these pre-trained models such as GPT-3, BERT, DALL-E, EfficientNet, or CLIP for application specific datasets. The key insight is that as you fine-tune these models, you gain in-distribution accuracy, but sacrifice the zero-shot flexibility, or out-of-distribution generalization, of these pre-trained “foundation” models. The authors present Weight-Space Ensembling, where you take a linear interpolation between the weights of the zero-shot and fine-tuned model to make new inference. This achieve…

4 недели, 1 день назад @ youtube.com
Robust fine-tuning of zero-shot models
Robust fine-tuning of zero-shot models Robust fine-tuning of zero-shot models

Researchers from the University of Washington, Columbia University, Open AI, the Allen Institute of Artificial Intelligence, and Toyota Research have teamed up to present a new method for fine-tuning these pre-trained models such as GPT-3, BERT, DALL-E, EfficientNet, or CLIP for application specific datasets. The key insight is that as you fine-tune these models, you gain in-distribution accuracy, but sacrifice the zero-shot flexibility, or out-of-distribution generalization, of these pre-trained “foundation” models. The authors present Weight-Space Ensembling, where you take a linear interpolation between the weights of the zero-shot and fine-tuned model to make new inference. This achieve…

1 месяц назад @ youtube.com
Generalization in Open-Domain Question Answering
Generalization in Open-Domain Question Answering Generalization in Open-Domain Question Answering

AI Weekly Update Notion Page: https://ebony-scissor-725.notion.site/AI-Weekly-Update-September-8th-2021-a2119851b5b74470b4971d064665e77e

New AI Weekly Update GitHub Repo: https://github.com/CShorten/AIWeeklyUpdates I am looking for collaborators for two survey papers on Text-to-Image Generation and Data Augmentation Controllers. If you are interested in helping out and being a co-author of either paper, please send me a quick overview of what you think about the topic to cshorten2015@fau.edu! Content Links in the Video:

Challenges in Generalization in Open Domain Question Answering: https://arxiv.org/pdf/2109.01156.pdf

Hurdles to Progress in Long-Form Question Answering: https://arxiv.org/p…

1 месяц, 1 неделя назад @ youtube.com
AI Weekly Update - August 7th, 2021 (#40)
AI Weekly Update - August 7th, 2021 (#40) AI Weekly Update - August 7th, 2021 (#40)

Notion Link: https://ebony-scissor-725.notion.site/AI-Weekly-Update-August-7th-2021-3b21331c5c6a45e1a5638955dda7923c Chapters:

0:00 Introduction

0:06 Open-Ended Learning

2:38 Persistent Reinforcement Learning

3:34 Domain-Matched Pre-training for Retrieval

4:22 Growing Knowledge Culturally

5:06 Language Grounding with 3D Objects

6:03 Pre-train, Prompt, and Predict

6:32 QA Dataset Explosion

7:32 ProtoTransformer

9:32 Don’t sweep your Learning Rate under the Rug

10:34 AAVAE

11:40 Domain-Agnostic Contrastive Learning

12:43 Dataset Distillation

14:04 Pointer Value Retrieval

16:01 Go Wider Instead of Deeper

16:52 Geometric Deep Learning on Molecules

18:37 Simulation Framework for Label Noise

19:5…

2 месяца, 1 неделя назад @ youtube.com
Reasoning with Language Models - Turning Tables
Reasoning with Language Models - Turning Tables Reasoning with Language Models - Turning Tables

Notion Link: https://ebony-scissor-725.notion.site/Henry-AI-Labs-Weekly-Update-July-22nd-2021-0c43042b93a3459c901f7f5973b949bf Thanks for watching! Please Subscribe!

2 месяца, 3 недели назад @ youtube.com
Deduplicating Training Data makes Language Models Better
Deduplicating Training Data makes Language Models Better Deduplicating Training Data makes Language Models Better

Notion Link: https://ebony-scissor-725.notion.site/Henry-AI-Labs-Weekly-Update-July-22nd-2021-0c43042b93a3459c901f7f5973b949bf Follow Katherine Lee on Twitter @katherine1ee

Twitter Thread Link: https://twitter.com/katherine1ee/status/1415496898241339400 Thanks for watching! Please Subscribe!

2 месяца, 3 недели назад @ youtube.com
Using HTML for Language Modeling
Using HTML for Language Modeling Using HTML for Language Modeling

Notion Link: https://ebony-scissor-725.notion.site/Henry-AI-Labs-Weekly-Update-July-22nd-2021-0c43042b93a3459c901f7f5973b949bf Thanks for watching! Please Subscribe!

2 месяца, 3 недели назад @ youtube.com
Writing with AI - Wordcraft Text Editor
Writing with AI - Wordcraft Text Editor Writing with AI - Wordcraft Text Editor

Notion Link: https://ebony-scissor-725.notion.site/Henry-AI-Labs-Weekly-Update-July-22nd-2021-0c43042b93a3459c901f7f5973b949bf Demo Video: https://www.youtube.com/watch?v=9p4mfA0Fyd8 Thanks for watching! Please Subscribe!

2 месяца, 3 недели назад @ youtube.com
Internet-Augmented Dialogue Generation
Internet-Augmented Dialogue Generation Internet-Augmented Dialogue Generation

Notion Link: https://ebony-scissor-725.notion.site/Henry-AI-Labs-Weekly-Update-July-22nd-2021-0c43042b93a3459c901f7f5973b949bf Thank you for watching! Please Subscribe!

2 месяца, 3 недели назад @ youtube.com
Beyond Goldfish Memory!
Beyond Goldfish Memory! Beyond Goldfish Memory!

Notion Link: https://ebony-scissor-725.notion.site/Henry-AI-Labs-Weekly-Update-July-22nd-2021-0c43042b93a3459c901f7f5973b949bf Thumbnail Goldfish Image Credit - Photo by zhengtao tang on Unsplash Thanks for watching! Please Subscribe! Chapters:

0:00 Introduction

1:54 Multi-Session Chat

4:12 Models

6:40 Drawbacks of Info Retrieval

9:40 Memory Augmented Models

10:09 Comparison with Language Models

2 месяца, 3 недели назад @ youtube.com
Blender Bot 2.0
Blender Bot 2.0 Blender Bot 2.0

Dienste anbieten und betreiben, z.

Personalisierte Inhalte und Werbeanzeigen können ebenfalls darauf basieren, darüber hinaus aber auch auf Aktivitäten wie Suchanfragen bei Google und Videos, die Sie sich bei YouTube ansehen.

Zu personalisierten Inhalten und Werbeanzeigen gehören beispielsweise Dinge wie relevantere Ergebnisse und Empfehlungen, eine individuelle YouTube-Startseite und Werbung, die auf Ihre Interessen zugeschnitten ist.

Klicken Sie auf „Anpassen“, um sich Ihre Möglichkeiten anzusehen.

Zu diesen gehören zum Beispiel Steuerelemente, um Cookies für die Personalisierung zu deaktivieren, oder Informationen zu Steuerelementen auf Browserebene, mit denen einige oder alle Cookies fü…

2 месяца, 3 недели назад @ youtube.com
AI Weekly Update Overview - July 22nd, 2021 (#39)
AI Weekly Update Overview - July 22nd, 2021 (#39) AI Weekly Update Overview - July 22nd, 2021 (#39)

Notion Link: https://ebony-scissor-725.notion.site/Henry-AI-Labs-Weekly-Update-July-22nd-2021-0c43042b93a3459c901f7f5973b949bf Thank you for watching! Please Subscribe! Chapters

0:00 Introduction

0:13 BlenderBot 2.0

2:28 Wordcraft

3:20 Hyper-Text Pre-Training and Prompting

4:18 Deduplicating Training Data

5:00 Turning Tables

6:26 Reasoning with LMs and KGs

7:06 FLEX: Few-Shot NLP

8:33 AlphaFold2

9:27 Semi-Supervised Learning in Action

10:18 AI-Generating Algorithms

10:52 Adaptable Agent Populations

11:36 Recurrent Parameter Generators

12:17 Conservative Objective Models

13:20 Representation Learning for OOD Robots

13:54 MultiBench

14:40 Align before Fuse

15:06 CLIP Benefit in V-L tasks

15:4…

2 месяца, 3 недели назад @ youtube.com
Determined AI + HuggingFace!
Determined AI + HuggingFace! Determined AI + HuggingFace!

This video will present the Determined AI Model Hub for using HuggingFace transformers, tokenizers, and datasets with the Determined training platform! I hope you find this video useful! Overview of Determined on Henry AI Labs:

https://www.youtube.com/watch?v=9wbn2ikhbpg

Challenges of Advanced Hyperparameter Search on Henry AI Labs:

https://www.youtube.com/watch?v=5F5LlmO10AM

Walkthrough of the Determined CIFAR-10 Example on Henry AI Labs:

https://www.youtube.com/watch?v=WEYu8DI4LOU Determined Transformers Examples: https://docs.determined.ai/latest/model-hub/transformers/examples.html

HuggingFace Models: https://huggingface.co/models

HuggingFace Datasets: https://huggingface.co/datasets/sw…

2 месяца, 3 недели назад @ youtube.com
MultiCite - New Research in Scientific Literature Mining!
MultiCite - New Research in Scientific Literature Mining! MultiCite - New Research in Scientific Literature Mining!

Notion Link: https://ebony-scissor-725.notion.site/Henry-AI-Labs-Weekly-Update-July-15th-2021-a68f599395e3428c878dc74c5f0e1124 Thanks for watching! Please Subscribe!

3 месяца назад @ youtube.com
3blue1brown 3blue1brown
последний пост 3 дня, 5 часов назад
Newton's Fractal (which Newton knew nothing about)
Newton's Fractal (which Newton knew nothing about) Newton's Fractal (which Newton knew nothing about)

Who knew root-finding could be so complicated?

Early view of the next part: https://www.patreon.com/3blue1brown

An equally valuable form of support is to simply share the videos. ------------------ On fractal dimension:

https://youtu.be/gB9n2gHsHN4 Mathologer on the cubic formula:

https://youtu.be/N-KXStupwsc Some of the videos from this year's Summer of Math Exposition are fairly relevant to the topics covered here. Take a look at these ones, The Beauty of Bézier Curves

https://youtu.be/aVwxzDHniEw The insolubility of the quintic:

https://youtu.be/BSHv9Elk1MU The math behind rasterizing fonts:

https://youtu.be/LaYPoMPRSlk --- These animations are largely made using a custom python library,…

3 дня, 5 часов назад @ youtube.com
Why aren't you making math videos? (Also, now there's a 3b1b podcast)
Why aren't you making math videos?  (Also, now there's a 3b1b podcast) Why aren't you making math videos? (Also, now there's a 3b1b podcast)

Learn more and submit: https://3b1b.co/SoME1

Podcast/New channel: https://youtu.be/C-i4q-Xlnis

↓↓Things referenced through the video↓↓ Join the discord channel:

https://discord.gg/SRTErdZ9 James Schloss:

https://www.youtube.com/user/LeiosOS Free will theorem:

https://www.ams.org/notices/200902/rtx090200226p.pdf Kolmogorov complexity and primes:

https://people.cs.uchicago.edu/~fortnow/papers/kaikoura.pdf Tadashi Tokieda talk:

https://youtu.be/tQQ3oiB32GI Boarbarktree:

https://www.youtube.com/channel/UCFeIEAkqvS4fJMTwUtF4OFw Mathologer:

https://youtu.be/N-KXStupwsc Manim:

https://github.com/3b1b/manim Manim Community edition:

https://github.com/ManimCommunity/manim/ Reanimate:

https://github.…

3 месяца назад @ youtube.com
A quick trick for computing eigenvalues | Essence of linear algebra, chapter 15
A quick trick for computing eigenvalues | Essence of linear algebra, chapter 15 A quick trick for computing eigenvalues | Essence of linear algebra, chapter 15

How to write the eigenvalues of a 2x2 matrix just by looking at it.

Thanks to Tim for the jingle: https://www.youtube.com/acapellascience

Help fund future projects: https://www.patreon.com/3blue1brown​

An equally valuable form of support is to simply share the videos.

Special thanks to these supporters: https://3b1b.co/quick-eigen-thanks Introduction to eigenvectors and eigenvalues:

https://youtu.be/PFDu9oVAE-g Lockdown math lecture talking about the mean product formula:

https://youtu.be/MHXO86wKeDY Timestamps:

0:00 - Background

4:53 - Examples

10:24 - Relation to the characteristic polynomial

12:00 - Last thoughts ------------------ These animations are largely made using a custom python …

5 месяцев, 1 неделя назад @ youtube.com
How (and why) to raise e to the power of a matrix | DE6
How (and why) to raise e to the power of a matrix | DE6 How (and why) to raise e to the power of a matrix | DE6

General exponentials, Love, Schrödinger, and more.

Home page: https://www.3blue1brown.com

Brought to you by you: https://3b1b.co/thanks ------------------

The Romeo-Juliet example is based on this essay by Steven Strogatz:

http://www.stevenstrogatz.com/essays/loves-me-loves-me-not-do-the-math The book shown at the start is Vladimir Arnold's (excellent) textbook on ordinary differential equations.

https://amzn.to/3dtXSwj Need a review of ordinary powers of e?

https://youtu.be/m2MIpDrF7Es Or of linear algebra?

https://youtu.be/kYB8IZa5AuE Timetable

0:00 - Definition

6:40 - Dynamics of love

13:17 - General equation

20:03 - On general rotations

22:11 - Visualizing with flow ------------------

C…

6 месяцев, 2 недели назад @ youtube.com
The medical test paradox: Can redesigning Bayes rule help?
The medical test paradox: Can redesigning Bayes rule help? The medical test paradox: Can redesigning Bayes rule help?

Bayes factors, aka Likelihood Ratios*, offer a very clear view of how medical test probabilities work.

Home page: https://www.3blue1brown.com

Brought to you by you: https://3b1b.co/bayes-factor-thanks The book by my friend Matt Cook about paradoxes mentioned at then end:

https://amzn.to/3aBrEzg On the topic, I can't help also mentioning another paradox book I'm rather fond of by Bunch:

https://amzn.to/3mBDSKE *As mentioned in the on-screen note at the end, while the terms "Bayes Factor" and "Likelihood Ratio" refer to the same term in this setting, where Bayes rule is used on the probability of an event with only two possible outcomes (you either have the disease or you don't), they do take…

9 месяцев, 3 недели назад @ youtube.com
Two Minute Papers Two Minute Papers
последний пост 3 дня, 9 часов назад
Finally, Beautiful Virtual Scenes…For Less! ☀️
Finally, Beautiful Virtual Scenes…For Less! ☀️ Finally, Beautiful Virtual Scenes…For Less! ☀️

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.com/papers ❤️ Their mentioned post is available here: https://wandb.ai/latentspace/published-work/The-Science-of-Debugging-with-W-B-Reports--Vmlldzo4OTI3Ng 📝 The paper "DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks" is available here:

https://depthoraclenerf.github.io/ Meet and discuss your ideas with other Fellow Scholars on the Two Minute Papers Discord: https://discordapp.com/invite/hbcTJu2 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Ra…

3 дня, 9 часов назад @ youtube.com
This New Method Can Simulate a Vast Ocean! 🌊
This New Method Can Simulate a Vast Ocean! 🌊 This New Method Can Simulate a Vast Ocean! 🌊

❤️ Check out Perceptilabs and sign up for a free demo here: https://www.perceptilabs.com/papers 📝 The paper "Ships, Splashes, and Waves on a Vast Ocean" is available here:

http://computationalsciences.org/publications/huang-2021-vast-ocean.html 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Kenneth Davis, Klaus Busse, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Mark Oates, Michael Albrecht, Nikhil Velpanur, Owen Campbell-Moo…

6 дней, 11 часов назад @ youtube.com
3D Modeling Without An Artist…Possible? 👩‍🎨
3D Modeling Without An Artist…Possible? 👩‍🎨 3D Modeling Without An Artist…Possible? 👩‍🎨

❤️ Check out Fully Connected by Weights & Biases: https://wandb.me/papers 📝 The paper "DAG Amendment for Inverse Control of Parametric Shapes" is available here:

https://perso.telecom-paristech.fr/boubek/papers/DAG_Amendment/ Check out Yannic Kilcher's channel here:

https://www.youtube.com/c/YannicKilcher/videos 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Kenneth Davis, Klaus Busse, Lorin Atzberger, Lukas Biewald, Matthew Allen Fis…

1 неделя, 3 дня назад @ youtube.com
This AI Learned Some Crazy Fighting Moves! 🥊
This AI Learned Some Crazy Fighting Moves! 🥊 This AI Learned Some Crazy Fighting Moves! 🥊

❤️ Check out Perceptilabs and sign up for a free demo here: https://www.perceptilabs.com/papers 📝 The paper "Neural Animation Layering for Synthesizing Martial Arts Movements" is available here:

https://github.com/sebastianstarke/AI4Animation/blob/master/Media/SIGGRAPH_2021/Paper.pdf

https://github.com/sebastianstarke/AI4Animation 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Kenneth Davis, Klaus Busse, Lorin Atzberger, Lukas Biewald…

1 неделя, 6 дней назад @ youtube.com
This AI Stuntman Just Keeps Getting Better! 🏃
This AI Stuntman Just Keeps Getting Better! 🏃 This AI Stuntman Just Keeps Getting Better! 🏃

❤️ Train a neural network and track your experiments with Weights & Biases here: http://wandb.me/paperintro 📝 The paper "Learning a family of motor skills from a single motion clip" is available here:

http://mrl.snu.ac.kr/research/ProjectParameterizedMotion/ParameterizedMotion.html 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Kenneth Davis, Klaus Busse, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Mark Oates, Michael Albrec…

2 недели, 2 дня назад @ youtube.com
NVIDIA’s New Technique: Beautiful Models For Less! 🌲
NVIDIA’s New Technique: Beautiful Models For Less! 🌲 NVIDIA’s New Technique: Beautiful Models For Less! 🌲

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "Appearance-Driven Automatic 3D Model Simplification" is available here:

https://research.nvidia.com/publication/2021-04_Appearance-Driven-Automatic-3D 📝 The differentiable material synthesis paper is available here:

https://users.cg.tuwien.ac.at/zsolnai/gfx/photorealistic-material-learning-and-synthesis/ 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, …

2 недели, 6 дней назад @ youtube.com
The Tale Of The Unscrewable Bolt! 🔩
The Tale Of The Unscrewable Bolt! 🔩 The Tale Of The Unscrewable Bolt! 🔩

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.com/papers ❤️ Their mentioned post is available here: https://wandb.ai/mathisfederico/wandb_features/reports/Visualizing-Confusion-Matrices-With-W-B--VmlldzoxMzE5ODk 📝 The paper "Intersection-free Rigid Body Dynamics" is available here:

https://ipc-sim.github.io/rigid-ipc/ Scene credits:

- Bolt - YSoft be3D

- Expanding Lock Box - Angus Deveson - Bike Chain and Sprocket - Okan (bike chain), Hampus Andersson (sprocket) 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian…

3 недели, 3 дня назад @ youtube.com
This AI Makes Digital Copies of Humans! 👤
This AI Makes Digital Copies of Humans! 👤 This AI Makes Digital Copies of Humans! 👤

❤️ Check out the Gradient Dissent podcast by Weights & Biases: http://wandb.me/gd 📝 The paper "The Relightables: Volumetric Performance Capture of Humans with Realistic Relighting" is available here:

https://augmentedperception.github.io/therelightables/ 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Kenneth Davis, Klaus Busse, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Mark Oates, Michael Albrecht, Nikhil Velpanur, Owen Ca…

3 недели, 6 дней назад @ youtube.com
Meet Your Virtual Level Designer! 🎮
Meet Your Virtual Level Designer! 🎮 Meet Your Virtual Level Designer! 🎮

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.com/papers ❤️ Their mentioned post is available here: https://wandb.ai/ayush-thakur/interpretability/reports/Interpretability-in-Deep-Learning-With-W-B-CAM-and-GradCAM--Vmlldzo5MTIyNw 📝 The paper "Adversarial Reinforcement Learning for Procedural Content Generation" is available here:

https://www.ea.com/seed/news/cog2021-adversarial-rl-content-generation 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Martel, Gordon Child, Ivo Galic, Jace …

1 месяц назад @ youtube.com
OpenAI Codex: Just Say What You Want! 🤖
OpenAI Codex: Just Say What You Want! 🤖 OpenAI Codex: Just Say What You Want! 🤖

❤️ Check out Perceptilabs and sign up for a free demo here: https://www.perceptilabs.com/papers 📝 The paper "Evaluating Large Language Models Trained on Code" is available here:

https://openai.com/blog/openai-codex/ Codex tweet/application links:

https://twitter.com/CristiVlad25/status/1432017112885833734

https://twitter.com/slava__bobrov/status/1425904829013102602

https://www.youtube.com/watch?v=MvHbrVfEuyk GPT-3 tweet/application links:

Website layout: https://twitter.com/sharifshameem/status/1283322990625607681

Plots: https://twitter.com/aquariusacquah/status/1285415144017797126?s=12

Typesetting math: https://twitter.com/sh_reya/status/1284746918959239168

Population data: https://twitter…

1 месяц назад @ youtube.com
Watch Tesla’s Self-Driving Car Learn In a Simulation! 🚘
Watch Tesla’s Self-Driving Car Learn In a Simulation! 🚘 Watch Tesla’s Self-Driving Car Learn In a Simulation! 🚘

❤️ Check out Fully Connected by Weights & Biases: https://wandb.me/papers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Kenneth Davis, Klaus Busse, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Mark Oates, Michael Albrecht, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Ramsey Elbasheer, Steef, Taras Bobrovytsky, Thomas Krcmar, Timothy Sum Hon Mun, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi.

If you wish to appear here…

1 месяц, 1 неделя назад @ youtube.com
This AI Creates Virtual Fingers! 🤝
This AI Creates Virtual Fingers! 🤝 This AI Creates Virtual Fingers! 🤝

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "ManipNet: Neural Manipulation Synthesis with a Hand-Object Spatial Representation" is available here:

https://github.com/cghezhang/ManipNet ❤️ Watch these videos in early access on our Patreon page or join us here on YouTube: - https://www.patreon.com/TwoMinutePapers

- https://www.youtube.com/channel/UCbfYPyITQ-7l4upoX8nvctg/join 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Martel, Gordon Child, Ivo Galic, Jace O'…

1 месяц, 1 неделя назад @ youtube.com
This AI Helps Making A Music Video! 💃
This AI Helps Making A Music Video! 💃 This AI Helps Making A Music Video! 💃

❤️ Train a neural network and track your experiments with Weights & Biases here: http://wandb.me/paperintro 📝 The paper Editable Free-Viewpoint Video using a Layered Neural Representation"" is available here:

https://jiakai-zhang.github.io/st-nerf/

https://github.com/DarlingHang/st-nerf 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Kenneth Davis, Klaus Busse, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Mark Oates, Michael A…

1 месяц, 1 неделя назад @ youtube.com
This AI Learned Boxing…With Serious Knockout Power! 🥊
This AI Learned Boxing…With Serious Knockout Power! 🥊 This AI Learned Boxing…With Serious Knockout Power! 🥊

❤️ Check out Perceptilabs and sign up for a free demo here: https://www.perceptilabs.com/papers 📝 The paper "Control Strategies for Physically Simulated Characters Performing Two-player Competitive Sports" is available here:

https://research.fb.com/publications/control-strategies-for-physically-simulated-characters-performing-two-player-competitive-sports/ ❤️ Watch these videos in early access on our Patreon page or join us here on YouTube: - https://www.patreon.com/TwoMinutePapers

- https://www.youtube.com/channel/UCbfYPyITQ-7l4upoX8nvctg/join 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Ange…

1 месяц, 2 недели назад @ youtube.com
This Magical AI Cuts People Out Of Your Videos! ✂️
This Magical AI Cuts People Out Of Your Videos! ✂️ This Magical AI Cuts People Out Of Your Videos! ✂️

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.com/papers ❤️ Their mentioned report is available here: https://wandb.ai/_scott/omnimatte/reports/Omnimatte-Associating-Objects-and-Their-Effects--Vmlldzo5MDQxNTc 📝 The paper "Omnimatte: Associating Objects and Their Effects in Video" is available here:

https://omnimatte.github.io/ Meet and discuss your ideas with other Fellow Scholars on the Two Minute Papers Discord: https://discordapp.com/invite/hbcTJu2 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eri…

1 месяц, 3 недели назад @ youtube.com
DataFest Video DataFest Video
последний пост 2 месяца, 2 недели назад
Gene Kogan | Machine learning for creativity
Gene Kogan | Machine learning for creativity Gene Kogan | Machine learning for creativity

Data Fest Online 2021 https://fest.ai/2021/

ML Art track https://ods.ai/tracks/ml-art-df2021 Speaker introduces himself in Russian, and then presents the material in English.

2 месяца, 2 недели назад @ youtube.com
Alex Farseev: Under the Boot of Google and Facebook and How to Crack it for better Performance
Alex Farseev: Under the Boot of Google and Facebook and How to Crack it for better Performance Alex Farseev: Under the Boot of Google and Facebook and How to Crack it for better Performance

Data Fest Online 2021 https://fest.ai/2021/

ML in Marketing track https://ods.ai/tracks/ml-in-marketing-df2021 Modern Digital Advertising Platforms Leverage Machine Learning and AI to help Advertisers to achieve their goals. Being managed by humans, Advertising technological potential is often remains under-utilised as Humans tend to follow stereotypes and rely on “gut feeling” when making decisions. Understanding of the underlying principles behind “Googles and Facebook’s of our world” therefore becomes a crucial skill a modern marketer needs to acquire to stay relevant. In this talk, we will shed the light into the complex Digital Advertising ecosystem and will show you techniques, such a…

4 месяца назад @ youtube.com
Artem Koval: Cloud-Native MLOps Framework
Artem Koval: Cloud-Native MLOps Framework Artem Koval: Cloud-Native MLOps Framework

Data Fest Online 2021 https://fest.ai/2021/

ML REPA track https://ods.ai/tracks/ml-repa-df2021 Presentation: https://yadi.sk/i/a25573AB8IZUyw In this video we will analyse the requirements for modern MLOps and the main trends: Human-Centered AI, Fairness, Explainability, Model Monitoring, Human Augmented AI

4 месяца, 1 неделя назад @ youtube.com
Data Fest Online 2021: IGLU Competition @ NeurIPS 2021
Data Fest Online 2021: IGLU Competition @ NeurIPS 2021 Data Fest Online 2021: IGLU Competition @ NeurIPS 2021

Data Fest Online 2021 https://fest.ai/2021/

RL + Catalyst track https://ods.ai/tracks/catalyst-and-rl-df2021

4 месяца, 2 недели назад @ youtube.com
Prince Canuma: Catalyst integration with Neptune
Prince Canuma: Catalyst integration with Neptune Prince Canuma: Catalyst integration with Neptune

Data Fest Online 2021 https://fest.ai/2021/

RL + Catalyst track https://ods.ai/tracks/catalyst-and-rl-df2021

4 месяца, 2 недели назад @ youtube.com
Catalyst integration with Wandb
Catalyst integration with Wandb Catalyst integration with Wandb

Data Fest Online 2021 https://fest.ai/2021/

RL + Catalyst track https://ods.ai/tracks/catalyst-and-rl-df2021

4 месяца, 2 недели назад @ youtube.com
Bag of tricks for image classification — Artur Kuzin
Bag of tricks for image classification — Artur Kuzin Bag of tricks for image classification — Artur Kuzin

ML Training 2019 Artur Kuzin tells about his participation in the competition Driven Data Hakuna Ma-data: Identify Wildlife on the Serengeti with AI for Earth. He took second place. In this video, you will find out: - Overview of a training procedure on Imagenet1k from scratch

- Implementation Details of Hacks & Tricks

- The specialty of working with JPEG pictures and resize in different frameworks Presentation - https://gh.mltrainings.ru/presentations/Kuzin_DrivenDataHakuna.pdf

8 месяцев назад @ youtube.com
Segmentation without pain — Yury Bolkonsky, Andrei Dukhounik
Segmentation without pain — Yury Bolkonsky, Andrei Dukhounik Segmentation without pain — Yury Bolkonsky, Andrei Dukhounik

ML Training 2019 Yury Bolkonsky and Andrei Dukhounik tell about their participation in Kaggle Understanding Clouds from Satellite Images. The team got a silver medal. In this video you will find out:

- Thresholding is an evil, believe in your classification models

- Why you should always use modern best practices

- Why it is not recommended to use postprocessing without local validation Presentation - https://gh.mltrainings.ru/presentations/Bolkonsky_KaggleUnderstandingClouds.pdf

8 месяцев назад @ youtube.com
Use leaks for validation Kaggle ASHRAE Great Energy Predictor III — Yury Bolkonsky
Use leaks for validation Kaggle ASHRAE   Great Energy Predictor III — Yury Bolkonsky Use leaks for validation Kaggle ASHRAE Great Energy Predictor III — Yury Bolkonsky

ML Training 2019 Yury Bolkonsky tells about his participation in Kaggle ASHRAE - Great Energy Predictor III. His team won a gold medal. In this video you will find out:

- How to create timestamp features

- Do you need to use a leak if it is noisy?

- Leak validation for the best solution

8 месяцев, 1 неделя назад @ youtube.com
Time series met AutoML Codalab Automated Time Series Regression — Denis Vorotyntsev
Time series met AutoML Codalab Automated Time Series Regression —  Denis Vorotyntsev Time series met AutoML Codalab Automated Time Series Regression — Denis Vorotyntsev

ML Training 2019 Denis Vorotyntsev won AutoSeries - AutoML competition on time-series regression. In his presentation, he talks about the competition organization, his final solution, and solutions of other top placed participants. In this video, you will find out:

- How AutoML competition differs from most common Kaggle-alike and why you should try them

- Features engineering approach for time-series tasks when you have no idea about domain

- Why validation split should emulate train-test split

- Why you should always check the code of top participants and how small bugs might drop your score Presentation - https://gh.mltrainings.ru/presentations/Vorotyntsev_CodalabAutoML.pdf

8 месяцев, 1 неделя назад @ youtube.com
DL for 6D Pose Estimation for Self Driving Cars — Adel Valiullin
DL for 6D Pose Estimation for Self Driving Cars — Adel Valiullin DL for 6D Pose Estimation for Self Driving Cars — Adel Valiullin

ML Training 2019 Adel Valiullin tells about his participation in the competition Kaggle Peking University/Baidu - Autonomous Driving. He won a silver medal. In this video, you will find out: - Overview of the Autonomous Vehicles problem

- Dataset description and exploration: images with 6D pose information, taken from the roof of a car, 3D models of cars and input data analysis - Problems with mAP metric and dataset in this challenge

- The implementation of CenterNet Neural Network for 6D car pose estimation

- Score boosters and other better and high scored approaches

8 месяцев, 2 недели назад @ youtube.com
2 Competitions 1 Unet SpaceNet 5 Challenge & The 3rd Tellus Satellite Challenge — Ilya Kibardin
2 Competitions 1 Unet SpaceNet 5 Challenge & The 3rd Tellus Satellite Challenge — Ilya Kibardin 2 Competitions 1 Unet SpaceNet 5 Challenge & The 3rd Tellus Satellite Challenge — Ilya Kibardin

ML Training 2019 Ilya Kibardin tells about his participation in 2 competitions: Topcoder SpaceNet 5 Challenge & Signate The 3rd Tellus Satellite Challenge. He took fourth and second places. In this video you will find out:

- Spacenet 5 challenge at Topcoder, dataset and metric description

- Overview of a UNet pipeline for road graph extraction from satellite images

- The same pipeline applied to ice segmentation at Signate

- Hacks & Tricks for better performance Presentation - https://gh.mltrainings.ru/presentations/Kibardin_Spacenet5Tellus_v2.pdf

8 месяцев, 2 недели назад @ youtube.com
Семинары JetBrains Research Семинары JetBrains Research
последний пост 14 часов назад
Distilling Knowledge from Reader to Retriever for Question Answering
Distilling Knowledge from Reader to Retriever for Question Answering Distilling Knowledge from Reader to Retriever for Question Answering

Информационный поиск является важной составляющей многих систем обработки естественного языка (например: question answering, fact checking). Традиционные методы поиска основывались на простых признаках (TF-IDF, BM25). Однако в последнее время конкурентные результаты позволяют получать признаки, полученные на основе нейронных сетей. Сложность использования нейросетевого подхода заключается в необходимости иметь данные для обучения, представляющие собой пары из запроса и набора вспомогательных документов. На семинаре будет разобрана статья "Distilling Knowledge from Reader to Retriever for Question Answering", в которой предложили обучать retriever-модель с помощью дистилляции. Такой метод не…

14 часов назад @ youtube.com
Code quality for online learning platforms
Code quality for online learning platforms Code quality for online learning platforms

В настоящее время количество онлайн курсов по программированию стремительно растет. Все больше и больше студентов учатся писать код, но зачастую результаты их трудов не поддвергаются процессу ревью от более опытных коллег, что негативно влияет на итоговые навыки студентов. На данном семинаре мы рассмотрим и сравним существующие подходы по автоматической оценке качества кода для образовательных платформ. Затем разберем все ключевые моменты разработанного нами инструмента Hyperstyle, направленного на решение данной задачи. Инструмент уже успешно внедрен и используется на платформах Stepik и Hyperskill и обрабатывает около 1 миллиона студенческих решений еженедельно. Докладчик: Анастасия Бирил…

4 дня, 12 часов назад @ youtube.com
Biology meets graph machine learning
Biology meets graph machine learning Biology meets graph machine learning

В последние годы активно развивается такая область машинного обучения, как машинное обучение на графах (GraphML). Особенно активно graphML используют в биологии, поскольку различные взаимодействия между лекарствами, белками, болезнями и прочее естественным образом представляется в виде графа. В рамках семинара мы рассмотрим биологические задачи, в которых используется машинное обучение на графах, и более подробно поговорим о таких задачах, как предсказание совместных побочных эффектов лекарств и предсказание взаимодействия между лекарствами и белками. Докладчик: Елена Картышева.

5 дней, 15 часов назад @ youtube.com
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery

Известная архитектура StyleGAN позволяет генерировать реалистичные изображения в различных доменах. Большая часть недавних работ была сосредоточена на понимании того, как использовать латентные пространства StyleGAN для модификации сгенерированных и реальных изображений. На семинаре мы обсудим предложенный авторами статьи подход с использованием архитектуры CLIP (Contrastive Language-Image Pre-training). Авторам удалось скомбинировать StyleGAN и CLIP для интерактивного изменения изображения на основе поданного на вход модели текста. Данный подход не требует размеченных парных датасетов для каждого вида манипуляций, а также позволяет без ручного подбора параметров модифицировать исходные изо…

1 неделя, 2 дня назад @ youtube.com
Automated Comment Update: How Far are We?
Automated Comment Update: How Far are We? Automated Comment Update: How Far are We?

Комментарии к коду являются ключом к пониманию программ, однако зачастую разработчики забывают обновлять их при изменении кода, тем самым нарушая согласованность кода и комментариев. Недавно для решения задачи автоматического обновления комментариев был предложен подход CUP. Это многообещающая sequence-to-sequence модель глубокого обучения, обошедшая по ряду метрик альтернативные решения. Авторы статьи “Automated Comment Update: How Far are We?” проводят углубленный анализ эффективности CUP, изучая, в каких случаях этот подход работает успешно, а в каких нет. Затем, основываясь на полученных наблюдениях, авторы реализуют простой эвристический алгоритм для генерации обновленных комментариев …

1 неделя, 4 дня назад @ youtube.com
Rethinking Attention with Performers / A Generalization of Transformer Networks to Graphs
Rethinking Attention with Performers / A Generalization of Transformer Networks to Graphs Rethinking Attention with Performers / A Generalization of Transformer Networks to Graphs

00:00:00 Rethinking Attention with Performers

00:45:37 A Generalization of Transformer Networks to Graphs "Rethinking Attention with Performers" В настоящее время трансформеры являются SoTA моделями на задачах моделирования последовательностей. В их основе лежит механизм внимания (attention), который описывает попарные взаимодействия между входными данными на каждом временном шаге. Платой за высокие результаты, которых позволяет добиться использование attention, является его слабая (квадратичная) масштабируемость по длине входной последовательности. На семинаре мы рассмотрим один из методов, позволяющим сделать вычисление матрицы attention более эффективным (линейным по длине последовательн…

1 неделя, 5 дней назад @ youtube.com
Large-scale Analysis of Kotlin Code
Large-scale Analysis of Kotlin Code Large-scale Analysis of Kotlin Code

В настоящее время Kotlin становится всё более популярным языком программирования, и по мере того как всё большее число разработчиков им пользуется, активно растёт количество написанного на нём кода. У этого есть два важных следствия. Во-первых, код неизбежно становится более разнообразным, разработчики используют его по-разному и для разных целей (возможно, не так, как задумывали создатели языка), и, следовательно, самому языку и его стандартной библиотеке необходимо адаптироваться и развиваться. Во-вторых, к счастью, большее количество кода даёт нам больше пространства для анализа и исследования, позволяет делать какие-то выводы и обобщения. В рамках этого семинара мы рассмотрим реализован…

2 недели, 1 день назад @ youtube.com
Language-Agnostic Representation Learning of Source Code from Structure and Context
Language-Agnostic Representation Learning of Source Code from Structure and Context Language-Agnostic Representation Learning of Source Code from Structure and Context

Векторизация кода используется во множестве задач, связанных с анализом кода: в задачах суммаризации по векторному представлению фрагмента кода генерируют документацию, в задачах поиска клонов дубликаты ищут по близости векторов, и т.д. Поэтому развитие моделей векторизации (encoder) логично и количество работ на эту тему с каждым годом увеличивается. При этом зачастую авторы пытаются обучить модели сразу под несколько прикладных задач или для нескольких языков программирования. Яркими представителями таких моделей являются CodeBERT или GREAT. Однако желание использовать несколько языков программирования и иметь модель сразу под несколько задач крайне сложно масштабируется в существующих ре…

2 недели, 4 дня назад @ youtube.com
Import2vec: Learning Embeddings for Software Libraries
Import2vec: Learning Embeddings for Software Libraries Import2vec: Learning Embeddings for Software Libraries

При разработке программного обеспечения огромную роль играют библиотеки. В наше время одним из важнейших навыков программирования является не только способность разрабатывать сложные алгоритмы, но и знание того, где эти алгоритмы реализованы. Но вместе с ростом количества библиотек знать их становится всё тяжелее, а собранные пользователями списки быстро устаревают. Из-за этого важной задачей является эффективная автоматическая работа с библиотеками, их сравнение и предложение наиболее оптимальных вариантов под конкретную задачу. В данной работе эта задача исследуется с точки зрения возможности создания эмбеддингов библиотек для их дальнейшего анализа. Такие эмбеддинги позволяют нам находит…

2 недели, 5 дней назад @ youtube.com
Diversity oriented Deep Reinforcement Learning for targeted molecule generation
Diversity oriented Deep Reinforcement Learning for targeted molecule generation Diversity oriented Deep Reinforcement Learning for targeted molecule generation

Важным шагом в разработке лекарств является генерация набора молекул - потенциальных кандидатов. Это сложная задача потому, что пространство возможных молекул огромно. Машинное обучение внедряют в процесс скрининга, потому что это в разы быстрее, чем стандартные вычислительные подходы в химии. В статье предлагается end-to-end RL-фреймворк с использованием RNN, строкового представления молекул (SMILES) и алгоритма REINFORCE. На семинаре мы обсудим следующее:

- pipeline генерации молекул с заданными свойствами

- как применяется алгоритм REINFORCE для дообучения рекуррентных нейронных сетей

- как авторы предлагают решать exploration / exploitation trade-off

- какие метрики применяются для оцен…

3 недели, 3 дня назад @ youtube.com
Re-splitting Jupyter notebook cells
Re-splitting Jupyter notebook cells Re-splitting Jupyter notebook cells

Вычислительные ноутбуки стали популярными относительно недавно, но из уже полученных данных ясно, что код, написанный в ноутбуках, отличается как по структуре, так и по качеству. Мы предполагаем, что одна из ключевых причин этого — возможность разбивать ноутбук на ячейки. Наша работа нацелена на то, чтобы помочь пользователю найти более эффективные разделения кода на ячейки и сделать ноутбук более читабельным.

На семинаре мы подробно рассмотрим разработанный алгоритм для переразбиения ячеек: зачем он нужен, как работает, и как мы оцениваем качество полученных ячеек. Алгоритм работает на эвристиках, и поэтому много внимания будет посвящено оценке связности кода в ноутбуке и другим метрикам с…

3 недели, 4 дня назад @ youtube.com
Collecting a dataset of bug fixing commits
Collecting a dataset of bug fixing commits Collecting a dataset of bug fixing commits

Идея применения машинного обучения для выявления/исправления ошибок в коде волнует умы исследователей по всему миру. Важным шагом на пути к этой светлой цели является получение обширной, репрезентативной и незашумлённой выборки bug-fix коммитов. На данный момент подавляющее число подобных датасетов собирается либо с помощью баг-трекеров, либо с помощью фильтрации коммитов по ключевым словам. Однако, точность и полнота подобных подходов оставляет желать лучшего. Мы решили исследовать применимость методов машинного обучения для майнинга bug-fix коммитов. На семинаре мы рассмотрим существующие исследования в области классификации коммитов, уделив отдельное внимание проблемам использовавшихся д…

4 недели назад @ youtube.com
Aggregation model for stack trace grouping / Deep assignee prediction for bug stacktraces
Aggregation model for stack trace grouping / Deep assignee prediction for bug stacktraces Aggregation model for stack trace grouping / Deep assignee prediction for bug stacktraces

"Aggregation model for stack trace grouping" Разработка программного обеспечения – сложный итеративный процесс. Программы поддерживаются, реорганизуются, исправляются и обновляются на постоянной основе. При этом очень часто существуют ошибки, которые не обнаружили на этапе тестирования. В большинстве современных программ при возникновении ошибки автоматически формируется отчет об ошибке, содержащий в себе стек вызовов, информацию о системе, состояние окружения, версию аварийного приложения, установленные плагины, библиотеки и их версии.

Отчеты об ошибках приходят постоянно, и их становится очень много. Одна из проблем, с которой сталкиваются разработчики, — одинаковые ошибки могут дать немн…

1 месяц назад @ youtube.com
Reassessing Automatic Evaluation Metrics for Code Summarization Tasks
Reassessing Automatic Evaluation Metrics for Code Summarization Tasks Reassessing Automatic Evaluation Metrics for Code Summarization Tasks

Статья (https://sarahfakhoury.com/2021-FSE-Summarization-Metrics.pdf) посвящена сравнению эффективности метрик для автоматической оценки качества моделей, которые суммаризуют код в документацию. Помимо стандартных BLEU, METEOR и ROUGE авторы также рассматривают BERTScore (метрику, основанную на претренированных эмбеддингах BERT'а) и chrF (метрику, которая считает F-score на n-граммах символов, а не токенов). Авторы изучили, насколько значительной должна быть разница в оценках моделей на уровне корпуса, чтобы мнение асессоров о сравнительном качестве моделей совпадало с мнением метрик. Также авторы изучили, насколько можно доверять оценкам метрик для отдельных примеров. В рамках семинара мы …

1 месяц назад @ youtube.com
Stack trace Similarity
Stack trace Similarity Stack trace Similarity

Программные продукты часто имеют ошибки, которые не были обнаружены при тестировании и в результате попали к пользователям. Тогда при возникновении такой ошибки разработчикам отправляется автоматически сгенерированный отчет о ней, состоящий преимущественно из стека вызовов. Таких отчетов приходит очень много и чтобы справиться с их анализом, часто их группируют по исходным причинам ошибки. Для группировки надо понять, какие стеки вызовов похожи друг на друга и относятся к одной ошибке, а какие нет. На семинаре будет рассказано, какие существуют подходы к этой задаче. Также будут рассмотрены предложенные нами модели и возможные их улучшения. В конце будут описаны другие задачи, связанные с о…

1 месяц назад @ youtube.com
Яндекс. Компьютерные науки Яндекс. Компьютерные науки
последний пост 6 месяцев назад
Разбор письменного экзамена ШАД. Задача 9. Индекс ближайшего превосходящего элемента
Разбор письменного экзамена ШАД. Задача 9. Индекс ближайшего превосходящего элемента Разбор письменного экзамена ШАД. Задача 9. Индекс ближайшего превосходящего элемента

В этом году мы решили помочь тем, кто готовится к поступлению в Школу анализа данных, и поделиться решениями нескольких заданий из вариантов письменного экзамена, демонстрирующими полезные приёмы. Каждую неделю мы будем публиковать здесь разбор одной из задач, которые были на письменном экзамене в ШАД в 2019 году. Условия задач и текстовые разборы вы найдёте на сайте: https://yandexdataschool.ru/stepbystep

6 месяцев назад @ youtube.com
Письменный разбор экзамена ШАД. Задача 8. Рёбра в графе
Письменный разбор экзамена ШАД. Задача 8. Рёбра в графе Письменный разбор экзамена ШАД. Задача 8. Рёбра в графе

В этом году мы решили помочь тем, кто готовится к поступлению в Школу анализа данных, и поделиться решениями нескольких заданий из вариантов письменного экзамена, демонстрирующими полезные приёмы. Каждую неделю мы будем публиковать здесь разбор одной из задач, которые были на письменном экзамене в ШАД в 2019 году. Условия задач и текстовые разборы вы найдёте на сайте: https://yandexdataschool.ru/stepbystep

6 месяцев, 2 недели назад @ youtube.com
Разбор письменного экзамена ШАД. Задача 7. Неравенство для производной
Разбор письменного экзамена ШАД. Задача 7. Неравенство для производной Разбор письменного экзамена ШАД. Задача 7. Неравенство для производной

В этом году мы решили помочь тем, кто готовится к поступлению в Школу анализа данных, и поделиться решениями нескольких заданий из вариантов письменного экзамена, демонстрирующими полезные приёмы. Каждую неделю мы будем публиковать здесь разбор одной из задач, которые были на письменном экзамене в ШАД в 2019 году. Условия задач и текстовые разборы вы найдёте на сайте: https://yandexdataschool.ru/stepbystep

6 месяцев, 3 недели назад @ youtube.com
Разбор письменного экзамена ШАД. Задача 6. Размерности
Разбор письменного экзамена ШАД. Задача 6. Размерности Разбор письменного экзамена ШАД. Задача 6. Размерности

В этом году мы решили помочь тем, кто готовится к поступлению в Школу анализа данных, и поделиться решениями нескольких заданий из вариантов письменного экзамена, демонстрирующими полезные приёмы. Каждую неделю мы будем публиковать здесь разбор одной из задач, которые были на письменном экзамене в ШАД в 2019 году. Условия задач и текстовые разборы вы найдёте на сайте: https://yandexdataschool.ru/stepbystep

7 месяцев назад @ youtube.com
Разбор письменного экзамена ШАД. Задача 5. Предел и вероятности
Разбор письменного экзамена ШАД. Задача 5. Предел и вероятности Разбор письменного экзамена ШАД. Задача 5. Предел и вероятности

В этом году мы решили помочь тем, кто готовится к поступлению в Школу анализа данных, и поделиться решениями нескольких заданий из вариантов письменного экзамена, демонстрирующими полезные приёмы. Каждую неделю мы будем публиковать здесь разбор одной из задач, которые были на письменном экзамене в ШАД в 2019 году. Условия задач и текстовые разборы вы найдёте на сайте: https://yandexdataschool.ru/stepbystep

7 месяцев, 1 неделя назад @ youtube.com
Разбор письменного экзамена ШАД. Задача 4. Геометрическая вероятность
Разбор письменного экзамена ШАД. Задача 4. Геометрическая вероятность Разбор письменного экзамена ШАД. Задача 4. Геометрическая вероятность

В этом году мы решили помочь тем, кто готовится к поступлению в Школу анализа данных, и поделиться решениями нескольких заданий из вариантов письменного экзамена, демонстрирующими полезные приёмы. Каждую неделю мы будем публиковать здесь разбор одной из задач, которые были на письменном экзамене в ШАД в 2019 году. Условия задач и текстовые разборы вы найдёте на сайте: https://yandexdataschool.ru/stepbystep

7 месяцев, 2 недели назад @ youtube.com
Разбор письменного экзамена ШАД. Задача 3. Математическое ожидание числа шаров
Разбор письменного экзамена ШАД. Задача 3. Математическое ожидание числа шаров Разбор письменного экзамена ШАД. Задача 3. Математическое ожидание числа шаров

В этом году мы решили помочь тем, кто готовится к поступлению в Школу анализа данных, и поделиться решениями нескольких заданий из вариантов письменного экзамена, демонстрирующими полезные приёмы. Каждую неделю мы будем публиковать здесь разбор одной из задач, которые были на письменном экзамене в ШАД в 2019 году. Условия задач и текстовые разборы вы найдёте на сайте: https://yandexdataschool.ru/stepbystep

7 месяцев, 3 недели назад @ youtube.com
Разбор письменного экзамена ШАД. Задача 2. Матрица проекции
Разбор письменного экзамена ШАД. Задача 2. Матрица проекции Разбор письменного экзамена ШАД. Задача 2. Матрица проекции

В этом году мы решили помочь тем, кто готовится к поступлению в Школу анализа данных, и поделиться решениями нескольких заданий из вариантов письменного экзамена, демонстрирующими полезные приёмы. Каждую неделю мы будем публиковать здесь разбор одной из задач, которые были на письменном экзамене в ШАД в 2019 году. Условия задач и текстовые разборы вы найдёте на сайте: https://yandexdataschool.ru/stepbystep

8 месяцев назад @ youtube.com
Разбор письменного экзамена ШАД. Задача 1. Предел отношения
Разбор письменного экзамена ШАД. Задача 1. Предел отношения Разбор письменного экзамена ШАД. Задача 1. Предел отношения

В этом году мы решили помочь тем, кто готовится к поступлению в Школу анализа данных, и поделиться решениями нескольких заданий из вариантов письменного экзамена, демонстрирующими полезные приёмы. Каждую неделю мы будем публиковать здесь разбор одной из задач, которые были на письменном экзамене в ШАД в 2019 году. Условия задач и текстовые разборы вы найдёте на сайте: https://yandexdataschool.ru/stepbystep

8 месяцев, 1 неделя назад @ youtube.com
ML Trainings ML Trainings
последний пост 3 дня, 10 часов назад
Евгений Ключников | Как делать масштабируемые A/B эксперименты
Евгений Ключников | Как делать масштабируемые A/B эксперименты Евгений Ключников | Как делать масштабируемые A/B эксперименты

ML in Marketing track https://ods.ai/tracks/ml-in-marketing-df2021

Телеграм-канал https://t.me/mlinmarketing Спикер: Евгений Ключников, Data Engineer at GetYourGuide

Мы активно используем а/б тестирование для оценки новых продуктов и идей; в некоторые дни у нас запущено до 70 экспериментов от разных команд одновременно. В докладе я расскажу, как мы организовали такую систему технически, где секрет её масштабируемости, что ломается чаще всего и почему не все всегда ею довольны. Затронем тему документации, обучения, неправильного дизайна, расскажу, как сломать эксперимент так, что этого никто и не заметит. Что было в докладе:

00:00 начало видео

03:07 что такое А/Б эксперимент

06:07 для чего м…

3 дня, 10 часов назад @ youtube.com
Денис Соловьев | Построение маркетинговой аналитики на базе Google Cloud
Денис Соловьев | Построение маркетинговой аналитики на базе Google Cloud Денис Соловьев | Построение маркетинговой аналитики на базе Google Cloud

ML in Marketing track https://ods.ai/tracks/ml-in-marketing-df2021

Телеграм-канал https://t.me/mlinmarketing Спикер: Денис Соловьев, Data Analyst at Ciklum Рассмотрим преимущества создания маркетинговой аналитики с использованием облачных технологий. Поговорим о том, на что стоит обратить внимание перед построением решения. Посмотрим на примеры аналитических архитектур для решения маркетинговых задач. Разберём организацию DWH на базе Google BigQuery. Затронем BigQuery ML и какие задачи можно решать с его помощью. Соцсети Data Fest:

https://t.me/datafest

https://vk.com/datafest

1 неделя, 2 дня назад @ youtube.com
Егор Борисов | Зарплаты и вакансии в Data Science: инсайты и тренды по данным чата ODS
Егор Борисов | Зарплаты и вакансии в Data Science: инсайты и тренды по данным чата ODS Егор Борисов | Зарплаты и вакансии в Data Science: инсайты и тренды по данным чата ODS

ML in Marketing track https://ods.ai/tracks/ml-in-marketing-df2021

Телеграм-канал https://t.me/mlinmarketing Спикер: Егор Борисов, Lead DS at DataWise Больше от автора:

- статья на хабр (https://habr.com/ru/company/ods/blog/572264/)

- ссылка на гитхаб (https://github.com/egorborisov/jobs_article) По мотивам статьи на хабр. Обзор рынка Data Science по данным из чатов ODS. Спрос на специалистов растет, или рынок уже насытился, какие технологии теряют, а какие набирают популярность, размер зарплатных вилок и от чего они зависят? Авторы исследования поделятся подходом, результатами и своими мыслями. Будет полезно каждому, кого волнует какие вакансии востребованы сегодня, куда и как расти. Соцсе…

1 неделя, 5 дней назад @ youtube.com
Айдын Убингажибов | Kaggle SIIM-FISABIO-RSNA COVID-19 Detection
Айдын Убингажибов | Kaggle SIIM-FISABIO-RSNA COVID-19 Detection Айдын Убингажибов | Kaggle SIIM-FISABIO-RSNA COVID-19 Detection

Презентация: https://docs.google.com/presentation/d/1XmM9-WHl6XhIx3VQgvm-qoshCXtl9L4I1A-ykK6R9dg/edit#slide=id.ge9979d1b50_5_0 Вступить в сообщество: https://ods.ai/ Соцсети ML Trainings

VK: https://vk.com/mltrainings

Telegram: https://t.me/mltrainings 0:00 - Intro

0:25 - Постановка задачи и метрика

1:21 - Валидация

1:42 - Аугментации

1:57 - Пайплайн классификации

2:50 - Как обучали классификацию

5:20 - Детекция

7:51 - Что не сработало

8:58 - Решение топ1 и топ2

10:09 - Outro

2 недели, 5 дней назад @ youtube.com
Владислав Крамаренко | Kaggle CommonLit Readability Prize
Владислав Крамаренко | Kaggle CommonLit Readability Prize Владислав Крамаренко | Kaggle CommonLit Readability Prize

Разбор решения по определению сложности и удобочитаемости текстов 0:00 Intro

0:20 Постановка задачи, метрика

1:27 Какие были данные?

2:00 Традиционные подходы

2:37 Фичи

3:42 Решение через Spacy

4:58 Решение через Трансформеры

11:33 Итоговый ансамбль

11:50 Альтернативный лосс

12:33 Подведение итогов Презентация: https://disk.yandex.ru/i/SJeX2so0bbYZ-A Вступить в сообщество: https://ods.ai/ Соцсети ML Trainings

VK: https://vk.com/mltrainings

Telegram: https://t.me/mltrainings

2 недели, 5 дней назад @ youtube.com
Юрий Болконский | Kaggle SETI Breakthrough Listen - E.T. Signal Search
Юрий Болконский | Kaggle SETI Breakthrough Listen - E.T. Signal Search Юрий Болконский | Kaggle SETI Breakthrough Listen - E.T. Signal Search

Разбор решения по выявлению аномальных оптических и электромагнитных сигналов в наборе данных 0:00 Intro

1:20 Постановка задачи

2:00 Целевая метрика

2:48 Анализ лидерборда

3:35 Лики

4:14 Решение. Что зашло?

7:46 Что не зашло?

9:12 Отличие от топ1 решения

13:28 Фишки в других решениях

17:57 Подведение итогов Презентация: https://disk.yandex.ru/i/jTwGGLGmIAY7bg Вступить в сообщество: https://ods.ai/ Соцсети ML Trainings

VK: https://vk.com/mltrainings

Telegram: https://t.me/mltrainings

2 недели, 5 дней назад @ youtube.com
Инесса Трегубова | Пространственная аналитика в задачах геомаркетинга
Инесса Трегубова | Пространственная аналитика в задачах геомаркетинга Инесса Трегубова | Пространственная аналитика в задачах геомаркетинга

ML in Marketing track https://ods.ai/tracks/ml-in-marketing-df2021

Телеграм-канал https://t.me/mlinmarketing Спикер: Инесса Трегубова, Geodata analyst at Яндекс Лавка В докладе расскажу, как и зачем используют геоданные в маркетинге. Обсудим специфику сбора и процессинга пространственных данных, а также несколько ML моделей, с помощью которых можно делать выводы о том, насколько тот или иной участок подходит для целей бизнеса. Покажем, какие задачи можно решать в бизнесе. 00:00 начало видео

01:09 задачи гео-маркетинга

03:18 типы данных

04:05 меры сходства между объектами

07:31 системы агреггации данных в пространстве

10:14 проекции представления данных (crs)

11:11 форматы датасетов

12:57 пр…

3 недели, 4 дня назад @ youtube.com
Данил Гиздатуллин | Рекомендации похожего контента
Данил Гиздатуллин | Рекомендации похожего контента Данил Гиздатуллин | Рекомендации похожего контента

ML in Marketing track https://ods.ai/tracks/ml-in-marketing-df2021

Телеграм-канал https://t.me/mlinmarketing Спикер: Данил Гиздатуллин, Data Scientist В рекомендательных системах есть понятия пользователя и айтема. Типичная задача рекомендаций - рекомендации айтемов пользователям, но также одной из важных задач является задача рекомендации айтемов к айтемам. В докладе рассмотрим несколько вариантов её решения: от использования коллаборативной фильтрации до модели ранжирования. Презентация доклада: https://drive.google.com/file/d/1F0mmU0W4mbOPKRpfYr0JaP6I82L7PS8z/view?usp=sharing Соцсети Data Fest:

https://t.me/datafest

https://vk.com/datafest

1 месяц назад @ youtube.com
Андрей Ануфриев | Одноголосный синтез речи на основе нейронных сетей
Андрей Ануфриев | Одноголосный синтез речи на основе нейронных сетей Андрей Ануфриев | Одноголосный синтез речи на основе нейронных сетей

ODS Summer of Code 2021 | Intel & SberCloud track https://ods.ai/tracks/cloudcity2021 Спикер: Андрей Ануфриев, Старший инженер-исследователь в области глубокого обучения, Intel Вступить в сообщество: https://ods.ai/events/datafest2021/join Соцсети Data Fest & ODS Summer of Code:

https://t.me/datafest

https://vk.com/datafest

1 месяц, 1 неделя назад @ youtube.com
ODS Course Fest #1
ODS Course Fest #1 ODS Course Fest #1

ODS Course Fest #1 https://datafest.ru/course/ Зарегистрироваться и получить доступ ко всем курсам и мероприятиям сезона: https://ods.ai/events/course_fest_1 Вступить в сообщество: https://ods.ai/ Соцсети Data Fest & Course Fest: https://t.me/datafest

https://vk.com/datafest

1 месяц, 1 неделя назад @ youtube.com
Антон Киселев | Динамическое ценообразование в топливном ритейле
Антон Киселев | Динамическое ценообразование в топливном ритейле Антон Киселев | Динамическое ценообразование в топливном ритейле

ML in Marketing track https://ods.ai/tracks/ml-in-marketing-df2021

Телеграм-канал https://t.me/mlinmarketing Спикер: Антон Киселев, Marketing Analytics Producer at Playrix

* 10 лет в data (ритейл, игры, телеком)

* 15 DS проектов в проде

* 40 DS специалистов принял на работу

* 3 DS отдела построил с нуля Антон расскажет, как оптимизировали и автоматизировали ценообразование моторного топлива в одной российской топливной компании. Как учились менять цены на лету, как Марков, RL и многорукие бандиты помогли вывести argmax в продакшн и что из всего этого вышло. 00:00 introduction

01:15 динамическое ценообразование в бизнесе

04:00 цели и задачи использования

05:25 ценообразование в топливном рит…

1 месяц, 2 недели назад @ youtube.com
Артём Агафонов | Машинное обучение в гео-аналитике для бизнеса
Артём Агафонов | Машинное обучение в гео-аналитике для бизнеса Артём Агафонов | Машинное обучение в гео-аналитике для бизнеса

ML in Marketing track https://ods.ai/tracks/ml-in-marketing-df2021

Телеграм-канал https://t.me/mlinmarketing Спикер: Артем Агафонов, Data Scientist at Mail Ru Group Выбор удачной геопозиции является одной из важных стратегических задач для бизнеса, будь это строительная компания или компания розничной торговли (магазины, рестораны, банки). Популярность локации, доступность с точки зрения пешеходного или транспортного потока, наличие конкурентов в геоточке определяют важнейшие характеристики успешности бизнеса. В докладе рассмотрены примеры того, как большие данные помогают автоматизировать процесс отбора географического положения для нового объекта. Презентация доклада: https://drive.google…

1 месяц, 3 недели назад @ youtube.com
Даниил Чесаков | One-shot FaceSwap
Даниил Чесаков | One-shot FaceSwap Даниил Чесаков | One-shot FaceSwap

ODS Summer of Code 2021 | Intel & SberCloud track https://ods.ai/tracks/cloudcity2021 В пятницу 27.08.2021 в 17:00 в нашем ODS спейсе live.ods.ai (пароль odssummerofcodeison) состоится митап, на котором Даниил Чесаков, Data Scientist в SberAI, расскажет о современных методах переноса лица и детально рассмотрит методы переноса лица по одной source фотографии на target фотографию или видео. Даниил остановится на двух моделях FaceShifter, одна из которых делает непосредственно перенос, а вторая добавляет детали, которые были изначально на target. Также он расскажет про то, как удалось улучшить качество переноса и сделать свой сервис FaceSwap Зарегистрироваться на ODS Summer of Code и получить …

1 месяц, 3 недели назад @ youtube.com
Виталий Поздняков | Поиск сообществ в социальных сетях
Виталий Поздняков | Поиск сообществ в социальных сетях Виталий Поздняков | Поиск сообществ в социальных сетях

Data Fest Online 2021

ML in Marketing track https://ods.ai/tracks/ml-in-marketing...

Телеграм-канал https://t.me/mlinmarketing Спикер: Виталий Поздняков, Преподаватель, исследователь в ВШЭ, MADE Разберемся, что такое сообщество в социальной сети и как его определить. Рассмотрим современные научные подходы к обнаружению сообществ в социальных сетях: методы на основе случайных блужданий, спектральной кластеризации и графовых нейронных сетей. Обсудим библиотеки Python, которые можно использовать для поиска сообществ, посмотрим примеры визуализации найденных сообществ. Дополнительные материалы по графам: Ссылка на гитхаб курса http://github.com/netspractice/network-science

там же в описании ссы…

1 месяц, 3 недели назад @ youtube.com
Евгений Изутов | Распознавание языка жестов
Евгений Изутов | Распознавание языка жестов Евгений Изутов | Распознавание языка жестов

ODS Summer of Code 2021 | Intel & SberCloud track https://ods.ai/tracks/cloudcity2021 Спикер: Евгений Изутов, Deep Learning R&D Engineer, Intel В последнее время мы видим все больше примеров успешного использования методов глубокого обучения для решения практических задач. Проделан большой путь от решения базовой задачи классификации, сделавшей имя глубокому обучению, до более полезных решений, таких как система помощи водителю или автоматический перевод иностранной речи. Последний пример включает расширенное определение языка – использование жестов и мимики вместо традиционных звуков. Жестовые языки долгое время оставались незамеченными, и машинный перевод на них не распространялся. Это бы…

1 месяц, 4 недели назад @ youtube.com
Primer Primer
последний пост 1 месяц, 2 недели назад
Simulating the Evolution of Sacrificing for Family
Simulating the Evolution of Sacrificing for Family Simulating the Evolution of Sacrificing for Family

Try NordVPN free for 30 days at https://nordvpn.com/Primer More than you ever wanted to know about Hamilton's rule:

https://users.ox.ac.uk/~grafen/cv/oseb.pdf For discussion and updates

- Twitter: @primerlearning

- Discord: https://discord.gg/NbruaNW

- Reddit: r/primerlearning Plush blobs and other merch: https://store.dftba.com/collections/primer

Support these videos on Patreon: https://www.patreon.com/primerlearning Made with Unity and Manim

https://github.com/Helpsypoo/PrimerUnity

https://www.manim.community Music by Mathieu Keith. For business inquiries: mathieu.keith@gmail.com Several other inputs into the graphics are from public domain contributions to blendswap.com Made possible by …

1 месяц, 2 недели назад @ youtube.com
Simulating Green Beard Altruism
Simulating Green Beard Altruism Simulating Green Beard Altruism

Brilliant: http://www.brilliant.org/primer Papers:

- https://www.researchgate.net/publication/41910312_Altruism_Spite_and_Greenbeards

- https://www.reed.edu/biology/professors/srenn/pages/teaching/2007_syllabus/2007_readings/a13_Keller_1998.pdf For discussion and updates

- Discord: https://discord.gg/NbruaNW

- Reddit: r/primerlearning

- Twitter: @primerlearning Sometimes streaming myself working on these monstrosities:

- Twitch: https://www.twitch.tv/primerjustin Made with Unity

https://github.com/Helpsypoo/PrimerUnity Music by Mathieu Keith. For business inquiries: mathieu.keith@gmail.com Several other inputs into the graphics are from public domain contributions to blendswap.com Plush blo…

6 месяцев, 3 недели назад @ youtube.com
🎧 Podcasts
Lex Fridman AI Podcast Lex Fridman AI Podcast
последний пост 1 день назад
#230 – Kelsi Sheren: War, Artillery, PTSD, and Love
#230 – Kelsi Sheren: War, Artillery, PTSD, and Love #230 – Kelsi Sheren: War, Artillery, PTSD, and Love

Kelsi Sheren is a veteran, artillery gunner, and founder of Brass and Unity.

Please support this podcast by checking out our sponsors:– Shopify: https://shopify.com/lex to get 14-day free trial– Justworks: https://justworks.com– Novo: https://banknovo.com/lex– Indeed: https://indeed.com/lex to get $75 credit– Onnit: https://lexfridman.com/onnit to get up to 10% offEPISODE LINKS:Kelsi’s Twitter: https://twitter.com/kelsiburnsKelsi’s Website: https://brassandunity.comKelsi’s Instagram: https://www.instagram.com/kelsie_sherenCharity: https://www.heroicheartsproject.org/Do the F*cking Work (book): https://amzn.to/3mGhguFNotes from Underground (book): https://amzn.to/3mN5TRFBrothers in Arms (son…

1 день назад @ lexfridman.com
#229 – Richard Wrangham: Role of Violence, Sex, and Fire in Human Evolution
#229 – Richard Wrangham: Role of Violence, Sex, and Fire in Human Evolution #229 – Richard Wrangham: Role of Violence, Sex, and Fire in Human Evolution

Richard Wrangham is a biological anthropologist at Harvard, specializing in the study of primates and the evolution of violence, sex, cooking, culture, and other aspects of ape and human behavior.

Please support this podcast by checking out our sponsors:– ROKA: https://roka.com/ and use code LEX to get 20% off your first order– Theragun: https://therabody.com/lex to get 30 day trial– ExpressVPN: https://expressvpn.com/lexpod and use code LexPod to get 3 months free– NI: https://www.ni.com/perspectives– Grammarly: https://grammarly.com/lex to get 20% off premiumEPISODE LINKS:Richard’s Website: https://heb.fas.harvard.edu/people/richard-w-wranghamThe Goodness Paradox (book): https://amzn.to/3…

5 дней, 5 часов назад @ lexfridman.com
#228 – RZA: Wu-Tang Clan, Kung Fu, Chess, God, Life, and Death
#228 – RZA: Wu-Tang Clan, Kung Fu, Chess, God, Life, and Death #228 – RZA: Wu-Tang Clan, Kung Fu, Chess, God, Life, and Death

RZA is a rapper, record producer, filmmaker, actor, writer, philosopher, and the mastermind of the legendary hip hop group Wu-Tang Clan.

Please support this podcast by checking out our sponsors:– ROKA: https://roka.com/ and use code LEX to get 20% off your first order– Athletic Greens: https://athleticgreens.com/lex and use code LEX to get 1 month of fish oil– SimpliSafe: https://simplisafe.com/lex and use code LEX to get a free security camera– Eight Sleep: https://www.eightsleep.com/lex and use code LEX to get special savings– BetterHelp: https://betterhelp.com/lex to get 10% offEPISODE LINKS:RZA’s Twitter: https://twitter.com/RZARZA’s Instagram: https://www.instagram.com/rzaWu-Tang Clan …

1 неделя, 3 дня назад @ lexfridman.com
#227 – Sean Kelly: Existentialism, Nihilism, and the Search for Meaning
#227 – Sean Kelly: Existentialism, Nihilism, and the Search for Meaning #227 – Sean Kelly: Existentialism, Nihilism, and the Search for Meaning

Sean Kelly is a philosopher at Harvard specializing in existentialism and the philosophy of mind.

Please support this podcast by checking out our sponsors:– Coinbase: https://coinbase.com/lex to get $5 in free Bitcoin– InsideTracker: https://insidetracker.com/lex and use code Lex25 to get 25% off– NetSuite: http://netsuite.com/lex to get free product tour– Ladder: https://ladderlife.com/lex– Sunbasket: https://sunbasket.com/lex and use code LEX to get $35 offEPISODE LINKS:Sean’s Twitter: https://twitter.com/sean_d_kellySean’s Website: https://scholar.harvard.edu/sdkellySean’s Wiki: https://en.wikipedia.org/wiki/Sean_Dorrance_KellyPODCAST INFO:Podcast website: https://lexfridman.com/podcastA…

2 недели, 1 день назад @ lexfridman.com
#226 – Jo Boaler: How to Learn Math
#226 – Jo Boaler: How to Learn Math #226 – Jo Boaler: How to Learn Math

Jo Boaler is a professor of mathematics education at Stanford and the co-founder of youcubed.

On some podcast players you should be able to click the timestamp to jump to that time.

(00:00) – Introduction(06:48) – What is beautiful about mathematics?

(15:37) – How difficult should math really be?

(23:56) – Students giving up on math(35:17) – Improving math education in schools(45:14) – Inspiring mathematical creativity(1:03:00) – youcubed(1:07:20) – Best methods for studying math(1:27:54) – Advice for young people

2 недели, 4 дня назад @ lexfridman.com
#225 – Jeffrey Shainline: Neuromorphic Computing and Optoelectronic Intelligence
#225 – Jeffrey Shainline: Neuromorphic Computing and Optoelectronic Intelligence #225 – Jeffrey Shainline: Neuromorphic Computing and Optoelectronic Intelligence

Jeffrey Shainline is a physicist at NIST.

On some podcast players you should be able to click the timestamp to jump to that time.

(00:00) – Introduction(05:56) – How are processors made?

(25:15) – Are engineers or physicists more important(27:43) – Super-conductivity(43:31) – Computation(48:07) – Computation vs communication(51:48) – Electrons for computation and light for communication(1:02:32) – Neuromorphic computing(1:27:23) – What is NIST?

(1:30:41) – Implementing super-conductivity(1:38:20) – The future of neuromorphic computing(1:57:54) – Loop neurons(2:04:09) – Machine learning(2:18:36) – Cosmological evolution(2:25:44) – Cosmological natural selection(2:43:05) – Life in the univers…

2 недели, 5 дней назад @ lexfridman.com
#224 – Travis Oliphant: NumPy, SciPy, Anaconda, Python & Scientific Programming
#224 – Travis Oliphant: NumPy, SciPy, Anaconda, Python & Scientific Programming #224 – Travis Oliphant: NumPy, SciPy, Anaconda, Python & Scientific Programming

Travis Oliphant is a data scientist, entrepreneur, and creator of NumPy, SciPy, and Anaconda.

Please support this podcast by checking out our sponsors:– Novo: https://banknovo.com/lex– Allform: https://allform.com/lex to get 20% off– Onnit: https://lexfridman.com/onnit to get up to 10% off– Athletic Greens: https://athleticgreens.com/lex and use code LEX to get 1 month of fish oil– Blinkist: https://blinkist.com/lex and use code LEX to get 25% off premiumEPISODE LINKS:Travis’s Twitter: https://twitter.com/teoliphantTravis’s Wiki Page: https://en.wikipedia.org/wiki/Travis_OliphantNumPy: https://numpy.org/SciPy: https://scipy.org/about.htmlAnaconda: https://www.anaconda.com/products/individua…

3 недели, 1 день назад @ lexfridman.com
#223 – Travis Stevens: Judo, Olympics, and Mental Toughness
#223 – Travis Stevens: Judo, Olympics, and Mental Toughness #223 – Travis Stevens: Judo, Olympics, and Mental Toughness

Travis Stevens is the 2016 Olympic Judo silver medalist and BJJ black belt.

Please support this podcast by checking out our sponsors:– Justworks: https://justworks.com– Indeed: https://indeed.com/lex to get $75 credit– MasterClass: https://masterclass.com/lex to get 15% off– Eight Sleep: https://www.eightsleep.com/lex and use code LEX to get special savingsEPISODE LINKS:Travis’s Website: https://www.travisstevensgrappling.com/Travis’s Twitter: https://twitter.com/judosilencerTravis’s Youtube Channel: https://www.youtube.com/c/TravisStevensgrapplingTravis’s Instagram: https://www.instagram.com/judosilencerPODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://app…

3 недели, 3 дня назад @ lexfridman.com
#222 – Jay McClelland: Neural Networks and the Emergence of Cognition
#222 – Jay McClelland: Neural Networks and the Emergence of Cognition #222 – Jay McClelland: Neural Networks and the Emergence of Cognition

Jay McClelland is a cognitive scientist at Stanford.

Please support this podcast by checking out our sponsors:– Paperspace: https://gradient.run/lex to get $15 credit– Skiff: https://skiff.org/lex to get early access– Uprising Food: https://uprisingfood.com/lex to get $10 off 1st starter bundle– Four Sigmatic: https://foursigmatic.com/lex and use code LexPod to get up to 60% off– Onnit: https://lexfridman.com/onnit to get up to 10% offEPISODE LINKS:Jay’s Website: https://stanford.edu/~jlmcc/PODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co/2lwqZIrSpotify: https://spoti.fi/2nEwCF8RSS: https://lexfridman.com/feed/podcast/YouTube Full Episodes: https:…

3 недели, 4 дня назад @ lexfridman.com
#221 – Douglas Lenat: Cyc and the Quest to Solve Common Sense Reasoning in AI
#221 – Douglas Lenat: Cyc and the Quest to Solve Common Sense Reasoning in AI #221 – Douglas Lenat: Cyc and the Quest to Solve Common Sense Reasoning in AI

Douglas Lenat is the founder of Cyc, a 37 year project aiming to solve common-sense knowledge and reasoning in AI.

On some podcast players you should be able to click the timestamp to jump to that time.

(00:00) – Introduction(07:39) – What is Cyc?

(2:01:21) – The open source community and OpenCyc(2:11:48) – The inference problem(2:13:31) – Cyc’s programming language(2:21:05) – Ontological engineering(2:28:30) – Do machines think?

(2:37:15) – Death and consciousness(2:47:16) – What would you say to AI?

1 месяц назад @ lexfridman.com
#220 – Niels Jorgensen: New York Firefighters and the Heroes of 9/11
#220 – Niels Jorgensen: New York Firefighters and the Heroes of 9/11 #220 – Niels Jorgensen: New York Firefighters and the Heroes of 9/11

Niels Jorgensen is a former New York firefighter for over 21 years, who was there at Ground Zero on September 11th, 2001.

Please support this podcast by checking out our sponsors:– ROKA: https://roka.com/ and use code LEX to get 20% off your first order– MUD\WTR: https://mudwtr.com/lex and use code LEX to get 5% off– Magic Spoon: https://magicspoon.com/lex and use code LEX to get $5 off– Blinkist: https://blinkist.com/lex and use code LEX to get 25% off premiumEPISODE LINKS:Niels’s 20 for 20 Podcast: https://ironlightlabs.org/20-for-20/PODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co/2lwqZIrSpotify: https://spoti.fi/2nEwCF8RSS: https://lexfridman.…

1 месяц назад @ lexfridman.com
#219 – Donald Knuth: Programming, Algorithms, Hard Problems & the Game of Life
#219 – Donald Knuth: Programming, Algorithms, Hard Problems & the Game of Life #219 – Donald Knuth: Programming, Algorithms, Hard Problems & the Game of Life

Donald Knuth is a computer scientist, Turing Award winner, father of algorithm analysis, author of The Art of Computer Programming, and creator of TeX.

Please support this podcast by checking out our sponsors:– Coinbase: https://coinbase.com/lex to get $5 in free Bitcoin– InsideTracker: https://insidetracker.com/lex and use code Lex25 to get 25% off– NetSuite: http://netsuite.com/lex to get free product tour– ExpressVPN: https://expressvpn.com/lexpod and use code LexPod to get 3 months free– BetterHelp: https://betterhelp.com/lex to get 10% offEPISODE LINKS:Donald’s Stanford Page: https://profiles.stanford.edu/donald-knuthDonald’s Books: https://amzn.to/3heyBsCPODCAST INFO:Podcast website: …

1 месяц назад @ lexfridman.com
#218 – Jaron Lanier: Virtual Reality, Social Media & the Future of Humans and AI
#218 – Jaron Lanier: Virtual Reality, Social Media & the Future of Humans and AI #218 – Jaron Lanier: Virtual Reality, Social Media & the Future of Humans and AI

Jaron Lanier is a computer scientist, composer, artist, author, and founder of the field of virtual reality.

Please support this podcast by checking out our sponsors:– Skiff: https://skiff.org/lex to get early access– Novo: https://banknovo.com/lex– Onnit: https://lexfridman.com/onnit to get up to 10% off– Indeed: https://indeed.com/lex to get $75 credit– Eight Sleep: https://www.eightsleep.com/lex and use code LEX to get special savingsEPISODE LINKS:Jaron’s Website: http://www.jaronlanier.com/Jaron’s Books: https://amzn.to/3tlhl9TPODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co/2lwqZIrSpotify: https://spoti.fi/2nEwCF8RSS: https://lexfridman.com/f…

1 месяц, 1 неделя назад @ lexfridman.com
#217 – Rodney Brooks: Robotics
#217 – Rodney Brooks: Robotics #217 – Rodney Brooks: Robotics

Rodney Brooks is a roboticist, former head of CSAIL at MIT, and co-founder of iRobot, Rethink Robotics, and Robust.AI.

Please support this podcast by checking out our sponsors:– Paperspace: https://gradient.run/lex to get $15 credit– GiveDirectly: https://givedirectly.org/lex to get gift matched up to $300– BiOptimizers: http://www.magbreakthrough.com/lex to get 10% off– Four Sigmatic: https://foursigmatic.com/lex and use code LexPod to get up to 60% off– SimpliSafe: https://simplisafe.com/lex and use code LEX to get a free security cameraEPISODE LINKS:Rodney’s Twitter: https://twitter.com/rodneyabrooksRodney’s Blog: http://rodneybrooks.com/blog/PODCAST INFO:Podcast website: https://lexfrid…

1 месяц, 1 неделя назад @ lexfridman.com
#216 – Vincent Racaniello: Viruses and Vaccines
#216 – Vincent Racaniello: Viruses and Vaccines #216 – Vincent Racaniello: Viruses and Vaccines

Vincent Racaniello is a virologist, immunologist, and microbiologist at Columbia.

He is a co-author of the textbook Principles of Virology and co-host of This Week in Virology podcast.

On some podcast players you should be able to click the timestamp to jump to that time.

(1:35:45) – Vaccines(1:41:43) – Lex on his reaction to the COVID-19 vaccine shot(1:47:39) – Modern vaccines(1:52:39) – How does mRNA vaccine work?

(3:06:37) – Masks(3:15:05) – Bret Weinstein vs Sam Harris(3:18:39) – This Week in Virology(3:28:19) – Advice for young people(3:30:42) – Meaning of life

1 месяц, 2 недели назад @ lexfridman.com
Microsoft Research Podcast Microsoft Research Podcast
последний пост 2 месяца назад
132 - New Future of Work: How remote and hybrid work will shape workplaces and society with Jaime Teevan and Sid Suri
132 - New Future of Work: How remote and hybrid work will shape workplaces and society with Jaime Teevan and Sid Suri 132 - New Future of Work: How remote and hybrid work will shape workplaces and society with Jaime Teevan and Sid Suri

For Microsoft researchers, COVID-19 was a call to action.

Teams from across the Microsoft organizational chart pooled their unique expertise together under The New Future of Work initiative.

The results have informed product features designed to better support remote work and are now being used to help companies, including Microsoft, usher their workforces into a future of hybrid work.

In this episode of The New Future of Work series, Chief Scientist Jaime Teevan and Senior Principal Researcher Siddharth Suri explore the many ways people were impacted by work shifts during the COVID-19 pandemic.

The research that Siddharth Suri describes in this podcast was jointly done with Hana Wolf of Li…

2 месяца назад @ blubrry.com
131 - New Future of Work: Redefining workspaces as hybrid and remote work become more prevalent with Jaime Teevan and Ginger Hudson
131 - New Future of Work: Redefining workspaces as hybrid and remote work become more prevalent with Jaime Teevan and Ginger Hudson 131 - New Future of Work: Redefining workspaces as hybrid and remote work become more prevalent with Jaime Teevan and Ginger Hudson

For Microsoft researchers, COVID-19 was a call to action.

The reimagining of work practices had long been an area of study, but existing and new questions that needed immediate answers surfaced as companies and their employees quickly adjusted to significantly different working conditions.

Teams from across the Microsoft organizational chart pooled their unique expertise together under The New Future of Work initiative.

The results have informed product features designed to better support remote work and are now being used to help companies, including Microsoft, usher their workforces into a future of hybrid work.

They also talk about what an “anatomy of hybrid work” might look like and som…

2 месяца, 1 неделя назад @ blubrry.com
130 - New Future of Work: Managing IT and security in remote scenarios with Jaime Teevan and Matt Brodsky
130 - New Future of Work: Managing IT and security in remote scenarios with Jaime Teevan and Matt Brodsky 130 - New Future of Work: Managing IT and security in remote scenarios with Jaime Teevan and Matt Brodsky

For Microsoft researchers, COVID-19 was a call to action.

The reimagining of work practices had long been an area of study, but existing and new questions that needed immediate answers surfaced as companies and their employees quickly adjusted to significantly different working conditions.

Teams from across the Microsoft organizational chart pooled their unique expertise together under The New Future of Work initiative.

The results have informed product features designed to better support remote work and are now being used to help companies, including Microsoft, usher their workforces into a future of hybrid work.

They also explore why remote work came with a spike in phishing threats, what…

2 месяца, 2 недели назад @ blubrry.com
129 - Machine learning, molecular simulation, and the opportunity for societal good with Chris Bishop and Max Welling
129 - Machine learning, molecular simulation, and the opportunity for societal good with Chris Bishop and Max Welling 129 - Machine learning, molecular simulation, and the opportunity for societal good with Chris Bishop and Max Welling

Unlocking the challenge of molecular simulation has the potential to yield significant breakthroughs in how we tackle such societal issues as climate change, drug discovery, and the treatment of disease, and Microsoft is ramping up its efforts in the space.

In this episode, Chris Bishop, Lab Director of Microsoft Research Cambridge, welcomes renowned machine learning researcher Max Welling to the Microsoft Research team as head of the new Amsterdam lab.

Connecting over their shared physics background and vision for molecular simulation, Bishop and Welling explore several fascinating topics, including a future in which machine learning and quantum computing will be used in tandem to model mo…

2 месяца, 3 недели назад @ blubrry.com
128 - New Future of Work: How developer collaboration and productivity are changing in a hybrid work model
128 - New Future of Work: How developer collaboration and productivity are changing in a hybrid work model 128 - New Future of Work: How developer collaboration and productivity are changing in a hybrid work model

Teams from across the Microsoft organizational chart pooled their unique expertise together under The New Future of Work initiative.

The results have informed product features designed to better support remote work and are now being used to help companies, including Microsoft, usher their workforces into a future of hybrid work.

In this episode of The New Future of Work series, Chief Scientist Jaime Teevan and Principal Productivity Engineer Brian Houck discuss what the massive shift to remote work meant for developers—both employees of Microsoft and customers using Microsoft developer platforms to support their work.

They’ll talk about how taking a holistic approach to developer productivi…

3 месяца назад @ blubrry.com
127 - New Future of Work: Staying productive and happy when our office is our home with Jaime Teevan and Sonia Jaffe
127 - New Future of Work: Staying productive and happy when our office is our home with Jaime Teevan and Sonia Jaffe 127 - New Future of Work: Staying productive and happy when our office is our home with Jaime Teevan and Sonia Jaffe

For Microsoft researchers, COVID-19 was a call to action.

The reimagining of work practices had long been an area of study, but existing and new questions that needed immediate answers surfaced as companies and their employees quickly adjusted to significantly different working conditions.

Teams from across the Microsoft organizational chart pooled their unique expertise together under The New Future of Work initiative.

The results have informed product features designed to better support remote work and are now being used to help companies, including Microsoft, usher their workforces into a future of hybrid work.

They also explore how people already working from home helped them better und…

3 месяца, 1 неделя назад @ blubrry.com
126 - New Future of Work: Meeting and collaborating in a remote and hybrid world with Jaime Teevan and Abigail Sellen
126 - New Future of Work: Meeting and collaborating in a remote and hybrid world with Jaime Teevan and Abigail Sellen 126 - New Future of Work: Meeting and collaborating in a remote and hybrid world with Jaime Teevan and Abigail Sellen

Teams from across the Microsoft organizational chart pooled their unique expertise together under The New Future of Work initiative.

The results have informed product features designed to better support remote work and are now being used to help companies, including Microsoft, usher their workforces into a future of hybrid work.

In this episode of The New Future of Work series of the podcast, Chief Scientist Jaime Teevan and Abigail Sellen, Deputy Lab Director at Microsoft Research Cambridge in the United Kingdom, explore the dynamics of meetings and collaborations in the context of remote work.

They specifically address the difference between weak and strong ties in our professional networ…

3 месяца, 2 недели назад @ blubrry.com
125 - New Future of Work: Driving innovation via cross-company research with Jaime Teevan and Brent Hecht
125 - New Future of Work: Driving innovation via cross-company research with Jaime Teevan and Brent Hecht 125 - New Future of Work: Driving innovation via cross-company research with Jaime Teevan and Brent Hecht

For Microsoft researchers, COVID-19 was a call to action.

The reimagining of work practices had long been an area of study, but existing and new questions that needed immediate answers surfaced as companies and their employees quickly adjusted to significantly different working conditions.

Teams from across the Microsoft organizational chart pooled their unique expertise together under The New Future of Work initiative.

The results have informed product features designed to better support remote work and are now being used to help companies, including Microsoft, usher their workforces into a future of hybrid work.

They’ll discuss the role of research during times of disruption, the widening…

3 месяца, 3 недели назад @ blubrry.com
124 - Econ4: Uncovering how decision-making shapes individuals and society through behavioral public economics featuring Evan Rose and Hunt Allcott
124 - Econ4: Uncovering how decision-making shapes individuals and society through behavioral public economics featuring Evan Rose and Hunt Allcott 124 - Econ4: Uncovering how decision-making shapes individuals and society through behavioral public economics featuring Evan Rose and Hunt Allcott

In the world of economics, researchers at Microsoft are examining a range of complex systems—from those that impact the technologies we use to those that inform the laws and policies we create—through the lens of a social science that goes beyond the numbers to better understand people and society.

In this episode, Senior Principal Researcher Hunt Allcott talks with Postdoctoral Researcher Evan Rose about Allcott’s work exploring the everyday decisions people face, like buying fuel-efficient cars or taking out payday loans, and how a clearer understanding of these decisions can shape meaningful public policy.

Allcott shares how his and others’ research shows that policy can often have compl…

4 месяца назад @ blubrry.com
123 - Econ3: Understanding the media ecosystem and how it informs public opinion in the internet age featuring Hunt Allcott and David Rothschild
123 - Econ3: Understanding the media ecosystem and how it informs public opinion in the internet age featuring Hunt Allcott and David Rothschild 123 - Econ3: Understanding the media ecosystem and how it informs public opinion in the internet age featuring Hunt Allcott and David Rothschild

Interviewed by Senior Principal Researcher Hunt Allcott, Economist David Rothschild discusses how the news media has evolved alongside social media and the internet, from story development to distribution of news via aggregators and wire services.

Rothschild illuminates how and where people are consuming news and shares some of the strategies he’s seeing news outlets use to appeal to their audiences.

He also covers research insights into media bias, misinformation, and how this knowledge could inform the future of news for the better.

In addition, the researchers talk about Rothschild’s work with Project Ratio, which looks at how the news ecosystem impacts public opinion and political polar…

4 месяца, 1 неделя назад @ blubrry.com
122 - Econ2: Causal machine learning, data interpretability, and online platform markets featuring Hunt Allcott and Greg Lewis
122 - Econ2: Causal machine learning, data interpretability, and online platform markets featuring Hunt Allcott and Greg Lewis 122 - Econ2: Causal machine learning, data interpretability, and online platform markets featuring Hunt Allcott and Greg Lewis

In the world of economics, researchers at Microsoft are examining a range of complex systems—from those that impact the technologies we use to those that inform the laws and policies we create—through the lens of a social science that goes beyond the numbers to better understand people and society.

In this episode, Senior Principal Researcher Dr. Hunt Allcott speaks with Microsoft Research New England office mate and Senior Principal Researcher Dr. Greg Lewis.

Together, they cover the connection between causal machine learning and economics research, the motivations of buyers and sellers on e-commerce platforms, and how ad targeting and data practices could evolve to foster a more symbiotic…

4 месяца, 2 недели назад @ blubrry.com
121 - Econ1: Using microeconomics to solve mass incarceration featuring Hunt Allcott and Evan Rose
121 - Econ1: Using microeconomics to solve mass incarceration featuring Hunt Allcott and Evan Rose 121 - Econ1: Using microeconomics to solve mass incarceration featuring Hunt Allcott and Evan Rose

In the world of economics, researchers at Microsoft are examining a range of complex systems—from those that impact the technologies we use to those that inform the laws and policies we create—through the lens of a social science that goes beyond the numbers to better understand people and society.

In this episode, Dr. Hunt Allcott, Senior Principal Researcher at Microsoft Research New England, talks with Dr. Evan Rose, Postdoctoral Researcher, whom Allcott describes as “one of the most engaging and talented researchers in applied microeconomics today.” They’ll discuss how Rose’s experience teaching adult learners at San Quentin State Prison has resonated throughout his research, and they’l…

4 месяца, 4 недели назад @ blubrry.com
120 - Advancing Excel as a programming language with Andy Gordon and Simon Peyton Jones
120 - Advancing Excel as a programming language with Andy Gordon and Simon Peyton Jones 120 - Advancing Excel as a programming language with Andy Gordon and Simon Peyton Jones

Today, people around the globe—from teachers to small-business owners to finance executives—use Microsoft Excel to make sense of the information that occupies their respective worlds, and whether they realize it or not, in doing so, they’re taking on the role of programmer.

In this episode, Senior Principal Research Manager Andy Gordon, who leads the Calc Intelligence team at Microsoft Research, and Senior Principal Researcher Simon Peyton Jones provide an inside account of the journey Excel has taken as a programming language, including the expansion of data types that has unlocked greater functionality and the release of the LAMBDA function, which makes the Excel formula language Turing-c…

5 месяцев, 1 неделя назад @ blubrry.com
NLP Highlights NLP Highlights
последний пост 1 неделя, 2 дня назад
133 - PhD Application Series: Preparing Application Materials, with Nathan Schneider and Roma Patel
133 - PhD Application Series: Preparing Application Materials, with Nathan Schneider and Roma Patel 133 - PhD Application Series: Preparing Application Materials, with Nathan Schneider and Roma Patel

This episode is the first in our current series on PhD applications.

How should people prepare their applications to PhD programs in NLP?

In this episode, we invite Nathan Schneider (Professor of Linguistics and Compute…

1 неделя, 2 дня назад @ soundcloud.com
132 - Alexa Prize Socialbot Grand Challenge and Alquist 4.0, with Petr Marek
132 - Alexa Prize Socialbot Grand Challenge and Alquist 4.0, with Petr Marek 132 - Alexa Prize Socialbot Grand Challenge and Alquist 4.0, with Petr Marek

In this episode, we discussed the Alexa Prize Socialbot Grand Challenge and this year's winning submission, Alquist 4.0, with Petr Marek, a member of the winning team.

Petr gave us an overview of their submission, the de…

2 недели, 4 дня назад @ soundcloud.com
131 - Opportunities and Barriers between HCI and NLP, with Nanna Inie and Leon Derczynski
131 - Opportunities and Barriers between HCI and NLP, with Nanna Inie and Leon Derczynski 131 - Opportunities and Barriers between HCI and NLP, with Nanna Inie and Leon Derczynski

What can NLP researchers learn from Human Computer Interaction (HCI) research?

We chatted with Nanna Inie and Leon Derczynski to find out.

We discussed HCI's research processes including methods of inquiry, the data anno…

1 месяц, 3 недели назад @ soundcloud.com
130 - Linking human cognitive patterns to NLP Models, with Lisa Bienborn
130 - Linking human cognitive patterns to NLP Models, with Lisa Bienborn 130 - Linking human cognitive patterns to NLP Models, with Lisa Bienborn

In this episode, we talk with Lisa Beinborn, an assistant professor at Vrije Universiteit Amsterdam, about how to use human cognitive signals to improve and analyze NLP models.

We start by discussing different kinds of c…

2 месяца, 1 неделя назад @ soundcloud.com
129 - Transformers and Hierarchical Structure, with Shunyu Yao
129 - Transformers and Hierarchical Structure, with Shunyu Yao 129 - Transformers and Hierarchical Structure, with Shunyu Yao

In this episode, we talk to Shunyu Yao about recent insights into how transformers can represent hierarchical structure in language.

Bounded-depth hierarchical structure is thought to be a key feature of natural language…

3 месяца, 2 недели назад @ soundcloud.com
128 - Dynamic Benchmarking, with Douwe Kiela
128 - Dynamic Benchmarking, with Douwe Kiela 128 - Dynamic Benchmarking, with Douwe Kiela

We discussed adversarial dataset construction and dynamic benchmarking in this episode with Douwe Kiela, a research scientist at Facebook AI Research who has been working on a dynamic benchmarking platform called Dynaben…

3 месяца, 4 недели назад @ soundcloud.com
127 - Masakhane and Participatory Research for African Languages, with Tosin Adewumi and Perez Ogayo
127 - Masakhane and Participatory Research for African Languages, with Tosin Adewumi and Perez Ogayo 127 - Masakhane and Participatory Research for African Languages, with Tosin Adewumi and Perez Ogayo

We invited members of Masakhane, Tosin Adewumi and Perez Ogayo, to talk about their EMNLP Findings paper that discusses why typical research is limited for low-resourced NLP and how participatory research can help.

4 месяца, 1 неделя назад @ soundcloud.com
126 - Optimizing Continuous Prompts for Generation, with Lisa Li
126 - Optimizing Continuous Prompts for Generation, with Lisa Li 126 - Optimizing Continuous Prompts for Generation, with Lisa Li

We invited Lisa Li to talk about her recent work, Prefix-Tuning: Optimizing Continuous Prompts for Generation.

Prefix tuning is a lightweight alternative to finetuning, and the idea is to tune only a fixed-length task-sp…

4 месяца, 3 недели назад @ soundcloud.com
125 - VQA for Real Users, with Danna Gurari
125 - VQA for Real Users, with Danna Gurari 125 - VQA for Real Users, with Danna Gurari

How can we build Visual Question Answering systems for real users?

For this episode, we chatted with Danna Gurari, about her work in building datasets and models towards VQA for people who are blind.

We talked about the …

5 месяцев, 2 недели назад @ soundcloud.com
124 - Semantic Machines and Task-Oriented Dialog, with Jayant Krishnamurthy and Hao Fang
124 - Semantic Machines and Task-Oriented Dialog, with Jayant Krishnamurthy and Hao Fang 124 - Semantic Machines and Task-Oriented Dialog, with Jayant Krishnamurthy and Hao Fang

We use cookies for various purposes including analytics and personalized marketing.

By continuing to use the service, you agree to our use of cookies as described in the Cookie Policy

6 месяцев назад @ soundcloud.com
123 - Robust NLP, with Robin Jia
123 - Robust NLP, with Robin Jia 123 - Robust NLP, with Robin Jia

We use cookies for various purposes including analytics and personalized marketing.

By continuing to use the service, you agree to our use of cookies as described in the Cookie Policy

6 месяцев, 1 неделя назад @ soundcloud.com
Data Skeptic Data Skeptic
последний пост 4 дня, 11 часов назад
Causal Inference in Educational Systems
Causal Inference in Educational Systems Causal Inference in Educational Systems

Manie Tadayon, a PhD graduate from the ECE department at University of California, Los Angeles, joins us today to talk about his work “Comparative Analysis of the Hidden Markov Model and LSTM: A Simulative Approach.”

4 дня, 11 часов назад @ dataskeptic.com
Boosted Embeddings for Time Series
Boosted Embeddings for Time Series Boosted Embeddings for Time Series

Sankeerth Rao Karingula, ML Researcher at Palo Alto Networks, joins us today to talk about his work “Boosted Embeddings for Time Series Forecasting.” Works Mentioned Boosted Embeddings for Time Series Forecasting by Sankeerth Rao Karingula, Nandini Ramanan, Rasool Tahmasbi, Mehrnaz Amjadi, Deokwoo Jung, Ricky Si, Charanraj Thimmisetty, Luisa Polania Cabrera, Marjorie Sayer, Claudionor Nunes Coelho Jr Social Media Linkedin Twitter LOD Conference 2021

1 неделя, 4 дня назад @ dataskeptic.com
Applying k-Nearest Neighbors to Time Series
Applying k-Nearest Neighbors to Time Series Applying k-Nearest Neighbors to Time Series

Samya Tajmouati, a PhD student in Data Science at the University of Science of Kenitra, Morocco, joins us today to discuss her work Applying K-Nearest Neighbors to Time Series Forecasting: Two New Approaches.

3 недели, 4 дня назад @ dataskeptic.com
Ultra Long Time Series
Ultra Long Time Series Ultra Long Time Series

Dr. Feng Li, (@f3ngli) is an Associate Professor of Statistics in the School of Statistics and Mathematics at Central University of Finance and Economics in Beijing, China. He joins us today to discuss his work Distributed ARIMA Models for Ultra-long Time Series.

1 месяц назад @ dataskeptic.com
MiniRocket
MiniRocket MiniRocket

Angus Dempster, PhD Student at Monash University in Australia, comes on today to talk about MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification, a fast deterministic transform for time series classification. MINIROCKET reformulates ROCKET, gaining a 75x improvement on larger datasets with essentially the same performance. In this episode, we talk about the insights that realized this speedup as well as use cases.

1 месяц, 1 неделя назад @ dataskeptic.com
Change Point Detection in Continuous Integration Systems
Change Point Detection in Continuous Integration Systems Change Point Detection in Continuous Integration Systems

David Daly, Performance Engineer at MongoDB, joins us today to discuss "The Use of Change Point Detection to Identify Software Performance Regressions in a Continuous Integration System".

1 месяц, 2 недели назад @ dataskeptic.com
ARiMA is not Sufficient
ARiMA is not Sufficient ARiMA is not Sufficient

Chongshou Li, Associate Professor at Southwest Jiaotong University in China, joins us today to talk about his work Why are the ARIMA and SARIMA not Sufficient.

1 месяц, 2 недели назад @ dataskeptic.com
Comp Engine
Comp Engine Comp Engine

Ben Fulcher, Senior Lecturer at the School of Physics at the University of Sydney in Australia, comes on today to talk about his project Comp Engine.

1 месяц, 3 недели назад @ dataskeptic.com
Detecting Ransomware
Detecting Ransomware Detecting Ransomware

Nitin Pundir, PhD candidate at University Florida and works at the Florida Institute for Cybersecurity Research, comes on today to talk about his work “RanStop: A Hardware-assisted Runtime Crypto-Ransomware Detection Technique.”

2 месяца назад @ dataskeptic.com
GANs in Finance
GANs in Finance GANs in Finance

Florian Eckerli, a recent graduate of Zurich University of Applied Sciences, comes on the show today to discuss his work Generative Adversarial Networks in Finance: An Overview.

2 месяца, 1 неделя назад @ dataskeptic.com
Predicting Urban Land Use
Predicting Urban Land Use Predicting Urban Land Use

Today on the show we have Daniel Omeiza, a doctoral student in the computer science department of the University of Oxford, who joins us to talk about his work Efficient Machine Learning for Large-Scale Urban Land-Use Forecasting in Sub-Saharan Africa.

2 месяца, 2 недели назад @ dataskeptic.com
Opportunities for Skillful Weather Prediction
Opportunities for Skillful Weather Prediction Opportunities for Skillful Weather Prediction

Today on the show we have Elizabeth Barnes, Associate Professor in the department of Atmospheric Science at Colorado State University, who joins us to talk about her work Identifying Opportunities for Skillful Weather Prediction with Interpretable Neural Networks. Find more from the Barnes Research Group on their site. Weather is notoriously difficult to predict. Complex systems are demanding of computational power. Further, the chaotic nature of, well, nature, makes accurate forecasting especially difficult the longer into the future one wants to look. Yet all is not lost! In this interview, we explore the use of machine learning to help identify certain conditions under which the weather …

2 месяца, 3 недели назад @ dataskeptic.com
Predicting Stock Prices
Predicting Stock Prices Predicting Stock Prices

Today on the show we have Andrea Fronzetti Colladon (@iandreafc), currently working at the University of Perugia and inventor of the Semantic Brand Score, joins us to talk about his work studying human communication and social interaction. We discuss the paper Look inside. Predicting Stock Prices by Analyzing an Enterprise Intranet Social Network and Using Word Co-Occurrence Networks.

2 месяца, 4 недели назад @ dataskeptic.com
N-Beats
N-Beats N-Beats

Today on the show we have Boris Oreshkin @boreshkin, a Senior Research Scientist at Unity Technologies, who joins us today to talk about his work N-BEATS: Neural Basis Expansion Analysis for Interpretable Time Series Forecasting.

3 месяца назад @ dataskeptic.com
Translation Automation
Translation Automation Translation Automation

Today we are back with another episode discussing AI in the work field. AI has, is, and will continue to facilitate the automation of work done by humans. Sometimes this may be an entire role. Other times it may automate a particular part of their role, scaling their effectiveness. Carl Stimson, a Freelance Japanese to English translator, comes on the show to talk about his work in translation and his perspective about how AI will change translation in the future.

3 месяца, 1 неделя назад @ dataskeptic.com
Linear Digressions Linear Digressions
последний пост None
SuperDataScience SuperDataScience
последний пост 13 часов назад
SDS 514: Does Caffeine Hurt Productivity? (Part 2: Experimental Results)
SDS 514: Does Caffeine Hurt Productivity? (Part 2: Experimental Results) SDS 514: Does Caffeine Hurt Productivity? (Part 2: Experimental Results)

In this episode, I dive into the nuts and bolts of data on my experiment into caffeine and productivity.

Additional materials: www.superdatascience.com/514

13 часов назад @ soundcloud.com
SDS 513: Transformers for Natural Language Processing
SDS 513: Transformers for Natural Language Processing SDS 513: Transformers for Natural Language Processing

Denis Rothman joins us to discuss his writing work in natural language processing, explainable AI, and more!

In this episode you will learn:• What are transformers and their applications?

[7:54]• Denis’s book on expla…

3 дня, 13 часов назад @ soundcloud.com
SDS 512: Does Caffeine Hurt Productivity? (Part 1)
SDS 512: Does Caffeine Hurt Productivity? (Part 1) SDS 512: Does Caffeine Hurt Productivity? (Part 1)

I dive into a personal experiment to test my productivity relative to my coffee intake and if caffeine is actually hurting my productivity.

Additional materials: www.superdatascience.com/512

1 неделя назад @ soundcloud.com
SDS 511: Data Science for Private Investing —— LIVE with Drew Conway
SDS 511: Data Science for Private Investing —— LIVE with Drew Conway SDS 511: Data Science for Private Investing —— LIVE with Drew Conway

Drew Conway joins us on the first live podcast to discuss his work in private investing and how data science figures into and improves his work.

In this episode you will learn:• The R Conference and NYHackR [6:33]• Ma…

1 неделя, 3 дня назад @ soundcloud.com
SDS 510: Deep Reinforcement Learning
SDS 510: Deep Reinforcement Learning SDS 510: Deep Reinforcement Learning

In this episode, I dive into the world of reinforcement learning and deep reinforcement learning and the benefits of both.

Additional materials: www.superdatascience.com/510

2 недели назад @ soundcloud.com
SDS 509: Accelerating Start-up Growth with A.I. Specialists
SDS 509: Accelerating Start-up Growth with A.I. Specialists SDS 509: Accelerating Start-up Growth with A.I. Specialists

Parinaz Sobhani joins us to discuss the cutting-edge work of Georgian, a collaborative company that helps start-ups implement and scale machine learning and AI.

In this episode you will learn:• Parinaz’s work at Georgi…

2 недели, 3 дня назад @ soundcloud.com
SDS 508: Building Your Ant Hill
SDS 508: Building Your Ant Hill SDS 508: Building Your Ant Hill

In this episode, I discuss an interesting bit of my grandmother’s view about the process of working and going through life.

Additional materials: www.superdatascience.com/508

3 недели назад @ soundcloud.com
SDS 507: Bayesian Statistics
SDS 507: Bayesian Statistics SDS 507: Bayesian Statistics

Rob Trangucci joins us to discuss his work and study in Bayesian statistics and how he applies it to real-world problems.

In this episode you will learn:• Getting Rob on the show [8:12]• Stan [9:34]• Gradients [18:15…

3 недели, 3 дня назад @ soundcloud.com
SDS 506: Supervised vs Unsupervised Learning
SDS 506: Supervised vs Unsupervised Learning SDS 506: Supervised vs Unsupervised Learning

In this episode, I continue with last week’s theme and discuss the differences between supervised and unsupervised learning.

Additional materials: www.superdatascience.com/506

4 недели назад @ soundcloud.com
SDS 505: From Data Science to Cinema
SDS 505: From Data Science to Cinema SDS 505: From Data Science to Cinema

Hadelin de Ponteves joins us to discuss his latest educational work and how his skills as a data science educator helped him start his career in acting.

In this episode you will learn:• What has Hadelin been up to?

1 месяц назад @ soundcloud.com
SDS 504: Classification vs Regression
SDS 504: Classification vs Regression SDS 504: Classification vs Regression

In this episode, I give a quick introduction to subcategories of supervised learning problems.

Additional materials: www.superdatascience.com/504

1 месяц назад @ soundcloud.com
SDS 503: Deep Reinforcement Learning for Robotics
SDS 503: Deep Reinforcement Learning for Robotics SDS 503: Deep Reinforcement Learning for Robotics

Pieter Abbeel joins us to discuss his work as an academic and entrepreneur in the field of AI robotics and what the future of the industry holds.

In this episode you will learn:• How does Pieter do it all?

[5:45]• Pie…

1 месяц, 1 неделя назад @ soundcloud.com
SDS 502: Managing Imposter Syndrome
SDS 502: Managing Imposter Syndrome SDS 502: Managing Imposter Syndrome

In this episode, I explore a common issue plaguing people across fields: imposter syndrome.

Additional materials: www.superdatascience.com/502

1 месяц, 1 неделя назад @ soundcloud.com
SDS 501: Statistical Programming with Friends
SDS 501: Statistical Programming with Friends SDS 501: Statistical Programming with Friends

Jared Lander joins us to discuss his work as an R meetup organizer, the upcoming virtual R Conference, and his work as a consultant for a variety of companies from metal workers to professional football teams.

1 месяц, 2 недели назад @ soundcloud.com
SDS 499: Data Meshes and Data Reliability
SDS 499: Data Meshes and Data Reliability SDS 499: Data Meshes and Data Reliability

Barr Moses joins us to discuss the importance of data reliability for pipelines and how companies can achieve data mesh.

In this episode you will learn:• Data meshes [4:25]• Self-serve data reliability [15:36]• How …

1 месяц, 3 недели назад @ soundcloud.com
Data Science at Home Data Science at Home
последний пост 1 месяц, 4 недели назад
Reinforcement Learning is all you need. Or is it? (Ep. 165)
Reinforcement Learning is all you need. Or is it? (Ep. 165) Reinforcement Learning is all you need. Or is it? (Ep. 165)

August 18, 2021 podcastIs reinforcement learning sufficient to build truly intelligent machines?

Our SponsorsQuantum MetricStay off the naughty list this holiday season by reducing customer friction, increasing conversions, and personalizing the shopping experience.

Visit us at quantummetric.com/podoffer and see if you qualify to receive our “12 Days of Insights” offer with code DATASCIENCE.

Amethix TechnologiesAmethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy.

Amethix provide solutions to collect and secure data with higher transparency and disintermediation…

1 месяц, 4 недели назад @ datascienceathome.com
What’s happening with AI today? (Ep. 164)
What’s happening with AI today? (Ep. 164) What’s happening with AI today? (Ep. 164)

August 18, 2021 podcastIn this episode I have a wonderful chat with Ronald Schmelzer and Kathleen Walch, authors of “AI Today” the top podcast for those wanting a no-hype, practical, real-world insight into what enterprises, public sector agencies, thought leaders, leading technology companies, pundits, and experts are doing with AI today.

Sponsored byQuantum Metric Did you know that 2021 holiday ecommerce sales are expected to exceed 2020 benchmarks?Are you prepared to capture every customer revenue opportunity?

Visit their website at quantummetric.com/podoffer and see if you qualify to receive their “12 Days of Insights” offer with code DATASCIENCE.

Sponsored by Amethix TechnologiesAmethi…

1 месяц, 4 недели назад @ datascienceathome.com
2 effective ways to explain your predictions (Ep. 163)
2 effective ways to explain your predictions (Ep. 163) 2 effective ways to explain your predictions (Ep. 163)

We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits.

By clicking “Accept”, you consent to the use of ALL the cookies.

2 месяца, 1 неделя назад @ datascienceathome.com
The Netflix challenge. Fair or what? (Ep. 162)
The Netflix challenge. Fair or what? (Ep. 162)

Remember the Netflix challenge? It was a ton of money for the one who would have cracked the problem of recommending the best possible movie. Was it a fair challenge? […]

The post The Netflix challenge. Fair or what? (Ep. 162) appeared first on Podcast Data science at home.

2 месяца, 1 неделя назад @ datascienceathome.com
Artificial Intelligence for Blockchains with Jonathan Ward CTO of Fetch AI (Ep. 161)
Artificial Intelligence for Blockchains with Jonathan Ward CTO of Fetch AI (Ep. 161)

In this episode Fetch AI CTO Jonathan Ward speaks about decentralization, AI, blockchain for smart cities and the enterprise.Below some great links about collective learning, smart contracts in Rust and […]

The post Artificial Intelligence for Blockchains with Jonathan Ward CTO of Fetch AI (Ep. 161) appeared first on Podcast Data science at home.

2 месяца, 1 неделя назад @ datascienceathome.com
Apache Arrow, Ballista and Big Data in Rust with Andy Grove RB (Ep. 160)
Apache Arrow, Ballista and Big Data in Rust with Andy Grove RB (Ep. 160) Apache Arrow, Ballista and Big Data in Rust with Andy Grove RB (Ep. 160)

August 4, 2021 podcastDo you want to know the latest in big data analytics frameworks?

Have you ever heard of Apache Arrow?

In this episode I speak with Andy Grove one of the main authors of Apache Arrow and Ballista compute engine.

Andy explains some challenges while he was designing the Arrow and Ballista memory models and he describes some amazing solutions.

Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy.

2 месяца, 1 неделя назад @ datascienceathome.com
GitHub Copilot: yay or nay? (Ep. 159)
GitHub Copilot: yay or nay? (Ep. 159) GitHub Copilot: yay or nay? (Ep. 159)

July 6, 2021 podcastIt made already quite some noise in the news, GitHub copilot promises to be your pair programmer for life.

In this episode I explain how and what GitHub copilot does.

Should developers be happy, scared or just keep coding the traditional way?

SponsorsGet one of the best VPN at a massive discount with coupon code DATASCIENCE.

It provides you with an 83% discount which unlocks the best price in the market plus 3 extra months for free.

3 месяца, 1 неделя назад @ datascienceathome.com
A simple trick for very unbalanced data (Ep. 157)
A simple trick for very unbalanced data (Ep. 157) A simple trick for very unbalanced data (Ep. 157)

June 22, 2021 podcastData from the real world are never perfectly balanced.

In this episode I explain a simple yet effective trick to train models with very unbalanced data.

SponsorsGet one of the best VPN at a massive discount with coupon code DATASCIENCE.

It provides you with an 83% discount which unlocks the best price in the market plus 3 extra months for free.

Here is the link https://surfshark.deals/DATASCIENCEReferences

3 месяца, 3 недели назад @ datascienceathome.com
Time to take your data back with Tapmydata (Ep. 156)
Time to take your data back with Tapmydata (Ep. 156) Time to take your data back with Tapmydata (Ep. 156)

We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits.

By clicking “Accept”, you consent to the use of ALL the cookies.

4 месяца назад @ datascienceathome.com
True Machine Intelligence just like the human brain (Ep. 155)
True Machine Intelligence just like the human brain (Ep. 155) True Machine Intelligence just like the human brain (Ep. 155)

June 10, 2021 podcastIn this episode I have a really interesting conversation with Karan Grewal, member of the research staff at Numenta where he investigates how biological principles of intelligence can be translated into silicon.

We speak about the thousand brains theory and why neural networks forget.

4 месяца, 1 неделя назад @ datascienceathome.com
Delivering unstoppable data with Streamr (Ep. 154)
Delivering unstoppable data with Streamr (Ep. 154) Delivering unstoppable data with Streamr (Ep. 154)

May 26, 2021 podcastDelivering unstoppable data to unstoppable apps is now possible with Streamr NetworkStreamr is a layer zero protocol for real-time data which powers the decentralized Streamr pub/sub network.

The technology works in tandem with companion blockchains – currently Ethereum and xDai chain – which are used for identity, security and payments.

On top is the application layer, including the Data Union framework, Marketplace and Core, and all third party applications.

In this episode I have a very interesting conversation with Streamr founder and CEO Henri PihkalaReferences

4 месяца, 3 недели назад @ datascienceathome.com
MLOps: the good, the bad and the ugly (Ep. 153)
MLOps: the good, the bad and the ugly (Ep. 153) MLOps: the good, the bad and the ugly (Ep. 153)

We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits.

By clicking “Accept”, you consent to the use of ALL the cookies.

4 месяца, 3 недели назад @ datascienceathome.com
MLOps: what is and why it is important Part 2 (Ep. 152)
MLOps: what is and why it is important Part 2 (Ep. 152) MLOps: what is and why it is important Part 2 (Ep. 152)

We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits.

By clicking “Accept”, you consent to the use of ALL the cookies.

4 месяца, 3 недели назад @ datascienceathome.com
MLOps: what is and why it is important (Ep. 151)
MLOps: what is and why it is important (Ep. 151) MLOps: what is and why it is important (Ep. 151)

May 11, 2021 podcastIf you think that knowing Tensorflow and Scikit-learn is enough, think again.

What is MLOps and why is it important?

It’s a podcast for techies by techies.

Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy.

Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.

5 месяцев, 1 неделя назад @ datascienceathome.com
Can I get paid for my data? With Mike Andi from Mytiki (Ep. 150)
Can I get paid for my data? With Mike Andi from Mytiki (Ep. 150) Can I get paid for my data? With Mike Andi from Mytiki (Ep. 150)

April 28, 2021 podcastYour data is worth thousands a year.

Why aren’t you getting your fair share?

There is a company that has a mission: they want you to take back control and get paid for your data.

In this episode I speak about knowledge graphs, data confidentiality and privacy with Mike Audi, CEO of MyTiki.

You can reach them on their website https://mytiki.com/Discord official channelhttps://discord.com/invite/evjYQq48BeTelegramhttps://t.me/mytikiappSignalhttps://signal.group/#CjQKIA66Eq2VHecpcCd-cu-dziozMRSH3EuQdcZJNyMOYNi5EhC0coWtjWzKQ1dDKEjMqhkP

5 месяцев, 2 недели назад @ datascienceathome.com