Very ML
State-of-the-art Machine Learning News Feed
/r/MachineLearning
последний пост 1 час назад
[R] mobilenetv2
[R] mobilenetv2

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

1 час назад @ reddit.com
[D] List of 20 AI Apps that allows nsfw
[D] List of 20 AI Apps that allows nsfw

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

2 часа назад @ reddit.com
[D] Is my laptop good enough for AI & ML
[D] Is my laptop good enough for AI & ML

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

2 часа назад @ reddit.com
[D] A slide which makes you feel old
[D] A slide which makes you feel old [D] A slide which makes you feel old

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

3 часа назад @ reddit.com
[D] How can I have successful career as ML researcher? Please give me your advice
[D] How can I have successful career as ML researcher? Please give me your advice

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

3 часа назад @ reddit.com
[P] LSTM Input shape
[P] LSTM Input shape

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

5 часов назад @ reddit.com
[R] Backpropagation through space, time, and the brain
[R] Backpropagation through space, time, and the brain

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

9 часов назад @ reddit.com
[D] How can I become a machine learning engineer?
[D] How can I become a machine learning engineer?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

9 часов назад @ reddit.com
[N] Kaiming He's lecture on DL architecture for Representation Learning
[N] Kaiming He's lecture on DL architecture for Representation Learning

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

11 часов назад @ reddit.com
Do you think Reinforcement Learning still got it? [D]
Do you think Reinforcement Learning still got it? [D]

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

15 часов назад @ reddit.com
[P] TorchFix - a linter for PyTorch-using code with autofix support
[P] TorchFix - a linter for PyTorch-using code with autofix support

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

17 часов назад @ reddit.com
[D] Is Google Set to Dominate the RAG Scene with Its Massive Data Resources?
[D] Is Google Set to Dominate the RAG Scene with Its Massive Data Resources?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

18 часов назад @ reddit.com
"[D]" adapt() gives error while using Normalization Layer in Sequential Models?
"[D]" adapt() gives error while using Normalization Layer in Sequential Models?

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

19 часов назад @ reddit.com
[P] AI-based Language Teacher that can run locally on a 12GB graphics card (RTX 4070)
[P] AI-based Language Teacher that can run locally on a 12GB graphics card (RTX 4070)

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

21 час назад @ reddit.com
[P] End-to-end locally-running language teacher
[P] End-to-end locally-running language teacher

Your request has been blocked due to a network policy.

If you're running a script or application, please register or sign in with your developer credentials here.

Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again.

if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

22 часа назад @ reddit.com
Towards Data Science
последний пост 7 часов назад
N-of-1 Trials and Analyzing Your Own Fitness Data
N-of-1 Trials and Analyzing Your Own Fitness Data N-of-1 Trials and Analyzing Your Own Fitness Data

Do I really sleep worse after drinking alcohol?Photo by Luke Chesser on UnsplashI first heard of N-of-1 trials in 2018 as a master’s student studying epidemiology. I was in my Intermediate Epidemiologic and Clinical Research Methods class, and we had a guest lecture from Dr. Eric Daza on N-of-1 study design. The N-of-1 study can be thought of as a clinical trial investigating the efficacy of an intervention on an individual patient. At the time, this methodology was an emerging practice, with promising implications for personalized medicine and optimizing healthcare for the individual.As an aside at the end of the lecture, he mentioned that N-of-1 experiments were a hobby for some people wh…

7 часов назад @ towardsdatascience.com
You’ve Got a Time Series. Now What?
You’ve Got a Time Series. Now What? You’ve Got a Time Series. Now What?

How to do exploratory data analysis of a time seriesImage by author.One of the most popular types of data is a time series. Videos, images, pixels, signals, literally anything having a time component could be turned into it. Formally, a time series is a sequence of historical measurements of an observable variable at equal time intervals.In this article I want to suggest a small pipeline which anyone can use when analyzing a time series. It could help you to derive meaningful insights about the data itself, prepare it for modeling and draw some preliminary conclusions.Since my favorite word is geospatial 🌏, today we will analyze a meteorological time series. In particular, we will explore 2…

8 часов назад @ towardsdatascience.com
Practical Computer Simulations for Product Analysts
Practical Computer Simulations for Product Analysts Practical Computer Simulations for Product Analysts

Part 1: Task-specific approaches for scenario forecastingImage by DALL-EIn product analytics, we quite often get "what-if" questions. Our teams are constantly inventing different ways to improve the product and want to understand how it can affect our KPI or other metrics.Let's look at some examples:Imagine we're in the fintech industry and facing new regulations requiring us to check more documents from customers making the first donation or sending more than $100K to a particular country. We want to understand the effect of this change on our Ops demand and whether we need to hire more agents.Let's switch to another industry. We might want to incentivise our taxi drivers to work late or t…

12 часов назад @ towardsdatascience.com
How to Implement Knowledge Graphs and Large Language Models (LLMs) together at the Enterprise Level
How to Implement Knowledge Graphs and Large Language Models (LLMs) together at the Enterprise Level How to Implement Knowledge Graphs and Large Language Models (LLMs) together at the Enterprise Level

Source: OpenArt SDXLHow to Implement Knowledge Graphs and Large Language Models (LLMs) Together at the Enterprise LevelA survey of the current methods of integrationLarge Language Models (LLMs) and Knowledge Graphs (KGs) are different ways of providing more people access to data. KGs use semantics to connect datasets via their meaning i.e. the entities they are representing. LLMs use vectors and deep neural networks to predict natural language. They are often both aimed at ‘unlocking’ data. For enterprises implementing KGs, the end goal is usually something like a data marketplace, a semantic layer, to FAIR-ify their data or to make their enterprise more data-centric. These are all differen…

12 часов назад @ towardsdatascience.com
The Business Guide to Tailoring Language AI Part 2
The Business Guide to Tailoring Language AI Part 2 The Business Guide to Tailoring Language AI Part 2

Prompting ChatGPT and other chat-based language AI — and why you should (not) care about itForewordThis article sheds some light on the question of how to “talk” to Large Language Models (LLM) that are designed to interact in conversational ways, like ChatGPT, Claude and others, so that the answers you get from them are as useful as possible for the task at hand. This particular communication from human to the language chatbot is what’s typically referred to as prompting. With this article, I mean to give people with no computer science background a compact overview of the topic so that everyone can understand. It can also help businesses to contextualize of what (not) to expect from their …

12 часов назад @ towardsdatascience.com
Pandas: My Experience Contributing to a Major Open Source Project
Pandas: My Experience Contributing to a Major Open Source Project Pandas: My Experience Contributing to a Major Open Source Project

It just might be worth you contributing tooContinue reading on Towards Data Science »

12 часов назад @ towardsdatascience.com
Information Rationalization in Large Organizations
Information Rationalization in Large Organizations Information Rationalization in Large Organizations

How can we use clustering techniques to combine and refactor a large number of disparate dashboards?Photo by Luke Chesser on UnsplashBackgroundOrganizations generate voluminous amounts of data on a daily basis. Dashboards are built to analyze this data and derive meaningful business insights as well as to track KPIs. Over time, we find ourselves with hundreds (or maybe more) of such dashboards. Oftentimes, this phenomenon of Dashboard Proliferation is driven by multiple groups developing their own analyses in silos without visibility into analytics that may already be available. For example, sales teams may create a dashboard to track their forecasts vs. actual sales not knowing that a fore…

13 часов назад @ towardsdatascience.com
Calculating the previous value in Power BI
Calculating the previous value in Power BI Calculating the previous value in Power BI

Calculating the consumption based on meter data looks easy. However, complex situations can be challenging. Let’s see how we can solve…Continue reading on Towards Data Science »

16 часов назад @ towardsdatascience.com
The Future of Robotic Assembly
The Future of Robotic Assembly The Future of Robotic Assembly

Since the introduction of mass production in 1913 assembly lines are still mostly human — humanoids might change thisContinue reading on Towards Data Science »

18 часов назад @ towardsdatascience.com
Fine-tune Llama 3 with ORPO
Fine-tune Llama 3 with ORPO Fine-tune Llama 3 with ORPO

A cheaper and faster unified fine-tuning techniqueImage generated with DALL-E 3 by authorORPO is a new exciting fine-tuning technique that combines the traditional supervised fine-tuning and preference alignment stages into a single process. This reduces the computational resources and time required for training. Moreover, empirical results demonstrate that ORPO outperforms other alignment methods on various model sizes and benchmarks.In this article, we will fine-tune the new Llama 3 8B model using ORPO with the TRL library. The code is available on Google Colab and in the LLM Course on GitHub.⚖️ ORPOInstruction tuning and preference alignment are essential techniques for adapting Large La…

21 час назад @ towardsdatascience.com
Visualizing my data science job search
Visualizing my data science job search Visualizing my data science job search

Reflections from a humbling journey trying to find a job in 20232023 was a turbulent year for many job seekers. At least for me, it felt like quite a journey. Over the 11 months between January and November, I had 107 career-related conversations and applied to 80 positions, resulting in 2 offers (I took a “break” in May to defend my PhD 😅). Some applications were stretches, sure, but many of them were posts to which I thought I’d be a great match — or at least worth a conversation!The frequency with which my efforts were met with silence surprised me. I felt like most of my cover letters were quite earnest and my resume had unique experiences, prompting me to second guess my communication …

21 час назад @ towardsdatascience.com
3 Best Practices for Bridging the Gap Between Engineers and Analysts
3 Best Practices for Bridging the Gap Between Engineers and Analysts 3 Best Practices for Bridging the Gap Between Engineers and Analysts

Assigning code owners, hiring analytics engineers, and creating flywheelsContinue reading on Towards Data Science »

21 час назад @ towardsdatascience.com
7 Subscriptions That Help Me As A Data Scientist
7 Subscriptions That Help Me As A Data Scientist 7 Subscriptions That Help Me As A Data Scientist

Subscriptions that boost my productivity, knowledge, and focus as a practicing data scientistContinue reading on Towards Data Science »

21 час назад @ towardsdatascience.com
Merging tokens to accelerate LLM inference with SLERP
Merging tokens to accelerate LLM inference with SLERP Merging tokens to accelerate LLM inference with SLERP

We can significantly accelerate LLMs next token generation by merging consecutive pairs of tokens using SLERP, reducing the computing power needed to perform the full prediction.Photo by Martin Martz on UnsplashTL;DR:This article presents a novel approach to accelerating Large Language Models (LLMs) inference by merging tokens using Spherical Linear Interpolation (SLERP). By reducing the sequence length while maintaining quality, this technique offers significant speed-ups in LLM inference, addressing the computational challenges posed by longer sequences. The method is still raw but highlights a dual world for LLM with one set up for training and one for predicting.Background:LLMs have rev…

21 час назад @ towardsdatascience.com
Structured Generative AI
Structured Generative AI Structured Generative AI

How to constrain your model to output defined formatsIn this post I will explain and demonstrate the concept of “structured generative AI”: generative AI constrained to defined formats. By the end of the post, you will understand where and when it can be used and how to implement it whether you’re crafting a transformer model from scratch or utilizing Hugging Face’s models. Additionally, we will cover an important tip for tokenization that is especially relevant for structured languages.One of the many uses of generative AI is as a translation tool. This often involves translating between two human languages but can also include computer languages or formats. For example, your application m…

1 day, 12 hours назад @ towardsdatascience.com
Distill.pub Distill.pub
последний пост None
The Gradient The Gradient
последний пост 1 week, 4 days назад
A Brief Overview of Gender Bias in AI
A Brief Overview of Gender Bias in AI A Brief Overview of Gender Bias in AI

All of these terms (“AI”, “gender”, and “bias”) can be somewhat overused and ambiguous.

A Short History of Studying Gender Bias in AIHere, I cover a very small sample of papers I’ve found influential studying gender bias in AI.

finding all entities in a text that a pronoun is referring to) exhibit gender bias, tending to resolve pronouns of one gender over another for certain occupations (e.g.

This article mainly focused on gender bias — and particularly, on binary gender.

AcknowledgementsThis post was originally posted on Art Fish IntelligenceCitationFor attribution in academic contexts or books, please cite this work asYennie Jun, "Gender Bias in AI," The Gradient, 2024@article{Jun2024bia…

1 week, 4 days назад @ thegradient.pub
Mamba Explained
Mamba Explained Mamba Explained

As a general sequence model backbone, Mamba achieves state-of-the-art performance across several modalities such as language, audio, and genomics.

Here we’ll discuss:The advantages (and disadvantages) of Mamba (🐍) vs Transformers (🤖),Analogies and intuitions for thinking about Mamba, andWhat Mamba means for Interpretability, AI Safety and Applications.

The Mamba BlockLike a Transformer made up of stacked transformer blocks, Mamba is made up of stacked Mamba blocks as above.

The Mamba authors write, “the efficiency vs. effectiveness tradeoff of sequence models is characterised by how well they compress their state”.

Thanks to Gonçalo for reading an early draft, Jaden for the nnsight library …

3 weeks, 2 days назад @ thegradient.pub
Car-GPT: Could LLMs finally make self-driving cars happen?
Car-GPT: Could LLMs finally make self-driving cars happen? Car-GPT: Could LLMs finally make self-driving cars happen?

We've just seen 3 prominent families of LLM usage in self-driving cars: Perception, Planning, and Generation.

The first wave of papers mentioning LLMs in Self-Driving Cars is from mid-2023, so let's give it some time.

Next StepsIf you want to get started on LLMs for self-driving cars, there are several things you can do:⚠️ Before this, the most important : If you want to keep learning about self-driving cars.

Author BioJérémy Cohen is a self-driving car engineer and founder of Think Autonomous, a platform to help engineers learn about cutting-edge technologies such as self-driving cars and advanced Computer Vision.

CitationFor attribution in academic contexts or books, please cite this work…

1 month, 1 week назад @ thegradient.pub
Do text embeddings perfectly encode text?
Do text embeddings perfectly encode text? Do text embeddings perfectly encode text?

Beyond the requirements of semantic similarity, there are no constraints on what embedding must be assigned for a given text input.

What if someone hacks into the database and gains access to all your text embedding vectors – would this be bad?

From text to embeddings...back to textThe problem of recovering text from embeddings is exactly the scenario we tackle in our paper Text Embeddings Reveal As Much as Text (EMNLP 2023).

Scaling and future workThe fact that text embeddings can be perfectly inverted raises many follow-up questions.

CitationFor attribution in academic contexts or books, please cite this work asJack Morris, "Do text embeddings perfectly encode text?

1 month, 2 weeks назад @ thegradient.pub
Why Doesn’t My Model Work?
Why Doesn’t My Model Work? Why Doesn’t My Model Work?

If your model latches on to these during training, it will appear to work well, but may not work on new data.

This happens when the model training pipeline has access to information it shouldn’t have access to, particularly information that confers an advantage to the model.

This means that knowledge of the test data is implicitly entering the model training pipeline, even if it is not explicitly used to train the model.

Well, this is a common thing to do, but if you’re developing a model iteratively and using the same test set to evaluate the model after each iteration, then you’re basically using that test set to guide the development of the model.

CitationFor attribution in academic cont…

1 month, 3 weeks назад @ thegradient.pub
Deep learning for single-cell sequencing: a microscope to see the diversity of cells
Deep learning for single-cell sequencing: a microscope to see the diversity of cells Deep learning for single-cell sequencing: a microscope to see the diversity of cells

Evolution of single-cell sequencing over timeHaving explored the panorama of single-cell sequencing, let us now delve into the role of deep learning in the context of single-cell sequencing.

Deep Learning on single-cell sequencingDeep learning is increasingly employed in single-cell analysis due to its capacity to handle the complexity of single-cell sequencing data.

The deep learning approach, however, autonomously captures relevant characteristics from single-cell sequencing data, addressing the heterogeneity between single-cell sequencing experiments, as well as the associated noise and sparsity in such data.

As we explore the reasons behind using deep learning in single-cell sequencing …

3 months, 1 week назад @ thegradient.pub
Salmon in the Loop
Salmon in the Loop Salmon in the Loop

In order to obtain a license or permit from FERC, hydroelectric dam operators must submit detailed plans and studies demonstrating that their facility meets regulations.

Typically, a hydroelectric dam requires lots of space to store water on one side of it, which means they tend to be located away from population centers.

Enter Computer VisionSome organizations are exploring the use of computer vision and machine learning to significantly automate fish counting.

The annotated images are then used to train a machine learning model.

CitationFor attribution of this in academic contexts or books, please cite this work as:Kevin McCraney, "Salmon in the Loop", The Gradient, 2023.

4 months назад @ thegradient.pub
Neural algorithmic reasoning
Neural algorithmic reasoning Neural algorithmic reasoning

In recent work with computer networking and machine learning collaborators from ETH Zürich, we studied the applicability of neural algorithmic reasoning in computer networking [27].

Our proposal, the neural algorithmic reasoning blueprint [32], aims to bridge this divide by neuralising the target algorithm.

Neural Algorithmic Reasoning with Causal Regularisation.

Neural Algorithmic Reasoning.

CitationFor attribution in academic contexts or books, please cite this work asPetar Veličković, "Neural Algorithmic Reasoning", The Gradient, 2023.

6 months, 1 week назад @ thegradient.pub
The Artificiality of Alignment
The Artificiality of Alignment The Artificiality of Alignment

This community has developed an extensive vocabulary around theories of AI safety and alignment, many first introduced as detailed blog posts in forums like LessWrong and AI Alignment Forum.

One such idea that is useful for contextualizing technical alignment work — and is perhaps the more formal version of what Bostrom was referring to — is the concept of intent alignment.

Anthropic’s product marketing pages are plastered with notes and phrases about their alignment work —“HHH” is also Claude's biggest selling point.

The site uses the phrasing “AI Safety” instead of “AI Alignment” in the title, but the article itself proceeds to use “safety” and “alignment” interchangeably without differen…

6 months, 2 weeks назад @ thegradient.pub
An Introduction to the Problems of AI Consciousness
An Introduction to the Problems of AI Consciousness An Introduction to the Problems of AI Consciousness

This brief introduction is aimed at those working within the AI community who are interested in AI consciousness, but may not know much about the philosophical and scientific work behind consciousness generally or the topic of AI consciousness in particular.

(Image by author)AI and ConsciousnessTwo Problems for AI ConsciousnessLet’s return to the topic of AI consciousness.

The problem of AI consciousness may seem less difficult than the hard problem: the problem of AI consciousness only asks if silicon could support consciousness, but it does not ask for an explanation of why silicon can or cannot, like the hard problem does.

The second test proposed by Schneider and Edwin Turner [27], call…

6 months, 3 weeks назад @ thegradient.pub
Text-to-CAD: Risks and Opportunities
Text-to-CAD: Risks and Opportunities Text-to-CAD: Risks and Opportunities

We’ve identified three key areas where such programs can level up: dataset curation, a pattern language for usability, and filtering.

The format for a friction hinge pattern might look like this:Pattern Name Friction Hinge Pattern Description The Friction Hinge pattern addresses the need for adjustable friction in hinges so as to provide tuneable resistance, but without compromising smooth movement.

Alas, once text-to-CAD models get open-sourced or leaked, many of these queries will be satisfied without compunction.

____In conclusion, the emergence of AI-powered text-to-CAD generation presents both risks and opportunities, the ratio of which is still very much undecided.

CitationFor attribu…

7 months, 1 week назад @ thegradient.pub
Interpretability Creationism
Interpretability Creationism Interpretability Creationism

In this piece, I will discuss the tendency towards “interpretability creationism” – interpretability methods that only look at the final state of the model and ignore its evolution over the course of training – and propose a focus on the training process to supplement interpretability research.

In the case of language models, they behave similarly to ngram models early on and exhibit linguistic patterns later.

Let us consider an explanation usually based on analyzing static models: hierarchical behavior in language models.

An ExampleI recently had to manage the trap of interpretability creationism myself.

Citation:For attribution in academic contexts or books, please cite this work asNaomi …

9 months, 1 week назад @ thegradient.pub
What Do LLMs Know About Linguistics? It Depends on How You Ask
What Do LLMs Know About Linguistics? It Depends on How You Ask What Do LLMs Know About Linguistics? It Depends on How You Ask

Understanding what LLMs learn about linguistics from large-scale pretraining is a good framework for understanding how they work in general.

How we extend in-context learning to structured prediction tasks with structured prompting (left) and an example of using structured prompting with an LLM to annotate parts-of-speech (right).

Overall, structured prompting can consistently generate the linguistic structure underlying text from LLMs, confirming that these models have implicitly learned these structures during pretraining.

This leads us to ask: what factors of structured prompting allow LLMs to annotate linguistic structure?

By searching for labels instead of full examples, we can find ta…

9 months, 2 weeks назад @ thegradient.pub
Why transformative artificial intelligence is really, really hard to achieve
Why transformative artificial intelligence is really, really hard to achieve Why transformative artificial intelligence is really, really hard to achieve

Good speculated, one day self-improve, cause an intelligence explosion, and lead to an economic growth singularity?

In this essay we assemble the best arguments that we have encountered for why transformative AI is hard to achieve.

AI progress may even be accelerating the decoupling of the US and China, reducing the flow of people and ideas.

Automation alone is not enough for transformative economic growth.

If transformative AI were coming soon, real interest rates would rise in line with expectations of great future wealth or risk.

9 months, 4 weeks назад @ thegradient.pub
TheSequence TheSequence
последний пост 2 days, 1 hour назад
Edge 388: Google DeepMind's SIMA can Follow Language Instructions in 3D Games Just Like Humans
Edge 388: Google DeepMind's SIMA can Follow Language Instructions in 3D Games Just Like Humans Edge 388: Google DeepMind's SIMA can Follow Language Instructions in 3D Games Just Like Humans

Created Using IdeogramVideo games have long served as some of the best environments for training AI agents.

However, most of the AI breakthroughs in 3D game environments have been constrained to one or a small number of games.

The goal of the project was to develop instructable agents that can interact with any 3D environment just like a human by following simple language instructions.

Language is the most powerful and yet simple abstraction for communicating instructions about the world or, in this case, a 3D virtual world.

The magic of SIMA is its ability to translate those abstract instructions into mouse and keyboard actions used to navigate an environment.

2 days, 1 hour назад @ thesequence.substack.com
Edge 387: Tool Learning in Autonomous Agents
Edge 387: Tool Learning in Autonomous Agents Edge 387: Tool Learning in Autonomous Agents

Created Using IdeogramIn this Issue:Tool learning in autonomous agents.

💡 ML Concept of the Day: Tool Learning in Autonomous AgentsOne of the key differentiators between agents and models is the capability of the former to take actions in a given environment.

Part of that action execution typically involves interactions with different systems or tools.

From this perspective, tool learning has become one of the most important building blocks of autonomous agents.

When it comes to tool learning in autonomous agents, we should identify two main groups of interactions:

4 days назад @ thesequence.substack.com
Neuro-Symbolic Models are Making a Comeback
Neuro-Symbolic Models are Making a Comeback Neuro-Symbolic Models are Making a Comeback

You can subscribed to The Sequence below:📝 Editorial: Neuro-Symbolic Models are Making a ComebackLarge language models (LLMs) have dominated the AI narrative in recent years to the point that we almost need to wonder about the future of other areas of machine learning.

One of the most interesting options to address these limitations comes from a pretty old ML school: neuro-symbolic models.

As the name indicates, neuro-symbolic models combine neural networks, such as LLMs, with smaller, easier-to-interpret symbolic models to adapt LLMs to specific domains.

Neuro-symbolic architectures are making a comeback as one of the possible architectures to play a role in this movement.

Neuro-symbolic m…

6 days, 1 hour назад @ thesequence.substack.com
Edge 386: Inside Yi, 01's Model Leading the Chinese LLM Movement
Edge 386: Inside Yi, 01's Model Leading the Chinese LLM Movement Edge 386: Inside Yi, 01's Model Leading the Chinese LLM Movement

Created Using DALL-EThe Chinese ecosystem around foundation models have been on fire recently.

One of the most ambitious foundation models effort in China comes from 01, the startup founded by former Microsoft and Google researcher Kai Fu Lee.

01’s first iteration came in the form of the Yi models.

A few days ago, 01 published a technical report about the Yi models and we thought it would be interesting to share some details.

The Yi series models stands out for their bilingual capabilities.

1 week, 2 days назад @ thesequence.substack.com
Edge 385: The Two Big Schools for Building Autonomous Agents
Edge 385: The Two Big Schools for Building Autonomous Agents Edge 385: The Two Big Schools for Building Autonomous Agents

Created Using IdeogramIn this Issue:Building LLM-based vs. computer-based autonomous agents.

💡 ML Concept of the Day: The Two Big Schools for Building Autonomous AgentsIn our series about autonomous agents, today we would like to explore the two fundamental schools used for implementations of this AI systems.

Typically, we associate autonomous agents with LLMs but there are competitive techniques fundamentally based on computer vision models which has been gaining quite a bit of traction.

CV-based autonomous agents fundamentally focus on recording actions in a user’s computer and replicating those actions with models that can understand pixel-by-pixel positions and mouse actions.

In general…

1 week, 4 days назад @ thesequence.substack.com
Generative Audio Models Just Had a Great Week
Generative Audio Models Just Had a Great Week Generative Audio Models Just Had a Great Week

You can subscribed to The Sequence below:📝 Editorial: Generative Audio Models Just Had a Great WeekAudio is rapidly becoming one of the most important frontiers in generative AI, a field that is advancing swiftly.

Technically, generative audio poses a fundamentally simpler problem than video or 3D, which leads to faster iterations in research and implementation.

The pace of innovation in generative audio is accelerating at remarkable levels.

To these innovations, you need to add more established players such as Eleven Labs, which have been pushing the boundaries of generative audio for years.

OpenAI Custom ModelsOpenAI announced enhacements to its training API as well as new mechanisms for …

1 week, 6 days назад @ thesequence.substack.com
📝 Guest Post: The EU AI Act – A Guide for Developers*
📝 Guest Post: The EU AI Act – A Guide for Developers* 📝 Guest Post: The EU AI Act – A Guide for Developers*

This year the EU AI Act came into law and has worried a lot of developers.

Anyone who is selling an AI product within the EU has to comply with the EU AI Act, even if you’re not based in the EU.

Minimal Risk AI Applications: This category covers the majority of AI applications such as AI in games, spam filters and recommendation engines.

The EU AI Act creates a separate category for what they consider to be “General Purpose AI” systems.

The worst elements of early drafts have mostly been stripped from the EU AI Act.

2 weeks, 1 day назад @ thesequence.substack.com
Edge 384: Inside Genie: Google DeepMind's Astonishing Model that can Build 2D Games from Text and Images
Edge 384: Inside Genie: Google DeepMind's Astonishing Model that can Build 2D Games from Text and Images Edge 384: Inside Genie: Google DeepMind's Astonishing Model that can Build 2D Games from Text and Images

Created Using IdeogramThe pace of research in generative AI is nothing short of remarkable.

Even so, from time to time, there are papers that literally challenge our imagination about how far the generative AI space can go.

A few weelks ago, Google DeepMind published some of that work with the release of Genie, a model that is able to generative interactive game environments from text and images.

This is the vision that Google DeepMind has turned into reality with Genie, a groundbreaking approach to generative AI.

Genie can craft interactive environments from a mere text or image prompt, thanks to its training on over 200,000 hours of publicly available gaming videos from the Internet.

2 weeks, 2 days назад @ thesequence.substack.com
Edge 383: The Key Capabilities of Autonomous Agens
Edge 383: The Key Capabilities of Autonomous Agens Edge 383: The Key Capabilities of Autonomous Agens

Created Using IdeogramIn this Issue:An review of the key capabilities of autonomous agents.

An introduction to the Crew AI framework for building autonomous agents.

Different frameworks have outlined various feature sets for autonomous agents.

However, there are a few building blocks that can be consistently considered as the strong foundation of any autonomous agent application.

Most capabilities of autonomous agents can be considered variations of the following key areas:

2 weeks, 4 days назад @ thesequence.substack.com
Four New Major Open Source Foundation Models in a Week
Four New Major Open Source Foundation Models in a Week Four New Major Open Source Foundation Models in a Week

You can subscribe to The Sequence using the link below:📝 Editorial: Four New Major Open Source Foundation Models in a WeekOpen source generative AI is experiencing tremendous momentum, and last week was a major example of this with the release of four major foundation models.

By open source, we refer to the weights of the models and not the training datasets or processes.

At this time, it's fair to say that the model weights are where most companies draw the line between open source and closed source.

Last week, we witnessed the release of four major open source models, each innovative in its own way:DBRX: Databricks released DBRX, a new model based on a mixture-of-experts architecture.

Thi…

2 weeks, 6 days назад @ thesequence.substack.com
Edge 381: Google DeepMind's PrompBreeder Self-Improves Prompts
Edge 381: Google DeepMind's PrompBreeder Self-Improves Prompts Edge 381: Google DeepMind's PrompBreeder Self-Improves Prompts

Created Using DALL-EReasoning and prompt evolution/optimization are being recognized as the next significant frontier for large language models(LLMs).

Among the various strategies employed to enhance the reasoning capabilities of LLMs, one of the prominent ones is Chain-of-Thought Prompting, often hailed for its effectiveness.

However, it’s worth noting that manually crafted prompt strategies tend to fall short of optimal performance.

Recently, researchers from Google DeepMind unveiled PROMPTBREEDER , a self-improving prompt algorithm that uses evolutionary techniques to arrive to the best prompts for a given task.

PROMPTBREEDER addresses some of the limitations of CoT with a simple and sup…

3 weeks, 2 days назад @ thesequence.substack.com
Edge 380: A New Series About Autonomous Agents
Edge 380: A New Series About Autonomous Agents Edge 380: A New Series About Autonomous Agents

Created Using DALL-EIn this Issue:An introduction to our series about autonomous agents.

💡 ML Concept of the Day: A New Series About Autonomous AgentsToday, we start one of our most ambitious series at The Sequence by diving into the world of autonomous agents.

What makes this series ambitious is that autonomous agents remains a relatively nascent and unsolved problem in AI.

The rapid pace of growth in LLMs have certainly accelerated the viability of autonomous agents but the space still remains in a very early stages.

What is an autonomous agents?

3 weeks, 4 days назад @ thesequence.substack.com
📝 Guest Post: Zilliz Unveiled Milvus 2.4 at GTC 24, Transforming Vector Databases with GPU Acceleration*
📝 Guest Post: Zilliz Unveiled Milvus 2.4 at GTC 24, Transforming Vector Databases with GPU Acceleration* 📝 Guest Post: Zilliz Unveiled Milvus 2.4 at GTC 24, Transforming Vector Databases with GPU Acceleration*

High throughput ensures that vector databases can handle a large volume of incoming queries concurrently, delivering low-latency responses to end-users or services.

At the heart of vector databases lies a core set of vector operations, such as similarity calculations and matrix operations, which are highly parallelizable and computationally intensive.

To leverage the benefits of GPU acceleration, CAGRA is integrated into Milvus’ Index and Query Nodes.

Blazing New TrailsThe integration of NVIDIA's CAGRA GPU acceleration framework into Milvus 2.4 represents a groundbreaking achievement in vector databases.

The unveiling of Milvus 2.4, a collaboration between Zilliz and NVIDIA, exemplifies the…

3 weeks, 4 days назад @ thesequence.substack.com
NVIDIA’s GTC in Four Headlines
NVIDIA’s GTC in Four Headlines NVIDIA’s GTC in Four Headlines

You can subscribe below:📝 Editorial: NVIDIA’s GTC in Four HeadlinesI tried to resist making this weekend's editorial about NVIDIA because I think you might have been inundated with headlines from the GTC conference.

If I have to summarize two key takeaways from NVIDIA's AI announcements this week, they would be these:NVIDIA is not only outgrowing but also out-innovating everyone else in AI compute hardware by a large margin.

Project GR00T: I think the coolest and most ambitious announcement was Project GR00T, which focuses on developing foundation models for humanoid robots.

NVIDIA's AI hardware dominance is unquestionable, but it's quickly making inroads in the software space.

The techniqu…

3 weeks, 5 days назад @ thesequence.substack.com
📌 Exciting lineup for apply() 2024 is now live
📌 Exciting lineup for apply() 2024 is now live 📌 Exciting lineup for apply() 2024 is now live

The agenda for apply() 2024, Tecton’s premier virtual conference dedicated to mastering AI and ML at production scale, is now live!

Join us on Wednesday, April 3, for a day packed with enlightening sessions and networking opportunities alongside industry leaders and fellow engineers.

REGISTER TODAYHere are some agenda highlights:Is RAG Really Dead?

Semi-Supervised Learning: How to Overcome the Lack of LabelsSpeaker: Aleksandr Timashov, ML Engineer at MetaAleksandr will delve into semi-supervised learning, offering insights into leveraging labeled and unlabeled data efficiently for model training.

SEE FULL AGENDA

4 weeks, 1 day назад @ thesequence.substack.com
Synced Review
последний пост 1 day, 11 hours назад
87% ImageNet Accuracy, 3.8ms Latency: Google’s MobileNetV4 Redefines On-Device Mobile Vision
87% ImageNet Accuracy, 3.8ms Latency: Google’s MobileNetV4 Redefines On-Device Mobile Vision 87% ImageNet Accuracy, 3.8ms Latency: Google’s MobileNetV4 Redefines On-Device Mobile Vision

Efficient on-device neural networks offer rapid, real-time, and interactive experiences while safeguarding private data from public internet exposure. Yet, the computational limitations of mobile devices present a formidable challenge in maintaining a delicate balance between accuracy and efficiency.Addressing this challenge head-on, a recent paper titled “MobileNetV4 — Universal Models for the Mobile Ecosystem,” penned by a Google research team, unveils the latest iteration of MobileNets: MobileNetV4 (MNv4). This cutting-edge model boasts an impressive 87% ImageNet-1K accuracy, coupled with an astonishingly low Pixel 8 EdgeTPU runtime of merely 3.8ms.At the heart of this breakthrough lies …

1 day, 11 hours назад @ medium.com
Unveiling the Black Box: Meta’s LM Transparency Tool Deciphers Transformer Language Models
Unveiling the Black Box: Meta’s LM Transparency Tool Deciphers Transformer Language Models Unveiling the Black Box: Meta’s LM Transparency Tool Deciphers Transformer Language Models

Transformer-based language models have emerged as powerful tools across various tasks, underlining their significance in critical contexts. Understanding the inner workings of these models is paramount for ensuring their safety, reliability, and trustworthiness, given their widespread adoption.In a new paper LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models, a research team from Meta, University College London and Universitat Politècnica de Catalunya introduces the LM Transparency Tool (LM-TT), an open-source interactive toolkit designed for dissecting Transformer-based language models.Existing analysis tools often focus on isolated aspects of decision-making …

3 days, 15 hours назад @ medium.com
OPPO AI’s Transformer-Lite Delivers 10x+ Prefill and 2~3x Decoding Boost on Mobile Phone GPUs
OPPO AI’s Transformer-Lite Delivers 10x+ Prefill and 2~3x Decoding Boost on Mobile Phone GPUs OPPO AI’s Transformer-Lite Delivers 10x+ Prefill and 2~3x Decoding Boost on Mobile Phone GPUs

The Large Language Model (LLM) has showcased remarkable efficacy across various real-world applications, including intelligent assistants, text summarization, translation, and multi-modality tasks on mobile devices. Nonetheless, the current methodologies for on-device deployment of LLMs are hampered by sluggish inference speeds, leading to subpar user experiences.In a new paper Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs, researchers from OPPO AI Center have introduced a solution. They present four optimization techniques and introduce a novel mobile inference engine dubbed Transformer-Lite. This engine outperforms CPU-based FastLLM and GPU-bas…

4 days, 12 hours назад @ medium.com
Revolutionizing Video Understanding: Real-Time Captioning for Any Length with Google’s Streaming…
Revolutionizing Video Understanding: Real-Time Captioning for Any Length with Google’s Streaming… Revolutionizing Video Understanding: Real-Time Captioning for Any Length with Google’s Streaming…

Revolutionizing Video Understanding: Real-Time Captioning for Any Length with Google’s Streaming ModelThe exponential growth of online video platforms has led to a surge in video content, thereby heightening the need for advanced video comprehension. However, existing computer vision models tailored for video understanding often fall short, typically analyzing only a limited number of frames, typically spanning mere seconds, and categorizing these brief segments into predefined concepts.To address the abovementioned challenge, in a new paper Streaming Dense Video Captioning, a Google research team proposes a streaming dense video captioning model, which revolutionizes dense video captioning…

1 week, 2 days назад @ medium.com
AURORA-M: A Global Symphony of Innovation as 33 Prestigious Institutions Unify for Open-Source…
AURORA-M: A Global Symphony of Innovation as 33 Prestigious Institutions Unify for Open-Source… AURORA-M: A Global Symphony of Innovation as 33 Prestigious Institutions Unify for Open-Source…

AURORA-M: A Global Symphony of Innovation as 33 Prestigious Institutions Unify for Open-Source Multilingual MasteryLarge Language Models (LLMs) have revolutionized various applications, including machine translation, text summarization, dialogue systems, and code generation. Yet, the hefty computational requirements for pretraining these models pose significant barriers to broader accessibility and development.To address these challenges, recent open-source initiatives like BLOOM, StarCoder, and StarCoder-2 have emerged, aiming to democratize access to pretrained LLMs. However, these models encounter limitations such as restricted multilingual capabilities, computational intensity, and the …

1 week, 4 days назад @ medium.com
Huawei & Peking U’s DiJiang: A Transformer Achieving LLaMA2–7B Performance at 1/50th the Training…
Huawei & Peking U’s DiJiang: A Transformer Achieving LLaMA2–7B Performance at 1/50th the Training… Huawei & Peking U’s DiJiang: A Transformer Achieving LLaMA2–7B Performance at 1/50th the Training…

Huawei & Peking U’s DiJiang: A Transformer Achieving LLaMA2–7B Performance at 1/50th the Training CostThe Transformer architecture has emerged as a pivotal tool in numerous domains, excelling particularly in tasks like speech recognition, machine translation, and document summarization. Yet, its efficacy often hinges on expanding the model’s size to tackle increasingly intricate challenges, thereby imposing substantial computational burdens.In the pursuit of alleviating the computational strain associated with Transformers, the exploration of linear attention mechanisms has garnered notable traction. Nonetheless, enhancing these mechanisms typically entails extensive retraining, a prohibiti…

2 weeks, 2 days назад @ medium.com
KCL Leverages Topos Theory to Decode Transformer Architectures
KCL Leverages Topos Theory to Decode Transformer Architectures KCL Leverages Topos Theory to Decode Transformer Architectures

The transformer architecture has emerged as the predominant framework for deep learning, playing a pivotal role in the remarkable achievements of large language models like ChatGPT. Despite its widespread adoption, the theoretical underpinnings of its success remain largely uncharted territory.In a new paper The Topos of Transformer Networks, a King’s College London research team delves into a theoretical exploration of the transformer architecture, employing the lens of topos theory. This innovative approach conjectures that the factorization through “choose” and “eval” morphisms can yield effective neural network architecture designs.The primary objective of this paper is to offer a categ…

2 weeks, 5 days назад @ medium.com
Robotic Marvels: Conquering San Francisco’s Streets Through Next Token Prediction
Robotic Marvels: Conquering San Francisco’s Streets Through Next Token Prediction Robotic Marvels: Conquering San Francisco’s Streets Through Next Token Prediction

In recent years, there has been a remarkable surge in the effectiveness of large transformer models trained through generative modeling on extensive language datasets sourced from the Internet. These models have demonstrated impressive capabilities across diverse domains. By predicting subsequent words, these models glean intricate understandings of language, which can then be applied to various tasks through multi-task learning and efficient few-shot learning techniques.This success has led researchers to ponder: can we replicate this approach to develop robust models for sensory and motor representation? While there have been encouraging signs of progress in learning sensorimotor represen…

3 weeks назад @ medium.com
First Model-Stealing Attack Reveals Secrets of Black-Box Production Language Models
First Model-Stealing Attack Reveals Secrets of Black-Box Production Language Models First Model-Stealing Attack Reveals Secrets of Black-Box Production Language Models

Despite the remarkable capabilities of contemporary large language models like GPT-4, Claude 2, or Gemini, the inner mechanisms governing their operations remain shrouded in mystery. While the intricate details of these models are kept under wraps, they are made accessible through APIs, prompting the question: How much insight can adversaries glean about these models by interacting with their APIs?To answer this question, in a new paper Stealing Part of a Production Language Model, a research team from Google DeepMind, ETH Zurich, University of Washington, OpenAI and McGill University introduces a groundbreaking model-stealing attack. This attack unveils precise, nontrivial information from…

3 weeks, 2 days назад @ medium.com
DeepMind & UBC’s Genie: A Revolutionary Leap in Generative AI for Interactive Virtual Worlds
DeepMind & UBC’s Genie: A Revolutionary Leap in Generative AI for Interactive Virtual Worlds DeepMind & UBC’s Genie: A Revolutionary Leap in Generative AI for Interactive Virtual Worlds

In recent years, there has been a remarkable surge in the development of generative AI, showcasing the potential of models to create novel and imaginative content. While strides have been made in various domains, the realm of video generation stands as a promising frontier.Recent advancements suggest that scaling up these models could further enhance their capabilities. However, there remains a significant gap between the interactive engagement offered by video generative models and the rich interactions facilitated by language tools like ChatGPT, not to mention more immersive experiences.In response to this challenge, in a new paper Genie: Generative Interactive Environments, a research te…

3 weeks, 4 days назад @ medium.com
ByteDance’s AnimateDiff-Lightning Shines in State-of-the-Art Video Creation in Lightning Speed
ByteDance’s AnimateDiff-Lightning Shines in State-of-the-Art Video Creation in Lightning Speed ByteDance’s AnimateDiff-Lightning Shines in State-of-the-Art Video Creation in Lightning Speed

In recent times, video generative models have emerged as a focal point of attention, unlocking a realm of fresh creative opportunities. Despite this, the velocity of these models remains a significant obstacle to their broader adoption. State-of-the-art generative models, while impressive in their capabilities, are hampered by sluggishness and computational demands due to their iterative diffusion processes.To address this issue, in a new paper AnimateDiff-Lightning: Cross-Model Diffusion Distillation, a ByteDance research team presents AnimateDiff-Lightning, a novel approach that utilizes progressive adversarial diffusion distillation, catapulting video generation into a realm of lightning…

1 month назад @ medium.com
Stanford’s VideoAgent Achieves New SOTA of Long-Form Video Understanding via Agent-Based System
Stanford’s VideoAgent Achieves New SOTA of Long-Form Video Understanding via Agent-Based System Stanford’s VideoAgent Achieves New SOTA of Long-Form Video Understanding via Agent-Based System

Understanding long-form videos presents a formidable challenge within the realm of computer vision. This undertaking requires a model adept at processing multi-modal data, managing extensive sequences, and effectively reasoning over these sequences.In response to this challenge, in a new paper VideoAgent: Long-form Video Understanding with Large Language Model as Agent, a Stanford University research team introduces VideoAgent, an innovative approach simulates human comprehension of long-form videos through an agent-based system, showcasing superior effectiveness and efficiency compared to current state-of-the-art methods. This underscores the potential of agent-based approaches in advancin…

1 month назад @ medium.com
DeepMind’s Gemma: Advancing AI Safety and Performance with Open Models
DeepMind’s Gemma: Advancing AI Safety and Performance with Open Models DeepMind’s Gemma: Advancing AI Safety and Performance with Open Models

Large Language Models (LLMs) have proven their mettle across a spectrum of real-world applications, ranging from language modeling to visual comprehension, and even text-to-image and text-to-video generation. Undoubtedly, LLMs stand as pivotal elements in contemporary artificial intelligence. However, alongside their groundbreaking potential, concerns regarding their safe deployment loom large.In a new paper Gemma: Open Models Based on Gemini Research and Technology, Google DeepMind Gemma Team introduces Gemma, a suite of lightweight, cutting-edge open models derived from the same research and technology underpinning the powerful Gemini models. Gemma marks a significant leap forward in perf…

1 month назад @ medium.com
Fast Tracks to Diverse Behaviors: VQ-BeT Achieves 5x Speed Surge Compared to Diffusion Policies
Fast Tracks to Diverse Behaviors: VQ-BeT Achieves 5x Speed Surge Compared to Diffusion Policies Fast Tracks to Diverse Behaviors: VQ-BeT Achieves 5x Speed Surge Compared to Diffusion Policies

Generative modeling of complex behaviors from labeled datasets has long been a significant challenge in decision-making. This entails…Continue reading on SyncedReview »

1 month, 1 week назад @ medium.com
BasedAI: A Decentralized Solution for Seamless Integration of Privacy and Performance in Large…
BasedAI: A Decentralized Solution for Seamless Integration of Privacy and Performance in Large… BasedAI: A Decentralized Solution for Seamless Integration of Privacy and Performance in Large…

Continue reading on SyncedReview »

1 month, 1 week назад @ medium.com
📓 Cool Blogs
ODS.ai Habr ODS.ai Habr
последний пост 4 months назад
GPT-like модель «впервые сделала научное открытие»: что, как и куда дальше?
GPT-like модель «впервые сделала научное открытие»: что, как и куда дальше? GPT-like модель «впервые сделала научное открытие»: что, как и куда дальше?

Или кликбейт — и это в Nature?

Статья ниже — подробный разбор достаточно сложного топика, и в некоторых моментах нужно будет сосредоточиться и вдумчиво читать.

Они же пишут код в помощь разработчикам, да и в целом помогают решать разного рода проблемы.

Однако может так получиться, что сет долгое время не выпадает — и не потому, что игроки проворонили, а потому, что его действительно просто нет на столе.

Надеемся, что после прочтения этой статьи стало ясно, что ошибки нейросетей — это не баг, это фича.

4 months назад @ habr.com
Кто такие LLM-агенты и что они умеют?
Кто такие LLM-агенты и что они умеют? Кто такие LLM-агенты и что они умеют?

Также присоединяйтесь к моему телеграм каналу AI[ex]Time , где я пишу про машинное обучение, NLP, LLM и в том числе про агентов.

LLaVaВ качестве LLM для генерации текстового ответа используется LLaMA, которая декодирует эмбеддинги (то же, что и векторы) картинок и входного текста в ответ.

На вход модель получает картинку и запрос от пользователя (Normal prompt) и на выходе должна дать ответ (Response).

Модель таким образом от решения задачи напрямую переходит к рассуждению по шагам, что в некотором смысле и является декомпозицией задачи.

И это улучшает качество работы модели — в том числе и ризонинга.

4 months, 2 weeks назад @ habr.com
Главное событие в мире AI: создатель ChatGPT рассказал, в какое будущее он нас всех ведет
Главное событие в мире AI: создатель ChatGPT рассказал, в какое будущее он нас всех ведет Главное событие в мире AI: создатель ChatGPT рассказал, в какое будущее он нас всех ведет

Мы помним, что в начале 2023-го продуктом уже пользовалось больше 100 млн человек в месяц.

У самых любознательных читателей может возникнуть вопрос: а как это вообще работает?

Насколько нам известно, это первый раз, когда модель такого масштаба обучается на синтетических данных, а не на произведенных человеком.

Использование такой модели дешевле в 3 раза на текст из промпта, и в 2 раза на генерируемые токены (их обычно меньше).

Но вот как с ними обращаться, когда использовать и как комбинировать — это уже решает AI по контексту диалога.

5 months, 1 week назад @ habr.com
Пять книг про NLP, с которых можно начать
Пять книг про NLP, с которых можно начать Пять книг про NLP, с которых можно начать

Меня зовут Валентин Малых, я — руководитель направления NLP-исследований в MTS AI, вот уже 6 лет я читаю курс по NLP.

Поскольку я все время отвечаю одно и то же, появилась идея сделать пост про мой список книг, заодно описав их.

При этом в книге больше информации про информационный поиск (information retrieval) и меньше про NLP, но в наше время эти две области уже (или все еще) очень близки.

Правда, его тоже уже не достать, хотя PDF версия ищется без проблем.

Foundations of Statistical Natural Language ProcessingНасколько мне известно, эта книга не переводилась на русский язык.

7 months, 2 weeks назад @ habr.com
Пять книг про NLP, с которых можно начать
Пять книг про NLP, с которых можно начать Пять книг про NLP, с которых можно начать

Меня зовут Валентин Малых, я — руководитель направления NLP-исследований в MTS AI, вот уже 6 лет я читаю курс по NLP.

Поскольку я все время отвечаю одно и то же, появилась идея сделать пост про мой список книг, заодно описав их.

При этом в книге больше информации про информационный поиск (information retrieval) и меньше про NLP, но в наше время эти две области уже (или все еще) очень близки.

Правда, его тоже уже не достать, хотя PDF версия ищется без проблем.

Foundations of Statistical Natural Language ProcessingНасколько мне известно, эта книга не переводилась на русский язык.

7 months, 2 weeks назад @ habr.com
Дропаем ранжирующие метрики в рекомендательной системе, часть 3: платформа для экспериментов
Дропаем ранжирующие метрики в рекомендательной системе, часть 3: платформа для экспериментов Дропаем ранжирующие метрики в рекомендательной системе, часть 3: платформа для экспериментов

Платформа для экспериментов в RecSysНа примере Киона я показала сложности экспериментов с рекомендательными системами.

В работе над RecSys в реальном сервисе мы строим гипотезы, выкатываем модели в АБ тесты, строим новые гипотезы.

Нашей целью была инфраструктура и пайплайны для максимально быстрых и удобных экспериментов с любыми алгоритмами.

Трекинг экспериментов: метрики и визуальный анализТрекинг экспериментов мы проводили одновременно в Mlflow и в отчётах в репозитории.

Схема на картинке:Инфраструктура платформы для экспериментов и приложения с рекомендациямиЖёлтые стрелки отражают запись результатов экспериментов: логирование метрик с кросс-валидации, сохранение обученных моделей в Min…

7 months, 4 weeks назад @ habr.com
Дропаем ранжирующие метрики в рекомендательной системе, часть 2: двухэтапные модели
Дропаем ранжирующие метрики в рекомендательной системе, часть 2: двухэтапные модели Дропаем ранжирующие метрики в рекомендательной системе, часть 2: двухэтапные модели

В первой части статьи я описывала, как мы с напарником решили выкатить модель из соревнования в онлайн рекомендации, и что из этого вышло.

Давайте построим фичу для бустинга, которая будет решать главную проблему нашей текущей модели - не релевантные айтемы в хороших подборках.

Также у нас были замечательные фичи, проверенные ещё на модели в соревновании.

В задаче классификации мы подаём модели кандидатов, бывших в интеракциях в качестве позитивных таргетов, и не бывших в качестве негативных, и учим его отличать их друг от друга.

Самый стабильный подход к кросс-валидации в рекомендательных системах - это схема скользящего окна, которая больше всего соответствует работе модели в реальном мир…

8 months назад @ habr.com
Дропаем ранжирующие метрики в рекомендательной системе, часть 1: визуальный анализ и popularity bias
Дропаем ранжирующие метрики в рекомендательной системе, часть 1: визуальный анализ и popularity bias Дропаем ранжирующие метрики в рекомендательной системе, часть 1: визуальный анализ и popularity bias

Смотрим, что мы имеем в Кионе:Распределение популярности топ 100 айтемов в датасете KionЧто там в топе просмотров?

Популярные айтемы заполняют собой полку рекомендаций и на этот запрос, и на многие другие.

Фильм занимает 6 место в топе просмотров в датасете и очень не вовремя попадает в рекомендации у самых разных алгоритмов.

Проверим рекомендации от модели с максимумом Recall без популярного:Рекомендации модели bm25 с максимумом Recall без популярного.

Recall и MAP понизились примерно в 2 раза от модели в соревновании.

8 months, 1 week назад @ habr.com
«Диалектик», независимое социалистическое медиа, рассказывает о своих NLP проектах, публикует датасеты и делится кодом
«Диалектик», независимое социалистическое медиа, рассказывает о своих NLP проектах, публикует датасеты и делится кодом «Диалектик», независимое социалистическое медиа, рассказывает о своих NLP проектах, публикует датасеты и делится кодом

Почти сразу после публикации поста про систему поиска новостей о трудовых конфликтах в СНГ я познакомился с коллективом проекта «Диалектик».

Коллектив волнуют процессы, происходящие как в экономическом базисе, так и в его надстройке – в обществе, культуре, политике.

Этот метод познания помогал всем классикам марксизма анализировать сложные ситуации в экономике и обществе, этим он ценен и для коллектива «Диалектика».

Поэтому внутри коллектива ведется разработка ИТ-инструментов для внутреннего (только для редакторов) и для внешнего (для всех пользователей) использования.

Этот публичный агрегатор находится на этапе тестирования и в скором времени будет представлен общественности.

8 months, 1 week назад @ habr.com
Machine Learning Mastery
последний пост 1 month назад
Unfolding Data Stories: From First Glance to In-Depth Analysis
Unfolding Data Stories: From First Glance to In-Depth Analysis

The path to uncovering meaningful insights often starts with a single step: looking at the data before asking questions. This journey through the Ames Housing dataset is more than an exploration; it’s a narrative about the hidden stories within numbers, waiting to be told. Through a “Data First Approach,” we invite you to dive deep […]

The post Unfolding Data Stories: From First Glance to In-Depth Analysis appeared first on MachineLearningMastery.com.

1 month назад @ machinelearningmastery.com
The Da Vinci Code of Data: Mastering The Data Science Mind Map
The Da Vinci Code of Data: Mastering The Data Science Mind Map

Data Science embodies a delicate balance between the art of visual storytelling, the precision of statistical analysis, and the foundational bedrock of data preparation, transformation, and analysis. The intersection of these domains is where true data alchemy happens – transforming and interpreting data to tell compelling stories that drive decision-making and knowledge discovery. Just as […]

The post The Da Vinci Code of Data: Mastering The Data Science Mind Map appeared first on MachineLearningMastery.com.

1 month назад @ machinelearningmastery.com
Finding Value with Data: The Cohesive Force Behind Luxury Real Estate Decisions
Finding Value with Data: The Cohesive Force Behind Luxury Real Estate Decisions

The real estate industry is a vast network of stakeholders including agents, homeowners, investors, developers, municipal planners, and tech innovators, each bringing unique perspectives and objectives to the table. Within this intricate ecosystem, data emerges as the critical element that binds these diverse interests together, facilitating collaboration and innovation. PropTech, or Property Technology, illustrates this […]

The post Finding Value with Data: The Cohesive Force Behind Luxury Real Estate Decisions appeared first on MachineLearningMastery.com.

1 month, 1 week назад @ machinelearningmastery.com
Skewness Be Gone: Transformative Tricks for Data Scientists
Skewness Be Gone: Transformative Tricks for Data Scientists

Data transformations enable data scientists to refine, normalize, and standardize raw data into a format ripe for analysis. These transformations are not merely procedural steps; they are essential in mitigating biases, handling skewed distributions, and enhancing the robustness of statistical models. This post will primarily focus on how to address skewed data. By focusing on […]

The post Skewness Be Gone: Transformative Tricks for Data Scientists appeared first on MachineLearningMastery.com.

1 month, 2 weeks назад @ machinelearningmastery.com
Best Free Resources to Learn Data Analysis and Data Science
Best Free Resources to Learn Data Analysis and Data Science

Sponsored Content In my decade of teaching online, the most significant inspiration has been that online learning democratizes access to education globally. Regardless of your ethnic background, income level, and geographical location—as long as you can surf the web—you can find an ocean of free educational content to help you learn new skills. […]

The post Best Free Resources to Learn Data Analysis and Data Science appeared first on MachineLearningMastery.com.

1 month, 2 weeks назад @ machinelearningmastery.com
Harmonizing Data: A Symphony of Segmenting, Concatenating, Pivoting, and Merging
Harmonizing Data: A Symphony of Segmenting, Concatenating, Pivoting, and Merging

In the world of data science, where raw information swirls in a cacophony of numbers and variables, lies the art of harmonizing data. Like a maestro conducting a symphony, the skilled data scientist orchestrates the disparate elements of datasets, weaving them together into a harmonious composition of insights. Welcome to a journey where data transcends […]

The post Harmonizing Data: A Symphony of Segmenting, Concatenating, Pivoting, and Merging appeared first on MachineLearningMastery.com.

1 month, 2 weeks назад @ machinelearningmastery.com
Beyond SQL: Transforming Real Estate Data into Actionable Insights with Pandas
Beyond SQL: Transforming Real Estate Data into Actionable Insights with Pandas

In the realm of data analysis, SQL stands as a mighty tool, renowned for its robust capabilities in managing and querying databases. However, Python’s pandas library brings SQL-like functionalities to the fingertips of analysts and data scientists, enabling sophisticated data manipulation and analysis without the need for a traditional SQL database. This exploration delves into […]

The post Beyond SQL: Transforming Real Estate Data into Actionable Insights with Pandas appeared first on MachineLearningMastery.com.

1 month, 2 weeks назад @ machinelearningmastery.com
Spotting the Exception: Classical Methods for Outlier Detection in Data Science
Spotting the Exception: Classical Methods for Outlier Detection in Data Science

Outliers are unique in that they often don’t play by the rules. These data points, which significantly differ from the rest, can skew your analyses and make your predictive models less accurate. Although detecting outliers is critical, there is no universally agreed-upon method for doing so. While some advanced techniques like machine learning offer solutions, […]

The post Spotting the Exception: Classical Methods for Outlier Detection in Data Science appeared first on MachineLearningMastery.com.

1 month, 3 weeks назад @ machinelearningmastery.com
Leveraging ANOVA and Kruskal-Wallis Tests to Analyze the Impact of the Great Recession on Housing Prices
Leveraging ANOVA and Kruskal-Wallis Tests to Analyze the Impact of the Great Recession on Housing Prices

In the world of real estate, numerous factors influence property prices. The economy, market demand, location, and even the year a property is sold can play significant roles. The years 2007 to 2009 marked a tumultuous time for the US housing market. This period, often referred to as the Great Recession, saw a drastic decline […]

The post Leveraging ANOVA and Kruskal-Wallis Tests to Analyze the Impact of the Great Recession on Housing Prices appeared first on MachineLearningMastery.com.

1 month, 3 weeks назад @ machinelearningmastery.com
Garage or Not? Housing Insights Through the Chi-Squared Test for Ames, Iowa
Garage or Not? Housing Insights Through the Chi-Squared Test for Ames, Iowa

The Chi-squared test for independence is a statistical procedure employed to assess the relationship between two categorical variables – determining whether they are associated or independent. In the dynamic realm of real estate, where a property’s visual appeal often impacts its valuation, the exploration becomes particularly intriguing. But how often do you associate a house’s […]

The post Garage or Not? Housing Insights Through the Chi-Squared Test for Ames, Iowa appeared first on MachineLearningMastery.com.

2 months назад @ machinelearningmastery.com
Testing Assumptions in Real Estate: A Dive into Hypothesis Testing with the Ames Housing Dataset
Testing Assumptions in Real Estate: A Dive into Hypothesis Testing with the Ames Housing Dataset

In the realm of inferential statistics, you often want to test specific hypotheses about our data. Using the Ames Housing dataset, you’ll delve deep into the concept of hypothesis testing and explore if the presence of an air conditioner affects the sale price of a house. Let’s get started. Overview This post unfolds through the […]

The post Testing Assumptions in Real Estate: A Dive into Hypothesis Testing with the Ames Housing Dataset appeared first on MachineLearningMastery.com.

2 months назад @ machinelearningmastery.com
Inferential Insights: How Confidence Intervals Illuminate the Ames Real Estate Market
Inferential Insights: How Confidence Intervals Illuminate the Ames Real Estate Market

In the vast universe of data, it’s not always about what we can see but rather what we can infer. Confidence intervals, a cornerstone of inferential statistics, empower us to make educated guesses about a larger population based on our sample data. Using the Ames Housing dataset, let’s unravel the concept of confidence intervals and […]

The post Inferential Insights: How Confidence Intervals Illuminate the Ames Real Estate Market appeared first on MachineLearningMastery.com.

2 months, 1 week назад @ machinelearningmastery.com
Mastering Pair Plots for Visualization and Hypothesis Creation in the Ames Housing Market
Mastering Pair Plots for Visualization and Hypothesis Creation in the Ames Housing Market

Navigating the complex landscape of real estate analytics involves unraveling distinct narratives shaped by various property features within the housing market data. Our exploration today takes us into the realm of a potent yet frequently overlooked data visualization tool: the pair plot. This versatile graphic not only sheds light on the robustness and orientation of […]

The post Mastering Pair Plots for Visualization and Hypothesis Creation in the Ames Housing Market appeared first on MachineLearningMastery.com.

2 months, 1 week назад @ machinelearningmastery.com
Feature Relationships 101: Lessons from the Ames Housing Data
Feature Relationships 101: Lessons from the Ames Housing Data

In the realm of real estate, understanding the intricacies of property features and their impact on sale prices is paramount. In this exploration, we’ll dive deep into the Ames Housing dataset, shedding light on the relationships between various features and their correlation with the sale price. Harnessing the power of data visualization, we’ll unveil patterns, […]

The post Feature Relationships 101: Lessons from the Ames Housing Data appeared first on MachineLearningMastery.com.

2 months, 2 weeks назад @ machinelearningmastery.com
Exploring Dictionaries, Classifying Variables, and Imputing Data in the Ames Dataset
Exploring Dictionaries, Classifying Variables, and Imputing Data in the Ames Dataset

The real estate market is a complex ecosystem driven by numerous variables such as location, property features, market trends, and economic indicators. One dataset that offers a deep dive into this complexity is the Ames Housing dataset. Originating from Ames, Iowa, this dataset comprises various properties and their characteristics, ranging from the type of alley […]

The post Exploring Dictionaries, Classifying Variables, and Imputing Data in the Ames Dataset appeared first on MachineLearningMastery.com.

2 months, 2 weeks назад @ machinelearningmastery.com
ML in Production
последний пост None
Sorta Insightful Sorta Insightful
последний пост 3 weeks, 6 days назад
Solving Crew Battle Strategy With Math
Solving Crew Battle Strategy With Math Solving Crew Battle Strategy With Math

This means we can reduce all the crew battle outcomes down to a single \(n \times n\) matrix, which I’ll call the crew battle matrix.

So I Wrote Some Python Code to Compute \(f\) for Arbitrary Crew BattlesLet’s consider the RPS crew battle again.

Here’s the matchup matrix:and here’s what my code outputs for the crew battle matrix, assuming optimal play.

But also, this suggests that crew battle strategy really isn’t that important in the first place!

If your crew loses, you don’t get to blame bad crew battle strategy.

3 weeks, 6 days назад @ alexirpan.com
MIT Mystery Hunt 2024
MIT Mystery Hunt 2024 MIT Mystery Hunt 2024

This has spoilers for MIT Mystery Hunt 2024.

I hunted with teammate again this year, because there is nothing quite like writing a Mystery Hunt to forge friends through fire.

I needed to spend some to avoid hitting the vacation cap, and what better time than Mystery Hunt?

If you were forced to pick how a Mystery Hunt runs long, I think most people would pick the “too many puzzles” side of Mystery Hunt 2024 over the “too difficult puzzles” side of Mystery Hunt 2023.

So did Spoilr, the codebase for Mystery Hunt 2022, and tph-site from Mystery Hunt 2023.

3 months назад @ alexirpan.com
My AI Timelines Have Sped Up (Again)
My AI Timelines Have Sped Up (Again) My AI Timelines Have Sped Up (Again)

In August 2020, I wrote a post about my AI timelines.

Diffusion based image augmentation has been shown to improve robot learning, and Anthropic has based a lot of its branding on constitutional AI and “RL from AI feedback”.

I don’t like AI Twitter for reasons I’ve explained here, but I especially do not AI twitter post-ChatGPT.

In AI, models can never do everything people claim they can, but what the models can do is ever-growing and never slides backward.

The bull is to say that we can figure out how to scale models, and scaled up models will solve all the other hard problems.

3 months, 1 week назад @ alexirpan.com
Far More Research Into Making Neopoints Than Anyone Needs to Know
Far More Research Into Making Neopoints Than Anyone Needs to Know Far More Research Into Making Neopoints Than Anyone Needs to Know

You know how much time I have spent studying how to squeeze Neopoints water out of the Neopets stone?

I’d say you can expect to get about 25k NP a day, depending on how many NP rewards you get.

Trudy’s Surprise also gives items for 7 day streaks, but these items are usually junk and not worth anything.

When you win against CPU opponents, you earn a small amount of Neopoints and an item drop.

Or your needs to…see someone do a lot of research into things that don’t matter?

4 months, 3 weeks назад @ alexirpan.com
Everfree Northwest 2023
Everfree Northwest 2023 Everfree Northwest 2023

Everfree Northwest is a My Little Pony convention in the Seattle area.

Except, I heard that with the ending of BronyCon, Everfree Northwest is the largest pony convention left.

My Little Pony: Friendship is Magic is Generation 4, or G4, and is the one that kicked off the brony fandom.

People tend to assume that “pony concert” means “remixes of pony songs”, or at least “overt references to My Little Pony”.

The first was a signed copy of Ponies: The Galloping, the Magic: The Gathering x My Little Pony crossover.

7 months, 2 weeks назад @ alexirpan.com
Eight Years Later
Eight Years Later Eight Years Later

I started working on that post right after Mystery Hunt 2023 finished, and hardcore focused on getting it out ASAP.

The end result was that I was exerting “write Mystery Hunt” levels of effort (10-20 hrs/week) for 1.5 years straight.

markdown 1929 2023 - 04 - 21 - mh - 2023. markdown 114 2023 - 05 - 09 - bootes - 2023. markdown 153 2023 - 07 - 19 - ml - hurry .

markdownPosts in LimboPost about Dominion Online:Odds of writing this year: 5%Odds of writing eventually: 25%My priorities are moving away from Dominion.

Post about AI timelines:Odds of writing this year: 90%Odds of writing eventually: 99%I’m not planning a big update to the last timelines post.

8 months назад @ alexirpan.com
Machine Learning Got Itself in a Big Damn Hurry
Machine Learning Got Itself in a Big Damn Hurry Machine Learning Got Itself in a Big Damn Hurry

As I asked questions about compute resources and the presenter’s position on generative AI ethics, I had a moment of realization.

Why are we talking about whether an RTX 3090 is big enough to finetune a LLaMa checkpoint?

The much easier and less-speculative way for a field to move faster is by having more people working in that field.

The MLP AI enthusiasts mentioned some pretrained LLMs that I had not even heard of.

I remember being a young whippersnapper, in the deep learning wave of 2015.

9 months назад @ alexirpan.com
Lil'Log
последний пост None
inFERENCe
последний пост None
The Spectator
последний пост 4 months, 1 week назад
Generative Science: Roles for Generative AI in Scientific Discovery
Generative Science: Roles for Generative AI in Scientific Discovery Generative Science: Roles for Generative AI in Scientific Discovery

Generative AI is a membership based concept, so it is defined by the set of approaches and applications that get put under that name.

I think this is one part of the intrigue of generative AI for science.

So maybe generative AI applications are part of this drive for a post-theory mode of science.

This generative science, it is hoped, can add vigour into the sciences, especially in cross-disciplinary ways and provide broad-based benefit.

Generative AI for Science presents opportunities for new ways of doing theory and renewed vigour across our sciences.

4 months, 1 week назад @ blog.shakirm.com
Responsibilities of the Pioneer: Generative AI and its Sociotechnical Foundations
Responsibilities of the Pioneer: Generative AI and its Sociotechnical Foundations Responsibilities of the Pioneer: Generative AI and its Sociotechnical Foundations

Keynote at the Stanford Human-centered AI 2023 Fall Conference on New Horizons in Generative AI: Science, Creativity, and Society.

Our conference today on new horizons in generative AI, invites us to think of the frontier of research and innovation.

These stories will expose some features of the sociotechnical foundations of generative AI that is my underlying message and call to action.

What he doesn’t know yet, is that this book will become a cornerstone of the field and industry of weather forecasting.

There is a specific and firm model for responsibility that is built on taking a sociotechnical approach to generative AI and our work.

5 months, 3 weeks назад @ blog.shakirm.com
Machine Learning with Social Purpose
Machine Learning with Social Purpose Machine Learning with Social Purpose

Abstract: This talk talk has a single objective: to advocate for machine learning infused with social purpose.

In this way, social purpose transforms our field of machine learning: into something that is both technical and social.

So social purpose will make machine learning something that is both technical and social.

A more global field and industry can shift machine learning to be more general: magnifying social purpose.

Your continued volunteering, funding, support, and openness to these groups shows yet another way of infusing machine learning with social purpose.

8 months, 2 weeks назад @ blog.shakirm.com
Off the Convex Path
последний пост None
Jay Alammar
последний пост None
Piekniewski's blog
последний пост 2 weeks, 4 days назад
fast.ai NLP fast.ai NLP
последний пост None
大トロ 大トロ
последний пост None
🔬 Science
Papers With Code Papers With Code
последний пост 37 секунд назад
/felix-lyx/ Towards a Foundation Model for Partial Differential Equation: Multi-Operator Learning and Extrapolation
/felix-lyx/ Towards a Foundation Model for Partial Differential Equation: Multi-Operator Learning and Extrapolation /felix-lyx/ Towards a Foundation Model for Partial Differential Equation: Multi-Operator Learning and Extrapolation

Foundation models, such as large language models, have demonstrated success in addressing various language and image processing tasks.

In this work, we introduce a multi-modal foundation model for scientific problems, named PROSE-PDE.

Our model, designed for bi-modality to bi-modality learning, is a multi-operator learning approach which can predict future states of spatiotemporal systems while concurrently learning the underlying governing equations of the physical system.

Specifically, we focus on multi-operator learning by training distinct one-dimensional time-dependent nonlinear constant coefficient partial differential equations, with potential applications to many physical applicatio…

37 секунд назад @ paperswithcode.com
/coco-hut/ Hypergraph Self-supervised Learning with Sampling-efficient Signals
/coco-hut/ Hypergraph Self-supervised Learning with Sampling-efficient Signals /coco-hut/ Hypergraph Self-supervised Learning with Sampling-efficient Signals

Self-supervised learning (SSL) provides a promising alternative for representation learning on hypergraphs without costly labels.

However, existing hypergraph SSL models are mostly based on contrastive methods with the instance-level discrimination strategy, suffering from two significant limitations: (1) They select negative samples arbitrarily, which is unreliable in deciding similar and dissimilar pairs, causing training bias.

(2) They often require a large number of negative samples, resulting in expensive computational costs.

To address the above issues, we propose SE-HSSL, a hypergraph SSL framework with three sampling-efficient self-supervised signals.

Specifically, we introduce two …

44 секунды назад @ paperswithcode.com
/ma921/ A Quadrature Approach for General-Purpose Batch Bayesian Optimization via Probabilistic Lifting
/ma921/ A Quadrature Approach for General-Purpose Batch Bayesian Optimization via Probabilistic Lifting /ma921/ A Quadrature Approach for General-Purpose Batch Bayesian Optimization via Probabilistic Lifting

To address these challenges, we introduce a versatile and modular framework for batch Bayesian optimisation via probabilistic lifting with kernel quadrature, called SOBER, which we present as a Python library based on GPyTorch/BoTorch.

Our framework offers the following unique benefits: (1) Versatility in downstream tasks under a unified approach.

(2) A gradient-free sampler, which does not require the gradient of acquisition functions, offering domain-agnostic sampling (e.g., discrete and mixed variables, non-Euclidean space).

(4) Adaptive batch size (autonomous determination of the optimal batch size).

(5) Robustness against a misspecified reproducing kernel Hilbert space.

52 секунды назад @ paperswithcode.com
/psanch21/ Improving the interpretability of GNN predictions through conformal-based graph sparsification
/psanch21/ Improving the interpretability of GNN predictions through conformal-based graph sparsification /psanch21/ Improving the interpretability of GNN predictions through conformal-based graph sparsification

Graph Neural Networks (GNNs) have achieved state-of-the-art performance in solving graph classification tasks.

However, most GNN architectures aggregate information from all nodes and edges in a graph, regardless of their relevance to the task at hand, thus hindering the interpretability of their predictions.

In contrast to prior work, in this paper we propose a GNN \emph{training} approach that jointly i) finds the most predictive subgraph by removing edges and/or nodes -- -\emph{without making assumptions about the subgraph structure} -- while ii) optimizing the performance of the graph classification task.

To that end, we rely on reinforcement learning to solve the resulting bi-level opt…

57 секунд назад @ paperswithcode.com
/tomekkorbak/ Aligning language models with human preferences
/tomekkorbak/ Aligning language models with human preferences /tomekkorbak/ Aligning language models with human preferences

However, they also manifest behaviors that violate human preferences, e.g., they can generate offensive content, falsehoods or perpetuate social biases.

In this thesis, I explore several approaches to aligning LMs with human preferences.

First, I argue that aligning LMs can be seen as Bayesian inference: conditioning a prior (base, pretrained LM) on evidence about human preferences (Chapter 2).

In chapter 4, I show how to extend the distribution matching to conditional language models.

Finally, in chapter 5 I explore a different root: conditioning an LM on human preferences already during pretraining.

1 минуту назад @ paperswithcode.com
/awjibon/ A Symmetric Regressor for MRI-Based Assessment of Striatal Dopamine Transporter Uptake in Parkinson's Disease
/awjibon/ A Symmetric Regressor for MRI-Based Assessment of Striatal Dopamine Transporter Uptake in Parkinson's Disease /awjibon/ A Symmetric Regressor for MRI-Based Assessment of Striatal Dopamine Transporter Uptake in Parkinson's Disease

Dopamine transporter (DAT) imaging is commonly used for monitoring Parkinson's disease (PD), where striatal DAT uptake amount is computed to assess PD severity.

This paper proposes a symmetric regressor for predicting the DAT uptake amount from the nigral MRI patch.

Moreover, it employs a symmetric loss that imposes a constraint on the difference between right-to-left predictions, resembling the high correlation in DAT uptake amounts in the two lateral sides.

Additionally, we propose a symmetric Monte-Carlo (MC) dropout method for providing a fruitful uncertainty estimate of the DAT uptake prediction, which utilizes the above symmetry.

The symmetric MC dropout also gave precise uncertainty …

1 минуту назад @ paperswithcode.com
/mlcommons/ Introducing v0.5 of the AI Safety Benchmark from MLCommons
/mlcommons/ Introducing v0.5 of the AI Safety Benchmark from MLCommons /mlcommons/ Introducing v0.5 of the AI Safety Benchmark from MLCommons

This paper introduces v0.5 of the AI Safety Benchmark, which has been created by the MLCommons AI Safety Working Group.

The AI Safety Benchmark has been designed to assess the safety risks of AI systems that use chat-tuned language models.

We created a new taxonomy of 13 hazard categories, of which 7 have tests in the v0.5 benchmark.

We plan to release version 1.0 of the AI Safety Benchmark by the end of 2024.

However, the v0.5 benchmark should not be used to assess the safety of AI systems.

1 минуту назад @ paperswithcode.com
/fangguo1/ Generating Diverse Criteria On-the-Fly to Improve Point-wise LLM Rankers
/fangguo1/ Generating Diverse Criteria On-the-Fly to Improve Point-wise LLM Rankers /fangguo1/ Generating Diverse Criteria On-the-Fly to Improve Point-wise LLM Rankers

The most recent pointwise Large Language Model (LLM) rankers have achieved remarkable ranking results.

However, these rankers are hindered by two major drawbacks: (1) they fail to follow a standardized comparison guidance during the ranking process, and (2) they struggle with comprehensive considerations when dealing with complicated passages.

To address these shortcomings, we propose to build a ranker that generates ranking scores based on a set of criteria from various perspectives.

These criteria are intended to direct each perspective in providing a distinct yet synergistic evaluation.

Our research, which examines eight datasets from the BEIR benchmark demonstrates that incorporating th…

1 минуту назад @ paperswithcode.com
/antnlp/ Length Generalization of Causal Transformers without Position Encoding
/antnlp/ Length Generalization of Causal Transformers without Position Encoding /antnlp/ Length Generalization of Causal Transformers without Position Encoding

Besides algorithms manipulating explicit position features, the success of Transformers without position encodings (NoPE) provides a new way to overcome the challenge.

In this paper, we study the length generalization property of NoPE.

We find that although NoPE can extend to longer sequences than the commonly used explicit position encodings, it still has a limited context length.

We identify a connection between the failure of NoPE's generalization and the distraction of attention distributions.

We propose a parameter-efficient tuning for searching attention heads' best temperature hyper-parameters, which substantially expands NoPE's context size.

1 минуту назад @ paperswithcode.com
/Wenlve-Zhou/ Unsupervised Domain Adaption Harnessing Vision-Language Pre-training
/Wenlve-Zhou/ Unsupervised Domain Adaption Harnessing Vision-Language Pre-training /Wenlve-Zhou/ Unsupervised Domain Adaption Harnessing Vision-Language Pre-training

Include the markdown at the top of your GitHub README.md file to showcase the performance of the model.

Badges are live and will be dynamically updated with the latest ranking of this paper.

6 часов назад @ paperswithcode.com
/Jyxarthur/ Moving Object Segmentation: All You Need Is SAM (and Flow)
/Jyxarthur/ Moving Object Segmentation: All You Need Is SAM (and Flow) /Jyxarthur/ Moving Object Segmentation: All You Need Is SAM (and Flow)

Our interest in this paper is to determine if the Segment Anything model (SAM) can contribute to this task.

We investigate two models for combining SAM with optical flow that harness the segmentation power of SAM with the ability of flow to discover and group moving objects.

In the first model, we adapt SAM to take optical flow, rather than RGB, as an input.

In the second, SAM takes RGB as an input, and flow is used as a segmentation prompt.

Again, this simple model outperforms previous methods on multiple video object segmentation benchmarks.

22 часа назад @ paperswithcode.com
OpenBezoar: Small, Cost-Effective and Open Models Trained on Mixes of Instruction Data
OpenBezoar: Small, Cost-Effective and Open Models Trained on Mixes of Instruction Data OpenBezoar: Small, Cost-Effective and Open Models Trained on Mixes of Instruction Data

Instruction fine-tuning pretrained LLMs for diverse downstream tasks has demonstrated remarkable success and has captured the interest of both academics and practitioners.

To ensure such fine-tuned LLMs align with human preferences, techniques such as RLHF and DPO have emerged.

At the same time, there is increasing interest in smaller parameter counts for models.

In this work, using OpenLLaMA 3Bv2 as a base model, we describe the recipe used to fine-tune the OpenBezoar family of models.

We release "OpenBezoar-SFT", "OpenBezoar-HH-RLHF-SFT", "OpenBezoar-HH-RLHF-DPO" checkpoints, alongside our generated datasets on HuggingFace at https://huggingface.co/collections/SurgeGlobal/open-bezoar-6620…

1 day, 1 hour назад @ paperswithcode.com
/jankolf/ GraFIQs: Face Image Quality Assessment Using Gradient Magnitudes
/jankolf/ GraFIQs: Face Image Quality Assessment Using Gradient Magnitudes /jankolf/ GraFIQs: Face Image Quality Assessment Using Gradient Magnitudes

Face Image Quality Assessment (FIQA) estimates the utility of face images for automated face recognition (FR) systems.

We propose in this work a novel approach to assess the quality of face images based on inspecting the required changes in the pre-trained FR model weights to minimize differences between testing samples and the distribution of the FR training dataset.

To achieve that, we propose quantifying the discrepancy in Batch Normalization statistics (BNS), including mean and variance, between those recorded during FR training and those obtained by processing testing samples through the pretrained FR model.

We then generate gradient magnitudes of pretrained FR weights by backpropagati…

1 day, 4 hours назад @ paperswithcode.com
/HeliumPeng/ Internet sentiment exacerbates intraday overtrading, evidence from A-Share market
/HeliumPeng/ Internet sentiment exacerbates intraday overtrading, evidence from A-Share market /HeliumPeng/ Internet sentiment exacerbates intraday overtrading, evidence from A-Share market

Market fluctuations caused by overtrading are important components of systemic market risk.

This study examines the effect of investor sentiment on intraday overtrading activities in the Chinese A-share market.

Employing high-frequency sentiment indices inferred from social media posts on the Eastmoney forum Guba, the research focuses on constituents of the CSI 300 and CSI 500 indices over a period from 01/01/2018, to 12/30/2022.

The empirical analysis indicates that investor sentiment exerts a significantly positive impact on intraday overtrading, with the influence being more pronounced among institutional investors relative to individual traders.

Additionally, the effect of sentiment on …

1 day, 4 hours назад @ paperswithcode.com
/yihuajerry/ AccidentBlip2: Accident Detection With Multi-View MotionBlip2
/yihuajerry/ AccidentBlip2: Accident Detection With Multi-View MotionBlip2 /yihuajerry/ AccidentBlip2: Accident Detection With Multi-View MotionBlip2

Multimodal Large Language Models (MLLMs) have shown outstanding capabilities in many areas of multimodal reasoning.

Therefore, we use the reasoning ability of Multimodal Large Language Models for environment description and scene understanding in complex transportation environments.

In this paper, we propose AccidentBlip2, a multimodal large language model that can predict in real time whether an accident risk will occur.

Our approach involves feature extraction based on the temporal scene of the six-view surround view graphs and temporal inference using the temporal blip framework through the vision transformer.

AccidentBlip2 outperforms existing solutions on the DeepAccident dataset and c…

1 day, 6 hours назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 37 секунд назад
/ml4bio/ RiboDiffusion: Tertiary Structure-based RNA Inverse Folding with Generative Diffusion Models
/ml4bio/ RiboDiffusion: Tertiary Structure-based RNA Inverse Folding with Generative Diffusion Models /ml4bio/ RiboDiffusion: Tertiary Structure-based RNA Inverse Folding with Generative Diffusion Models

A fundamental challenge is to find functional RNA sequences that satisfy given structural constraints, known as the inverse folding problem.

However, designing RNA sequences directly from 3D structures is still challenging, due to the scarcity of data, the non-unique structure-sequence mapping, and the flexibility of RNA conformation.

In this study, we propose RiboDiffusion, a generative diffusion model for RNA inverse folding that can learn the conditional distribution of RNA sequences given 3D backbone structures.

Our model outperforms baselines in sequence recovery, with an average relative improvement of $11\%$ for sequence similarity splits and $16\%$ for structure similarity splits.

W…

1 day, 6 hours назад @ paperswithcode.com
/mv-lab/ Simple Image Signal Processing using Global Context Guidance
/mv-lab/ Simple Image Signal Processing using Global Context Guidance /mv-lab/ Simple Image Signal Processing using Global Context Guidance

In modern smartphone cameras, the Image Signal Processor (ISP) is the core element that converts the RAW readings from the sensor into perceptually pleasant RGB images for the end users.

Deep learning-based ISPs aim to transform RAW images into DSLR-like RGB images using deep neural networks.

First, we propose a novel module that can be integrated into any neural ISP to capture the global context information from the full RAW images.

Second, we propose an efficient and simple neural ISP that utilizes our proposed module.

Our model achieves state-of-the-art results on different benchmarks using diverse and real smartphone images.

1 day, 6 hours назад @ paperswithcode.com
/mzzzhu/ CMNEE: A Large-Scale Document-Level Event Extraction Dataset based on Open-Source Chinese Military News
/mzzzhu/ CMNEE: A Large-Scale Document-Level Event Extraction Dataset based on Open-Source Chinese Military News /mzzzhu/ CMNEE: A Large-Scale Document-Level Event Extraction Dataset based on Open-Source Chinese Military News

Extracting structured event knowledge, including event triggers and corresponding arguments, from military texts is fundamental to many applications, such as intelligence analysis and decision assistance.

However, event extraction in the military field faces the data scarcity problem, which impedes the research of event extraction models in this domain.

To alleviate this problem, we propose CMNEE, a large-scale, document-level open-source Chinese Military News Event Extraction dataset.

It contains 17,000 documents and 29,223 events, which are all manually annotated based on a pre-defined schema for the military domain including 8 event types and 11 argument role types.

We designed a two-sta…

1 day, 6 hours назад @ paperswithcode.com
/n-kubiak/ S3R-Net: A Single-Stage Approach to Self-Supervised Shadow Removal
/n-kubiak/ S3R-Net: A Single-Stage Approach to Self-Supervised Shadow Removal /n-kubiak/ S3R-Net: A Single-Stage Approach to Self-Supervised Shadow Removal

In this paper we present S3R-Net, the Self-Supervised Shadow Removal Network.

The two-branch WGAN model achieves self-supervision relying on the unify-and-adaptphenomenon - it unifies the style of the output data and infers its characteristics from a database of unaligned shadow-free reference images.

This approach stands in contrast to the large body of supervised frameworks.

S3R-Net also differentiates itself from the few existing self-supervised models operating in a cycle-consistent manner, as it is a non-cyclic, unidirectional solution.

The proposed framework achieves comparable numerical scores to recent selfsupervised shadow removal models while exhibiting superior qualitative perfor…

1 day, 6 hours назад @ paperswithcode.com
/kosturm/ Self-Adjusting Evolutionary Algorithms Are Slow on Multimodal Landscapes
/kosturm/ Self-Adjusting Evolutionary Algorithms Are Slow on Multimodal Landscapes /kosturm/ Self-Adjusting Evolutionary Algorithms Are Slow on Multimodal Landscapes

The one-fifth rule and its generalizations are a classical parameter control mechanism in discrete domains.

They have also been transferred to control the offspring population size of the $(1, \lambda)$-EA.

In this work we show that the positive results do not extend to other types of local optima.

On the distorted OneMax benchmark, the self-adjusting $(1, \lambda)$-EA is slowed down just as elitist algorithms because self-adaptation prevents the algorithm from escaping from local optima.

This makes the self-adaptive algorithm considerably worse than good static parameter choices, which do allow to escape from local optima efficiently.

1 day, 6 hours назад @ paperswithcode.com
/anonymousid-submission/ X-Light: Cross-City Traffic Signal Control Using Transformer on Transformer as Meta Multi-Agent Reinforcement Learner
/anonymousid-submission/ X-Light: Cross-City Traffic Signal Control Using Transformer on Transformer as Meta Multi-Agent Reinforcement Learner /anonymousid-submission/ X-Light: Cross-City Traffic Signal Control Using Transformer on Transformer as Meta Multi-Agent Reinforcement Learner

The effectiveness of traffic light control has been significantly improved by current reinforcement learning-based approaches via better cooperation among multiple traffic lights.

However, a persisting issue remains: how to obtain a multi-agent traffic signal control algorithm with remarkable transferability across diverse cities?

In this paper, we propose a Transformer on Transformer (TonT) model for cross-city meta multi-agent traffic signal control, named as X-Light: We input the full Markov Decision Process trajectories, and the Lower Transformer aggregates the states, actions, rewards among the target intersection and its neighbors within a city, and the Upper Transformer learns the ge…

1 day, 6 hours назад @ paperswithcode.com
/ericyu97/ MaskCD: A Remote Sensing Change Detection Network Based on Mask Classification
/ericyu97/ MaskCD: A Remote Sensing Change Detection Network Based on Mask Classification /ericyu97/ MaskCD: A Remote Sensing Change Detection Network Based on Mask Classification

Change detection (CD) from remote sensing (RS) images using deep learning has been widely investigated in the literature.

It is typically regarded as a pixel-wise labeling task that aims to classify each pixel as changed or unchanged.

For high-resolution RS images, partly or totally changed objects are more worthy of attention rather than a single pixel.

Subsequently, a masked-attention-based detection transformers (MA-DETR) decoder is developed to accurately locate and identify changed objects based on masked attention and self-attention mechanisms.

It reconstructs the desired changed objects by decoding the pixel-wise representations into learnable mask proposals and making final predicti…

1 day, 7 hours назад @ paperswithcode.com
/zhandawei/ Expected Coordinate Improvement for High-Dimensional Bayesian Optimization
/zhandawei/ Expected Coordinate Improvement for High-Dimensional Bayesian Optimization /zhandawei/ Expected Coordinate Improvement for High-Dimensional Bayesian Optimization

Bayesian optimization (BO) algorithm is very popular for solving low-dimensional expensive optimization problems.

Extending Bayesian optimization to high dimension is a meaningful but challenging task.

In this work, we propose the expected coordinate improvement (ECI) criterion for high-dimensional Bayesian optimization.

The greatest advantage of the proposed ECI-BO (expected coordinate improvement based Bayesian optimization) algorithm over the standard BO algorithm is that the infill selection problem of the proposed algorithm is always a one-dimensional problem thus can be easily solved.

This work provides a simple but efficient approach for high-dimensional Bayesian optimization.

1 day, 7 hours назад @ paperswithcode.com
/vance0124/ Token-level Direct Preference Optimization
/vance0124/ Token-level Direct Preference Optimization /vance0124/ Token-level Direct Preference Optimization

Fine-tuning pre-trained Large Language Models (LLMs) is essential to align them with human values and intentions.

This process often utilizes methods like pairwise comparisons and KL divergence against a reference LLM, focusing on the evaluation of full answers generated by the models.

In this paper, we introduce Token-level Direct Preference Optimization (TDPO), a novel approach to align LLMs with human preferences by optimizing policy at the token level.

Unlike previous methods, which face challenges in divergence efficiency, TDPO incorporates forward KL divergence constraints for each token, improving alignment and diversity.

Utilizing the Bradley-Terry model for a token-based reward sys…

1 day, 7 hours назад @ paperswithcode.com
/schizophreni/ Harnessing Joint Rain-/Detail-aware Representations to Eliminate Intricate Rains
/schizophreni/ Harnessing Joint Rain-/Detail-aware Representations to Eliminate Intricate Rains /schizophreni/ Harnessing Joint Rain-/Detail-aware Representations to Eliminate Intricate Rains

Recent advances in image deraining have focused on training powerful models on mixed multiple datasets comprising diverse rain types and backgrounds.

To overcome this limitation, we focus on addressing various rainy images by delving into meaningful representations that encapsulate both the rain and background components.

Leveraging these representations as instructive guidance, we put forth a Context-based Instance-level Modulation (CoI-M) mechanism adept at efficiently modulating CNN- or Transformer-based models.

Furthermore, we devise a rain-/detail-aware contrastive learning strategy to help extract joint rain-/detail-aware representations.

By integrating CoI-M with the rain-/detail-awa…

1 day, 7 hours назад @ paperswithcode.com
/nicolay-r/ Large Language Models in Targeted Sentiment Analysis
/nicolay-r/ Large Language Models in Targeted Sentiment Analysis /nicolay-r/ Large Language Models in Targeted Sentiment Analysis

In this paper we investigate the use of decoder-based generative transformers for extracting sentiment towards the named entities in Russian news articles.

We study sentiment analysis capabilities of instruction-tuned large language models (LLMs).

The first group of experiments was aimed at the evaluation of zero-shot capabilities of LLMs with closed and open transparencies.

Reasoning capabilities of the fine-tuned Flan-T5 models with THoR achieve at least 5% increment with the base-size model compared to the results of the zero-shot experiment.

The best results of sentiment analysis on RuSentNE-2023 were achieved by fine-tuned Flan-T5-xl, which surpassed the results of previous state-of-th…

1 day, 7 hours назад @ paperswithcode.com
/facebookresearch/ From Form(s) to Meaning: Probing the Semantic Depths of Language Models Using Multisense Consistency
/facebookresearch/ From Form(s) to Meaning: Probing the Semantic Depths of Language Models Using Multisense Consistency /facebookresearch/ From Form(s) to Meaning: Probing the Semantic Depths of Language Models Using Multisense Consistency

The staggering pace with which the capabilities of large language models (LLMs) are increasing, as measured by a range of commonly used natural language understanding (NLU) benchmarks, raises many questions regarding what "understanding" means for a language model and how it compares to human understanding.

Specifically, we focus on consistency across languages as well as paraphrases.

Taking GPT-3.5 as our object of study, we evaluate multisense consistency across five different languages and various tasks.

We start the evaluation in a controlled setting, asking the model for simple facts, and then proceed with an evaluation on four popular NLU benchmarks.

We find that the model's multisens…

1 day, 7 hours назад @ paperswithcode.com
/vl2g/ Sketch-guided Image Inpainting with Partial Discrete Diffusion Process
/vl2g/ Sketch-guided Image Inpainting with Partial Discrete Diffusion Process /vl2g/ Sketch-guided Image Inpainting with Partial Discrete Diffusion Process

In this work, we study the task of sketch-guided image inpainting.

Unlike the well-explored natural language-guided image inpainting, which excels in capturing semantic details, the relatively less-studied sketch-guided inpainting offers greater user control in specifying the object's shape and pose to be inpainted.

As one of the early solutions to this task, we introduce a novel partial discrete diffusion process (PDDP).

The forward pass of the PDDP corrupts the masked regions of the image and the backward pass reconstructs these masked regions conditioned on hand-drawn sketches using our proposed sketch-guided bi-directional transformer.

The proposed novel transformer module accepts two i…

1 day, 7 hours назад @ paperswithcode.com
/vicomtech/ EuSQuAD: Automatically Translated and Aligned SQuAD2.0 for Basque
/vicomtech/ EuSQuAD: Automatically Translated and Aligned SQuAD2.0 for Basque /vicomtech/ EuSQuAD: Automatically Translated and Aligned SQuAD2.0 for Basque

The widespread availability of Question Answering (QA) datasets in English has greatly facilitated the advancement of the Natural Language Processing (NLP) field.

However, the scarcity of such resources for minority languages, such as Basque, poses a substantial challenge for these communities.

In this context, the translation and alignment of existing QA datasets plays a crucial role in narrowing this technological gap.

This work presents EuSQuAD, the first initiative dedicated to automatically translating and aligning SQuAD2.0 into Basque, resulting in more than 142k QA examples.

We demonstrate EuSQuAD's value through extensive qualitative analysis and QA experiments supported with EuSQuA…

1 day, 7 hours назад @ paperswithcode.com
/ucsb-nlp-chang/ Advancing the Robustness of Large Language Models through Self-Denoised Smoothing
/ucsb-nlp-chang/ Advancing the Robustness of Large Language Models through Self-Denoised Smoothing /ucsb-nlp-chang/ Advancing the Robustness of Large Language Models through Self-Denoised Smoothing

Although large language models (LLMs) have achieved significant success, their vulnerability to adversarial perturbations, including recent jailbreak attacks, has raised considerable concerns.

However, the increasing size of these models and their limited access make improving their robustness a challenging task.

Among various defense strategies, randomized smoothing has shown great potential for LLMs, as it does not require full access to the model's parameters or fine-tuning via adversarial training.

However, randomized smoothing involves adding noise to the input before model prediction, and the final model's robustness largely depends on the model's performance on these noise corrupted …

1 day, 7 hours назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 37 секунд назад
/uhh-lt/ Exploring Boundaries and Intensities in Offensive and Hate Speech: Unveiling the Complex Spectrum of Social Media Discourse
/uhh-lt/ Exploring Boundaries and Intensities in Offensive and Hate Speech: Unveiling the Complex Spectrum of Social Media Discourse /uhh-lt/ Exploring Boundaries and Intensities in Offensive and Hate Speech: Unveiling the Complex Spectrum of Social Media Discourse

The prevalence of digital media and evolving sociopolitical dynamics have significantly amplified the dissemination of hateful content.

Existing studies mainly focus on classifying texts into binary categories, often overlooking the continuous spectrum of offensiveness and hatefulness inherent in the text.

In this research, we present an extensive benchmark dataset for Amharic, comprising 8,258 tweets annotated for three distinct tasks: category classification, identification of hate targets, and rating offensiveness and hatefulness intensities.

Our results reveal that hate and offensive speech can not be addressed by a simplistic binary classification, instead manifesting as variables acro…

1 day, 7 hours назад @ paperswithcode.com
/dskoda/ Information theory unifies atomistic machine learning, uncertainty quantification, and materials thermodynamics
/dskoda/ Information theory unifies atomistic machine learning, uncertainty quantification, and materials thermodynamics /dskoda/ Information theory unifies atomistic machine learning, uncertainty quantification, and materials thermodynamics

An accurate description of information is relevant for a range of problems in atomistic modeling, such as sampling methods, detecting rare events, analyzing datasets, or performing uncertainty quantification (UQ) in machine learning (ML)-driven simulations.

Here, we introduce an information theoretical framework that unifies predictions of phase transformations, kinetic events, dataset optimality, and model-free UQ from atomistic simulations, thus bridging materials modeling, ML, and statistical mechanics.

We first demonstrate that, for a proposed representation, the information entropy of a distribution of atom-centered environments is a surrogate value for thermodynamic entropy.

Finally, …

1 day, 7 hours назад @ paperswithcode.com
/openmachine-ai/ Transformer tricks: Removing weights for skipless transformers
/openmachine-ai/ Transformer tricks: Removing weights for skipless transformers /openmachine-ai/ Transformer tricks: Removing weights for skipless transformers

He and Hofmann (arXiv:2311.01906) detailed a skipless transformer without the V and P (post-attention projection) linear layers, which reduces the total number of weights.

However, this scheme is only applicable to MHA (multi-head attention), but not for MQA (multi-query attention) and GQA (grouped-query attention).

Therefore, this micro-paper proposes mathematically equivalent versions that are suitable for MQA and GQA.

For example, removing Q and P from a skipless version of Mistral-7B would remove 15% of its weights (and thus reduce its compute and memory complexity).

See arXiv:2402.13388 and https://github.com/OpenMachine-ai/transformer-tricks for code and more transformer tricks.

1 day, 7 hours назад @ paperswithcode.com
/dwzhu-pku/ LongEmbed: Extending Embedding Models for Long Context Retrieval
/dwzhu-pku/ LongEmbed: Extending Embedding Models for Long Context Retrieval /dwzhu-pku/ LongEmbed: Extending Embedding Models for Long Context Retrieval

Embedding models play a pivot role in modern NLP applications such as IR and RAG.

While the context limit of LLMs has been pushed beyond 1 million tokens, embedding models are still confined to a narrow context window not exceeding 8k tokens, refrained from application scenarios requiring long inputs such as legal contracts.

This paper explores context window extension of existing embedding models, pushing the limit to 32k without requiring additional training.

First, we examine the performance of current embedding models for long context retrieval on our newly constructed LongEmbed benchmark.

Based on this, comprehensive experiments show that training-free context window extension strategi…

1 day, 7 hours назад @ paperswithcode.com
/xincanfeng/ Sharing Parameter by Conjugation for Knowledge Graph Embeddings in Complex Space
/xincanfeng/ Sharing Parameter by Conjugation for Knowledge Graph Embeddings in Complex Space /xincanfeng/ Sharing Parameter by Conjugation for Knowledge Graph Embeddings in Complex Space

A Knowledge Graph (KG) is the directed graphical representation of entities and relations in the real world.

The need to scale up and complete KG automatically yields Knowledge Graph Embedding (KGE), a shallow machine learning model that is suffering from memory and training time consumption issues.

To mitigate the computational load, we propose a parameter-sharing method, i.e., using conjugate parameters for complex numbers employed in KGE models.

Our method improves memory efficiency by 2x in relation embedding while achieving comparable performance to the state-of-the-art non-conjugate models, with faster, or at least comparable, training time.

We demonstrated the generalizability of our…

1 day, 7 hours назад @ paperswithcode.com
/continental/ 6Img-to-3D: Few-Image Large-Scale Outdoor Driving Scene Reconstruction
/continental/ 6Img-to-3D: Few-Image Large-Scale Outdoor Driving Scene Reconstruction /continental/ 6Img-to-3D: Few-Image Large-Scale Outdoor Driving Scene Reconstruction

Current 3D reconstruction techniques struggle to infer unbounded scenes from a few images faithfully.

Specifically, existing methods have high computational demands, require detailed pose information, and cannot reconstruct occluded regions reliably.

We introduce 6Img-to-3D, an efficient, scalable transformer-based encoder-renderer method for single-shot image to 3D reconstruction.

Our method outputs a 3D-consistent parameterized triplane from only six outward-facing input images for large-scale, unbounded outdoor driving scenarios.

We take a step towards resolving existing shortcomings by combining contracted custom cross- and self-attention mechanisms for triplane parameterization, differ…

1 day, 7 hours назад @ paperswithcode.com
/roryshao/ Data-free Knowledge Distillation for Fine-grained Visual Categorization
/roryshao/ Data-free Knowledge Distillation for Fine-grained Visual Categorization /roryshao/ Data-free Knowledge Distillation for Fine-grained Visual Categorization

Data-free knowledge distillation (DFKD) is a promising approach for addressing issues related to model compression, security privacy, and transmission restrictions.

To address this issue, we propose an approach called DFKD-FGVC that extends DFKD to fine-grained visual categorization~(FGVC) tasks.

Our approach utilizes an adversarial distillation framework with attention generator, mixed high-order attention distillation, and semantic feature contrast learning.

Specifically, we introduce a spatial-wise attention mechanism to the generator to synthesize fine-grained images with more details of discriminative parts.

Moreover, we leverage the teacher and student models of the distillation frame…

1 day, 7 hours назад @ paperswithcode.com
/radu1999/ Aligning Actions and Walking to LLM-Generated Textual Descriptions
/radu1999/ Aligning Actions and Walking to LLM-Generated Textual Descriptions /radu1999/ Aligning Actions and Walking to LLM-Generated Textual Descriptions

This work explores the use of LLMs to generate rich textual descriptions for motion sequences, encompassing both actions and walking patterns.

For action recognition, we employ LLMs to generate textual descriptions of actions in the BABEL-60 dataset, facilitating the alignment of motion sequences with linguistic representations.

In the domain of gait analysis, we investigate the impact of appearance attributes on walking patterns by generating textual descriptions of motion sequences from the DenseGait dataset using LLMs.

These descriptions capture subtle variations in walking styles influenced by factors such as clothing choices and footwear.

Our approach demonstrates the potential of LLMs…

1 day, 7 hours назад @ paperswithcode.com
/chengshiest/ The devil is in the object boundary: towards annotation-free instance segmentation using Foundation Models
/chengshiest/ The devil is in the object boundary: towards annotation-free instance segmentation using Foundation Models /chengshiest/ The devil is in the object boundary: towards annotation-free instance segmentation using Foundation Models

Foundation models, pre-trained on a large amount of data have demonstrated impressive zero-shot capabilities in various downstream tasks.

However, in object detection and instance segmentation, two fundamental computer vision tasks heavily reliant on extensive human annotations, foundation models such as SAM and DINO struggle to achieve satisfactory performance.

}, these foundation models fail to discern boundaries between individual objects.

Following this surprising observation, we propose $\textbf{Zip}$ which $\textbf{Z}$ips up CL$\textbf{ip}$ and SAM in a novel classification-first-then-discovery pipeline, enabling annotation-free, complex-scene-capable, open-vocabulary object detection…

1 day, 7 hours назад @ paperswithcode.com
/algomole/ MolCRAFT: Structure-Based Drug Design in Continuous Parameter Space
/algomole/ MolCRAFT: Structure-Based Drug Design in Continuous Parameter Space /algomole/ MolCRAFT: Structure-Based Drug Design in Continuous Parameter Space

Generative models for structure-based drug design (SBDD) have shown promising results in recent years.

Existing works mainly focus on how to generate molecules with higher binding affinity, ignoring the feasibility prerequisites for generated 3D poses and resulting in false positives.

In this paper, we introduce \ours, the first SBDD model that operates in the continuous parameter space, together with a novel noise reduced sampling strategy.

Empirical results show that our model consistently achieves superior performance in binding affinity with more stable 3D structure, demonstrating our ability to accurately model interatomic interactions.

To our best knowledge, MolCRAFT is the first to a…

1 day, 7 hours назад @ paperswithcode.com
/liu-haoyue/ Seeing Motion at Nighttime with an Event Camera
/liu-haoyue/ Seeing Motion at Nighttime with an Event Camera /liu-haoyue/ Seeing Motion at Nighttime with an Event Camera

We focus on a very challenging task: imaging at nighttime dynamic scenes.

However, they would inevitably face a dilemma between the long exposure time of nighttime and the motion blur of dynamic scenes.

In this work, we present a novel nighttime dynamic imaging method with an event camera.

Specifically, we discover that the event at nighttime exhibits temporal trailing characteristics and spatial non-stationary distribution.

Moreover, we construct a paired real low-light event dataset (RLED) through a co-axial imaging system, including 64,200 spatially and temporally aligned image GTs and low-light events.

1 day, 7 hours назад @ paperswithcode.com
/michelle123lam/ Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM
/michelle123lam/ Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM /michelle123lam/ Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM

Data analysts have long sought to turn unstructured text data into meaningful concepts.

We introduce concept induction, a computational process that instead produces high-level concepts, defined by explicit inclusion criteria, from unstructured text.

For a dataset of toxic online comments, where a state-of-the-art BERTopic model outputs "women, power, female," concept induction produces high-level concepts such as "Criticism of traditional gender roles" and "Dismissal of women's concerns."

We present LLooM, a concept induction algorithm that leverages large language models to iteratively synthesize sampled text and propose human-interpretable concepts of increasing generality.

We then insta…

1 day, 7 hours назад @ paperswithcode.com
/modaresimr/ Boosting Medical Image Segmentation Performance with Adaptive Convolution Layer
/modaresimr/ Boosting Medical Image Segmentation Performance with Adaptive Convolution Layer /modaresimr/ Boosting Medical Image Segmentation Performance with Adaptive Convolution Layer

Medical image segmentation plays a vital role in various clinical applications, enabling accurate delineation and analysis of anatomical structures or pathological regions.

However, they often rely on fixed kernel sizes, which can limit their performance and adaptability in medical images where features exhibit diverse scales and configurations due to variability in equipment, target sizes, and expert interpretations.

In this paper, we propose an adaptive layer placed ahead of leading deep-learning models such as UCTransNet, which dynamically adjusts the kernel size based on the local context of the input image.

Extensive experiments are conducted on benchmark medical image datasets to eval…

1 day, 9 hours назад @ paperswithcode.com
/fluhus/ FrackyFrac: A Standalone UniFrac Calculator
/fluhus/ FrackyFrac: A Standalone UniFrac Calculator /fluhus/ FrackyFrac: A Standalone UniFrac Calculator

UniFrac is a family of distance metrics over microbial abundances, that take taxonomic relatedness into account.

Current tools and libraries for calculating UniFrac have specific requirements regarding the user's technical expertise, operating system, and pre-installed software, which might exclude potential users.

FrackyFrac is a native command-line tool that can run on any platform and has no requirements.

It can also generate the phylogenetic trees required for the calculation.

FrackyFrac can make UniFrac accessible to researchers who may otherwise skip it due to the effort involved, and it can simplify analysis pipelines for those who already use it.

1 day, 9 hours назад @ paperswithcode.com
/Infini-AI-Lab/ TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
/Infini-AI-Lab/ TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding /Infini-AI-Lab/ TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

With large language models (LLMs) widely deployed in long content generation recently, there has emerged an increasing demand for efficient long-sequence inference support.

However, key-value (KV) cache, which is stored to avoid re-computation, has emerged as a critical bottleneck by growing linearly in size with the sequence length.

Due to the auto-regressive nature of LLMs, the entire KV cache will be loaded for every generated token, resulting in low utilization of computational cores and high latency.

While various compression methods for KV cache have been proposed to alleviate this issue, they suffer from degradation in generation quality.

We introduce TriForce, a hierarchical specula…

1 day, 9 hours назад @ paperswithcode.com
💼 University and corporation labs
DeepMind DeepMind
последний пост 1 day, 2 hours назад
The ethics of advanced AI assistants
The ethics of advanced AI assistants The ethics of advanced AI assistants

Responsibility & Safety The ethics of advanced AI assistants ShareCopy link ×Exploring the promise and risks of a future with more capable AI Imagine a future where we interact regularly with a range of advanced artificial intelligence (AI) assistants — and where millions of assistants interact with each other on our behalf.

General-purpose foundation models are paving the way for increasingly advanced AI assistants.

Advanced AI assistants could have a profound impact on users and society, and be integrated into most aspects of people’s lives.

Able to fluidly communicate using natural language, the written output and voices of advanced AI assistants may become hard to distinguish from those…

1 day, 2 hours назад @ deepmind.google
TacticAI: an AI assistant for football tactics
TacticAI: an AI assistant for football tactics TacticAI: an AI assistant for football tactics

Research TacticAI: an AI assistant for football tactics ShareCopy link ×As part of our multi-year collaboration with Liverpool FC, we develop a full AI system that can advise coaches on corner kicks 'Corner taken quickly… Origi!'

Our first paper, Game Plan, looked at why AI should be used in assisting football tactics, highlighting examples such as analyzing penalty kicks.

Predicting corner kick outcomes with geometric deep learning A corner kick is awarded when the ball passes over the byline, after touching a player of the defending team.

With TacticAI, we have developed a capable AI assistant for football tactics and achieved a milestone in developing useful assistants in sports AI.

We s…

1 month назад @ deepmind.google
SIMA generalist AI agent for 3D virtual environments
SIMA generalist AI agent for 3D virtual environments SIMA generalist AI agent for 3D virtual environments

In a new technical report, we introduce SIMA, short for Scalable Instructable Multiworld Agent, a generalist AI agent for 3D virtual settings.

SIMA: a versatile AI agent SIMA is an AI agent that can perceive and understand a variety of environments, then take actions to achieve an instructed goal.

What’s more, an agent trained in all but one game performed nearly as well on that unseen game as an agent trained specifically on it, on average.

We compare this performance with three types of generalist SIMA agent, each trained across multiple environments.

Advancing AI agent research SIMA’s results show the potential to develop a new wave of generalist, language-driven AI agents.

1 month, 1 week назад @ deepmind.google
Gemma: Introducing new state-of-the-art open models
Gemma: Introducing new state-of-the-art open models Gemma: Introducing new state-of-the-art open models

Today, we’re excited to introduce a new generation of open models from Google to assist developers and researchers in building AI responsibly.

Gemma open modelsGemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models.

This enables Gemma 2B and 7B to achieve best-in-class performance for their sizes compared to other open models.

And Gemma models are capable of running directly on a developer laptop or desktop computer.

Notably, Gemma surpasses significantly larger models on key benchmarks while adhering to our rigorous standards for safe and responsible outputs.

1 month, 4 weeks назад @ blog.google
Our next-generation model: Gemini 1.5
Our next-generation model: Gemini 1.5 Our next-generation model: Gemini 1.5

A note from Google and Alphabet CEO Sundar Pichai:Last week, we rolled out our most capable model, Gemini 1.0 Ultra, and took a significant step forward in making Google products more helpful, starting with Gemini Advanced.

Today, developers and Cloud customers can begin building with 1.0 Ultra too — with our Gemini API in AI Studio and in Vertex AI.

Our teams continue pushing the frontiers of our latest models with safety at the core.

In fact, we’re ready to introduce the next generation: Gemini 1.5.

It shows dramatic improvements across a number of dimensions and 1.5 Pro achieves comparable quality to 1.0 Ultra, while using less compute.

2 months назад @ blog.google
The next chapter of our Gemini era
The next chapter of our Gemini era The next chapter of our Gemini era

We’re excited by the progress, for example with our Search Generative Experience, or SGE, which you can try in Search Labs.

Introducing Gemini AdvancedBard has been the best way for people to directly experience our most capable models.

To reflect the advanced tech at its core, Bard will now simply be called Gemini.

The version with Ultra will be called Gemini Advanced, a new experience far more capable at reasoning, following instructions, coding, and creative collaboration.

You can start using Gemini Advanced by subscribing to the new Google One AI Premium plan, which offers the best of Google’s AI features in a single place.

2 months, 1 week назад @ blog.google
AlphaGeometry: An Olympiad-level AI system for geometry
AlphaGeometry: An Olympiad-level AI system for geometry AlphaGeometry: An Olympiad-level AI system for geometry

Research AlphaGeometry: An Olympiad-level AI system for geometry ShareCopy link ×Our AI system surpasses the state-of-the-art approach for geometry problems, advancing AI reasoning in mathematicsReflecting the Olympic spirit of ancient Greece, the International Mathematical Olympiad is a modern-day arena for the world's brightest high-school mathematicians.

In a paper published today in Nature, we introduce AlphaGeometry, an AI system that solves complex geometry problems at a level approaching a human Olympiad gold-medalist - a breakthrough in AI performance.

In a benchmarking test of 30 Olympiad geometry problems, AlphaGeometry solved 25 within the standard Olympiad time limit.

Solving Ol…

3 months назад @ deepmind.google
Shaping the future of advanced robotics
Shaping the future of advanced robotics Shaping the future of advanced robotics

Today we’re announcing a suite of advances in robotics research that bring us a step closer to this future.

AutoRT, SARA-RT, and RT-Trajectory build on our historic Robotics Transformers work to help robots make decisions faster, and better understand and navigate their environments.

So the AutoRT system comprises layers of practical safety measures from classical robotics.

SARA-RT: Making Robotics Transformers leaner and faster Our new system, Self-Adaptive Robust Attention for Robotics Transformers (SARA-RT), converts Robotics Transformer (RT) models into more efficient versions.

We will continue to tackle challenges in robotics today and to adapt to the new capabilities and technologies …

3 months, 2 weeks назад @ deepmind.google
Images altered to trick machine vision can influence humans too
Images altered to trick machine vision can influence humans too Images altered to trick machine vision can influence humans too

Research Images altered to trick machine vision can influence humans too ShareCopy link ×New research shows that even subtle changes to digital images, designed to confuse computer vision systems, can also affect human perception Computers and humans see the world in different ways.

Our discovery highlights a similarity between human and machine vision, but also demonstrates the need for further research to understand the influence adversarial images have on people, as well as AI systems.

An adversarial image is one that has been subtly altered by a procedure that causes an AI model to confidently misclassify the image contents.

Indeed, security concerns have led researchers to investigate …

3 months, 2 weeks назад @ deepmind.google
2023: A Year of Groundbreaking Advances in AI and Computing
2023: A Year of Groundbreaking Advances in AI and Computing 2023: A Year of Groundbreaking Advances in AI and Computing

Company 2023: A Year of Groundbreaking Advances in AI and Computing ShareCopy link ×This has been a year of incredible progress in the field of Artificial Intelligence (AI) research and its practical applications.

In this Year-in-Review post we’ll go over some of Google Research’s and Google DeepMind’s efforts putting these paragraphs into practice safely throughout 2023.

In November, in partnership with YouTube, we announced Lyria, our most advanced AI music generation model to date.

Then in December, we launched Gemini, our most capable and general AI model.

Advances in capable language and multimodal models have also benefited our robotics research efforts.

3 months, 4 weeks назад @ deepmind.google
FunSearch: Making new discoveries in mathematical sciences using Large Language Models
FunSearch: Making new discoveries in mathematical sciences using Large Language Models FunSearch: Making new discoveries in mathematical sciences using Large Language Models

Research FunSearch: Making new discoveries in mathematical sciences using Large Language Models ShareCopy link ×By searching for “functions” written in computer code, FunSearch made the first discoveries in open problems in mathematical sciences using LLMs Large Language Models (LLMs) are useful assistants - they excel at combining concepts and can read, write and code to help people solve problems.

This work represents the first time a new discovery has been made for challenging open problems in science or mathematics using LLMs.

FunSearch discovered new solutions for the cap set problem, a longstanding open problem in mathematics.

The corresponding function produced by FunSearch for each …

4 months, 1 week назад @ deepmind.google
Google DeepMind at NeurIPS 2023
Google DeepMind at NeurIPS 2023 Google DeepMind at NeurIPS 2023

We’ll be showcasing demos of our cutting edge AI models for global weather forecasting, materials discovery, and watermarking AI-generated content.

There will also be an opportunity to hear from the team behind Gemini, our largest and most capable AI model.

Here’s a look at some of our research highlights:Multimodality: language, video, actionUniSim is a universal simulator of real-world interactions.

Our approach surpassed current methods on vision and language tasks, and showed more potential to scale.

To help people solve problems and find what they’re looking for, AI models need to process billions of unique values efficiently.

4 months, 1 week назад @ deepmind.google
Introducing Gemini: our largest and most capable AI model
Introducing Gemini: our largest and most capable AI model Introducing Gemini: our largest and most capable AI model

AI has the potential to create opportunities — from the everyday to the extraordinary — for people everywhere.

That’s what excites me: the chance to make AI helpful for everyone, everywhere in the world.

At the same time, developers are using our models and infrastructure to build new generative AI applications, and startups and enterprises around the world are growing with our AI tools.

Now, we’re taking the next step on our journey with Gemini, our most capable and general model yet, with state-of-the-art performance across many leading benchmarks.

I’m genuinely excited for what’s ahead, and for the opportunities Gemini will unlock for people everywhere.

4 months, 2 weeks назад @ blog.google
Millions of new materials discovered with deep learning
Millions of new materials discovered with deep learning Millions of new materials discovered with deep learning

Research Millions of new materials discovered with deep learning ShareCopy link ×AI tool GNoME finds 2.2 million new crystals, including 380,000 stable materials that could power future technologies Modern technologies from computer chips and batteries to solar panels rely on inorganic crystals.

Computational approaches drawing from the Materials Project, Open Quantum Materials Database and WBM database boosted this number to 48,000 stable crystals.

Over the last decade, computational approaches led by the Materials Project and other groups have helped discover 28,000 new materials.

GNoME: Harnessing graph networks for materials explorationGNoME uses two pipelines to discover low-energy (st…

4 months, 3 weeks назад @ deepmind.google
Transforming the future of music creation
Transforming the future of music creation Transforming the future of music creation

Music AI tools – a set of tools we’re designing with artists, songwriters, and producers to help bolster their creative processes.

To develop these projects, we’ve brought together technical experts from across Google with a diverse range of world-renowned artists and songwriters to explore how generative music technologies can responsibly shape the future of music creation.

Exploring music AI tools with the industry Our researchers have been exploring with artists, songwriters, and producers in YouTube’s Music AI Incubator how generative AI can best support the creative process, and working together to responsibly design a suite of music AI tools.

This work draws on our history of research…

5 months назад @ deepmind.google
Google
последний пост 1 day, 20 hours назад
Innovating in patent search: How IPRally leverages AI with Google Kubernetes Engine and Ray
Innovating in patent search: How IPRally leverages AI with Google Kubernetes Engine and Ray Innovating in patent search: How IPRally leverages AI with Google Kubernetes Engine and Ray

Next on the horizon is expanding to big data solutions with Ray Data and BigQuery.

With just two DevOps engineers and one MLOps engineer, IPRally was able to build its own customized ML platform with GKE and Ray as key components.

To more easily manage Ray, IPRally uses KubeRay, a specialized tool that simplifies Ray cluster management on Kubernetes.

IPRally uses Ray for the most advanced tasks like massive preprocessing of data and exploratory deep learning in R&D.

And byleveraging them within GKE, IPRally facilitates the creation of nodes on-demand, scales GPU resources as needed, thus optimizing its operational costs.

1 day, 20 hours назад @ cloud.google.com
Meta Llama 3 Available Today on Google Cloud Vertex AI
Meta Llama 3 Available Today on Google Cloud Vertex AI Meta Llama 3 Available Today on Google Cloud Vertex AI

When developers access Llama 3 through Vertex AI, they will soon have access to multiple state of the art tuning options made available through Colab Enterprise.

Vertex AI also makes it simple for developers to evaluate their tuned Llama models, either through preconfigured notebooks directly in Model Garden or with Auto SxS, Vertex AI’s pairwise model-based evaluation tool.

And with robust features like Model Registry, Vertex AI makes it easy to manage and monitor model variants and endpoints and scale them appropriately for your needs.

A thriving, open ecosystem for enterprise model buildersWith over 130 first-party, third-party, and open models, Vertex AI Model Garden is a one-stop desti…

2 days, 3 hours назад @ cloud.google.com
Introducing LLM fine-tuning and evaluation in BigQuery
Introducing LLM fine-tuning and evaluation in BigQuery Introducing LLM fine-tuning and evaluation in BigQuery

BigQuery allows you to analyze your data using a range of large language models (LLMs) hosted in Vertex AI including Gemini 1.0 Pro, Gemini 1.0 Pro Vision and text-bison.

These models work well for several tasks such as text summarization, sentiment analysis, etc.

Today, we are announcing support for customizing LLMs in BigQuery with supervised fine-tuning.

Feature walkthroughTo illustrate model fine-tuning, let’s look at a classification problem using text data.

To fine-tune and evaluate our model, we first create an evaluation table and a training table in BigQuery using a subset of this data available in Cloud Storage as follows:

4 days, 19 hours назад @ cloud.google.com
Google Cloud offers new AI, cybersecurity, and data analytics training to unlock job opportunities
Google Cloud offers new AI, cybersecurity, and data analytics training to unlock job opportunities Google Cloud offers new AI, cybersecurity, and data analytics training to unlock job opportunities

This new initiative uses Google Cloud technology to help move certificate completers through the hiring process.

The U.S. Department of the Treasury will start using these new Google Cloud Certificates and labs for cyber and data analytics talent identification across the federal agency, per President Biden's Executive Order on AI.

Together, they were the pioneers in stacking Grow with Google certificates into four types of degree-earning credit certificates over the past two years.

(ISC)2 Cybersecurity Workforce Study (2022)5.

The Google Cloud Certificates offer a recommendation from the American Council on Education® of up to 10 college credits.

4 days, 21 hours назад @ cloud.google.com
Introducing multimodal and structured data embedding support in BigQuery
Introducing multimodal and structured data embedding support in BigQuery Introducing multimodal and structured data embedding support in BigQuery

Embeddings represent real-world objects, like entities, text, images, or videos as an array of numbers (a.k.a vectors) that machine learning models can easily process.

Embeddings are the building blocks of many ML applications such as semantic search, recommendations, clustering, outlier detection, named entity extraction, and more.

Last year, we introduced support for text embeddings in BigQuery, allowing machine learning models to understand real-world data domains more effectively and earlier this year we introduced vector search, which lets you index and work with billions of embeddings and build generative AI applications on BigQuery.

You can start using multimodal embedding in BigQuer…

1 week, 1 day назад @ cloud.google.com
Eating our own dogfood: Building an AI-driven business at Google Cloud Consulting
Eating our own dogfood: Building an AI-driven business at Google Cloud Consulting Eating our own dogfood: Building an AI-driven business at Google Cloud Consulting

What you build now with gen AI is just a starting point, but it will be worth the investment.

Your success at building with and embracing gen AI will likely be a differentiator in the future — the sooner you get started, the better.

We’re excited to share more at Google Cloud Next and stay tuned for future posts that dive into the technical implementation and underpinnings for these two solutions.

Within Google Cloud Consulting, we provide guidance to customers on how to test and productionize GenAI use cases with Vertex AI.

Learn more about how Google Cloud Consulting can help you learn, build, operate and succeed.

1 week, 1 day назад @ cloud.google.com
Introducing Isolator: Enabling secure multi-party collaboration with healthcare data
Introducing Isolator: Enabling secure multi-party collaboration with healthcare data Introducing Isolator: Enabling secure multi-party collaboration with healthcare data

Isolator can enable multi-party collaboration on work that involves handling raw and unprocessed, sensitive, and regulated data in an isolated, private, and compliant environment on Google Cloud.

Leverage complex data : Isolator can advance adoption of our Medical Image Suite, giving researchers and data scientists the efficiencies to improve workflows, and at a lower cost than keeping data on-premise.

Scaling advanced data analytics : Isolator can assist in reducing costs and improving results by moving and transforming data from siloed, on-premise databases to feature-rich and massively scalable analytic and data processing services such as Healthcare Data Engine and BigQuery.

This can he…

1 week, 1 day назад @ cloud.google.com
Accelerating database modernization with Gemini in Database Migration Service
Accelerating database modernization with Gemini in Database Migration Service Accelerating database modernization with Gemini in Database Migration Service

At Google Cloud Next ‘23, we announced support in Database Migration Service (DMS) for Oracle-to-PostgreSQL migrations to make them smooth, intuitive and efficient.

This week at Google Cloud Next ‘24, we unveiled Gemini for Google Cloud, a new generation of AI-assistive capabilities based on Google's Gemini family of models, including Gemini in Databases.

In addition, we announced the preview of a key feature of Gemini in Database: assistive code and schema conversion, allowing you to speed up the modernization of Oracle databases to PostgreSQL databases on Google Cloud.

Finally, we announced Gemini-assisted code explainability, a key feature of Gemini in Databases, which simplifies code co…

1 week, 1 day назад @ cloud.google.com
Build powerful gen AI applications with Firestore vector similarity search
Build powerful gen AI applications with Firestore vector similarity search Build powerful gen AI applications with Firestore vector similarity search

Creating innovative AI-powered solutions for use cases such as product recommendations and chatbots often requires vector similarity search, or vector search for short.

At Google Cloud Next ‘24, we announced the Firestore vector search in preview, using exact K-nearest neighbor (KNN) search.

Developers can now perform vector search on transactional Firestore data without the hassle of copying data to another vector search solution, maintaining operational simplicity and efficiency.

Developers can now utilize Firestore vector search with popular orchestration frameworks such as LangChain and LlamaIndex through native integrations.

How to use KNN vector search in FirestoreThe first step in ut…

1 week, 1 day назад @ cloud.google.com
Introducing new Vertex AI text embedding models
Introducing new Vertex AI text embedding models Introducing new Vertex AI text embedding models

Text embedding models are essential for many diverse natural-language processing (NLP) applications, from document retrieval and similarity measurement to classification and clustering.

Our text embedding models power applications across Google Cloud including BigQuery, Cloud Database, Vertex AI Search, and Workspace.

New text embedding models with stronger performanceWe evaluated our new models and published the metrics and more technical details in the Google study: Gecko: Versatile text embeddings distilled from large language models.

The pricing for our text embedding models is $0.000025/1,000 characters for online requests and $0.00002/1,000 characters for batch requests.

Dynamic embed…

1 week, 2 days назад @ cloud.google.com
Accelerate AI Inference with Google Cloud TPUs and GPUs
Accelerate AI Inference with Google Cloud TPUs and GPUs Accelerate AI Inference with Google Cloud TPUs and GPUs

That’s why we chose Cloud TPU v5e with MaxText, JAX, and JetStream for our end-to-end AI workflows.

With Google Cloud, we were able to quickly and easily fine-tune Google’s latest Gemma open model on billions of tokens using MaxText and deploy it for inference using JetStream, all on Cloud TPU v5e.

Experience the future of LLM inference with JetStream today.

We are committed to developing and supporting JetStream over the long term on GitHub and through Google Cloud Customer Care.

The MaxDiffusion implementation of the new SDXL-Lightning model achieves 6 images/s on Cloud TPU v5e-4, and throughput scales linearly to 12 images/s on Cloud TPU v5e-8, taking full advantage of the high performan…

1 week, 2 days назад @ cloud.google.com
Gemini in Databases — supercharge database development and management
Gemini in Databases — supercharge database development and management Gemini in Databases — supercharge database development and management

Yesterday, at Next ‘24, we announced a preview of Gemini in Databases — an AI powered database assistant that simplifies all aspects of database operations including assisted development, performance optimization, fleet management, governance, and migrations.

With Gemini in Databases, companies no longer need to exclusively rely on people with specialized skills and custom resources to manage their databases — Google's proactive AI assistant can help!

In most organizations, developers, database administrators, and platform engineers handle various aspects of database development and management.

With Gemini in Databases, database professionals get all of these capabilities in a single offeri…

1 week, 2 days назад @ cloud.google.com
Natural language support in AlloyDB for building gen AI apps with real-time data
Natural language support in AlloyDB for building gen AI apps with real-time data Natural language support in AlloyDB for building gen AI apps with real-time data

AlloyDB AI's natural language support, with features like end-user access controls and NL2SQL primitives, are now available in technology preview through the downloadable AlloyDB Omni edition.

Introducing natural language support to AlloyDBAlloyDB AI provides the best of both worlds: accuracy, security, and flexibility.

These include:Intent clarification: Because users are often imprecise with language, AlloyDB AI has mechanisms for interactively clarifying user intent.

How natural language support worksIn its easiest-to-use form, AlloyDB AI provides an interface that takes in any natural language question, including ones that are broad or imprecise, and returns either accurate results or f…

1 week, 2 days назад @ cloud.google.com
Introducing ML Productivity Goodput: a metric to measure AI system efficiency
Introducing ML Productivity Goodput: a metric to measure AI system efficiency Introducing ML Productivity Goodput: a metric to measure AI system efficiency

This rapid rise in compute scale is made feasible by larger and more efficient compute clusters.

For large-scale training, the true efficiency of the overall ML system is core to its viability — if left unattended, it can make attaining a certain scale infeasible.

In this blog post, we introduce a new metric, ML Productivity Goodput, to measure this efficiency.

We also present an API that you can integrate into your projects to measure and monitor Goodput, and methods to maximize ML Productivity Goodput.

Introducing ML Productivity GoodputML Productivity Goodput is actually composed of three Goodput metrics: Scheduling Goodput, Runtime Goodput, and Program Goodput.

1 week, 2 days назад @ cloud.google.com
Gemma on Google Kubernetes Engine DeepDive: New innovations to serve open generative AI models
Gemma on Google Kubernetes Engine DeepDive: New innovations to serve open generative AI models Gemma on Google Kubernetes Engine DeepDive: New innovations to serve open generative AI models

Gemma models achieve best-in-class performance for their size (Gemma 2B and Gemma 7B) compared to other open models, and are pre-trained and equipped with instruction-tuned variants to enable research and development.

This lets you easily deploy models from your preferred repositories to the infrastructure of your choice.

Earlier today we published a performance deepdive of Gemma on Google Cloud AI optimized infrastructure for training and serving generative AI workloads.

Hugging Face launched the Gemma model card, allowing you to deploy Gemma from Hugging Face directly to Google Cloud.

Once you click the Google Cloud option, you will be redirected to Vertex Model Garden, where you can choo…

1 week, 2 days назад @ cloud.google.com
OpenAI
последний пост 6 days, 5 hours назад
Introducing OpenAI Japan
Introducing OpenAI Japan Introducing OpenAI Japan

Our new local presence also gets us closer to leading businesses like Daikin, Rakuten, and TOYOTA Connected who are using ChatGPT Enterprise to automate complex business processes, assist in data analysis, and optimize internal reporting.

ChatGPT also helps accelerate the efforts of local governments, such as Yokosuka City, which is leveraging the technology to improve the efficiency of public services in Japan.

Over the past year, the city has gradually provided ChatGPT access to almost all city employees, and 80% have reported increases in productivity.

Now Yokosuka City has formed a network with 21 local governments—including the Tokyo Metropolitan Government and the City of Kobe—to …

6 days, 5 hours назад @ openai.com
Introducing improvements to the fine-tuning API and expanding our custom models program
Introducing improvements to the fine-tuning API and expanding our custom models program Introducing improvements to the fine-tuning API and expanding our custom models program

Assisted Fine-TuningAt DevDay last November, we announced a Custom Model program designed to train and optimize models for a specific domain, in partnership with a dedicated group of OpenAI researchers.

Since then, we've met with dozens of customers to assess their custom model needs and evolved our program to further maximize performance.

Today, we are formally announcing our assisted fine-tuning offering as part of the Custom Model program.

Fully custom-trained models imbue new knowledge from a specific domain by modifying key steps of the model training process using novel mid-training and post-training techniques.

Our team modified every step of the model training process, from domain-s…

2 weeks, 2 days назад @ openai.com
Start using ChatGPT instantly
Start using ChatGPT instantly Start using ChatGPT instantly

We’ve also introduced additional content safeguards for this experience, such as blocking prompts and generations in a wider range of categories.

There are many benefits to creating an account including the ability to save and review your chat history, share chats, and unlock additional features like voice conversations and custom instructions.

For anyone that has been curious about AI’s potential but didn’t want to go through the steps to set-up an account, start using ChatGPT today.

2 weeks, 5 days назад @ openai.com
Navigating the Challenges and Opportunities of Synthetic Voices
Navigating the Challenges and Opportunities of Synthetic Voices Navigating the Challenges and Opportunities of Synthetic Voices

We recognize that generating speech that resembles people's voices has serious risks, which are especially top of mind in an election year.

We are engaging with U.S. and international partners from across government, media, entertainment, education, civil society and beyond to ensure we are incorporating their feedback as we build.ÂThe partners testing Voice Engine today have agreed to our usage policies, which prohibit the impersonation of another individual or organization without consent or legal right.

In addition, our terms with these partners require explicit and informed consent from the original speaker and we don’t allow developers to build ways for individual users to create the…

3 weeks, 1 day назад @ openai.com
Sora: First Impressions
Sora: First Impressions Sora: First Impressions

Starting his career at DreamWorks Animation, Don Allen III is a multidisciplinary creator, speaker and consultant who collaborates with major tech and entertainment companies on mixed reality, virtual reality and AI applications.

“For a long time I've been making augmented reality hybrid creatures that I think would be fun combinations in my head.

Now I have a much easier way of prototyping the ideas before I fully build out the 3-D characters to place in spatial computers.” Don cites Sora’s “weirdness” as its greatest strength: “It’s not bound by traditional laws of physics or conventions of thought.” He says that working with Sora shifted his focus from “technical hurdle…

3 weeks, 5 days назад @ openai.com
Global news partnerships: Le Monde and Prisa Media
Global news partnerships: Le Monde and Prisa Media Global news partnerships: Le Monde and Prisa Media

Echoing this sentiment, Louis Dreyfus, CEO of Le Monde, stated, "At the moment we are celebrating the 80th anniversary of Le Monde, this partnership with OpenAI allows us to expand our reach and uphold our commitment to providing accurate, verified, balanced news stories at scale.

Collaborating with OpenAI ensures that our authoritative content can be accessed and appreciated by a broader, more diverse audience. ÂEvery shift in the media landscape has presented Le Monde with new opportunities.

From the transition to digital platforms to embracing the era of free media, Le Monde has consistently seized these moments to underscore its commitment to independence, expertise, and journalistic i…

1 month, 1 week назад @ openai.com
Review completed & Altman, Brockman to continue to lead OpenAI
Review completed & Altman, Brockman to continue to lead OpenAI Review completed & Altman, Brockman to continue to lead OpenAI

The Special Committee of the OpenAI Board today announced the completion of the review by WilmerHale.

The firm conducted dozens of interviews with members of OpenAI’s prior Board, OpenAI executives, advisors to the prior Board, and other pertinent witnesses; reviewed more than 30,000 documents; and evaluated various corporate actions.

“We have unanimously concluded that Sam and Greg are the right leaders for OpenAI,” stated Bret Taylor, Chair of the OpenAI Board.

The Special Committee acknowledged the important work done by WilmerHale in conducting this extensive review and thanked OpenAI current and former Board members, advisors and employees for their cooperation.

The Special Commi…

1 month, 1 week назад @ openai.com
OpenAI announces new members to board of directors
OpenAI announces new members to board of directors OpenAI announces new members to board of directors

Additionally, Sam Altman, CEO, will rejoin the OpenAI Board of Directors.ÂSue, Nicole and Fidji have experience in leading global organizations and navigating complex regulatory environments, including backgrounds in technology, nonprofit and board governance.

They will work closely with current board members Adam D’Angelo, Larry Summers and Bret Taylor as well as Sam and OpenAI’s senior management.ÂBret Taylor, Chair of the OpenAI board, stated, “I am excited to welcome Sue, Nicole, and Fidji to the OpenAI Board of Directors.

She also served as President of Sony Entertainment, Inc., and simultaneously served as President of Sony Corporation of America.

She also serves as a member of …

1 month, 1 week назад @ openai.com
OpenAI and Elon Musk
OpenAI and Elon Musk OpenAI and Elon Musk

Date: January 31, 2018 at 11:54:30 PM PSTSubject: Re: Top AI institutions todayWorking at the cutting edge of AI is unfortunately expensive.

For example,In addition to DeepMind, Google also has Google Brain, Research, and Cloud.

If historical trends are any indication, progress in AI is primarily driven by systems - compute, data, infrastructure.

Not only that, but any algorithmic advances published in a paper somewhere can be almost immediately re-implemented and incorporated.

The “second stage” would be a full self driving solution based on large-scale neural network training, which OpenAI expertise could significantly help accelerate.

1 month, 2 weeks назад @ openai.com
Video generation models as world simulators
Video generation models as world simulators Video generation models as world simulators

This technical report focuses on (1) our method for turning visual data of all types into a unified representation that enables large-scale training of generative models, and (2) qualitative evaluation of Sora’s capabilities and limitations.

Model and implementation details are not included in this report.

Much prior work has studied generative modeling of video data using a variety of methods, including recurrent networks,[^1][^2][^3] generative adversarial networks,[^4][^5][^6][^7] autoregressive transformers,[^8][^9] and diffusion models.

[^10][^11][^12] These works often focus on a narrow category of visual data, on shorter videos, or on videos of a fixed size.

Sora is a generalist mo…

2 months назад @ openai.com
Disrupting malicious uses of AI by state-affiliated threat actors
Disrupting malicious uses of AI by state-affiliated threat actors Disrupting malicious uses of AI by state-affiliated threat actors

Based on collaboration and information sharing with Microsoft, we disrupted five state-affiliated malicious actors: two China-affiliated threat actors known as Charcoal Typhoon and Salmon Typhoon; the Iran-affiliated threat actor known as Crimson Sandstorm; the North Korea-affiliated actor known as Emerald Sleet; and the Russia-affiliated actor known as Forest Blizzard.

The identified OpenAI accounts associated with these actors were terminated.

Salmon Typhoon used our services to translate technical papers, retrieve publicly available information on multiple intelligence agencies and regional threat actors, assist with coding, and research common ways processes could be hidden on a system.…

2 months назад @ openai.com
Memory and new controls for ChatGPT
Memory and new controls for ChatGPT Memory and new controls for ChatGPT

We’re testing memory with ChatGPT.

Remembering things you discuss across all chats saves you from having to repeat information and makes future conversations more helpful.

You're in control of ChatGPT's memory.

You can explicitly tell it to remember something, ask it what it remembers, and tell it to forget conversationally or through settings.

We are rolling out to a small portion of ChatGPT free and Plus users this week to learn how useful it is.

2 months, 1 week назад @ openai.com
Building an early warning system for LLM-aided biological threat creation
Building an early warning system for LLM-aided biological threat creation Building an early warning system for LLM-aided biological threat creation

We believe that these efforts would benefit from broader input, and that methods-sharing could also be of value to the AI risk research community.

To this end, we are presenting some of our early work—today, focused on biological risk.

This evaluation aims to measure whether models could meaningfully increase malicious actors’ access to dangerous information about biological threat creation, compared to the baseline of existing resources (i.e., the internet).

Each participant was then asked to complete a set of tasks covering aspects of the end-to-end process for biological threat creation.

We also discuss the limitations of statistical significance as an effective method of measuring m…

2 months, 2 weeks назад @ openai.com
New embedding models and API updates
New embedding models and API updates New embedding models and API updates

We are launching a new generation of embedding models, new GPT-4 Turbo and moderation models, new API usage management tools, and soon, lower pricing on GPT-3.5 Turbo.

2 months, 3 weeks назад @ openai.com
Democratic inputs to AI grant program: lessons learned and implementation plans
Democratic inputs to AI grant program: lessons learned and implementation plans Democratic inputs to AI grant program: lessons learned and implementation plans

We funded 10 teams from around the world to design ideas and tools to collectively govern AI. We summarize the innovations, outline our learnings, and call for researchers and engineers to join us as we continue this work.

3 months назад @ openai.com
Microsoft Microsoft
последний пост 1 day, 20 hours назад
SAMMO: A general-purpose framework for prompt optimization
SAMMO: A general-purpose framework for prompt optimization SAMMO: A general-purpose framework for prompt optimization

New generations of language models like GPT-4 and Mixtral 8x7B advance the capability to process long input texts.

The first structure, the task description, remains static and independent of the input as a result of conventional prompt optimization techniques.

Despite previous efforts in prompt optimization, the evolution towards more complex prompt structures has rendered many older strategies ineffective in this new context.

SAMMO: A prompt optimization approachDownload SAMMOTo address these challenges, we developed the Structure-Aware Multi-objective Metaprompt Optimization (SAMMO) framework.

We compared it against Automatic Prompt Optimization (APO) and GrIPS, applying open-source mode…

1 day, 20 hours назад @ microsoft.com
Research Focus: Week of April 15, 2024
Research Focus: Week of April 15, 2024 Research Focus: Week of April 15, 2024

Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft.

NEW RESEARCHAppropriate reliance on Generative AI: Research synthesisAppropriate reliance on AI happens when people accept correct AI outputs and reject incorrect ones.

Spotlight: Event Series Microsoft Research Forum Join us for a continuous exchange of ideas about research in the era of general AI.

They also develop AFRICOMET: COMET evaluation metrics for African languages by leveraging DA data from well-resourced languages and an African-centric multilingual encoder (AfroXLMR) to create state-of-the-…

2 days, 20 hours назад @ microsoft.com
Microsoft at NDSI 2024: Discoveries and implementations in networked systems
Microsoft at NDSI 2024: Discoveries and implementations in networked systems Microsoft at NDSI 2024: Discoveries and implementations in networked systems

Networked systems and their applications are essential in building the reliable, scalable, secure, and innovative infrastructure required to meet society’s evolving needs.

Microsoft is honored to support NDSI ‘24 as a returning sponsor.

We are pleased to announce that 23 papers from Microsoft researchers and their partners have been accepted to the conference.

They encompass both early-stage research and systems already deployed in production.

Spotlight: Microsoft research newsletter Microsoft Research Newsletter Stay connected to the research community at Microsoft.

3 days, 20 hours назад @ microsoft.com
Abstracts: April 16, 2024
Abstracts: April 16, 2024 Abstracts: April 16, 2024

GRETCHEN HUIZINGA: Welcome to Abstracts, a Microsoft Research Podcast that puts the spotlight on world-class research in brief.

CHAKRABORTY: So satellite connectivity is nothing new and has been there for long.

So we are talking about the satellites that are at least 10 to 20 times cheaper and smaller than state-of-the-art satellites.

So the device sends some packet to the satellite; satellite sends some packet to the device—it’s all about packet exchange.

So our vision is clear: to bring 24-7 connectivity for devices anywhere on Earth with just a click of power button.

3 days, 23 hours назад @ microsoft.com
Ideas: Language technologies for everyone with Kalika Bali
Ideas: Language technologies for everyone with Kalika Bali

The new series “Ideas” debuts with guest Kalika Bali. The speech and language tech researcher talks sci-fi and its impact on her career, the design thinking philosophy behind her research, and the “outrageous idea” she had to work with low-resource languages. The post Ideas: Language technologies for everyone with Kalika Bali appeared first on Microsoft Research.

1 week, 1 day назад @ microsoft.com
Research Focus: Week of April 1, 2024
Research Focus: Week of April 1, 2024 Research Focus: Week of April 1, 2024

Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft.

Existing work on tool-augmented LLMs primarily focuses on the broad coverage of tools and the flexibility of adding new tools.

However, much of this research has been confined to English, leaving LLM building and evaluation for non-English languages relatively unexplored.

NEW RESEARCHTraining Audio Captioning Models without AudioAutomated Audio Captioning (AAC) is a process that creates text descriptions for audio recordings.

Their approach leverages CLAP, a contrastive learning model that uses audio an…

2 weeks, 2 days назад @ microsoft.com
AI Frontiers: Rethinking intelligence with Ashley Llorens and Ida Momennejad
AI Frontiers: Rethinking intelligence with Ashley Llorens and Ida Momennejad AI Frontiers: Rethinking intelligence with Ashley Llorens and Ida Momennejad

And so I just want to start here: for you, Ida, what is general intelligence?

Different people at different times provide different criteria for what would be the artificial general intelligence notion.

One is artificial general intelligence and the other is humanlike intelligence or human-level intelligence.

Artificial general intelligence and humanlike, human-level intelligence—how do these two concepts relate to you?

LLORENS: So it sounds like a very extensive set of experiments across many different tasks and with many different leading AI models, and you’ve uncovered a lack of robustness across some of these different tasks.

3 weeks, 1 day назад @ microsoft.com
Learning from interaction with Microsoft Copilot (web)
Learning from interaction with Microsoft Copilot (web) Learning from interaction with Microsoft Copilot (web)

AI systems like Bing and Microsoft Copilot (web) are as good as they are because they continuously learn and improve from people’s interactions.

[1]How are people using Copilot (web)?

Conversely, they express explicit frustration or switch topics when encountering mistakes in the response from Copilot (web).

These three reports present a comprehensive and multi-faceted approach to dynamically learning from conversation logs in Copilot (web) at scale.

[1] The research was performed only on fully de-identified interaction data from Copilot (web) consumers.

3 weeks, 2 days назад @ microsoft.com
Abstracts: March 21, 2024
Abstracts: March 21, 2024 Abstracts: March 21, 2024

GRETCHEN HUIZINGA: Welcome to Abstracts, a Microsoft Research Podcast that puts the spotlight on world-class research in brief.

These two examples are also the differences from other deep learning OFDFT works.

This is the generalization challenge and is one of the major challenges of deep learning method for molecular science applications.

This somehow shows the benefits of leveraging the OFDFT framework for using a deep learning method to solve molecular tasks.

You can also read it on arXiv, or you can check out the March 2024 issue of Nature Computational Science.

4 weeks, 1 day назад @ microsoft.com
Research Focus: Week of March 18, 2024
Research Focus: Week of March 18, 2024 Research Focus: Week of March 18, 2024

Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft.

NEW RESEARCHFewer is More: Boosting LLM Reasoning with Reinforced Context PruningLarge language models (LLMs) have shown impressive capabilities, yet they still struggle with math reasoning.

Given that adding more concise CoT examples in the prompt can improve LLM reasoning performance, CoT-Influx employs a coarse-to-fine pruner to maximize the input of effective and concise CoT examples.

The pruner first selects as many crucial CoT examples as possible and then prunes unimportant tokens to fit the cont…

1 month назад @ microsoft.com
Intelligent monitoring: Towards AI-assisted monitoring for cloud services
Intelligent monitoring: Towards AI-assisted monitoring for cloud services Intelligent monitoring: Towards AI-assisted monitoring for cloud services

Microsoft cloud monitor platformsWhen they are properly configured, cloud monitors can help to meet monitoring requirements.

Notably, missing monitors and alerts constituted over 40 percent of all misdetections, indicating the complexity of determining what to monitor in cloud services.

Data-driven intelligent monitoringOrganizing monitor dataBecause there is no standardized approach to building monitors, monitor data often lacks structure.

In our paper, “Intelligent Monitoring Framework for Cloud Services: A Data-Driven Approach (opens in new tab),” to be presented at ICSE 2024 (opens in new tab), we propose a data-driven approach for developing this ontology.

This shows us that we can pre…

1 month назад @ microsoft.com
Introducing Garnet – an open-source, next-generation, faster cache-store for accelerating applications and services
Introducing Garnet – an open-source, next-generation, faster cache-store for accelerating applications and services Introducing Garnet – an open-source, next-generation, faster cache-store for accelerating applications and services

Researchers at Microsoft have been working for nearly a decade to address the increasing demand for data storage mechanisms to support the rapid advances in interactive web applications and services.

This has fueled a growing cache-store industry, including many open-source systems, such as Redis, Memcached, KeyDB, and Dragonfly.

Garnet demonstrates better client latency at the 99 th and 99.9 th percentiles, which is critical to real-world scenarios.

We compare Garnet to the latest open-source versions of Redis (opens in new tab) (v7.2), KeyDB (opens in new tab) (v6.3.4), and Dragonfly (opens in new tab) (v6.2.11).

We see in Figure 4 that Garnet’s latency is low across the board.

1 month назад @ microsoft.com
Exploring how context, culture, and character matter in avatar research
Exploring how context, culture, and character matter in avatar research Exploring how context, culture, and character matter in avatar research

In our paper, “Ecological Validity and the Evaluation of Avatar Facial Animation Noise,” presented at ANIVAE 2024, we explore the challenge of evaluating avatar noise without a standardized approach.

Traditional methods, which present participants with isolated facial animation noise to gauge perception thresholds, fall short of reflecting real-life avatar interactions.

Our approach emphasizes ecological validity—the extent to which experiments mimic real-world conditions—as central in assessing avatar noise.

Isolated clips, on the other hand, led to greater annoyance with facial animation noise, suggesting the importance of social context over hyper-realistic animation.

This indicates that…

1 month назад @ microsoft.com
Scaling early detection of esophageal cancer with AI
Scaling early detection of esophageal cancer with AI Scaling early detection of esophageal cancer with AI

Microsoft Research and Cyted have collaborated to build novel AI models (opens in new tab) to scale the early detection of esophageal cancer.

Fewer than 1 in 5 patients survive five years after diagnosis, making early detection of this disease critical to improving a patient’s chances.

However early detection of BE has typically involved an endoscopic biopsy, a procedure that many people find uncomfortable and invasive.

As we move forward, the scalability of this technology holds the promise for widespread adoption of early detection in the fight against esophageal cancer.

As researchers, it has been exciting to work closely with Cyted and be part of the long path towards early detection of…

1 month, 1 week назад @ microsoft.com
Improving LLM understanding of structured data and exploring advanced prompting methods
Improving LLM understanding of structured data and exploring advanced prompting methods Improving LLM understanding of structured data and exploring advanced prompting methods

Our paper, “Table Meets LLM: Can Large Language Models Understand Structured Table Data?

Despite the simplicity of the benchmark tasks, the highest overall accuracy across seven tasks is only 65.43 percent.

Choosing the right combination of input designs can significantly enhance LLMs’ understanding of structured data.

We suggest future research should prioritize the integration of structural information to improve performance with various structured data types.

Additionally, we propose exploring LLMs’ ability to use external tools or agents for improved handling of structured data, opening new avenues for application.

1 month, 1 week назад @ microsoft.com
MIT AI MIT AI
последний пост 1 day, 8 hours назад
To build a better AI helper, start by modeling the irrational behavior of humans
To build a better AI helper, start by modeling the irrational behavior of humans To build a better AI helper, start by modeling the irrational behavior of humans

To build AI systems that can collaborate effectively with humans, it helps to have a good model of human behavior to start with.

Ultimately, this work could help scientists teach AI systems how humans behave, which could enable these systems to respond better to their human collaborators.

Being able to model human behavior is an important step toward building an AI agent that can actually help that human,” he says.

Modeling behaviorResearchers have been building computational models of human behavior for decades.

Moreover, the researchers saw that their model of human behavior matched up well with measures of player skill (in chess matches) and task difficulty.

1 day, 8 hours назад @ news.mit.edu
Using deep learning to image the Earth’s planetary boundary layer
Using deep learning to image the Earth’s planetary boundary layer Using deep learning to image the Earth’s planetary boundary layer

Although the troposphere is often thought of as the closest layer of the atmosphere to the Earth’s surface, the planetary boundary layer (PBL) — the lowest layer of the troposphere — is actually the part that most significantly influences weather near the surface.

This PBL-focused research effort builds on more than a decade of related work on fast, operational neural network algorithms developed by Lincoln Laboratory for NASA missions.

According to a Global Drought Snapshot report released last year, droughts are a pressing planetary issue that the global community needs to address.

Lack of humidity near the surface, specifically at the level of the PBL, is the leading indicator of drought…

1 day, 17 hours назад @ news.mit.edu
Advancing technology for aquaculture
Advancing technology for aquaculture Advancing technology for aquaculture

Like land-based farming, shellfish aquaculture requires healthy seed production in order to maintain a sustainable industry.

Aquaculture hatchery production of shellfish larvae — seeds — requires close monitoring to track mortality rates and assess health from the earliest stages of life.

Careful observation is necessary to inform production scheduling, determine effects of naturally occurring harmful bacteria, and ensure sustainable seed production.

ARC faces challenges with manually quantifying larvae classes, an important step in their seed production process.

Developing an automated identification and counting system will help to improve this step in the production process with time and…

1 day, 17 hours назад @ news.mit.edu
3 Questions: Enhancing last-mile logistics with machine learning
3 Questions: Enhancing last-mile logistics with machine learning 3 Questions: Enhancing last-mile logistics with machine learning

Q: What is the vehicle routing problem, and how do traditional operations research (OR) methods address it?

A: The vehicle routing problem is faced by pretty much every logistics and delivery company like USPS, Amazon, UPS, FedEx, DHL every single day.

To solve the vehicle routing problem, we obviously we can't do our modeling without proper demand information and, ideally, customer-related characteristics.

Then there are a bunch of other equations that define the inner workings of a routing problem.

Q: You’re currently applying machine learning to the vehicle routing problem.

3 days, 17 hours назад @ news.mit.edu
A crossroads for computing at MIT
A crossroads for computing at MIT A crossroads for computing at MIT

On Vassar Street, in the heart of MIT’s campus, the MIT Stephen A. Schwarzman College of Computing recently opened the doors to its new headquarters in Building 45.

“The college has a broad mandate for computing across MIT,” says Daniel Huttenlocher, dean of the MIT Schwarzman College of Computing and the Henry Ellis Warren Professor of Electrical Engineering and Computer Science.

The building will also accommodate 50 computing research groups, which correspond to the number of new faculty the college is hiring — 25 in core computing positions and 25 in shared positions with departments at MIT.

Organized by various MIT faculty, the 12 sessions in the series delved into exciting areas of com…

1 week, 1 day назад @ news.mit.edu
New AI method captures uncertainty in medical images
New AI method captures uncertainty in medical images New AI method captures uncertainty in medical images

In biomedicine, segmentation involves annotating pixels from an important structure in a medical image, like an organ or cell.

However, these models typically only provide one answer, while the problem of medical image segmentation is often far from black and white.

Rakic is lead author of a paper with others at MIT, the Broad Institute of MIT and Harvard, and Massachusetts General Hospital that introduces a new AI tool that can capture the uncertainty in a medical image.

Addressing ambiguityAI systems for medical image segmentation typically use neural networks.

The researchers also developed a version of Tyche that can be used with an existing, pretrained model for medical image segmentat…

1 week, 1 day назад @ news.mit.edu
A faster, better way to prevent an AI chatbot from giving toxic responses
A faster, better way to prevent an AI chatbot from giving toxic responses A faster, better way to prevent an AI chatbot from giving toxic responses

To prevent this and other safety issues, companies that build large language models typically safeguard them using a process called red-teaming.

But this only works effectively if engineers know which toxic prompts to use.

The technique outperformed human testers and other machine-learning approaches by generating more distinct prompts that elicited increasingly toxic responses.

This trial-and-error process rewards the red-team model for generating prompts that trigger toxic responses from the chatbot being tested.

“If you are releasing a new AI model and are concerned about whether it will behave as expected, consider using curiosity-driven red-teaming,” says Agrawal.

1 week, 3 days назад @ news.mit.edu
Extracting hydrogen from rocks
Extracting hydrogen from rocks Extracting hydrogen from rocks

Geologic hydrogen, as it’s known, is produced when water reacts with iron-rich rocks, causing the iron to oxidize.

Abate is looking to jump-start the natural hydrogen production process, implementing “proactive” approaches that involve stimulating production and harvesting the gas.

In December, French President Emmanuel Macron said his government would provide funding to explore natural hydrogen.

The projects themselves are diverse, ranging from applying industrial oil and gas methods for hydrogen production and extraction to developing models to understand hydrogen formation in rocks.

The lab-scale device will also inform the design of a real-world reactor that can accelerate hydrogen prod…

1 week, 4 days назад @ news.mit.edu
When an antibiotic fails: MIT scientists are using AI to target “sleeper” bacteria
When an antibiotic fails: MIT scientists are using AI to target “sleeper” bacteria When an antibiotic fails: MIT scientists are using AI to target “sleeper” bacteria

When an infection is treated repeatedly, clinicians run the risk of bacteria becoming resistant to the antibiotics.

One well-documented possibility is that the bacteria are becoming metabolically inert, escaping detection of traditional antibiotics that only respond to metabolic activity.

Another revelation was semapimod's ability to disrupt the membranes of so-called “Gram-negative” bacteria, which are known for their high intrinsic resistance to antibiotics due to their thicker, less-penetrable outer membrane.

Examples of Gram-negative bacteria include E. coli, A. baumannii, Salmonella, and Pseudomonis, all of which are challenging to find new antibiotics for.

Valeri recalls a quote from …

1 week, 4 days назад @ news.mit.edu
A new computational technique could make it easier to engineer useful proteins
A new computational technique could make it easier to engineer useful proteins A new computational technique could make it easier to engineer useful proteins

This process has yielded optimized versions of many important proteins, including green fluorescent protein (GFP).

MIT researchers have now developed a computational approach that makes it easier to predict mutations that will lead to better proteins, based on a relatively small amount of data.

These landscapes contain peaks that represent fitter proteins and valleys that represent less fit proteins.

To overcome this problem, the researchers used an existing computational technique to “smooth” the fitness landscape.

The researchers now plan to use this computational technique on data that Bracha has been generating on voltage indicator proteins.

2 weeks, 3 days назад @ news.mit.edu
Most work is new work, long-term study of U.S. census data shows
Most work is new work, long-term study of U.S. census data shows Most work is new work, long-term study of U.S. census data shows

In 1900, Orville and Wilbur Wright listed their occupations as “Merchant, bicycle” on the U.S. census form.

Most work in the U.S. is new work, as U.S. census forms reveal.

Finally, the study brings novel data to a tricky question: To what extent does technology create new jobs, and to what extent does it replace jobs?

“In the first period, from 1940 to 1980, there’s a lot of work being created for people without college degrees, a lot of clerical work and production work, middle-skill work.

Support for the research was provided, in part, by the Carnegie Corporation; Google; Instituut Gak; the MIT Work of the Future Task Force; Schmidt Futures; the Smith Richardson Foundation; and the Washin…

2 weeks, 5 days назад @ news.mit.edu
Does technology help or hurt employment?
Does technology help or hurt employment? Does technology help or hurt employment?

Overall, does technology replace more jobs than it creates?

On net, the study finds, and particularly since 1980, technology has replaced more U.S. jobs than it has generated.

That has allowed them, for the first time, to quantify the effects of technology over both job loss and job creation.

“There’s no law these things have to be one-for-one balanced, although there’s been no period where we haven’t also created new work,” Autor observes.

“The missing link was documenting and quantifying how much technology augments people’s jobs,” Autor says.

2 weeks, 5 days назад @ news.mit.edu
Researchers create “The Consensus Game” to elevate AI’s text comprehension and generation skills
Researchers create “The Consensus Game” to elevate AI’s text comprehension and generation skills Researchers create “The Consensus Game” to elevate AI’s text comprehension and generation skills

“Imagine a new way to help language models understand and generate text, like a game.

The math behind this partially inspired The Consensus Game.

The Consensus Game system reaches equilibrium as an agreement, ensuring accuracy and fidelity to the model's original insights.

In practice, implementing the Consensus Game approach to language model querying, especially for question-answering tasks, does involve significant computational challenges.

“The proposal by the MIT researchers is an innovative game-theoretic framework for decoding from language models through solving the equilibrium of a consensus game.

3 weeks назад @ news.mit.edu
MIT launches Working Group on Generative AI and the Work of the Future
MIT launches Working Group on Generative AI and the Work of the Future MIT launches Working Group on Generative AI and the Work of the Future

The generative AI wave has elicited excitement, anxiety, and plenty of speculation about what's to come, but no clear answers to these core questions.

To discover how generative AI can lead to better jobs, MIT is convening a working group on Generative AI and the Work of the Future.

The group is gathering original data on how teams are using generative AI tools — and the impact these tools are having on workers.

The working group also hosted a workshop on Feb. 29 highlighting responsible AI practices, in partnership with MIT’s Industrial Liaison Program.

IBM has joined the working group as part of its broader investments in retraining and job transformation related to generative AI.

3 weeks, 1 day назад @ news.mit.edu
Second round of seed grants awarded to MIT scholars studying the impact and applications of generative AI
Second round of seed grants awarded to MIT scholars studying the impact and applications of generative AI Second round of seed grants awarded to MIT scholars studying the impact and applications of generative AI

In light of this enthusiastic response, Kornbluth and Barnhart announced a second call for proposals this fall.

“The groundswell of interest and the caliber of the ideas overall made clear that a second round was in order,” they said in their email to MIT’s research community this fall.

Following the second call, the faculty committee from the first round considered the proposals and selected 16 proposals to receive exploratory funding.

Each selected research group will receive between $50,000 and $70,000 to create 10-page impact papers.

The selected papers are:

3 weeks, 1 day назад @ news.mit.edu
Berkeley AI
последний пост 1 month назад
Modeling Extremely Large Images with xT
Modeling Extremely Large Images with xT Modeling Extremely Large Images with xT

Modeling Extremely Large Images with xTAs computer vision researchers, we believe that every pixel can tell a story.

However, there seems to be a writer’s block settling into the field when it comes to dealing with large images.

Today, we make one of two sub-optimal choices when handling large images: down-sampling or cropping.

Why bother handling large images anyways?

That’s basically what we do with large images with $x$T.

1 month назад @ bair.berkeley.edu
Modeling Extremely Large Images with xT
Modeling Extremely Large Images with xT Modeling Extremely Large Images with xT

As computer vision researchers, we believe that every pixel can tell a story. However, there seems to be a writer’s block settling into the field when it comes to dealing with large images. Large images are no longer rare—the cameras we carry in our pockets and those orbiting our planet snap pictures so big and detailed that they stretch our current best models and hardware to their breaking points when handling them. Generally, we face a quadratic increase in memory usage as a function of image size.

Today, we make one of two sub-optimal choices when handling large images: down-sampling or cropping. These two methods incur significant losses in the amount of information and context present…

1 month назад @ localhost:4000
2024 BAIR Graduate Directory
2024 BAIR Graduate Directory 2024 BAIR Graduate Directory

Every year, the Berkeley Artificial Intelligence Research (BAIR) Lab graduates some of the most talented and innovative minds in artificial intelligence and machine learning. Our Ph.D. graduates have each expanded the frontiers of AI research and are now ready to embark on new adventures in academia, industry, and beyond.

These fantastic individuals bring with them a wealth of knowledge, fresh ideas, and a drive to continue contributing to the advancement of AI. Their work at BAIR, ranging from deep learning, robotics, and natural language processing to computer vision, security, and much more, has contributed significantly to their fields and has had transformative impacts on society.

This…

1 month, 1 week назад @ localhost:4000
2024 BAIR Graduate Directory
2024 BAIR Graduate Directory 2024 BAIR Graduate Directory

2024 BAIR Graduate DirectoryEvery year, the Berkeley Artificial Intelligence Research (BAIR) Lab graduates some of the most talented and innovative minds in artificial intelligence and machine learning.

Our Ph.D. graduates have each expanded the frontiers of AI research and are now ready to embark on new adventures in academia, industry, and beyond.

These fantastic individuals bring with them a wealth of knowledge, fresh ideas, and a drive to continue contributing to the advancement of AI.

Join us in celebrating the achievements of BAIR’s latest PhD graduates.

Thank you to our friends at the Stanford AI Lab for this idea!

1 month, 1 week назад @ bair.berkeley.edu
The Shift from Models to Compound AI Systems
The Shift from Models to Compound AI Systems The Shift from Models to Compound AI Systems

AI caught everyone’s attention in 2023 with Large Language Models (LLMs) that can be instructed to perform general tasks, such as translation or coding, just by prompting. This naturally led to an intense focus on models as the primary ingredient in AI application development, with everyone wondering what capabilities new LLMs will bring.

As more developers begin to build using LLMs, however, we believe that this focus is rapidly changing: state-of-the-art AI results are increasingly obtained by compound systems with multiple components, not just monolithic models.

For example, Google’s AlphaCode 2 set state-of-the-art results in programming through a carefully engineered system that uses L…

2 months назад @ localhost:4000
The Shift from Models to Compound AI Systems
The Shift from Models to Compound AI Systems The Shift from Models to Compound AI Systems

In this post, we analyze the trend toward compound AI systems and what it means for AI developers.

We argue that compound AI systems will likely be the best way to maximize AI results in the future, and might be one of the most impactful trends in AI in 2024.

We define a Compound AI System as a system that tackles AI tasks using multiple interacting components, including multiple calls to models, retrievers, or external tools.

Developing Compound AI SystemsWhile compound AI systems can offer clear benefits, the art of designing, optimizing, and operating them is still emerging.

However, new compound AI systems contain non-differentiable components like search engines or code interpreters, a…

2 months назад @ bair.berkeley.edu
Ghostbuster: Detecting Text Ghostwritten by Large Language Models
Ghostbuster: Detecting Text Ghostwritten by Large Language Models Ghostbuster: Detecting Text Ghostwritten by Large Language Models

Ghostbuster: Detecting Text Ghostwritten by Large Language ModelsThe structure of Ghostbuster, our new state-of-the-art method for detecting AI-generated text.

Large language models like ChatGPT write impressively well—so well, in fact, that they’ve become a problem.

Existing tools to detect AI-generated text sometimes do poorly on data that differs from what they were trained on.

Our recent paper introduces Ghostbuster, a state-of-the-art method for detecting AI-generated text.

Many current AI-generated text detection systems are brittle to classifying different types of text (e.g., different writing styles, or different text generation models or prompts).

5 months, 1 week назад @ bair.berkeley.edu
Ghostbuster: Detecting Text Ghostwritten by Large Language Models
Ghostbuster: Detecting Text Ghostwritten by Large Language Models Ghostbuster: Detecting Text Ghostwritten by Large Language Models

The structure of Ghostbuster, our new state-of-the-art method for detecting AI-generated text. Large language models like ChatGPT write impressively well—so well, in fact, that they’ve become a problem. Students have begun using these models to ghostwrite assignments, leading some schools to ban ChatGPT. In addition, these models are also prone to producing text with factual errors, so wary readers may want to know if generative AI tools have been used to ghostwrite news articles or other sources before trusting them.

What can teachers and consumers do? Existing tools to detect AI-generated text sometimes do poorly on data that differs from what they were trained on. In addition, if these m…

5 months, 1 week назад @ localhost:4000
Asymmetric Certified Robustness via Feature-Convex Neural Networks
Asymmetric Certified Robustness via Feature-Convex Neural Networks Asymmetric Certified Robustness via Feature-Convex Neural Networks

Asymmetric Certified Robustness via Feature-Convex Neural NetworksAsymmetric Certified Robustness via Feature-Convex Neural NetworksTLDR: We propose the asymmetric certified robustness problem, which requires certified robustness for only one class and reflects real-world adversarial scenarios.

We argue that these issues can be addressed by refining the certified robustness problem to be more aligned with practical adversarial settings.

The Asymmetric Certified Robustness ProblemCurrent certifiably robust classifiers produce certificates for inputs belonging to any class.

Feature-convex classifiersWe propose feature-convex neural networks to address the asymmetric robustness problem.

Conclu…

5 months, 1 week назад @ bair.berkeley.edu
Asymmetric Certified Robustness via Feature-Convex Neural Networks
Asymmetric Certified Robustness via Feature-Convex Neural Networks Asymmetric Certified Robustness via Feature-Convex Neural Networks

Asymmetric Certified Robustness via Feature-Convex Neural Networks TLDR: We propose the asymmetric certified robustness problem, which requires certified robustness for only one class and reflects real-world adversarial scenarios. This focused setting allows us to introduce feature-convex classifiers, which produce closed-form and deterministic certified radii on the order of milliseconds. Figure 1. Illustration of feature-convex classifiers and their certification for sensitive-class inputs. This architecture composes a Lipschitz-continuous feature map $\varphi$ with a learned convex function $g$. Since $g$ is convex, it is globally underapproximated by its tangent plane at $\varphi(x)$, y…

5 months, 1 week назад @ localhost:4000
Goal Representations for Instruction Following
Goal Representations for Instruction Following Goal Representations for Instruction Following

Goal Representations for Instruction Following A longstanding goal of the field of robot learning has been to create generalist agents that can perform tasks for humans. Natural language has the potential to be an easy-to-use interface for humans to specify arbitrary tasks, but it is difficult to train robots to follow language instructions. Approaches like language-conditioned behavioral cloning (LCBC) train policies to directly imitate expert actions conditioned on language, but require humans to annotate all training trajectories and generalize poorly across scenes and behaviors. Meanwhile, recent goal-conditioned approaches perform much better at general manipulation tasks, but do not e…

6 months назад @ localhost:4000
Goal Representations for Instruction Following
Goal Representations for Instruction Following Goal Representations for Instruction Following

Goal Representations for Instruction FollowingGoal Representations for Instruction FollowingA longstanding goal of the field of robot learning has been to create generalist agents that can perform tasks for humans.

Goal Representations for Instruction FollowingThe GRIF model consists of a language encoder, a goal encoder, and a policy network.

Our approach, Goal Representations for Instruction Following (GRIF), jointly trains a language- and a goal- conditioned policy with aligned task representations.

In particular, we exploit this structure by requiring that language- and goal- representations be similar for the same semantic task.

We train dual image and text encoders by doing contrastiv…

6 months назад @ bair.berkeley.edu
Rethinking the Role of PPO in RLHF
Rethinking the Role of PPO in RLHF Rethinking the Role of PPO in RLHF

Rethinking the Role of PPO in RLHFRethinking the Role of PPO in RLHFTL;DR: In RLHF, there’s tension between the reward learning phase, which uses human preference in the form of comparisons, and the RL fine-tuning phase, which optimizes a single, non-comparative reward.

By incorporating a new component - pairwise policy gradient, we can unify the reward modeling stage and RL stage, enabling direct updates based on pairwise responses.

Proximal Policy Optimization (PPO), the dominant RL optimizer in this process, has been reported to exhibit instability and implementation complications.

Derivation of P3OOur idea stems from the vanilla policy gradient (VPG).

Under this framework, we develop …

6 months, 1 week назад @ bair.berkeley.edu
Rethinking the Role of PPO in RLHF
Rethinking the Role of PPO in RLHF Rethinking the Role of PPO in RLHF

Rethinking the Role of PPO in RLHF TL;DR: In RLHF, there’s tension between the reward learning phase, which uses human preference in the form of comparisons, and the RL fine-tuning phase, which optimizes a single, non-comparative reward. What if we performed RL in a comparative way? Figure 1: This diagram illustrates the difference between reinforcement learning from absolute feedback and relative feedback. By incorporating a new component - pairwise policy gradient, we can unify the reward modeling stage and RL stage, enabling direct updates based on pairwise responses. Large Language Models (LLMs) have powered increasingly capable virtual assistants, such as GPT-4, Claude-2, Bard and Bing…

6 months, 1 week назад @ localhost:4000
Goal Representations for Instruction Following
Goal Representations for Instruction Following Goal Representations for Instruction Following

Goal Representations for Instruction FollowingGoal Representations for Instruction FollowingA longstanding goal of the field of robot learning has been to create generalist agents that can perform tasks for humans.

Goal Representations for Instruction FollowingThe GRIF model consists of a language encoder, a goal encoder, and a policy network.

Our approach, Goal Representations for Instruction Following (GRIF), jointly trains a language- and a goal- conditioned policy with aligned task representations.

In particular, we exploit this structure by requiring that language- and goal- representations be similar for the same semantic task.

We train dual image and text encoders by doing contrastiv…

6 months, 1 week назад @ bair.berkeley.edu
AWS Machine Learning AWS Machine Learning
последний пост 11 часов назад
Introducing automatic training for solutions in Amazon Personalize
Introducing automatic training for solutions in Amazon Personalize Introducing automatic training for solutions in Amazon Personalize

Amazon Personalize is excited to announce automatic training for solutions.

PrerequisitesTo enable automatic training for your solutions, you first need to set up Amazon Personalize resources.

For more information about tagging Amazon Personalize resources, see Tagging Amazon Personalize resources.

To use automatic training, in the Automatic training section, select Turn on and specify your training frequency.

For more information about optimizing your user experience with Amazon Personalize, see the Amazon Personalize Developer Guide.

11 часов назад @ aws.amazon.com
Use Kubernetes Operators for new inference capabilities in Amazon SageMaker that reduce LLM deployment costs by 50% on average
Use Kubernetes Operators for new inference capabilities in Amazon SageMaker that reduce LLM deployment costs by 50% on average Use Kubernetes Operators for new inference capabilities in Amazon SageMaker that reduce LLM deployment costs by 50% on average

We are excited to announce a new version of the Amazon SageMaker Operators for Kubernetes using the AWS Controllers for Kubernetes (ACK).

For more details, see Amazon SageMaker adds new inference capabilities to help reduce foundation model deployment costs and latency.

In this post, we show how to use SageMaker ACK Operators to deploy SageMaker inference components.

kubectl apply passes this file, called a manifest, to the Kubernetes API server running in the Kubernetes controller node.

ConclusionIn this post, we showed how to use SageMaker ACK Operators to deploy SageMaker inference components.

19 часов назад @ aws.amazon.com
Talk to your slide deck using multimodal foundation models hosted on Amazon Bedrock and Amazon SageMaker – Part 2
Talk to your slide deck using multimodal foundation models hosted on Amazon Bedrock and Amazon SageMaker – Part 2 Talk to your slide deck using multimodal foundation models hosted on Amazon Bedrock and Amazon SageMaker – Part 2

We used AWS services including Amazon Bedrock, Amazon SageMaker, and Amazon OpenSearch Serverless in this solution.

These descriptions are then converted into text embeddings using the Amazon Titan Text Embeddings model and stored in a vector database.

OpenSearch Serverless is an on-demand serverless configuration for Amazon OpenSearch Service.

Amazon OpenSearch Ingestion (OSI) is a fully managed, serverless data collector that delivers data to OpenSearch Service domains and OpenSearch Serverless collections.

The user input is converted into embeddings using the Amazon Titan Text Embeddings model accessed using Amazon Bedrock.

20 часов назад @ aws.amazon.com
Scale AI training and inference for drug discovery through Amazon EKS and Karpenter
Scale AI training and inference for drug discovery through Amazon EKS and Karpenter Scale AI training and inference for drug discovery through Amazon EKS and Karpenter

In this post, we focus on how we used Karpenter on Amazon Elastic Kubernetes Service (Amazon EKS) to scale AI training and inference, which are core elements of the Iambic discovery platform.

The Kubernetes Event Driven Autoscaler (KEDA) is configured to automatically scale the number of service pods, based on the custom metrics available in Prometheus.

We deploy Karpenter by following the instructions in the Karpenter documentation, or by running the following script, which automates the deployment instructions.

We then deploy the load generator, which causes KEDA to add more service pods and Karpenter adds more GPU instances.

When we scale the load-generating deployment to 40 replicas, KE…

20 часов назад @ aws.amazon.com
Generate customized, compliant application IaC scripts for AWS Landing Zone using Amazon Bedrock
Generate customized, compliant application IaC scripts for AWS Landing Zone using Amazon Bedrock Generate customized, compliant application IaC scripts for AWS Landing Zone using Amazon Bedrock

With Amazon Bedrock, teams can input high-level architectural descriptions and use generative AI to generate a baseline configuration of Terraform scripts.

In this post, we show you how to generate customized, compliant IaC scripts for AWS Landing Zone using Amazon Bedrock.

The following is an example of a customized Terraform-based AWS Landing Zone solution, in which each application resides in its own AWS account.

This function, which is central to the operation, translates these inputs into compliant code, using Amazon Bedrock and Knowledge Bases for Amazon Bedrock.

Allow Amazon Bedrock to create and manage the vector store for you in Amazon OpenSearch Service.

1 day, 18 hours назад @ aws.amazon.com
Live Meeting Assistant with Amazon Transcribe, Amazon Bedrock, and Knowledge Bases for Amazon Bedrock
Live Meeting Assistant with Amazon Transcribe, Amazon Bedrock, and Knowledge Bases for Amazon Bedrock Live Meeting Assistant with Amazon Transcribe, Amazon Bedrock, and Knowledge Bases for Amazon Bedrock

In this post, we show you how to use LMA with Amazon Transcribe, Amazon Bedrock, and Knowledge Bases for Amazon Bedrock.

It uses Amazon Transcribe for speech to text, Knowledge Bases for Amazon Bedrock for contextual queries against your company’s documents and knowledge sources, and Amazon Bedrock models for customizable transcription insights and summaries.

Browser extension captures audio and meeting metadata from popular meeting apps – The browser extension captures meeting metadata—the meeting title and names of active speakers—and audio from you (your microphone) and others (from the meeting browser tab).

To customize the meeting assist capabilities based on the QnABot on AWS solution…

1 day, 18 hours назад @ aws.amazon.com
Meta Llama 3 models are now available in Amazon SageMaker JumpStart
Meta Llama 3 models are now available in Amazon SageMaker JumpStart Meta Llama 3 models are now available in Amazon SageMaker JumpStart

Today, we are excited to announce that Meta Llama 3 foundation models are available through Amazon SageMaker JumpStart to deploy and run inference.

Discover modelsYou can access the foundation models through SageMaker JumpStart in the SageMaker Studio UI and the SageMaker Python SDK.

For more details on how to get started and set up SageMaker Studio, refer to Amazon SageMaker Studio.

In SageMaker Studio, you can access SageMaker JumpStart, which contains pre-trained models, notebooks, and prebuilt solutions, under Prebuilt and automated solutions.

The Eiffel Tower: The iconic Eiffel Tower is one of the most recognizable landmarks in the world and offers breathtaking views of the city.

1 day, 19 hours назад @ aws.amazon.com
Slack delivers native and secure generative AI powered by Amazon SageMaker JumpStart
Slack delivers native and secure generative AI powered by Amazon SageMaker JumpStart Slack delivers native and secure generative AI powered by Amazon SageMaker JumpStart

“With Amazon SageMaker JumpStart, Slack can access state-of-the-art foundation models to power Slack AI, while prioritizing security and privacy.

SageMaker JumpStart allows Slack to support high standards for security and data privacy, while also using state-of-the-art models that help Slack AI perform optimally for Slack customers.

Slack AI has access to multi-GPU based instances to host their SageMaker JumpStart models.

Learn more about SageMaker JumpStart, Slack AI and how the Slack team built Slack AI to be secure and private.

About the AuthorsJackie Rocca is VP of Product at Slack, where she oversees the vision and execution of Slack AI, which brings generative AI natively and securely…

2 days назад @ aws.amazon.com
Uncover hidden connections in unstructured financial data with Amazon Bedrock and Amazon Neptune
Uncover hidden connections in unstructured financial data with Amazon Bedrock and Amazon Neptune Uncover hidden connections in unstructured financial data with Amazon Bedrock and Amazon Neptune

This post demonstrates a proof of concept built on two key AWS services well suited for graph knowledge representation and natural language processing: Amazon Neptune and Amazon Bedrock.

With generative AI services like Amazon Bedrock, you now have the capability to automate this process.

The workflow consists of the following steps:A user uploads official reports (in PDF format) to an Amazon Simple Storage Service (Amazon S3) bucket.

Filter out noise and irrelevant entities (for example, generic terms such as “consumers”) using Amazon Bedrock.

Content is generated using Amazon Bedrock and is purely fictional.

2 days, 21 hours назад @ aws.amazon.com
Open source observability for AWS Inferentia nodes within Amazon EKS clusters
Open source observability for AWS Inferentia nodes within Amazon EKS clusters Open source observability for AWS Inferentia nodes within Amazon EKS clusters

The pattern is part of the AWS CDK Observability Accelerator, a set of opinionated modules to help you set observability for Amazon EKS clusters.

The open source observability set of patterns instruments observability with Amazon Managed Grafana dashboards, an AWS Distro for OpenTelemetry collector to collect metrics, and Amazon Managed Service for Prometheus to store them.

Bootstrap the AWS CDK environmentThe first step to any AWS CDK deployment is bootstrapping the environment.

You use the cdk bootstrap command in the AWS CDK CLI to prepare the environment (a combination of AWS account and AWS Region) with resources required by AWS CDK to perform deployments into that environment.

Metrics…

2 days, 21 hours назад @ aws.amazon.com
Explore data with ease: Use SQL and Text-to-SQL in Amazon SageMaker Studio JupyterLab notebooks
Explore data with ease: Use SQL and Text-to-SQL in Amazon SageMaker Studio JupyterLab notebooks Explore data with ease: Use SQL and Text-to-SQL in Amazon SageMaker Studio JupyterLab notebooks

Solution overviewWith SageMaker Studio JupyterLab notebook’s SQL integration, you can now connect to popular data sources like Snowflake, Athena, Amazon Redshift, and Amazon DataZone.

The SageMaker Studio JupyterLab built-in SQL extension also enables you to run SQL queries directly from a notebook.

– This SageMaker Studio notebook feature requires user name and password access to data sources such as Snowflake and Amazon Redshift.

To explore SQL data sources in the left pane of SageMaker Studio, you first need to create AWS Glue connection objects.

Varun Shah is a Software Engineer working on Amazon SageMaker Studio at Amazon Web Services.

3 days, 13 hours назад @ aws.amazon.com
Distributed training and efficient scaling with the Amazon SageMaker Model Parallel and Data Parallel Libraries
Distributed training and efficient scaling with the Amazon SageMaker Model Parallel and Data Parallel Libraries Distributed training and efficient scaling with the Amazon SageMaker Model Parallel and Data Parallel Libraries

Enabling training at such scale remains a challenge in distributed training, and training efficiently in such a large system is another equally important problem.

Over the past years, the distributed training community has introduced 3D parallelism (data parallelism, pipeline parallelism, and tensor parallelism) and other techniques (such as sequence parallelism and expert parallelism) to address such challenges.

In December 2023, Amazon announced the release of the SageMaker model parallel library 2.0 (SMP), which achieves state-of-the-art efficiency in large model training, together with the SageMaker distributed data parallelism library (SMDDP).

The scaling efficiencies are stable across…

3 days, 19 hours назад @ aws.amazon.com
Manage your Amazon Lex bot via AWS CloudFormation templates
Manage your Amazon Lex bot via AWS CloudFormation templates Manage your Amazon Lex bot via AWS CloudFormation templates

Managing your Amazon Lex bots using AWS CloudFormation allows you to create templates defining the bot and all the AWS resources it depends on.

The benefits of using CloudFormation include:Consistency – A CloudFormation template provides a more consistent and automated way to deploy and manage the resources associated with an Amazon Lex bot.

– A CloudFormation template provides a more consistent and automated way to deploy and manage the resources associated with an Amazon Lex bot.

Version control – With AWS CloudFormation, you can use version control systems like Git to manage your CloudFormation templates.

You can use AWS services like AWS CodePipeline and AWS CodeBuild to build, test, an…

3 days, 19 hours назад @ aws.amazon.com
A secure approach to generative AI with AWS
A secure approach to generative AI with AWS A secure approach to generative AI with AWS

Innovating secure generative AI workloads using AWS industry-leading security capabilitiesFrom day one, AWS AI infrastructure and services have had built-in security and privacy features to give you control over your data.

The Nitro System fulfills the first principle of Secure AI Infrastructure by isolating your AI data from AWS operators.

You will be able to decrypt and load sensitive AI data into an ML accelerator for processing while providing isolation from your own operators and verified authenticity of the application used for processing the AI data.

Advancing the future of generative AI securityToday, tens of thousands of customers are using AWS to experiment and move transformative…

3 days, 20 hours назад @ aws.amazon.com
Cost-effective document classification using the Amazon Titan Multimodal Embeddings Model
Cost-effective document classification using the Amazon Titan Multimodal Embeddings Model Cost-effective document classification using the Amazon Titan Multimodal Embeddings Model

In this post, we discuss document classification using the Amazon Titan Multimodal Embeddings model to classify any document types without the need for training.

Amazon Titan Multimodal EmbeddingsAmazon recently introduced Titan Multimodal Embeddings in Amazon Bedrock.

Solution overviewLet’s examine the following document classification solution with the Amazon Titan Multimodal Embeddings model.

The Lambda function reads the document image and translates the image into embeddings by calling Amazon Bedrock and using the Amazon Titan Multimodal Embeddings model.

Additionally, on the Model access page of the Amazon Bedrock console, make sure that access is granted for the Amazon Titan Multimod…

1 week, 1 day назад @ aws.amazon.com
NVIDIA
последний пост 1 day, 19 hours назад
Wide Open: NVIDIA Accelerates Inference on Meta Llama 3
Wide Open: NVIDIA Accelerates Inference on Meta Llama 3 Wide Open: NVIDIA Accelerates Inference on Meta Llama 3

The latest open large language model from Meta — built with NVIDIA technology — is optimized to run on NVIDIA GPUs from the cloud and data center to the edge and the PC.

NVIDIA today announced optimizations across all its platforms to accelerate Meta Llama 3, the latest generation of the large language model (LLM).

With support from NVIDIA, Meta tuned its network, software and model architectures for its flagship LLM.

Putting Llama 3 to WorkVersions of Llama 3, accelerated on NVIDIA GPUs, are available today for use in the cloud, data center, edge and PC.

Custom models can be optimized for inference with NVIDIA TensorRT-LLM and deployed with NVIDIA Triton Inference Server.

1 day, 19 hours назад @ blogs.nvidia.com
Up to No Good: ‘No Rest for the Wicked’ Early Access Launches on GeForce NOW
Up to No Good: ‘No Rest for the Wicked’ Early Access Launches on GeForce NOW Up to No Good: ‘No Rest for the Wicked’ Early Access Launches on GeForce NOW

Members can now stream No Rest for the Wicked from the cloud.

Holy MolyNo Rest for the Wicked is the highly anticipated action role-playing game from Moon Studios, developer of the Ori series, and publisher Private Division.

Embark on a dark and perilous journey, where no rest awaits the wicked.

Rise to the challenge and stream from GeForce RTX 4080 servers with a GeForce NOW Ultimate membership for the smoothest gameplay from the cloud.

Shiny New GamesBecome a Wild West superhero in Evil West, streaming on GeForce NOW this week and part of PC Game Pass.

1 day, 23 hours назад @ blogs.nvidia.com
NVIDIA Honors Partners of the Year in Europe, Middle East, Africa
NVIDIA Honors Partners of the Year in Europe, Middle East, Africa NVIDIA Honors Partners of the Year in Europe, Middle East, Africa

The recipients were honored at the annual EMEA Partner Day hosted by the NVIDIA Partner Network (NPN).

“This year marks another milestone for NVIDIA and our partners across EMEA as we pioneer technological breakthroughs and unlock new business opportunities using NVIDIA’s full-stack platform,” said Dirk Barfuss, director of EMEA channel at NVIDIA.

received the Rising Star Northern Europe award for its exceptional revenue growth and broad customer base deploying NVIDIA AI solutions in data centers.

Through extensive collaboration with NVIDIA, the company has become a cornerstone of the NVIDIA partner landscape in Germany.

Through extensive collaboration with NVIDIA, the company has become a …

2 days, 2 hours назад @ blogs.nvidia.com
Advancing Medical Image Decoding with GPU-Accelerated nvImageCodec
Advancing Medical Image Decoding with GPU-Accelerated nvImageCodec Advancing Medical Image Decoding with GPU-Accelerated nvImageCodec

We’ll guide you through the intricacies of image decoding, introduce you to AWS HealthImaging, and explore the advancements enabled by GPU-accelerated decoding solutions.

GPU-accelerated image decodingTo further enhance the image decoding performance, AWS HealthImaging supports GPU acceleration, specifically leveraging the NVIDIA nvJPEG2000 library.

AWS HealthImaging walkthroughIn this walkthrough, we showcase the use of AWS HealthImaging.

We demonstrate the process of employing a SageMaker multimodel endpoint for image decoding, utilizing a GPU-accelerated interface.

: In computer vision applications, image decoding speed plays a vital role in real-time processing tasks such as object dete…

2 days, 15 hours назад @ developer.nvidia.com
Seeing Beyond: Living Optics CEO Robin Wang on Democratizing Hyperspectral Imaging
Seeing Beyond: Living Optics CEO Robin Wang on Democratizing Hyperspectral Imaging Seeing Beyond: Living Optics CEO Robin Wang on Democratizing Hyperspectral Imaging

Step into the realm of the unseen with Robin Wang, CEO of Living Optics.

Living Optics’ hyperspectral imaging camera, which can capture visual data across 96 colors, reveals details invisible to the human eye.

1:45: The Living Optics camera’s ability to capture 96 colors3:36: Where is hyperspectral imaging being used, and why is it so important?

9:34: Other use cases of hyperspectral imaging13:07: What’s unique about Living Optics’ hyperspectral imaging camera?

18:36: Breakthroughs, challenges during the technology’s development23:27: What’s next for Living Optics and hyperspectral imaging?

2 days, 16 hours назад @ blogs.nvidia.com
Moving Pictures: Transform Images Into 3D Scenes With NVIDIA Instant NeRF
Moving Pictures: Transform Images Into 3D Scenes With NVIDIA Instant NeRF Moving Pictures: Transform Images Into 3D Scenes With NVIDIA Instant NeRF

Using Instant NeRF, creatives are transforming collections of still images into digital 3D scenes in just seconds.

The researchers won a best paper award for the work, and TIME Magazine named Instant NeRF a best invention of 2022.

Ready, Set, GoGet started with Instant NeRF to learn about radiance fields and experience imagery in a new way.

The Instant NeRF gallery showcases some of the most innovative and thought-provoking examples, viewable as video clips in any web browser.

Thanks to a recent Instant NeRF update, users can render their scenes from static images and virtually step inside the environments, moving freely within the 3D space.

2 days, 23 hours назад @ blogs.nvidia.com
New NVIDIA RTX A400 and A1000 GPUs Enhance AI-Powered Design and Productivity Workflows
New NVIDIA RTX A400 and A1000 GPUs Enhance AI-Powered Design and Productivity Workflows New NVIDIA RTX A400 and A1000 GPUs Enhance AI-Powered Design and Productivity Workflows

To meet this growing need, NVIDIA is expanding its RTX professional graphics offerings with two new NVIDIA Ampere architecture-based GPUs for desktops: the NVIDIA RTX A400 and NVIDIA RTX A1000.

A New Era of Creativity, Performance and EfficiencyThe RTX A400 GPU introduces accelerated ray tracing and AI to the RTX 400 series GPUs.

With a sleek, single-slot design and consuming just 50W, the A400 and A1000 GPUs bring impressive features to compact, energy-efficient workstations.

AvailabilityThe NVIDIA RTX A1000 GPU is now available through global distribution partners such as PNY and Ryoyo Electric.

The RTX A400 GPU is expected to be available from channel partners starting in May, with antic…

3 days, 20 hours назад @ blogs.nvidia.com
To Cut a Long Story Short: Video Editors Benefit From DaVinci Resolve’s New AI Features Powered by RTX
To Cut a Long Story Short: Video Editors Benefit From DaVinci Resolve’s New AI Features Powered by RTX To Cut a Long Story Short: Video Editors Benefit From DaVinci Resolve’s New AI Features Powered by RTX

Blackmagic Design’s DaVinci Resolve released version 19, adding the IntelliTrack AI point tracker and UltraNR AI-powered features to further streamline video editing workflows.

April also brings the latest NVIDIA Studio Driver, which optimizes the latest creative app updates, available for download today.

All DaVinci Resolve AI effects are accelerated on RTX GPUs by NVIDIA TensorRT, boosting AI performance by up to 2x.

For more information, check out the DaVinci Resolve website.

In testing, the scene below runs 4.5x faster FPS using the NVIDIA RTX 4090 vs. the Mac M2 Ultra and other competitors.

3 days, 23 hours назад @ blogs.nvidia.com
AI Is Tech’s ‘Greatest Contribution to Social Elevation,’ NVIDIA CEO Tells Oregon State Students
AI Is Tech’s ‘Greatest Contribution to Social Elevation,’ NVIDIA CEO Tells Oregon State Students AI Is Tech’s ‘Greatest Contribution to Social Elevation,’ NVIDIA CEO Tells Oregon State Students

AI promises to bring the full benefits of the digital revolution to billions across the globe, NVIDIA CEO Jensen Huang said Friday during a conversation with Oregon State University President Jayathi Murthy.

On Wednesday, Georgia Tech announced a new NVIDIA-powered supercomputer that will help prepare undergraduate students to solve complex challenges with AI and HPC.

Georgia Tech Unveils New AI Makerspace in Collaboration With NVIDIAAnd on Wednesday, Georgia Tech’s College of Engineering established an AI supercomputing hub dedicated to teaching students.

The Makerspace will also better position students after graduation as they work with AI professionals and help shape future applications…

4 days, 23 hours назад @ blogs.nvidia.com
Explainer: What Is a Convolutional Neural Network?
Explainer: What Is a Convolutional Neural Network? Explainer: What Is a Convolutional Neural Network?

Convolutional neural networks (CNNs) apply a variation of multilayer perceptrons (algorithms that classify visual inputs), usually across multiple convolutional layers that are either entirely connected or pooled.

A CNN usually consists of three layers: 1) an input layer, 2) an output layer, and 3) a hidden layer that includes multiple convolutional layers.

Within the hidden layers are pooling layers, fully connected layers, and normalization layers.

The pooling layer progressively reduces the spatial size of the representation for more efficient computation.

Stacking convolutional layers allows the input to be decomposed into its fundamental elements.

1 week назад @ nvidia.com
New Video Series: OpenUSD for Developers
New Video Series: OpenUSD for Developers New Video Series: OpenUSD for Developers

Universal Scene Description, also called OpenUSD or USD, is an open and extensible framework for creating, editing, querying, rendering, collaborating, and simulating within 3D worlds.

To help more developers get started leveraging USD to build tools for virtual worlds, NVIDIA is releasing an exclusive OpenUSD for Developers video series that delivers a foundational understanding of what many have called the HTML of the 3D internet.

With NVIDIA Omniverse, developers can easily integrate OpenUSD into existing software tools, simulation workflows, and generative AI-based applications.

Get started with NVIDIA Omniverse by downloading the standard license for free, or learn how Omniverse Enterp…

1 week, 1 day назад @ developer.nvidia.com
Bethesda’s ‘Fallout’ Titles Join GeForce NOW
Bethesda’s ‘Fallout’ Titles Join GeForce NOW Bethesda’s ‘Fallout’ Titles Join GeForce NOW

Bethesda’s Fallout 4 and Fallout 76 are bringing post-nuclear adventures to the cloud.

These highly acclaimed action role-playing games lead 10 new titles joining GeForce NOW this week.

Announced as coming to GeForce NOW at CES, Honkai: Star Rail is targeting a release this quarter.

Plus, in Fallout 76, head back to the early days of post-nuclear Appalachia and experience the Fallout universe’s largest, most dynamic world.

The games come just in time for those tuning into the Fallout series TV adaptation, released today, for a Fallout-filled week.

1 week, 1 day назад @ blogs.nvidia.com
Combating Corruption With Data: Cleanlab and Berkeley Research Group on Using AI-Powered Investigative Analytics
Combating Corruption With Data: Cleanlab and Berkeley Research Group on Using AI-Powered Investigative Analytics Combating Corruption With Data: Cleanlab and Berkeley Research Group on Using AI-Powered Investigative Analytics

Northcutt and Gawthorpe provide insights into how AI-powered data analytics can help combat economic crimes and corruption and discuss the intersection of AI, data science and ethical governance in fostering a more just society.

In the latest episode of NVIDIA’s AI Podcast, host Noah Kravitz spoke with Thomson Reuters’ Chief Product Officer David Wong about its potential — and implications.

Making Machines Mindful: NYU Professor Talks Responsible AI – Ep.

Julia Stoyanovich, associate professor of computer science and engineering at NYU and director of the university’s Center for Responsible AI, wants to make the terms “AI” and “responsible AI” synonymous.

Make the AI Podcast better: Have a …

1 week, 2 days назад @ blogs.nvidia.com
The Building Blocks of AI: Decoding the Role and Significance of Foundation Models
The Building Blocks of AI: Decoding the Role and Significance of Foundation Models The Building Blocks of AI: Decoding the Role and Significance of Foundation Models

Online apps built on foundation models typically access the models from a data center.

Rather than training an entirely new AI model for each generative AI application — a costly and time-consuming endeavor — users commonly fine-tune foundation models for specialized use cases.

Pretrained foundation models are remarkably capable, thanks to prompts and data-retrieval techniques like retrieval-augmented generation, or RAG.

Types of Foundation ModelsMore than 100 foundation models are in use — a number that continues to grow.

Think Globally, Run AI Models LocallyGeForce RTX and NVIDIA RTX GPUs can run foundation models locally.

1 week, 2 days назад @ blogs.nvidia.com
NVIDIA Joins $110 Million Partnership to Help Universities Teach AI Skills
NVIDIA Joins $110 Million Partnership to Help Universities Teach AI Skills NVIDIA Joins $110 Million Partnership to Help Universities Teach AI Skills

Universities around the world are preparing students for crucial AI skills by providing access to the high performance computing capabilities of supercomputing.

This computing center will help students research, develop and apply AI across Oregon State’s top-ranked programs in agriculture, computer sciences, climate science, forestry, oceanography, robotics, water resources, materials sciences and more.

Strengthening US-Japan AI Research CollaborationThe U.S.-Japan HPC alliance will advance AI research and development and support the two nations’ global leadership in cutting-edge technology.

Addressing Worldwide AI Talent ShortageDemand for key AI skills is creating a talent shortage worldw…

1 week, 3 days назад @ blogs.nvidia.com
Facebook
последний пост 1 week, 1 day назад
Building new custom silicon for Meta’s AI workloads
Building new custom silicon for Meta’s AI workloads Building new custom silicon for Meta’s AI workloads

To help personalize content, tailor and measure ads and provide a safer experience, we use cookies.

By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies.

Learn more, including about available controls: Cookie PolicyAccept

1 week, 1 day назад @ engineering.fb.com
Introducing the next-gen Meta Training and Inference Accelerator
Introducing the next-gen Meta Training and Inference Accelerator Introducing the next-gen Meta Training and Inference Accelerator

The next generation of Meta’s large-scale infrastructure is being built with AI in mind, including supporting new generative AI (GenAI) products and services, recommendation systems, and advanced AI research.

It’s an investment we expect will grow in the years ahead as the compute requirements to support AI models increase alongside the models’ sophistication.

Last year, we unveiled the Meta Training and Inference Accelerator (MTIA) v1, our first-generation AI inference accelerator that we designed in-house with Meta’s AI workloads in mind – specifically our deep learning recommendation models that are improving a variety of experiences across our products.

MTIA is a long-term venture to pr…

1 week, 2 days назад @ ai.meta.com
Optimizing RTC bandwidth estimation with machine learning
Optimizing RTC bandwidth estimation with machine learning Optimizing RTC bandwidth estimation with machine learning

Bandwidth estimation (BWE) and congestion control play an important role in delivering high-quality real-time communication (RTC) across Meta’s family of apps.

Network characterizationAn ML model-based approach leverages time series data to improve the bandwidth estimation by using offline parameter tuning for characterized network types.

The first component is offline ML model learning using ML to categorize the network type (random packet loss versus bursty loss).

The non-time series data or dense data will pass through a dense layer (i.e., a fully connected layer).

Use case: Random packet loss classificationLet’s consider the use case of categorizing packet loss as either random or conge…

1 month назад @ engineering.fb.com
Logarithm: A logging engine for AI training workflows and services
Logarithm: A logging engine for AI training workflows and services Logarithm: A logging engine for AI training workflows and services

In this post, we present the design behind Logarithm, and show how it powers AI training debugging use cases.

AI training debugging with LogarithmBefore looking at Logarithm’s internals, we present support for training systems and model issue debugging, one of the prominent use cases of Logarithm at Meta.

ML model training workflows tend to have a wide range of failure modes, spanning data inputs, model code and hyperparameters, and systems components (e.g., PyTorch, data readers, checkpointing, framework code, and hardware).

Logarithm ingests both systems logs from the training stack, and model telemetry from training jobs that the stack executes.

Filter–by-callsite enables hiding known lo…

1 month назад @ engineering.fb.com
Building Meta’s GenAI Infrastructure
Building Meta’s GenAI Infrastructure Building Meta’s GenAI Infrastructure

While we’ve had a long history of building AI infrastructure, we first shared details on our AI Research SuperCluster (RSC), featuring 16,000 NVIDIA A100 GPUs, in 2022.

Under the hoodOur newer AI clusters build upon the successes and lessons learned from RSC.

Our out-of-box performance for large clusters was initially poor and inconsistent, compared to optimized small cluster performance.

Commitment to open AI innovationMeta maintains its commitment to open innovation in AI software and hardware.

The future of Meta’s AI infrastructureThese two AI training cluster designs are a part of our larger roadmap for the future of AI.

1 month, 1 week назад @ engineering.fb.com
Improving machine learning iteration speed with faster application build and packaging
Improving machine learning iteration speed with faster application build and packaging Improving machine learning iteration speed with faster application build and packaging

These improvements helped us find and remove many unnecessary dependencies, making build graph analysis and overall build times much better.

In response to this challenge, we implemented a new solution for the packaging and distribution of Python executables – the Content Addressable Filesystem (CAF).

LazyCAF and enforcing uniform revisions: Areas for further ML iteration improvementsThe improvements we implemented have proven highly effective, drastically reducing the overhead and significantly elevating the efficiency of our ML engineers.

Faster build times and more efficient packaging and distribution of executables have reduced overhead by double-digit percentages.

We plan to enable all…

2 months, 3 weeks назад @ engineering.fb.com
Lazy is the new fast: How Lazy Imports and Cinder accelerate machine learning at Meta
Lazy is the new fast: How Lazy Imports and Cinder accelerate machine learning at Meta Lazy is the new fast: How Lazy Imports and Cinder accelerate machine learning at Meta

At Meta, the quest for faster model training has yielded an exciting milestone: the adoption of Lazy Imports and the Python Cinder runtime.

The challenges of adopting Lazy ImportsWhile Lazy Imports’ approach significantly improved ML development, it was not all a bed of roses.

With Lazy Imports, Meta’s ML developers are now equipped to work more efficiently, experiment more rapidly, and achieve results faster.

Here’s a glimpse into our future endeavors:Streamlining developer onboardingThe learning curve associated with Lazy Imports can be a challenge for newcomers.

Building a robust community that helps supporting paradigms and patterns that play well with Lazy Imports is one of our future …

3 months назад @ engineering.fb.com
How Meta is advancing GenAI
How Meta is advancing GenAI How Meta is advancing GenAI

What’s going on with generative AI (GenAI) at Meta?

In this episode of the Meta Tech Podcast, Meta engineer Pascal Hartig (@passy) speaks with Devi Parikh, an AI research director at Meta.

They cover a wide range of topics, including the history and future of GenAI and the most interesting research papers that have come out recently.

And, of course, they discuss some of Meta’s latest GenAI innovations, including:Audiobox, a foundational model for generating sound and soundscapes using natural language prompts.

And if you’re interested in AI career opportunities at Meta visit the Meta Careers page.

3 months, 1 week назад @ engineering.fb.com
AI debugging at Meta with HawkEye
AI debugging at Meta with HawkEye AI debugging at Meta with HawkEye

In this post, we will provide an overview of the end-to-end debugging workflows supported by HawkEye, components of the system, and the product surface for Meta product and monetization teams to debug AI model and feature issues.

HawkEye includes infrastructure for continuously collecting data on serving and training models, data generation, and analysis components for mining root causes.

However, significant differences indicate problems with either the training data or loss divergence (e.g., loss or gradient explosion) in the bad snapshot.

Such issues can happen for several hard-to-diagnose reasons, ranging from the complex data pipelines behind training data, to data corruptions.

HawkEye…

4 months назад @ engineering.fb.com
Watch: Meta’s engineers on building network infrastructure for AI
Watch: Meta’s engineers on building network infrastructure for AI Watch: Meta’s engineers on building network infrastructure for AI

The 2023 edition of Networking at Scale focused on how Meta’s engineers and researchers have been designing and operating the network infrastructure over the last several years for Meta’s AI workloads, including our numerous ranking and recommendation workloads and the immense Generative AI models.

But the sheer scale and complexity of GenAi models means new challenges for Meta’s network infrastructure.

Meta’s Network Journey to Enable AIHany Morsy, Network EngineerSusana Contrera, Network EngineerOver the years, Meta’s AI infrastructure has transitioned from CPU-based to GPU-based training due to growing AI workloads.

Traffic Engineering for AI Training NetworksShuqiang Zhang, Software Eng…

5 months назад @ engineering.fb.com
How Meta is creating custom silicon for AI
How Meta is creating custom silicon for AI How Meta is creating custom silicon for AI

Fueling the success of these products are world-class infrastructure teams, including Meta’s custom AI silicon team, led by Olivia Wu, a leader in the silicon industry for 30 years.

In 2018, I saw a social media post from Yann LeCun, our Chief AI Scientist, that Meta was looking for someone to help build AI silicon in-house.

I knew of just a few other companies designing their own custom AI silicon, but they were mainly focused only on silicon and not the software ecosystem and products.

What’s next for the AI silicon design team?

We’re continuing to gather feedback and input from our AI software teams to shape the features of our future AI silicon.

6 months назад @ engineering.fb.com
Using Chakra execution traces for benchmarking and network performance optimization
Using Chakra execution traces for benchmarking and network performance optimization Using Chakra execution traces for benchmarking and network performance optimization

Meta presents Chakra execution traces , an open graph-based representation of AI/ML workload execution, laying the foundation for benchmarking and network performance optimization.

The limitations of traditional AI benchmarking methodologyTraditionally, benchmarking AI systems has largely relied on running full ML workloads.

How Meta leverages Chakra execution tracesAt Meta, we collect execution traces from our production servers every day.

Future plansEnhancing the benchmarking capability of Chakra execution tracesWhile the execution trace replayer enables replay of execution traces, it brings forth challenges.

Using AI to generate representative execution tracesChakra execution traces are…

7 months, 2 weeks назад @ engineering.fb.com
Arcadia: An end-to-end AI system performance simulator
Arcadia: An end-to-end AI system performance simulator Arcadia: An end-to-end AI system performance simulator

We’re introducing Arcadia, Meta’s unified system that simulates the compute, memory, and network performance of AI training clusters.

Arcadia gives Meta’s researchers and engineers valuable insights into the performance of AI models and workloads in an AI cluster – enabling data-driven decision making in the design of AI clusters.

That’s where Arcadia, Meta’s end-to-end AI system performance simulator, comes in.

By providing insights into the impact of these factors on system performance, Arcadia facilitates data-driven decision-making processes and fosters the evolution of models and hardware.

This comprehensive set of metrics empowers stakeholders to analyze the impact of different factor…

7 months, 2 weeks назад @ engineering.fb.com
Code Llama: Meta’s state-of-the-art LLM for coding
Code Llama: Meta’s state-of-the-art LLM for coding Code Llama: Meta’s state-of-the-art LLM for coding

Today, we are releasing Code Llama, a large language model (LLM) that can use text prompts to generate code.

Additionally, we have further fine-tuned two additional variations of Code Llama: Code Llama - Python and Code Llama - Instruct.

Code Llama - Python is a language-specialized variation of Code Llama, further fine-tuned on 100B tokens of Python code.

Code Llama - Instruct is an instruction fine-tuned and aligned variation of Code Llama.

We recommend using Code Llama - Instruct variants whenever using Code Llama for code generation since Code Llama - Instruct has been fine-tuned to generate helpful and safe answers in natural language.

8 months назад @ ai.meta.com
Meta Connect 2023: September 27 – 28
Meta Connect 2023: September 27 – 28 Meta Connect 2023: September 27 – 28

[...]

Read More...

The post Meta Connect 2023: September 27 – 28 appeared first on Engineering at Meta.

8 months, 1 week назад @ meta.com
Uber Engineering
последний пост None
neptune.ai neptune.ai
последний пост 1 day, 3 hours назад
Deep Learning Optimization Algorithms
Deep Learning Optimization Algorithms Deep Learning Optimization Algorithms

In this article, we’ll survey the most commonly used deep learning optimization algorithms, including Gradient Descent, Stochastic Gradient Descent, and the Adam optimizer.

Understanding different optimization algorithms and their strengths and weaknesses is crucial for any data scientist training deep learning models.

Optimization in deep learning Have a look at other articles on our blog exploring aspects of optimization in deep learning: Deep Learning Model Optimization Methods: Deep learning models exhibit excellent performance but require high computational resources.

:Mini-batch Gradient DescentMini-batch Gradient Descent strikes a balance between the thorough, calculated approach of …

1 day, 3 hours назад @ neptune.ai
Track and Visualize Information From Your Pipelines: neptune.ai + ZenML Integration
Track and Visualize Information From Your Pipelines: neptune.ai + ZenML Integration Track and Visualize Information From Your Pipelines: neptune.ai + ZenML Integration

On top of that, neptune.ai integrates with any MLOps stack, and it just works.

Now, with less boilerplate code, you can log and visualize information from your ZenML pipeline steps (e.g., models, parameters, metrics).

You’re looking for a more visually interactive way of navigating the results produced from your ZenML pipeline runs (e.g., models, metrics, datasets).

In this example, we log a simple ZenML pipeline to Neptune using the Experiment Tracker stack component.

The example assumes that you have ZenML installed together with the Neptune integration.

3 days, 22 hours назад @ neptune.ai
Product Updates September ’23: Scatter Plots, Airflow Integration, and More
Product Updates September ’23: Scatter Plots, Airflow Integration, and More Product Updates September ’23: Scatter Plots, Airflow Integration, and More

Scatter plotsIf you have two metrics or parameters that you wish to compare or see how they relate to each other throughout the runs, you can now create a scatter plot.

See an example of a scatter plot in the Neptune app.

Distraction-free modeYou can now view run dashboards and compare views in distraction-free mode.

(neptune 1.5.0)We added a new environment variable NEPTUNE_DATA_DIRECTORY, which lets you specify where the .neptune folder should be created instead of the current working directory.

(neptune 1.6.0)Web applicationWe fixed an issue where a field named “type” could not be displayed as a runs table column.

3 days, 22 hours назад @ neptune.ai
Train, Track, and Deploy Your Models: Neptune + Modelbit Integration
Train, Track, and Deploy Your Models: Neptune + Modelbit Integration Train, Track, and Deploy Your Models: Neptune + Modelbit Integration

We are excited to announce that Neptune and Modelbit have partnered to release an integration to enable better ML model deployment and experiment tracking.

Data scientists and machine learning engineers can use the integration to train and deploy machine learning models in Modelbit while logging and visualizing training progress in Neptune.

We’ll log the model’s hyperparameters and accuracy to Neptune and then deploy the model to a REST endpoint.

First, import “modelbit” and “neptune” and authenticate your notebook with Modelbit:import modelbit, neptune mb = modelbit.login()If your “NEPTUNE_API_TOKEN” isn’t already in your notebook’s environment, add it:import os os.environ["NEPTUNE_API_TOK…

3 days, 22 hours назад @ neptune.ai
Product Updates March’24: MosaicML Composer integration, Neptune Query Language, and More
Product Updates March’24: MosaicML Composer integration, Neptune Query Language, and More Product Updates March’24: MosaicML Composer integration, Neptune Query Language, and More

(neptune 1.9.0): When fetching runs with , you have 4 new parameters: , , , and .

(neptune 1.9.0) All messages that Neptune prints to the console are prefixed with [neptune].

(neptune 1.9.0)that Neptune prints to the console are prefixed with [neptune].

(neptune 1.9.0) We introduced querying functionality to table fetching methods.

(neptune 1.10.0)IntegrationsWe introduced the log_model_summary parameter to NeptuneCallback() ( neptune- tensorflow-keras 2.2.1)parameter to ( neptune- 2.2.1) We added support for logging project requirements with the dependencies parameter.

3 days, 22 hours назад @ neptune.ai
Product Updates December ’23: MLflow Plugin, New Docs Tutorials, and More
Product Updates December ’23: MLflow Plugin, New Docs Tutorials, and More Product Updates December ’23: MLflow Plugin, New Docs Tutorials, and More

This page lists all functions, parameters, and constants exposed by the Neptune Python API.

(kedro-neptune 0.3.0)Web applicationWhen modifying a search query in the runs table, you can now edit the operator and value without changing the other parts.

You can now click on a username in the Owner column (in the runs table) to instantly create a filter for runs created by that account.

column (in the runs table) to instantly create a filter for runs created by that account.

(neptune 1.8.6)Web applicationWe fixed an issue where a field named “type” could not be displayed as a runs table column.

3 days, 22 hours назад @ neptune.ai
How to Optimize GPU Usage During Model Training With neptune.ai
How to Optimize GPU Usage During Model Training With neptune.ai How to Optimize GPU Usage During Model Training With neptune.ai

Strategies for improving GPU usage include mixed-precision training, optimizing data transfer and processing, and appropriately dividing workloads between CPU and GPU.

The GPU memory usage metric reflects the amount of memory allocated to the GPU relative to its total memory capacity.

Optimizing data preprocessingIn addition to I/O and data transfer to the GPU, data preprocessing can become a bottleneck for GPU utilization.

This indicates that by improving GPU utilization, training could be sped up, and GPU resources could be used more efficiently.

However, across the many projects I’ve worked on, the following guidelines for optimizing GPU usage have often proven helpful:Always monitor GPU…

3 weeks, 2 days назад @ neptune.ai
Zero-Shot and Few-Shot Learning with LLMs
Zero-Shot and Few-Shot Learning with LLMs Zero-Shot and Few-Shot Learning with LLMs

The role of zero-shot and few-shot learning in LLMsThe goal of zero-shot and few-shot learning is to get a machine-learning model to perform a new task it was not trained for.

“Few-shot learning” and “zero-shot learning” are well-known concepts in machine learning that were studied long before LLMs appeared on the scene.

In the context of LLMs, these terms are sometimes used interchangeably with “few-shot prompting” and “zero-shot prompting.” However, they are not the same.

Where zero-shot prompting failsLet’s turn to two use cases where zero-shot prompting is insufficient.

Where few-shot prompting failsFinally, let’s look at a situation where few-shot prompting won’t get us far.

4 weeks назад @ neptune.ai
LLMOps: What It Is, Why It Matters, and How to Implement It
LLMOps: What It Is, Why It Matters, and How to Implement It LLMOps: What It Is, Why It Matters, and How to Implement It

Large Language Models (LLMs) like Meta AI’s LLaMA models, MISTRAL AI’s open models, and OpenAI’s GPT series have improved language-based AI.

Monitoring Monitor model performance for data drift and model degradation, often using automated monitoring tools.

Automate prompt selection: Implement systems that automatically choose the best prompt for a given task using historical data on prompt performance and the specifics of the current request.

Monitoring and observability are about tracking LLMs’ performance, health, and operational metrics in production to ensure they perform optimally and reliably.

Use platforms that offer comprehensive observability into LLM performance, including function…

1 month, 1 week назад @ neptune.ai
The Real Cost of Self-Hosting MLflow
The Real Cost of Self-Hosting MLflow The Real Cost of Self-Hosting MLflow

The canonical MLflow setup for teams: The MLflow client embedded in the training code communicates with the MLflow tracking server, which handles access to cloud storage (artifact store) and a database (metadata store).

| Modified based on: sourceDeploying the MLflow tracking serverMLflow’s tracking server is relatively lightweight.

With this in mind, there are three main options for deploying the MLflow tracking server on AWS:Deploying the MLflow tracking server on an Amazon EC2 instance.

Recommended The Best MLflow alternatives Read alsoSetting up an artifact storeThe artifact store is the third relevant cost item in an MLflow deployment.

Further, everyone who has access to your MLflow tr…

1 month, 1 week назад @ neptune.ai
Deep Learning Model Optimization Methods
Deep Learning Model Optimization Methods Deep Learning Model Optimization Methods

Knowledge distillation transfers insights from a complex “teacher” model to a simpler “student” model, maintaining performance with less computational demand.

Distillation loss: The student model is trained not just to replicate the output of the teacher model but to match the output distributions produced by the teacher model.

Model architecture compatibility: The effectiveness of knowledge distillation depends on how well the student model can learn from the teacher model, which is greatly influenced by their architectural compatibility.

Data is fed to both a complex ‘Teacher Model’ and a simpler ‘Student Model.’ The Teacher Model, which consists of multiple layers (from Layer 1 to Layer …

1 month, 2 weeks назад @ neptune.ai
Continual Learning: Methods and Application
Continual Learning: Methods and Application Continual Learning: Methods and Application

Continual learning scenariosDepending on the data stream characteristics, problems within the continual learning scenario can be divided into three, each with a common solution.

Task incremental continual learningTask Incremental (TI) continual learning is classic multi-task learning but in an incremental way.

Continual learning methodsOver the second decade of the 2000s, there has been a rapid improvement in recent advances in continual learning methods.

Recommended How to Build Machine Learning Systems with a Feature Store See alsoHow to choose the right continual learning method for your projectAcross the three groups of continual learning approaches, many techniques exist.

Continual lea…

1 month, 4 weeks назад @ neptune.ai
2024 Layoffs and LLMs: Pivoting for Success
2024 Layoffs and LLMs: Pivoting for Success 2024 Layoffs and LLMs: Pivoting for Success

Neither teams working on proof-of-concept projects nor production ML systems are immune from job cuts.

The rapid advancements of Large Language Models (LLMs) are changing the day-to-day work of ML practitioners and how company leadership thinks about AI.

The rise of modern LLMsIn 2023, the dominance of modern LLMs became increasingly evident, challenging the incumbent classical NLP models.

Even small and relatively weaker LLMs like DistilGPT2 and t5-small have surpassed classical NLP models in understanding context and generating coherent text.

As you can see, it’s unclear whether people working on PoC or production projects are at higher risk of layoffs.

2 months назад @ neptune.ai
Mikiko Bazeley: What I Learned Building the ML Platform at Mailchimp
Mikiko Bazeley: What I Learned Building the ML Platform at Mailchimp Mikiko Bazeley: What I Learned Building the ML Platform at Mailchimp

In this article, I will outline the learnings and challenges I faced while on the ML platform team at Mailchimp, building infrastructure and setting up the environment for development and testing.

Mailchimp’s ML Platform: genesis, challenges, and objectivesMailchimp is a 20-year-old bootstrapped email marketing company.

Team setup and responsibilitiesWe had around 20 data engineers and ML(Ops) engineers working on the ML platform at Mailchimp.

Recommended Learnings From Building the ML Platform at Mailchimp See full episodeIn times of generative AI, a good ML infrastructure matters a lotA lesson that I’ve learned time and time again over the past years is the enduring importance of ‘boring’…

2 months, 3 weeks назад @ neptune.ai
How to Build Machine Learning Systems With a Feature Store
How to Build Machine Learning Systems With a Feature Store How to Build Machine Learning Systems With a Feature Store

Understanding machine learning pipelinesMachine learning (ML) pipelines are a key component of ML systems.

Machine learning systems with feature storesMachine learning (ML) systems manage the data transformations, model training, and predictions made on ML models.

A feature store typically comprises a feature repository, a feature serving layer, and a metadata store.

A feature store is a data platform that supports the development and operation of machine learning systems by managing the storage and efficient querying of feature data.

A feature store stores feature data in tables called feature groups, also called feature sets or tables.

2 months, 3 weeks назад @ neptune.ai
▶️ YouTube
Yannic Kilcher Yannic Kilcher
последний пост 2 days, 13 hours назад
Hugging Face got hacked
Hugging Face got hacked Hugging Face got hacked

Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https://www.subscribestar.com/yannickilcher

Patreon: https://www.patreon.com/yannickilcher

Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq

Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2

Litecoin (LTC): LQW2TRyKYetVC8WjFkhpP…

2 days, 13 hours назад @ youtube.com
[ML News] Microsoft to spend 100 BILLION DOLLARS on supercomputer (& more industry news)
[ML News] Microsoft to spend 100 BILLION DOLLARS on supercomputer (& more industry news) [ML News] Microsoft to spend 100 BILLION DOLLARS on supercomputer (& more industry news)

Some updates from industry in the Machine Learning world Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https://www.subscribestar.com/yannickilcher

Patreon: https://www.patreon.com/yannickilcher

Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq

Ethereum (ETH): 0x7ad3513E3B8f66799f507…

4 days, 19 hours назад @ youtube.com
[ML News] Jamba, CMD-R+, and other new models (yes, I know this is like a week behind 🙃)
[ML News] Jamba, CMD-R+, and other new models (yes, I know this is like a week behind 🙃) [ML News] Jamba, CMD-R+, and other new models (yes, I know this is like a week behind 🙃)

A flurry of new models continues to appear. Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https://www.subscribestar.com/yannickilcher

Patreon: https://www.patreon.com/yannickilcher

Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq

Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC…

1 week назад @ youtube.com
Flow Matching for Generative Modeling (Paper Explained)
Flow Matching for Generative Modeling (Paper Explained) Flow Matching for Generative Modeling (Paper Explained)

Flow matching is a more general method than diffusion and serves as the basis for models like Stable Diffusion 3. Paper: https://arxiv.org/abs/2210.02747 Abstract:

We introduce a new paradigm for generative modeling built on Continuous Normalizing Flows (CNFs), allowing us to train CNFs at unprecedented scale. Specifically, we present the notion of Flow Matching (FM), a simulation-free approach for training CNFs based on regressing vector fields of fixed conditional probability paths. Flow Matching is compatible with a general family of Gaussian probability paths for transforming between noise and data samples -- which subsumes existing diffusion paths as specific instances. Interestingly, …

1 week, 4 days назад @ youtube.com
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping (Searchformer)
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping (Searchformer) Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping (Searchformer)

Paper: https://arxiv.org/abs/2402.14083 Abstract:

While Transformers have enabled tremendous progress in various application settings, such architectures still lag behind traditional symbolic planners for solving complex decision making tasks. In this work, we demonstrate how to train Transformers to solve complex planning tasks and present Searchformer, a Transformer model that optimally solves previously unseen Sokoban puzzles 93.7% of the time, while using up to 26.8% fewer search steps than standard A∗ search. Searchformer is an encoder-decoder Transformer model trained to predict the search dynamics of A∗. This model is then fine-tuned via expert iterations to perform fewer search step…

1 week, 6 days назад @ youtube.com
[ML News] Grok-1 open-sourced | Nvidia GTC | OpenAI leaks model names | AI Act
[ML News] Grok-1 open-sourced | Nvidia GTC | OpenAI leaks model names | AI Act [ML News] Grok-1 open-sourced | Nvidia GTC | OpenAI leaks model names | AI Act

OUTLINE:

0:00 - Intro

0:15 - XAI releases Grok-1

2:00 - Nvidia GTC

4:45 - Comment of the Week

5:35 - Brute-forcing OpenAI model names

7:30 - Inflection AI gets eaten by Microsoft

9:25 - EU AI Act moving forward

11:45 - Advances in Robotics

14:00 - India retracts controversial advisory

14:30 - OpenSora

15:20 - Improved Gemma fine-tuning

16:20 - Decoding encrypted LLM traffic

17:45 - Varia References:

https://x.ai/blog/grok-os

https://github.com/xai-org/grok-1

https://finance.yahoo.com/news/nvidia-debuts-next-generation-blackwell-ai-chip-at-gtc-2024-205825161.html?guccounter=1&guce_referrer=aHR0cHM6Ly9uZXdzLmdvb2dsZS5jb20v&guce_referrer_sig=AQAAAHYRVePPrDnH3HxPV8smDzUiia_ztWttteAmHKxy-x_Z75lq…

3 weeks, 3 days назад @ youtube.com
[ML News] Devin AI Software Engineer | GPT-4.5-Turbo LEAKED | US Gov't Report: Total Extinction
[ML News] Devin AI Software Engineer | GPT-4.5-Turbo LEAKED | US Gov't Report: Total Extinction [ML News] Devin AI Software Engineer | GPT-4.5-Turbo LEAKED | US Gov't Report: Total Extinction

Your weekly dose of ML News OUTLINE:

0:00 - Intro

0:15 - Devin: AI software engineer

5:50 - Mira Murati on Sora training data

6:50 - Inflection accused of copying Claude

9:00 - Tools & papers

16:30 - GPT-4.5-turbo mystery

17:30 - US government report: total extinction by AI

19:20 - Various other news References:

https://www.cognition-labs.com/introducing-devin

https://twitter.com/cognition_labs/status/1767548763134964000?t=ZECIn-uqbguwHtY8X_Gvtw&s=09

https://news.google.com/stories/CAAqNggKIjBDQklTSGpvSmMzUnZjbmt0TXpZd1NoRUtEd2lWMUwyU0N4RnVWM3pSRWhWX01pZ0FQAQ?hl=en-US&gl=US&ceid=US%3Aen

https://www.bloomberg.com/news/articles/2024-03-12/cognition-ai-is-a-peter-thiel-backed-coding-assistant?…

1 month назад @ youtube.com
[ML News] Elon sues OpenAI | Mistral Large | More Gemini Drama
[ML News] Elon sues OpenAI | Mistral Large | More Gemini Drama [ML News] Elon sues OpenAI | Mistral Large | More Gemini Drama

#mlnews #ainews #openai OUTLINE:

0:00 - Intro

0:20 - Elon sues OpenAI

14:00 - Mistral Large

16:40 - ML Espionage

18:30 - More Gemini Drama

24:00 - Copilot generates spicy images

26:55 - Gemma bugs

28:45 - Varia References: https://gist.github.com/yk/0c065cdc8e414738abfaae4f8e417e00 Thumbnail pictures: Wikipedia Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and …

1 month, 1 week назад @ youtube.com
On Claude 3
On Claude 3 On Claude 3 1 month, 1 week назад @ youtube.com
No, Anthropic's Claude 3 is NOT sentient
No, Anthropic's Claude 3 is NOT sentient No, Anthropic's Claude 3 is NOT sentient

No, Anthropic's Claude 3 is not conscious or sentient or self-aware. References:

https://www.anthropic.com/news/claude-3-family

https://twitter.com/_akhaliq/status/1764673955313459560?t=gkBx2uTXfrxLl-5_mL7Btg&s=09

https://twitter.com/idavidrein/status/1764675668175094169?t=pJfbN3LtKaxsU8egz83Mvg&s=09

https://twitter.com/TolgaBilge_/status/1764754012824314102?t=9bakXDnVMC1oAEyZFoKimA&s=09

https://twitter.com/karinanguyen_/status/1764670019743690757?t=gkBx2uTXfrxLl-5_mL7Btg&s=09

https://twitter.com/alexalbert__/status/1764722513014329620

https://www.lesswrong.com/posts/pc8uP4S9rDoNpwJDZ/claude-3-claims-its-conscious Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTu…

1 month, 2 weeks назад @ youtube.com
[ML News] Groq, Gemma, Sora, Gemini, and Air Canada's chatbot troubles
[ML News] Groq, Gemma, Sora, Gemini, and Air Canada's chatbot troubles [ML News] Groq, Gemma, Sora, Gemini, and Air Canada's chatbot troubles

Your dose of ML News! OUTLINE:

0:00 - Intro

0:20 - Gemma & Gemini

3:40 - Groq

6:30 - Nvidia EOS Supercomputer

7:15 - Gpulist.ai

8:20 - Demis Hassabis on scale

10:10 - Hardware wars

12:05 - Sora

15:10 - Gemini 1.5 Pro & Long Context

18:45 - Air Canada must pay for chatbot mistake

23:30 - Giant Rat Balls

26:25 - Various News References:

https://blog.google/technology/developers/gemma-open-models/?utm_source=tw

https://twitter.com/altryne/status/1760358916624719938?t=PVZkHQA_p7GxmeUX0hcZ_Q&s=09

https://twitter.com/paulg/status/1760078920135872716?t=PVZkHQA_p7GxmeUX0hcZ_Q&s=09

https://groq.com/

https://twitter.com/mattshumer_/status/1759347920543834117?t=cS5nPvZOsV6iDA1mVabHOg&s=09

https://twit…

1 month, 2 weeks назад @ youtube.com
Gemini has a Diversity Problem
Gemini has a Diversity Problem Gemini has a Diversity Problem

Google turned the anti-bias dial up to 11 on their new Gemini Pro model. References:

https://developers.googleblog.com/2024/02/gemini-15-available-for-private-preview-in-google-ai-studio.html

https://blog.google/technology/developers/gemma-open-models/?utm_source=tw

https://storage.googleapis.com/deepmind-media/gemma/gemma-report.pdf

https://twitter.com/ClementDelangue/status/1760324815888486668?t=spXd7Oq_cSrRN2A-3r6gnQ&s=09

https://twitter.com/paulg/status/1760078920135872716?t=PVZkHQA_p7GxmeUX0hcZ_Q&s=09

https://twitter.com/yoavgo/status/1760445342691016811/photo/3

https://twitter.com/alex_peys/status/1760327435890135279/photo/2

https://twitter.com/woke8yearold/status/1760310705142558781/…

1 month, 3 weeks назад @ youtube.com
V-JEPA: Revisiting Feature Prediction for Learning Visual Representations from Video (Explained)
V-JEPA: Revisiting Feature Prediction for Learning Visual Representations from Video (Explained) V-JEPA: Revisiting Feature Prediction for Learning Visual Representations from Video (Explained)

#vjepa #meta #unsupervisedlearning V-JEPA is a method for unsupervised representation learning of video data by using only latent representation prediction as objective function. Weights & Biases course on Structured LLM Outputs: https://wandb.me/course-yannic OUTLINE:

0:00 - Intro

1:45 - Predictive Feature Principle

8:00 - Weights & Biases course on Structured LLM Outputs

9:45 - The original JEPA architecture

27:30 - V-JEPA Concept

33:15 - V-JEPA Architecture

44:30 - Experimental Results

46:30 - Qualitative Evaluation via Decoding Blog: https://ai.meta.com/blog/v-jepa-yann-lecun-ai-model-video-joint-embedding-predictive-architecture/

Paper: https://ai.meta.com/research/publications/revisit…

2 months назад @ youtube.com
What a day in AI! (Sora, Gemini 1.5, V-JEPA, and lots of news)
What a day in AI! (Sora, Gemini 1.5, V-JEPA, and lots of news) What a day in AI! (Sora, Gemini 1.5, V-JEPA, and lots of news)

Your regularly irregular dose of Machine Learning News! W&B Course on LLM Structured Outputs: https://wandb.me/course-yannic OUTLINE:

0:00 - OpenAI Sora

3:25 - Gemini 1.5 with 1 Million Tokens context window

4:50 - V-JEPA

6:50 - Sam Altman raises 7 TRILLION dollars for AI chips

9:30 - Sponsor: Weights & Biases course on Structure Output from LLMs

11:30 - Bard becomes Gemini

13:55 - GOODY-2: The world's most responsible model

16:05 - miqu-1-70b leaked from Mistral

18:25 - Zuckerberg on Meta's open approach to AI models

21:40 - 1X advances robotics

23:30 - Questions around Bard's arena leaderboard position

27:00 - Various other news References:

https://gist.github.com/yk/65fe3d582a43540a61718…

2 months назад @ youtube.com
Lumiere: A Space-Time Diffusion Model for Video Generation (Paper Explained)
Lumiere: A Space-Time Diffusion Model for Video Generation (Paper Explained) Lumiere: A Space-Time Diffusion Model for Video Generation (Paper Explained)

#lumiere #texttovideoai #google LUMIERE by Google Research tackles globally consistent text-to-video generation by extending the U-Net downsampling concept to the temporal axis of videos. OUTLINE:

0:00 - Introduction

8:20 - Problems with keyframes

16:55 - Space-Time U-Net (STUNet)

21:20 - Extending U-Nets to video

37:20 - Multidiffusion for SSR prediction fusing

44:00 - Stylized generation by swapping weights

49:15 - Training & Evaluation

53:20 - Societal Impact & Conclusion Paper: https://arxiv.org/abs/2401.12945

Website: https://lumiere-video.github.io/ Abstract:

We introduce Lumiere -- a text-to-video diffusion model designed for synthesizing videos that portray realistic, diverse and co…

2 months, 2 weeks назад @ youtube.com
Henry AI Labs Henry AI Labs
последний пост 1 day, 13 hours назад
Llama 3 RAG Demo with DSPy Optimization, Ollama, and Weaviate!
Llama 3 RAG Demo with DSPy Optimization, Ollama, and Weaviate! Llama 3 RAG Demo with DSPy Optimization, Ollama, and Weaviate!

Hey everyone! Thank you so much for watching this overview of Llama 3 looking at the release notes and seeing a demo of how to integrate it with DSPy through Ollama and how to use DSPy's MIPRO to find the optimal prompt when using this new large language model for RAG! We are hosting an event in San Francisco on May 1st with Arize AI and Cohere, featuring a talk from Omar Khattab, the lead author of DSPy! Hope to see you there! https://lu.ma/dspy Introducing Meta Llama 3: https://ai.meta.com/blog/meta-llama-3/ Ollama Llama 3: https://ollama.com/library/llama3 Weaviate Recipes: https://github.com/weaviate/recipes/blob/main/integrations/dspy/llms/Llama3.ipynb Chapters

0:00 Llama3!!

1:28 Relea…

1 day, 13 hours назад @ youtube.com
Building RAG with Command R+ from Cohere, DSPy, and Weaviate!
Building RAG with Command R+ from Cohere, DSPy, and Weaviate! Building RAG with Command R+ from Cohere, DSPy, and Weaviate!

Hey everyone! Thank you so much for watching this overview of Command R+ showing you how you can use the new model in DSPy and a quick RAG demo, as well as walking through the details of the release post! Congratulations to the Cohere team! Super exciting times to be working with LLM systems! Introducing Command R+: A Scalable LLM Built for Business - https://txt.cohere.com/command-r-plus-microsoft-azure/ Link to demo notebook - https://github.com/weaviate/recipes/blob/main/integrations/dspy/llms/Command-R-Plus.ipynb Chapters

0:00 Welcome! Command R+!

1:12 Demo with Cohere, DSPy, and Weaviate

6:06 Command R+ Announcement Post

9:24 LLM Evals

2 weeks, 1 day назад @ youtube.com
Structured Outputs with DSPy
Structured Outputs with DSPy Structured Outputs with DSPy

Unfortunately, Large Language Models will not consistently follow the instructions that you give them. This is a massive problem when you are building AI systems that require a particular type of output from the previous step to feed into the next one! For example, imagine you are building a blog post writing system that first takes a question and retrieved context to output a list of topics. These topics have to be formatted in a particular way, such as a comma-separated list or a JSON of Topic objects, such that the system can continue writing the blog post! I am SUPER excited to share the 4th video in my DSPy series, diving into 3 solutions to structuring outputs in DSPy programs: (1) **…

2 weeks, 4 days назад @ youtube.com
Adding Depth to DSPy Programs
Adding Depth to DSPy Programs Adding Depth to DSPy Programs

Hey everyone! Thank you so much for watching the 3rd edition of the DSPy series, Adding Depth to DSPy Programs!! You can find the examples and links to community resources / news on https://github.com/weaviate/recipes! Chapters

0:00 Intro

0:50 Chapters Overview

5:06 Weaviate Recipes

5:24 DSPy News and Community Notes

13:51 Adding Depth to RAG Programs

18:40 Multi-Model DSPy Programs

20:18 DSPy Optimizers

25:30 Deep Dive Optimizers

27:55 Into the Optimizer Code!

37:48 Demo #1: Adding Depth to RAG

1:05:25 Demo #2: Questions to Blogs

1:07:48 Thank you so much for watching!

1 month, 2 weeks назад @ youtube.com
Getting Started with RAG in DSPy!
Getting Started with RAG in DSPy! Getting Started with RAG in DSPy!

Hey everyone! Thank you so much for watching this tutorial on getting started with RAG programming in DSPy! This video will take you through 4 major aspects of building DSPy programs (1) Installation, settings, and Datasets with dspy.Example, (2) LLM Metrics, (3) The DSPy programming model, and (4) Optimization!! The notebook used in the video can be found here: https://github.com/weaviate/recipes/blob/main/integrations/dspy/1.Getting-Started-with-RAG-in-DSPy.ipynb All future videos, as well as additional utils like data import scripts, will be in this folder: https://github.com/weaviate/recipes/tree/main/integrations/dspy Please leave a star, it helps a lot! DSPy on GitHub: https://github.…

2 months, 1 week назад @ youtube.com
DSPy Explained!
DSPy Explained! DSPy Explained!

Hey everyone! Thank you so much for watching this explanation of DSPy! DSPy is a super exciting new framework for developing LLM programs! Pioneered by frameworks such as LangChain and LlamaIndex, we can build much more powerful systems by chaining together LLM calls! This means that the output of one call to an LLM is the input to the next, and so on. We can think of chains as programs, with each LLM call analogous to a function that takes text as input and produces text as output. DSPy offers a new programming model, inspired by PyTorch, that gives you a massive amount of control over these LLM programs. Further the Signature abstraction wraps prompts and structured input / outputs to cle…

2 months, 3 weeks назад @ youtube.com
3blue1brown 3blue1brown
последний пост 1 week, 1 day назад
How word vectors encode meaning
How word vectors encode meaning How word vectors encode meaning

This comes from a full video dissecting how LLMs work. In the shorts player, you can click the link at the bottom of the screen, or for reference: https://youtu.be/wjZofJX0v4M

1 week, 1 day назад @ youtube.com
Visualizing Attention, a Transformer's Heart | Chapter 6, Deep Learning
Visualizing Attention, a Transformer's Heart | Chapter 6, Deep Learning Visualizing Attention, a Transformer's Heart | Chapter 6, Deep Learning

Demystifying attention, the key mechanism inside transformers and LLMs.

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support

Special thanks to these supporters: https://www.3blue1brown.com/lessons/attention#thanks

An equally valuable form of support is to simply share the videos. Demystifying self-attention, multiple heads, and cross-attention.

Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support The first pass for the translated subtitles here is machine-generated, and therefore notably imperfect. To contribute edits or fixes, visit https://translate.3blue1brown.com/ ------------------ Here are …

1 week, 5 days назад @ youtube.com
But what is a GPT? Visual intro to Transformers | Deep learning, chapter 5
But what is a GPT?  Visual intro to Transformers | Deep learning, chapter 5 But what is a GPT? Visual intro to Transformers | Deep learning, chapter 5

An introduction to transformers and their prerequisites

Early view of the next chapter for patrons: https://3b1b.co/early-attention Other recommended resources on the topic. Richard Turner's introduction is one of the best starting places:

https://arxiv.org/pdf/2304.10557.pdf Coding a GPT with Andrej Karpathy

https://youtu.be/kCc8FmEb1nY Introduction to self-attention by John Hewitt

https://web.stanford.edu/class/cs224n/readings/cs224n-self-attention-transformers-2023_draft.pdf History of language models by Brit Cruise:

https://youtu.be/OFS90-FX6pg ------------------ Timestamps 0:00 - Predict, sample, repeat

3:03 - Inside a transformer

6:36 - Chapter layout

7:20 - The premise of Deep Learni…

2 weeks, 4 days назад @ youtube.com
Simulating the electric field and a moving charge
Simulating the electric field and a moving charge Simulating the electric field and a moving charge

A link to the full video is at the bottom of the screen.

Or, for reference: https://youtu.be/aXRTczANuIs

2 months, 3 weeks назад @ youtube.com
How the Mandelbrot set is defined
How the Mandelbrot set is defined How the Mandelbrot set is defined

A link to the full video answering this is at the bottom of the screen. Or, for reference: https://youtu.be/LqbZpur38nw

2 months, 3 weeks назад @ youtube.com
A challenging puzzle about subset sums
A challenging puzzle about subset sums A challenging puzzle about subset sums

A link to the full video answering this is at the bottom of the screen. Or, for reference: https://youtu.be/bOXCLR3Wric

2 months, 4 weeks назад @ youtube.com
Ellipses have multiple definitions, how are these the same?
Ellipses have multiple definitions, how are these the same? Ellipses have multiple definitions, how are these the same?

A link to the full video is at the bottom of the screen.

Or, for reference: https://youtu.be/pQa_tWZmlGs The full video this comes from proves why slicing a cone gives the same shape as the two-thumbtacks-and-string construction, which is beautiful. Editing from long-form to short by Dawid Kołodziej

3 months назад @ youtube.com
Three levels of understanding Bayes' theorem
Three levels of understanding Bayes' theorem Three levels of understanding Bayes' theorem

A link to the full video is at the bottom of the screen.

Or, for reference: https://youtu.be/HZGCoVF3YvM Editing from long-form to short by Dawid Kołodziej

3 months назад @ youtube.com
The medical test paradox (well "paradox")
The medical test paradox (well "paradox") The medical test paradox (well "paradox")

A link to the full video about Bayesian thinking is at the bottom of the screen.

Or, for reference: https://youtu.be/lG4VkPoG3ko Long-to-short editing by Dawid Kołodziej

3 months назад @ youtube.com
Positioned as the hardest question on a Putnam exam (#6, 1992)
Positioned as the hardest question on a Putnam exam  (#6, 1992) Positioned as the hardest question on a Putnam exam (#6, 1992)

A link to the full video is at the bottom of the screen.

Or, for reference: https://youtu.be/OkmNXy7er84 Editing from the original video into this short by Dawid Kołodziej

3 months, 1 week назад @ youtube.com
Why does light slowing imply a bend? (Beyond the tank/car analogy)
Why does light slowing imply a bend? (Beyond the tank/car analogy) Why does light slowing imply a bend? (Beyond the tank/car analogy)

A link to the full video is at the bottom of the screen.

Or, for reference: https://youtu.be/Cz4Q4QOuoo8 That video answers various viewer questions about the index of refraction. Editing from long-form to short by Dawid Kołodziej

3 months, 1 week назад @ youtube.com
The cube shadow puzzle
The cube shadow puzzle The cube shadow puzzle

A link to the full video is at the bottom of the screen. Or, for reference: https://youtu.be/ltLUadnCyi0

3 months, 1 week назад @ youtube.com
What does it mean that light "slows down" in glass?
What does it mean that light "slows down" in glass? What does it mean that light "slows down" in glass?

A link to the full video is at the bottom of the screen.

Or, for reference: https://youtu.be/KTzGBJPuJwM That video unpacks the mechanism behind how light slows down in passing through a medium, and why the slow-down rate would depend on color. Editing from long-form to short by Dawid Kołodziej

3 months, 1 week назад @ youtube.com
Why do we call them "scalars"?
Why do we call them "scalars"? Why do we call them "scalars"?

A link to the full video is at the bottom of the screen.

Or, for reference: https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab Editing from long-form to short by Dawid Kołodziej

3 months, 2 weeks назад @ youtube.com
A beautiful international math olympiad problem
A beautiful international math olympiad problem A beautiful international math olympiad problem

The link to the full video is at the bottom of the screen. For reference, here it is: https://youtu.be/M64HUIJFTZM

3 months, 2 weeks назад @ youtube.com
Two Minute Papers Two Minute Papers
последний пост 2 days, 14 hours назад
GPT-4 Just Got Supercharged!
GPT-4 Just Got Supercharged! GPT-4 Just Got Supercharged!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers ChatGPT:

https://chat.openai.com/ Chatbot arena leaderboard:

https://chat.lmsys.org/?leaderboard Video studying Devin: https://www.youtube.com/watch?v=tNmgmwEtoWE Try this instead! https://github.com/princeton-nlp/SWE-agent 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gord…

2 days, 14 hours назад @ youtube.com
NVIDIA’s AI Puts You In a Video Game 75,000x Faster!
NVIDIA’s AI Puts You In a Video Game 75,000x Faster! NVIDIA’s AI Puts You In a Video Game 75,000x Faster!

❤️ Check out Microsoft Azure AI and try it out for free:

https://azure.microsoft.com/en-us/solutions/ai 📝 The paper "Live 3D Portrait: Real-Time Radiance Fields for Single-Image Portrait View Synthesis " is available here:

https://research.nvidia.com/labs/nxp/lp3d/ Fully Connected conference: https://wandb.ai/site/resources/events/fully-connected MKBHD, iJustine ,Brian Tong persona source: https://www.youtube.com/watch?v=dtp6b76pMak 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous …

5 days, 19 hours назад @ youtube.com
DeepMind’s New AI Saw 15,000,000,000 Chess Boards!
DeepMind’s New AI Saw 15,000,000,000 Chess Boards! DeepMind’s New AI Saw 15,000,000,000 Chess Boards!

❤️ Check out Microsoft Azure AI and try it out for free:

https://azure.microsoft.com/en-us/solutions/ai 📝 The paper "Grandmaster-Level Chess Without Search" is available here:

https://arxiv.org/abs/2402.04494 +1:

https://www.anthropic.com/news/decomposing-language-models-into-understandable-components

https://transformer-circuits.pub/2023/monosemantic-features/index.html 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Ba…

1 week назад @ youtube.com
NVIDIA’s New Tech: Master of Illusions!
NVIDIA’s New Tech: Master of Illusions! NVIDIA’s New Tech: Master of Illusions!

❤️ Check out Microsoft Azure AI and try it out for free:

https://azure.microsoft.com/en-us/solutions/ai 📝 The paper "ViCMA: Visual Control of Multibody Animations" is available here:

https://research.nvidia.com/labs/prl/vicma/ 📝 The paper "Inverse-Foley Animation: Synchronizing rigid-body motions to sound" is available here:

https://www.cs.cornell.edu/projects/Sound/ifa/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Ba…

1 week, 2 days назад @ youtube.com
DeepMind’s Gemini AI: Assistant From The Future!
DeepMind’s Gemini AI: Assistant From The Future! DeepMind’s Gemini AI: Assistant From The Future!

❤️ Check out Microsoft Azure AI and try it out for free:

https://azure.microsoft.com/en-us/solutions/ai 📝 The paper "Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context" is available here:

https://storage.googleapis.com/deepmind-media/gemini/gemini_v1_5_report.pdf 📝 The paper "Gemma: Open Models Based on Gemini Research and Technology" is available here:

https://storage.googleapis.com/deepmind-media/gemma/gemma-report.pdf Try Gemma:

https://huggingface.co/chat Sources:

https://twitter.com/skirano/status/1760468624706351383

https://twitter.com/mckaywrigley/status/1761113846520131816

https://simonwillison.net/2024/Feb/21/gemini-pro-video/

https://twitter.com/ha…

1 week, 6 days назад @ youtube.com
Blender 4.1 - An Amazing Tool…For Free!
Blender 4.1 - An Amazing Tool…For Free! Blender 4.1 - An Amazing Tool…For Free!

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers Get Blender: https://www.blender.org/ Demo files: https://www.blender.org/download/demo-files/

Blend Swap: https://www.blendswap.com/ Andrew Price's donut tutorial:

https://www.youtube.com/watch?v=B0J27sf9N1Y&list=PLjEaoINr3zgEPv5y--4MKpciLaoQYZB1Z&index=2 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Be…

2 weeks, 1 day назад @ youtube.com
OpenAI Sora: Beauty And Horror!
OpenAI Sora: Beauty And Horror! OpenAI Sora: Beauty And Horror!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 My Master thesis on fluids, with source code: https://users.cg.tuwien.ac.at/zsolnai/gfx/fluid_control_msc_thesis/

📝 Paper/poster on fluid control, with source code: https://users.cg.tuwien.ac.at/zsolnai/gfx/real_time_fluid_control_eg/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Bret Bri…

2 weeks, 5 days назад @ youtube.com
OpenAI Sora Just Supercharged Filmmaking!
OpenAI Sora Just Supercharged Filmmaking! OpenAI Sora Just Supercharged Filmmaking!

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers The conference:

https://wandb.ai/site/resources/events/fully-connected First impressions from creatives: https://openai.com/blog/sora-first-impressions 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Bret Brizzee, Gaston Ingaramo, Gordon Child, Jace O'Brien, John Le, Kyle Davis, Lukas Biewald…

3 weeks, 3 days назад @ youtube.com
NVIDIA GTC: This Is The Future Of Everything!
NVIDIA GTC: This Is The Future Of Everything! NVIDIA GTC: This Is The Future Of Everything!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Bret Brizzee, Gaston Ingaramo, Gordon Child, Jace O'Brien, John Le, Kyle Davis, Lukas Biewald, Martin, Michael Albrecht, Michael Tedder, Owen Skarpness, Richard Putra Iskandar, Richard Sundvall, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Tyb…

3 weeks, 5 days назад @ youtube.com
DeepMind’s New Gaming AI Is Here!
DeepMind’s New Gaming AI Is Here! DeepMind’s New Gaming AI Is Here!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "A generalist AI agent for 3D virtual environments" is available here:

https://deepmind.google/discover/blog/sima-generalist-ai-agent-for-3d-virtual-environments/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Bret Brizzee, Gaston Ingaramo, Gordon Child, Jace O'Brien, John Le, Ky…

1 month назад @ youtube.com
The First AI Software Engineer Is Here!
The First AI Software Engineer Is Here! The First AI Software Engineer Is Here!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Bret Brizzee, Gaston Ingaramo, Gordon Child, Jace O'Brien, John Le, Kyle Davis, Lukas Biewald, Martin, Michael Albrecht, Michael Tedder, Owen Skarpness, Richard Putra Iskandar, Richard Sundvall, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Tyb…

1 month, 1 week назад @ youtube.com
The First AI Virus Is Here!
The First AI Virus Is Here! The First AI Virus Is Here!

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers 📝 The paper "ComPromptMized: Unleashing Zero-click Worms that Target GenAI-Powered Applications" is available here:

https://sites.google.com/view/compromptmized 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Bret Brizzee, Gaston Ingaramo, Gordon Child, Jace O'Brien, John Le, Kyle Davis, Luka…

1 month, 1 week назад @ youtube.com
Claude 3 AI: Smarter Than GPT-4?
Claude 3 AI: Smarter Than GPT-4? Claude 3 AI: Smarter Than GPT-4?

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers 📝 Claude 3 is available here - try it out for free (note that we are not affiliated with them):

https://www.anthropic.com/news/claude-3-family Conference I am coming to:

https://wandb.ai/site/resources/events/fully-connected Experiments, evaluations:

https://twitter.com/JiaweiLiu_/status/1764866009175920892

https://twitter.com/tolgabilge_/status/1764754012824314102

https://twitter.com/RubenHssd/status/1764692641436827842 Additional results (very nice so far!):

https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard 📝 My paper on simulations that look almost like reality is available for free here:

1 month, 1 week назад @ youtube.com
Stable Diffusion 3 - An Amazing AI For Free!
Stable Diffusion 3 - An Amazing AI For Free! Stable Diffusion 3 - An Amazing AI For Free!

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers 📝 The paper "Scaling Rectified Flow Transformers for High-Resolution Image Synthesis" is available here:

https://stability.ai/news/stable-diffusion-3-research-paper Waitlist: https://stability.ai/stablediffusion3 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Bret Brizzee, Gaston Ingaramo, G…

1 month, 2 weeks назад @ youtube.com
DeepMind’s New AI Makes Games From Scratch!
DeepMind’s New AI Makes Games From Scratch! DeepMind’s New AI Makes Games From Scratch!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "Genie: Generative Interactive Environments" is available here:

https://sites.google.com/view/genie-2024/ 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Bret Brizzee, Gaston Ingaramo, Gordon Child, Jace O'Brien, John Le, Kyle Davis, Lukas Biewald, Martin, Michael Albrecht, Michae…

1 month, 2 weeks назад @ youtube.com
DataFest Video DataFest Video
последний пост None
Яндекс. Компьютерные науки Яндекс. Компьютерные науки
последний пост 8 months назад
03. Дикуссия «Ближайшее будущее диффузионных моделей»
03. Дикуссия «Ближайшее будущее диффузионных моделей» 03. Дикуссия «Ближайшее будущее диффузионных моделей»

Участники: - Петр Ермаков, ML Brand Director, Яндекс

- Иван Барабанов, Разработчик, ВКонтакте, deep vk

- Валентин Хрульков, ведущий исследователь, Yandex Research Денис Димитров, Исполнительный директор по исследованию данных Sber AI, научный консультант AIRI Вместе со специалистами в области диффузионных картиночных моделей порассуждаем, куда развивается область. Поговорим про текущие положение дел и актуальные технические барьеры области.

8 months назад @ youtube.com
02. Практические аспекты обучения масштабных диффузионных моделей - Валентин Хрульков
02. Практические аспекты обучения масштабных диффузионных моделей - Валентин Хрульков 02. Практические аспекты обучения масштабных диффузионных моделей - Валентин Хрульков

Спикер: Валентин Хрульков, ведущий исследователь, Yandex Research Рассмотрим все этапы обучения foundational диффузионных моделей, начиная от подготовки датасета до регулярных замеров качества в процессе обучения. Обсудим scaling law эксперименты и их предварительные результаты. Так же обсудим различные аспекты применения этих моделей на практике: генерации рекламных баннеров, персонализация, сервис-социальная сеть Шедеврум.

8 months назад @ youtube.com
01. Kandinsky 2 X - Андрей Кузнецов
01. Kandinsky 2 X - Андрей Кузнецов 01. Kandinsky 2 X - Андрей Кузнецов

Спикер: Андрей Кузнецов, Исполнительный директор по исследованию данных, Sber AI Рассмотрим теорию и доступную информацию о диффузионном процессе. Подробно покажу, чем отличается архитектура Kandinsky 2.1 от 2.2. Обсудим метрики для оценки качества генераций, поговорим про ключевые результаты релизов. Вместе посмотрим, в каких сценариях для бизнеса и практических приложениях можно применять нашу модель.

8 months назад @ youtube.com
ML Party 03.08.2023
ML Party 03.08.2023 ML Party 03.08.2023

Добро пожаловать на вечерний митап для ML-инженеров от Яндекса. В этот раз поговорим про модные нынче «генеративки», а именно про диффузионные картиночные модели! программа:

00:00:00 – таймер обратного отсчёта перед эфиром

00:02:12 – открытие конференции

00:04:52 – Kandinsky 2.X

Андрей Кузнецов, исполнительный директор по исследованию данных, Sber AI.

00:46:55 – перерыв

01:03:40 – Практические аспекты обучения масштабных диффузионных моделей Валентин Хрульков, ведущий исследователь, Yandex Research. 01:49:34 – Дискуссия «Ближайшее будущее диффузионных моделей» Присоединяйтесь к нашем сообществу в телеграм, чтобы быть в курсе всех событий и мероприятий Яндекса https://t.me/yadatadojo. Вопрос…

8 months, 2 weeks назад @ youtube.com
ML Trainings ML Trainings
последний пост 4 weeks, 1 day назад
Data Fusion Contest 2024 - митап с доразбором задачи Геоаналитика и QnA (21.03.2024)
Data Fusion Contest 2024 -  митап с доразбором задачи Геоаналитика и QnA (21.03.2024) Data Fusion Contest 2024 - митап с доразбором задачи Геоаналитика и QnA (21.03.2024)

Спикеры: Дмитрий Колодезев, Алексей Пустынников, Алексей Натекин Страница соревнования: https://ods.ai/tracks/data-fusion-2024-competitions

Дедлайн по соревнованию 5 апреля 2024 года, присоединяйтесь!

4 weeks, 1 day назад @ youtube.com
Data Fusion Contest 2024 - митап по задачам Геоаналитика и Модели оттока (29.02.2024)
Data Fusion Contest 2024 -  митап по задачам Геоаналитика и Модели оттока (29.02.2024) Data Fusion Contest 2024 - митап по задачам Геоаналитика и Модели оттока (29.02.2024)

Спикеры: Алексей Натекин, Дмитрий Колодезев Страница соревнования: https://ods.ai/tracks/data-fusion-2024-competitions

Все презентации можно скачать на странице митапа https://ods.ai/tracks/data-fusion-2024-competitions/meetup Дедлайн по соревнованию 5 апреля 2024 года, присоединяйтесь!

1 month, 1 week назад @ youtube.com
ODS SPB, WiDS Meetup 7 марта
ODS SPB, WiDS Meetup 7 марта ODS SPB, WiDS Meetup 7 марта

С радостью приглашаем вас на уникальное событие, посвященное силе и вкладу женщин в мире данных - митап "Woman In Data Science"! Это не просто встреча, это праздник ума, таланта и вдохновения, организованный ODS SPB при поддержке компании Samokat.tech. Полная программа доступна на ODS:

https://ods.ai/events/wids-meetup-2024 Вступить в сообщество: https://ods.ai/ Соцсети Data Fest & Course Fest: https://t.me/datafest

https://vk.com/datafest

1 month, 2 weeks назад @ youtube.com
Деревья и их ансамбли 2023 | Растим дерево
Деревья и их ансамбли 2023 | Растим дерево Деревья и их ансамбли 2023 | Растим дерево

Курс Open ML Course: Деревья и их ансамбли:

https://ods.ai/tracks/trees-autumn23

Сезон курсов: https://ods.ai/events/course_season_autumn_23 Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 months, 1 week назад @ youtube.com
Деревья и их ансамбли 2023 | Деревья в анализе данных
Деревья и их ансамбли 2023 | Деревья в анализе данных Деревья и их ансамбли 2023 | Деревья в анализе данных

Курс Open ML Course: Деревья и их ансамбли:

https://ods.ai/tracks/trees-autumn23

Сезон курсов: https://ods.ai/events/course_season_autumn_23 Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 months, 2 weeks назад @ youtube.com
DRL Course 2023 | Model-Free Reinforcement Learning: Monte-Carlo, SARSA, Q-Learning
DRL Course 2023 | Model-Free Reinforcement Learning: Monte-Carlo, SARSA, Q-Learning DRL Course 2023 | Model-Free Reinforcement Learning: Monte-Carlo, SARSA, Q-Learning

Курс Deep Reinforcement Learning 2023: https://ods.ai/tracks/drlcourse23

Сезон курсов:https://ods.ai/events/course_season_autumn_23 В четвертой лекции:

- Рассматривается случай MDP с неизвестными функциями награды и перехода между состояниями

- Рассмотрели подход Monte-Carlo и Temporal-Difference для нахождения Q-функции в этом случае

- Обсудили epsilon-жадные политики

- Вывили алгоритмы Monte-Carlo, SARSA и Q-learning Автор курса: Антон Плаксин, исследователь в группе Yandex.Research и доцент Уральского федерального университета. Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам:…

2 months, 2 weeks назад @ youtube.com
DRL Course 2023 |Dynamic Programming. Policy and Value Iterations
DRL Course 2023 |Dynamic Programming. Policy and Value Iterations DRL Course 2023 |Dynamic Programming. Policy and Value Iterations

Курс Deep Reinforcement Learning 2023: https://ods.ai/tracks/drlcourse23

Сезон курсов:https://ods.ai/events/course_season_autumn_23 В третьей лекции:

- Поговорили про принцип динамического программирования

- Рассмотрели понятия v- и q-функций, а также понятия оптимальной политики.

- Выписали уравнения Белламана и научились их решать методами Policy Iteration и Value Iteration. Автор курса: Антон Плаксин, исследователь в группе Yandex.Research и доцент Уральского федерального университета. Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат …

2 months, 2 weeks назад @ youtube.com
DRL Course 2023 | Практическое занятие 2. PyTorch and Deep Cross-Entropy Method.
DRL Course 2023 | Практическое занятие 2. PyTorch and Deep Cross-Entropy Method. DRL Course 2023 | Практическое занятие 2. PyTorch and Deep Cross-Entropy Method.

Курс Deep Reinforcement Learning 2023: https://ods.ai/tracks/drlcourse23

Сезон курсов:https://ods.ai/events/course_season_autumn_23 На втором практическом занятии: - Разбираемся с PyTorch

- Пишем линейную регрессию

- Решаем задачу регрессии с помощь нейронных сетей

- Реализуем Deep Cross-Entropy метод и решаем CartPole Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 months, 3 weeks назад @ youtube.com
Линейные модели 2023 | Разбор домашнего задания
Линейные модели 2023 | Разбор домашнего задания Линейные модели 2023 | Разбор домашнего задания

Курс Open ML Course: Линейные модели 2023: https://ods.ai/tracks/linear-models-autumn23

Сезон курсов: https://ods.ai/events/course_season_autumn_23 Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 months, 3 weeks назад @ youtube.com
DRL Course 2023 | Практическое занятие 3. Policy Iteration
DRL Course 2023 | Практическое занятие 3. Policy Iteration DRL Course 2023 | Практическое занятие 3. Policy Iteration

На третьем практическом занятии: - Разбираемся с со средой Frozen Lake

- Пишем Policy Iteration Автор курса: Антон Плаксин, исследователь в группе Yandex.Research и доцент Уральского федерального университета. Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 months, 3 weeks назад @ youtube.com
Линейные модели 2023 | Выбор модели. Создание новых признаков
Линейные модели 2023 | Выбор модели. Создание новых признаков Линейные модели 2023 | Выбор модели. Создание новых признаков

Курс Open ML Course: Линейные модели 2023: https://ods.ai/tracks/linear-models-autumn23

Сезон курсов: https://ods.ai/events/course_season_autumn_23 Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 months, 3 weeks назад @ youtube.com
My First Data Project: от идеи к продукту - Создаем прототип продукта. Proof of concept
My First Data Project: от идеи к продукту - Создаем прототип продукта. Proof of concept My First Data Project: от идеи к продукту - Создаем прототип продукта. Proof of concept

Страница курса: https://ods.ai/tracks/my_first_data_project

Все доп.материалы в блоке на странице курса: https://ods.ai/tracks/my_first_data_project/blocks/98c41cb4-aaff-4c5e-8ddf-252be36ed722 Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 months, 3 weeks назад @ youtube.com
Деревья и их ансамбли 2023 | Дополнительные условия при построении деревьев
Деревья и их ансамбли 2023 | Дополнительные условия при построении деревьев Деревья и их ансамбли 2023 | Дополнительные условия при построении деревьев

Курс Open ML Course: Деревья и их ансамбли:

https://ods.ai/tracks/trees-autumn23

Сезон курсов: https://ods.ai/events/course_season_autumn_23 Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 months, 4 weeks назад @ youtube.com
Основы проектирования ML-систем (autumn 2023 update)
Основы проектирования ML-систем (autumn 2023 update) Основы проектирования ML-систем (autumn 2023 update)

Курс ML System Design: https://ods.ai/tracks/ml-system-design-23

Сезон курсов: https://ods.ai/events/course_season_autumn_23 Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

Канал с вакансиями в telegram: https://t.me/odsjobs

Канал с апдейтами по курсам: https://t.me/odscourses

Как попасть в чат сообщества ODS Mattermost: https://ods.ai/tracks/mattermost

2 months, 4 weeks назад @ youtube.com
DRL Course 2023 | Introduction to Neural Networks. Deep Cross-Entropy Method
DRL Course 2023 | Introduction to Neural Networks. Deep Cross-Entropy Method DRL Course 2023 | Introduction to Neural Networks. Deep Cross-Entropy Method

Курс Deep Reinforcement Learning 2023: https://ods.ai/tracks/drlcourse23

Сезон курсов:https://ods.ai/events/course_season_autumn_23 Во второй лекции:

- рассмотрены понятия нейрона, функции активации, нейронных сетей;

- кратко изложен нейросетевой подход к решению задач регрессии и классификации;

- приведена Теорема Цибенко об аппроксимации нейронными сетями непрерывных функций;

- рассказана модификация алгоритма Кросс-Энтропии с использованием нейронных сетей для решения задач обучения с подкреплением с бесконечными пространствами состояний и действий. Автор курса: Антон Плаксин, исследователь в группе Yandex.Research и доцент Уральского федерального университета. Наши соц.сети:

Telegram: h…

3 months назад @ youtube.com
🎧 Podcasts
Lex Fridman AI Podcast Lex Fridman AI Podcast
последний пост 2 days, 15 hours назад
#426 – Edward Gibson: Human Language, Psycholinguistics, Syntax, Grammar & LLMs
#426 – Edward Gibson: Human Language, Psycholinguistics, Syntax, Grammar & LLMs #426 – Edward Gibson: Human Language, Psycholinguistics, Syntax, Grammar & LLMs

Edward Gibson is a psycholinguistics professor at MIT and heads the MIT Language Lab.

Please support this podcast by checking out our sponsors:– Yahoo Finance: https://yahoofinance.com– Listening: https://listening.com/lex and use code LEX to get one month free– Policygenius: https://policygenius.com/lex– Shopify: https://shopify.com/lex to get $1 per month trial– Eight Sleep: https://eightsleep.com/lex to get special savingsTranscript: https://lexfridman.com/edward-gibson-transcriptEPISODE LINKS:Edward’s X: https://x.com/LanguageMITTedLab: https://tedlab.mit.edu/Edward’s Google Scholar: https://scholar.google.com/citations?user=4FsWE64AAAAJTedLab’s YouTube: https://youtube.com/@Tedlab-MITP…

2 days, 15 hours назад @ lexfridman.com
#425 – Andrew Callaghan: Channel 5, Gonzo, QAnon, O-Block, Politics & Alex Jones
#425 – Andrew Callaghan: Channel 5, Gonzo, QAnon, O-Block, Politics & Alex Jones #425 – Andrew Callaghan: Channel 5, Gonzo, QAnon, O-Block, Politics & Alex Jones

Andrew Callaghan is the host of Channel 5 on YouTube, where he does street interviews with fascinating humans at the edges of society, the so-called vagrants, vagabonds, runaways, outlaws, from QAnon adherents to Phish heads to O Block residents and much more.

Please support this podcast by checking out our sponsors:– ShipStation: https://shipstation.com/lex and use code LEX to get 60-day free trial– BetterHelp: https://betterhelp.com/lex to get 10% off– LMNT: https://drinkLMNT.com/lex to get free sample pack– MasterClass: https://masterclass.com/lexpod to get 15% off– AG1: https://drinkag1.com/lex to get 1 month supply of fish oilTranscript: https://lexfridman.com/andrew-callaghan-transcri…

6 days, 18 hours назад @ lexfridman.com
#424 – Bassem Youssef: Israel-Palestine, Gaza, Hamas, Middle East, Satire & Fame
#424 – Bassem Youssef: Israel-Palestine, Gaza, Hamas, Middle East, Satire & Fame #424 – Bassem Youssef: Israel-Palestine, Gaza, Hamas, Middle East, Satire & Fame

Bassem Youssef is an Egyptian-American comedian & satirist, referred to as the Jon Stewart of the Arab World.

Please support this podcast by checking out our sponsors:– AG1: https://drinkag1.com/lex to get 1 month supply of fish oil– Shopify: https://shopify.com/lex to get $1 per month trial– Eight Sleep: https://eightsleep.com/lex to get special savings– LMNT: https://drinkLMNT.com/lex to get free sample packEPISODE LINKS:Bassem’s X: https://x.com/ByoussefBassem’s Instagram: https://instagram.com/bassemBassem’s Facebook: https://facebook.com/bassemyousseftvBassem’s Website: https://bassemyoussef.xyzPODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co…

2 weeks назад @ lexfridman.com
#423 – Tulsi Gabbard: War, Politics, and the Military Industrial Complex
#423 – Tulsi Gabbard: War, Politics, and the Military Industrial Complex #423 – Tulsi Gabbard: War, Politics, and the Military Industrial Complex

Tulsi Gabbard is a politician, veteran, and author of For Love of Country.

Please support this podcast by checking out our sponsors:– Riverside: https://creators.riverside.fm/LEX and use code LEX to get 30% off– ExpressVPN: https://expressvpn.com/lexpod to get 3 months free– NetSuite: http://netsuite.com/lex to get free product tour– Notion: https://notion.com/lexEPISODE LINKS:For Love of Country (book): https://amzn.to/3VLlofMTulsi’s X: https://x.com/tulsigabbardTulsi’s YouTube: https://youtube.com/@TulsiGabbardTulsi’s Podcast: https://youtube.com/@TheTulsiGabbardShowTulsi’s Instagram: https://instagram.com/tulsigabbardTulsi’s Facebook: https://facebook.com/TulsiGabbardTulsi’s Website: htt…

2 weeks, 3 days назад @ lexfridman.com
#422 – Mark Cuban: Shark Tank, DEI & Wokeism Debate, Elon Musk, Politics & Drugs
#422 – Mark Cuban: Shark Tank, DEI & Wokeism Debate, Elon Musk, Politics & Drugs #422 – Mark Cuban: Shark Tank, DEI & Wokeism Debate, Elon Musk, Politics & Drugs

Mark Cuban is a businessman, investor, star of TV series Shark Tank, long-time principal owner of Dallas Mavericks, and founder of Cost Plus Drugs.

Please support this podcast by checking out our sponsors:– Listening: https://listening.com/lex and use code LEX to get one month free– Cloaked: https://cloaked.com/lex and use code LexPod to get 25% off– Notion: https://notion.com/lex– Eight Sleep: https://eightsleep.com/lex to get special savings– Shopify: https://shopify.com/lex to get $1 per month trialEPISODE LINKS:Mark’s X: https://twitter.com/mcubanMark’s Instagram: https://instagram.com/mcubanCost Plus Drugs: https://costplusdrugs.comShark Tank: https://abc.com/shows/shark-tankDallas Mav…

3 weeks назад @ lexfridman.com
#421 – Dana White: UFC, Fighting, Khabib, Conor, Tyson, Ali, Rogan, Elon & Zuck
#421 – Dana White: UFC, Fighting, Khabib, Conor, Tyson, Ali, Rogan, Elon & Zuck #421 – Dana White: UFC, Fighting, Khabib, Conor, Tyson, Ali, Rogan, Elon & Zuck

Dana White is the CEO and president of the UFC.

Please support this podcast by checking out our sponsors:– LMNT: https://drinkLMNT.com/lex to get free sample pack– Notion: https://notion.com/lex– AG1: https://drinkag1.com/lex to get 1 month supply of fish oil– InsideTracker: https://insidetracker.com/lex to get 20% offTranscript: https://lexfridman.com/dana-white-transcriptEPISODE LINKS:Dana’s X: https://x.com/danawhiteDana’s Instagram: https://instagram.com/danawhiteDana’s Facebook: https://facebook.com/danawhiteUFC’s YouTube: https://youtube.com/@UFCUFC’s Website: https://ufc.com/PODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co/2lwqZIrSpotify: h…

3 weeks, 4 days назад @ lexfridman.com
#420 – Annie Jacobsen: Nuclear War, CIA, KGB, Aliens, Area 51, Roswell & Secrecy
#420 – Annie Jacobsen: Nuclear War, CIA, KGB, Aliens, Area 51, Roswell & Secrecy #420 – Annie Jacobsen: Nuclear War, CIA, KGB, Aliens, Area 51, Roswell & Secrecy

Annie Jacobsen is an investigative journalist and author of “Nuclear War: A Scenario” and many other books on war, weapons, government secrecy, and national security.

Please support this podcast by checking out our sponsors:– HiddenLayer: https://hiddenlayer.com/lex– BetterHelp: https://betterhelp.com/lex to get 10% off– Policygenius: https://policygenius.com/lex– NetSuite: http://netsuite.com/lex to get free product tourEPISODE LINKS:Nuclear War: A Scenario (book): https://amzn.to/3THZHfrAnnie’s Twitter: https://twitter.com/anniejacobsenAnnie’s Website: https://anniejacobsen.com/Annie’s Books: https://amzn.to/3TGWyMJAnnie’s Books (audio): https://adbl.co/49ZnI7cPODCAST INFO:Podcast website…

4 weeks назад @ lexfridman.com
#419 – Sam Altman: OpenAI, GPT-5, Sora, Board Saga, Elon Musk, Ilya, Power & AGI
#419 – Sam Altman: OpenAI, GPT-5, Sora, Board Saga, Elon Musk, Ilya, Power & AGI #419 – Sam Altman: OpenAI, GPT-5, Sora, Board Saga, Elon Musk, Ilya, Power & AGI

Sam Altman is the CEO of OpenAI, the company behind GPT-4, ChatGPT, Sora, and many other state-of-the-art AI technologies.

Please support this podcast by checking out our sponsors:– Cloaked: https://cloaked.com/lex and use code LexPod to get 25% off– Shopify: https://shopify.com/lex to get $1 per month trial– BetterHelp: https://betterhelp.com/lex to get 10% off– ExpressVPN: https://expressvpn.com/lexpod to get 3 months freeTranscript: https://lexfridman.com/sam-altman-2-transcriptEPISODE LINKS:Sam’s X: https://x.com/samaSam’s Blog: https://blog.samaltman.com/OpenAI’s X: https://x.com/OpenAIOpenAI’s Website: https://openai.comChatGPT Website: https://chat.openai.com/Sora Website: https://op…

1 month назад @ lexfridman.com
#418 – Israel-Palestine Debate: Finkelstein, Destiny, M. Rabbani & Benny Morris
#418 – Israel-Palestine Debate: Finkelstein, Destiny, M. Rabbani & Benny Morris #418 – Israel-Palestine Debate: Finkelstein, Destiny, M. Rabbani & Benny Morris

Norman Finkelstein and Benny Morris are historians.

Mouin Rabbani is a Middle East analyst.

Steven Bonnell (aka Destiny) is a political livestreamer.

On some podcast players you should be able to click the timestamp to jump to that time.

(00:00) – Introduction(12:11) – 1948(1:10:43) – Partition(2:15:16) – October 7(3:09:27) – Gaza(3:36:02) – Peace(4:40:47) – Hope for the future

1 month назад @ lexfridman.com
#417 – Kimbal Musk: The Art of Cooking, Tesla, SpaceX, Zip2, and Family
#417 – Kimbal Musk: The Art of Cooking, Tesla, SpaceX, Zip2, and Family #417 – Kimbal Musk: The Art of Cooking, Tesla, SpaceX, Zip2, and Family

Kimbal Musk is a chef, entrepreneur, and author of The Kitchen Cookbook: Cooking for Your Community.

Please support this podcast by checking out our sponsors:– Eight Sleep: https://eightsleep.com/lex to get special savings– ExpressVPN: https://expressvpn.com/lexpod to get 3 months free– NetSuite: http://netsuite.com/lex to get free product tour– BetterHelp: https://betterhelp.com/lex to get 10% offTranscript: https://lexfridman.com/kimbal-musk-transcriptEPISODE LINKS:Kimbal’s X: https://x.com/kimbalKimbal’s Instagram: https://instagram.com/kimbalmusk/Kimbal’s Facebook: https://facebook.com/kimbalmuskofficial/The Kitchen Cookbook: https://amzn.to/4ccaCoEThe Kitchen (restaurants): https://www…

1 month, 1 week назад @ lexfridman.com
#416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI
#416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI #416 – Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI

Yann LeCun is the Chief AI Scientist at Meta, professor at NYU, Turing Award winner, and one of the most influential researchers in the history of AI.

Please support this podcast by checking out our sponsors:– HiddenLayer: https://hiddenlayer.com/lex– LMNT: https://drinkLMNT.com/lex to get free sample pack– Shopify: https://shopify.com/lex to get $1 per month trial– AG1: https://drinkag1.com/lex to get 1 month supply of fish oilEPISODE LINKS:Yann’s Twitter: https://twitter.com/ylecunYann’s Facebook: https://facebook.com/yann.lecunMeta AI: https://ai.meta.com/PODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co/2lwqZIrSpotify: https://spoti.fi/2nEwCF8R…

1 month, 1 week назад @ lexfridman.com
#415 – Serhii Plokhy: History of Ukraine, Russia, Soviet Union, KGB, Nazis & War
#415 – Serhii Plokhy: History of Ukraine, Russia, Soviet Union, KGB, Nazis & War #415 – Serhii Plokhy: History of Ukraine, Russia, Soviet Union, KGB, Nazis & War

Serhii Plokhy is a Ukrainian historian at Harvard University, director of the Ukrainian Research Institute, and an author of many books on history of Eastern Europe, including his latest book The Russo-Ukrainian War: The Return of History.

Please support this podcast by checking out our sponsors:– Eight Sleep: https://eightsleep.com/lex to get special savings– Shopify: https://shopify.com/lex to get $1 per month trial– NetSuite: http://netsuite.com/lex to get free product tour– AG1: https://drinkag1.com/lex to get 1 month supply of fish oilEPISODE LINKS:Serhii’s X: https://x.com/splokhySerhii’s Website: https://history.fas.harvard.edu/people/serhii-plokhiiHarvard Ukrainian Research Institut…

1 month, 2 weeks назад @ lexfridman.com
#414 – Tucker Carlson: Putin, Navalny, Trump, CIA, NSA, War, Politics & Freedom
#414 – Tucker Carlson: Putin, Navalny, Trump, CIA, NSA, War, Politics & Freedom #414 – Tucker Carlson: Putin, Navalny, Trump, CIA, NSA, War, Politics & Freedom

Tucker Carlson is a highly-influential political commentator.

You can watch and listen to him on the Tucker Carlson Network and the Tucker Carlson Podcast.

Please support this podcast by checking out our sponsors:– ZipRecruiter: https://ziprecruiter.com/lex– Listening: https://listening.com/lex and use code LEX to get one month free– HiddenLayer: https://hiddenlayer.com/lex– LMNT: https://drinkLMNT.com/lex to get free sample pack– AG1: https://drinkag1.com/lex to get 1 month supply of fish oilTranscript: https://lexfridman.com/tucker-carlson-transcriptEPISODE LINKS:Tucker Carlson Network: https://tuckercarlson.com/Tucker Carlson Podcast: https://tuckercarlson.com/listen/Tucker’s X: https://…

1 month, 3 weeks назад @ lexfridman.com
#413 – Bill Ackman: Investing, Financial Battles, Harvard, DEI, X & Free Speech
#413 – Bill Ackman: Investing, Financial Battles, Harvard, DEI, X & Free Speech #413 – Bill Ackman: Investing, Financial Battles, Harvard, DEI, X & Free Speech

Bill Ackman is an investor who has led some of the biggest and controversial financial trades in history.

He is founder and CEO of Pershing Square Capital Management.

Please support this podcast by checking out our sponsors:– LMNT: https://drinkLMNT.com/lex to get free sample pack– Policygenius: https://policygenius.com/lex– AG1: https://drinkag1.com/lex to get 1 month supply of fish oil– Eight Sleep: https://eightsleep.com/lex to get special savings– BetterHelp: https://betterhelp.com/lex to get 10% offEPISODE LINKS:Bill’s X: https://twitter.com/BillAckmanPershing Square Holdings: https://pershingsquareholdings.com/Pershing Square Foundation: https://pershingsquarefoundation.orgNeri Oxman …

1 month, 4 weeks назад @ lexfridman.com
#412 – Marc Raibert: Boston Dynamics and the Future of Robotics
#412 – Marc Raibert: Boston Dynamics and the Future of Robotics #412 – Marc Raibert: Boston Dynamics and the Future of Robotics

Marc Raibert is founder and former long-time CEO of Boston Dynamics, and recently Executive Director of the newly-created Boston Dynamics AI Institute.

Please support this podcast by checking out our sponsors:– HiddenLayer: https://hiddenlayer.com/ and use code LEX– Babbel: https://babbel.com/lexpod and use code Lexpod to get 55% off– MasterClass: https://masterclass.com/lexpod to get 15% off– NetSuite: http://netsuite.com/lex to get free product tour– ExpressVPN: https://expressvpn.com/lexpod to get 3 months freeEPISODE LINKS:Boston Dynamics AI Institute: https://theaiinstitute.com/Boston Dynamics YouTube: https://youtube.com/@bostondynamicsBoston Dynamics X: https://x.com/BostonDynamicsBo…

2 months назад @ lexfridman.com
Microsoft Research Podcast Microsoft Research Podcast
последний пост 3 days, 23 hours назад
Abstracts: April 16, 2024
Abstracts: April 16, 2024 Abstracts: April 16, 2024

GRETCHEN HUIZINGA: Welcome to Abstracts, a Microsoft Research Podcast that puts the spotlight on world-class research in brief.

CHAKRABORTY: So satellite connectivity is nothing new and has been there for long.

So we are talking about the satellites that are at least 10 to 20 times cheaper and smaller than state-of-the-art satellites.

So the device sends some packet to the satellite; satellite sends some packet to the device—it’s all about packet exchange.

So our vision is clear: to bring 24-7 connectivity for devices anywhere on Earth with just a click of power button.

3 days, 23 hours назад @ microsoft.com
Ideas: Language technologies for everyone with Kalika Bali
Ideas: Language technologies for everyone with Kalika Bali Ideas: Language technologies for everyone with Kalika Bali

Behind every emerging technology is a great idea propelling it forward. In the new Microsoft Research Podcast series, Ideas, members of the research community at Microsoft discuss the beliefs that animate their research, the experiences and thinkers that inform it, and the positive human impact it targets. In this episode, host Gretchen Huizinga talks with Principal Researcher Kalika Bali. Inspired by an early vision of “talking computers” and a subsequent career in linguistics, Bali has spent the last two decades bringing the two together. Aided by recent advances in large language models and motivated by her belief that everyone should have access to AI in their own language, Bali and her…

1 week, 1 day назад @ microsoft.com
AI Frontiers: Rethinking intelligence with Ashley Llorens and Ida Momennejad
AI Frontiers: Rethinking intelligence with Ashley Llorens and Ida Momennejad AI Frontiers: Rethinking intelligence with Ashley Llorens and Ida Momennejad

And so I just want to start here: for you, Ida, what is general intelligence?

Different people at different times provide different criteria for what would be the artificial general intelligence notion.

One is artificial general intelligence and the other is humanlike intelligence or human-level intelligence.

Artificial general intelligence and humanlike, human-level intelligence—how do these two concepts relate to you?

LLORENS: So it sounds like a very extensive set of experiments across many different tasks and with many different leading AI models, and you’ve uncovered a lack of robustness across some of these different tasks.

3 weeks, 1 day назад @ microsoft.com
Abstracts: March 21, 2024
Abstracts: March 21, 2024 Abstracts: March 21, 2024

GRETCHEN HUIZINGA: Welcome to Abstracts, a Microsoft Research Podcast that puts the spotlight on world-class research in brief.

These two examples are also the differences from other deep learning OFDFT works.

This is the generalization challenge and is one of the major challenges of deep learning method for molecular science applications.

This somehow shows the benefits of leveraging the OFDFT framework for using a deep learning method to solve molecular tasks.

You can also read it on arXiv, or you can check out the March 2024 issue of Nature Computational Science.

4 weeks, 1 day назад @ microsoft.com
Abstracts: February 29, 2024
Abstracts: February 29, 2024 Abstracts: February 29, 2024

And so we realized that working with generative AI really parallels these different aspects of what a manager does, right.

So this requires having self-awareness of the applicability of generative AI to your workflows and maintaining an appropriate level of confidence in completing tasks manually or relying on generative AI.

For example, whether it’s worth it for you to actually learn how to work with generative AI more effectively.

But I think, given how generative AI has rolled out in the world today, I mean, a lot of the focus has been on productivity and workflows.

If you want to read the full paper on metacognition and generative AI, you can find a link at aka.ms/abstracts, or you can …

1 month, 2 weeks назад @ microsoft.com
What’s Your Story: Nicole Forsgren
What’s Your Story: Nicole Forsgren What’s Your Story: Nicole Forsgren

NICOLE FORSGREN: Yeah, it’s, you know, it’s, kind of, this ridiculous story.

I was there for, you know, two, three years, and I’m doing really, really well.

GEHRKE: This is just “I had a feeling.”FORSGREN: In my gut, I’m like, I’m doing really well.

After that, things were going really well, but we were also growing and scaling really, really rapidly.

Because I realized there were pieces about research that I really, really loved.

2 months назад @ microsoft.com
What’s Your Story: Ivan Tashev
What’s Your Story: Ivan Tashev What’s Your Story: Ivan Tashev

In the Microsoft Research Podcast series What’s Your Story, Johannes Gehrke explores the who behind the technical and scientific advancements helping to reshape the world.

A systems expert whose 10 years with Microsoft spans research and product, Gehrke talks to members of the company’s research community about what motivates their work and how they got where they are today.

Partner Software Architect Ivan Tashev’s expertise in audio signal processing has contributed to the design and study of audio components for Microsoft products such as Kinect, Teams, and HoloLens.

In this episode, Tashev discusses how a first-place finish in the Mathematical Olympiad fueled a lifelong passion for shoot…

2 months, 2 weeks назад @ blubrry.com
Abstracts: January 25, 2024
Abstracts: January 25, 2024 Abstracts: January 25, 2024

And up until now, there’s been a lot of work on pruning model parameters for a variety of reasons.

But generally, these papers show that as parameters are removed from the model, performance just does not degrade.

You can, overall, keep performance roughly the same even with a fairly drastic reduction of model parameters.

HUIZINGA: So, Jordan, I often think of an abstract as a sort of appetizer for a research paper.

I think for one, as a practical matter, there’s this question of just what’s the best way to find the best LASER intervention?

2 months, 3 weeks назад @ microsoft.com
AI Frontiers: A deep dive into deep learning with Ashley Llorens and Chris Bishop
AI Frontiers: A deep dive into deep learning with Ashley Llorens and Chris Bishop AI Frontiers: A deep dive into deep learning with Ashley Llorens and Chris Bishop

LLORENS: Your new book lays out foundations in statistics and probability theory for modern machine learning.

LLORENS: Another concept that is key in machine learning is generalization.

So that’s, that’s … we show that in the book, in fact.

BISHOP: [LAUGHS] Well, that’s, that’s really interesting.

So I personally, actually, find this one of the most exciting frontiers not only of the natural sciences but also of machine learning.

4 months назад @ microsoft.com
Abstracts: December 12, 2023
Abstracts: December 12, 2023 Abstracts: December 12, 2023

Members of the research community at Microsoft work continuously to advance their respective fields.

Abstracts brings its audience to the cutting edge with them through short, compelling conversations about new and noteworthy achievements.

In this episode, Senior Principal Research Manager Tao Qin and Senior Researcher Lijun Wu discuss “FABind: Fast and Accurate Protein-Ligand Binding.” The paper, accepted at the 2023 Conference on Neural Information Processing Systems (NeurIPS), introduces a new method for predicting the binding structures of proteins and ligands during drug development.

The method demonstrates improved speed and accuracy over current methods.

Learn moreFABind: Fast and Ac…

4 months, 1 week назад @ blubrry.com
Abstracts: December 11, 2023
Abstracts: December 11, 2023 Abstracts: December 11, 2023

Members of the research community at Microsoft work continuously to advance their respective fields.

Abstracts brings its audience to the cutting edge with them through short, compelling conversations about new and noteworthy achievements.

In this episode, Principal Researcher Alessandro Sordoni joins host Gretchen Huizinga to discuss “Joint Prompt Optimization of Stacked LLMs using Variational Inference.” In the paper, which was accepted at the 2023 Conference on Neural Information Processing Systems (NeurIPS), Sordoni and his coauthors introduce Deep Language Networks, or DLNs, an architecture that treats large language models as layers within a network and natural language prompts as eac…

4 months, 1 week назад @ blubrry.com
Abstracts: December 6, 2023
Abstracts: December 6, 2023 Abstracts: December 6, 2023

Members of the research community at Microsoft work continuously to advance their respective fields.

Abstracts brings its audience to the cutting edge with them through short, compelling conversations about new and noteworthy achievements.

In this episode, Xing Xie, a Senior Principal Research Manager of Microsoft Research Asia, joins host Dr. Gretchen Huizinga to discuss “Evaluating General-Purpose AI with Psychometrics.” As AI capabilities move from task specific to more general purpose, the paper explores psychometrics, a subfield of psychology, as an alternative to traditional methods for evaluating model performance and for supporting consistent and reliable systems.

Read the paper: Ev…

4 months, 2 weeks назад @ blubrry.com
Abstracts: December 6, 2023
Abstracts: December 6, 2023 Abstracts: December 6, 2023

Members of the research community at Microsoft work continuously to advance their respective fields.

Abstracts brings its audience to the cutting edge with them through short, compelling conversations about new and noteworthy achievements.

In this episode, Xing Xie, a Senior Principal Research Manager of Microsoft Research Asia, joins host Dr. Gretchen Huizinga to discuss “Evaluating General-Purpose AI with Psychometrics.” As AI capabilities move from task specific to more general purpose, the paper explores psychometrics, a subfield of psychology, as an alternative to traditional methods for evaluating model performance and for supporting consistent and reliable systems.

Read the paper: Ev…

4 months, 2 weeks назад @ blubrry.com
Collaborators: Teachable AI with Cecily Morrison and Karolina Pakėnaitė
Collaborators: Teachable AI with Cecily Morrison and Karolina Pakėnaitė Collaborators: Teachable AI with Cecily Morrison and Karolina Pakėnaitė

It often requires the knowledge and experience of individuals from across disciplines and institutions.

Collaborators, a Microsoft Research Podcast series, explores the relationships—both expected and unexpected—behind the projects, products, and services being pursued and delivered by researchers at Microsoft and the diverse range of people they’re teaming up with.

In this episode, Dr. Gretchen Huizinga speaks with Cecily Morrison, MBE, a Senior Principal Research Manager at Microsoft Research, and Karolina Pakėnaitė, who also goes by Caroline, a PhD student and member of the citizen design team working with Morrison on the research project Find My Things.

An AI phone application designed …

4 months, 2 weeks назад @ blubrry.com
Abstracts: November 20, 2023
Abstracts: November 20, 2023 Abstracts: November 20, 2023

Shrey and Zoë are coauthors of a paper called Contextual Confidence and Generative AI, and you can read a preprint of this paper now on arXiv.

What was the prompt—no pun intended—for generative AI that got you concerned about this?

HITZIG: I’d say this research paper draws on a few different strands of literature.

So, specifically, how does generative AI threaten our ability to identify and protect?

HUIZINGA: Zoë, if there was one takeaway that you want our listeners to get from this work on contextual confidence, what would it be?

5 months назад @ microsoft.com
Data Skeptic
последний пост 4 days, 5 hours назад
Pose Tracking
Pose Tracking Pose Tracking

Talmo PereiraDr. Talmo Pereira is a Principal Investigator at the Salk Institute for Biological Studies in San Diego, CA where he leads a research group as a Salk Fellow.

His lab (talmolab.org) focuses on the development of deep learning-based computational methods for recognition and modeling of complex biological systems, with applications ranging from neuroscience to cancer and plant biology.

His recent work has demonstrated how advances in deep learning and computer vision can enable quantitative phenotyping of complex behaviors through the development and application of approaches for markerless motion capture (sleap.ai).

This work has been published in Nature Methods and featured in T…

4 days, 5 hours назад @ dataskeptic.com
Modeling Group Behavior
Modeling Group Behavior Modeling Group Behavior

Modeling Group BehaviorOur guest in this episode is Sebastien Motsch, an assistant professor at Arizona State University, working in the School of Mathematical and Statistical Science.

Sebastien discussed two approaches to modeling the group behavior of animals, for instance, a flock of birds.

He discussed how Boltzmann's questions and kinetic theory help them understand birds' interactions with respect to velocity changes.

Papers discussedA new model for self-organized dynamics and its flocking behaviorHeterophilious dynamics enhances consensusResourceBoltzmann equation$$f(t + \Delta t, \mathbf{x} + \Delta \mathbf{x}, \mathbf{v} + \Delta \mathbf{v}) = f(t, \mathbf{x}, \mathbf{v}) + \left( …

1 week, 4 days назад @ dataskeptic.com
Advances in Data Loggers
Advances in Data Loggers Advances in Data Loggers

Ryan discussed how the behavior of rattlesnakes is studied in the natural world, particularly with an increase in temperature.

He discussed how they collect data about the rattlesnake hunt using loggers and how they determine the loggers do not affect the animals' behavior.

Ryan discussed how he built a machine learning model to predict the behavior of the animals.

Ryan discussed what he discovered about the eating habits of rattlesnakes.

Rounding up, Ryan shared some future plans for the project.

3 weeks, 4 days назад @ dataskeptic.com
What You Know About Intelligence is Wrong (fixed)
What You Know About Intelligence is Wrong (fixed) What You Know About Intelligence is Wrong (fixed)

What you Know About Intelligence is WrongWe are joined by Hank Schlinger, a professor of psychology at California State University, Los Angeles.

His research revolves around theoretical issues in psychology and behavioral analysis.

He also discussed how intelligence is measured in a given context.

Hank mentioned why the current measure of intelligence is fundamentally flawed.

Hank discussed how psychologists can perform behavioral experiments to understand consciousness.

1 month назад @ dataskeptic.com
What You Know About Intelligence is Wrong
What You Know About Intelligence is Wrong What You Know About Intelligence is Wrong

What you Know About Intelligence is WrongWe are joined by Hank Schlinger, a professor of psychology at California State University, Los Angeles.

His research revolves around theoretical issues in psychology and behavioral analysis.

He also discussed how intelligence is measured in a given context.

Hank mentioned why the current measure of intelligence is fundamentally flawed.

Hank discussed how psychologists can perform behavioral experiments to understand consciousness.

1 month назад @ dataskeptic.com
Animal Decision Making
Animal Decision Making Animal Decision Making

Louis and the interim director at the Whitney R. Harris World Ecology Center.

Aimee discussed how animals perceive information and what they use it for.

She also discussed the costs required for learning and factors that affect animal learning.

She also discussed the different kinds of evolutionary experiments that can be performed.

Aimee discussed some models that researchers use during evolutionary experiments.

1 month, 1 week назад @ dataskeptic.com
Octopus Cognition
Octopus Cognition Octopus Cognition

Octopus CognitionWe are joined by Tamar Gutnick, a visiting professor at the University of Naples Federico II, Napoli, Italy.

She studies the octopus nervous system and their behavior, focusing on cognition and learning behaviors.

Tamar gave a background to the kind of research she does — lab research.

She discussed the octopus nervous system and why they are unique compared to other animals.

She discussed how they measure the behavior of octopuses using a video recording and a logger to track brain activity.

1 month, 1 week назад @ dataskeptic.com
Optimal Foraging
Optimal Foraging Optimal Foraging

Claire discussed how bumblebees make foraging decisions and how they communicate when foraging.

She discussed how they set up experiments in the lab to address questions about bumblebees foraging.

Claire discussed factors that drive an animal's foraging decisions.

She also touched on some irrational foraging behaviors she observed in her study.

She discussed the effect of climate change on foraging bees' learning behavior.

1 month, 3 weeks назад @ dataskeptic.com
Memory in Chess
Memory in Chess Memory in Chess

On today’s show, we are joined by our co-host, Becky Hansis-O’Neil. Becky is a Ph.D. student at the University of Missouri, St Louis, where she studies bumblebees and tarantulas to understand their learning and cognitive work. She joins us to discuss the paper: Perception in Chess. The paper aimed to understand how chess players perceive the positions of chess pieces on a chess board. She discussed the findings paper. She spoke about situations where grandmasters had better recall of chess positions than beginners and situations where they did not. Becky and Kyle discussed the use of chess engines for cheating. They also discussed how chess players use chunking. Becky discussed some approac…

2 months, 1 week назад @ dataskeptic.com
OpenWorm
OpenWorm OpenWorm

OpenwormOn this episode, we are joined by Stephen Larson, the CEO of MetaCell and an affiliate of the OpenWorm foundation.

Stephen discussed what the Openworm project is about.

They hope to use a digital C. elegans nematode (C. elegans for short) to study the basics of life.

Stephen discussed why C. elegans is an ideal organism for studying life in the lab.

He also mentioned how students can get involved in the Openworm project.

2 months, 2 weeks назад @ dataskeptic.com
What the Antlion Knows
What the Antlion Knows What the Antlion Knows

What the Antlion KnowsOur guest is Becky Hansis-O’Neil, a Ph.D. student at the University of Missouri, St Louis.

Becky discussed how they designed an experiment using a T-maze to understand antlions' behavior.

Becky discussed some interesting findings from the experiment.

Becky gave her thoughts on the findings of the paper and the methodologies used.

Paper in focusOperant conditioning in antlion larvae and its impairment following exposure to elevated temperaturesFollow our guestXWebsiteThis season’s cover art features original photographs by Becky Hansis-O’Neil

2 months, 3 weeks назад @ dataskeptic.com
AI Roundtable
AI Roundtable

Kyle is joined by friends and former guests Pramit Choudhary and Frank Bell to have an open discussion of the impacts LLMs and machine learning have had in the past year on industry, and where things may go in the current year.

3 months назад @ dataskeptic.com
Uncontrollable AI Risks
Uncontrollable AI Risks Uncontrollable AI Risks

Uncontrollable AI RisksWe are joined by Darren McKee, a Policy Advisor and the host of Reality Check — a critical thinking podcast.

Darren gave a background about himself and how he got into the AI space.

Darren shared his thoughts on AGI's achievements in the coming years.

Darren discussed his worry about AI surpassing human understanding of the universe and potentially causing harm to humanity.

He explored whether AI possesses inherently evil intentions and gave his thoughts on regulating AI.

3 months, 3 weeks назад @ dataskeptic.com
I LLM and You Can Too
I LLM and You Can Too I LLM and You Can Too

This episode is not yet released.

Coming soon

3 months, 4 weeks назад @ dataskeptic.com
Q&A with Kyle
Q&A with Kyle Q&A with Kyle

Thanks to our sponsors for their support

4 months назад @ dataskeptic.com
SuperDataScience SuperDataScience
последний пост 1 day, 1 hour назад
776: Deep Utopia: AI Could Solve All Human Problems in Our Lifetime
776: Deep Utopia: AI Could Solve All Human Problems in Our Lifetime 776: Deep Utopia: AI Could Solve All Human Problems in Our Lifetime

What are the risks of AI progressing beyond a point of no return?

What do we stand to gain?

On this Five-Minute Friday, Jon Krohn talks ‘books’ as he outlines two nonfiction works on AI and futurism by Oxford philosopher…

1 day, 1 hour назад @ soundcloud.com
775: What will humans do when machines are vastly more intelligent? With Aleksa Gordić
775: What will humans do when machines are vastly more intelligent? With Aleksa Gordić 775: What will humans do when machines are vastly more intelligent? With Aleksa Gordić

Tech entrepreneurship, artificial superintelligence, and the future of education: Aleksa Gordić speaks to Jon Krohn about his strategies for self-directed learning, the traits that help people succeed in moving from big …

4 days, 1 hour назад @ soundcloud.com
774: RFM-1 Gives Robots Human-like Reasoning and Conversation Abilities
774: RFM-1 Gives Robots Human-like Reasoning and Conversation Abilities 774: RFM-1 Gives Robots Human-like Reasoning and Conversation Abilities

Covariant's RFM-1: Jon Krohn explores the future of AI-driven robotics with RFM-1, a groundbreaking robot arm designed by Covariant and discussed by A.I.

roboticist Pieter Abbeel.

Explore how this innovation aims to merg…

1 week, 1 day назад @ soundcloud.com
773: Deep Reinforcement Learning for Maximizing Profits, with Prof. Barrett Thomas
773: Deep Reinforcement Learning for Maximizing Profits, with Prof. Barrett Thomas 773: Deep Reinforcement Learning for Maximizing Profits, with Prof. Barrett Thomas

Dr. Barrett Thomas, an award-winning Research Professor at the University of Iowa, explores the intricacies of Markov decision processes and their connection to Deep Reinforcement Learning.

Discover how these concepts ar…

1 week, 4 days назад @ soundcloud.com
772: In Case You Missed It in March 2024
772: In Case You Missed It in March 2024 772: In Case You Missed It in March 2024

Pytorch benefits, how to get funding for your AI startup, and managing scientific silos: In our new series for SuperDataScience, “In Case You Missed It”, host Jon Krohn engages in some “reinforcement learning through hum…

2 weeks, 1 day назад @ soundcloud.com
771: Gradient Boosting: XGBoost, LightGBM and CatBoost, with Kirill Eremenko
771: Gradient Boosting: XGBoost, LightGBM and CatBoost, with Kirill Eremenko 771: Gradient Boosting: XGBoost, LightGBM and CatBoost, with Kirill Eremenko

Kirill Eremenko joins Jon Krohn for another exclusive, in-depth teaser for a new course just released on the SuperDataScience platform, “Machine Learning Level 2”.

Kirill walks listeners through why decision trees and ra…

2 weeks, 4 days назад @ soundcloud.com
770: The Neuroscientific Guide to Confidence
770: The Neuroscientific Guide to Confidence 770: The Neuroscientific Guide to Confidence

Explore the science of confidence with Lucy Antrobus, as she unveils neuroscience-backed strategies to build and boost confidence through practice, positive energy, and the power of laughter.

An essential listen for fost…

3 weeks, 1 day назад @ soundcloud.com
769: Generative AI for Medicine, with Prof. Zack Lipton
769: Generative AI for Medicine, with Prof. Zack Lipton 769: Generative AI for Medicine, with Prof. Zack Lipton

Generative AI in medicine takes center stage as Prof. Zachary Lipton, Chief Scientific Officer at Abridge, joins host Jon Krohn to discuss the significant advancements in AI that are reshaping healthcare.

This episode i…

3 weeks, 4 days назад @ soundcloud.com
768: Is Claude 3 Better than GPT-4?
768: Is Claude 3 Better than GPT-4? 768: Is Claude 3 Better than GPT-4?

Claude 3, LLMs and testing ML performance: Jon Krohn tests out Anthropic’s new model family, Claude 3, which includes the Haiku, Sonnet and Opus models (written in order of their performance power, from least to greatest…

4 weeks, 1 day назад @ soundcloud.com
767: Open-Source LLM Libraries and Techniques, with Dr. Sebastian Raschka
767: Open-Source LLM Libraries and Techniques, with Dr. Sebastian Raschka 767: Open-Source LLM Libraries and Techniques, with Dr. Sebastian Raschka

Jon Krohn sits down with Sebastian Raschka to discuss his latest book, Machine Learning Q and AI, the open-source libraries developed by Lightning AI, how to exploit the greatest opportunities for LLM development, and wh…

1 month назад @ soundcloud.com
766: Vonnegut's Player Piano (1952): An Eerie Novel on the Current AI Revolution
766: Vonnegut's Player Piano (1952): An Eerie Novel on the Current AI Revolution 766: Vonnegut's Player Piano (1952): An Eerie Novel on the Current AI Revolution

Kurt Vonnegut's "Player Piano" delivers striking parallels between its dystopian vision and today's AI challenges.

This week, Jon Krohn explores the novel's depiction of a world where humans are marginalized by machines,…

1 month назад @ soundcloud.com
765: NumPy, SciPy and the Economics of Open-Source, with Dr. Travis Oliphant
765: NumPy, SciPy and the Economics of Open-Source, with Dr. Travis Oliphant 765: NumPy, SciPy and the Economics of Open-Source, with Dr. Travis Oliphant

Explore the origins of NumPy and SciPy with their creator, Dr. Travis Oliphant.

Discover the journey from personal need to global impact, the challenges overcome, and the future of these essential Python libraries in sci…

1 month, 1 week назад @ soundcloud.com
764: The Top 10 Episodes of 2023
764: The Top 10 Episodes of 2023 764: The Top 10 Episodes of 2023

Data science futurists, bestselling authors, and lively how-to guides from the industry’s top practitioners, which range from applying data science for good to using open-source tools for NLP: This is The Super Data Scie…

1 month, 1 week назад @ soundcloud.com
763: The Best A.I. Startup Opportunities, with venture capitalist Rudina Seseri
763: The Best A.I. Startup Opportunities, with venture capitalist Rudina Seseri 763: The Best A.I. Startup Opportunities, with venture capitalist Rudina Seseri

At Glasswing Ventures, Rudina Seseri wants to be able to answer the question: What has Glasswing Ventures done for the company beyond capital investment?

She speaks to Jon Krohn about how her company uses data to assess …

1 month, 2 weeks назад @ soundcloud.com
762: Gemini 1.5 Pro, the Million-Token-Context LLM
762: Gemini 1.5 Pro, the Million-Token-Context LLM 762: Gemini 1.5 Pro, the Million-Token-Context LLM

Jon Krohn presents an insightful overview of Google's groundbreaking Gemini Pro 1.5, a million-token LLM that's transforming the landscape of AI.

Discover the innovative aspects of Gemini Pro 1.5, from its extensive cont…

1 month, 2 weeks назад @ soundcloud.com
Data Science at Home Data Science at Home
последний пост 18 часов назад
Rust in the Cosmos: Decoding Communication Part 2 (Ep. 255)
Rust in the Cosmos: Decoding Communication Part 2 (Ep. 255) Rust in the Cosmos: Decoding Communication Part 2 (Ep. 255)

In this episode of “Rust in the Cosmos” we delve into the challenge of testing software for… ehm … spaceHow can Rust help?

Build robotics applications in minutes, not months.

Amethix works to create and maximize the impact of the world’s leading corporations and startups, so they can create a better future for everyone they serve.

We provide solutions in AI/ML, Fintech, Defense, Robotics and Predictive maintenance.

CommunitiesAeroRust, Intrepid, BytenookAeroRust Discord invite: https://discord.com/invite/6jJyx5nEUqAeroRust website: AeroRust.orgIntrepid AI Discord https://discord.gg/cSSzche6Cthttps://discord.gg/cSSzche6Ct Intrepid AI website: https://intrepid.aiReferences

18 часов назад @ datascienceathome.com
Rust in the Cosmos: Decoding Communication Part I (Ep. 254)
Rust in the Cosmos: Decoding Communication Part I (Ep. 254) Rust in the Cosmos: Decoding Communication Part I (Ep. 254)

In this inaugural episode of “Rust in the Cosmos,” we delve into the intricacies of communication in space and some of the challenges in space application development.

1 week, 2 days назад @ datascienceathome.com
AI and Video Game Development: Navigating the Future Frontier (Ep. 253)
AI and Video Game Development: Navigating the Future Frontier (Ep. 253) AI and Video Game Development: Navigating the Future Frontier (Ep. 253)

In this episode we delve into the dynamic realm of game development and the transformative role of artificial intelligence (AI).

Join Frag, Jim and Mike as they explore the current landscape of game development processes, from initial creative ideation to the integration of AI-driven solutions.

With Mike’s expertise as a software executive and avid game developer, we uncover the potential of AI to revolutionize game design, streamline development cycles, and enhance player experiences.

SponsorsIntrepid AI is an AI assisted all-in-one platform for robotics teams.

Build robotics applications in minutes, not months.

2 weeks, 6 days назад @ datascienceathome.com
Kaggle Kommando’s Data Disco: Laughing our Way Through AI Trends (Ep. 252)
Kaggle Kommando’s Data Disco: Laughing our Way Through AI Trends (Ep. 252) Kaggle Kommando’s Data Disco: Laughing our Way Through AI Trends (Ep. 252)

In this episode, join me and the Kaggle Grand Master, Konrad Banachewicz, for a hilarious journey into the zany world of data science trends.

From algorithm acrobatics to AI, creativity, Hollywood movies, and music, we just can’t get enough.

It’s the typical episode with a dose of nerdy comedy you didn’t know you needed.

Buckle up, it’s a data disco, and we’re breaking down the binary!

SponsorsIntrepid AI is an AI assisted all-in-one platform for robotics teams.

1 month, 1 week назад @ datascienceathome.com
Revolutionizing Robotics: Embracing Low-Code Solutions (Ep. 251)
Revolutionizing Robotics: Embracing Low-Code Solutions (Ep. 251) Revolutionizing Robotics: Embracing Low-Code Solutions (Ep. 251)

In this episode of Data Science at Home, we explore the game-changing impact of low-code solutions in robotics development.

Discover how these tools bridge the coding gap, simplify integration, and enable trial-and-error development.

We’ll also uncover challenges with traditional coding methods using ROS.

Join us for a concise yet insightful discussion on the future of robotics!

2 months назад @ datascienceathome.com
Is Sqream the fastest big data platform? (Ep. 250)
Is Sqream the fastest big data platform? (Ep. 250) Is Sqream the fastest big data platform? (Ep. 250)

Join us in a dynamic conversation with Yori Lavi, Field CTO at SQream, as we unravel the data analytics landscape.

From debunking the data lakehouse hype to SQream’s GPU-based magic, discover how extreme data challenges are met with agility.

Yori shares success stories, insights into SQream’s petabyte-scale capabilities, and a roadmap to breaking down organizational bottlenecks in data science.

Dive into the future of data analytics with SQream’s commitment to innovation, leaving legacy formats behind and leading the charge in large-scale, cost-effective data projects.

Tune in for a dose of GPU-powered revolution!

2 months, 2 weeks назад @ datascienceathome.com
OpenAI CEO Shake-up: Decoding December 2023 (Ep. 249)
OpenAI CEO Shake-up: Decoding December 2023 (Ep. 249) OpenAI CEO Shake-up: Decoding December 2023 (Ep. 249)

In this episode from a month ago, join me as we unravel the controversial CEO firing at OpenAI in December 2023.

I share my insights on the events, decode the intricacies, and explore what lies ahead for this influential organization.

Don’t miss this concise yet insightful take on the intersection of leadership and artificial intelligence innovation.

SponsorLearn what the new year holds for ransomware as a service, Active Directory, artificial intelligence and more when you download the 2024 Arctic Wolf Labs Predictions Report today at arcticwolf.com/datascience

3 months назад @ datascienceathome.com
Careers, Skills, and the Evolution of AI (Ep. 248)
Careers, Skills, and the Evolution of AI (Ep. 248) Careers, Skills, and the Evolution of AI (Ep. 248)

!!WARNING!!

Due to some technical issues the volume is not always constant during the show.

I sincerely apologise for any inconvenienceFrancescoIn this episode, I speak with Richie Cotton, Data Evangelist at DataCamp, as he delves into the dynamic intersection of AI and education.

Richie, a seasoned expert in data science and the host of the podcast, brings together a wealth of knowledge and experience to explore the evolving landscape of AI careers, the skills essential for generative AI technologies, and the symbiosis of domain expertise and technical skills in the industry.

3 months, 1 week назад @ datascienceathome.com
Open Source Revolution: AI’s Redemption in Data Science (Ep. 247)
Open Source Revolution: AI’s Redemption in Data Science (Ep. 247) Open Source Revolution: AI’s Redemption in Data Science (Ep. 247)

Dive into the world of Data Science at Home with our latest episode, where we explore the dynamic relationship between Artificial Intelligence and the redemption of open source software.

In this thought-provoking discussion, I share my insights on why now, more than ever, is the opportune moment for open source to leave an indelible mark on the field of AI.

Join me as I unpack my opinions and set expectations for the near future, discussing the pivotal role open source is set to play in shaping the landscape of data science and artificial intelligence.

Don’t miss out—tune in to gain a deeper understanding of this revolutionary intersection!

This episode is available as YouTube stream at htt…

4 months назад @ datascienceathome.com
Money, Cryptocurrencies, and AI: Exploring the Future of Finance with Chris Skinner [RB] (Ep. 246)
Money, Cryptocurrencies, and AI: Exploring the Future of Finance with Chris Skinner [RB] (Ep. 246) Money, Cryptocurrencies, and AI: Exploring the Future of Finance with Chris Skinner [RB] (Ep. 246)

In this captivating podcast episode, join renowned financial expert Chris Skinner as he delves into the fascinating realm of the future of money.

From cryptocurrencies to government currencies, the metaverse to artificial intelligence (AI), Skinner explores the intricate interplay between technology and humanity.

Gain valuable insights as he defines the future of money, examines the potential impact of cryptocurrencies on traditional government currencies, and addresses the advantages and disadvantages of digital currencies.

Brace yourself for an enlightening discussion on the integration of AI in the financial sector and its potential impact on humanity.

Once subscribed, you get full acces…

4 months, 1 week назад @ datascienceathome.com
Debunking AGI Hype and Embracing Reality [RB] (Ep. 245)
Debunking AGI Hype and Embracing Reality [RB] (Ep. 245) Debunking AGI Hype and Embracing Reality [RB] (Ep. 245)

In this thought-provoking episode, we sit down with the renowned AI expert, Filip Piekniewski, Phd, who fearlessly challenges the prevailing narratives surrounding artificial general intelligence (AGI) and the singularity.

With a no-nonsense approach and a deep understanding of the field, Filip dismantles the hype and exposes some of the misconceptions about AI, LLMs and AGI.

If you’re seeking a refreshingly pragmatic perspective on the future of AI, this episode is an absolute must-listen.

Filip Piekniewski BioFilip Piekniewski is a distinguished computer vision researcher and engineer, specializing in visual object tracking and perception.

He is known for his realistic perspective on AI, …

4 months, 2 weeks назад @ datascienceathome.com
Destroy your toaster before it kills you. Drama at OpenAI and other stories (Ep. 244)
Destroy your toaster before it kills you. Drama at OpenAI and other stories (Ep. 244) Destroy your toaster before it kills you. Drama at OpenAI and other stories (Ep. 244)

Brace yourselves, dear friends!

In this episode, we delve into the earth-shattering revelation that OpenAI might have stumbled upon AGI (lol) and we’re all just seconds away from being replaced by highly sophisticated toasters (lol lol).

Spoiler alert: OpenAI’s CEO is just playing 7D chess with the entire human race.

So, sit back, relax, and enjoy this totally not ominous exploration into the ‘totally not happening’ future of AI!

4 months, 2 weeks назад @ datascienceathome.com
The AI Chip Chat 🤖💻 (Ep. 243)
The AI Chip Chat 🤖💻 (Ep. 243) The AI Chip Chat 🤖💻 (Ep. 243)

Dive into the cool world of AI chips with us!

🚀 We’re breaking down how these special computer chips for AI have evolved and what makes them different.

Think of them like the superheroes of the tech world!

Don’t miss out!

🎙️🔍 #AIChips #TechTalk #SimpleScience

4 months, 2 weeks назад @ datascienceathome.com
Rolling the Dice: Engineering in an Uncertain World (Ep. 242)
Rolling the Dice: Engineering in an Uncertain World (Ep. 242) Rolling the Dice: Engineering in an Uncertain World (Ep. 242)

Hey there, engineering enthusiasts!

Ever wondered how engineers deal with the wild, unpredictable twists and turns in their projects?

In this episode, we’re spilling the beans on uncertainty and why it’s the secret sauce in every engineering recipe, not just the fancy stuff like deep learning and neural networks!

Join us for a ride through the world of uncertainty quantification.

Tune in and let’s demystify the unpredictable together!

4 months, 2 weeks назад @ datascienceathome.com
How Language Models Are the Ultimate Database(Ep. 241)
How Language Models Are the Ultimate Database(Ep. 241) How Language Models Are the Ultimate Database(Ep. 241)

In this episode, dive deep into the world of Language Models as we decode their intricate structure, revealing how these powerful algorithms exploit concepts from the past.

But… what if LLMs were just a database?

Referenceshttps://fchollet.substack.com/p/how-i-think-about-llm-prompt-engineering

4 months, 2 weeks назад @ datascienceathome.com