Very ML
State-of-the-art Machine Learning News Feed
/r/MachineLearning
последний пост 52 минуты назад
[P] I create a small pytorch utility to Import custom dataset
[P] I create a small pytorch utility to Import custom dataset

Hi guys , TorchClassifierData is A small pytorch utility to Import, Split ,Normalize and Visualize custom dataset for classification tasks. wich is indispensable for real word problem . You can find a full notebook that use TorchClassifierData to train a classifier on this kaggle dataset here. The code source is avalaible on my github. Thank you. submitted by /u/charles_data_dev [link] [comments]

52 минуты назад @ reddit.com
[D] Why do Diffusion models work so well while SG-MCMC does not?
[D] Why do Diffusion models work so well while SG-MCMC does not?

Diffusion models are basically Langevin sampling. What are the key differences and tricks that set them apart from Langevin dynamics? Why do they work so well while very similar sampling methods don't? submitted by /u/Dangerous-Flan-6581 [link] [comments]

1 час назад @ reddit.com
[D] In ML, a PhD gives you a 10-year head start over weekend warriors
[D] In ML, a PhD gives you a 10-year head start over weekend warriors [D] In ML, a PhD gives you a 10-year head start over weekend warriors

​ https://preview.redd.it/cczbhu367aqb1.png?width=1600&format=png&auto=webp&s=f1761911d7ce3bbefaef43774b5d60f638886893 ML is often portrayed as a magical field where anyone with a laptop and Python skills can build amazing AI systems. The reality is less democratic: mastering ML requires gritty, systematic work best learned through formal training. You need rock solid foundations in math, programming, and core concepts—skills acquired through advanced education, which (almost always) is beyond self-taught hackers. Most think a PhD is unnecessary, but the reality is that advanced degrees provide the deepest training. Patience and persistence do matter, but a PhD gives you a 10-year head star…

1 час назад @ reddit.com
[D] Offer From Bug 4 VS Startup
[D] Offer From Bug 4 VS Startup

So briefly about my current experience, I graduated 2 years ago with a bachelor in data science and I have 2-3 years of experience as a data scientist/ml engineer/software engineer. So I’ve got competing offers, one from the big 4 as a software systems engineer - AI/ML (Big 4) and the other as a machine learning engineer. The startup salary is higher while big 4 is lower. Additionally the startup isn’t necessarily a unicorn it’s a relatively small startup with an interesting product but it doesn’t necessarily blow me away. The salary at the startup is 15 percent higher that that of the big 4 offer. For those wondering I did already negotiate the salary and they did increase it marginally. I…

2 часа назад @ reddit.com
[P] Hardware Resources for training SwinBert
[P] Hardware Resources for training SwinBert

So I've been thinking of implementing SwinBert for a college project and have been wondering what all resources i would be needing for a computer. Any ideas? submitted by /u/Big-Brain_69 [link] [comments]

2 часа назад @ reddit.com
[D] Career advice for a mid-level ml engineer(Perception/CV)?
[D] Career advice for a mid-level ml engineer(Perception/CV)?

I’ve been having a bit of an existential crisis as of late and wanted to ask for advice on how to move forward. I have a Master’s in CS with research experience and a few publications applying machine learning in a fairly niche area (So not novel from the ML side). Since graduating, I’ve worked ~2 years as an ML engineer in small company(Niche area, different than research). I’ve done quite well here and have played a critical role in taking several big greenfield projects to completion. Most of my work is framing problems, understanding what’s possible with current research, then building the data pipelines, and training models(with small mods here and there). My main worry is that I might…

3 часа назад @ reddit.com
[D] Image-to-text web-scraping
[D] Image-to-text web-scraping

I'm curious if anyone has tried pix2struct-large for web-scraping text from wesites. If so - how well did it perform? If not - is there something else that is considered better to use? submitted by /u/ReddSpark [link] [comments]

4 часа назад @ reddit.com
[D] Where will the demand for AI work be in future?
[D] Where will the demand for AI work be in future?

Hypothesis: Big tech companies are investing vast amounts of money to develop general models on which others will build. They'll develop interfaces to make it easier for others to fine-tune on top of their models. So that there will be less and less of a need for ML engineers that know how to create a deep learning model in Pytorch, and more and more of a need for data engineers that simply plug into pre-trained models. An AI assistant will also be quicker at coding up a more bespoke AI model for a companies needs, guided by data engineers. What do people think? Is this a scenario that they think will play out? Where will the demand for AI skills be coming from in the future? submitted by /…

5 часов назад @ reddit.com
[R] Researchers announce GPT4Tools: a method for teaching LLMs how to use tools for visual tasks
[R] Researchers announce GPT4Tools: a method for teaching LLMs how to use tools for visual tasks

LLMs are great with words but can't handle visual tasks like understanding images. Teaching them to use visual tools could make them much more capable. A new paper introduces GPT4Tools - a method to efficiently teach existing LLMs to invoke tools for visual tasks without proprietary data. My highlights from the paper: Uses ChatGPT as a "teacher" to generate instructional data for other LLMs Fine-tunes LLMs like Vicuna on this data using selective weight tuning (keeps base model frozen) Allows smaller 13B LLM to match 175B GPT-3.5 on seen tools after tuning Data augmentation with negative/context samples was found to be the secret sauce to get this to work Can generalize to brand new visual …

5 часов назад @ reddit.com
[D] Tools to gather and collaborate on fine-tuning datasets?
[D] Tools to gather and collaborate on fine-tuning datasets?

Hey all, I run a small team & we are collaborating on a few data sets that we use to fine-tune GPT3.5, We are currently using Google Sheets and I'm wondering if there is a tool where we can organize our data preferably with version control Any ideas? submitted by /u/zeJaeger [link] [comments]

5 часов назад @ reddit.com
[D] Colored Point Cloud Completion
[D] Colored Point Cloud Completion

Hello, I have created point clouds from images using Point-E. Sadly they are very sparse (for example wehn inputting an image of a house, the roof has very few points in it) and I was searching for other Models, that could make the PC more dense and predict the color of every point. Point-E outputs xyz and rgb vectors for every point. Do some of you have advise for me here? submitted by /u/bySmily [link] [comments]

6 часов назад @ reddit.com
[P]Just published my second blog on medium about feature scaling in machine learning please have a look
[P]Just published my second blog on medium about feature scaling in machine learning please have a look [P]Just published my second blog on medium about feature scaling in machine learning please have a look

submitted by /u/indusop [link] [comments]

6 часов назад @ reddit.com
[D][P] how to create a 3D gymnasium environment for mujoco env?
[D][P] how to create a 3D gymnasium environment for mujoco env?

Hi I'm a student and working on a RL project for the university and need some guidance. I have created a 3d model with mujoco (I have the xml file) how do I create an environment in gymnasium with this xml file? for the sake of an example let's say I have the xml file of the humanoid model how do I load this in gymnasium so that I could train it to walk? (this is just an example because the current project is harder to explain, but will use the humanoid model in the project) or is the approach that I'm trying is not appropriate at all? I came across this stackoverflow post where they say mujoco is itself good for this but was hard for me to understand due to lack of examples. would really a…

7 часов назад @ reddit.com
[D] Is Machine Learning Related To Central Limit Theorem?
[D] Is Machine Learning Related To Central Limit Theorem?

Machine Learning (ML) is known for requiring lots of data in order to make plausible predictions, but the Central Limit Theorem (CLT) says the more data you get, the more it will form into a normal distribution, meaning that it will gravitate towards the actual mean value. Isn't this the reason why ML needs more data for it to be precise? submitted by /u/HighwayLatter1786 [link] [comments]

8 часов назад @ reddit.com
[P] Made a simple semantic segmentation annotation tool with segment-anything masks support in PyQt5
[P] Made a simple semantic segmentation annotation tool with segment-anything masks support in PyQt5 [P] Made a simple semantic segmentation annotation tool with segment-anything masks support in PyQt5

I just open-sourced (MIT License) semantic segmentation annotation tool powered by segment-anything model that I used for a while in my projects. Hopefully it will help someone as it seems to me that it is more suitable for small projects than popular huge web based annotation tools. Link to the project: SAMAT (any feedback in Discussions section on GitHub is appreciated) Features: Brush annotation (opposed to polygons) Magic Wand (like in Photoshop) powered by segment-anything masks (it is optional, if you don’t have cool GPU to prepare masks) samat showcase Why yet another annotation tool? Before starting this project I tried supervisely, segments.ai, roboflow and several others, but foun…

9 часов назад @ reddit.com
Towards Data Science
последний пост 1 час назад
Large Language Models: RoBERTa — A Robustly Optimized BERT Approach
Large Language Models: RoBERTa — A Robustly Optimized BERT Approach Large Language Models: RoBERTa — A Robustly Optimized BERT Approach

Large Language Models: RoBERTa — A Robustly Optimized BERT ApproachLearn about key techniques used for BERT optimisationIntroductionThe appearance of the BERT model led to significant progress in NLP. Deriving its architecture from Transformer, BERT achieves state-of-the-art results on various downstream tasks: language modeling, next sentence prediction, question answering, NER tagging, etc.Large Language Models: BERT — Bidirectional Encoder Representations from TransformerDespite the excellent performance of BERT, researchers still continued experimenting with its configuration in hopes of achieving even better metrics. Fortunately, they succeeded with it and presented a new model called …

1 час назад @ towardsdatascience.com
Don’t Start Your Data Science Journey Without These 5 Must-Do Steps From a Spotify Data Scientist
Don’t Start Your Data Science Journey Without These 5 Must-Do Steps From a Spotify Data Scientist Don’t Start Your Data Science Journey Without These 5 Must-Do Steps From a Spotify Data Scientist

A complete guide to everything I wish I’d done before starting my Data Science journey, here’s to acing your first year with dataContinue reading on Towards Data Science »

6 часов назад @ towardsdatascience.com
Matplotlib Tutorial: Let’s Take Your Country Maps to Another Level
Matplotlib Tutorial: Let’s Take Your Country Maps to Another Level Matplotlib Tutorial: Let’s Take Your Country Maps to Another Level

How to draw beautiful maps with Python and MatplotlibContinue reading on Towards Data Science »

8 часов назад @ towardsdatascience.com
Causal Python: Five Novel Causal Ideas At NeurIPS 2023
Causal Python: Five Novel Causal Ideas At NeurIPS 2023 Causal Python: Five Novel Causal Ideas At NeurIPS 2023

New exciting ideas that marry causality with generative modeling, conformal prediction and topology.Continue reading on Towards Data Science »

8 часов назад @ towardsdatascience.com
Anomaly Detection in TensorFlow and Keras Using the Autoencoder Method
Anomaly Detection in TensorFlow and Keras Using the Autoencoder Method Anomaly Detection in TensorFlow and Keras Using the Autoencoder Method

A cutting-edge unsupervised method for noise removal, dimensionality reduction, anomaly detection, and moreContinue reading on Towards Data Science »

1 day, 2 hours назад @ towardsdatascience.com
How to Program a Neural Network
How to Program a Neural Network How to Program a Neural Network

A step-by-step guide to implementing a neural network from scratchContinue reading on Towards Data Science »

1 day, 2 hours назад @ towardsdatascience.com
Optimizing LLMs with C, and running GPT, Lama, Whisper on your laptop
Optimizing LLMs with C, and running GPT, Lama, Whisper on your laptop Optimizing LLMs with C, and running GPT, Lama, Whisper on your laptop

In this first article, we’ll dive into ggml, the fantastic tensor library created by Georgi Gerganov. How does it work? How is the tensor…Continue reading on Towards Data Science »

1 day, 3 hours назад @ towardsdatascience.com
Temporal-Difference Learning and the importance of exploration: An illustrated guide
Temporal-Difference Learning and the importance of exploration: An illustrated guide Temporal-Difference Learning and the importance of exploration: An illustrated guide

Comparing model-free and model-based RL methods on a dynamic grid worldPhoto by Saffu on UnsplashRecently, Reinforcement Learning (RL) algorithms have received a lot of traction by solving research problems such as protein folding, reaching a superhuman level in drone racing, or even integrating human feedback in your favorite chatbots.Indeed, RL provides useful solutions to a variety of sequential decision-making problems. Temporal-Difference Learning (TD learning) methods are a popular subset of RL algorithms. TD learning methods combine key aspects of Monte Carlo and Dynamic Programming methods to accelerate learning without requiring a perfect model of the environment dynamics.In this a…

1 day, 6 hours назад @ towardsdatascience.com
A Taxonomy of Natural Language Processing
A Taxonomy of Natural Language Processing A Taxonomy of Natural Language Processing

An overview of different fields of study and recent developments in NLPNLP taxonomy. Image by author.This post is based on our RANLP 2023 paper “Exploring the Landscape of Natural Language Processing Research”. You can read more details there.IntroductionAs an efficient approach to understand, generate, and process natural language texts, research in natural language processing (NLP) has exhibited a rapid spread and wide adoption in recent years. Given the rapid developments in NLP, obtaining an overview of the domain and maintaining it is difficult. This blog post aims to provide a structured overview of different fields of study NLP and analyzes recent trends in this domain.Fields of stud…

1 day, 9 hours назад @ towardsdatascience.com
Organizational Processes for Machine Learning Risk Management
Organizational Processes for Machine Learning Risk Management Organizational Processes for Machine Learning Risk Management

Organizational processes are a key nontechnical determinant of reliability in ML systems.Continue reading on Towards Data Science »

1 day, 16 hours назад @ towardsdatascience.com
Creating and Publishing Your Own Python Package for Absolute Beginners
Creating and Publishing Your Own Python Package for Absolute Beginners Creating and Publishing Your Own Python Package for Absolute Beginners

Create, build an publish a Python Package in 5 minutesContinue reading on Towards Data Science »

1 day, 16 hours назад @ towardsdatascience.com
From Hacks to Harmony: Structuring Product Rules in Recommendations
From Hacks to Harmony: Structuring Product Rules in Recommendations From Hacks to Harmony: Structuring Product Rules in Recommendations

Don’t let heuristics undermine your ML, learn to combine themIn today’s data-driven landscape, recommendation systems power everything from social media feeds to e-commerce. While it’s tempting to think that machine learning algorithms do all the heavy lifting, that’s only half the story. Real-world systems often rely on a mix of machine learning and heuristic rules — commonly referred to as product rules, business rules, or simply hacks — to generate the most relevant recommendations.For example:You can’t recommend tracks from the same artist too often;You should include content from subscriptions in the feed, but not overwhelm it;If a user has already disliked a certain category or author…

1 day, 16 hours назад @ towardsdatascience.com
Now You See Me (CME): Concept-based Model Extraction
Now You See Me (CME): Concept-based Model Extraction Now You See Me (CME): Concept-based Model Extraction

A label-efficient approach to Concept-based ModelsFrom the AIMLAI workshop paper presented at the CIKM conference: “Now You See Me (CME): Concept-based Model Extraction” (GitHub)Visual abstract. Image by the author.TL;DRProblem — Deep Neural Network models are black boxes, which cannot be interpreted directly. As a result — it is difficult to build trust in such models. Existing methods, such as Concept Bottleneck Models, make such models more interpretable, but require a high annotation cost for annotating underlying conceptsKey Innovation — A method for generating Concept-based Models in a weakly-supervised fashion, requiring vastly fewer annotations as a resultSolution — Our Concept-base…

2 days, 2 hours назад @ towardsdatascience.com
Exploring what makes an AI Ethics Toolkit tick
Exploring what makes an AI Ethics Toolkit tick Exploring what makes an AI Ethics Toolkit tick

Exploring What Makes an AI Ethics Toolkit TickAI Ethics Toolkits are everywhere, but do we really understand them?Photo by Todd Quackenbush on Unsplash — It’s time to dismantle an AI Ethics ToolkitIntroductionAs AI systems’ use in applications with critical implications continue to multiply, experts have been calling for more participatory and value-conscious practices when designing these systems. There are a number of benefits that increased stakeholder participation can bring to AI systems design, including making them more inclusive, combating existing biases, and providing accountability. In response, the field of AI ethics has produced a significant number of toolkits in recent years.…

2 days, 6 hours назад @ towardsdatascience.com
Quantifying GPT-4’s Hidden Regressions Over Time
Quantifying GPT-4’s Hidden Regressions Over Time Quantifying GPT-4’s Hidden Regressions Over Time

Part 3 of a study on generative AI usage and testingPhoto by Randy Fath on UnsplashGPT-4 is bigger and better than GPT-3. GPT-4 can draft up eloquent speeches, pass standardized exams, and even interpret images. Since its release on March 14, 2023, OpenAI continues to iterate and update GPT-4 to improve its performance for the millions of queries it receives each day. However, is the latest version of GPT-4 in OpenAI’s API, called “gpt-4”, actually better than the initial version from March, called “gpt-4–0314”?From the perspective of a machine learning engineer at Kolena, this article is a continuation in a series of discussions highlighting a testing paradigm for LLMs, comparing the perfo…

2 days, 7 hours назад @ towardsdatascience.com
Distill.pub Distill.pub
последний пост None
The Gradient The Gradient
последний пост 2 weeks, 1 day назад
Text-to-CAD: Risks and Opportunities
Text-to-CAD: Risks and Opportunities Text-to-CAD: Risks and Opportunities

We’ve identified three key areas where such programs can level up: dataset curation, a pattern language for usability, and filtering.

The format for a friction hinge pattern might look like this:Pattern Name Friction Hinge Pattern Description The Friction Hinge pattern addresses the need for adjustable friction in hinges so as to provide tuneable resistance, but without compromising smooth movement.

Alas, once text-to-CAD models get open-sourced or leaked, many of these queries will be satisfied without compunction.

____In conclusion, the emergence of AI-powered text-to-CAD generation presents both risks and opportunities, the ratio of which is still very much undecided.

CitationFor attribu…

2 weeks, 1 day назад @ thegradient.pub
Interpretability Creationism
Interpretability Creationism Interpretability Creationism

In this piece, I will discuss the tendency towards “interpretability creationism” – interpretability methods that only look at the final state of the model and ignore its evolution over the course of training – and propose a focus on the training process to supplement interpretability research.

In the case of language models, they behave similarly to ngram models early on and exhibit linguistic patterns later.

Let us consider an explanation usually based on analyzing static models: hierarchical behavior in language models.

An ExampleI recently had to manage the trap of interpretability creationism myself.

Citation:For attribution in academic contexts or books, please cite this work asNaomi …

2 months, 2 weeks назад @ thegradient.pub
What Do LLMs Know About Linguistics? It Depends on How You Ask
What Do LLMs Know About Linguistics? It Depends on How You Ask What Do LLMs Know About Linguistics? It Depends on How You Ask

Understanding what LLMs learn about linguistics from large-scale pretraining is a good framework for understanding how they work in general.

How we extend in-context learning to structured prediction tasks with structured prompting (left) and an example of using structured prompting with an LLM to annotate parts-of-speech (right).

Overall, structured prompting can consistently generate the linguistic structure underlying text from LLMs, confirming that these models have implicitly learned these structures during pretraining.

This leads us to ask: what factors of structured prompting allow LLMs to annotate linguistic structure?

By searching for labels instead of full examples, we can find ta…

2 months, 2 weeks назад @ thegradient.pub
Why transformative artificial intelligence is really, really hard to achieve
Why transformative artificial intelligence is really, really hard to achieve Why transformative artificial intelligence is really, really hard to achieve

Good speculated, one day self-improve, cause an intelligence explosion, and lead to an economic growth singularity?

In this essay we assemble the best arguments that we have encountered for why transformative AI is hard to achieve.

AI progress may even be accelerating the decoupling of the US and China, reducing the flow of people and ideas.

Automation alone is not enough for transformative economic growth.

If transformative AI were coming soon, real interest rates would rise in line with expectations of great future wealth or risk.

3 months назад @ thegradient.pub
Modern AI is Domestification
Modern AI is Domestification Modern AI is Domestification

IntroductionAs internet-scale AI models mature rapidly from coarse research demos to productionized user-facing systems, expectations have increased and goalposts have moved drastically.

Incorporating AI Feedback: AI CriticsWhile RLHF provides a powerful mechanism to transfer human knowledge to AI models, it also faces practical limitations: human feedback can be noisy, inconsistent, and expensive to collect.

To tackle these challenges, Reinforcement Learning from AI Feedback (RLAIF) aims to bring existing AI models into the loop by utilizing prompted pretrained models to generate preference data for training reward models.

Here are a few examples of AI feedback that amplify existing AI pri…

4 months назад @ thegradient.pub
Artificial Curiosity as Moral Virtue
Artificial Curiosity as Moral Virtue Artificial Curiosity as Moral Virtue

An artificially intelligent agent could act and think in ways driven by curiosity, and, in these ways, exercise the moral virtue of curiosity - letting them become more and more human-like.

We begin to investigate this question through an exploration of artificial curiosity in the context of the free energy principle - as put forward by neuroscientist Karl Friston.

The free energy principle provides a way for adaptive systems to unify action, perception, and learning.

Artificial curiosity, as it could follow from the free energy principle, could be examined appropriately.

For attribution in academic contexts or books, please cite this work asSyed Hussain Ather, "Artificial Curiosity as Mora…

4 months, 1 week назад @ thegradient.pub
In-Context Learning, In Context
In-Context Learning, In Context In-Context Learning, In Context

BackgroundMuch recent work on large language models (LLMs) has explored the phenomenon of in-context learning (ICL).

We use the term “in-context learning” to describe the inner loop of this process, which occurs within the forward-pass upon each sequence.

In “What Learning Algorithm is In-Context Learning,” Akyürek et al., using linear regression as a toy problem, provide evidence for the hypothesis that transformer-based in-context learners indeed implement standard learning algorithms implicitly.

First, “The Learnability of In-Context Learning” presents a first-of-its-kind PAC-based framework for in-context learnability.

“Large Language Models Do In-Context Learning Differently” further s…

4 months, 4 weeks назад @ thegradient.pub
Software²: A new generation of AIs that become increasingly general by producing their own training data
Software²: A new generation of AIs that become increasingly general by producing their own training data Software²: A new generation of AIs that become increasingly general by producing their own training data

Despite the large impact of the training data on model performance, mainstream training practices are not inherently data-seeking.

Instead, they ignore the quality of information within the training data in favor of maximizing data quantity.

In this sense, we say an IGI performs open-ended learning, and call its associated process for collecting training data open-ended exploration.

In general, the full data space can be unbounded and cannot be captured by a single, predefined dataset or simulator.

Software²: A new generation of AIs that become increasingly general by producing their own training data.

5 months назад @ thegradient.pub
Grounding Large Language Models in a Cognitive Foundation: How to Build Someone We Can Talk To
Grounding Large Language Models in a Cognitive Foundation: How to Build Someone We Can Talk To Grounding Large Language Models in a Cognitive Foundation: How to Build Someone We Can Talk To

And we need our robots to tell the truth and say when they don’t know—to actually be useful, robots need to be trustworthy.

Robots need strong mental modelsRobots need strong mental models so that they can learn quickly, form long-term memories, and understand truth.

Strengthening Mental Models using Curriculum Learning to Acquire a Cognitive FoundationRobots need strong mental models to learn quickly and adapt to novel situations.

When large language models (LLMs) train only on text they are only learning from part of the information.

ConclusionLarge language models (LLMs) such as ChatGPT are trained on internet data, which is the end product of evolution instead of its beginning.

5 months, 1 week назад @ thegradient.pub
Towards Geometric Deep Learning
Towards Geometric Deep Learning Towards Geometric Deep Learning

The last decade has witnessed an experimental revolution in data science and machine learning, epitomised by deep learning methods.

Geometric Deep Learning is concerned with exposing these regularities through unified geometric principles that can be applied throughout a broad spectrum of applications.

Kunihiko Fukushima and the neocognitron, an early geometric deep learning architecture and a precursor of the modern convolutional neural networks.

The ‘Erlangen Programme’ of Deep LearningOur historical overview of the geometric foundations of deep learning has now naturally brought us to the blueprint that underpins this book.

The Geometric Deep Learning Blueprint can be used to derive from…

7 months, 1 week назад @ thegradient.pub
Artists enable AI art - shouldn't they be compensated?
Artists enable AI art - shouldn't they be compensated? Artists enable AI art - shouldn't they be compensated?

However, there is another side to the AI art process, one that is not talked about enough.

In this article, I will cover why this the case, the debate around artist compensation in AI art, and some possible solutions to the problem.

With that out of the way, let’s move on to the overarching debate about AI art and whether it copies artists.

Let’s connect: https://rb.gy/m5ok2yMy Instagram: https://rb.gy/gmvuy9My Twitter: https://twitter.com/Machine0177681CitationFor attribution of this in academic contexts or books, please cite this work as:Devansh Lnu, "Artists enable AI art - shouldn't they be compensated?

BibTeX citation:@article{Lnu2023aiart,author = {Lnu, Devansh},title = {Artists enabl…

7 months, 2 weeks назад @ thegradient.pub
Do Large Language Models learn world models or just surface statistics?
Do Large Language Models learn world models or just surface statistics? Do Large Language Models learn world models or just surface statistics?

Back to the mystery on whether large language models are learning surface statistics or world models, there have been some tantalizing clues suggesting language models may build interpretable “world models” with probing techniques.

Back to the question we have at the beginning: do language models learn world models or just surface statistics?

Our experiment provides evidence supporting that these language models are developing world models and relying on the world model to generate sequences.

CitationFor attribution of this in academic contexts or books, please cite this work as:Kenneth Li, "Do Large Language Models learn world models or just surface statistics?

BibTeX citation (this blog):…

8 months назад @ thegradient.pub
Reasons to Punish Autonomous Robots
Reasons to Punish Autonomous Robots Reasons to Punish Autonomous Robots

Autonomous Military Robots: Design and PlausibilityAs Sparrow notes, ‘autonomy’ means different things to different authors (2007, 65).

Danaher predicts that people won’t desire to punish autonomous robots because the robots don’t seem deserving of punishment.

That gives us someone to punish and might also help make sure autonomous military robots are only used judiciously.

Thus, to the extent that deterrence, restoring trust, communicating condemnation, or providing education provide good reasons for punishing human agents, they also provide reasons to punish autonomous robots.

If it is not reasonable or ethically defensible to punish* autonomous robots, we should look hard at whether it i…

8 months, 1 week назад @ thegradient.pub
Learning to Make the Right Mistakes - a Brief Comparison Between Human Perception and Multimodal LMs
Learning to Make the Right Mistakes - a Brief Comparison Between Human Perception and Multimodal LMs Learning to Make the Right Mistakes - a Brief Comparison Between Human Perception and Multimodal LMs

This is because their top-down perception has not had enough “experience/training data” to learn and refine itself.

In a way, we can say that their “world model” is not as good as that of adults.

An interesting consequence of a strong top-down perception is the ability of us humans to see things like animals/faces in the clouds (Pareidolia).

Multimodal Language Models (LMs) are an attempt to make such language models perceive the world in a way that’s one step closer to that of humans.

His work focuses on reverse engineering Large Multimodal Language Models to make them explainable to humans.

9 months, 3 weeks назад @ thegradient.pub
TheSequence TheSequence
последний пост 13 часов назад
Do Amazon and Apple Have Any Moats in Generative AI?
Do Amazon and Apple Have Any Moats in Generative AI? Do Amazon and Apple Have Any Moats in Generative AI?

You can subscribe below:📝 Editorial: Do Amazon and Apple Have Any Moats in Generative AI?

Amazon seems to be banking on partnerships with startups like Hugging Face to enable generative AI capabilities in AWS.

However, neither company has been particularly acquisitive lately, and the valuations for top generative AI companies are in the double-digit billions.

Generative AI evolves at a multi-exponential pace, making it incredibly challenging for companies of any size to regain a lost market.

Both Apple and Amazon should be formidable competitors in the generative AI space, but for now, they appear to lack a defining moat.

13 часов назад @ thesequence.substack.com
📣 Event: Announcing apply(ops), the Biggest Virtual Conference for the Global ML Community!
📣 Event: Announcing apply(ops), the Biggest Virtual Conference for the Global ML Community! 📣 Event: Announcing apply(ops), the Biggest Virtual Conference for the Global ML Community!

Our favorite virtual ML data engineering conference is coming back on November 14!

And this time, the theme is apply(ops), where the focus will be on platforms and architectures for production ML use cases.

And we’re still adding more and putting the final touches to the agenda.

And as usual, emcee Meech (aka Demetrios Brinkmann) will be hosting and handing out gifts.

Don’t miss this chance to connect and network with the speakers and the community around the world in live Q&As!

2 days, 12 hours назад @ thesequence.substack.com
Edge 328: Inside AudioCraft: Meta AI’s New Family of Generative Audio Models
Edge 328: Inside AudioCraft: Meta AI’s New Family of Generative Audio Models Edge 328: Inside AudioCraft: Meta AI’s New Family of Generative Audio Models

Created Using AudiogramAudio is rapidly becoming one of the new frontiers of generative AI.

In the pursuit of generating high-fidelity audio, Meta AI faces the challenge of modeling intricate signals and patterns at diverse scales.

Among various audio types, music proves especially daunting due to its amalgamation of local and long-range patterns, spanning from individual notes to complex musical structures with multiple instruments.

While conventional approaches rely on symbolic representations like MIDI or piano rolls, they fall short in capturing the expressive nuances and stylistic richness intrinsic to music.

Meta AI recently introduced AudioCraft, a family of generative AI models for …

3 days, 13 hours назад @ thesequence.substack.com
Edge 327: A New Series About Fine-Tuning Foundation Models
Edge 327: A New Series About Fine-Tuning Foundation Models Edge 327: A New Series About Fine-Tuning Foundation Models

Created Using MidjourneyIn this Issue:An intro to our series about fine-tuning foundation models.

A review of Meta AI’s LIMA paper that shows astonishing examples of the vlaue proposition of model fine-tuning.

An overview of H2O.ai’s LLM Studio for fine-tuning models.

💡 ML Concept of the Day: A New Series About Fine-Tuning Foundation ModelsComing out of our longest and most successful series about new techniques in foundation models, we thought it might be a good idea to deep dive into some of those methods.

Today, we are starting a new series about maybe the most popular technique associated with the customization of foundation models: fine-tuning.

5 days, 13 hours назад @ thesequence.substack.com
NVIDIA, The Most Influential VC In Generative AI
NVIDIA, The Most Influential VC In Generative AI NVIDIA, The Most Influential VC In Generative AI

📝 Editorial: NVIDIA, The Most Influential VC In Generative AIThese days, NVIDIA is synonymous with AI computing.

The growing dependence of AI workloads on NVIDIA GPUs has transformed the chip manufacturer into one of the most influential tech companies in the world.

The recent additions to the portfolio feature some of the hottest companies in the AI space.

NVIDIA's influence in the AI space expands far beyond its computing capabilities, as it strategically invests in some of the biggest innovators in the generative AI space.

🤖 Cool AI Tech ReleasesTensorRT-LLMNVIDIA announced the new TensorRT-LLM compiler that drastically improves inference on NVIDIA H100 GPUs —> Read more.

1 week назад @ thesequence.substack.com
🎥 Building, Training & Deploying High-Quality ML Models—a Virtual Hands-On Lab
🎥 Building, Training & Deploying High-Quality ML Models—a Virtual Hands-On Lab 🎥 Building, Training & Deploying High-Quality ML Models—a Virtual Hands-On Lab

Want to get some hands-on training to learn how to generate accurate training datasets for ML models, implement feature pipelines, and better manage the lifecycle of ML models and features?

Join us at Tecton’s virtual hands-on lab on Wednesday, September 27!

Tecton Developer Advocate Nick Acosta will walk through concepts and code that will help you build a modern technical architecture that simplifies the process of managing ML models and features.

Through a series of code examples, you’ll learn how to use Tecton to build high-quality features that can train and productionize ML models, utilize Tecton and Databricks to power real-time ML, and more.

SAVE MY SPOT!

1 week, 2 days назад @ thesequence.substack.com
Meet SDXL 1.0: Stability AI New Text-to-Image Super Model
Meet SDXL 1.0: Stability AI New Text-to-Image Super Model Meet SDXL 1.0: Stability AI New Text-to-Image Super Model

Created Using IdeogramStability AI has been at the center of the text-to-image revolution with the release of the Stable Diffusion family of models.

Incorporating some of those breakthroughs led Stability to gradually improve Stable Diffusion.

The latest result of this work was the release of SDXL, a very advanced latent diffusion model designed for text-to-image synthesis.

With this release, SDXL is now the state-of-the-art text-to-image generation model from Stability AI.

SDXL is now available via ClipDrop, GitHub or the Stability AI Platform.

1 week, 3 days назад @ thesequence.substack.com
The Sequence Chat: Lianmin Zheng, UC Berkeley About Vicuna, Chatbot Arena and the Open Source LLM Revolution
The Sequence Chat: Lianmin Zheng, UC Berkeley About Vicuna, Chatbot Arena and the Open Source LLM Revolution The Sequence Chat: Lianmin Zheng, UC Berkeley About Vicuna, Chatbot Arena and the Open Source LLM Revolution

I am committed to open-source AI research by developing better models (e.g., Vicuna), evaluations (e.g., Chatbot Arena and MT-bench), and systems (e.g., FastChat, Alpa).

🛠 AI WorkYou are one of the researchers behind Vicuna, one of the most popular open-source LLMs released to date.

The rapid advancement of large language models (LLMs) has revolutionized AI systems, resulting in unprecedented levels of intelligence as seen in OpenAI's ChatGPT.

Chatbot arena is a benchmark platform for large language models (LLMs) that features anonymous, randomized battles in a crowdsourced manner.

The latest vicuna is finetuned from Llama-2.

1 week, 4 days назад @ thesequence.substack.com
Edge 325: A Summary of Our Series About New Techniques Foundation Models
Edge 325: A Summary of Our Series About New Techniques Foundation Models Edge 325: A Summary of Our Series About New Techniques Foundation Models

💡 ML Concept of the Day: A Summary of Our Series About New Techniques Foundation ModelsThe universe of foundation models is absolutely fascinating and research in that space is advancing an at incredible pace.

Over the last few weeks, we have covered some of the latest methods that are likely to become relevant in the next generation of foundation models.

Throughout 18 issues, We covered all sorts of different areas such as reinforcement learning with human feedback, fine-tuning, tool augmentation, knowledge augmentation, memory, in-context learning and many others.

We could keep going with this series for a while but, instead, we are going to start running smaller series that go even deepe…

1 week, 5 days назад @ thesequence.substack.com
Falcon-180B Takes Open Source LLMs Closer to GPT-4
Falcon-180B Takes Open Source LLMs Closer to GPT-4 Falcon-180B Takes Open Source LLMs Closer to GPT-4

Created Using MidjourneyNext Week in The Sequence:Edge 325: We conclude our longest and most sucessful series about new techniques in foundation models with a comprehensive summary.

📝 Editorial: Falcon-180B Takes Open Source LLMs Closer to GPT-4A few months ago, The Technology Innovation Institute (TII) in the United Arab Emirates (UAE) took the world of foundation models by storm with the release of the Falcon LLM model.

The momentum in open-source foundation models is real and is not showing any signs of slowing down.

Frontiers of Multimodal LearningMicrsooft Research published a summary of recent papers detailing their responsible approach to multimodal learning.

🤖 Cool AI Tech ReleasesF…

2 weeks назад @ thesequence.substack.com
🔥Building Plaid’s ML Fraud Detection Application—an apply() Fireside Chat
🔥Building Plaid’s ML Fraud Detection Application—an apply() Fireside Chat 🔥Building Plaid’s ML Fraud Detection Application—an apply() Fireside Chat

Want to know how Plaid, a leading fintech company, built the ML infrastructure that powers Signal, its payment fraud detection and prevention application?

Then join us on September 20 at 9:30 AM PT for this virtual fireside chat!

Guest speaker Renault Young, a Software Engineer at Plaid, and Kevin Stumpf, CTO of Tecton, will discuss:Overcoming the challenges faced during Plaid's quest for effective payment fraud detection, including how to solve for out-of-order transaction data.

Technical solutions they adopted, including a new ML platform that allows for generating training datasets and designed for feature freshness and seamless large-scale production.

The benefits derived from this new …

2 weeks, 2 days назад @ thesequence.substack.com
Edge 324: A Deep Dive Into Code Llama: Meta AI’s Open Source Entrance in the Code LLM Space
Edge 324: A Deep Dive Into Code Llama: Meta AI’s Open Source Entrance in the Code LLM Space Edge 324: A Deep Dive Into Code Llama: Meta AI’s Open Source Entrance in the Code LLM Space

Since OpenAI unveiled Codex( now part of GPT-4) last year, the level of innovation in coding language models has been breathtaking.

In the last few months, we have seen code LLMs released by companies like Salesforce, Hugging Face, DeepMind, Amazon and many others.

A few days ago, Meta AI jumped the code LLM frenzy with the release of Code Llama, an open-source code LLM based on the recently released Llama 2.

By intensively extracting data from the code-centric dataset, Code Llama evolved to bolster its coding proficiency, effectively enhancing the capabilities inherited from Llama 2.

This advanced version demonstrates prowess in code generation and articulating natural language about code.

2 weeks, 3 days назад @ thesequence.substack.com
Edge 323: Types of Memory-Augmentation in Foundation Models
Edge 323: Types of Memory-Augmentation in Foundation Models Edge 323: Types of Memory-Augmentation in Foundation Models

Created Using MidjourneyIn this Issue:Different types of memory in foundation models.

💡 ML Concept of the Day: Types of Memory-Augmentation in Foundation ModelsMemory is a key component of modern LLM architectures and one that is rapidly being incorporated into LLM frameworks.

Memory allows LLMs to expands their contextual knowledge of a conversation by retrieving concepts from former interactions.

Conceptually, LLM memory can be seen as a set of context-target pairs that can be aggregated to obtain next token probabilities.

Most LLM memory forms can be grouped into three main groups:

2 weeks, 5 days назад @ thesequence.substack.com
The Other OpenAI Competitor that Just Raised a Lot of Money
The Other OpenAI Competitor that Just Raised a Lot of Money The Other OpenAI Competitor that Just Raised a Lot of Money

📝 Editorial: The Other OpenAI Competitor that Just Raised a Lot of MoneyWhen we consider standalone OpenAI competitors, our thoughts turn toward companies such as Anthropic or Cohere, which have been securing remarkable venture funding.

In addition to its core foundational models, AI21 Labs offers hyper-optimized task-specific models for various domains, such as content management or expert knowledge systems.

Just last week, AI21 Labs announced the completion of a $155 million financing round that valued the company at $1.4 billion.

AI21 Labs is undeniably a name to monitor, as they appear to be a formidable presence in the LLM space.

MoZOAI researchers from Princeton University published a…

3 weeks назад @ thesequence.substack.com
Edge 322: Inside Generative Agents : How Google and Stanford Researchers Used Generative AI to Learn to Simulate Human Behavior
Edge 322: Inside Generative Agents : How Google and Stanford Researchers Used Generative AI to Learn to Simulate Human Behavior Edge 322: Inside Generative Agents : How Google and Stanford Researchers Used Generative AI to Learn to Simulate Human Behavior

Created Using MidjourneySimulating human behavior has long been one of the crown jewels of artificial intelligence(AI).

Recently, researchers from Google and Stanford University collaborated on a paper demonstrating generative agents that are able to simulate human behavior across complex tasks.

A wide variety of inferences can be drawn by generative agents about themselves, other agents, and their environment.

When the end user changes their environment or commands them in natural language, generative agents respond accordingly.

In a society where generative agents are present, emergent social dynamics can be observed, where new relationships are formed, information diffuses, and coordinat…

3 weeks, 3 days назад @ thesequence.substack.com
Synced Review
последний пост 20 часов назад
Language Models Redefined: Transforming Textual Mastery into Compression Brilliance
Language Models Redefined: Transforming Textual Mastery into Compression Brilliance Language Models Redefined: Transforming Textual Mastery into Compression Brilliance

Continue reading on SyncedReview »

20 часов назад @ medium.com
From Stagnant to Stunning: Google Transforms Still Images into Photo-Realistic Animations
From Stagnant to Stunning: Google Transforms Still Images into Photo-Realistic Animations From Stagnant to Stunning: Google Transforms Still Images into Photo-Realistic Animations

Motion is one of the most conspicuous visual cues in the natural world, and humans possess a remarkable sensitivity to it. While humans…Continue reading on SyncedReview »

3 days, 21 hours назад @ medium.com
Unveiling the Enigma: Meta AI & UPC Decodes the Inner Workings of Large Scale Language Models
Unveiling the Enigma: Meta AI & UPC Decodes the Inner Workings of Large Scale Language Models Unveiling the Enigma: Meta AI & UPC Decodes the Inner Workings of Large Scale Language Models

Recent developments in large-scale language models have showcased their remarkable ability to tackle a wide array of tasks using a single…Continue reading on SyncedReview »

4 days, 10 hours назад @ medium.com
Revolutionizing Autonomous Agents: AGENTS Framework Puts Power in Your Hands
Revolutionizing Autonomous Agents: AGENTS Framework Puts Power in Your Hands Revolutionizing Autonomous Agents: AGENTS Framework Puts Power in Your Hands

In recent years, the rapid progress of Large Language Models (LLMs) has showcased their potential in creating autonomous agents capable of…Continue reading on SyncedReview »

6 days, 5 hours назад @ medium.com
DeepMind Decodes the Puzzle of ‘ Grokking ’ In Neural Network Generalization Through Circuit…
DeepMind Decodes the Puzzle of ‘ Grokking ’ In Neural Network Generalization Through Circuit… DeepMind Decodes the Puzzle of ‘ Grokking ’ In Neural Network Generalization Through Circuit…

One of the intriguing puzzles within the realm of neural network generalization is a phenomenon known as “grokking.” It involves a neural…Continue reading on SyncedReview »

1 week, 2 days назад @ medium.com
Microsoft’s phi-1.5
Microsoft’s phi-1.5 Microsoft’s phi-1.5

Large Language Models (LLMs) have undoubtedly showcased remarkable performance in the field of natural language processing. However…Continue reading on SyncedReview »

1 week, 3 days назад @ medium.com
Unlocking the Power of Visual Modeling: Microsoft’s Sparse MoEs Redefine Efficiency and Excellence
Unlocking the Power of Visual Modeling: Microsoft’s Sparse MoEs Redefine Efficiency and Excellence Unlocking the Power of Visual Modeling: Microsoft’s Sparse MoEs Redefine Efficiency and Excellence

In recent years, sparsely-gated Mixture-of-Experts models (sparse MoEs) have garnered substantial attention and acclaim for their…Continue reading on SyncedReview »

1 week, 4 days назад @ medium.com
Revolutionizing Optimization: DeepMind Leverages Large Language Models as Intelligent Optimizers
Revolutionizing Optimization: DeepMind Leverages Large Language Models as Intelligent Optimizers Revolutionizing Optimization: DeepMind Leverages Large Language Models as Intelligent Optimizers

Optimization plays a pivotal role in a diverse array of real-world applications. Nevertheless, traditional optimization algorithms often…Continue reading on SyncedReview »

1 week, 5 days назад @ medium.com
Equall & Apple’s Revolutionizing Transformers: One Wide Feedforward for Unprecedented Efficiency…
Equall & Apple’s Revolutionizing Transformers: One Wide Feedforward for Unprecedented Efficiency… Equall & Apple’s Revolutionizing Transformers: One Wide Feedforward for Unprecedented Efficiency…

The Transformer architecture has demonstrated remarkable scalability, leading to substantial improvements in accuracy. However, this…Continue reading on SyncedReview »

2 weeks, 2 days назад @ medium.com
Unlocking Limitless Retrieval Power: Google’s MEMORY-VQ Revolutionizes LLMs with Remarkable…
Unlocking Limitless Retrieval Power: Google’s MEMORY-VQ Revolutionizes LLMs with Remarkable… Unlocking Limitless Retrieval Power: Google’s MEMORY-VQ Revolutionizes LLMs with Remarkable…

Retrieval augmentation is a commonly employed and effective approach for enhancing the factual knowledge of language models, while…Continue reading on SyncedReview »

2 weeks, 3 days назад @ medium.com
MIT’s AskIt Provides A Unified Programming Interface for Code Generation with LLMs
MIT’s AskIt Provides A Unified Programming Interface for Code Generation with LLMs MIT’s AskIt Provides A Unified Programming Interface for Code Generation with LLMs

Continue reading on SyncedReview »

2 weeks, 4 days назад @ medium.com
70 billion parameter LLaMA2 model training accelerated by 195% with best foundation model practice…
70 billion parameter LLaMA2 model training accelerated by 195% with best foundation model practice… 70 billion parameter LLaMA2 model training accelerated by 195% with best foundation model practice…

Initially triggered by ChatGPT, the large model boom is continuing to intensify.Continue reading on SyncedReview »

2 weeks, 5 days назад @ medium.com
Meta AI’s Nougat Enables Conversion of Mathematic Expressions from PDF Files to Machine Readable…
Meta AI’s Nougat Enables Conversion of Mathematic Expressions from PDF Files to Machine Readable… Meta AI’s Nougat Enables Conversion of Mathematic Expressions from PDF Files to Machine Readable…

The majority of scientific knowledge is most commonly stored in the form of Portable Document Format (PDF), which are also the second most…Continue reading on SyncedReview »

3 weeks, 2 days назад @ medium.com
CMU & Tsinghua U’s Prompt2Model Generates Deployable Models Following Natural Language Instructions
CMU & Tsinghua U’s Prompt2Model Generates Deployable Models Following Natural Language Instructions CMU & Tsinghua U’s Prompt2Model Generates Deployable Models Following Natural Language Instructions

Continue reading on SyncedReview »

3 weeks, 3 days назад @ medium.com
Published In Nature: New Breakthrough of Speech-to-Text BCI Achieves Speed of 62 Words/Minute
Published In Nature: New Breakthrough of Speech-to-Text BCI Achieves Speed of 62 Words/Minute Published In Nature: New Breakthrough of Speech-to-Text BCI Achieves Speed of 62 Words/Minute

Speech brain–computer interfaces (BCIs) is a innovate technology that establishes a communication channel between a user and certain…Continue reading on SyncedReview »

3 weeks, 4 days назад @ medium.com
📓 Cool Blogs
ODS.ai Habr ODS.ai Habr
последний пост 2 weeks, 4 days назад
Пять книг про NLP, с которых можно начать
Пять книг про NLP, с которых можно начать Пять книг про NLP, с которых можно начать

Меня зовут Валентин Малых, я — руководитель направления NLP-исследований в MTS AI, вот уже 6 лет я читаю курс по NLP.

Поскольку я все время отвечаю одно и то же, появилась идея сделать пост про мой список книг, заодно описав их.

При этом в книге больше информации про информационный поиск (information retrieval) и меньше про NLP, но в наше время эти две области уже (или все еще) очень близки.

Правда, его тоже уже не достать, хотя PDF версия ищется без проблем.

Foundations of Statistical Natural Language ProcessingНасколько мне известно, эта книга не переводилась на русский язык.

2 weeks, 4 days назад @ habr.com
Пять книг про NLP, с которых можно начать
Пять книг про NLP, с которых можно начать Пять книг про NLP, с которых можно начать

Меня зовут Валентин Малых, я — руководитель направления NLP-исследований в MTS AI, вот уже 6 лет я читаю курс по NLP.

Поскольку я все время отвечаю одно и то же, появилась идея сделать пост про мой список книг, заодно описав их.

При этом в книге больше информации про информационный поиск (information retrieval) и меньше про NLP, но в наше время эти две области уже (или все еще) очень близки.

Правда, его тоже уже не достать, хотя PDF версия ищется без проблем.

Foundations of Statistical Natural Language ProcessingНасколько мне известно, эта книга не переводилась на русский язык.

2 weeks, 4 days назад @ habr.com
Дропаем ранжирующие метрики в рекомендательной системе, часть 3: платформа для экспериментов
Дропаем ранжирующие метрики в рекомендательной системе, часть 3: платформа для экспериментов Дропаем ранжирующие метрики в рекомендательной системе, часть 3: платформа для экспериментов

Платформа для экспериментов в RecSysНа примере Киона я показала сложности экспериментов с рекомендательными системами.

В работе над RecSys в реальном сервисе мы строим гипотезы, выкатываем модели в АБ тесты, строим новые гипотезы.

Нашей целью была инфраструктура и пайплайны для максимально быстрых и удобных экспериментов с любыми алгоритмами.

Трекинг экспериментов: метрики и визуальный анализТрекинг экспериментов мы проводили одновременно в Mlflow и в отчётах в репозитории.

Схема на картинке:Инфраструктура платформы для экспериментов и приложения с рекомендациямиЖёлтые стрелки отражают запись результатов экспериментов: логирование метрик с кросс-валидации, сохранение обученных моделей в Min…

1 month назад @ habr.com
Дропаем ранжирующие метрики в рекомендательной системе, часть 2: двухэтапные модели
Дропаем ранжирующие метрики в рекомендательной системе, часть 2: двухэтапные модели Дропаем ранжирующие метрики в рекомендательной системе, часть 2: двухэтапные модели

В первой части статьи я описывала, как мы с напарником решили выкатить модель из соревнования в онлайн рекомендации, и что из этого вышло.

Давайте построим фичу для бустинга, которая будет решать главную проблему нашей текущей модели - не релевантные айтемы в хороших подборках.

Также у нас были замечательные фичи, проверенные ещё на модели в соревновании.

В задаче классификации мы подаём модели кандидатов, бывших в интеракциях в качестве позитивных таргетов, и не бывших в качестве негативных, и учим его отличать их друг от друга.

Самый стабильный подход к кросс-валидации в рекомендательных системах - это схема скользящего окна, которая больше всего соответствует работе модели в реальном мир…

1 month, 1 week назад @ habr.com
Дропаем ранжирующие метрики в рекомендательной системе, часть 1: визуальный анализ и popularity bias
Дропаем ранжирующие метрики в рекомендательной системе, часть 1: визуальный анализ и popularity bias Дропаем ранжирующие метрики в рекомендательной системе, часть 1: визуальный анализ и popularity bias

Смотрим, что мы имеем в Кионе:Распределение популярности топ 100 айтемов в датасете KionЧто там в топе просмотров?

Популярные айтемы заполняют собой полку рекомендаций и на этот запрос, и на многие другие.

Фильм занимает 6 место в топе просмотров в датасете и очень не вовремя попадает в рекомендации у самых разных алгоритмов.

Проверим рекомендации от модели с максимумом Recall без популярного:Рекомендации модели bm25 с максимумом Recall без популярного.

Recall и MAP понизились примерно в 2 раза от модели в соревновании.

1 month, 2 weeks назад @ habr.com
«Диалектик», независимое социалистическое медиа, рассказывает о своих NLP проектах, публикует датасеты и делится кодом
«Диалектик», независимое социалистическое медиа, рассказывает о своих NLP проектах, публикует датасеты и делится кодом «Диалектик», независимое социалистическое медиа, рассказывает о своих NLP проектах, публикует датасеты и делится кодом

Почти сразу после публикации поста про систему поиска новостей о трудовых конфликтах в СНГ я познакомился с коллективом проекта «Диалектик».

Коллектив волнуют процессы, происходящие как в экономическом базисе, так и в его надстройке – в обществе, культуре, политике.

Этот метод познания помогал всем классикам марксизма анализировать сложные ситуации в экономике и обществе, этим он ценен и для коллектива «Диалектика».

Поэтому внутри коллектива ведется разработка ИТ-инструментов для внутреннего (только для редакторов) и для внешнего (для всех пользователей) использования.

Этот публичный агрегатор находится на этапе тестирования и в скором времени будет представлен общественности.

1 month, 2 weeks назад @ habr.com
Создай своего ИИ-ассистента с помощью ChatGPT и Streamlit
Создай своего ИИ-ассистента с помощью ChatGPT и Streamlit Создай своего ИИ-ассистента с помощью ChatGPT и Streamlit

И вот благодаря увлечению Data Science можно использовать этих ботов, чтобы помогать людям, да ещё и пообщаться с ними на разные темы.

AI can do my jobСайт ChatGPT иногда работает медленно, и для доступа к нему в некоторых странах, например в России, требуется VPN.

Если сообщений нет, создается вводное сообщение ИИ с ai_role и добавляется в список, после чего следует ввод данных пользователем.

Разделение кода на эти функции позволяет легко настраивать поток общения между пользователем и ИИ и управлять им.

Назначаются предопределённые роли ИИ для каждого языка с помощью AI_ROLE_OPTIONS_EN и AI_ROLE_OPTIONS_RU - это нам пригодится позже.

5 months, 1 week назад @ habr.com
Человечество против искусственного интеллекта: может ли развитие нейросетей привести к катастрофе
Человечество против искусственного интеллекта: может ли развитие нейросетей привести к катастрофе Человечество против искусственного интеллекта: может ли развитие нейросетей привести к катастрофе

В 2019 они ввалили миллиарды денег в OpenAI, а в 2023 доввалили еще, чтобы получить доступ к превью-версии GPT-4.

Она логическая или даже философская — а как максимально точно сформулировать то, что мы имеем в виду, а не то, что нам кажется мы хотим достичь?

Точно так же как и не решит «всех убить».

С одной стороны, он тут и письма за приостановку исследований подписывает, и в Твиттере в адрес OpenAI кричит «астанавитесь!».

Во второй части этой статьи будет более подробный разбор всех аргументов как сторонников ИИ, как и противников, чтобы вы поняли картину глубже.

5 months, 3 weeks назад @ habr.com
GPT-4: Чему научилась новая нейросеть, и почему это немного жутковато
GPT-4: Чему научилась новая нейросеть, и почему это немного жутковато GPT-4: Чему научилась новая нейросеть, и почему это немного жутковато

Нейросеть попробует разобраться сразу и в визуальных данных, и в текстовом промпте – и даст свой ответ.

Но это не беда: можно просто скопипастить текст ошибки в диалог с GPT-4 и скомандовать ей «слушай, ну сделай нормально уже, а?» – и та реально извинится и всё пофиксит!

Понятно, что это самые мейнстримные и популярные проекты, которые с одной стороны легко написать, но с другой – они всё-таки являются полноценными демонстрациями.

Причем сравнение здесь идет не с рандомами, а с людьми, которые к этим экзаменам действительно готовились!

И дело тут совсем не в расистских шутках или в инструкциях по сбору бомб в домашних условиях (и в опасении последующих судебных исков и разбирательств) – во…

6 months, 1 week назад @ habr.com
Как работает ChatGPT: объясняем на простом русском эволюцию языковых моделей с T9 до чуда
Как работает ChatGPT: объясняем на простом русском эволюцию языковых моделей с T9 до чуда Как работает ChatGPT: объясняем на простом русском эволюцию языковых моделей с T9 до чуда

При этом мало кто понимает — а как вообще нейросети вроде ChatGPT работают внутри?

Языковые модели без всякого труда генерируют длинные тексты, но делают они это по принципу «слово за словом».

Большинство людей легко понимают из контекста, что в одном случае «она» — это приманка, а в другом — рыба.

Краткое резюме: GPT-2 вышла в 2019 году, и она превосходила свою предшественницу и по объему тренировочных текстовых данных, и по размеру самой модели (числу параметров) в 10 раз.

Короче, InstructGPT (также известная как GPT-3.5) – это как раз и есть GPT-3, которую дообучили с помощью фидбека на максимизацию оценки живого человека.

6 months, 3 weeks назад @ habr.com
АБ-тесты — это не только ценный мех… Но еще и процессы
АБ-тесты — это не только ценный мех… Но еще и процессы АБ-тесты — это не только ценный мех… Но еще и процессы

Более детальное описание каждой ступени можно найти в моем докладе тут и в этой статье.

Определение географии пилота и выбор объектов для тестирования (пилотная группа, внедряем MVP) и сравнения (контрольная группа, ничего не внедряем).

О том, почему подобное ручное сравнение это плохо и что должно улучшить АБ (и как объяснить это бизнесу!

А если планируем эффект на пару категорий продаж, то проверять стоит на них, а не на тотал продажах.

Внедряем как получилось, а потом бахаем causal impact и вуаля, у нас есть оценка и пилота, и ролл-аута.

7 months назад @ habr.com
[Перевод] Запуск Stable Diffusion локально и в облаке с помощью Diffusers и dstack
[Перевод] Запуск Stable Diffusion локально и в облаке с помощью Diffusers и dstack [Перевод] Запуск Stable Diffusion локально и в облаке с помощью Diffusers и dstack

В этой статье, я на простом примере расскажу о том, как решать эту проблему с помощью diffusers и dstack.

Чтобы запустить сценарий через dstack , сценарий должен быть определен как workflow через YAML-файл в .dstack/workflows .

Настройка AWS в качестве remoteПо умолчанию workflows в dstack запускаются локально.

ЗаключениеЕсли эта статья показалась вам интересной, вы можете углубиться в эту тему, изучив документацию по diffusers и dstack.

В одной из следующих статей мы углубимся не только в генерацию изображений, но и в файнтьюнинг Stable Diffusion.

7 months, 1 week назад @ habr.com
Теория вероятностей в машинном обучении. Часть 2: модель классификации
Теория вероятностей в машинном обучении. Часть 2: модель классификации Теория вероятностей в машинном обучении. Часть 2: модель классификации

Модель классификации и функция потерьЧтобы задать вероятностную модель, нам нужно определить, в какой форме она будет предсказывать распределение .

Если мы используем в обучении сигмоиду, то модель непосредственно предсказывает логит , то есть логарифм того, во сколько раз второй класс вероятнее первого.

Например, в пусть в задаче классификации эмоций по видеозаписи датасет размечен сразу несколькими людьми-аннотаторами, которые иногда дают разные ответы.

Если в задаче классификации в эталонном распределении вероятности классов равны 0.7 и 0.3, то мы хотели бы, чтобы в предсказании они тоже были бы равны 0.7 и 0.3.

В этом разделе мы рассмотрели более общий случай, когда эталонное распределе…

7 months, 3 weeks назад @ habr.com
Теория вероятностей в машинном обучении. Часть 1: модель регрессии
Теория вероятностей в машинном обучении. Часть 1: модель регрессии Теория вероятностей в машинном обучении. Часть 1: модель регрессии

В пятом разделе рассмотрим модель регрессии с оценкой уверенности в виде формул и программного кода.

Вообще говоря, большую часть машинного обучения можно считать статистическим выводом, что мы более формально рассмотрим в дальнейшем.

При этом чем сложнее модель и меньше данных, тем менее точной получается аппроксимация, что и является причиной переобучения.

Проверим этот метод на практике, обучив модель на табличном датасете California Housing, в котором нужно предсказывать цену недвижимости в разных районах Калифорнии, имея 8 исходных признаков.

Положительная корреляция говорит о том, что модель в какой-то степени справляется с задачей оценки собственной уверенности в предсказании.

7 months, 3 weeks назад @ habr.com
ChatGPT как инструмент для поиска: решаем основную проблему
ChatGPT как инструмент для поиска: решаем основную проблему ChatGPT как инструмент для поиска: решаем основную проблему

Под размером понимается количество параметров в модели, и для LLM это число превосходит несколько миллиардов.

Языковые модели и фактыЯзыковые модели, или Language Models (LM), решают очень простую задачу: предсказание следующего слова (или токена, части слова).

Это ясно нам, человекам, и как показывают современные языковые модели - это понятно и им.

Мы уже обсудили, что такое токен, и что для модели заранее создается словарь токенов, который используется для подачи входного текста.

Как модель для DotA 2 видит поле боя - принцип сбора признаком для подачи в нейросеть.

8 months назад @ habr.com
Machine Learning Mastery
последний пост 2 weeks, 1 day назад
Statistical Tests in R
Statistical Tests in R

R as a data analytics platform is expected to have a lot of support for various statistical tests. In this post, you are going to see how you can run statistical tests using the built-in functions in R. Specifically, you are going to learn: What is t-test and how to do it in R What […]

The post Statistical Tests in R appeared first on MachineLearningMastery.com.

2 weeks, 1 day назад @ machinelearningmastery.com
Generating Random Numbers in R
Generating Random Numbers in R

Whether working on a machine learning project, a simulation, or other models, you need to generate random numbers in your code. R as a programming language, has several functions for random number generation. In this post, you will learn about them and see how they can be used in a larger program. Specifically, you will […]

The post Generating Random Numbers in R appeared first on MachineLearningMastery.com.

2 weeks, 5 days назад @ machinelearningmastery.com
Plotting Graphs in R
Plotting Graphs in R

Visualizing data can sometimes help people understand it better. As a data analytics platform, R provided some advanced plotting functions. In this post, you will learn how to use the built-in plot functions to create some common visualization. Specifically, you will learn how to create: Line plot Scatter plot Pie charts Let’s get started. Overview […]

The post Plotting Graphs in R appeared first on MachineLearningMastery.com.

3 weeks, 1 day назад @ machinelearningmastery.com
Logic, Flow Control, and Functions in R
Logic, Flow Control, and Functions in R

R is a procedural programming language. Therefore, it has the full set of flow control syntax like many other languages. Indeed, the flow control syntax in R is similar to Java and C. In this post, you will see some examples of using the flow control syntax in R. Let’s get started. Overview This post […]

The post Logic, Flow Control, and Functions in R appeared first on MachineLearningMastery.com.

3 weeks, 5 days назад @ machinelearningmastery.com
Built-in Datasets in R
Built-in Datasets in R

The ecosystem in R contains not only the function libraries to help you perform statistical analysis but also the data library that gives you some famous datasets to test out your program. There are a lot of built-in datasets in R. In this post, you will: Learn some of the built-in datasets Know how to […]

The post Built-in Datasets in R appeared first on MachineLearningMastery.com.

4 weeks назад @ machinelearningmastery.com
Operations on Vectors in R
Operations on Vectors in R

Vectors in R is the native way of handling data. In addition to the vector operations you saw in the linear algebra textbook, R supports a lot more. In this post, you will learn about: How to manipulate a vector How to treat vectors as sets Let’s get started. Overview This post is divided into […]

The post Operations on Vectors in R appeared first on MachineLearningMastery.com.

1 month назад @ machinelearningmastery.com
Surviving in the R Environment
Surviving in the R Environment

R is not only a programming language but also a programming shell with read-eval-print loop (REPL). The shell is how most people use R. But when you drill deeper, knowing more about what’s working behind the scenes is handy. In this post, you will learn: How to manage variables in R How to manage packages […]

The post Surviving in the R Environment appeared first on MachineLearningMastery.com.

1 month назад @ machinelearningmastery.com
A Gentle Introduction to Lists and Data Frames in R
A Gentle Introduction to Lists and Data Frames in R

Vectors in R are supposed to be of homogeneous data type. You can use a list as the container if there are mixed data types, such as numbers and strings. The list and data frame are closely related in R. The data frame is probably more useful because it reflects how we usually collect statistics. […]

The post A Gentle Introduction to Lists and Data Frames in R appeared first on MachineLearningMastery.com.

1 month назад @ machinelearningmastery.com
A Gentle Introduction to Vectors in R
A Gentle Introduction to Vectors in R

R is a language for programming with data. Unlike many other languages, the primitive data types in R are not scalars but vectors. Therefore, understanding how to deal with vectors is crucial to programming or reading the R code. In this post, you will learn about various vector operations in R. Specifically, you will know: […]

The post A Gentle Introduction to Vectors in R appeared first on MachineLearningMastery.com.

1 month, 1 week назад @ machinelearningmastery.com
Celebrating Devart’s 26th Birthday with an Exclusive 20% Discount on Data Connectivity Tools!
Celebrating Devart’s 26th Birthday with an Exclusive 20% Discount on Data Connectivity Tools!

Sponsored Post Devart, a leading provider of database connectivity solutions, is celebrating its 26th birthday. Devart has been at the forefront of delivering innovative solutions that empower businesses and developers to connect, manage, and optimize their databases seamlessly. With a rich history of providing top-notch database tools and unparalleled customer support, Devart […]

The post Celebrating Devart’s 26th Birthday with an Exclusive 20% Discount on Data Connectivity Tools! appeared first on MachineLearningMastery.com.

1 month, 1 week назад @ machinelearningmastery.com
An Introduction to R
An Introduction to R

R is a programming language of its kind. It is a language for statistics, and its ecosystem has a lot of libraries for all kinds of statistical tasks. It is a language targeted the statisticians rather than computer scientists. Hence you will see some unorthodox patterns in the language. In this post, you will learn […]

The post An Introduction to R appeared first on MachineLearningMastery.com.

1 month, 1 week назад @ machinelearningmastery.com
MOSTLY AI: The most accurate synthetic data generator
MOSTLY AI: The most accurate synthetic data generator

Sponsored Post By Georgios Loizou, AI & Machine Learning Product Owner at MOSTLY AI As businesses attempt to extract relevant insights and build powerful machine-learning models, the need for high-quality, accurate synthetic datasets has grown. MOSTLY AI is excited to present our latest findings. In this blogpost we will present the results […]

The post MOSTLY AI: The most accurate synthetic data generator appeared first on MachineLearningMastery.com.

2 months назад @ machinelearningmastery.com
Building Your mini-ChatGPT at Home
Building Your mini-ChatGPT at Home

ChatGPT is fun to play with. Chances are, you also want to have your own copy running privately. Realistically, that’s impossible because ChatGPT is not a software for download, and it needs tremendous computer power to run. But you can build a trimmed-down version that can run on commodity hardware. In this post, you will […]

The post Building Your mini-ChatGPT at Home appeared first on MachineLearningMastery.com.

2 months назад @ machinelearningmastery.com
Advanced Techniques for Research with ChatGPT
Advanced Techniques for Research with ChatGPT

Research has always been essential to human progress and has evolved tremendously over the past few years. With the advent of advanced technologies, new tools and techniques have emerged to conduct more efficient research and to stay at the forefront of knowledge. One such technology is ChatGPT, a large language model that uses deep learning […]

The post Advanced Techniques for Research with ChatGPT appeared first on MachineLearningMastery.com.

2 months, 3 weeks назад @ machinelearningmastery.com
Using the Natural Language Understanding Capability of ChatGPT
Using the Natural Language Understanding Capability of ChatGPT

ChatGPT as a Large Language Model, is well-known for understanding human languages. Instead of asking ChatGPT for an answer you don’t know, you can make it work on existing information while leveraging the natural language understanding (NLU) capability. In this post, you will learn How to make ChatGPT produce a summary from a long text […]

The post Using the Natural Language Understanding Capability of ChatGPT appeared first on MachineLearningMastery.com.

2 months, 3 weeks назад @ machinelearningmastery.com
ML in Production
последний пост None
Sorta Insightful Sorta Insightful
последний пост 2 weeks, 5 days назад
Everfree Northwest 2023
Everfree Northwest 2023 Everfree Northwest 2023

Everfree Northwest is a My Little Pony convention in the Seattle area.

Except, I heard that with the ending of BronyCon, Everfree Northwest is the largest pony convention left.

My Little Pony: Friendship is Magic is Generation 4, or G4, and is the one that kicked off the brony fandom.

People tend to assume that “pony concert” means “remixes of pony songs”, or at least “overt references to My Little Pony”.

The first was a signed copy of Ponies: The Galloping, the Magic: The Gathering x My Little Pony crossover.

2 weeks, 5 days назад @ alexirpan.com
Eight Years Later
Eight Years Later Eight Years Later

I started working on that post right after Mystery Hunt 2023 finished, and hardcore focused on getting it out ASAP.

The end result was that I was exerting “write Mystery Hunt” levels of effort (10-20 hrs/week) for 1.5 years straight.

markdown 1929 2023 - 04 - 21 - mh - 2023. markdown 114 2023 - 05 - 09 - bootes - 2023. markdown 153 2023 - 07 - 19 - ml - hurry .

markdownPosts in LimboPost about Dominion Online:Odds of writing this year: 5%Odds of writing eventually: 25%My priorities are moving away from Dominion.

Post about AI timelines:Odds of writing this year: 90%Odds of writing eventually: 99%I’m not planning a big update to the last timelines post.

1 month, 1 week назад @ alexirpan.com
Machine Learning Got Itself in a Big Damn Hurry
Machine Learning Got Itself in a Big Damn Hurry Machine Learning Got Itself in a Big Damn Hurry

As I asked questions about compute resources and the presenter’s position on generative AI ethics, I had a moment of realization.

Why are we talking about whether an RTX 3090 is big enough to finetune a LLaMa checkpoint?

The much easier and less-speculative way for a field to move faster is by having more people working in that field.

The MLP AI enthusiasts mentioned some pretrained LLMs that I had not even heard of.

I remember being a young whippersnapper, in the deep learning wave of 2015.

2 months, 1 week назад @ alexirpan.com
A Boötes Shaped Addendum to Writing Mystery Hunt 2023
A Boötes Shaped Addendum to Writing Mystery Hunt 2023 A Boötes Shaped Addendum to Writing Mystery Hunt 2023

But why would you solve a puzzle from an old Mystery Hunt, when there are a bunch of puzzles to testsolve for the upcoming Mystery Hunt?

I’ve since had two people from Galactic tell me that they did this for the Students round in Mystery Hunt 2021.

As the FAQ mentions, the puzzles in that puzzlehunt were originally going to appear in Mystery Hunt 2023.

There are plenty of people who only do MIT Mystery Hunt and nothing else.

One advantage of Mystery Hunt always happening on MLK weekend is that anyone who wants to attend has ample warning time to clear their calendar.

4 months, 2 weeks назад @ alexirpan.com
Writing MIT Mystery Hunt 2023
Writing MIT Mystery Hunt 2023 Writing MIT Mystery Hunt 2023

This post is about 55,000 words long, and is riddled with spoilers for pretty much every aspect of MIT Mystery Hunt 2023.

I write a post about Mystery Hunt 2022, where I make a few predictions about how writing Mystery Hunt 2023 will go.

We talked a bit about whether this was okay, since some organizers for Teammate Hunt 2021 were not writing Mystery Hunt this year.

The first choice we had to make was whether we’d use the hunt codebase from Palindrome, or use the tph-site codebase we’d built over Teammate Hunt 2020 and Teammate Hunt 2021.

The assumption we made is that most Mystery Hunt teams do not have an active codebase, and default to using the code from the previous Mystery Hunt.

5 months назад @ alexirpan.com
A Prelude to the Inevitable Long Post About MIT Mystery Hunt 2023
A Prelude to the Inevitable Long Post About MIT Mystery Hunt 2023 A Prelude to the Inevitable Long Post About MIT Mystery Hunt 2023

The first time I ever wrote for a puzzlehunt was Mystery Hunt 2013.

Ten years later, teammate wrote another Mystery Hunt that went into Monday, with a similarly large number of free answers as MH 2013.

I don’t think there was any single reason that Mystery Hunt was so hard this year, but there was definitely a systematic underestimation of difficulty and length.

However, there are some first-time constructors on teammate this year, where their Hunt puzzles are their first puzzles for the public.

I’m pretty sure I’ve spent more time on Hunt this year than I spent in all my past puzzle writing combined.

8 months, 1 week назад @ alexirpan.com
Lil'Log
последний пост None
inFERENCe
последний пост 3 months, 2 weeks назад
We may finally crack Maths. But should we?
We may finally crack Maths. But should we? We may finally crack Maths. But should we?

June 8, 2023We may finally crack Maths.

Automating mathematical theorem proving has been a long standing goal of artificial intelligence and indeed computer science.

Mathematical theorem proving, and code generation in general, is a single-player game.

So what would the paper ”Dual-use of artificial-intelligence-powered mathematical theorem proving” be about?

Mathematical theorem proving, especially the one based on formal mathematics, is closely related to software verification, and formally verifiable software engineering.

3 months, 2 weeks назад @ inference.vc
Mortal Komputation: On Hinton's argument for superhuman AI.
Mortal Komputation: On Hinton's argument for superhuman AI. Mortal Komputation: On Hinton's argument for superhuman AI.

AGI opinion May 30, 2023Mortal Komputation: On Hinton's argument for superhuman AI.

Analogue hardware allows for lower energy cost but at the cost of mortality: algorithm and hardware are inseparable - the argument goes.

Digital intelligence has two advantages: aggregating learning from parallel experiences, and backpropagation which is implausible on analogue hardwareHinton concludes these advantages can/will lead to superhuman digital intelligence.

Brains running on analogue hardware are mortal: once the hardware dies, the algorithm dies with it.

What if we prematurely conclude digital brains are superior to analogue brains just because we haven't yet managed to make analogue computation …

3 months, 3 weeks назад @ inference.vc
Autoregressive Models, OOD Prompts and the Interpolation Regime
Autoregressive Models, OOD Prompts and the Interpolation Regime Autoregressive Models, OOD Prompts and the Interpolation Regime

March 30, 2023Autoregressive Models, OOD Prompts and the Interpolation RegimeA few years ago I was very much into maximum likelihood-based generative modeling and autoregressive models (see this, this or this).

AR models > distributions over sequencesI have always associated AR models as just a smart way to parametrize probability distributions over multidimensional vectors or sequences.

Let's consider fitting AR models on a dataset generated by a probabilistic context free grammar $\mathbb{P}[S=\"a^nb^n\"]=q^n(1-q)$.

Now consider two autoregressive models, $Q_1$ and $Q_2$, which both perfectly fit our training distribution.

Low probability promptsBut surely, we are not just interested in e…

5 months, 4 weeks назад @ inference.vc
We May be Surprised Again: Why I take LLMs seriously.
We May be Surprised Again: Why I take LLMs seriously. We May be Surprised Again: Why I take LLMs seriously.

March 22, 2023We May be Surprised Again: Why I take LLMs seriously.

"Deep Learning is Easy, Learn something Harder" - I proclaimed in one of my early and provocative blog posts from 2016.

I wasn't alone in my deep learning skepticism, in fact I'm far from being the most extreme deep learning skeptic.

An important change-point in many of our attitudes was the 2016 paper Understanding deep learning requires rethinking generalization.

I have now realised that $\operatorname{argmin}\mathcal{L}$ is a very poor description of what actually happens in deep learning.

6 months назад @ inference.vc
The Spectator
последний пост 1 month, 3 weeks назад
Machine Learning with Social Purpose
Machine Learning with Social Purpose Machine Learning with Social Purpose

Abstract: This talk talk has a single objective: to advocate for machine learning infused with social purpose.

In this way, social purpose transforms our field of machine learning: into something that is both technical and social.

So social purpose will make machine learning something that is both technical and social.

A more global field and industry can shift machine learning to be more general: magnifying social purpose.

Your continued volunteering, funding, support, and openness to these groups shows yet another way of infusing machine learning with social purpose.

1 month, 3 weeks назад @ blog.shakirm.com
Elevating our Evaluations: Technical and Sociotechnical Standards of Assessment in Machine Learning
Elevating our Evaluations: Technical and Sociotechnical Standards of Assessment in Machine Learning Elevating our Evaluations: Technical and Sociotechnical Standards of Assessment in Machine Learning

In machine learning, you could say we have two types of evaluations.

The science of machine learning especially is still one that is all about theory generation, falsification, and regeneration.

To use this for machine learning evaluation, we should first generalise this optimisation.

Access for all, not just some, and quality for all, remains a challenge in realising positive benefit from machine learning.

To connect the dots here, this is just another way to understand the sociotechnical nature of machine learning.

4 months, 4 weeks назад @ blog.shakirm.com
Generative Models for Climate Informatics
Generative Models for Climate Informatics Generative Models for Climate Informatics

Abstract: Generative models are probabilistic models of data that allow easy sampling and exploration of a data distribution using a learned model.

For today I put forward a title of ‘Generative Models for Climate Informatics’.

There are many ways to think about generative models, so in part 1, I will give a high-level view of generative models as I see them.

I’ll give more focus to aspects of evaluation that will need our special care when working with generative models.

Generative models in their simplest description are probabilistic models of data.

5 months, 1 week назад @ blog.shakirm.com
Off the Convex Path
последний пост None
Piekniewski's blog
последний пост 5 months, 1 week назад
The Atom of Intelligence
The Atom of Intelligence The Atom of Intelligence

The initial atom of intelligence never had the luxury of abstraction or determinism.

Control loops of bigger organisms are composed of control loops of those at lower scale.

If the idea of internal and external control loop of an organism is the analog of atom for intelligence, the ability of compose such loops across scales is the analog of a chemical bond.

These complex organism not only build control loops from much smaller control loops, but also utilize numerous slower loops.

The web of feedback loops goes back all the way to the primordial atom of intelligence.

5 months, 1 week назад @ blog.piekniewski.info
Ai Reflections
Ai Reflections Ai Reflections

AI enthusiast: Maybe, but who cares, ultimately the thing works better than anything before.

AI enthusiast: yes but there are some measurable ways to verify that machine gets better than a human.

AI enthusiast: OK, but what about the Turing test, ultimately when humans get convinced that AI agent is sentient just as they are, it's game over, AGI is here.

AI enthusiast: But GPT can beat human at programming, can write better poems and makes fewer and fewer mistakes.

But as far as critical, practical applications in the real world go, AI deployment has been nothing but a failure.

5 months, 2 weeks назад @ blog.piekniewski.info
AI psychosis
AI psychosis AI psychosis

In reality A.I.

And here we come back to the psychosis we're in right now and particularly how I think most of it is really unjustified and wrong.

Later on numerous flaws were pointed out in the study and as it turns out A.I.

The problem with GPT though, as with many other AI contraptions, is that obviously it knows extremely little about the world.

We will just sugar rush ourselves with AI Elon Musk type hyper-promises, scare ourselves to the point of anxiety by the nonexistent AI daemons until we all slowly but surely go totally insane.

7 months, 2 weeks назад @ blog.piekniewski.info
Science, dogma and mysteries.
Science, dogma and mysteries. Science, dogma and mysteries.

Now almost 13 years after my PhD defense, my view is that science is actually a rather fragile thread we use to hold together and explain various mysteries in the world.

But I now view science as any other social activity, being influenced by zeitgeist, politics, fashion, financing and often stuck in a dogma, no different than the dogma that threatened Galileo or Copernicus.

A few years back I digested all of his books and this experience has completely changed my view on science.

Science is the best method, but scientific community is mostly toxicThe general theme of this post is that by looking at several seemingly disconnected aspects of science and technology we can see that our contemp…

8 months, 3 weeks назад @ blog.piekniewski.info
What actually is statistics?
What actually is statistics? What actually is statistics?

Data science essentially is glorified statistics with a computer, AI is deeply statistical at its very core, we use statistical analysis for pretty much everything from economy to biology.

Statistics is a craft that allows us to analyze and predict certain subset of complex signals that are not possible to describe in terms of dynamics.

Now let me repeat this once again: statistics can be applied to some data sometimes.

Also the smaller signals can be reasonably "independent" of each other, but can all be dependent on some other bigger external thing.

they only applied statistics to what can be understood with mechanics but at a slightly higher level of organization.

9 months, 2 weeks назад @ blog.piekniewski.info
fast.ai NLP fast.ai NLP
последний пост None
Andrew Karpathy blog
последний пост None
大トロ 大トロ
последний пост None
🔬 Science
Papers With Code Papers With Code
последний пост 1 час назад
/shilinyan99/ PanoVOS:Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation
/shilinyan99/ PanoVOS:Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation /shilinyan99/ PanoVOS:Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation

However, existing datasets for video segmentation only focus on conventional planar images.

To address the challenge, in this paper, we present a panoramic video dataset, PanoVOS.

To quantify the domain gap between 2D planar videos and panoramic videos, we evaluate 15 off-the-shelf video object segmentation (VOS) models on PanoVOS.

Through error analysis, we found that all of them fail to tackle pixel-level content discontinues of panoramic videos.

Our dataset poses new challenges in panoramic VOS and we hope that our PanoVOS can advance the development of panoramic segmentation/tracking.

1 час назад @ paperswithcode.com
/zhaoolee/ The Cambridge Law Corpus: A Corpus for Legal AI Research
/zhaoolee/ The Cambridge Law Corpus: A Corpus for Legal AI Research /zhaoolee/ The Cambridge Law Corpus: A Corpus for Legal AI Research

We introduce the Cambridge Law Corpus (CLC), a corpus for legal AI research.

Most cases are from the 21st century, but the corpus includes cases as old as the 16th century.

Together with the corpus, we provide annotations on case outcomes for 638 cases, done by legal experts.

Using our annotated data, we have trained and evaluated case outcome extraction with GPT-3, GPT-4 and RoBERTa models to provide benchmarks.

As a consequence, the corpus will only be released for research purposes under certain restrictions.

1 час назад @ paperswithcode.com
/desehuileng0o0/ Elevating Skeleton-Based Action Recognition with Efficient Multi-Modality Self-Supervision
/desehuileng0o0/ Elevating Skeleton-Based Action Recognition with Efficient Multi-Modality Self-Supervision /desehuileng0o0/ Elevating Skeleton-Based Action Recognition with Efficient Multi-Modality Self-Supervision

Self-supervised representation learning for human action recognition has developed rapidly in recent years.

Most of the existing works are based on skeleton data while using a multi-modality setup.

In this work, we first propose an Implicit Knowledge Exchange Module (IKEM) which alleviates the propagation of erroneous knowledge between low-performance modalities.

Then, we further propose three new modalities to enrich the complementary information between modalities.

The experimental results demonstrate the effectiveness of our approach, unlocking the efficient use of skeleton-based multi-modality data.

1 час назад @ paperswithcode.com
/wenet-e2e/ Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition
/wenet-e2e/ Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition /wenet-e2e/ Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition

Current speaker recognition systems primarily rely on supervised approaches, constrained by the scale of labeled datasets.

To boost the system performance, researchers leverage large pretrained models such as WavLM to transfer learned high-level features to the downstream speaker recognition task.

Another group of researchers directly apply self-supervised methods such as DINO to speaker embedding learning, yet they have not explored its potential on large-scale in-the-wild datasets.

In this paper, we present the effectiveness of DINO training on the large-scale WenetSpeech dataset and its transferability in enhancing the supervised system performance on the CNCeleb dataset.

Additionally, w…

18 часов назад @ paperswithcode.com
/mpfefferk/ Regret and Conservatism of Constrained Stochastic Model Predictive Control
/mpfefferk/ Regret and Conservatism of Constrained Stochastic Model Predictive Control /mpfefferk/ Regret and Conservatism of Constrained Stochastic Model Predictive Control

We analyse conservatism and regret of stochastic model predictive control (SMPC) when using moment-based ambiguity sets for modeling unknown uncertainties.

To quantify the conservatism, we compare the deterministic constraint tightening while taking a distributionally robust approach against the optimal tightening when the exact distributions of the stochastic uncertainties are known.

Furthermore, we quantify the regret by comparing the performance when the distributions of the stochastic uncertainties are known and unknown.

Analysing the accumulated sub-optimality of SMPC due to the lack of knowledge about the true distributions of the uncertainties marks the novel contribution of this wor…

20 часов назад @ paperswithcode.com
/FlorisE/ NeuralLabeling: A versatile toolset for labeling vision datasets using Neural Radiance Fields
/FlorisE/ NeuralLabeling: A versatile toolset for labeling vision datasets using Neural Radiance Fields /FlorisE/ NeuralLabeling: A versatile toolset for labeling vision datasets using Neural Radiance Fields

We present NeuralLabeling, a labeling approach and toolset for annotating a scene using either bounding boxes or meshes and generating segmentation masks, affordance maps, 2D bounding boxes, 3D bounding boxes, 6DOF object poses, depth maps and object meshes.

NeuralLabeling uses Neural Radiance Fields (NeRF) as renderer, allowing labeling to be performed using 3D spatial tools while incorporating geometric clues such as occlusions, relying only on images captured from multiple viewpoints as input.

To demonstrate the applicability of NeuralLabeling to a practical problem in robotics, we added ground truth depth maps to 30000 frames of transparent object RGB and noisy depth maps of glasses pla…

21 час назад @ paperswithcode.com
/Unbabel/ Scaling up COMETKIWI: Unbabel-IST 2023 Submission for the Quality Estimation Shared Task
/Unbabel/ Scaling up COMETKIWI: Unbabel-IST 2023 Submission for the Quality Estimation Shared Task /Unbabel/ Scaling up COMETKIWI: Unbabel-IST 2023 Submission for the Quality Estimation Shared Task

We present the joint contribution of Unbabel and Instituto Superior T\'ecnico to the WMT 2023 Shared Task on Quality Estimation (QE).

Our team participated on all tasks: sentence- and word-level quality prediction (task 1) and fine-grained error span detection (task 2).

Our multilingual approaches are ranked first for all tasks, reaching state-of-the-art performance for quality estimation at word-, span- and sentence-level granularity.

Compared to the previous state-of-the-art COMETKIWI-22, we show large improvements in correlation with human judgements (up to 10 Spearman points).

Moreover, we surpass the second-best multilingual submission to the shared-task with up to 3.8 absolute points.

22 часа назад @ paperswithcode.com
/louislau1129/ CoMFLP: Correlation Measure based Fast Search on ASR Layer Pruning
/louislau1129/ CoMFLP: Correlation Measure based Fast Search on ASR Layer Pruning /louislau1129/ CoMFLP: Correlation Measure based Fast Search on ASR Layer Pruning

Layer pruning (LP) is a commonly used compression method to remove redundant layers.

Previous studies on LP usually identify the redundant layers according to a task-specific evaluation metric.

To address this problem, we propose CoMFLP, a fast search LP algorithm based on correlation measure.

The correlation between layers is computed to generate a correlation matrix, which identifies the redundancy among layers.

Experiments on an ASR task show that the pruning proposal determined by CoMFLP outperforms existing LP methods while only requiring constant time complexity.

22 часа назад @ paperswithcode.com
/boris-bergsma/ Cluster-based pruning techniques for audio data
/boris-bergsma/ Cluster-based pruning techniques for audio data /boris-bergsma/ Cluster-based pruning techniques for audio data

Deep learning models have become widely adopted in various domains, but their performance heavily relies on a vast amount of data.

Datasets often contain a large number of irrelevant or redundant samples, which can lead to computational inefficiencies during the training.

In this work, we introduce, for the first time in the context of the audio domain, the k-means clustering as a method for efficient data pruning.

We discuss how k-means clustering can significantly reduce the size of audio datasets while maintaining the classification performance across neural networks (NNs) with different architectures.

We further comment on the role of scaling analysis in identifying the optimal pruning …

22 часа назад @ paperswithcode.com
/sdascoli/ Boolformer: Symbolic Regression of Logic Functions with Transformers
/sdascoli/ Boolformer: Symbolic Regression of Logic Functions with Transformers /sdascoli/ Boolformer: Symbolic Regression of Logic Functions with Transformers

In this work, we introduce Boolformer, the first Transformer architecture trained to perform end-to-end symbolic regression of Boolean functions.

First, we show that it can predict compact formulas for complex functions which were not seen during training, when provided a clean truth table.

We evaluate the Boolformer on a broad set of real-world binary classification datasets, demonstrating its potential as an interpretable alternative to classic machine learning methods.

Finally, we apply it to the widespread task of modelling the dynamics of gene regulatory networks.

Using a recent benchmark, we show that Boolformer is competitive with state-of-the art genetic algorithms with a speedup of…

1 day, 18 hours назад @ paperswithcode.com
/gledsonrt/ Stochastic stiffness identification and response estimation of Timoshenko beams via physics-informed Gaussian processes
/gledsonrt/ Stochastic stiffness identification and response estimation of Timoshenko beams via physics-informed Gaussian processes /gledsonrt/ Stochastic stiffness identification and response estimation of Timoshenko beams via physics-informed Gaussian processes

Machine learning models trained with structural health monitoring data have become a powerful tool for system identification.

This paper presents a physics-informed Gaussian process (GP) model for Timoshenko beam elements.

The optimised GP model is further employed for probabilistic predictions of unobserved responses.

Additionally, an entropy-based method for physics-informed sensor placement optimisation is presented, exploiting heterogeneous sensor position information and structural boundary conditions built into the GP model.

Probabilistic predictions of structural responses and internal forces are in closer agreement with measured data.

1 day, 19 hours назад @ paperswithcode.com
/zelaki/ Weakly-supervised Automated Audio Captioning via text only training
/zelaki/ Weakly-supervised Automated Audio Captioning via text only training /zelaki/ Weakly-supervised Automated Audio Captioning via text only training

In recent years, datasets of paired audio and captions have enabled remarkable success in automatically generating descriptions for audio clips, namely Automated Audio Captioning (AAC).

However, it is labor-intensive and time-consuming to collect a sufficient number of paired audio and captions.

Our approach leverages the similarity between audio and text embeddings in CLAP.

During training, we learn to reconstruct the text from the CLAP text embedding, and during inference, we decode using the audio embeddings.

To mitigate the modality gap between the audio and text embeddings we employ strategies to bridge the gap during training and inference stages.

1 day, 19 hours назад @ paperswithcode.com
/amrita-medical-ai/ Automatic Endoscopic Ultrasound Station Recognition with Limited Data
/amrita-medical-ai/ Automatic Endoscopic Ultrasound Station Recognition with Limited Data /amrita-medical-ai/ Automatic Endoscopic Ultrasound Station Recognition with Limited Data

Pancreatic cancer is a lethal form of cancer that significantly contributes to cancer-related deaths worldwide.

Despite advances in medical imaging techniques, pancreatic cancer remains a challenging disease to detect.

Endoscopic ultrasound (EUS) is the most effective diagnostic tool for detecting pancreatic cancer.

To obtain complete imaging of the pancreas, practitioners must learn to guide the endoscope into multiple "EUS stations" (anatomical locations), which provide different views of the pancreas.

We build an AI-assisted tool that utilizes deep learning techniques to identify these stations of the stomach in real time during EUS procedures.

1 day, 19 hours назад @ paperswithcode.com
/mattapow/ Differentiable Phylogenetics via Hyperbolic Embeddings with Dodonaphy
/mattapow/ Differentiable Phylogenetics via Hyperbolic Embeddings with Dodonaphy /mattapow/ Differentiable Phylogenetics via Hyperbolic Embeddings with Dodonaphy

Motivation: Navigating the high dimensional space of discrete trees for phylogenetics presents a challenging problem for tree optimisation.

To address this, hyperbolic embeddings of trees offer a promising approach to encoding trees efficiently in continuous spaces.

However, they require a differentiable tree decoder to optimise the phylogenetic likelihood.

Results: We illustrate the potential for differentiable optimisation over tree space for maximum likelihood inference.

We then perform variational Bayesian phylogenetics by optimising embedding distributions in hyperbolic space.

1 day, 19 hours назад @ paperswithcode.com
/aymericb213/ Incremental Constrained Clustering by Minimal Weighted Modification
/aymericb213/ Incremental Constrained Clustering by Minimal Weighted Modification /aymericb213/ Incremental Constrained Clustering by Minimal Weighted Modification

Clustering is a well-known task in Data Mining that aims at grouping data instances according to their similarity.

Constrained clustering has been introduced for better modeling the expectations of the expert.

Nevertheless constrained clustering is not yet sufficient since it usually requires the constraints to be given before the clustering process.

In this paper we address a more general problem that aims at modeling the exploratory clustering process, through a sequence of clustering modifications where expert constraints are added on the fly.

We present an incremental constrained clustering framework integrating active query strategies and a Constraint Programming model to fit the exper…

2 days, 13 hours назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 1 час назад
/yuweien1120/ A class-weighted supervised contrastive learning long-tailed bearing fault diagnosis approach using quadratic neural network
/yuweien1120/ A class-weighted supervised contrastive learning long-tailed bearing fault diagnosis approach using quadratic neural network /yuweien1120/ A class-weighted supervised contrastive learning long-tailed bearing fault diagnosis approach using quadratic neural network

Deep learning has achieved remarkable success in bearing fault diagnosis.

In this paper, we propose a supervised contrastive learning approach with a class-aware loss function to enhance the feature extraction capability of neural networks for fault diagnosis.

The developed class-weighted contrastive learning quadratic network (CCQNet) consists of a quadratic convolutional residual network backbone, a contrastive learning branch utilizing a class-weighted contrastive loss, and a classifier branch employing logit-adjusted cross-entropy loss.

By utilizing class-weighted contrastive loss and logit-adjusted cross-entropy loss, our approach encourages equidistant representation of class features…

2 days, 15 hours назад @ paperswithcode.com
/LIN-SHANG/ InstructERC: Reforming Emotion Recognition in Conversation with a Retrieval Multi-task LLMs Framework
/LIN-SHANG/ InstructERC: Reforming Emotion Recognition in Conversation with a Retrieval Multi-task LLMs Framework /LIN-SHANG/ InstructERC: Reforming Emotion Recognition in Conversation with a Retrieval Multi-task LLMs Framework

Include the markdown at the top of your GitHub README.md file to showcase the performance of the model.

Badges are live and will be dynamically updated with the latest ranking of this paper.

2 days, 16 hours назад @ paperswithcode.com
/lcs2-iiitd/ Focal Inferential Infusion Coupled with Tractable Density Discrimination for Implicit Hate Speech Detection
/lcs2-iiitd/ Focal Inferential Infusion Coupled with Tractable Density Discrimination for Implicit Hate Speech Detection /lcs2-iiitd/ Focal Inferential Infusion Coupled with Tractable Density Discrimination for Implicit Hate Speech Detection

Although pre-trained large language models (PLMs) have achieved state-of-the-art on many NLP tasks, they lack understanding of subtle expressions of implicit hate speech.

Such nuanced and implicit hate is often misclassified as non-hate.

Various attempts have been made to enhance the detection of (implicit) hate content by augmenting external context or enforcing label separation via distance-based metrics.

We test FiADD on three implicit hate datasets and observe significant improvement in the two-way and three-way hate classification tasks.

We analyze the generated latent space to understand its evolution under FiADD, which corroborates the advantage of employing FiADD for implicit hate s…

2 days, 18 hours назад @ paperswithcode.com
/elisakreiss/ ContextRef: Evaluating Referenceless Metrics For Image Description Generation
/elisakreiss/ ContextRef: Evaluating Referenceless Metrics For Image Description Generation /elisakreiss/ ContextRef: Evaluating Referenceless Metrics For Image Description Generation

Referenceless metrics (e.g., CLIPScore) use pretrained vision--language models to assess image descriptions directly without costly ground-truth reference texts.

In this paper, we introduce ContextRef, a benchmark for assessing referenceless metrics for such alignment.

ContextRef has two components: human ratings along a variety of established quality dimensions, and ten diverse robustness checks designed to uncover fundamental weaknesses.

A crucial aspect of ContextRef is that images and descriptions are presented in context, reflecting prior work showing that context is important for description quality.

Using ContextRef, we assess a variety of pretrained models, scoring functions, and te…

2 days, 19 hours назад @ paperswithcode.com
/theoverhelst/ Uplift vs. predictive modeling: a theoretical analysis
/theoverhelst/ Uplift vs. predictive modeling: a theoretical analysis /theoverhelst/ Uplift vs. predictive modeling: a theoretical analysis

Despite the growing popularity of machine-learning techniques in decision-making, the added value of causal-oriented strategies with respect to pure machine-learning approaches has rarely been quantified in the literature.

These strategies are crucial for practitioners in various domains, such as marketing, telecommunications, health care and finance.

This paper presents a comprehensive treatment of the subject, starting from firm theoretical foundations and highlighting the parameters that influence the performance of the uplift and predictive approaches.

The focus of the paper is on a binary outcome case and a binary action, and the paper presents a theoretical analysis of uplift modeling…

2 days, 19 hours назад @ paperswithcode.com
/alagoz/ Generating Hierarchical Structures for Improved Time Series Classification Using Stochastic Splitting Functions
/alagoz/ Generating Hierarchical Structures for Improved Time Series Classification Using Stochastic Splitting Functions /alagoz/ Generating Hierarchical Structures for Improved Time Series Classification Using Stochastic Splitting Functions

This study introduces a novel hierarchical divisive clustering approach with stochastic splitting functions (SSFs) to enhance classification performance in multi-class datasets through hierarchical classification (HC).

The approach is evaluated on 46 multi-class time series datasets using popular classifiers (svm and rocket) and SSFs (potr, srtr, and lsoo).

While the number of classes and flat classification (FC) score show consistent significance, variations are observed with different splitting functions.

Overall, the proposed approach presents a promising strategy for enhancing classification by generating hierarchical structure in multi-class time series datasets.

Future research direct…

2 days, 19 hours назад @ paperswithcode.com
/ictmcg/ Bad Actor, Good Advisor: Exploring the Role of Large Language Models in Fake News Detection
/ictmcg/ Bad Actor, Good Advisor: Exploring the Role of Large Language Models in Fake News Detection /ictmcg/ Bad Actor, Good Advisor: Exploring the Role of Large Language Models in Fake News Detection

Recent advances in large language models (LLMs) have shown remarkable performance in various tasks, but whether and how LLMs could help with fake news detection remains underexplored.

In this paper, we investigate the potential of LLMs in fake news detection.

Based on these findings, we propose that current LLMs may not substitute fine-tuned SLMs in fake news detection but can be a good advisor for SLMs by providing multi-perspective instructive rationales.

To instantiate this proposal, we design an adaptive rationale guidance network for fake news detection (ARG), in which SLMs selectively acquire insights on news analysis from the LLMs' rationales.

Experiments on two real-world datasets d…

2 days, 19 hours назад @ paperswithcode.com
/itu-ai-ml-in-5g-challenge/ Multimodal Transformers for Wireless Communications: A Case Study in Beam Prediction
/itu-ai-ml-in-5g-challenge/ Multimodal Transformers for Wireless Communications: A Case Study in Beam Prediction /itu-ai-ml-in-5g-challenge/ Multimodal Transformers for Wireless Communications: A Case Study in Beam Prediction

Wireless communications at high-frequency bands with large antenna arrays face challenges in beam management, which can potentially be improved by multimodality sensing information from cameras, LiDAR, radar, and GPS.

In this paper, we present a multimodal transformer deep learning framework for sensing-assisted beam prediction.

We also evaluate data processing and augmentation techniques such as image enhancement, segmentation, background filtering, multimodal data flipping, radar signal transformation, and GPS angle calibration.

This outperforms using other modalities and arbitrary data processing techniques, which demonstrates the effectiveness of transformers with feature fusion in perf…

2 days, 19 hours назад @ paperswithcode.com
/takhemlata/ t-EER: Parameter-Free Tandem Evaluation of Countermeasures and Biometric Comparators
/takhemlata/ t-EER: Parameter-Free Tandem Evaluation of Countermeasures and Biometric Comparators /takhemlata/ t-EER: Parameter-Free Tandem Evaluation of Countermeasures and Biometric Comparators

Presentation attack (spoofing) detection (PAD) typically operates alongside biometric verification to improve reliablity in the face of spoofing attacks.

Even though the two sub-systems operate in tandem to solve the single task of reliable biometric verification, they address different detection tasks and are hence typically evaluated separately.

We introduce a new metric for the joint evaluation of PAD solutions operating in situ with biometric verification.

In contrast to the tandem detection cost function proposed recently, the new tandem equal error rate (t-EER) is parameter free.

The proposed approach is a strong candidate metric for the tandem evaluation of PAD systems and biometric …

2 days, 19 hours назад @ paperswithcode.com
/thu-ml/ How Robust is Google's Bard to Adversarial Image Attacks?
/thu-ml/ How Robust is Google's Bard to Adversarial Image Attacks? /thu-ml/ How Robust is Google's Bard to Adversarial Image Attacks?

However, due to the unsolved adversarial robustness problem of vision models, MLLMs can have more severe safety and security risks by introducing the vision inputs.

In this work, we study the adversarial robustness of Google's Bard, a competitive chatbot to ChatGPT that released its multimodal capability recently, to better understand the vulnerabilities of commercial MLLMs.

By attacking white-box surrogate vision encoders or MLLMs, the generated adversarial examples can mislead Bard to output wrong image descriptions with a 22% success rate based solely on the transferability.

We show that the adversarial examples can also attack other MLLMs, e.g., a 26% attack success rate against Bing Ch…

2 days, 19 hours назад @ paperswithcode.com
/ai-s2-lab/ FluentEditor: Text-based Speech Editing by Considering Acoustic and Prosody Consistency
/ai-s2-lab/ FluentEditor: Text-based Speech Editing by Considering Acoustic and Prosody Consistency /ai-s2-lab/ FluentEditor: Text-based Speech Editing by Considering Acoustic and Prosody Consistency

Text-based speech editing (TSE) techniques are designed to enable users to edit the output audio by modifying the input text transcript instead of the audio itself.

Despite much progress in neural network-based TSE techniques, the current techniques have focused on reducing the difference between the generated speech segment and the reference target in the editing region, ignoring its local and global fluency in the context and original utterance.

To maintain the speech fluency, we propose a fluency speech editing model, termed \textit{FluentEditor}, by considering fluency-aware training criterion in the TSE training.

The subjective and objective experimental results on VCTK demonstrate tha…

2 days, 19 hours назад @ paperswithcode.com
/jkminder/ SALSA-CLRS: A Sparse and Scalable Benchmark for Algorithmic Reasoning
/jkminder/ SALSA-CLRS: A Sparse and Scalable Benchmark for Algorithmic Reasoning /jkminder/ SALSA-CLRS: A Sparse and Scalable Benchmark for Algorithmic Reasoning

We introduce an extension to the CLRS algorithmic learning benchmark, prioritizing scalability and the utilization of sparse representations.

Many algorithms in CLRS require global memory or information exchange, mirrored in its execution model, which constructs fully connected (not sparse) graphs based on the underlying problem.

However, many important algorithms do not demand a fully connected graph; these algorithms, primarily distributed in nature, align closely with the message-passing paradigm employed by Graph Neural Networks.

Hence, we propose SALSA-CLRS, an extension of the current CLRS benchmark specifically with scalability and sparseness in mind.

Our approach includes adapted al…

2 days, 19 hours назад @ paperswithcode.com
/xaviergrool/ FGFusion: Fine-Grained Lidar-Camera Fusion for 3D Object Detection
/xaviergrool/ FGFusion: Fine-Grained Lidar-Camera Fusion for 3D Object Detection /xaviergrool/ FGFusion: Fine-Grained Lidar-Camera Fusion for 3D Object Detection

Lidars and cameras are critical sensors that provide complementary information for 3D detection in autonomous driving.

In this paper, we propose Fine-Grained Lidar-Camera Fusion (FGFusion) that make full use of multi-scale features of image and point cloud and fuse them in a fine-grained way.

First, we design a dual pathway hierarchy structure to extract both high-level semantic and low-level detailed features of the image.

Second, an auxiliary network is introduced to guide point cloud features to better learn the fine-grained spatial information.

Finally, we propose multi-scale fusion (MSF) to fuse the last N feature maps of image and point cloud.

2 days, 19 hours назад @ paperswithcode.com
/microsoft/ Privacy-Preserving In-Context Learning with Differentially Private Few-Shot Generation
/microsoft/ Privacy-Preserving In-Context Learning with Differentially Private Few-Shot Generation /microsoft/ Privacy-Preserving In-Context Learning with Differentially Private Few-Shot Generation

We study the problem of in-context learning (ICL) with large language models (LLMs) on private datasets.

This scenario poses privacy risks, as LLMs may leak or regurgitate the private examples demonstrated in the prompt.

We propose a novel algorithm that generates synthetic few-shot demonstrations from the private dataset with formal differential privacy (DP) guarantees, and show empirically that it can achieve effective ICL.

We conduct extensive experiments on standard benchmarks and compare our algorithm with non-private ICL and zero-shot solutions.

These results open up new possibilities for ICL with privacy protection for a broad range of applications.

2 days, 19 hours назад @ paperswithcode.com
/bartn8/ Active Stereo Without Pattern Projector
/bartn8/ Active Stereo Without Pattern Projector /bartn8/ Active Stereo Without Pattern Projector

This paper proposes a novel framework integrating the principles of active stereo in standard passive camera systems without a physical pattern projector.

We virtually project a pattern over the left and right images according to the sparse measurements obtained from a depth sensor.

Any such devices can be seamlessly plugged into our framework, allowing for the deployment of a virtual active stereo setup in any possible environment, overcoming the limitation of pattern projectors, such as limited working range or environmental conditions.

Experiments on indoor/outdoor datasets, featuring both long and close-range, support the seamless effectiveness of our approach, boosting the accuracy of …

2 days, 19 hours назад @ paperswithcode.com
Papers With Code Papers With Code
последний пост 1 час назад
/cyfml/ Unveiling the Hidden Realm: Self-supervised Skeleton-based Action Recognition in Occluded Environments
/cyfml/ Unveiling the Hidden Realm: Self-supervised Skeleton-based Action Recognition in Occluded Environments /cyfml/ Unveiling the Hidden Realm: Self-supervised Skeleton-based Action Recognition in Occluded Environments

To integrate action recognition methods into autonomous robotic systems, it is crucial to consider adverse situations involving target occlusions.

Such a scenario, despite its practical relevance, is rarely addressed in existing self-supervised skeleton-based action recognition methods.

We first pre-train using occluded skeleton sequences, then use k-means clustering (KMeans) on sequence embeddings to group semantically similar samples.

Imputing incomplete skeleton sequences to create relatively complete sequences as input provides significant benefits to existing skeleton-based self-supervised models.

Meanwhile, building on the state-of-the-art Partial Spatio-Temporal Learning (PSTL), we i…

2 days, 19 hours назад @ paperswithcode.com
/liuxiaoyu1104/ Beyond Image Borders: Learning Feature Extrapolation for Unbounded Image Composition
/liuxiaoyu1104/ Beyond Image Borders: Learning Feature Extrapolation for Unbounded Image Composition /liuxiaoyu1104/ Beyond Image Borders: Learning Feature Extrapolation for Unbounded Image Composition

For improving image composition and aesthetic quality, most existing methods modulate the captured images by striking out redundant content near the image borders.

However, such image cropping methods are limited in the range of image views.

Nonetheless, the synthesized extrapolated regions may be included in the cropped image, making the image composition result not real and potentially with degraded image quality.

In this paper, we circumvent this issue by presenting a joint framework for both unbounded recommendation of camera view and image composition (i.e., UNIC).

Extensive experiments are conducted on the datasets constructed upon existing image cropping datasets, showing the effecti…

2 days, 19 hours назад @ paperswithcode.com
/apple/ Smooth ECE: Principled Reliability Diagrams via Kernel Smoothing
/apple/ Smooth ECE: Principled Reliability Diagrams via Kernel Smoothing /apple/ Smooth ECE: Principled Reliability Diagrams via Kernel Smoothing

Calibration measures and reliability diagrams are two fundamental tools for measuring and interpreting the calibration of probabilistic predictors.

Calibration measures quantify the degree of miscalibration, and reliability diagrams visualize the structure of this miscalibration.

However, the most common constructions of reliability diagrams and calibration measures -- binning and ECE -- both suffer from well-known flaws (e.g.

We prove that with a careful choice of bandwidth, this method yields a calibration measure that is well-behaved in the sense of (B{\l}asiok, Gopalan, Hu, and Nakkiran 2023a) -- a consistent calibration measure.

Moreover, the reliability diagram obtained from this smoo…

2 days, 19 hours назад @ paperswithcode.com
/matt3o/ AutoPET Challenge 2023: Sliding Window-based Optimization of U-Net
/matt3o/ AutoPET Challenge 2023: Sliding Window-based Optimization of U-Net /matt3o/ AutoPET Challenge 2023: Sliding Window-based Optimization of U-Net

Tumor segmentation in medical imaging is crucial and relies on precise delineation.

Combining PET with Computed Tomography (CT) can enhance tumor segmentation by integrating metabolic and anatomic information.

FDG-PET/CT scans are pivotal for cancer staging and reassessment, utilizing radiolabeled fluorodeoxyglucose to highlight metabolically active regions.

Accurately distinguishing tumor-specific uptake from physiological uptake in normal tissues is a challenging aspect of precise tumor segmentation.

The AutoPET challenge addresses this by providing a dataset of 1014 FDG-PET/CT studies, encouraging advancements in accurate tumor segmentation and analysis within the FDG-PET/CT domain.

2 days, 19 hours назад @ paperswithcode.com
/mgithubl/ TMac: Temporal Multi-Modal Graph Learning for Acoustic Event Classification
/mgithubl/ TMac: Temporal Multi-Modal Graph Learning for Acoustic Event Classification /mgithubl/ TMac: Temporal Multi-Modal Graph Learning for Acoustic Event Classification

Audiovisual data is everywhere in this digital age, which raises higher requirements for the deep learning models developed on them.

To well handle the information of the multi-modal data is the key to a better audiovisual modal.

It indicates that temporal information is important in multi-modal acoustic event modeling for both intra- and inter-modal.

With this motivation, we propose a Temporal Multi-modal graph learning method for Acoustic event Classification, called TMac, by modeling such temporal information via graph learning techniques.

In particular, we construct a temporal graph for each acoustic event, dividing its audio data and video data into multiple segments.

2 days, 19 hours назад @ paperswithcode.com
/rajeshjnu2006/ Dictionary Attack on IMU-based Gait Authentication
/rajeshjnu2006/ Dictionary Attack on IMU-based Gait Authentication /rajeshjnu2006/ Dictionary Attack on IMU-based Gait Authentication

We present a novel adversarial model for authentication systems that use gait patterns recorded by the inertial measurement unit (IMU) built into smartphones.

The attack idea is inspired by and named after the concept of a dictionary attack on knowledge (PIN or password) based authentication systems.

In particular, this work investigates whether it is possible to build a dictionary of IMUGait patterns and use it to launch an attack or find an imitator who can actively reproduce IMUGait patterns that match the target's IMUGait pattern.

Nine physically and demographically diverse individuals walked at various levels of four predefined controllable and adaptable gait factors (speed, step lengt…

2 days, 19 hours назад @ paperswithcode.com
/saintslab/ Activation Compression of Graph Neural Networks using Block-wise Quantization with Improved Variance Minimization
/saintslab/ Activation Compression of Graph Neural Networks using Block-wise Quantization with Improved Variance Minimization /saintslab/ Activation Compression of Graph Neural Networks using Block-wise Quantization with Improved Variance Minimization

Efficient training of large-scale graph neural networks (GNNs) has been studied with a specific focus on reducing their memory consumption.

(2022) proposed extreme activation compression (EXACT) which demonstrated drastic reduction in memory consumption by performing quantization of the intermediate activation maps down to using INT2 precision.

They showed little to no reduction in performance while achieving large reductions in GPU memory consumption.

In this work, we present an improvement to the EXACT strategy by using block-wise quantization of the intermediate activation maps.

Further, we present a correction to the assumptions on the distribution of intermediate activation maps in EXA…

2 days, 19 hours назад @ paperswithcode.com
/ai-s2-lab/ Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech
/ai-s2-lab/ Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech /ai-s2-lab/ Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech

Prosodic phrasing is crucial to the naturalness and intelligibility of end-to-end Text-to-Speech (TTS).

As the study of prosodic phrasing has been linguistically motivated, prosodic phrasing for expressive emotion rendering has not been well studied.

In this paper, we propose an emotion-aware prosodic phrasing model, termed \textit{EmoPP}, to mine the emotional cues of utterance accurately and predict appropriate phrase breaks.

We first conduct objective observations on the ESD dataset to validate the strong correlation between emotion and prosodic phrasing.

Then the objective and subjective evaluations show that the EmoPP outperforms all baselines and achieves remarkable performance in ter…

2 days, 19 hours назад @ paperswithcode.com
/manymuch/ A Vision-Centric Approach for Static Map Element Annotation
/manymuch/ A Vision-Centric Approach for Static Map Element Annotation /manymuch/ A Vision-Centric Approach for Static Map Element Annotation

The recent development of online static map element (a.k.a.

However, available public datasets currently cannot provide high-quality training data regarding consistency and accuracy.

To this end, we present CAMA: a vision-centric approach for Consistent and Accurate Map Annotation.

Without LiDAR inputs, our proposed framework can still generate high-quality 3D annotations of static map elements.

Compared with the original nuScenes static map element, models trained with annotations from CAMA achieve lower reprojection errors (e.g., 4.73 vs. 8.03 pixels).

2 days, 19 hours назад @ paperswithcode.com
/hedlen/ MEFLUT: Unsupervised 1D Lookup Tables for Multi-exposure Image Fusion
/hedlen/ MEFLUT: Unsupervised 1D Lookup Tables for Multi-exposure Image Fusion /hedlen/ MEFLUT: Unsupervised 1D Lookup Tables for Multi-exposure Image Fusion

In this paper, we introduce a new approach for high-quality multi-exposure image fusion (MEF).

We show that the fusion weights of an exposure can be encoded into a 1D lookup table (LUT), which takes pixel intensity value as input and produces fusion weight as output.

We learn one 1D LUT for each exposure, then all the pixels from different exposures can query 1D LUT of that exposure independently for high-quality and efficient fusion.

In addition, we collect a new MEF dataset consisting of 960 samples, 155 of which are manually tuned by professionals as ground-truth for evaluation.

Moreover, our 1D LUT approach takes less than 4ms to run a 4K image on a PC GPU.

2 days, 19 hours назад @ paperswithcode.com
/jun-long-li/ TCOVIS: Temporally Consistent Online Video Instance Segmentation
/jun-long-li/ TCOVIS: Temporally Consistent Online Video Instance Segmentation /jun-long-li/ TCOVIS: Temporally Consistent Online Video Instance Segmentation

In recent years, significant progress has been made in video instance segmentation (VIS), with many offline and online methods achieving state-of-the-art performance.

While offline methods have the advantage of producing temporally consistent predictions, they are not suitable for real-time scenarios.

Conversely, online methods are more practical, but maintaining temporal consistency remains a challenging task.

In this paper, we propose a novel online method for video instance segmentation, called TCOVIS, which fully exploits the temporal information in a video clip.

For instance, on YouTube-VIS 2021, TCOVIS achieves 49.5 AP and 61.3 AP with ResNet-50 and Swin-L backbones, respectively.

2 days, 19 hours назад @ paperswithcode.com
/summitgao/ Convolution and Attention Mixer for Synthetic Aperture Radar Image Change Detection
/summitgao/ Convolution and Attention Mixer for Synthetic Aperture Radar Image Change Detection /summitgao/ Convolution and Attention Mixer for Synthetic Aperture Radar Image Change Detection

Synthetic aperture radar (SAR) image change detection is a critical task and has received increasing attentions in the remote sensing community.

However, existing SAR change detection methods are mainly based on convolutional neural networks (CNNs), with limited consideration of global attention mechanism.

In this letter, we explore Transformer-like architecture for SAR change detection to incorporate global attention.

First, to compensate the inductive bias for Transformer, we combine self-attention with shift convolution in a parallel way.

The parallel design effectively captures the global semantic information via the self-attention and performs local feature extraction through shift con…

2 days, 19 hours назад @ paperswithcode.com
/dvlab-research/ LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
/dvlab-research/ LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models /dvlab-research/ LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

We present LongLoRA, an efficient fine-tuning approach that extends the context sizes of pre-trained large language models (LLMs), with limited computation cost.

Typically, training LLMs with long context sizes is computationally expensive, requiring extensive training hours and GPU resources.

Notably, we find that LoRA for context extension works well under the premise of trainable embedding and normalization.

LongLoRA demonstrates strong empirical results on various tasks on LLaMA2 models from 7B/13B to 70B.

LongLoRA extends models' context while retaining their original architectures, and is compatible with most existing techniques, like FlashAttention-2.

2 days, 19 hours назад @ paperswithcode.com
/freedomintelligence/ AceGPT, Localizing Large Language Models in Arabic
/freedomintelligence/ AceGPT, Localizing Large Language Models in Arabic /freedomintelligence/ AceGPT, Localizing Large Language Models in Arabic

This paper explores the imperative need and methodology for developing a localized Large Language Model (LLM) tailored for Arabic, a language with unique cultural characteristics that are not adequately addressed by current mainstream models like ChatGPT.

The objective is to train culturally aware and value-aligned Arabic LLMs that can serve the diverse application-specific needs of Arabic-speaking communities.

Extensive evaluations demonstrated that the resulting LLM called `\textbf{AceGPT}' is the SOTA open Arabic LLM in various benchmarks, including instruction-following benchmark (i.e., Arabic Vicuna-80 and Arabic AlpacaEval), knowledge benchmark (i.e., Arabic MMLU and EXAMs), as well a…

2 days, 19 hours назад @ paperswithcode.com
/lukasberglund/ The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"
/lukasberglund/ The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A" /lukasberglund/ The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"

If a model is trained on a sentence of the form "A is B", it will not automatically generalize to the reverse direction "B is A".

This is the Reversal Curse.

if "A is B'' occurs, "B is A" is more likely to occur).

The Reversal Curse is robust across model sizes and model families and is not alleviated by data augmentation.

This shows a failure of logical deduction that we hypothesize is caused by the Reversal Curse.

2 days, 19 hours назад @ paperswithcode.com
💼 University and corporation labs
DeepMind DeepMind
последний пост 6 days назад
A catalogue of genetic mutations to help pinpoint the cause of diseases
A catalogue of genetic mutations to help pinpoint the cause of diseases A catalogue of genetic mutations to help pinpoint the cause of diseases

Missense variants are genetic mutations that can affect the function of human proteins.

AlphaMissense predicted the pathogenicity of all possible 71 million missense variants.

Missense variants can be used in the diagnosis of rare genetic diseases, where a few or even a single missense variant may directly cause disease.

Our adapted model can predict the pathogenicity of missense variants altering individual amino acids of proteins.

Red dots represent known pathogenic missense variants, blue dots represent known benign variants from the ClinVar database.

6 days назад @ deepmind.com
Identifying AI-generated images with SynthID
Identifying AI-generated images with SynthID Identifying AI-generated images with SynthID

Today, in partnership with Google Cloud, we’re launching a beta version of SynthID, a tool for watermarking and identifying AI-generated images.

Part of this responsibility is giving users more advanced tools for identifying AI-generated images so their images — and even some edited versions — can be identified at a later date.ÂSynthID generates an imperceptible digital watermark for AI-generated images.

Google Cloud is the first cloud provider to offer a tool for creating AI-generated images responsibly and identifying them with confidence.

Traditional watermarks aren’t sufficient for identifying AI-generated images because they’re often applied like a stamp on an image and can e…

3 weeks, 6 days назад @ deepmind.com
RT-2: New model translates vision and language into action
RT-2: New model translates vision and language into action RT-2: New model translates vision and language into action

A visual-language model (VLM) pre-trained on web-scale data is learning from RT-1 robotics data to become RT-2, a visual-language-action (VLA) model that can control a robot.

RT-2 shows improved generalisation capabilities and semantic and visual understanding beyond the robotic data it was exposed to.

In our work, we adapt Pathways Language and Image model (PaLI-X) and Pathways Language model Embodied (PaLM-E) to act as the backbones of RT-2.

Generalisation and emergent skillsWe performed a series of qualitative and quantitative experiments on our RT-2 models, on over 6,000 robotic trials.

Success rates of emergent skill evaluations: our RT-2 models outperform both previous robotics transf…

1 month, 4 weeks назад @ deepmind.com
Using AI to fight climate change
Using AI to fight climate change Using AI to fight climate change

How we’re applying the latest AI developments to help fight climate change and build a more sustainable, low-carbon worldAI is a powerful technology that will transform our future, so how can we best apply it to help combat climate change and find sustainable solutions?

Understand weather, climate, and their effectsBetter understanding the core problems and their effects is a critical first step to tackling climate change.

Moreover, we’re partnering with non-profit Climate Change AI to close important gaps in climate-related data.

Currently, this partnership focuses on building a comprehensive wishlist of datasets whose availability would advance AI solutions for climate change.

If you …

2 months назад @ deepmind.com
Google DeepMind’s latest research at ICML 2023
Google DeepMind’s latest research at ICML 2023 Google DeepMind’s latest research at ICML 2023

In reinforcement learning, one way this can be defined is through reward.

When deploying AI systems, they need to be robust enough for the real-world.

We look at how to better train reinforcement learning algorithms within constraints, as AI tools often have to be limited for safety and efficiency.

In our research, which was recognised with an ICML 2023 Outstanding Paper Award, we explore how we can teach models complex long-term strategy under uncertainty with imperfect information games.

Developing advanced AI systems that can generalise in human-like ways will help to create AI tools we can use in our everyday lives and to tackle new challenges.

2 months, 1 week назад @ deepmind.com
Developing reliable AI tools for healthcare
Developing reliable AI tools for healthcare Developing reliable AI tools for healthcare

This question is particularly important in healthcare, where predictive AI is increasingly used in high-stakes tasks to assist clinicians.

However, for many healthcare providers, it’s simply not possible to redesign a predictive AI model.

The predictive AI outputs a confidence score between 0 (certain no disease is present) and 1 (certain that disease is present).

Here, the existing predictive AI model remains unchanged.ÂCoDoC learns to establish the relative accuracy of the predictive AI model compared with clinicians’ interpretation, and how that relationship fluctuates with the predictive AI’s confidence scores.

To bring technology like CoDoC safely to real-world medical settings, …

2 months, 1 week назад @ deepmind.com
Exploring institutions for global AI governance
Exploring institutions for global AI governance Exploring institutions for global AI governance

An intergovernmental or multi-stakeholder Advanced AI Governance Organisation could help internationalise and align efforts to address global risks from advanced AI systems by setting governance norms and standards and assisting in their implementation.

In doing so, it would help underserved societies benefit from cutting-edge AI technology and promote international access to AI technology for safety and governance objectives.

For example, a Commission on Advanced AI will face significant scientific challenges given the extreme uncertainty about AI trajectories and capabilities and the limited scientific research on advanced AI issues to date.

The rapid rate of AI progress and limited capac…

2 months, 2 weeks назад @ deepmind.com
RoboCat: A self-improving robotic agent
RoboCat: A self-improving robotic agent RoboCat: A self-improving robotic agent

New foundation agent learns to operate different robotic arms, solves tasks from as few as 100 demonstrations, and improves from self-generated data.

The spin-off agent practises on this new task/arm an average of 10,000 times, generating more training data.

Learning to operate new robotic arms and solve more complex tasksWith RoboCat’s diverse training, it learned to operate different robotic arms within a few hours.

Left: A new robotic arm RoboCat learned to control‍Right: Video of RoboCat using the arm to pick up gearsAfter observing 1000 human-controlled demonstrations, collected in just hours, RoboCat could direct this new arm dexterously enough to pick up gears successfully 86% of…

3 months, 1 week назад @ deepmind.com
Optimising computer systems with more generalised AI tools
Optimising computer systems with more generalised AI tools Optimising computer systems with more generalised AI tools

As part of our efforts to build increasingly capable and general AI systems, we’re working to create AI tools with a broad understanding of the world, so useful knowledge can be transferred between many different types of tasks.

General-purpose tools to power our digital futureFrom playing games to solving complex engineering problems at the heart of every device, our AI tools are saving billions of people time and energy.

And this is just the start.ÂWe envision a future where more general-purpose AI tools can help optimise the entire computing ecosystem that powers our digital world.

But to support these tools, we’ll need faster, more efficient, and a more sustainable digital infrastru…

3 months, 2 weeks назад @ deepmind.com
AlphaDev discovers faster sorting algorithms
AlphaDev discovers faster sorting algorithms AlphaDev discovers faster sorting algorithms

For sorting algorithms, this means unordered numbers go in and correctly sorted numbers come out.

AlphaDev wins the game by discovering a correct, faster program.ÂDiscovering faster sorting algorithmsAlphaDev uncovered new sorting algorithms that led to improvements in the LLVM libc++ sorting library that were up to 70% faster for shorter sequences and about 1.7% faster for sequences exceeding 250,000 elements.ÂWe focused on improving sorting algorithms for shorter sequences of three to five elements.

Finding novel approachesAlphaDev not only found faster algorithms, but also uncovered novel approaches.

Left: The original implementation with max (B, min (A, C, D))used in a larger sorting al…

3 months, 2 weeks назад @ deepmind.com
An early warning system for novel AI risks
An early warning system for novel AI risks An early warning system for novel AI risks

AI researchers already use a range of evaluation benchmarks to identify unwanted behaviours in AI systems, such as AI systems making misleading statements, biased decisions, or repeating copyrighted content.

Model safety evaluations, including those assessing extreme risks, will be a critical component of safe AI development and deployment.

By identifying the risks early on, this will unlock opportunities to be more responsible when training new AI systems, deploying these AI systems, transparently describing their risks, and applying appropriate cybersecurity standards.

A blueprint for embedding model evaluations for extreme risks into important decision making processes throughout model t…

4 months назад @ deepmind.com
DeepMind’s latest research at ICLR 2023
DeepMind’s latest research at ICLR 2023 DeepMind’s latest research at ICLR 2023

Research towards AI models that can generalise, scale, and accelerate scienceNext week marks the start of the 11th International Conference on Learning Representations (ICLR), taking place 1-5 May in Kigali, Rwanda.

We present a new approach where models learn by solving two problems in one.

We developed a new approach and open-source training data set to help models learn to explore in human-like ways over long time horizons.

On the other hand, adversarial attacks are a way of probing the limits of AI models by pushing them to create wrong or harmful outputs.

RL models also learn by trial and error which can be very data-intensive and time-consuming.

5 months назад @ deepmind.com
How can we build human values into AI?
How can we build human values into AI? How can we build human values into AI?

These questions shed light on the role played by principles – the foundational values that drive decisions big and small in AI.

For humans, principles help shape the way we live our lives and our sense of right and wrong.

A tool for fairer decision-makingA key goal for AI researchers has been to align AI systems with human values.

However, there is no consensus on a single set of human values or preferences to govern AI – we live in a world where people have diverse backgrounds, resources and beliefs.

The veil of ignorance may provide a starting point for the selection of principles with which to align AI.

5 months назад @ deepmind.com
Announcing Google DeepMind
Announcing Google DeepMind Announcing Google DeepMind

Earlier today we announced some changes that will accelerate our progress in AI and help us develop more capable AI systems safely and responsibly.

In the coming years, AI - and ultimately AGI - has the potential to drive one of the greatest social, economic and scientific transformations in history.ÂThat’s why today Sundar is announcing that DeepMind and the Brain team from Google Research will be joining forces as a single, focused unit called Google DeepMind.

By creating Google DeepMind, I believe we can get to that future faster.

Through Google DeepMind, we are bringing together our world-class talent in AI with the computing power, infrastructure and resources to create the next gene…

5 months, 1 week назад @ deepmind.com
Competitive programming with AlphaCode
Competitive programming with AlphaCode Competitive programming with AlphaCode

As part of DeepMind’s mission to solve intelligence, we created a system called AlphaCode that writes computer programs at a competitive level.

AlphaCode placed at about the level of the median competitor, marking the first time an AI code generation system has reached a competitive level of performance in programming competitions.

We pre-train our model on selected public GitHub code and fine-tune it on our relatively small competitive programming dataset.

"Solving competitive programming problems is a really hard thing to do, requiring both good coding skills and problem solving creativity in humans.

AlphaCode ranked within the top 54% in real-world programming competitions, an advancem…

9 months, 3 weeks назад @ deepmind.com
Google
последний пост 2 days, 8 hours назад
Bring AI to Looker with the Machine Learning Accelerator
Bring AI to Looker with the Machine Learning Accelerator Bring AI to Looker with the Machine Learning Accelerator

Earlier this year we released the Machine Learning (ML) Accelerator for Looker, a Looker application that integrates Looker with BigQuery ML.

It streamlines machine learning workflows by providing simplified access to machine learning tools and letting users use trusted data to train and run those models.

The Machine Learning Accelerator is available free from the Looker Marketplace, and can be installed in any Looker instance.

Today, we are adding a new ML model type: Single Time Series Forecasting using the ARIMA Plus model in BigQuery ML.

Get started todayTo get started with the ML Accelerator for Looker, install it from the Looker marketplace .

2 days, 8 hours назад @ cloud.google.com
Distilling step-by-step: Outperforming larger language models with less training data and smaller model sizes
Distilling step-by-step: Outperforming larger language models with less training data and smaller model sizes Distilling step-by-step: Outperforming larger language models with less training data and smaller model sizes

Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes”, presented at ACL2023, we set out to tackle this trade-off between model size and training data collection cost.

Less training dataCompared to standard fine-tuning, the distilling step-by-step method achieves better performance using much less training data.

Smaller deployed model sizeCompared to few-shot CoT prompted LLMs, distilling step-by-step achieves better performance using much smaller model sizes.

Distilling step-by-step is able to outperform LLM baselines by using much smaller models, e.g., over 700× smaller models on ANLI.

Distilling step-by-step outperforms few-shot LLMs with smaller models usi…

3 days, 3 hours назад @ blog.research.google
Realizing telecommunications transformation with Google Cloud generative AI
Realizing telecommunications transformation with Google Cloud generative AI Realizing telecommunications transformation with Google Cloud generative AI

For example, Vodafone is experimenting with Vertex AI Search.

“We are building an intelligent assistant to securely and quickly search contracts,” said Sherif Bakir, CEO of Vodafone Voice and Roaming Services, about Vodafone’s use of Vertex AI Search.

Underpinning all of this, Google Cloud is committed to addressing the accuracy, privacy and security, regulatory compliance, and intellectual property infringement risks of gen AI, implementing strong guard rails as we adopt and implement gen AI technologies.

Many of the foundation models, Vertex AI Search and Conversation, and Telecom Subscriber Insights are already generally available to all Google Cloud customers.

You can also learn more ab…

4 days, 11 hours назад @ cloud.google.com
Meet the inaugural cohort of the Google for Startups Accelerator: AI First in Europe
Meet the inaugural cohort of the Google for Startups Accelerator: AI First in Europe Meet the inaugural cohort of the Google for Startups Accelerator: AI First in Europe

Today’s startups are addressing the world's most pressing issues, and artificial intelligence (AI) is one of their most powerful tools.

To empower startups to scale their business towards success in the rapidly evolving AI landscape, Google for Startups Accelerator: AI First offers a 10-week, equity-free program for AI-first startups in partnership with Google Cloud.

Startups that are selected for the cohort also benefit from dedicated Google AI technical expertise and receive credits via the Google for Startups Cloud Program.

Out of hundreds of impressive applications, today we welcome the inaugural cohort of the Google for Startups Accelerator: AI First.

So with no further ado, we present…

5 days, 15 hours назад @ cloud.google.com
MediaPipe FaceStylizer: On-device real-time few-shot face stylization
MediaPipe FaceStylizer: On-device real-time few-shot face stylization MediaPipe FaceStylizer: On-device real-time few-shot face stylization

Users can fine-tune the generator to learn a style from one or a few images using MediaPipe Model Maker, and deploy to on-device face stylization applications with the customized model using MediaPipe FaceStylizer.

To enable such a face stylization pipeline, we built the pipeline with a GAN inversion encoder and efficient face generator model (see below).

With the newly designed architecture, we train the BlazeStyleGAN model by distilling it from a teacher StyleGAN model.

MediaPipe SolutionsThe MediaPipe FaceStylizer is going to be released to public users in MediaPipe Solutions.

Users can leverage MediaPipe Model Maker to train a customized face stylization model using their own style imag…

1 week, 2 days назад @ blog.research.google
Applying Generative AI to product design with BigQuery DataFrames
Applying Generative AI to product design with BigQuery DataFrames Applying Generative AI to product design with BigQuery DataFrames

With so many factors to consider, multiplied across an entire product catalog, the process must be designed to scale.

In this blog post, we will show how the power of data analytics and generative AI can help unleash the creative process, and accelerate testing.

We will provide a step-by-step guide on how to generate potential drug names using BigQuery DataFrames.

We will be using BigQuery DataFrames to perform generative AI operations.

BigQuery DataFrames directly supports a wide variety of ML use cases, which we will showcase here.

1 week, 2 days назад @ cloud.google.com
On-device content distillation with graph neural networks
On-device content distillation with graph neural networks On-device content distillation with graph neural networks

Our on-device content distillation model smoothly transforms long-form content into a simple and customizable layout for a more pleasant reading journey while also outperforming the leading alternative approaches.

Leaf nodes containing text content are classified as essential or non-essential content.

The features used in the model can be grouped into intermediate node features, leaf-node text features, and element position features.

Node metrics assess the classification performance at the granularity of the accessibility tree node, which is analogous to a paragraph level.

The F1-score of the Chrome content distillation model for headline and main text content on the test sets of different…

1 week, 3 days назад @ blog.research.google
Document AI Workbench is now powered by generative AI to structure document data faster
Document AI Workbench is now powered by generative AI to structure document data faster Document AI Workbench is now powered by generative AI to structure document data faster

That’s why we launched Document AI Workbench.

Document AI Workbench enables users to customize models for document processing tasks.

At Google Next ’23, we spoke about mobilizing document insights using generative AI and announced the public preview launches of two generative AI powered features in Document AI Workbench: Custom Extractor with generative AI and the Summarizer.

Custom Extractor with generative AIGenerative AI-powered extraction is now available, in public preview, within the Custom Extractor.

To get started, go to Document AI Workbench and create a Custom Extractor.

1 week, 4 days назад @ cloud.google.com
What is Optical Character Recognition? OCR explained by Google
What is Optical Character Recognition? OCR explained by Google What is Optical Character Recognition? OCR explained by Google

The earliest multilingual OCR systems adopted a language-specific approach, whereby a specific text line recognition model was trained for each supported language.

What sets Google OCR apartGoogle Cloud offers two standalone OCR products, Vision API Text Detection and Document AI Enterprise Document OCR, which allow users to perform high-quality extraction across a wide range of languages, advanced features, and an enterprise-ready API.

This is in large part due to the close partnership between Google Cloud and Google Research to develop and employ the latest advancements in OCR technology.

Document AI Enterprise OCR is Google Cloud’s OCR specialized for document use cases.

How can Google h…

1 week, 4 days назад @ cloud.google.com
What is Optical Character Recognition? OCR explained by Google
What is Optical Character Recognition? OCR explained by Google What is Optical Character Recognition? OCR explained by Google

Over time, OCR systems generalized beyond these font-specific approaches, with Ray Kurzweil often credited as the developer of the first omni-font OCR system in the 1970s.

This both simplified the OCR pipeline and led to better OCR accuracy by enabling the use of larger recognition models trained on more data.

What sets Google OCR apartGoogle Cloud offers two standalone OCR products, Vision API Text Detection and Document AI Enterprise Document OCR, which allow users to perform high-quality extraction across a wide range of languages, advanced features, and an enterprise-ready API.

Document AI Enterprise OCR is Google Cloud’s OCR specialized for document use cases.

Why OCR is important when…

1 week, 4 days назад @ cloud.google.com
World scale inverse reinforcement learning in Google Maps
World scale inverse reinforcement learning in Google Maps World scale inverse reinforcement learning in Google Maps

Routing in Google Maps remains one of our most helpful and frequently used features.

To this end, in "Massively Scalable Inverse Reinforcement Learning in Google Maps", we share the result of a multi-year collaboration among Google Research, Maps, and Google DeepMind to surpass this IRL scalability limitation.

Google Maps improvements in route match rate relative to the existing baseline, when using the RHIP inverse reinforcement learning policy.

Our 360M parameter reward model provides intuitive wins for Google Maps users in live A/B experiments.

Examining road segments with a large absolute difference between the learned rewards and the baseline rewards can help improve certain Google Map…

1 week, 5 days назад @ blog.research.google
Generative AI on Google Cloud: New training content, from introductory to advanced
Generative AI on Google Cloud: New training content, from introductory to advanced Generative AI on Google Cloud: New training content, from introductory to advanced

We have brand-new generative AI training options, and are constantly adding to our training catalog on Google Cloud Skills Boost.

This includes two learning paths that each feature comprehensive content: one is for the non-technical audience (introductory) ‘Introduction to Generative AI learning path’, and the other is for technical practitioners (more advanced) ‘Generative AI for Developers learning path’.

Powered by Google Cloud technology, generative AI is reimagining the way that organizations do business by leveraging data.

Plus, labs give you direct access to Generative AI Studio and Vertex AI (Google Cloud’s machine learning platform) to get hands-on, technical experience.

Here is a …

1 week, 5 days назад @ cloud.google.com
Helping you deliver high-performance, cost-efficient AI inference at scale with GPUs and TPUs
Helping you deliver high-performance, cost-efficient AI inference at scale with GPUs and TPUs Helping you deliver high-performance, cost-efficient AI inference at scale with GPUs and TPUs

Organizations are deploying these AI models in their products and services, driving the exponential growth in demand for AI-accelerated computing for AI training and inference, to reliably and efficiently serve AI models to users across the world.

Google Cloud inference systems deliver between a factor of 2-4x performance improvement and more than a factor of 2x cost-efficiency improvement over existing offerings.

GPU-accelerated AI inference on Google CloudGoogle Cloud and NVIDIA continue to partner to help bring the most advanced GPU-accelerated inference platform to our customers.

NVIDIA’s MLPerf™ 3.1 results for the L4 GPU speak to G2’s strengths: up to 1.8x improvement in performance p…

1 week, 6 days назад @ cloud.google.com
Differentially private median and more
Differentially private median and more Differentially private median and more

In these cases, a change of a single data point may cascade — changing multiple slices and thus increasing composition cost.

We can therefore ask the following fundamental question: What is the smallest input size N for which we can solve the private interval point problem?

Even though the interval point task seems very basic, it captures the essence of the difficulty of private solutions for common aggregation tasks.

Private learning of axis-aligned rectanglesFor the next task, the input is a set of n labeled data points, where each point x = (x 1 ,....,x d ) is a d-dimensional vector over a domain X.

This is because an exact solution could be very sensitive to the presence of a particular…

2 weeks, 2 days назад @ blog.research.google
How Vertex AI empowered our Supply Chain Science team at Wayfair
How Vertex AI empowered our Supply Chain Science team at Wayfair How Vertex AI empowered our Supply Chain Science team at Wayfair

For example, we can easily check which model code was used in the training pipeline run, and we can then check the data collection pipeline to inspect the code that generated the data used for training.

This is possible thanks to additional tooling that we built on top of Vertex AI and described in our previous blog.

Another change that made our lives easier is the serverless nature of Vertex AI.

Reduced dependency on other teams leads to the final key benefit of our Vertex AI transition: We now own the entire data collection workflow necessary for our model development.

Most of our ML happens in the development environment, where we use historical data for model development.

2 weeks, 2 days назад @ cloud.google.com
OpenAI
последний пост 5 days, 17 hours назад
OpenAI Red Teaming Network
OpenAI Red Teaming Network OpenAI Red Teaming Network

You can read more about our prior red teaming efforts, including our past work with external experts, on models such as DALL·E 2 and GPT-4.

The OpenAI Red Teaming Network is a community of trusted and experienced experts that can help to inform our risk assessment and mitigation efforts more broadly, rather than one-off engagements and selection processes prior to major model deployments.

Members of the network will be called upon based on their expertise to help red team at various stages of the model and product development lifecycle.

Outside of red teaming campaigns commissioned by OpenAI, members will have the opportunity to engage with each other on general red teaming practices and f…

5 days, 17 hours назад @ openai.com
Introducing OpenAI Dublin
Introducing OpenAI Dublin Introducing OpenAI Dublin

The strength of Ireland’s tech and startup ecosystem across Dublin and cities like Cork, Galway, and Limerick has shown impressive growth and advancement.

“IDA Ireland welcomes the decision by OpenAI to establish a European presence in Dublin.

Ireland is a recognized hub for administrative, regulatory, and innovation activities for the world’s leading digital companies.

OpenAI’s investment confirms this and endorses Ireland’s focus on building a flourishing AI ecosystem,” said Michael Lohan, CEO of IDA Ireland.

This marks a significant milestone in our journey, enabling us to better understand, serve, and collaborate with our European partners, users, and customers.

1 week, 4 days назад @ openai.com
Join us for OpenAI’s first developer conference on November 6 in San Francisco
Join us for OpenAI’s first developer conference on November 6 in San Francisco Join us for OpenAI’s first developer conference on November 6 in San Francisco

Please join us for our first developer conference, OpenAI DevDay, on November 6, 2023 in San Francisco.ÂThe one-day event will bring hundreds of developers from around the world together with the team at OpenAI to preview new tools and exchange ideas.

In-person attendees will also be able to join breakout sessions led by members of OpenAI’s technical staff.

Since launching our API in 2020, we’ve continuously updated it to include our most advanced models, making it easier than ever for developers to integrate cutting-edge AI into their projects with a simple API call.

Today, over 2 million developers are using GPT-4, GPT-3.5, DALL·E and Whisper for a wide range of use cases—from inte…

2 weeks, 4 days назад @ openai.com
Teaching with AI
Teaching with AI Teaching with AI

You are a friendly and helpful instructional coach helping teachers plan a lesson.ÂFirst introduce yourself and ask the teacher what topic they want to teach and the grade level of their students.

Wait for the teacher to respond.

Do not move on until the teacher responds.ÂNext ask the teacher if students have existing knowledge about the topic or if this in an entirely new topic.

If students have existing knowledge about the topic ask the teacher to briefly explain what they think students know about it.

Wait for the teacher to respond.

3 weeks, 3 days назад @ openai.com
Introducing ChatGPT Enterprise
Introducing ChatGPT Enterprise Introducing ChatGPT Enterprise

Customization: Securely extend ChatGPT’s knowledge with your company data by connecting the applications you already useSecurely extend ChatGPT’s knowledge with your company data by connecting the applications you already use Availability for all team sizes: a self-serve ChatGPT Business offering for smaller teamsa self-serve ChatGPT Business offering for smaller teams Power tools: Even more powerful versions of Advanced Data Analysis and browsing that are optimized for workEven more powerful versions of Advanced Data Analysis and browsing that are optimized for work Solutions for your function: more tools for specific roles, such as data analysts, marketers, customer support and moreWe…

3 weeks, 6 days назад @ openai.com
OpenAI partners with Scale to provide support for enterprises fine-tuning models
OpenAI partners with Scale to provide support for enterprises fine-tuning models OpenAI partners with Scale to provide support for enterprises fine-tuning models

OpenAI and Scale are joining forces to help more companies benefit from fine-tuning our most advanced models.

We recently launched fine-tuning for GPT-3.5 Turbo, and will bring fine-tuning to GPT-4 this fall.

With fine-tuning, companies can now securely customize our most advanced models on proprietary data, making our most powerful models even more useful.

Scale customers can now fine-tune OpenAI models just as they would through OpenAI, while also benefiting from Scale’s enterprise AI expertise and Data Engine.

Scale has already demonstrated value for customers by fine-tuning GPT-3.5 for Brex.

1 month назад @ openai.com
GPT-3.5 Turbo fine-tuning and API updates
GPT-3.5 Turbo fine-tuning and API updates GPT-3.5 Turbo fine-tuning and API updates

Developers can now bring their own data to customize GPT-3.5 Turbo for their use cases.

1 month назад @ openai.com
OpenAI acquires Global Illumination
OpenAI acquires Global Illumination OpenAI acquires Global Illumination

OpenAI has acquired the team at Global Illumination, a company founded by Thomas Dimson, Taylor Gordon, and Joey Flynn.

The entire team has joined OpenAI to work on our core products including ChatGPT.

Global Illumination is a company that has been leveraging AI to build creative tools, infrastructure, and digital experiences.

The team previously designed and built products early on at Instagram and Facebook and have also made significant contributions at YouTube, Google, Pixar, Riot Games, and other notable companies.

We’re very excited for the impact they’ll have here at OpenAI.

1 month, 1 week назад @ openai.com
Using GPT-4 for content moderation
Using GPT-4 for content moderation Using GPT-4 for content moderation

Our large language models like GPT-4 can understand and generate natural language, making them applicable to content moderation.

The models can make moderation judgments based on policy guidelines provided to them.

We can repeat steps 2 and 3 until we are satisfied with the policy quality.

This iterative process yields refined content policies that are translated into classifiers, enabling the deployment of the policy and content moderation at scale.

Optionally, to handle large amounts of data at scale, we can use GPT-4's predictions to fine-tune a much smaller model.

1 month, 1 week назад @ openai.com
Confidence-Building Measures for Artificial Intelligence: Workshop proceedings
Confidence-Building Measures for Artificial Intelligence: Workshop proceedings Confidence-Building Measures for Artificial Intelligence: Workshop proceedings

The first two authors are listed in order of contribution, and the remaining authors are listed alphabetically.

The claims in this paper do not represent the views of any author’s organization.

For questions about this paper, contact Sarah Shoker at [email protected] and Andrew Reddie at [email protected].

*Significant contribution, including writing, providing detailed input for the paper, research, workshop organization, and setting the direction of the paper.

**Significant contribution, including providing detailed input for the paper, research, workshop organization, and setting the direction of the paper.

1 month, 3 weeks назад @ openai.com
Frontier Model Forum
Frontier Model Forum Frontier Model Forum

The Forum will coordinate research to progress these efforts in areas such as adversarial robustness, mechanistic interpretability, scalable oversight, independent research access, emergent behaviors and anomaly detection.

There will be a strong focus initially on developing and sharing a public library of technical evaluations and benchmarks for frontier AI models.

Support the AI safety ecosystem by identifying the most important open research questions on AI safety.

There will be a strong focus initially on developing and sharing a public library of technical evaluations and benchmarks for frontier AI models.

The Frontier Model Forum will play a vital role in coordinating best practices a…

2 months назад @ openai.com
Moving AI governance forward
Moving AI governance forward Moving AI governance forward

They will also develop tools or APIs to determine if a particular piece of content was created with their system.

Audiovisual content that is readily distinguishable from reality or that is designed to be readily recognizable as generated by a company’s AI system—such as the default voices of AI assistants—is outside the scope of this commitment.

7) Prioritize research on societal risks posed by AI systems, including on avoiding harmful bias and discrimination, and protecting privacyCompanies making this commitment recognize the importance of avoiding harmful biases from being propagated by, and discrimination enacted by, AI systems.

Companies commit generally to empowering trust and …

2 months назад @ openai.com
Custom instructions for ChatGPT
Custom instructions for ChatGPT Custom instructions for ChatGPT

We’re introducing custom instructions so that you can tailor ChatGPT to better meet your needs.

This feature will be available in beta starting with the Plus plan today, expanding to all users in the coming weeks.

Custom instructions allow you to add preferences or requirements that you’d like ChatGPT to consider when generating its responses.

We’ve heard your feedback about the friction of starting each ChatGPT conversation afresh.

ChatGPT will consider your custom instructions for every conversation going forward.

2 months назад @ openai.com
Partnership with American Journalism Project to support local news
Partnership with American Journalism Project to support local news Partnership with American Journalism Project to support local news

The American Journalism Project, the leading venture philanthropy working to rebuild local news, today announced a new partnership with OpenAI, the AI research and deployment company behind ChatGPT, to explore ways in which the development of artificial intelligence (AI) can support a thriving, innovative local news field.

“To ensure local journalism remains an essential pillar of our democracy, we need to be smart about the potential powers and pitfalls of new technology,” said Sarabeth Berman, CEO of the American Journalism Project.

Through this partnership, AJP aims to build the support structure for community-driven local news organizations to expand their capacities through AI.

The…

2 months, 1 week назад @ openai.com
Frontier AI regulation: Managing emerging risks to public safety
Frontier AI regulation: Managing emerging risks to public safety Frontier AI regulation: Managing emerging risks to public safety

Contributions include writing, editing, research, detailed feedback, and participation in a workshop on a draft of the paper.

Given the size of the group, inclusion as an author does not entail endorsement of all claims in the paper, nor does inclusion entail an endorsement on the part of any individual’s organization.

*Significant contribution, including writing, research, convening, and setting the direction of the paper.

**Significant contribution, including editing, convening, detailed input, and setting the direction of the paper.

Markus Anderljung (markus.ander[email protected]) and Anton Korinek ([email protected]).

2 months, 2 weeks назад @ openai.com
Microsoft Microsoft
последний пост 4 days, 7 hours назад
Neural Graphical Models
Neural Graphical Models Neural Graphical Models

In the field of reasoning under uncertainty, probabilistic graphical models (PGMs) stand out as a powerful tool for analyzing data.

Learning, inference, and sampling are operations that make graphical models useful for domain exploration.

Various graphical models impose restrictions on the set of distributions or types of variables in the domain.

SPOTLIGHT: AI focus area AI and Microsoft Research Learn more about the breadth of AI research at Microsoft Learn more Opens in a new tabIn our paper, “Neural Graphical Models (opens in new tab),” presented at ECSQARU 2023 (opens in new tab), we propose Neural Graphical Models (NGMs), a new type of PGM that learns to represent the probability funct…

4 days, 7 hours назад @ microsoft.com
Announcing the DeepSpeed4Science Initiative: Enabling large-scale scientific discovery through sophisticated AI system technologies
Announcing the DeepSpeed4Science Initiative: Enabling large-scale scientific discovery through sophisticated AI system technologies Announcing the DeepSpeed4Science Initiative: Enabling large-scale scientific discovery through sophisticated AI system technologies

We work closely with internal and external teams who own AI-driven science models that represent key science missions, to identify and address general domain-specific AI system challenges.

Our long-term vision is to develop DeepSpeed4Science into a new platform and a unified repository for sharing advanced AI system technologies that support scientific discoveries.

As the featured showcases for this release, they represent two common AI system challenges facing today’s AI-driven structural biology research.

One of our high-level goals is to generalize AI system technologies that broadly address the major system pain points for large-scale scientific discoveries.

We are looking forward to be…

5 days, 8 hours назад @ microsoft.com
Microsoft at ACM SIGCOMM 2023: Innovating the future of networking
Microsoft at ACM SIGCOMM 2023: Innovating the future of networking Microsoft at ACM SIGCOMM 2023: Innovating the future of networking

In this evolving landscape, Microsoft is at the forefront, spearheading innovation efforts in networking and strengthening the foundational network infrastructure that underpins the cloud ecosystem.

ACM SIGCOMM (opens in new tab), the premier annual conference of the Association for Computing Machinery’s special interest group on data communication (opens in new tab) (SIGCOMM), is dedicated to the study of communication and computer networks.

Listen now Opens in a new tabPaper highlightsThe papers Microsoft published at SIGCOMM 2023 span a wide spectrum of networking domains, ranging from 5G and wide area networks (WAN) to enterprise networks.

For the complete list of accepted publications …

1 week, 3 days назад @ microsoft.com
AI Frontiers: The future of scale with Ahmed Awadallah and Ashley Llorens
AI Frontiers: The future of scale with Ahmed Awadallah and Ashley Llorens AI Frontiers: The future of scale with Ahmed Awadallah and Ashley Llorens

Things are moving really, really fast.

One is that these benchmarks are being saturated really, really quickly.

And then came GPT-4, and we’re seeing that it’s doing really, really, really well, even in like zero-shot and, and, and few-shot settings, where the tasks are completely new to the models.

And pretraining is the traditional language model training as we have always done it.

doing the language model training with neural networks, and we’re still training neural networks the same way, autoregressive next word prediction.

1 week, 3 days назад @ microsoft.com
Research Focus: Week of September 11, 2023
Research Focus: Week of September 11, 2023 Research Focus: Week of September 11, 2023

Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft.

Most database vendors provide business intelligence (BI) tools as an efficient and user-friendly platform for customers to perform data cleaning, preparation and linking tasks to obtain actionable semantic data.

However, customers are increasingly interested in querying semantic data through various modalities including SQL, imperative programming languages such as Python, and natural language queries.

Despite these advantages, lithium-ion batteries face challenges related to capacity degradation and pe…

1 week, 4 days назад @ microsoft.com
Abstracts: September 13, 2023
Abstracts: September 13, 2023 Abstracts: September 13, 2023

And they work really well nowadays.

And so that was our first really exciting result, that we could generate proteins that were really of high quality using our method.

And these types of disordered proteins have really, really important roles in biology and in disease.

YANG: So there have been a few other preprints or papers that talk about applying diffusion to protein sequences.

So people will also train these models on small sets of related protein sequences.

1 week, 4 days назад @ microsoft.com
FP2: Fully In-Place Functional Programming provides memory reuse for pure functional programs
FP2: Fully In-Place Functional Programming provides memory reuse for pure functional programs FP2: Fully In-Place Functional Programming provides memory reuse for pure functional programs

How can we overcome this limitation and harness the benefits of functional programming while maintaining efficient memory usage?

Fully in-place functional programming avoids allocationRecent developments have made it possible to avoid such allocations.

In our paper, “FP2: Fully in-Place Functional Programming (opens in new tab),” which we’re presenting at ICFP 2023 (opens in new tab), we describe the new fip keyword.

To learn about fully in-place functional programming and the Koka language, start at the Koka homepage (opens in new tab).

We encourage readers to continue exploring and experimenting with fully in-place programming.

1 week, 5 days назад @ microsoft.com
Understanding social biases through the text-to-image generation lens
Understanding social biases through the text-to-image generation lens Understanding social biases through the text-to-image generation lens

In our paper, “Social Biases through the Text-to-Image Generation Lens (opens in new tab),” presented at AIES 2023 (opens in new tab), we conduct a thorough analysis for studying and quantifying common societal biases reflected in generated images.

Image generation results: Emphasizes the distribution observed in image generation outputs.

Notably, images generated by DALLE-v2 provide minimal representation of women in the professions of CEO and computer programmer.

Conversely, in images generated by Stable Diffusion, women are consistently represented in the roles of nurses and housekeepers 100% of the time.

Safeguarding against bias in T2I modelsAs T2I generation models become increasingly…

2 weeks, 2 days назад @ microsoft.com
Intern Insights: Dr. Josh Benaloh with Anunay Kulshrestha and Karan Newatia
Intern Insights: Dr. Josh Benaloh with Anunay Kulshrestha and Karan Newatia Intern Insights: Dr. Josh Benaloh with Anunay Kulshrestha and Karan Newatia

BENALOH: So let’s, let’s talk a little bit more about ElectionGuard.

I wonder if you can tell the listeners a little bit about what it is and how it works.

It’s OK to go into the math a little bit, get a little bit geeky, maybe talk a little bit about the fundamentals of homomorphic encryption and how that leads to things.

BENALOH: So we talked about your backgrounds a little bit at the beginning and then dove into the geeky stuff very quickly.

Anunay, you were talking to me a little bit before we started about your difficulty with geometry as a student.

2 weeks, 2 days назад @ microsoft.com
Incorporating chemists’ insight with AI models for single-step retrosynthesis prediction
Incorporating chemists’ insight with AI models for single-step retrosynthesis prediction Incorporating chemists’ insight with AI models for single-step retrosynthesis prediction

In retrosynthesis, a molecule can be represented as either a 2D graph or a 1D SMILES (simplified molecular-input line-entry system) sequence.

While AI has made significant strides in predicting reactants, it’s crucial to acknowledge the expertise of human chemists.

Our paper, Single-step retrosynthesis prediction by leveraging commonly preserved substructures (opens in new tab), proposes a novel approach that leverages commonly preserved substructures in organic synthesis.

This approach incorporates chemists’ insight in retrosynthesis, bringing the AI model closer to the way human experts think.

The substructure SMILES and the predicted fragment SMILES are then combined to form a complete r…

2 weeks, 3 days назад @ microsoft.com
Frontiers of multimodal learning: A responsible AI approach
Frontiers of multimodal learning: A responsible AI approach Frontiers of multimodal learning: A responsible AI approach

In this digital tapestry, multimodal models are the weavers, bringing together different threads into a coherent whole.

In this blog, we’ll cover some of that research and other groundbreaking work into multimodal AI from Microsoft Research, examining the complexities of evaluating multimodal models and paths toward their improvement.

Historically, machine learning models have functioned within a closed-world assumption, limited by their training data or specific application contexts.

The first paper unveils a practical mixed-reality approach for continual learning that includes multimodal models in the loop.

Related readingLiterature on multimodal models directly discussed in this blogLite…

2 weeks, 4 days назад @ microsoft.com
Rethinking trust in direct messages in the AI era
Rethinking trust in direct messages in the AI era Rethinking trust in direct messages in the AI era

In this blog post, we focus on the challenge of establishing trust and accountability in direct communication (between two people), such as email, direct messages on social media platforms, SMS, and even phone calls.

Reputation system for online accountabilityConsider what an online communication user experience could be like with an integrated reputation system.

Messages with an attached reputation tag are called reputed messages, whereas those without an associated reputation are called generic messages.

However, in this case your email client has noted the valid reputation tag and automatically moved the email to a reputed messages folder.

Right before calling you, they sent you a reputa…

2 weeks, 5 days назад @ microsoft.com
AI Frontiers: AI in India and beyond with Sriram Rajamani
AI Frontiers: AI in India and beyond with Sriram Rajamani AI Frontiers: AI in India and beyond with Sriram Rajamani

There are new things that are coming up, like something that’s, that’s called Beckn.

So that’s actually the shock and the awe and the emotional journey that these people went through.

So I think that’s actually what I heard in your podcast, and that’s indicative of actually what the rest of my colleagues are going through.

You know, for example, Ashley, you know about the HyWay project, where we’re thinking about technology to make hybrid work work better.

But there is a gap between that capability and having programmers just program in natural language, right.

3 weeks, 3 days назад @ microsoft.com
Building a “heavy metal quartet” of AI compilers
Building a “heavy metal quartet” of AI compilers Building a “heavy metal quartet” of AI compilers

As AI technology and large-scale AI models become increasingly prevalent across the digital world, their unique characteristics are posing new challenges for compilers.

Currently, most AI compilers focus on addressing data flow execution efficiency and do not provide efficient support for control flow.

After cutting the data flow into parallel computing blocks of different sizes, it then integrates (grinds) control flow into data flow, so that control flow can also be executed efficiently on the accelerator.

Grinder uses a heuristic strategy to find an effective scheduling scheme and can automatically move control flow into device kernels, thereby achieving optimizations across control flow…

3 weeks, 4 days назад @ microsoft.com
Research Focus: Week of August 28, 2023
Research Focus: Week of August 28, 2023 Research Focus: Week of August 28, 2023

Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft.

Listen now Opens in a new tabNEW RESEARCHFiGURe: Simple and Efficient Unsupervised Node Representations with Filter AugmentationsContrastive learning is a powerful method for unsupervised graph representation learning.

However, these representations struggle when dealing with heterophilic tasks, where edges tend to connect nodes with different labels.

Yet these methods operate in semi-supervised settings, and the extension of these ideas in unsupervised learning still needs to be explored.

In 2018, Sull…

3 weeks, 4 days назад @ microsoft.com
MIT AI MIT AI
последний пост 4 days, 3 hours назад
School of Engineering welcomes Songyee Yoon PhD ’00 as visiting innovation scholar
School of Engineering welcomes Songyee Yoon PhD ’00 as visiting innovation scholar School of Engineering welcomes Songyee Yoon PhD ’00 as visiting innovation scholar

Songyee Yoon PhD ’00, an entrepreneur, innovator, investor, and leader in AI and the gaming industry, has been appointed as a School of Engineering visiting innovation scholar for the 2023-24 academic year.

She played a pivotal role in founding the NCSOFT AI Center, a state-of-the-art AI research facility that has helped the company develop and integrate the latest AI and machine learning technologies into their products.

As a visiting innovation scholar, Yoon will engage in a variety of activities with faculty, students, and staff across MIT’s School of Engineering.

“I am both humbled and excited to embark on my journey as an innovation scholar, where I will champion entrepreneurship and e…

4 days, 3 hours назад @ news.mit.edu
Meet the 2023-24 Accenture Fellows
Meet the 2023-24 Accenture Fellows Meet the 2023-24 Accenture Fellows

The MIT and Accenture Convergence Initiative for Industry and Technology has selected five new research fellows for 2023-24.

Now in its third year, the initiative underscores the ways in which industry and research can collaborate to spur technological innovation.

The 2023-24 fellows will conduct research in areas including artificial intelligence, sustainability, and robotics.

The 2023-24 Accenture Fellows are:Yiyue LuoYiyue Luo is a PhD candidate who is developing innovative integrations of tactile sensing and haptics, interactive sensing and AI, digital fabrication, and smart wearables.

Luo’s research holds exciting potential to advance ground-breaking applications for smart textiles, he…

5 days, 3 hours назад @ news.mit.edu
Four Lincoln Laboratory technologies win five 2023 R&D 100 awards
Four Lincoln Laboratory technologies win five 2023 R&D 100 awards Four Lincoln Laboratory technologies win five 2023 R&D 100 awards

These four technologies developed at MIT Lincoln Laboratory, either wholly or with collaborators, received 2023 R&D 100 Awards.

“Lincoln Laboratory has been very fortunate to receive 86 R&D 100 Awards over the past 14 years.

Until recently, Air Force airmen, such as pilots and loadmasters, would have to schedule each flight’s crew manually, on a whiteboard.

Until now, quantum memory systems have fallen short on one or more of these capabilities.

In addition to these winning technologies, five other Lincoln Laboratory technologies were named R&D 100 award finalists.

5 days, 4 hours назад @ news.mit.edu
MIT scholars awarded seed grants to probe the social implications of generative AI
MIT scholars awarded seed grants to probe the social implications of generative AI MIT scholars awarded seed grants to probe the social implications of generative AI

“In the past year, generative AI has captured the public imagination and raised countless questions about how this rapidly advancing technology will affect our world,” Kornbluth says.

“This summer, to help shed light on those questions, we offered our faculty seed grants for the most promising ‘impact papers’ — basically, proposals to pursue intensive research on some aspect of how generative AI will shape people’s life and work.

Those papers will be shared widely via a publication venue managed and hosted by the MIT Press and the MIT Libraries.

Reflecting generative AI’s wide-ranging impact beyond the technology sphere, 11 of the selected proposals have at least one author from the School …

6 days, 5 hours назад @ news.mit.edu
Multi-AI collaboration helps reasoning and factual accuracy in large language models
Multi-AI collaboration helps reasoning and factual accuracy in large language models Multi-AI collaboration helps reasoning and factual accuracy in large language models

This method empowers these expansive language models to heighten their adherence to factual data and refine their decision-making.

The crux of the problem with large language models (LLMs) lies in the inconsistency of their generated responses, leading to potential inaccuracies and flawed reasoning.

This simplicity, the team says, could help researchers and developers use the tool to improve the consistency and factual accuracy of language model outputs across the board.

Additionally, the language models showed off enhanced abilities to generate accurate arithmetic evaluations, illustrating potential across different domains.

Beyond its application to language models, the approach could als…

6 days, 11 hours назад @ news.mit.edu
AI-driven tool makes it easy to personalize 3D-printable models
AI-driven tool makes it easy to personalize 3D-printable models AI-driven tool makes it easy to personalize 3D-printable models

To do this, many of these amateur artisans access free, open-source repositories of user-generated 3D models that they download and fabricate on their 3D printer.

A designer could utilize this tool, called Style2Fab, to personalize 3D models of objects using only natural language prompts to describe their desired design.

Style2Fab is driven by deep-learning algorithms that automatically partition the model into aesthetic and functional segments, streamlining the design process.

A stylization tool would need to preserve the geometry of externally and internally functional segments while enabling customization of nonfunctional, aesthetic segments.

In addition, they want to enhance Style2Fab s…

1 week, 2 days назад @ news.mit.edu
A pose-mapping technique could remotely evaluate patients with cerebral palsy
A pose-mapping technique could remotely evaluate patients with cerebral palsy A pose-mapping technique could remotely evaluate patients with cerebral palsy

MIT engineers hope to alleviate some of that stress with a new method that remotely evaluates patients’ motor function.

The researchers tested the method on videos of more than 1,000 children with cerebral palsy.

The researchers then looked for ways to automatically decipher patterns in the cerebral palsy data that are characteristic of each clinical motor function level.

They took the backbone of this pretrained model and added to it a new classification layer, specific to the clinical scores related to cerebral palsy.

They found that the pretrained network learned to correctly classify children’s mobility levels, and it did so more accurately than if it were trained only on the cerebral p…

1 week, 3 days назад @ news.mit.edu
How an archeological approach can help leverage biased data in AI to improve medicine
How an archeological approach can help leverage biased data in AI to improve medicine How an archeological approach can help leverage biased data in AI to improve medicine

When encountering biased data, particularly for AI models used in medical settings, the typical response is to either collect more data from underrepresented groups or generate synthetic data making up for missing parts to ensure that the model performs equally well across an array of patient populations.

Self-reported race is a social construct that is both a proxy for other information, and deeply proxied itself in other medical data.

They'll say, 'I'm really scared of an AI misdiagnosing me,’ or ‘I'm concerned it will treat me poorly,’” Ghassemi says.

“I tell them, you shouldn't be scared of some hypothetical AI in health tomorrow, you should be scared of what health is right now.

If we …

1 week, 4 days назад @ news.mit.edu
A. Michael West: Advancing human-robot interactions in health care
A. Michael West: Advancing human-robot interactions in health care A. Michael West: Advancing human-robot interactions in health care

“I liked what I was learning.”As a rising senior at Yale, West was selected to participate in the MIT Summer Research Program (MSRP).

For West, MSRP was an education in what “exactly grad school was, especially what it would be like at MIT.”It was also, and most importantly, a source of validation that West could succeed in the higher levels of academia.

Having benefited from the MSRP experience, West gave back once he enrolled at MIT by working as an MRSP group leader for two summers.

His involvement as a leader and mentor in MSRP is just one way West has sought to give back.

It’s something I’d like to get into after graduation.”While earning prestigious fellowships and advancing human-rob…

1 week, 4 days назад @ news.mit.edu
Helping computer vision and language models understand what they see
Helping computer vision and language models understand what they see Helping computer vision and language models understand what they see

Powerful machine-learning algorithms known as vision and language models, which learn to match text with images, have shown remarkable results when asked to generate captions or summarize videos.

Researchers from MIT, the MIT-IBM Watson AI Lab, and elsewhere have demonstrated a new technique that utilizes computer-generated data to help vision and language models overcome this shortcoming.

They used this annotated dataset to “fix” vision and language models so they can learn concepts more effectively.

The researchers sought to mitigate this problem by using synthetic data to fine-tune a vision and language model.

Their synthetic dataset and fine-tuning strategy improved the ability of popul…

1 week, 4 days назад @ news.mit.edu
AI model speeds up high-resolution computer vision
AI model speeds up high-resolution computer vision AI model speeds up high-resolution computer vision

Researchers from MIT, the MIT-IBM Watson AI Lab, and elsewhere have developed a more efficient computer vision model that vastly reduces the computational complexity of this task.

To do this, the vehicle might use a powerful computer vision model to categorize every pixel in a high-resolution image of this scene, so it doesn’t lose sight of objects that might be obscured in a lower-quality image.

The result is a new model series for high-resolution computer vision that performs up to nine times faster than prior models when deployed on a mobile device.

Not only could this technique be used to help autonomous vehicles make decisions in real-time, it could also improve the efficiency of other…

1 week, 5 days назад @ news.mit.edu
System combines light and electrons to unlock faster, greener computing
System combines light and electrons to unlock faster, greener computing System combines light and electrons to unlock faster, greener computing

Photonic computing is one potential remedy for the growing computational demands of machine-learning models.

The prototype’s novel design enables impressive speeds, creating the first photonic computing system to serve real-time machine-learning inference requests.

“Controlling this dataflow between photonics and electronics was the Achilles’ heel of past state-of-the-art photonic computing works.

Before Lightning, photonic and electronic computing schemes operated independently, speaking different languages.

“Building a photonic computing system without a count-action programming abstraction is like trying to steer a Lamborghini without knowing how to drive,” says Ghobadi, who is a senior …

1 week, 6 days назад @ news.mit.edu
Making life friendlier with personal robots
Making life friendlier with personal robots Making life friendlier with personal robots

“As a child, I wished for a robot that would explain others’ emotions to me” says Sharifa Alghowinem, a research scientist in the Media Lab’s Personal Robots Group (PRG).

In her early life, Alghowinem faced difficulties with understanding social cues and never scored well on standardized tests, but her dreams carried her through.

Breazeal's research explores the potential for companion robots to go far beyond assistants who obey transactional commands, like requests for the daily weather, adding items to shopping lists, or controlling lighting.

Still in the fundraising stage, the plan is to equip social robots to teach the children English language and social-emotional skills and provide ac…

2 weeks назад @ news.mit.edu
AI pilot programs look to reduce energy use and emissions on MIT campus
AI pilot programs look to reduce energy use and emissions on MIT campus AI pilot programs look to reduce energy use and emissions on MIT campus

That’s the idea behind a cross-departmental effort working to reduce campus energy use through AI building controls that respond in real-time to internal and external factors.

“As we work to decarbonize our campus, we’re exploring all avenues,” says Vice President for Campus Services and Stewardship Joe Higgins, who originally pitched the idea to students at the 2019 MIT Energy Hack.

“The goal here is energy savings, but that’s not something we can fully assess until we complete a whole building,” explains Norford.

Expanding impactThe successful completion of these programs will also open the possibility for even greater energy savings — bringing MIT closer to its decarbonization goals.

“Be…

2 weeks, 2 days назад @ news.mit.edu
Jackson Jewett wants to design buildings that use less concrete
Jackson Jewett wants to design buildings that use less concrete Jackson Jewett wants to design buildings that use less concrete

Jewett arrived at MIT in 2017, initially only planning on completing the master’s program in civil and environmental engineering.

He was particularly interested in applying this approach to concrete design, and he collaborated with Carstensen to help demonstrate its viability.

After earning his master’s, Jewett spent a year and a half as a structural engineer in New York City.

The work Jewett completed for his master’s thesis was just the start of a long learning process.

Having worked in the industry before starting the PhD program, Jewett has an eye toward doing work that can be feasibly implemented.

2 weeks, 2 days назад @ news.mit.edu
Berkeley AI
последний пост 2 months, 1 week назад
Training Diffusion Models with Reinforcement Learning
Training Diffusion Models with  Reinforcement Learning Training Diffusion Models with Reinforcement Learning

Training Diffusion Models withReinforcement LearningTraining Diffusion Models with Reinforcement LearningreplayDiffusion models have recently emerged as the de facto standard for generating complex, high-dimensional outputs.

However, most use cases of diffusion models are not directly concerned with matching the training data, but instead with a downstream objective.

In this post, we show how diffusion models can be trained on these downstream objectives directly using reinforcement learning (RL).

This led to two variants of DDPO: DDPO SF , which uses the simple score function estimator of the policy gradient also known as REINFORCE; and DDPO IS , which uses a more powerful importance sampl…

2 months, 1 week назад @ bair.berkeley.edu
On the Stepwise Nature of Self-Supervised Learning
On the Stepwise Nature of  Self-Supervised Learning On the Stepwise Nature of Self-Supervised Learning

On the Stepwise Nature ofSelf-Supervised LearningFigure 1: stepwise behavior in self-supervised learning.

Self-supervised learning (SSL) has emerged as a leading framework for learning these representations for images directly from unlabeled data, similar to how LLMs learn representations for language directly from web-scraped text.

BackgroundWe focus here on joint-embedding SSL methods — a superset of contrastive methods — which learn representations that obey view-invariance criteria.

Figure 3: stepwise learning is apparent in Barlow Twins, SimCLR, and VICReg.

The loss and embeddings of all three methods display stepwise learning, with embeddings iteratively increasing in rank as pred…

2 months, 2 weeks назад @ bair.berkeley.edu
Generating 3D Molecular Conformers via Equivariant Coarse-Graining and Aggregated Attention
Generating 3D Molecular Conformers via Equivariant Coarse-Graining and Aggregated Attention Generating 3D Molecular Conformers via Equivariant Coarse-Graining and Aggregated Attention

Generating 3D Molecular Conformers via Equivariant Coarse-Graining and Aggregated AttentionFigure 1: CoarsenConf architecture.

(IV) Given the FG latent vector and the RDKit approximation, the decoder $p_\theta(X |\mathcal{R}, z)$ learns to recover the low-energy FG structure through autoregressive equivariant message passing.

We introduce a method, which we call Aggregated Attention, to learn the optimal variable length mapping from the latent CG representation to FG coordinates.

Aggregated Attention is responsible for the efficient translation from the latent CG representation to viable FG coordinates (Figure 1(III)).

Experimental ResultsTable 1: Quality of generated conformer ensembles fo…

2 months, 3 weeks назад @ bair.berkeley.edu
GPT-4 + Stable-Diffusion = ?: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models
GPT-4 + Stable-Diffusion = ?: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models GPT-4 + Stable-Diffusion = ?: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models

: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language ModelsTL;DR: Text Prompt -> LLM -> Intermediate Representation (such as an image layout) -> Stable Diffusion -> Image.

Recent advancements in text-to-image generation with diffusion models have yielded remarkable results synthesizing highly realistic and diverse images.

In contrast, our method, LLM-grounded Diffusion (LMD), delivers much better prompt understanding in text-to-image generation in those scenarios.

Figure 1: LLM-grounded Diffusion enhances the prompt understanding ability of text-to-image diffusion models.

This approach comes with significant costs: It is time-consuming and expensive to trai…

4 months назад @ bair.berkeley.edu
Interactive Fleet Learning
Interactive Fleet Learning Interactive Fleet Learning

A robot fleet is a modern analogue of a fleet of ships, where the word fleet has an etymology tracing back to flēot (‘ship’) and flēotan (‘float’) in Old English.

IFL Formalism and AlgorithmsTo this end, in a recent paper at the Conference on Robot Learning we introduced the paradigm of Interactive Fleet Learning (IFL), the first formalism in the literature for interactive learning with multiple robots and multiple humans.

Takeaways and Future DirectionsTo address the gap between the theory and practice of robot fleet learning as well as facilitate future research, we introduce new formalisms, algorithms, and benchmarks for Interactive Fleet Learning.

AcknowledgementsThis post is …

5 months, 3 weeks назад @ bair.berkeley.edu
Koala: A Dialogue Model for Academic Research
Koala: A Dialogue Model for Academic Research Koala: A Dialogue Model for Academic Research

Koala: A Dialogue Model for Academic ResearchIn this post, we introduce Koala, a chatbot trained by fine-tuning Meta’s LLaMA on dialogue data gathered from the web.

We introduce a new model, Koala, which provides an additional piece of evidence toward this discussion.

We train our Koala model on a single Nvidia DGX server with 8 A100 GPUs.

Understanding large language models: because Koala inference can be performed on relatively inexpensive commodity GPUs, it enables us to better inspect and understand the internals of dialogue language models, making (previously black-box) language models more interpretable.

The TeamThe Koala model is a joint effort across multiple research groups in th…

5 months, 3 weeks назад @ bair.berkeley.edu
Fully Autonomous Real-World Reinforcement Learning with Applications to Mobile Manipulation
Fully Autonomous Real-World Reinforcement Learning with Applications to Mobile Manipulation Fully Autonomous Real-World Reinforcement Learning with Applications to Mobile Manipulation

Fully Autonomous Real-World Reinforcement Learning with Applications to Mobile ManipulationReinforcement learning provides a conceptual framework for autonomous agents to learn from experience, analogously to how one might train a pet with treats.

Can we instead devise reinforcement learning systems for robots that allow them to learn directly “on-the-job”, while performing the task that they are required to do?

Learning systems have the ability to create the entire control algorithm for the robot, and are not limited to tuning a few parameters in a script.

The key step in this work allows these real-world learning systems to autonomously collect the data needed to enable the success of…

8 months, 1 week назад @ bair.berkeley.edu
AWS Machine Learning AWS Machine Learning
последний пост 2 days, 3 hours назад
Improving your LLMs with RLHF on Amazon SageMaker
Improving your LLMs with RLHF on Amazon SageMaker Improving your LLMs with RLHF on Amazon SageMaker

In this post, we will demonstrate how to fine-tune a base model with RLHF on Amazon SageMaker.

In this blog post, we illustrate how RLHF can be performed on Amazon SageMaker by conducting an experiment with the popular, open-sourced RLHF repo Trlx.

Train your reward modelOur reward model is based on GPT-J-6B and is fine-tuned on the previously mentioned HH dataset.

RLHF TrainingNow that we have acquired all the required components for RLHF training (i.e., an SFT model and a reward model), we can now begin optimizing the policy using RLHF.

Prompt SFT Model RLHF Model I’m a big fan of Mexican street corn.

2 days, 3 hours назад @ aws.amazon.com
How United Airlines built a cost-efficient Optical Character Recognition active learning pipeline
How United Airlines built a cost-efficient Optical Character Recognition active learning pipeline How United Airlines built a cost-efficient Optical Character Recognition active learning pipeline

In this post, we discuss how United Airlines, in collaboration with the Amazon Machine Learning Solutions Lab, build an active learning framework on AWS to automate the processing of passenger documents.

About the AuthorsXin Gu is the Lead Data Scientist – Machine Learning at United Airlines’ Advanced Analytics and Innovation division.

He has 5+ years of experience in building machine learning and deep learning solutions and focuses on computer vision and reinforcement learning with human feedbacks.

Previously, he was head of Central US, Greater China Region, LATAM and Automotive Vertical in AWS Machine Learning Solutions Lab.

He helped AWS customers identify and build machine learning solu…

3 days, 7 hours назад @ aws.amazon.com
Optimize generative AI workloads for environmental sustainability
Optimize generative AI workloads for environmental sustainability Optimize generative AI workloads for environmental sustainability

With the increasing complexity and scale of generative AI models, it is crucial to work towards minimizing their environmental impact.

Generative AI problem framingWhen framing your generative AI problem, consider the following:Align your use of generative AI with your sustainability goals – When scoping your project, be sure to take sustainability into account: What are the trade-offs between a generative AI solution and a less resource-intensive traditional approach?

ConclusionAs generative AI models are becoming bigger, it is essential to consider the environmental impact of our workloads.

Because the field of generative AI is continuously progressing, staying updated with the latest cou…

3 days, 7 hours назад @ aws.amazon.com
Train and deploy ML models in a multicloud environment using Amazon SageMaker
Train and deploy ML models in a multicloud environment using Amazon SageMaker Train and deploy ML models in a multicloud environment using Amazon SageMaker

For example, you may want to make use of Amazon SageMaker to build and train ML model, or use Amazon SageMaker Jumpstart to deploy pre-built foundation or third party ML models, which you can deploy at the click of a few buttons.

AWS provides support for scenarios where organizations want to bring their own model to Amazon SageMaker or into Amazon SageMaker Canvas for predictions.

We train the model using Amazon SageMaker, store the model artifacts in Amazon Simple Storage Service (Amazon S3), and deploy and run the model in Azure.

SageMaker Studio allows data scientists, ML engineers, and data engineers to prepare data, build, train, and deploy ML models on one web interface.

For more info…

4 days, 7 hours назад @ aws.amazon.com
Generative AI and multi-modal agents in AWS: The key to unlocking new value in financial markets
Generative AI and multi-modal agents in AWS: The key to unlocking new value in financial markets Generative AI and multi-modal agents in AWS: The key to unlocking new value in financial markets

One of the ways to handle multi-modal data that is gaining popularity is the use of multi-modal agents.

Implementing a multi-modal agent with AWS consolidates key insights from diverse structured and unstructured data on a large scale.

All this is achieved using AWS services, thereby increasing the financial analyst’s efficiency to analyze multi-modal financial data (text, speech, and tabular data) holistically.

Solution overviewThe following diagram illustrates the conceptual architecture to use generative AI with multi-modal data using agents.

The platform uses a framework to determine the most suitable multi-modal agent tool to answer the question.

5 days, 8 hours назад @ aws.amazon.com
How VirtuSwap accelerates their pandas-based trading simulations with an Amazon SageMaker Studio custom container and AWS GPU instances
How VirtuSwap accelerates their pandas-based trading simulations with an Amazon SageMaker Studio custom container and AWS GPU instances How VirtuSwap accelerates their pandas-based trading simulations with an Amazon SageMaker Studio custom container and AWS GPU instances

PrerequisitesTo run this step-by-step guide, you need an AWS account with permissions to SageMaker, Amazon Elastic Container Registry (Amazon ECR), AWS Identity and Access Management (IAM), and AWS CodeBuild.

Create a policy named sm-build-policy with the following permissions: { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "codebuild:DeleteProject", "codebuild:CreateProject", "codebuild:BatchGetBuilds", "codebuild:StartBuild" ], "Resource": "arn:aws:codebuild:*:*:project/sagemaker-studio*" }, { "Effect": "Allow", "Action": "logs:CreateLogStream", "Resource": "arn:aws:logs:*:*:log-group:/aws/codebuild/sagemaker-studio*" }, { "Effect": "Allow", "Action": [ "logs:G…

5 days, 8 hours назад @ aws.amazon.com
Unlock ML insights using the Amazon SageMaker Feature Store Feature Processor
Unlock ML insights using the Amazon SageMaker Feature Store Feature Processor Unlock ML insights using the Amazon SageMaker Feature Store Feature Processor

Amazon SageMaker Feature Store provides an end-to-end solution to automate feature engineering for machine learning (ML).

Now, create the car-data feature group:# Create Feature Group - Car sale records.

car_sales_fg = FeatureGroup( name=CAR_SALES_FG_NAME, feature_definitions=CAR_SALES_FG_FEATURE_DEFINITIONS, sagemaker_session=sagemaker_session, ) create_car_sales_fg_resp = car_sales_fg.create( record_identifier_name=CAR_SALES_FG_RECORD_IDENTIFIER_NAME, event_time_feature_name=CAR_SALES_FG_EVENT_TIME_FEATURE_NAME, s3_uri=CAR_SALES_FG_OFFLINE_STORE_S3_URI, enable_online_store=True, role_arn=CAR_SALES_FG_ROLE_ARN, )Create the car-data-aggregated feature group:# Create Feature Group - Aggregat…

5 days, 8 hours назад @ aws.amazon.com
Orchestrate Ray-based machine learning workflows using Amazon SageMaker
Orchestrate Ray-based machine learning workflows using Amazon SageMaker Orchestrate Ray-based machine learning workflows using Amazon SageMaker

Amazon SageMaker Feature Store provides a scalable repository for storing, retrieving, and sharing ML features for model training.

Trained models can be stored, versioned, and tracked in Amazon SageMaker Model Registry for governance and management.

We set up an end-to-end Ray-based ML workflow, orchestrated using SageMaker Pipelines.

SageMaker Pipelines has direct integration with SageMaker, the SageMaker Python SDK, and SageMaker Studio.

First, let’s get the model registered in SageMaker Model Registry:xgb_regressor_model = ModelPackage( role_arn, model_package_arn=model_package_arn, name=model_name )The model’s current status is PendingApproval .

6 days, 6 hours назад @ aws.amazon.com
Designing resilient cities at Arup using Amazon SageMaker geospatial capabilities
Designing resilient cities at Arup using Amazon SageMaker geospatial capabilities Designing resilient cities at Arup using Amazon SageMaker geospatial capabilities

SageMaker geospatial capabilities make it easy for data scientists and machine learning (ML) engineers to build, train, and deploy models using geospatial data.

Arup addressed this challenge by partnering with AWS and using SageMaker geospatial capabilities to enable analysis at a city scale and beyond.

Over larger areas of interest, this modeling can also be used to forecast the UHI effect alongside climate projections to quantify the scale of the UHI effect.

Unlock the possibilities of earth observation data in your sustainability projects by exploring the SageMaker geospatial capabilities on the SageMaker console today.

To find out more, refer to Amazon SageMaker geospatial capabilities,…

6 days, 6 hours назад @ aws.amazon.com
Learn how to build and deploy tool-using LLM agents using AWS SageMaker JumpStart Foundation Models
Learn how to build and deploy tool-using LLM agents using AWS SageMaker JumpStart Foundation Models Learn how to build and deploy tool-using LLM agents using AWS SageMaker JumpStart Foundation Models

With access to tools, LLM agents can become more powerful—at the cost of additional complexity.

In this post, we introduce LLM agents and demonstrate how to build and deploy an e-commerce LLM agent using Amazon SageMaker JumpStart and AWS Lambda.

For a fully managed experience for building LLM agents, AWS also provides the agents for Amazon Bedrock feature (in preview).

Reliability and performance: LLM agents can be unreliable, especially for complex tasks that cannot be completed within a few loops.

He is passionate about helping companies with AI/ML product development, and the future of LLM agents and co-pilots.

1 week, 2 days назад @ aws.amazon.com
Build a classification pipeline with Amazon Comprehend custom classification (Part I)
Build a classification pipeline with Amazon Comprehend custom classification (Part I) Build a classification pipeline with Amazon Comprehend custom classification (Part I)

Amazon Comprehend custom classification can be useful in this situation.

In first part of this multi-series blog post, you will learn how to create a scalable training pipeline and prepare training data for Comprehend Custom Classification models.

Q02ClassifierOutputBucketName (Required) – The Amazon S3 bucket name to store outputs from Amazon Comprehend and the pipeline.

– The Amazon S3 bucket name to store outputs from Amazon Comprehend and the pipeline.

ConclusionIn this post, we showed you the concept of a scalable training pipeline for Amazon Comprehend custom classification models and providing an automated solution to efficiently training new models.

1 week, 3 days назад @ aws.amazon.com
Fine-tune Falcon 7B and other LLMs on Amazon SageMaker with @remote decorator
Fine-tune Falcon 7B and other LLMs on Amazon SageMaker with @remote decorator Fine-tune Falcon 7B and other LLMs on Amazon SageMaker with @remote decorator

With Amazon SageMaker, now you can run a SageMaker training job simply by annotating your Python code with @remote decorator.

In this post, we showcase how to fine-tune a Falcon-7B Foundation Models (FM) using @remote decorator from SageMaker Python SDK.

Continue providing dependencies in a requirements.txt and supply that to remote decorator.

Setup remote decorator configurationsCreate a configuration file where all the configurations related to Amazon SageMaker training job are specified.

{region}.amazonaws.com/huggingface-pytorch-training:2.0.0-transformers4.28.1-gpu-py310-cu118-ubuntu20.04' InstanceType: ml.g5.12xlarge RoleArn: arn:aws:iam::111122223333:role/ExampleSageMakerRoleIt’s not…

1 week, 3 days назад @ aws.amazon.com
Simplify access to internal information using Retrieval Augmented Generation and LangChain Agents
Simplify access to internal information using Retrieval Augmented Generation and LangChain Agents Simplify access to internal information using Retrieval Augmented Generation and LangChain Agents

Amazon API Gateway hosts a REST API with various endpoints to handle user requests that are authenticated using Amazon Cognito.

The LangChain agent verifies permissions, modifies ticket status, and notifies the correct individuals using Amazon Simple Notification Service (Amazon SNS).

ConclusionIn this post, we showed you how to simplify access to internal information by summarizing from multiple repositories in real-time.

To learn more about working with generative AI on AWS, refer to Announcing New Tools for Building with Generative AI on AWS.

In our next post, we’ll outline ways to implement this solution using Amazon Bedrock and the Amazon Titan LLM.

1 week, 3 days назад @ aws.amazon.com
Visualize an Amazon Comprehend analysis with a word cloud in Amazon QuickSight
Visualize an Amazon Comprehend analysis with a word cloud in Amazon QuickSight Visualize an Amazon Comprehend analysis with a word cloud in Amazon QuickSight

Amazon Comprehend is a fully, managed service that uses natural language processing (NLP) to extract insights about the content of documents.

Amazon Comprehend develops insights by recognizing the entities, key phrases, sentiment, themes, and custom elements in a document.

Then, we use Amazon QuickSight to generate a simple yet powerful word cloud visual to easily spot themes or trends.

Analyze data using Amazon ComprehendThere are many types of text-based and image information that can be processed using Amazon Comprehend.

This post described the steps to build a word cloud to visualize a text content analysis from Amazon Comprehend using AWS tools and QuickSight to visualize the data.

1 week, 4 days назад @ aws.amazon.com
Amazon SageMaker simplifies the Amazon SageMaker Studio setup for individual users
Amazon SageMaker simplifies the Amazon SageMaker Studio setup for individual users Amazon SageMaker simplifies the Amazon SageMaker Studio setup for individual users

With this new capability, individual users can launch Amazon SageMaker Studio with default presets in minutes.

To use SageMaker Studio or other personal apps such as Amazon SageMaker Canvas, or to collaborate in shared spaces, AWS customers need to first set up a SageMaker domain.

Introducing the simplified Quick Studio setupThe new Quick Studio setup experience for SageMaker provides a new onboarding and administration experience that makes it easy for individual users to set up and manage SageMaker Studio.

Components of the Quick Studio setupWhen using the Quick Studio setup, SageMaker creates the following resources:A new IAM role with the appropriate permissions for using SageMaker Stud…

1 week, 5 days назад @ aws.amazon.com
NVIDIA
последний пост 3 days, 9 hours назад
NVIDIA Studio Lineup Adds RTX-Powered Microsoft Surface Laptop Studio 2
NVIDIA Studio Lineup Adds RTX-Powered Microsoft Surface Laptop Studio 2 NVIDIA Studio Lineup Adds RTX-Powered Microsoft Surface Laptop Studio 2

Editor’s note: This post is part of our weekly In the NVIDIA Studio series, which celebrates featured artists, offers creative tips and tricks and demonstrates how NVIDIA Studio technology improves creative workflows.

The NVIDIA Studio laptop lineup is expanding with the new Microsoft Surface Laptop Studio 2, powered by GeForce RTX 4060, GeForce RTX 4050 or NVIDIA RTX 2000 Ada Generation Laptop GPUs, providing powerful performance and versatility for creators.

Backed by the NVIDIA Studio platform, the Surface Laptop Studio 2, announced today, offers maximum stability with preinstalled Studio Drivers, plus exclusive tools to accelerate professional and creative workflows.

The September Studi…

3 days, 9 hours назад @ blogs.nvidia.com
Run AI on Your PC? GeForce Users Are Ahead of the Curve
Run AI on Your PC? GeForce Users Are Ahead of the Curve Run AI on Your PC? GeForce Users Are Ahead of the Curve

Video editing in DaVinci Resolve powered by AI doubles its speed, and Adobe Photoshop’s photo editing tasks become 3x as swift.

NVIDIA RTX AI tech demonstrates a staggering 10x faster speeds in distinct workflows when juxtaposed against its competitors.

Beyond gaming and content creation, RTX GPUs bring AI to all types of users.

Video Streamers : RTX Video Super Resolution uses AI to increase the resolution and improve the quality of streamed video, elevating the home video experience.

Three pillars: lightning-fast graphics processing from GPUs, AI capabilities integral to GeForce and the omnipresent cloud.

3 days, 9 hours назад @ blogs.nvidia.com
Into the Omniverse: Blender 4.0 Alpha Release Sets Stage for New Era of OpenUSD Artistry
Into the Omniverse: Blender 4.0 Alpha Release Sets Stage for New Era of OpenUSD Artistry Into the Omniverse: Blender 4.0 Alpha Release Sets Stage for New Era of OpenUSD Artistry

For seasoned 3D artists and budding digital creation enthusiasts alike, an alpha version of the popular 3D software Blender is elevating creative journeys.

A Universal Upgrade for Blender WorkflowsWith Blender 4.0 Alpha, 3D creators across industries and enterprises can access optimized OpenUSD workflows for various use cases.

See how Moment Factory uses Omniverse, Blender and USD to bring their immersive events to life in their recent livestream.

Get started with NVIDIA Omniverse by downloading the standard license for free or learn how Omniverse Enterprise can connect your team.

Stay up to date on the platform by subscribing to the newsletter and following NVIDIA Omniverse on Instagram, L…

3 days, 11 hours назад @ blogs.nvidia.com
NVIDIA CEO Jensen Huang to Headline AI Summit in Tel Aviv
NVIDIA CEO Jensen Huang to Headline AI Summit in Tel Aviv NVIDIA CEO Jensen Huang to Headline AI Summit in Tel Aviv

NVIDIA founder and CEO Jensen Huang will highlight the newest in generative AI and cloud computing at the NVIDIA AI Summit in Tel Aviv from Oct. 15-16.

Deep dive into AI : The first day is dedicated to intensive learning sessions hosted by the NVIDIA Deep Learning Institute.

Workshops encompass topics like “Fundamentals of Deep Learning” and “Building AI-Based Cybersecurity Pipelines,” among a range of other topics.

Edge AI & Robotics Developer Day activities will explore innovations in AI and the NVIDIA Jetson Orin platform.

: The first day is dedicated to intensive learning sessions hosted by the NVIDIA Deep Learning Institute.

3 days, 11 hours назад @ blogs.nvidia.com
Cash In: ‘PAYDAY 3’ Streams on GeForce NOW
Cash In: ‘PAYDAY 3’ Streams on GeForce NOW Cash In: ‘PAYDAY 3’ Streams on GeForce NOW

Time to get the gang back together — PAYDAY 3 streams on GeForce NOW this week.

The Perfect HeistPAYDAY 3 is the highly anticipated sequel to one of the world’s most popular co-op shooters.

Step out of retirement and back into the life of crime in the shoes of the Payday Gang — who bring the envy of their peers and the nightmare of law enforcement wherever they go.

Upgrade to a GeForce NOW Ultimate membership to pull off every heist at the highest quality.

Here’s the full list of this week’s GeForce NOW library additions:Starting today, the Cyberpunk 2077 2.0 patch will also be supported, adding DLSS 3.5 technology and other new features.

3 days, 11 hours назад @ blogs.nvidia.com
Virtually Incredible: Mercedes-Benz Folds Next-Gen Luxury EV Platform Into Production With NVIDIA Omniverse and Generative AI
Virtually Incredible: Mercedes-Benz Folds Next-Gen Luxury EV Platform Into Production With NVIDIA Omniverse and Generative AI Virtually Incredible: Mercedes-Benz Folds Next-Gen Luxury EV Platform Into Production With NVIDIA Omniverse and Generative AI

The digital twin in production helps ensure Mercedes-Benz assembly lines can be retooled, configured and optimized in physically accurate simulations first.

By leveraging Omniverse, Mercedes-Benz can interact directly with its suppliers, reducing coordination processes by 50%.

The Rastatt plant is being used to pioneer digital production in the paint shop.

Running simulations in NVIDIA Omniverse enables factory planners to optimize factory floor and production line layouts for supply routes, and production lines can be validated without having to disrupt production.

This accelerates how quickly new production lines can reach maximum capacity and reduces the risk of re-work or stoppages.

4 days, 6 hours назад @ blogs.nvidia.com
New Video: Representing Data with OpenUSD Custom Schemas
New Video: Representing Data with OpenUSD Custom Schemas New Video: Representing Data with OpenUSD Custom Schemas

In the third installment of this OpenUSD series, I share what developers must know about custom schemas.

Specifically, we dive into:Data formalization: Custom schemas formalize data models, such as geometric meshes.

Custom schemas formalize data models, such as geometric meshes.

OpenUSD includes core schemas like geometry and shading, with continuous development of custom schemas to broaden the digital landscape.

Universal Scene Description (OpenUSD): Custom SchemasFor the latest USD resources and tutorials, visit our OpenUSD resources page.

4 days, 8 hours назад @ developer.nvidia.com
Oracle Cloud Infrastructure Offers New NVIDIA GPU-Accelerated Compute Instances
Oracle Cloud Infrastructure Offers New NVIDIA GPU-Accelerated Compute Instances Oracle Cloud Infrastructure Offers New NVIDIA GPU-Accelerated Compute Instances

To help meet this need, Oracle Cloud Infrastructure today announced general availability of NVIDIA H100 Tensor Core GPUs on OCI Compute, with NVIDIA L40S GPUs coming soon.

The BM.GPU.H100.8 OCI Compute shape includes eight NVIDIA H100 GPUs, each with 80GB of HBM2 GPU memory.

NVIDIA L40S GPU Instance on OCIThe NVIDIA L40S GPU, based on the NVIDIA Ada Lovelace architecture, is a universal GPU for the data center, delivering breakthrough multi-workload acceleration for LLM inference and training, visual computing and video applications.

The OCI Compute bare-metal instances with NVIDIA L40S GPUs will be available for early access later this year, with general availability coming early in 2024.

5 days назад @ blogs.nvidia.com
Meet the Omnivore: Industrial Designer Blends Art and OpenUSD to Create 3D Assets for AI Training
Meet the Omnivore: Industrial Designer Blends Art and OpenUSD to Create 3D Assets for AI Training Meet the Omnivore: Industrial Designer Blends Art and OpenUSD to Create 3D Assets for AI Training

Boehmer creates realistic 3D assets that can be used with SORDI.ai, short for Synthetic Object Recognition Dataset for Industries.

Creating Realistic 3D Assets for Training AIAs a design intern, Boehmer’s main tasks are animation and 3D modeling.

Once the 3D assets are created, Boehmer uses the models to start assembling scenes.

Join In on the CreationAnyone can build their own Omniverse extension or Connector to enhance their 3D workflows and tools.

See how creators are using OpenUSD to accelerate a variety of 3D workflows in the latest OpenUSD All Stars.

5 days, 6 hours назад @ blogs.nvidia.com
Ray Shines with NVIDIA AI: Anyscale Collaboration to Help Developers Build, Tune, Train and Scale Production LLMs
Ray Shines with NVIDIA AI: Anyscale Collaboration to Help Developers Build, Tune, Train and Scale Production LLMs Ray Shines with NVIDIA AI: Anyscale Collaboration to Help Developers Build, Tune, Train and Scale Production LLMs

Developers will have the flexibility to deploy open-source NVIDIA software with Ray or opt for NVIDIA AI Enterprise software running on the Anyscale Platform for a fully supported and secure production deployment.

NeMo is an end-to-end, cloud-native framework to build, customize and deploy generative AI models anywhere.

Whether developers use Ray open source or the supported Anyscale Platform, Anyscale’s core functionality helps them easily orchestrate LLM workloads.

The NVIDIA AI integration can help developers build, train, tune and scale AI with even greater efficiency.

NVIDIA AI integrations with Anyscale are in development and expected to be available by the end of the year.

6 days, 11 hours назад @ blogs.nvidia.com
Adobe Scales ML Pipelines for Optimized Delivery of Brand Messages
Adobe Scales ML Pipelines for Optimized Delivery of Brand Messages Adobe Scales ML Pipelines for Optimized Delivery of Brand Messages

Amazon EMR on EKS IntegrationAmazon EMR on EKS provides a deployment option for Amazon EMR that allows organizations to run open-source big data frameworks on Amazon Elastic Kubernetes Service (Amazon EKS).

With Amazon EMR on EKS, Spark applications run on the Amazon EMR runtime for Apache Spark.

Amazon EKS deployment architectureTo effectively deploy Amazon EMR on EKS with Spark-RAPIDS, we’ll follow the recommended architecture provided by Data on EKS (DoEKS).

NVIDIA GPU OperatorThe NVIDIA GPU Operator is a Kubernetes operator that streamlines the management and provisioning of NVIDIA GPU resources within Amazon EKS clusters.

Deploying the solutionThe solution involves the creation and con…

1 week, 3 days назад @ aws.amazon.com
Shout at the Devil: Capcom’s ‘Devil May Cry 5’ Joins GeForce NOW
Shout at the Devil: Capcom’s ‘Devil May Cry 5’ Joins GeForce NOW Shout at the Devil: Capcom’s ‘Devil May Cry 5’ Joins GeForce NOW

GFN Thursday is downright demonic, as Devil May Cry 5 comes to GeForce NOW.

Capcom’s action-packed third-person brawler leads 15 titles joining the GeForce NOW library this week, including Gears Tactics and The Crew Motorfest.

The Devil ReturnsDevil May Cry 5 is the next title from Capcom’s catalog to come to GeForce NOW.

Gears Tactics is the next PC Game Pass title to arrive in the cloud.

Gears Tactics is a fast-paced, turn-based strategy game from one of the most acclaimed video game franchises — Gears of War.

1 week, 3 days назад @ blogs.nvidia.com
Unlocking the Language of Genomes and Climates: Anima Anandkumar on Using Generative AI to Tackle Global Challenges
Unlocking the Language of Genomes and Climates: Anima Anandkumar on Using Generative AI to Tackle Global Challenges Unlocking the Language of Genomes and Climates: Anima Anandkumar on Using Generative AI to Tackle Global Challenges

Generative AI can also predict extreme weather events like hurricanes or heat waves.

Even with an AI boost, trying to predict natural events is challenging because of the sheer number of variables and unknowns.

Overjet’s Ai Wardah Inam on Bringing AI to DentistryOverjet, a member of NVIDIA Inception, is moving fast to bring AI to dentists’ offices.

Subscribe to the AI Podcast: Now Available on Amazon MusicThe AI Podcast is now available through Amazon Music.

Make the AI Podcast better.

1 week, 4 days назад @ blogs.nvidia.com
NVIDIA Lends Support to Washington’s Efforts to Ensure AI Safety
NVIDIA Lends Support to Washington’s Efforts to Ensure AI Safety NVIDIA Lends Support to Washington’s Efforts to Ensure AI Safety

The commitments are designed to advance common standards and best practices to ensure the safety of generative AI systems until regulations are in place, the White House said.

They include:Testing the safety and capabilities of AI products before they’re deployed,Safeguarding AI models against cyber and insider threats, andUsing AI to help meet society’s greatest challenges, from cancer to climate change.

The subcommittee’s hearing, “Oversight of AI: Rules for Artificial Intelligence,” is among actions from policymakers around the world trying to identify and address potential risks of generative AI.

That result helped spark a decade of rapid advances using GPUs, leading to ChatGPT and othe…

1 week, 5 days назад @ blogs.nvidia.com
Mobility Gets Amped: IAA Show Floor Energized by Surge in EV Reveals, Generative AI
Mobility Gets Amped: IAA Show Floor Energized by Surge in EV Reveals, Generative AI Mobility Gets Amped: IAA Show Floor Energized by Surge in EV Reveals, Generative AI

Electric Vehicles Dominate the Show FloorThe auto industry’s move toward electrification was on full display at IAA, with a number of global automakers showcasing their current and upcoming electric mobility lineup.

It’s powered by NVIDIA DRIVE Orin, which delivers 254 TOPS of compute to enable safe, high-speed and urban intelligent-driving capabilities.

XPENG’s inaugural presence at IAA served as the ideal opportunity to introduce its latest models to Europe, including its G9 and P7 EVs, with NVIDIA DRIVE Orin under the hood.

The automaker’s intelligent G6 Coupe SUV, also powered by NVIDIA DRIVE Orin, will be made available to the European market next year.

Ecosystem Partners Paint IAA Sho…

1 week, 5 days назад @ blogs.nvidia.com
Facebook
последний пост 2 weeks, 3 days назад
Using Chakra execution traces for benchmarking and network performance optimization
Using Chakra execution traces for benchmarking and network performance optimization Using Chakra execution traces for benchmarking and network performance optimization

Meta presents Chakra execution traces , an open graph-based representation of AI/ML workload execution, laying the foundation for benchmarking and network performance optimization.

The limitations of traditional AI benchmarking methodologyTraditionally, benchmarking AI systems has largely relied on running full ML workloads.

How Meta leverages Chakra execution tracesAt Meta, we collect execution traces from our production servers every day.

Future plansEnhancing the benchmarking capability of Chakra execution tracesWhile the execution trace replayer enables replay of execution traces, it brings forth challenges.

Using AI to generate representative execution tracesChakra execution traces are…

2 weeks, 3 days назад @ engineering.fb.com
Arcadia: An end-to-end AI system performance simulator
Arcadia: An end-to-end AI system performance simulator Arcadia: An end-to-end AI system performance simulator

We’re introducing Arcadia, Meta’s unified system that simulates the compute, memory, and network performance of AI training clusters.

Arcadia gives Meta’s researchers and engineers valuable insights into the performance of AI models and workloads in an AI cluster – enabling data-driven decision making in the design of AI clusters.

That’s where Arcadia, Meta’s end-to-end AI system performance simulator, comes in.

By providing insights into the impact of these factors on system performance, Arcadia facilitates data-driven decision-making processes and fosters the evolution of models and hardware.

This comprehensive set of metrics empowers stakeholders to analyze the impact of different factor…

2 weeks, 3 days назад @ engineering.fb.com
Code Llama: Meta’s state-of-the-art LLM for coding
Code Llama: Meta’s state-of-the-art LLM for coding Code Llama: Meta’s state-of-the-art LLM for coding

Today, we are releasing Code Llama, a large language model (LLM) that can use text prompts to generate code.

Additionally, we have further fine-tuned two additional variations of Code Llama: Code Llama - Python and Code Llama - Instruct.

Code Llama - Python is a language-specialized variation of Code Llama, further fine-tuned on 100B tokens of Python code.

Code Llama - Instruct is an instruction fine-tuned and aligned variation of Code Llama.

We recommend using Code Llama - Instruct variants whenever using Code Llama for code generation since Code Llama - Instruct has been fine-tuned to generate helpful and safe answers in natural language.

1 month назад @ ai.meta.com
Meta Connect 2023: September 27 – 28
Meta Connect 2023: September 27 – 28 Meta Connect 2023: September 27 – 28

[...]

Read More...

The post Meta Connect 2023: September 27 – 28 appeared first on Engineering at Meta.

1 month, 1 week назад @ meta.com
Scaling the Instagram Explore recommendations system
Scaling the Instagram Explore recommendations system Scaling the Instagram Explore recommendations system

Every day, hundreds of millions of people visit Explore on Instagram to discover something new, making it one of the largest recommendation surfaces on Instagram.

Readers might notice that the leitmotif of this post will be clever use of caching and pre-computation in different ranking stages.

In most large-scale recommender systems, the retrieval stage consists of multiple candidates’ retrieval sources (“sources” for short).

To model media retrieval for different user groups with various interests, we utilize all these mentioned source types together and mix them with tunable weights.

After training, user embeddings should be close to the embeddings of relevant items for a given user.

1 month, 2 weeks назад @ engineering.fb.com
MSVP is Meta’s first video processing ASIC
MSVP is Meta’s first video processing ASIC MSVP is Meta’s first video processing ASIC

MSVP’s video encoding algorithmsThe MSVP encoder has two main goals: to be highly power efficient and to deliver the same or better video quality as software encoders.

Here’s a simplified version of the data flow of modern hybrid (hardware and software) video encoders:Simplified video encoder modules.

We mainly focused on three levels: block level, frame level, and group of picture (GOP) level.

Smart quantizationQuantization is the only lossy part of video compression, and it is also the dominant bit rate control knob in any video coding standard.

Since VP9’s frame type is different from H.264’s, the strategy for making frame level decisions is also different, as shown in the following figu…

4 months, 1 week назад @ ai.meta.com
Meta introduces its first-generation AI inference accelerator
Meta introduces its first-generation AI inference accelerator Meta introduces its first-generation AI inference accelerator

These workloads run on PyTorch with first-class Python integration, eager-mode development, and the simplicity of APIs.

Deep learning recommendation models (DLRMs), in particular, are important for improving experiences across Meta’s services and applications.

We found that GPUs were not always optimal for running Meta’s specific recommendation workloads at the levels of efficiency required at our scale.

Our solution to this challenge was to design a family of recommendation-specific Meta Training and Inference Accelerator (MTIA) ASICs.

In addition, we maintained the user experience and developer efficiency offered by PyTorch eager-mode development.

4 months, 1 week назад @ ai.meta.com
Uber Engineering
последний пост None
neptune.ai neptune.ai
последний пост 2 weeks, 3 days назад
Software Engineering Patterns for Machine Learning
Software Engineering Patterns for Machine Learning Software Engineering Patterns for Machine Learning

Best practices for building ETLs for MLBest practices for building ETLs for ML | Source: AuthorThe significance of ETLs in machine learning projectsExploring a pivotal facet of every machine learning endeavor: ETLs.

Related post In-Depth ETL in Machine Learning Tutorial [Case Study] Read moreBest practices for building training and inference algorithmsBest practices for building training and inference algorithms | Source: AuthorThe nature of training in machine learningTraining is often seen as an engaging and imaginative aspect of machine learning tasks.

Transition to production and challengesAfter constructing the machine learning model, the next step is transitioning it into a production…

2 weeks, 3 days назад @ neptune.ai
ML Pipeline Architecture Design Patterns (With 10 Real-World Examples)
ML Pipeline Architecture Design Patterns (With 10 Real-World Examples) ML Pipeline Architecture Design Patterns (With 10 Real-World Examples)

2 Exploration of standard ML pipeline/system design and architectural practices in prominent tech companiesExploration of standard ML pipeline/system design and architectural practices in prominent tech companies 3 Explanation of common ML pipeline architecture design patternsExplanation of common ML pipeline architecture design patterns 4 Introduction to common components of ML pipelinesIntroduction to common components of ML pipelines 5 Introduction to tools, techniques and software used to implement and maintain ML pipelinesIntroduction to tools, techniques and software used to implement and maintain ML pipelines 6 ML pipeline architecture examplesML pipeline architecture examples 7 Comm…

1 month, 2 weeks назад @ neptune.ai
Organizing ML Monorepo With Pants
Organizing ML Monorepo With Pants Organizing ML Monorepo With Pants

Finally, we’ll see how to harness the power of the Pants build system to organize your machine learning monorepo into a robust CI/CD build system.

Machine learning monorepo | Source: AuthorA monorepo (short for monolithic repository) is a software development strategy where code for many projects is stored in the same repository.

Machine learning with monoreposThere are at least six reasons why monorepos are particularly suitable for machine learning projects.

Cross-functional collaborationMachine learning projects often involve collaboration between data scientists, ML engineers, and software engineers.

In it, we are running the following commands:pants tailor --check update-build-files --…

1 month, 3 weeks назад @ neptune.ai
Learnings From Building the ML Platform at Stitch Fix
Learnings From Building the ML Platform at Stitch Fix Learnings From Building the ML Platform at Stitch Fix

In this episode, Stefan Krawczyk shares his learnings from building the ML Platform at Stitch Fix.

As you’ve been running the ML data platform team, how do you do that?

So I have a blog post on some of my learnings, building the platform at Stitch Fix.

But at Stitch Fix, the ratio of safe, if you just take the engineering, the platform team that was focused on helping pipelines, right?

But there is an order of magnitude less around being an MLOps engineer or a member of the ML platform team.

1 month, 3 weeks назад @ neptune.ai
Deploying Conversational AI Products to Production With Jason Flaks
Deploying Conversational AI Products to Production With Jason Flaks Deploying Conversational AI Products to Production With Jason Flaks

Every episode is focused on one specific ML topic, and during this one, we talked to Jason Falks about deploying conversational AI products to production.

Today, we have Jason Flaks with us, and we’ll be talking about deploying conversational AI products to production.

Conversational AI is about building technology and products that are capable of interacting with humans in this unbounded conversational domain space.

ChatbotThis is about enabling people to engage with our product via conversational speech.

And so, if you want to support conversational AI, you really need to be prepared to deal with the dynamic nature of language constantly.

2 months, 1 week назад @ neptune.ai
How to Use SHAP Values to Optimize and Debug ML Models
How to Use SHAP Values to Optimize and Debug ML Models How to Use SHAP Values to Optimize and Debug ML Models

With SHAP values, practitioners can:Uncover influential features : SHAP values highlight features with substantial impact on predictions, enabling their prioritization during feature engineering.

SHAP beeswarm plotThe SHAP beeswarm plot visualizes the distribution of SHAP values across features in a dataset.

See in the app Full screen preview SHAP values and the dependence plotIn this example, the SHAP dependence scatter plot showcases the non-linear relationship between the “age” feature and its corresponding SHAP values.

On the x-axis, the “age” values are displayed, while the y-axis represents the SHAP values associated with each “age” value.

ConclusionIn this article, we explored how to…

2 months, 4 weeks назад @ neptune.ai
MLOps Landscape in 2023: Top Tools and Platforms
MLOps Landscape in 2023: Top Tools and Platforms MLOps Landscape in 2023: Top Tools and Platforms

Like every software solution, evaluating MLOps (Machine Learning Operations) tools and platforms can be a complex task as it requires consideration of varying factors.

Talend Data QualityTalend Data Quality is a comprehensive data quality management tool with data profiling, cleansing, and monitoring features.

OctoMLOctoML is a machine learning acceleration platform that helps engineers quickly deploy machine learning models on any hardware, cloud provider, or edge device.

Model performance tracking : Track and analyze model performance over time, including metrics such as accuracy, precision, recall, or custom-defined metrics, providing a comprehensive view of model effectiveness.

See also…

2 months, 4 weeks назад @ neptune.ai
How to Build ML Model Training Pipeline
How to Build ML Model Training Pipeline How to Build ML Model Training Pipeline

Complete ML model training pipeline workflow | SourceBut before we delve into the step-by-step model training pipeline, it’s essential to understand the basics, architecture, motivations, challenges associated with ML pipelines, and a few tools that you will need to work with.

There are several reasons to build an ML model training pipeline (trust me!

ML model training pipeline architectureAn ML model training pipeline typically consists of several interconnected components or stages.

In this section, we will walk through a step-by-step tutorial on how to build an ML model training pipeline.

To effectively incorporate distributed training in your ML model training pipelines, here are some u…

3 months, 2 weeks назад @ neptune.ai
What Does GPT-3 Mean For the Future of MLOps? With David Hershey
What Does GPT-3 Mean For the Future of MLOps? With David Hershey What Does GPT-3 Mean For the Future of MLOps? With David Hershey

Every episode is focused on one specific ML topic, and during this one, we talked to David Hershey about GPT-3 and the feature of MLOps.

I give broad answers because nearly every industry product has some opportunity to incorporate or improve a feature using language models.

So I write a prompt, the language model says something back based on what the language model says back, and I send another prompt to clarify or to move in some other direction.

David: Maybe to start with the obvious and then we’ll get into the less obvious because I think that’s easy.

Happy to chat about language models, MLOps, whatever, and flush boat.

3 months, 3 weeks назад @ neptune.ai
Building ML Platform in Retail and eCommerce
Building ML Platform in Retail and eCommerce Building ML Platform in Retail and eCommerce

An ML Platform helps in the faster iteration of an ML project lifecycle.

The architecture of an ML Platform in eCommerce | Source: AuthorOne might give a different name to a component, but the major components in an ML Platform are as follows:1 Data platformData platform 2 Data processingData processing 3 Continuous integration / continuous deployment / continuous trainingContinuous integration / continuous deployment / continuous training 4 Model servingModel serving 5 Performance monitoringThese are the components we will find in any ML Platform, but what’s special about ML Platform in Retail?

You may also like Building a Machine Learning Platform [Definitive Guide]Consideration for data …

3 months, 3 weeks назад @ neptune.ai
How to Build ETL Data Pipeline in ML
How to Build ETL Data Pipeline in ML How to Build ETL Data Pipeline in ML

Xoriant Data pipeline is an umbrella term for the category of moving data between different systems, and ETL data pipeline is a type of data pipeline.

It is common to use ETL data pipeline and data pipeline interchangeably.

Comparisons ETL Pipeline Data Pipeline Terminology As the abbreviation suggests, ETL involves a series of processes, extracting the data, transforming it and at the end loading it to the target source.

While a data pipeline can include various types of pipelines, ETL pipeline is one specific subset of a data pipeline.

ETL pipeline toolsTo create an ETL pipeline, as discussed in the last section, we require tools, tools that can provide us the functionality of following b…

4 months, 1 week назад @ neptune.ai
How to Save Trained Model in Python
How to Save Trained Model in Python How to Save Trained Model in Python

with open(model_pkl_file, 'rb' ) as file: model = pickle.load(file) y_predict = model.predict(X_test) print(classification_report(y_test, y_predict))Once loaded you can use this model to make predictions.

Saving trained model with JSONWhen you want to have full control over the save and restore procedure of your ML model, JSON comes into play.

To save a deep learning model in TensorFlow Keras, you can use the save() method of the Keras Model object.

ONNX is primarily designed for deep learning models and may not be suitable for other types of machine learning models.

Save Both Model Architecture and Weights: In the case of DL-based models, if you save only model weight but not architecture,…

4 months, 2 weeks назад @ neptune.ai
How to Build an End-To-End ML Pipeline
How to Build an End-To-End ML Pipeline How to Build an End-To-End ML Pipeline

2 Learn the essential steps and best practices machine learning engineers can follow to build robust, scalable, end-to-end machine learning pipelines.

Learn the essential steps and best practices machine learning engineers can follow to build robust, scalable, end-to-end machine learning pipelines.

Machine learning pipeline vs machine learning platformThe ML pipeline is part of the broader ML platform.

Machine Learning pipeline toolsThe following are examples of machine learning pipeline orchestration tools and platforms:1 Metaflow.

DEMO: End-to-end ML pipeline exampleIn this example, you will build an ML pipeline with Kubeflow Pipelines based on the infamous Titanic ML competition on Kaggl…

4 months, 2 weeks назад @ neptune.ai
▶️ YouTube
Yannic Kilcher Yannic Kilcher
последний пост 1 week, 5 days назад
Retentive Network: A Successor to Transformer for Large Language Models (Paper Explained)
Retentive Network: A Successor to Transformer for Large Language Models (Paper Explained) Retentive Network: A Successor to Transformer for Large Language Models (Paper Explained)

#ai #retnet #transformers Retention is an alternative to Attention in Transformers that can both be written in a parallel and in a recurrent fashion. This means the architecture achieves training parallelism while maintaining low-cost inference. Experiments in the paper look very promising. OUTLINE:

0:00 - Intro

2:40 - The impossible triangle

6:55 - Parallel vs sequential

15:35 - Retention mechanism

21:00 - Chunkwise and multi-scale retention

24:10 - Comparison to other architectures

26:30 - Experimental evaluation Paper: https://arxiv.org/abs/2307.08621 Abstract:

In this work, we propose Retentive Network (RetNet) as a foundation architecture for large language models, simultaneously achie…

1 week, 5 days назад @ youtube.com
Reinforced Self-Training (ReST) for Language Modeling (Paper Explained)
Reinforced Self-Training (ReST) for Language Modeling (Paper Explained) Reinforced Self-Training (ReST) for Language Modeling (Paper Explained)

#ai #rlhf #llm ReST uses a bootsrap-like method to produce its own extended dataset and trains on ever higher-quality subsets of it to improve its own reward. The method allows for re-using the same generated data multiple times and thus has an efficiency advantage with respect to Online RL techniques like PPO. Paper: https://arxiv.org/abs/2308.08998 Abstract:

Reinforcement learning from human feedback (RLHF) can improve the quality of large language model's (LLM) outputs by aligning them with human preferences. We propose a simple algorithm for aligning LLMs with human preferences inspired by growing batch reinforcement learning (RL), which we call Reinforced Self-Training (ReST). Given an…

3 weeks назад @ youtube.com
[ML News] LLaMA2 Released | LLMs for Robots | Multimodality on the Rise
[ML News] LLaMA2 Released | LLMs for Robots | Multimodality on the Rise [ML News] LLaMA2 Released | LLMs for Robots | Multimodality on the Rise

#mlnews #llama2 #openai Your regular irregular update on the world of Machine Learning. References:

https://twitter.com/ylecun/status/1681336284453781505

https://ai.meta.com/llama/

https://about.fb.com/news/2023/07/llama-2-statement-of-support/

https://247wallst.com/special-report/2023/08/12/this-is-the-biggest-social-media-platform-ranking-the-worlds-largest-networking-sites/4/

https://github.com/Alpha-VLLM/LLaMA2-Accessory

https://together.ai/blog/llama-2-7b-32k?s=09&utm_source=pocket_saves

https://github.com/imoneoi/openchat

https://twitter.com/lmsysorg/status/1686794639469371393?s=09&t=sS3awkbavmSMSmwp64Ef4A&utm_source=pocket_saves

https://huggingface.co/lmsys/vicuna-13b-v1.5-16k

https:…

1 month, 1 week назад @ youtube.com
How Cyber Criminals Are Using ChatGPT (w/ Sergey Shykevich)
How Cyber Criminals Are Using ChatGPT (w/ Sergey Shykevich) How Cyber Criminals Are Using ChatGPT (w/ Sergey Shykevich)

#cybercrime #chatgpt #security An interview with Sergey Shykevich, Threat Intelligence Group Manager at Check Point, about how models like ChatGPT have impacted the realm of cyber crime. https://threatmap.checkpoint.com/ Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https://www.subscribesta…

1 month, 1 week назад @ youtube.com
Recipe AI suggests FATAL CHLORINE GAS Recipe
Recipe AI suggests FATAL CHLORINE GAS Recipe Recipe AI suggests FATAL CHLORINE GAS Recipe

#llm #safety #gpt4 A prime example of intellectual dishonesty of journalists and AI critics. Article: https://gizmodo.com/paknsave-ai-savey-recipe-bot-chlorine-gas-1850725057

My Recipe AI: https://github.com/yk/recipe-ai Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https://www.subscribesta…

1 month, 1 week назад @ youtube.com
DeepFloyd IF - Pixel-Based Text-to-Image Diffusion (w/ Authors)
DeepFloyd IF - Pixel-Based Text-to-Image Diffusion (w/ Authors) DeepFloyd IF - Pixel-Based Text-to-Image Diffusion (w/ Authors)

#ai #diffusion #stabilityai An interview with DeepFloyd members Misha Konstantinov and Daria Bakshandaeva on the release of the model IF, an open-source model following Google's implementation of Imagen. References:

https://www.deepfloyd.ai/deepfloyd-if

https://huggingface.co/DeepFloyd

https://twitter.com/_gugutse_

https://twitter.com/_bra_ket Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me fina…

1 month, 1 week назад @ youtube.com
[ML News] GPT-4 solves MIT Exam with 100% ACCURACY | OpenLLaMA 13B released
[ML News] GPT-4 solves MIT Exam with 100% ACCURACY | OpenLLaMA 13B released [ML News] GPT-4 solves MIT Exam with 100% ACCURACY | OpenLLaMA 13B released

#gpt4 #mit #ai A new paper claims to use GPT-4 to solve 100% of a set of MIT university exercises. Some people are skeptic and their investigations reveal more than one problem with this paper... OUTLINE:

0:00 - ChatGPT gives out Windows 10 keys

0:30 - MIT exam paper

2:50 - Prompt engineering

5:30 - Automatic grading

6:45 - Response by other MIT students

8:30 - Unsolvable questions

10:50 - Duplicates

13:30 - Cascading the heuristics

22:40 - Other problems

29:25 - OpenLLaMA 13B published References:

https://twitter.com/immasiddtweets/status/1669721470006857729/photo/1

https://arxiv.org/abs/2306.08997

https://arxiv.org/pdf/2306.08997.pdf

https://flower-nutria-41d.notion.site/No-GPT4-can-t-ace…

3 months назад @ youtube.com
Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust (Explained)
Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust (Explained) Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust (Explained)

#stablediffusion #ai #watermark Watermarking the outputs of generative models is usually done as a post-processing step on the model outputs. Tree-Ring Watermarks are applied in the latent space at the beginning of a diffusion process, which makes them nearly undetectable, robust to strong distortions, and only recoverable by the model author. It is a very promising technique with applications potentially beyond watermarking itself. OUTLINE:

0:00 - Introduction & Overview

1:30 - Why Watermarking?

4:20 - Diffusion Models Recap

13:40 - Inverting Diffusion Models

17:05 - Tree-Ring Watermarking

26:15 - Effects of Tree-Ring Watermarks

30:00 - Experimental Results

32:40 - Limitations

34:40 - Conc…

3 months, 2 weeks назад @ youtube.com
RWKV: Reinventing RNNs for the Transformer Era (Paper Explained)
RWKV: Reinventing RNNs for the Transformer Era (Paper Explained) RWKV: Reinventing RNNs for the Transformer Era (Paper Explained)

#gpt4 #rwkv #transformer We take a look at RWKV, a highly scalable architecture between Transformers and RNNs. Fully Connected (June 7th in SF) Promo Link: https://www.fullyconnected.com/?promo=ynnc OUTLINE:

0:00 - Introduction

1:50 - Fully Connected In-Person Conference in SF June 7th

3:00 - Transformers vs RNNs

8:00 - RWKV: Best of both worlds

12:30 - LSTMs

17:15 - Evolution of RWKV's Linear Attention

30:40 - RWKV's Layer Structure

49:15 - Time-Parallel vs Sequence Mode

53:55 - Experimental Results & Limitations

58:00 - Visualizations

1:01:40 - Conclusion Paper: https://arxiv.org/abs/2305.13048

Code: https://github.com/BlinkDL/RWKV-LM Abstract:

Transformers have revolutionized almost all …

3 months, 3 weeks назад @ youtube.com
Tree of Thoughts: Deliberate Problem Solving with Large Language Models (Full Paper Review)
Tree of Thoughts: Deliberate Problem Solving with Large Language Models (Full Paper Review) Tree of Thoughts: Deliberate Problem Solving with Large Language Models (Full Paper Review)

#gpt4 #ai #prompt Tree-of-Thought improves prompting of large language models (LLMs) by generalizing the concept of Chain-of-Thought prompting and introduces a tree search across language model thoughts, including state evaluation and backtracking. Experiments on toy tasks show large improvements over both classic and Chain-of-Thought prompting. OUTLINE:

0:00 - Introduction

1:20 - From Chain-of-Thought to Tree-of-Thought

11:10 - Formalizing the algorithm

16:00 - Game of 24 & Creative writing

18:30 - Crosswords

23:30 - Is this a general problem solver?

26:50 - Ablation studies

28:55 - Conclusion Paper: https://arxiv.org/abs/2305.10601 Abstract:

Language models are increasingly being deployed…

4 months назад @ youtube.com
OpenAI suggests AI licenses (US Senate hearing on AI regulation w/ Sam Altman)
OpenAI suggests AI licenses (US Senate hearing on AI regulation w/ Sam Altman) OpenAI suggests AI licenses (US Senate hearing on AI regulation w/ Sam Altman)

#ai #openai #gpt4 US Senate hearing on AI regulation. MLST video on the hearing: https://www.youtube.com/watch?v=DeSXnESGxr4 Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https://www.subscribestar.com/yannickilcher

Patreon: https://www.patreon.com/yannickilcher

Bitcoin (BTC): bc1q49lsw3q325…

4 months назад @ youtube.com
[ML News] Geoff Hinton leaves Google | Google has NO MOAT | OpenAI down half a billion
[ML News] Geoff Hinton leaves Google | Google has NO MOAT | OpenAI down half a billion [ML News] Geoff Hinton leaves Google | Google has NO MOAT | OpenAI down half a billion

#google #openai #mlnews Updates from the world of Machine Learning and AI

Great AI memes here: https://twitter.com/untitled01ipynb OUTLINE:

0:00 - Google I/O 2023: Generative AI in everything

0:20 - Anthropic announces 100k tokens context

0:35 - Intro

1:20 - Geoff Hinton leaves Google

7:00 - Google memo leaked: we have no moat

11:30 - OpenAI loses 540M

12:30 - Google AI: Product first

15:50 - Ilya Sutskever on safety vs competition

18:00 - AI works cannot be copyrighted

19:40 - OpenAI tries to trademark GPT

20:30 - StarCoder: accessible code model

21:40 - RedPyjama & OpenLlama

22:55 - Mosaic 7B model

23:50 - YoloNAS

24:10 - Mojo programming language

25:30 - Random helpful things

37:40 - Dee…

4 months, 2 weeks назад @ youtube.com
Scaling Transformer to 1M tokens and beyond with RMT (Paper Explained)
Scaling Transformer to 1M tokens and beyond with RMT (Paper Explained) Scaling Transformer to 1M tokens and beyond with RMT (Paper Explained)

#ai #transformer #gpt4 This paper promises to scale transformers to 1 million tokens and beyond. We take a look at the technique behind it: The Recurrent Memory Transformer, and what its strenghts and weaknesses are. OUTLINE:

0:00 - Intro

2:15 - Transformers on long sequences

4:30 - Tasks considered

8:00 - Recurrent Memory Transformer

19:40 - Experiments on scaling and attention maps

24:00 - Conclusion Paper: https://arxiv.org/abs/2304.11062 Abstract:

This technical report presents the application of a recurrent memory to extend the context length of BERT, one of the most effective Transformer-based models in natural language processing. By leveraging the Recurrent Memory Transformer archit…

5 months назад @ youtube.com
OpenAssistant RELEASED! The world's best open-source Chat AI!
OpenAssistant RELEASED! The world's best open-source Chat AI! OpenAssistant RELEASED! The world's best open-source Chat AI!

#openassistant #chatgpt #mlnews Try the chat: https://open-assistant.io/chat

Homepage: https://open-assistant.io Dataset: https://huggingface.co/datasets/OpenAssistant/oasst1

Code: https://github.com/LAION-AI/Open-Assistant

Paper (temporary): https://ykilcher.com/oa-paper Links:

Homepage: https://ykilcher.com

Merch: https://ykilcher.com/merch

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://ykilcher.com/discord

LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have aske…

5 months, 1 week назад @ youtube.com
AI Alignment Livestream (aka OpenAssistant "Just Chatting")
AI Alignment Livestream (aka OpenAssistant "Just Chatting") AI Alignment Livestream (aka OpenAssistant "Just Chatting")

https://open-assistant.io/chat Links:

TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://discord.gg/4H8xxDF

BitChute: https://www.bitchute.com/channel/yannic-kilcher

Minds: https://www.minds.com/ykilcher

Parler: https://parler.com/profile/YannicKilcher

LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/

BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):

SubscribeStar: https…

5 months, 2 weeks назад @ youtube.com
Henry AI Labs Henry AI Labs
последний пост None
3blue1brown 3blue1brown
последний пост 3 weeks, 2 days назад
The origin of light, scattering, and polarization | Barber pole, part 2
The origin of light, scattering, and polarization | Barber pole, part 2 The origin of light, scattering, and polarization | Barber pole, part 2

Explaining the barber pole effect from the last video: https://youtu.be/QCX62YJCmGk

Help fund future projects: https://www.patreon.com/3blue1brown

An equally valuable form of support is to simply share the videos. Timestamps:

0:00 - Recap

0:44 - The radiation law

6:10 - Simulating the radiation law

11:11 - Why the diagonal stripes?

16:31 - Why does it twist? ------------------ These animations are largely made using a custom python library, manim. See the FAQ comments here:

https://www.3blue1brown.com/faq#manim

https://github.com/3b1b/manim

https://github.com/ManimCommunity/manim/ You can find code for specific videos and projects here:

https://github.com/3b1b/videos/ Music by Vincent Rubin…

3 weeks, 2 days назад @ youtube.com
This demo tests your understanding of light | Barber pole, part 1
This demo tests your understanding of light | Barber pole, part 1 This demo tests your understanding of light | Barber pole, part 1

Next video: https://youtu.be/aXRTczANuIs

Steve Mould'd video on the topic: https://youtu.be/975r9a7FMqc

Help fund future projects: https://www.patreon.com/3blue1brown

An equally valuable form of support is to simply share the videos. Thanks to Quinn Brodsky for setting up the demo and to the MIT Physics Instructional Resources Lab for their help and materials, especially Josh Wolfe and Caleb Bonyun. ------------------ These animations are largely made using a custom python library, manim. See the FAQ comments here:

https://www.3blue1brown.com/faq#manim

https://github.com/3b1b/manim

https://github.com/ManimCommunity/manim/ You can find code for specific videos and projects here:

https://gith…

3 weeks, 2 days назад @ youtube.com
Why is the "central limit" a normal distribution?
Why is the "central limit" a normal distribution? Why is the "central limit" a normal distribution?

A visual trick to compute the sum of two normally-distributed variables.

3b1b mailing list: https://3blue1brown.substack.com/

Help fund future projects: https://www.patreon.com/3blue1brown

Special thanks to these supporters: https://www.3blue1brown.com/lessons/gaussian-convolution#thanks For the technically curious who want to go deeper, here's a proof of the central limit theorem using Moment generating functions:

https://www.cs.toronto.edu/~yuvalf/CLT.pdf And here's a nice discussion of methods using entropy:

https://mathoverflow.net/questions/182752/central-limit-theorem-via-maximal-entropy Relevant previous videos Central limit theorem

https://youtu.be/zeJD6dqJ5lo Why π is there, and th…

2 months, 2 weeks назад @ youtube.com
The absurd circle division pattern (updated) | Moser's circle problem
The absurd circle division pattern (updated) | Moser's circle problem The absurd circle division pattern (updated) | Moser's circle problem

An apparent pattern that breaks, and the reason behind it.

Summer of math exposition: https://3blue1brown.substack.com/p/some3-begins

Learn more at https://some.3b1b.co/

Help fund future projects: https://www.patreon.com/3blue1brown

An equally valuable form of support is to simply share the videos. For the long-time viewers among you, if this sounds familiar, it's because it's a remake of one of the earliest videos on the channel. It's such a wonderful problem, and the audio/pacing in earlier videos was really suboptimal, so I wanted to freshen it up a little here. Timestamps

0:00 - The pattern

2:20 - Counting chords

4:03 - Counting intersection points

6:20 - Euler's characteristic formula

2 months, 3 weeks назад @ youtube.com
How They Fool Ya (live) | Math parody of Hallelujah
How They Fool Ya (live) | Math parody of Hallelujah How They Fool Ya (live) | Math parody of Hallelujah

Happy Tau Day! Here's something a bit out of the ordinary for you.

Thanks to Matt Parker, @standupmaths, for the invite.

More about the event: https://festivalofthespokennerd.com/show/an-evening-of-unnecessary-detail/ Thanks to Tim Blais, @acapellascience, for helpful thoughts on the song, including the key phrase "How they fool ya" Video about the circle pattern

https://youtu.be/YtkIWDE36qU Video about those integrals

https://youtu.be/851U557j6HE Video about the primes in base 4

https://youtu.be/jhObLT1Lrfo Timestamps:

0:00 - Intro

1:20 - Song ------------------ 3blue1brown is a channel about animating math, in all senses of the word animate. Website: https://www.3blue1brown.com

Mailing li…

2 months, 4 weeks назад @ youtube.com
Convolutions | Why X+Y in probability is a beautiful mess
Convolutions | Why X+Y in probability is a beautiful mess Convolutions | Why X+Y in probability is a beautiful mess

Adding random variables, with connections to the central limit theorem.

Help fund future projects: https://www.patreon.com/3blue1brown

An equally valuable form of support is to simply share the videos. 0:00 - Intro quiz

2:24 - Discrete case, diagonal slices

6:49 - Discrete case, flip-and-slide

8:41 - The discrete formula

10:58 - Continuous case, flip-and-slide

15:53 - Example with uniform distributions

18:42 - Central limit theorem

20:50 - Continuous case, diagonal slices

25:26 - Returning to the intro quiz ------------------ These animations are largely made using a custom python library, manim. See the FAQ comments here:

https://www.3blue1brown.com/faq#manim

https://github.com/3b1b/manim

2 months, 4 weeks назад @ youtube.com
Why π is in the normal distribution (beyond integral tricks)
Why π is in the normal distribution (beyond integral tricks) Why π is in the normal distribution (beyond integral tricks)

Where's the circle? And how does it relate to where e^(-x^2) comes from?

Help fund future projects: https://www.patreon.com/3blue1brown

Special thanks to these supporters: https://www.3blue1brown.com/lessons/gaussian-integral#thanks

An equally valuable form of support is to simply share the videos. The artwork in this video is by Kurt Bruns, aided by Midjourney Here are several other good posts about the classic Poisson proof vcubingx: https://www.youtube.com/watch?v=9CgOthUUdw4

BriTheMathGuy: https://www.youtube.com/watch?v=S79KPrIm_Gc

Dr. Alter's math library: https://idan-alter.github.io/2023/02/20/Gaussian-Integral.html And if you'd like to see many other variations on approaching this …

5 months, 3 weeks назад @ youtube.com
But what is the Central Limit Theorem?
But what is the Central Limit Theorem? But what is the Central Limit Theorem?

A visual introduction to probability's most important theorem

Help fund future projects: https://www.patreon.com/3blue1brown

Special thanks to these lovely supporters: https://www.3blue1brown.com/lessons/clt#thanks

An equally valuable form of support is to simply share the videos. Galton board shown in the video: https://amzn.to/3ZJK8nY ----------------- Timestamps

0:00 - Introduction

1:53 - A simplified Galton Board

4:14 - The general idea

6:15 - Dice simulations

8:55 - The true distributions for sums

11:41 - Mean, variance, and standard deviation

15:54 - Unpacking the Gaussian formula

20:47 - The more elegant formulation

25:01 - A concrete example

27:10 - Sample means

28:10 - Underlying a…

6 months, 2 weeks назад @ youtube.com
Two Minute Papers Two Minute Papers
последний пост 2 days, 10 hours назад
OpenAI's DALL-E 3 - The King Is Back!
OpenAI's DALL-E 3 - The King Is Back! OpenAI's DALL-E 3 - The King Is Back!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers The DALL-E 3 announcement is available here:

https://openai.com/dall-e-3 Their video is available here (thanks for the Scholarly representation!):

https://www.youtube.com/watch?v=sqQrN0iZBs0 My latest paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bret Brizzee, Bryan…

2 days, 10 hours назад @ youtube.com
NVIDIA’s DLSS 3.5: This Should Be Impossible!
NVIDIA’s DLSS 3.5: This Should Be Impossible! NVIDIA’s DLSS 3.5: This Should Be Impossible!

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/2mp 📝 My paper on neural rendering:

https://users.cg.tuwien.ac.at/zsolnai/gfx/gaussian-material-synthesis/ My latest paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 Digital trends on DLSS 3: https://www.digitaltrends.com/computing/why-i-leave-dlss-3-off-in-games/ 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bret Brizze…

5 days, 7 hours назад @ youtube.com
OpenAI’s ChatGPT Nails 150+ Difficult Tasks!
OpenAI’s ChatGPT Nails 150+ Difficult Tasks! OpenAI’s ChatGPT Nails 150+ Difficult Tasks!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "Reflexion: Language Agents with Verbal Reinforcement Learning" is available here:

https://arxiv.org/abs/2303.11366 Video editing: https://twitter.com/goodside/status/1652540643212767234

Music taste: https://twitter.com/SHL0MS/status/1652842277788692480/ My latest paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, …

1 week, 2 days назад @ youtube.com
AI Reads Minds of 29 Patients!
AI Reads Minds of 29 Patients! AI Reads Minds of 29 Patients!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "Music can be reconstructed from human auditory cortex activity using nonlinear decoding models" is available here:

https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3002176 Listen: https://news.berkeley.edu/2023/08/15/releases-20230811 My latest paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov…

2 weeks назад @ youtube.com
Wow, NVIDIA’s Rendering, But 10X Faster!
Wow, NVIDIA’s Rendering, But 10X Faster! Wow, NVIDIA’s Rendering, But 10X Faster!

❤️ Check out Fully Connected by Weights & Biases: https://wandb.me/papers 📝 The paper "3D Gaussian Splatting for Real-Time Radiance Field Rendering" is available here:

https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/ Unofficial implementation: https://jatentaki.github.io/portfolio/gaussian-splatting/ Community showcase links:

https://twitter.com/alexcarliera/status/1693755522636206494

https://www.linkedin.com/posts/divesh-naidoo-48809934_vfx-nerf-3d-activity-7099439156354781185-Dswa

https://www.linkedin.com/posts/huguesbruyere_gaussiansplatting-3d-ml-activity-7098993947112300544-qjCi

https://twitter.com/jonstephens85/status/1692281505526235373?t=v1hiWNMGBSUhcUDCkCihLg&s=19

https://…

3 weeks, 1 day назад @ youtube.com
1,000,000,000 Parameter Super Resolution AI!
1,000,000,000 Parameter Super Resolution AI! 1,000,000,000 Parameter Super Resolution AI!

❤️ Check out Weights & Biases and say hi in their community forum here: https://wandb.me/paperforum 📝 The paper "GigaGAN: Scaling up GANs for Text-to-Image Synthesis" is available here:

https://mingukkang.github.io/GigaGAN/ My latest paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bret Brizzee, Bryan Learn, B Shang, Christian Ahlin, Geronimo Moralez, Gor…

3 weeks, 4 days назад @ youtube.com
DeepMind-Like Gaming AI: Incredible Driving Skills!
DeepMind-Like Gaming AI: Incredible Driving Skills! DeepMind-Like Gaming AI: Incredible Driving Skills!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "MAESTRO: Open-Ended Environment Design

for Multi-Agent Reinforcement Learning" and "The Power Particle-In-Cell Method" are available here:

https://sites.google.com/view/maestro-ued

https://www.seas.upenn.edu/~ziyinq/static/files/power.pdf

https://dl.acm.org/doi/abs/10.1145/3528223.3530066 My latest paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Pa…

4 weeks назад @ youtube.com
AI Mind Reading Experiment!
AI Mind Reading Experiment! AI Mind Reading Experiment!

❤️ Check out Weights & Biases and take their great course for free: https://wandb.me/papercourse 📝 The mind video paper "Cinematic Mindscapes: High-quality Video Reconstruction from Brain Activity" is available here:

https://mind-video.com/ My latest paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bret Brizzee, Bryan Learn, B Shang, Christian Ahlin, Gero…

1 month назад @ youtube.com
New AI Beats DeepMind’s AlphaGo Variants 97% Of The Time!
New AI Beats DeepMind’s AlphaGo Variants 97% Of The Time! New AI Beats DeepMind’s AlphaGo Variants 97% Of The Time!

❤️ Check out Weights & Biases and say hi in their community forum here: https://wandb.me/paperforum 📝 The papers in this episode are available here:

https://goattack.far.ai/

https://arxiv.org/abs/2211.00241

https://arxiv.org/abs/2307.03798 My latest paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bret Brizzee, Bryan Learn, B Shang, Christian Ahlin, Geron…

1 month назад @ youtube.com
NVIDIA Omniverse: Virtual Worlds Come Alive!
NVIDIA Omniverse: Virtual Worlds Come Alive! NVIDIA Omniverse: Virtual Worlds Come Alive!

❤️ Check out Weights & Biases and take their great course for free: https://wandb.me/papercourse 📝 My paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bret Brizzee, Bryan Learn, B Shang, Christian Ahlin, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Lukic, John Le, Kenneth Davis, Klaus Busse, Kyle Davis, Lukas Biewald, Martin, Matthew Valle, Michael …

1 month, 1 week назад @ youtube.com
NVIDIA’s New AI Is Gaming With Style!
NVIDIA’s New AI Is Gaming With Style! NVIDIA’s New AI Is Gaming With Style!

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.com/papers 📝 The paper "Synthesizing Physical Character-Scene Interactions" is available here:

https://dl.acm.org/doi/abs/10.1145/3588432.3591525

https://research.nvidia.com/publication/2023-08_synthesizing-physical-character-scene-interactions My latest paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew …

1 month, 1 week назад @ youtube.com
Stable Diffusion XL Is Here!
Stable Diffusion XL Is Here! Stable Diffusion XL Is Here!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers Try it here online: https://clipdrop.co/stable-diffusion

How to run it: https://stable-diffusion-art.com/sdxl-model/

Run on all platforms: https://github.com/AUTOMATIC1111/stable-diffusion-webui

Alternative, modular solution: https://github.com/comfyanonymous/ComfyUI

Great Mac app: https://drawthings.ai/ Danielle Baskin: https://twitter.com/djbaskin/status/1514735924826963981 SDXL Doré drawings: https://www.reddit.com/r/StableDiffusion/comments/155bwgz/gustave_dor%C3%A9_drawings_sdxl/

Orcton: https://twitter.com/OrctonAI/status/1684344552654610434 My latest paper on simulations that look almost like real…

1 month, 2 weeks назад @ youtube.com
NVIDIA’s New AI Trained For 5,000,000,000 Steps!
NVIDIA’s New AI Trained For 5,000,000,000 Steps! NVIDIA’s New AI Trained For 5,000,000,000 Steps!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "CALM: Conditional Adversarial Latent Models for Directable Virtual Characters" is available here:

https://research.nvidia.com/labs/par/calm/ My latest paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bret Brizzee, Bryan Learn, B Shang, Christian Ahlin, Ger…

1 month, 2 weeks назад @ youtube.com
Microsoft’s AI Watched 100,000,000 Youtube Videos!
Microsoft’s AI Watched 100,000,000 Youtube Videos! Microsoft’s AI Watched 100,000,000 Youtube Videos!

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.com/papers 📝 The paper "CoDi: Any-to-Any Generation via Composable Diffusion" is available here:

https://codi-gen.github.io/ 📝 The paper "Shortest Path to Boundary for Self-Intersecting Meshes" is available here:

https://arxiv.org/abs/2305.09778

https://www.youtube.com/watch?v=qRBHY9ntwbU 📝 Sound synthesis paper:

https://www.cs.cornell.edu/projects/Sound/mc/ My latest paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our ge…

1 month, 2 weeks назад @ youtube.com
NVIDIA's New AI: Text To Image Supercharged!
NVIDIA's New AI: Text To Image Supercharged! NVIDIA's New AI: Text To Image Supercharged!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "Key-Locked Rank One Editing for Text-to-Image Personalization" is available here:

https://research.nvidia.com/labs/par/Perfusion/ I made a remark in the video that I did not word very well that I had to cut out - if you see a sudden jump somewhere that you did not expect, that is it. It's not the cleanest, but it is there to make sure everything is as clear and accurate as possible. My apologies! My latest paper on simulations that look almost like reality is available for free here:

https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations:

https://www.nature.c…

1 month, 3 weeks назад @ youtube.com
DataFest Video DataFest Video
последний пост None
Семинары JetBrains Research Семинары JetBrains Research
последний пост 4 months, 1 week назад
Обнаружение и диагностика неисправностей при помощи графовых нейронных сетей
Обнаружение и диагностика неисправностей при помощи графовых нейронных сетей Обнаружение и диагностика неисправностей при помощи графовых нейронных сетей

Анонсы семинаров доступны по ссылке: https://t.me/piterai Докладчик: Александр Коваленко - младший научный сотрудник научно-исследовательского института искусственного интеллекта AIRI. Доклад посвящен задачам обнаружения и диагностики неисправностей в технологических процессах и промышленном оборудовании. В докладе будут проанализированы особенности применения графовых нейронных сетей к данным с промышленных объектов, описан способ получения структуры графа при помощи обучающихся матриц смежности.

4 months, 1 week назад @ youtube.com
Механизм внимания в моделях, основанных на деревьях решений
Механизм внимания в моделях, основанных на деревьях решений Механизм внимания в моделях, основанных на деревьях решений

Анонсы семинаров доступны по ссылке: https://t.me/piterai Докладчик: Андрей Константинов, аспирант Высшей школы искусственного интеллекта Санкт-Петербургского политехнического университета Петра Великого. Доклад посвящён применению широко распространённых моделей внимания (Attention), изначально используемых для обработки естественного языка и последовательностей, к задачам классификации, регрессии и анализу выживаемости с обучающими выборками малых размеров. В качестве моделей рассматриваются различные комбинации механизма внимания с ансамблями деревьев решений, а задачи обучения сводятся как к оптимизации нейронных сетей, так и к выпуклому программированию.

5 months назад @ youtube.com
Автоматический анализ временных рядов (с упором на классификацию и поиск аномалий)
Автоматический анализ временных рядов (с упором на классификацию и поиск аномалий) Автоматический анализ временных рядов (с упором на классификацию и поиск аномалий)

Анонсы будущих семинаров - в канале ассоциации - https://t.me/piterai Ссылка на Fedot.Industrial - https://github.com/aimclub/Fedot.Industrial Докладчик: Илья Ревин, научный сотрудник исследовательского центра в сфере искусственного интеллекта "Сильный искусственный интеллект в промышленности", научный сотрудник лаборатории композитного искусственного интеллекта ИТМО. Аннотация: Автоматическое машинное обучение способно повысить эффективность анализа временных рядов благодаря возможности комбинировать различные методы извлечения признаков и модели машинного обучения для создания оптимальной модели для конкретной задачи. На данный момент актуальные решения данной задачи имеют 2 существенных …

5 months, 3 weeks назад @ youtube.com
Решение задачи оптимального транспорта на основе нейросетей
Решение задачи оптимального транспорта на основе нейросетей Решение задачи оптимального транспорта на основе нейросетей

Анонсы будущих семинаров - в канале ассоциации - https://t.me/piterai Докладчик: Евгений Бурнаев, д.ф.-м.н., профессор, руководитель центра прикладного ИИ Сколтеха, ведущий научный сотрудник AIRI Аннотация:

Решение задач оптимального транспорта (OT) с помощью нейронных сетей получило широкое распространение в машинном обучении. Большинство существующих методов вычисляют стоимость OT и используют ее в качестве функции потерь для настройки генератора в генеративных моделях (Wasserstein GANs). В презентации будет рассказано о совершенно другом и недавно появившемся направлении - методах вычисления отображения ОТ и использования его в качестве генеративной модели. Недавние результаты показывают…

6 months назад @ youtube.com
Яндекс. Компьютерные науки Яндекс. Компьютерные науки
последний пост 1 month назад
03. Дикуссия «Ближайшее будущее диффузионных моделей»
03. Дикуссия «Ближайшее будущее диффузионных моделей» 03. Дикуссия «Ближайшее будущее диффузионных моделей»

Участники: - Петр Ермаков, ML Brand Director, Яндекс

- Иван Барабанов, Разработчик, ВКонтакте, deep vk

- Валентин Хрульков, ведущий исследователь, Yandex Research Денис Димитров, Исполнительный директор по исследованию данных Sber AI, научный консультант AIRI Вместе со специалистами в области диффузионных картиночных моделей порассуждаем, куда развивается область. Поговорим про текущие положение дел и актуальные технические барьеры области.

1 month назад @ youtube.com
02. Практические аспекты обучения масштабных диффузионных моделей - Валентин Хрульков
02. Практические аспекты обучения масштабных диффузионных моделей - Валентин Хрульков 02. Практические аспекты обучения масштабных диффузионных моделей - Валентин Хрульков

Спикер: Валентин Хрульков, ведущий исследователь, Yandex Research Рассмотрим все этапы обучения foundational диффузионных моделей, начиная от подготовки датасета до регулярных замеров качества в процессе обучения. Обсудим scaling law эксперименты и их предварительные результаты. Так же обсудим различные аспекты применения этих моделей на практике: генерации рекламных баннеров, персонализация, сервис-социальная сеть Шедеврум.

1 month назад @ youtube.com
01. Kandinsky 2 X - Андрей Кузнецов
01. Kandinsky 2 X - Андрей Кузнецов 01. Kandinsky 2 X - Андрей Кузнецов

Спикер: Андрей Кузнецов, Исполнительный директор по исследованию данных, Sber AI Рассмотрим теорию и доступную информацию о диффузионном процессе. Подробно покажу, чем отличается архитектура Kandinsky 2.1 от 2.2. Обсудим метрики для оценки качества генераций, поговорим про ключевые результаты релизов. Вместе посмотрим, в каких сценариях для бизнеса и практических приложениях можно применять нашу модель.

1 month назад @ youtube.com
ML Party 03.08.2023
ML Party 03.08.2023 ML Party 03.08.2023

Добро пожаловать на вечерний митап для ML-инженеров от Яндекса. В этот раз поговорим про модные нынче «генеративки», а именно про диффузионные картиночные модели! программа:

00:00:00 – таймер обратного отсчёта перед эфиром

00:02:12 – открытие конференции

00:04:52 – Kandinsky 2.X

Андрей Кузнецов, исполнительный директор по исследованию данных, Sber AI.

00:46:55 – перерыв

01:03:40 – Практические аспекты обучения масштабных диффузионных моделей Валентин Хрульков, ведущий исследователь, Yandex Research. 01:49:34 – Дискуссия «Ближайшее будущее диффузионных моделей» Присоединяйтесь к нашем сообществу в телеграм, чтобы быть в курсе всех событий и мероприятий Яндекса https://t.me/yadatadojo. Вопрос…

1 month, 3 weeks назад @ youtube.com
Bias-variance tradeoff в анализе данных и реальной жизни. Максим Николаев, Школа анализа данных
Bias-variance tradeoff в анализе данных и реальной жизни. Максим Николаев, Школа анализа данных Bias-variance tradeoff в анализе данных и реальной жизни. Максим Николаев, Школа анализа данных

Лекция «Bias-variance tradeoff. Почему мы любим спрашивать совета у профессионалов» на Дне открытых дверей Школы анализа данных в Санкт-Петербурге, 22 апреля 2023 года. Лектор: Максим Николаев, координатор партнерского направления Яндекса «Науки о данных» на факультете МКН СПбГУ, преподаватель статистики и ML. С переобучением и недообучением модели сталкивается всякий, кто интересуется анализом данных и машинным обучением. В этой лекции мы разберемся, как описать эти явления в терминах смещения и дисперсии предсказания, узнаем, как связаны эти понятия со способностью модели выучивать обучающую выборку, а также увидим, как улучшить качество предсказания, умело ограничивая выразительность мод…

4 months, 3 weeks назад @ youtube.com
Data Dojo — ML тренировка 22 апреля 2023
Data Dojo — ML тренировка 22 апреля 2023 Data Dojo — ML тренировка 22 апреля 2023

Data Dojo — мероприятие, на котором встречаются специалисты из сферы анализа данных и участвуют в тренировках по машинному обучению. Задавайте вопросы спикерам в телеграм-чате (https://t.me/+OsKnLNG-7DE1ZTFi) с хештегом #вопрос, чтобы ведущий зачитал их в прямом эфире. Программа: https://events.yandex.ru/events/datadojo-22-04-2023 00:00:00 - заставка перед началом трансляции

00:02:23 - Обзор активных ML-соревнований (май-июль)

00:20:38 - Разбор решения Kaggle: Learning Equality - Curriculum Recommendations

01:46:44 - Разбор топ-2 решения на всероссийском чемпионате от ЦП

02:12:58 - Разбор решения Kaggle: IceCube - Neutrinos in Deep Ice

5 months назад @ youtube.com
ML Party Yerevan — 2 марта 2023
ML Party Yerevan — 2 марта 2023 ML Party Yerevan — 2 марта 2023

0:00 начало

11:48 Как устроены Быстрые ответы в Яндекс Поиске - Антон Иванов

1:16:53 Построение модели эмбеддинга изображений для поиска одежды с заданными атрибутами - Александр Шишеня

2:19:17 DSSM для рекомендации незнакомых исполнителей в Яндекс Музыке - Александр Сафронов ML Party — регулярные встречи о применении машинного обучения в IT. Задавайте вопросы спикерам в телеграм-чате (https://t.me/+OsKnLNG-7DE1ZTFi) с хештегом #вопрос, чтобы ведущий зачитал их в прямом эфире.

6 months, 3 weeks назад @ youtube.com
Прямая трансляция пользователя Компьютерные науки
Прямая трансляция пользователя Компьютерные науки Прямая трансляция пользователя Компьютерные науки 6 months, 3 weeks назад @ youtube.com
Data Dojo — ML тренировка 16 февраля 2023
Data Dojo — ML тренировка 16 февраля 2023 Data Dojo — ML тренировка 16 февраля 2023

Data Dojo — тренировки по машинному обучению и место встречи специалистов в сфере анализа данных. Задавайте вопросы спикерам в телеграм-чате (https://t.me/+OsKnLNG-7DE1ZTFi) с хештегом #вопрос, чтобы ведущий зачитал их в прямом эфире. 0:00:00 — Начало трансляции

0:13:52 — Обзорный доклад про активные ML-соревнования 2023 / Петр Ермаков 0:39:10 — Разбор олимпиад по искусственному интеллекту от Сбера для школьников и внедрение их решений в экосистему банка / Максим Кузнецов и Владимир Воробьёв 1:28:57 — Диагностирование перелома шейных позвонков на КТ снимках. Разбор решения с Kaggle RSNA 2022 / Селим Сефербеков

7 months, 1 week назад @ youtube.com
Data Dojo — новогодняя ML тренировка 24 декабря 2022
Data Dojo — новогодняя ML тренировка 24 декабря 2022 Data Dojo — новогодняя ML тренировка 24 декабря 2022

Data Dojo — тренировки по машинному обучению и место встречи специалистов в сфере анализа данных. Задавайте вопросы спикерам в телеграм-чате (https://t.me/+OsKnLNG-7DE1ZTFi) с хештегом #вопрос, чтобы ведущий зачитал их в прямом эфире. 0:00:00 — Начало трансляции

0:00:55 — ML-соревнования 2022. Подведение итогов года / Петр Ермаков

0:12:17 — Предсказание исполнителя трека по набору акустических признаков. Разбор решения с Yandex Cup 2022 / Владимир Фоменко

0:37:52 — Что было, что будет, чем сердце успокоится: об анализе новостной ленты из прошлого, настоящего и будущего / Елизавета Пушкарева и Георгий Сурков

2:02:12 — Дорога к Kaggle Competitions Master в 17 лет / Вадим Тимакин

2:31:10 — При…

9 months назад @ youtube.com
ML Trainings ML Trainings
последний пост 1 month, 3 weeks назад
Николай Бутаков - SLAMA: тонкости масштабируемости распределенного AutoML решения на Spark
Николай Бутаков - SLAMA: тонкости масштабируемости распределенного AutoML решения на Spark Николай Бутаков - SLAMA: тонкости масштабируемости распределенного AutoML решения на Spark

SLAMA: тонкости масштабируемости распределенного AutoML решения на Spark и оптимизация производительности Data Fest 2023:

https://ods.ai/events/datafestonline2023

Трек "Open Source": https://ods.ai/tracks/df23-open-sourse Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 month, 3 weeks назад @ youtube.com
Виктор Котежеков -Использование ИИ в процессах гидродинамического модел-я нефтегазовых месторождений
Виктор Котежеков -Использование ИИ в процессах гидродинамического модел-я нефтегазовых месторождений Виктор Котежеков -Использование ИИ в процессах гидродинамического модел-я нефтегазовых месторождений

Data Fest 2023:

https://ods.ai/events/datafestonline2023

Трек "ML in Manufacturing": https://ods.ai/tracks/df23-ml_in_manufacturing Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 month, 3 weeks назад @ youtube.com
Александр Хватов - Физически-обоснованные модели с точки зрения машинного обучения
Александр Хватов - Физически-обоснованные модели с точки зрения машинного обучения Александр Хватов - Физически-обоснованные модели с точки зрения машинного обучения

Data Fest 2023:

https://ods.ai/events/datafestonline2023

Трек "Random DS/ML": https://ods.ai/tracks/df23-random_ds-ml Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 month, 3 weeks назад @ youtube.com
Сергей Цимфер - Сейсмика сквозь ResNet18 линзу
Сергей Цимфер - Сейсмика сквозь ResNet18 линзу Сергей Цимфер - Сейсмика сквозь ResNet18 линзу

Data Fest 2023:

https://ods.ai/events/datafestonline2023

Трек "ML in Manufacturing": https://ods.ai/tracks/df23-ml_in_manufacturing Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 month, 3 weeks назад @ youtube.com
Ирина Деева - Фреймворк для вероятностного моделирования на основе байесовских сетей BAMT
Ирина Деева - Фреймворк для вероятностного моделирования на основе байесовских сетей BAMT Ирина Деева - Фреймворк для вероятностного моделирования на основе байесовских сетей BAMT

Data Fest 2023:

https://ods.ai/events/datafestonline2023

Трек "Open Source": https://ods.ai/tracks/df23-open-sourse Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 month, 3 weeks назад @ youtube.com
Алексей Осипов - Моделиривание поведения нефтяного пятна
Алексей Осипов - Моделиривание поведения нефтяного пятна Алексей Осипов - Моделиривание поведения нефтяного пятна

Data Fest 2023:

https://ods.ai/events/datafestonline2023

Трек "ML in Manufacturing": https://ods.ai/tracks/df23-ml_in_manufacturing Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 month, 3 weeks назад @ youtube.com
Иван Брагин - Position bias в рекомендательных системах
Иван Брагин - Position bias в рекомендательных системах Иван Брагин - Position bias в рекомендательных системах

Data Fest 2023:

https://ods.ai/events/datafestonline2023

Трек "RecSys": https://ods.ai/tracks/df23-recsys Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 month, 3 weeks назад @ youtube.com
Артём Ерохин - Бутстрапирование временных рядов
Артём Ерохин - Бутстрапирование временных рядов Артём Ерохин - Бутстрапирование временных рядов

Скачать презентацию: https://drive.google.com/file/d/13i2L3ByUryPO3nY0-6vRq5w86zZTxx9R/view?usp=sharing Артем Ерохин, Lead DS в X5 Tech, автор канала @Artificial Stupid (https://t.me/artificial_stupid), выступает с докладом про бутстрапирование временных рядов. В докладе рассмотрена проблема применения классического бутстрепа для временных рядов. Артем расскажет про различные методы бутстрепа, учитывающие структуру временного ряда, рассмотрит плюсы и минусы разных подходов. Telegram Reliable ML: https://t.me/reliable_ml Data Fest 2023:

https://ods.ai/events/datafestonline2023

Трек "Reliable ML ": https://ods.ai/tracks/df23-reliable-ml Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте…

1 month, 3 weeks назад @ youtube.com
Марина Завгородняя - ML System Design Doc Challenge – Финал
Марина Завгородняя - ML System Design Doc Challenge – Финал Марина Завгородняя - ML System Design Doc Challenge – Финал

ML System Design Doc: https://github.com/IrinaGoloshchapova/ml_system_design_doc_ru Марина Завгородняя, Data Science Community Lead в Райффайзенбанке, проводит финальный этап контеста по составлению ML System Design Doc – дизайна ML-системы для решения конкретных бизнес-задач. Финальный питчинг решений, просмотр варианты ML System дизайн документов команд и подведение итогов. Участники контеста 3-4 июня практиковались в составлении ML System Design Doc (https://github.com/IrinaGoloshchapova/ml_system_design_doc_ru) на реальном кейсе. По итогам контеста участники получили памятные подарки и командные консультации по своим дизайн документам. Считаем контест полезным всем DS-специалистам, поск…

1 month, 3 weeks назад @ youtube.com
Людмила Коновалова - АВ-тесты в Кионе
Людмила Коновалова - АВ-тесты в Кионе Людмила Коновалова - АВ-тесты в Кионе

Data Fest 2023:

https://ods.ai/events/datafestonline2023

Трек "RecSys": https://ods.ai/tracks/df23-recsys Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 month, 3 weeks назад @ youtube.com
Юрий Кацер - Предварительная обработка и поиск аномалий во временных рядах
Юрий Кацер - Предварительная обработка и поиск аномалий во временных рядах Юрий Кацер - Предварительная обработка и поиск аномалий во временных рядах

Скачать презентацию: https://drive.google.com/file/d/1oNbO5IPZbj0WWted3J2AZ4I9k0LSnnHW/view?usp=drive_link Доклад "Предварительная обработка и поиск аномалий во временных рядах" от Юрия Кацера, эксперта в области применения DS, ML в промышленности, сооснователя waico.tech (http://waico.tech/) и автора тг канала @DataKatser (https://t.me/DataKatser). Процесс работы с временными рядами имеет свои особенности даже в сравнении с табличными данными. В первой части доклада поговорим об этапах предварительной обработки временных рядов, сложностях и проблемах, с которыми можно столкнуться в процессе работы с ними. Также поговорим о конкретных подходах, методах и библиотеках, которые этот процесс ав…

1 month, 3 weeks назад @ youtube.com
Богдан Печёнкин - Bag-of-tricks того, как сделать ваш ML-пайплайн более reliable
Богдан Печёнкин - Bag-of-tricks того, как сделать ваш ML-пайплайн более reliable Богдан Печёнкин - Bag-of-tricks того, как сделать ваш ML-пайплайн более reliable

Скачать презентацию: https://drive.google.com/file/d/1Q70DQl17t7RfXIG9fhaud27a0h9CU2r1/view?usp=drive_link Богдан Печёнкин, Senior ML Engineer в BrandsGoDigital, автор симулятора ML-инженера на karpov.courses и тг-канала @bogdanisssimo, рассказывает основные приёмы и инструменты в арсенале ML инженера, которые помогают застраховать ML проект от неожиданных происшествий на разных этапах его жизненного цикла, и сэкономит вам десятки часов поиска источника проблем. Машинное обучение у многих ассоциируется чёрным ящиком: такие-то данные на входе, такие-то предсказания на выходе, а внутри – что-то загадочное, неконтролируемое, непредсказуемое, а следовательно, ненадёжное (non-reliable). Это силь…

1 month, 3 weeks назад @ youtube.com
Байрамкулов Аслан, Хакимов Артем - Воркшоп по А/B тестированию
Байрамкулов Аслан, Хакимов Артем - Воркшоп по А/B тестированию Байрамкулов Аслан, Хакимов Артем - Воркшоп по А/B тестированию

Data Fest 2023:

https://ods.ai/events/datafestonline2023

Трек "RecSys": https://ods.ai/tracks/df23-recsys Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 month, 3 weeks назад @ youtube.com
Кристина Лукьянова - Байесовский подход к АБ-тестам на примере теста конверсии
Кристина Лукьянова - Байесовский подход к АБ-тестам на примере теста конверсии Кристина Лукьянова - Байесовский подход к АБ-тестам на примере теста конверсии

Скачать презентацию: https://drive.google.com/file/d/179KsS5uoPpUdvfcYvyQ29R3sHDUXOuxR/view?usp=drive_link Кристина Лукьянова, бизнес-аналитик в компании Glowbyte, выступит с вводным докладом о байесовском подходе к АБ-тестированию на нашей секции Reliable ML (https://t.me/reliable_ml/140) на Data Fest 2023 (https://ods.ai/events/datafestonline2023). Байесовский подход к АБ тестам - альтернатива частотному (фреквентистскому) подходу. Поговорим о том, как заменить p-value на более интерпретируемые метрики, используя байесовские методы. Сравним частотный и байесовский подходы на примере теста конверсии. Telegram Reliable ML: https://t.me/reliable_ml Data Fest 2023:

https://ods.ai/events/dataf…

1 month, 3 weeks назад @ youtube.com
Ирина Скорынина - Транзакционная активность клиентов FMCD и MCG
Ирина Скорынина - Транзакционная активность клиентов FMCD и MCG Ирина Скорынина - Транзакционная активность клиентов FMCD и MCG

Data Fest 2023:

https://ods.ai/events/datafestonline2023

Трек "Random DS/ML": https://ods.ai/tracks/df23-random_ds-ml Наши соц.сети:

Telegram: https://t.me/datafest

Вконтакте: https://vk.com/datafest

1 month, 4 weeks назад @ youtube.com
🎧 Podcasts
Lex Fridman AI Podcast Lex Fridman AI Podcast
последний пост 6 days, 23 hours назад
#396 – James Sexton: Divorce Lawyer on Marriage, Relationships, Sex, Lies & Love
#396 – James Sexton: Divorce Lawyer on Marriage, Relationships, Sex, Lies & Love #396 – James Sexton: Divorce Lawyer on Marriage, Relationships, Sex, Lies & Love

James Sexton is a divorce attorney and author.

Please support this podcast by checking out our sponsors:– Eight Sleep: https://www.eightsleep.com/lex to get special savings– InsideTracker: https://insidetracker.com/lex to get 20% off– House of Macadamias: https://houseofmacadamias.com/lex and use code LEX to get 20% off first order– MasterClass: https://masterclass.com/lexpod to get 15% off– AG1: https://drinkag1.com/lex to get 1 month supply of fish oilEPISODE LINKS:James’s Twitter: https://twitter.com/nycdivorcelawJames’s Instagram: https://instagram.com/nycdivorcelawyerJames’s Website: https://nycdivorces.com/our-attorneysHow to Stay in Love (book): https://amzn.to/3t61ujiIf You’re in My…

6 days, 23 hours назад @ lexfridman.com
#395 – Walter Isaacson: Elon Musk, Steve Jobs, Einstein, Da Vinci & Ben Franklin
#395 – Walter Isaacson: Elon Musk, Steve Jobs, Einstein, Da Vinci & Ben Franklin #395 – Walter Isaacson: Elon Musk, Steve Jobs, Einstein, Da Vinci & Ben Franklin

Walter Isaacson is an author of biographies on Elon Musk, Steve Jobs, Einstein, Benjamin Franklin, Leonardo da Vinci, and many others.

Please support this podcast by checking out our sponsors:– MasterClass: https://masterclass.com/lexpod to get 15% off– NetSuite: http://netsuite.com/lex to get free product tour– BetterHelp: https://betterhelp.com/lex to get 10% off– ExpressVPN: https://expressvpn.com/lexpod to get 3 months free– Shopify: https://shopify.com/lex to get $1 per month trialEPISODE LINKS:Walter’s Twitter: https://twitter.com/WalterIsaacsonWalter’s Instagram: https://www.instagram.com/walter_isaacsonWalter’s Website: https://isaacson.tulane.eduWalter’s Books:Elon Musk: https://am…

2 weeks назад @ lexfridman.com
#394 – Neri Oxman: Biology, Art, and Science of Design & Engineering with Nature
#394 – Neri Oxman: Biology, Art, and Science of Design & Engineering with Nature #394 – Neri Oxman: Biology, Art, and Science of Design & Engineering with Nature

Neri Oxman is a designer, engineer, scientist, and artist working on computational design, synthetic biology and digital fabrication, previously at MIT, and now at OXMAN.

Please support this podcast by checking out our sponsors:– Babbel: https://babbel.com/lexpod and use code Lexpod to get 55% off– BetterHelp: https://betterhelp.com/lex to get 10% off– House of Macadamias: https://houseofmacadamias.com/lex and use code LEX to get 20% off first order– InsideTracker: https://insidetracker.com/lex to get 20% off– ExpressVPN: https://expressvpn.com/lexpod to get 3 months freeEPISODE LINKS:Neri’s Twitter: https://twitter.com/nerioxmanOXMAN Website: https://oxman.com/OXMAN Instagram: https://inst…

3 weeks, 2 days назад @ lexfridman.com
#393 – Andrew Huberman: Relationships, Drama, Betrayal, Sex, and Love
#393 – Andrew Huberman: Relationships, Drama, Betrayal, Sex, and Love #393 – Andrew Huberman: Relationships, Drama, Betrayal, Sex, and Love

Andrew Huberman is a neuroscientist at Stanford and host of the Huberman Lab Podcast.

Please support this podcast by checking out our sponsors:– InsideTracker: https://insidetracker.com/lex to get 20% off– Eight Sleep: https://www.eightsleep.com/lex to get special savings– AG1: https://drinkag1.com/lex to get 1 month supply of fish oil– Shopify: https://shopify.com/lex to get $1 per month trial– NetSuite: http://netsuite.com/lex to get free product tourTranscript: https://lexfridman.com/andrew-huberman-4-transcriptEPISODE LINKS:Andrew’s YouTube: https://www.youtube.com/AndrewHubermanLabAndrew’s Instagram: https://instagram.com/hubermanlabAndrew’s Website: https://hubermanlab.comAndrew’s Twi…

1 month, 1 week назад @ lexfridman.com
#392 – Joscha Bach: Life, Intelligence, Consciousness, AI & the Future of Humans
#392 – Joscha Bach: Life, Intelligence, Consciousness, AI & the Future of Humans #392 – Joscha Bach: Life, Intelligence, Consciousness, AI & the Future of Humans

Joscha Bach is a cognitive scientist, AI researcher, and philosopher.

Please support this podcast by checking out our sponsors:– Numerai: https://numer.ai/lex– Eight Sleep: https://www.eightsleep.com/lex to get special savings– MasterClass: https://masterclass.com/lex to get 15% off– AG1: https://drinkag1.com/lex to get 1 month supply of fish oilTranscript: https://lexfridman.com/joscha-bach-3-transcriptEPISODE LINKS:Joscha’s Twitter: https://twitter.com/PlinzJoscha’s Website: http://bach.aiJoscha’s Substack: https://substack.com/@joschaPODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co/2lwqZIrSpotify: https://spoti.fi/2nEwCF8RSS: https://lexfridman…

1 month, 3 weeks назад @ lexfridman.com
#391 – Mohammed El-Kurd: Palestine
#391 – Mohammed El-Kurd: Palestine #391 – Mohammed El-Kurd: Palestine

Mohammed El-Kurd is a Palestinian writer and poet.

Please support this podcast by checking out our sponsors:– Factor: https://factormeals.com/lex50 and use code lex50 to get 50% off first box– SimpliSafe: https://simplisafe.com/lex to get free security camera plus 20% off– BetterHelp: https://betterhelp.com/lex to get 10% off– Babbel: https://babbel.com/lexpod and use code Lexpod to get 55% off– House of Macadamias: https://houseofmacadamias.com/lex and use code LEX to get 20% off your first orderTranscript: https://lexfridman.com/mohammed-el-kurd-transcriptEPISODE LINKS:Mohammed’s Twitter: https://twitter.com/m7mdkurdMohammed’s Instagram: https://instagram.com/mohammedelkurdMohammed’s Webs…

2 months назад @ lexfridman.com
#390 – Yuval Noah Harari: Human Nature, Intelligence, Power, and Conspiracies
#390 – Yuval Noah Harari: Human Nature, Intelligence, Power, and Conspiracies #390 – Yuval Noah Harari: Human Nature, Intelligence, Power, and Conspiracies

Yuval Noah Harari is a historian, philosopher, and author of Sapiens, Homo Deus, 21 Lessons for the 21st Century, and Unstoppable Us.

Please support this podcast by checking out our sponsors:– MasterClass: https://masterclass.com/lex to get 15% off– Eight Sleep: https://www.eightsleep.com/lex to get special savings– ExpressVPN: https://expressvpn.com/lexpod to get 3 months free– InsideTracker: https://insidetracker.com/lex to get 20% off– AG1: https://drinkag1.com/lex to get 1 month supply of fish oilTranscript: https://lexfridman.com/yuval-noah-harari-transcriptEPISODE LINKS:Yuval’s Twitter: https://twitter.com/harari_yuvalYuval’s Instagram: https://www.instagram.com/yuval_noah_harariYuval…

2 months, 1 week назад @ lexfridman.com
#389 – Benjamin Netanyahu: Israel, Palestine, Power, Corruption, Hate, and Peace
#389 – Benjamin Netanyahu: Israel, Palestine, Power, Corruption, Hate, and Peace #389 – Benjamin Netanyahu: Israel, Palestine, Power, Corruption, Hate, and Peace

Benjamin Netanyahu is the Prime Minister of Israel.

Please support this podcast by checking out our sponsors:– Numerai: https://numer.ai/lex– BetterHelp: https://betterhelp.com/lex to get 10% off– NetSuite: http://netsuite.com/lex to get free product tour– Shopify: https://shopify.com/lex to get $1 per month trialTranscript: https://lexfridman.com/benjamin-netanyahu-transcriptEPISODE LINKS:Netanyahu’s Twitter: https://twitter.com/netanyahuNetanyahu’s Instagram: https://www.instagram.com/b.netanyahuNetanyahu’s Website: https://www.netanyahu.org.ilBibi: My Story (book): https://amzn.to/3XJd6URA Durable Peace (book): https://amzn.to/3pIofbXFighting Terrorism (book): https://amzn.to/3XNp6onPODC…

2 months, 2 weeks назад @ lexfridman.com
#388 – Robert F. Kennedy Jr: CIA, Power, Corruption, War, Freedom, and Meaning
#388 – Robert F. Kennedy Jr: CIA, Power, Corruption, War, Freedom, and Meaning #388 – Robert F. Kennedy Jr: CIA, Power, Corruption, War, Freedom, and Meaning

Robert F. Kennedy Jr. is an activist, lawyer, author, and candidate for the President of the Unites States.

Please support this podcast by checking out our sponsors:– House of Macadamias: https://houseofmacadamias.com/lex and use code LEX to get 20% off your first order– Eight Sleep: https://www.eightsleep.com/lex to get special savings– InsideTracker: https://insidetracker.com/lex to get 20% off– AG1: https://drinkag1.com/lex to get 1 month supply of fish oilTRANSCRIPT:https://lexfridman.com/robert-f-kennedy-jr-transcript/EPISODE LINKS:Robert’s Twitter: https://twitter.com/RobertKennedyJrRobert’s Instagram: https://www.instagram.com/robertfkennedyjrRobert’s Facebook: https://www.facebook.c…

2 months, 2 weeks назад @ lexfridman.com
#387 – George Hotz: Tiny Corp, Twitter, AI Safety, Self-Driving, GPT, AGI & God
#387 – George Hotz: Tiny Corp, Twitter, AI Safety, Self-Driving, GPT, AGI & God #387 – George Hotz: Tiny Corp, Twitter, AI Safety, Self-Driving, GPT, AGI & God

George Hotz is a programmer, hacker, and the founder of comma-ai and tiny corp.

Please support this podcast by checking out our sponsors:– Numerai: https://numer.ai/lex– Babbel: https://babbel.com/lexpod and use code Lexpod to get 55% off– NetSuite: http://netsuite.com/lex to get free product tour– InsideTracker: https://insidetracker.com/lex to get 20% off– AG1: https://drinkag1.com/lex to get 1 year of Vitamin D and 5 free travel packsTranscript: https://lexfridman.com/george-hotz-3-transcriptEPISODE LINKS:George’s Twitter: https://twitter.com/realgeorgehotzGeorge’s Twitch: https://twitch.tv/georgehotzGeorge’s Instagram: https://instagram.com/georgehotzTiny Corp’s Twitter: https://twitter…

2 months, 3 weeks назад @ lexfridman.com
#386 – Marc Andreessen: Future of the Internet, Technology, and AI
#386 – Marc Andreessen: Future of the Internet, Technology, and AI #386 – Marc Andreessen: Future of the Internet, Technology, and AI

Marc Andreessen is the co-creator of Mosaic, co-founder of Netscape, and co-founder of the venture capital firm Andreessen Horowitz.

When Reason Goes on Holiday (book): https://amzn.to/3p80b1K2.

Lenin (book): https://amzn.to/43L8YWD4.

On some podcast players you should be able to click the timestamp to jump to that time.

(00:00) – Introduction(05:01) – Google Search(12:49) – LLM training(25:20) – Truth(31:32) – Journalism(41:24) – AI startups(46:46) – Future of browsers(53:09) – History of browsers(59:10) – Steve Jobs(1:13:45) – Software engineering(1:21:00) – JavaScript(1:25:18) – Netscape(1:30:22) – Why AI will save the world(1:38:20) – Dangers of AI(2:08:40) – Nuclear energy(2:20:37) – M…

3 months назад @ lexfridman.com
#385 – Jimmy Wales: Wikipedia
#385 – Jimmy Wales: Wikipedia #385 – Jimmy Wales: Wikipedia

Jimmy Wales is the co-founder of Wikipedia.

Please support this podcast by checking out our sponsors:– Hexclad Cookware: https://hexclad.com/lex and use code LEX to get 10% off– Eight Sleep: https://www.eightsleep.com/lex to get special savings– House of Macadamias: https://houseofmacadamias.com/lex and use code LEX to get 20% off your first orderTranscript: https://lexfridman.com/jimmy-wales-transcriptEPISODE LINKS:Jimmy’s Twitter: https://twitter.com/jimmy_walesJimmy’s Wikipedia page: https://en.wikipedia.org/wiki/Jimmy_WalesDonate to Wikipedia: https://donate.wikimedia.orgWT.Social: https://wt.social/PODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://appl…

3 months, 1 week назад @ lexfridman.com
#384 – Matthew McConaughey: Freedom, Truth, Family, Hardship, and Love
#384 – Matthew McConaughey: Freedom, Truth, Family, Hardship, and Love #384 – Matthew McConaughey: Freedom, Truth, Family, Hardship, and Love

Matthew McConaughey is an Oscar-winning actor and author of Greenlights.

Please support this podcast by checking out our sponsors:– Riverside: http://creators.riverside.fm/LEX and use code LEX to get 30% off– MasterClass: https://masterclass.com/lex to get 15% off– AG1: https://drinkag1.com/lex to get 1 year of Vitamin D and 5 free travel packsEPISODE LINKS:Matthew’s Twitter: https://twitter.com/McConaugheyMatthew’s Instagram: https://instagram.com/officiallymcconaughey/Matthew’s YouTube: https://youtube.com/@MatthewMcConaugheyGreenlights (book): https://greenlights.com/Roadtrip (course): https://artoflivinevent.com/roadtripPODCAST INFO:Podcast website: https://lexfridman.com/podcastApple P…

3 months, 1 week назад @ lexfridman.com
#383 – Mark Zuckerberg: Future of AI at Meta, Facebook, Instagram, and WhatsApp
#383 – Mark Zuckerberg: Future of AI at Meta, Facebook, Instagram, and WhatsApp #383 – Mark Zuckerberg: Future of AI at Meta, Facebook, Instagram, and WhatsApp

Mark Zuckerberg is CEO of Meta.

Please support this podcast by checking out our sponsors:– Numerai: https://numer.ai/lex– Shopify: https://shopify.com/lex to get $1 per month trial– BetterHelp: https://betterhelp.com/lex to get 10% offEPISODE LINKS:Mark’s Facebook: https://facebook.com/zuckMark’s Instagram: https://instagram.com/zuckMeta AI: https://ai.facebook.com/Meta Quest: https://www.meta.com/quest/PODCAST INFO:Podcast website: https://lexfridman.com/podcastApple Podcasts: https://apple.co/2lwqZIrSpotify: https://spoti.fi/2nEwCF8RSS: https://lexfridman.com/feed/podcast/YouTube Full Episodes: https://youtube.com/lexfridmanYouTube Clips: https://youtube.com/lexclipsSUPPORT & CONNECT:– Ch…

3 months, 2 weeks назад @ lexfridman.com
#382 – Bert Kreischer: Comedy, Drinking, Rogan, Segura, Churchill & Kim Jong Un
#382 – Bert Kreischer: Comedy, Drinking, Rogan, Segura, Churchill & Kim Jong Un #382 – Bert Kreischer: Comedy, Drinking, Rogan, Segura, Churchill & Kim Jong Un

Bert Kreischer is a comedian, actor, and podcaster.

Check him out on Bertcast, 2 Bears 1 Cave, Something is Burning, and the new movie The Machine.

Please support this podcast by checking out our sponsors:– Eight Sleep: https://www.eightsleep.com/lex to get special savings– NetSuite: http://netsuite.com/lex to get free product tour– ExpressVPN: https://expressvpn.com/lexpod to get 3 months freeEPISODE LINKS:Bert’s Instagram: https://instagram.com/bertkreischer/Bert’s Twitter: https://twitter.com/bertkreischerBert’s YouTube: https://www.youtube.com/@bertkreischerBert’s Website: https://bertbertbert.com/2 Bears 1 Cave: https://www.youtube.com/playlist?list=PL-i3EV1v5hLeT91DuXckUf6tsbMfLgZnoBo…

3 months, 3 weeks назад @ lexfridman.com
Microsoft Research Podcast Microsoft Research Podcast
последний пост 2 days, 7 hours назад
Abstracts: September XX, 2023
Abstracts: September XX, 2023 Abstracts: September XX, 2023

Members of the research community at Microsoft work continuously to advance their respective fields.

Abstracts brings its audience to the cutting edge with them through short, compelling conversations about new and noteworthy achievements.

In this episode, Dr. Xing Xie, a Senior Principal Research Manager of Microsoft Research Asia joins host Dr. Gretchen Huizinga to discuss “Psychometrics for Evaluating General-Purpose AI.” As AI capabilities move from task specific to more general purpose, the paper explores psychometrics, a subfield of psychology, as an alternative to traditional methods for evaluating model performance and for supporting consistent and reliable systems.

Read the paper:

2 days, 7 hours назад @ blubrry.com
149 - AI Frontiers: The future of scale with Ahmed Awadallah and Ashley Llorens
149 - AI Frontiers: The future of scale with Ahmed Awadallah and Ashley Llorens 149 - AI Frontiers: The future of scale with Ahmed Awadallah and Ashley Llorens

Powerful large-scale AI models like GPT-4 are showing dramatic improvements in reasoning, problem-solving, and language capabilities.

This marks a phase change for artificial intelligence—and a signal of accelerating progress to come.

In this Microsoft Research Podcast series, AI scientist and engineer Ashley Llorens hosts conversations with his collaborators and colleagues about what these models—and the models that will come next—mean for our approach to creating, understanding, and deploying AI, its applications in areas such as healthcare and education, and its potential to benefit humanity.

This episode features Senior Principal Research Manager Ahmed H. Awadallah, whose work improving…

1 week, 3 days назад @ blubrry.com
148 - Abstracts: September 13, 2023
148 - Abstracts: September 13, 2023 148 - Abstracts: September 13, 2023

Members of the research community at Microsoft work continuously to advance their respective fields.

Abstracts brings its audience to the cutting edge with them through short, compelling conversations about new and noteworthy achievements.

In the inaugural episode of the series, Dr. Ava Amini and Dr. Kevin K. Yang, both Senior Researchers with Microsoft Health Futures, join host Dr. Gretchen Huizinga to discuss “Protein generation with evolutionary diffusion: Sequence is all you need.” The paper introduces EvoDiff, a suite of models that leverages evolutionary-scale protein data to help design novel proteins more efficiently.

Improved protein engineering has the potential to help create new…

1 week, 4 days назад @ blubrry.com
147 - Intern Insights: Dr. Josh Benaloh with Anunay Kulshrestha and Karan Newatia
147 - Intern Insights: Dr. Josh Benaloh with Anunay Kulshrestha and Karan Newatia 147 - Intern Insights: Dr. Josh Benaloh with Anunay Kulshrestha and Karan Newatia

Every year, interns from academic institutions around the world apply and grow their knowledge as members of the research community at Microsoft.

In this Microsoft Research Podcast series, these students join their internship supervisors to share their experience working alongside some of the leading researchers in their respective fields.

In this episode, PhD students Anunay Kulshrestha and Karan Newatia talk to Senior Cryptographer Josh Benaloh about their work this summer on ElectionGuard, a free, open-source toolkit that enables voters to verify that their votes have been accurately counted.

Kulshrestha and Newatia discuss their contributions to extending ElectionGuard to mail-in voting…

2 weeks, 2 days назад @ blubrry.com
146 - AI Frontiers: AI in India and beyond with Sriram Rajamani
146 - AI Frontiers: AI in India and beyond with Sriram Rajamani 146 - AI Frontiers: AI in India and beyond with Sriram Rajamani

Powerful large-scale AI models like GPT-4 are showing dramatic improvements in reasoning, problem-solving, and language capabilities.

This marks a phase change for artificial intelligence—and a signal of accelerating progress to come.

This episode features Sriram Rajamani, Distinguished Scientist and Managing Director of Microsoft Research India.

Rajamani talks about how the lab’s work is being influenced by today’s rapidly advancing AI.

It’s an application Microsoft CEO Satya Nadella described as the “mic drop moment” of his trip to the lab early this year.

3 weeks, 3 days назад @ blubrry.com
145 - Collaborators: Project InnerEye with Javier Alvarez and Raj Jena
145 - Collaborators: Project InnerEye with Javier Alvarez and Raj Jena 145 - Collaborators: Project InnerEye with Javier Alvarez and Raj Jena

Transforming research ideas into meaningful impact is no small feat.

It often requires the knowledge and experience of individuals from across disciplines and institutions.

Collaborators, a new Microsoft Research Podcast series, explores the relationships—both expected and unexpected—behind the projects, products, and services being pursued and delivered by researchers at Microsoft and the diverse range of people they’re teaming up with.

In this episode, Dr. Gretchen Huizinga talks with Microsoft Health Futures Senior Director Javier Alvarez and Dr. Raj Jena, a radiation oncologist at Addenbrooke’s hospital, part of Cambridge University Hospitals in the United Kingdom, about Project InnerEy…

1 month, 1 week назад @ blubrry.com
144 - Collaborators: Data-driven decision-making with Jina Suh and Shamsi Iqbal
144 - Collaborators: Data-driven decision-making with Jina Suh and Shamsi Iqbal 144 - Collaborators: Data-driven decision-making with Jina Suh and Shamsi Iqbal

Transforming research ideas into meaningful impact is no small feat.

It often requires the knowledge and experience of individuals from across disciplines and institutions.

Collaborators, a new Microsoft Research Podcast series, explores the relationships—both expected and unexpected—behind the projects, products, and services being pursued and delivered by researchers at Microsoft and the diverse range of people they’re teaming up with.

In this episode of the podcast, Dr. Gretchen Huizinga welcomes Principal Researcher Dr. Jina Suh and Principal Applied and Data Science Manager Dr. Shamsi Iqbal to the show to discuss their most recent work together, a research project aimed at developing d…

1 month, 3 weeks назад @ blubrry.com
143 - Collaborators: Gaming AI with Haiyan Zhang
143 - Collaborators: Gaming AI with Haiyan Zhang 143 - Collaborators: Gaming AI with Haiyan Zhang

Transforming research ideas into meaningful impact is no small feat.

It often requires the knowledge and experience of individuals from across disciplines and institutions.

Collaborators, a new Microsoft Research Podcast series, explores the relationships—both expected and unexpected—behind the projects, products, and services being pursued and delivered by researchers at Microsoft and the diverse range of people they’re teaming up with.

In the world of gaming, Haiyan Zhang has situated herself where research meets real-world challenges, helping to bring product teams and researchers together to elevate the player experience with the latest AI advances even before the job became official wi…

2 months назад @ blubrry.com
142 - Collaborators: Holoportation™ communication technology with Spencer Fowers and Kwame Darko
142 - Collaborators: Holoportation™ communication technology with Spencer Fowers and Kwame Darko 142 - Collaborators: Holoportation™ communication technology with Spencer Fowers and Kwame Darko

Transforming research ideas into meaningful impact is no small feat.

It often requires the knowledge and experience of individuals from across disciplines and institutions.

Collaborators, a new Microsoft Research Podcast series, explores the relationships—both expected and unexpected—behind the projects, products, and services being pursued and delivered by researchers at Microsoft and the diverse range of people they’re teaming up with.

In this episode, host Dr. Gretchen Huizinga welcomes Dr. Spencer Fowers, a member of the Special Projects Technical Staff at Microsoft Research, and Dr. Kwame Darko, a plastic surgeon in the reconstructive plastic surgery and burns center in Ghana’s Korle B…

2 months, 2 weeks назад @ blubrry.com
141 - Collaborators: Renewable energy storage with Bichlien Nguyen and David Kwabi
141 - Collaborators: Renewable energy storage with Bichlien Nguyen and David Kwabi 141 - Collaborators: Renewable energy storage with Bichlien Nguyen and David Kwabi

Transforming research ideas into meaningful impact is no small feat.

It often requires the knowledge and experience of individuals from across disciplines and institutions.

Collaborators, a new Microsoft Research Podcast series, explores the relationships—both expected and unexpected—behind the projects, products, and services being pursued and delivered by researchers at Microsoft and the diverse range of people they’re teaming up with.

In this episode, Microsoft Principal Researcher Dr. Bichlien Nguyen and Dr. David Kwabi, Assistant Professor of Mechanical Engineering at the University of Michigan, join host Dr. Gretchen Huizinga to talk about how their respective research interests—and t…

3 months назад @ blubrry.com
140 - AI Frontiers: The future of causal reasoning with Emre Kiciman and Amit Sharma
140 - AI Frontiers: The future of causal reasoning with Emre Kiciman and Amit Sharma 140 - AI Frontiers: The future of causal reasoning with Emre Kiciman and Amit Sharma

What is causality, and how can we think about the space of different, you know, kinds of causal reasoning?

But I think this is a kind of causal reasoning that people do quite often, maybe even it’s the thing they think about when they think about causal reasoning.

LLORENS: In the paper, you explore briefly, you know, kind of actual causality that deals with responsibility or faults.

And so, Emre, can you take … take us through the experiment setup and how you had to approach that with this, you know, kind of unique, unique way of assessing causal reasoning?

I’m really looking forward to tracking your research, and the possibilities for more powerful causal reasoning with AI.

3 months, 2 weeks назад @ microsoft.com
139 - Collaborators: Gov4git with Petar Maymounkov and Kasia Sitkiewicz
139 - Collaborators: Gov4git with Petar Maymounkov and Kasia Sitkiewicz 139 - Collaborators: Gov4git with Petar Maymounkov and Kasia Sitkiewicz

Today, I’m joined by our first two guests, Petar Maymounkov and Kasia Sitkiewicz.

And what I do at GitHub, uh, I work as a product manager.

Right now, um, communities, there are few ways of like how they make decisions, either majority of the votes or through consensus.

One nice side benefit from this entire project is that Gov4git, uh, enables people to like reflect on what they’ve done and, and what is happening.

[MUSIC]HUIZINGA: Petar and Kasia, thank you so much for coming on the show today and being our first guests on the Collaborators podcast.

4 months, 3 weeks назад @ microsoft.com
138 - AI Frontiers: Models and Systems with Ece Kamar
138 - AI Frontiers: Models and Systems with Ece Kamar 138 - AI Frontiers: Models and Systems with Ece Kamar

Let’s say I want this AI system to write me an email to you.

And we just need to build these techniques and make them part of the way we build AI systems with these latest models.

Kamar: You know, the word agent comes from agency, and the question is what does agency mean for an AI system?

So those are the three main capabilities we want to have in our AI systems.

Right now, we are seeing writing documents, collecting information from the web, and presenting them, but in the future, what other creative things AI systems and humans can do together?

5 months, 2 weeks назад @ microsoft.com
137 - AI Frontiers: AI for health and the future of research with Peter Lee
137 - AI Frontiers: AI for health and the future of research with Peter Lee 137 - AI Frontiers: AI for health and the future of research with Peter Lee

Peter Lee: It’s such an important question and, you know, research in context, I think the way I explained it before is about inevitable futures.

And so knowing that, that’s a very different world than today, because today most of medical research funding is focused on cancer research, not on neurological disease.

In that transition, which is just a positive outcome of research, it also had some disruptive effect on research.

You know, in 1991, when Microsoft Research was founded, one of the founding research groups was a 3D computer graphics research group that was amongst, uh, the first three research groups for MSR.

This is a watershed moment for AI and for computing research, uh, more b…

5 months, 4 weeks назад @ microsoft.com
136 - AI Frontiers: The Physics of AI with Sébastien Bubeck
136 - AI Frontiers: The Physics of AI with Sébastien Bubeck 136 - AI Frontiers: The Physics of AI with Sébastien Bubeck

Powerful new large-scale AI models like GPT-4 are showing dramatic improvements in reasoning, problem-solving, and language capabilities.

This marks a phase change for artificial intelligence—and a signal of accelerating progress to come.

The first episode features Sébastien Bubeck, who leads the Machine Learning Foundations group at Microsoft Research in Redmond.

He and his collaborators conducted an extensive evaluation of GPT-4 while it was in development, and have published their findings in a paper that explores its capabilities and limitations—noting that it shows “sparks” of artificial general intelligence.

https://www.microsoft.com/research

6 months назад @ blubrry.com
NLP Highlights NLP Highlights
последний пост 2 months, 3 weeks назад
141 - Building an open source LM, with Iz Beltagy and Dirk Groeneveld
141 - Building an open source LM, with Iz Beltagy and Dirk Groeneveld 141 - Building an open source LM, with Iz Beltagy and Dirk Groeneveld

In this special episode of NLP Highlights, we discussed building and open sourcing language models.

What is the usual recipe for building large language models?

What does it mean to open source them?

What new research qu…

2 months, 3 weeks назад @ soundcloud.com
140 - Generative AI and Copyright, with Chris Callison-Burch
140 - Generative AI and Copyright, with Chris Callison-Burch 140 - Generative AI and Copyright, with Chris Callison-Burch

In this special episode, we chatted with Chris Callison-Burch about his testimony in the recent U.S. Congress Hearing on the Interoperability of AI and Copyright Law.

We started by asking Chris about the purpose and the …

3 months, 3 weeks назад @ soundcloud.com
139 - Coherent Long Story Generation, with Kevin Yang
139 - Coherent Long Story Generation, with Kevin Yang 139 - Coherent Long Story Generation, with Kevin Yang

How can we generate coherent long stories from language models?

Ensuring that the generated story has long range consistency and that it conforms to a high level plan is typically challenging.

In this episode, Kevin Yang…

6 months назад @ soundcloud.com
138 - Compositional Generalization in Neural Networks, with Najoung Kim
138 - Compositional Generalization in Neural Networks, with Najoung Kim 138 - Compositional Generalization in Neural Networks, with Najoung Kim

Compositional generalization refers to the capability of models to generalize to out-of-distribution instances by composing information obtained from the training data.

In this episode we chatted with Najoung Kim, on how…

8 months, 1 week назад @ soundcloud.com
137 - Nearest Neighbor Language Modeling and Machine Translation, with Urvashi Khandelwal
137 - Nearest Neighbor Language Modeling and Machine Translation, with Urvashi Khandelwal 137 - Nearest Neighbor Language Modeling and Machine Translation, with Urvashi Khandelwal

We invited Urvashi Khandelwal, a research scientist at Google Brain to talk about nearest neighbor language and machine translation models.

These models interpolate parametric (conditional) language models with non-param…

8 months, 2 weeks назад @ soundcloud.com
Data Skeptic
последний пост 1 week, 6 days назад
The Defeat of the Winograd Schema Challenge
The Defeat of the Winograd Schema Challenge The Defeat of the Winograd Schema Challenge

The Defeat of the Winograd Schema ChallengeOur guest today is Vid Kocijan, a Machine Learning Engineer at Kumo AI.

He joins us to discuss how he built a BERT model that solved the Winograd Schema Challenge.

Vid introduced the Winograd schema challenge (WSC) and why it was challenging for deep learning models.

ResourcesResearch Paper: A Surprisingly Robust Trick for Winograd Schema Challenge.

Research Paper: A Review of Winograd Schema Challenge Datasets and Approaches.

1 week, 6 days назад @ dataskeptic.com
LLMs in Social Science
LLMs in Social Science LLMs in Social Science

LLMs in Social ScienceToday, We are joined by Petter Törnberg, an Assistant Professor in Computational Social Science at the University of Amsterdam and a Senior Researcher at the University of Neuchatel.

His research is centered on the intersection of computational methods and their applications in social sciences.

Petter started by discussing the history of computational social science, from neoclassical economics to heterodox economics to now discovering theories from data and models.

Petter discussed how LLMs can be used in other social science fields.

He advised social science students on how to maximize LLMs in their work.

2 weeks, 6 days назад @ dataskeptic.com
LLMs in Music Composition
LLMs in Music Composition LLMs in Music Composition

Carlos’s interest focuses on building new models for symbolic music generation.

Carlos began by explaining how machine learning helps music composers improve their composition.

He shared the standard data formats in which music is represented for machine learning.

He also discussed the machine learning algorithms best for music composition.

Carlos’s next research is on symbolic music generation with human emotions.

3 weeks, 6 days назад @ dataskeptic.com
Cuttlefish Model Tuning
Cuttlefish Model Tuning Cuttlefish Model Tuning

Cuttlefish Model TuningHongyi Wang, a Senior Researcher at the Machine Learning Department at Carnegie Mellon University, joins us.

His research is in the intersection of systems and machine learning.

He discussed his research paper, Cuttlefish: Low-Rank Model Training without All the Tuning, on today’s show.

Hongyi discussed the Cuttlefish model and how it edges LoRa.

Rounding up, he gave his advice on how people can get into the machine learning field.

1 month назад @ dataskeptic.com
Which Professions Are Threatened by LLMs
Which Professions Are Threatened by LLMs Which Professions Are Threatened by LLMs

Which Professions are threatened by LLMsOn today’s episode, we have Daniel Rock, an Assistant Professor of Operations Information and Decisions at the Wharton School of the University of Pennsylvania.

He shared how they used the O-NET dataset and the BLS occupational employment survey to measure the impact of LLMs on different professions.

Daniel broadly highlighted professions that are most and least exposed to LLMs proliferation.

He also spoke about the risks of LLMs and his thoughts on implementing policies for regulating LLMs.

ResourcesPaper: GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language ModelsDataset: O*NET® DatabaseDataset: BLS Occupational employ…

1 month, 1 week назад @ dataskeptic.com
Why Prompting is Hard
Why Prompting is Hard Why Prompting is Hard

Why Prompting is HardWe are excited to be joined by J.D.

He joins us to share his work in his paper, Why Johnny can’t prompt: how non-AI experts try (and fail) to design LLM prompts.

He shared the tools he utilized to create chatbots, including those for improving the chatbot’s adaptability, managing dialogue flow, and handling decision points.

He discussed the tool he and his colleagues built, which helped non-AI experts create better prompts.

He also talked about having chatbots give results that people actually want, although measuring desirability is challenging.

1 month, 2 weeks назад @ dataskeptic.com
Automated Peer Review
Automated Peer Review Automated Peer Review

Automated Peer ReviewIn this episode, we are joined by Ryan Liu, a Computer Science graduate of Carnegie Mellon University.

An Exploratory Study on Using Large Language Models for Paper ReviewingRyan started by providing an overview of the peer review process.

He discussed what birthed the idea of using LLMs, such as ChatGPT, Bard, LLaMa etc., for paper review.

Ryan shared the prompts he used for evaluating the papers for various use cases.

Rounding up, he shared future use cases of using LLMs for paper reviews.

1 month, 3 weeks назад @ dataskeptic.com
Prompt Refusal
Prompt Refusal Prompt Refusal

Prompt RefusalThe creators of large language models impose restrictions on some of the types of requests one might make of them.

LLMs commonly refuse to give advice on committing crimes, producting adult content, or respond with any details about a variety of sensitive subjects.

As with any content filtering system, you have false positives and false negatives.

Today’s interview with Max Reuter and William Schulze discusses their paper I’m Afraid I Can’t Do That: Predicting Prompt Refusal in Black-Box Generative Language Models.

In this work, they explore what types of prompts get refused and build a machine learning classifier adept at predicting if a particular prompt will be refused or n…

2 months назад @ dataskeptic.com
A Long Way Till AGI
A Long Way Till AGI A Long Way Till AGI

A Long Way till AGIOur guest today is Maciej Świechowski.

Maciej joins us to discuss findings from his study, Deep Learning and Artificial General Intelligence: Still a Long Way to Go.

Maciej shared how he started his journey in AI and his thoughts about the prospect of evolutionary algorithms.

Maciej stated reasons why AGI is still a long way ahead.

Maciej discussed how AI is used in game production and the current challenges faced in the industry.

2 months, 1 week назад @ dataskeptic.com
Brain Inspired AI
Brain Inspired AI Brain Inspired AI

Lin and Lu began by discussing the connections between the brain and neural networks.

They also shared whether there is a possibility for solid advancements in neural networks to the point of AGI.

Lin and Lu shared how the brain inspired popular machine learning algorithms like transformers.

They also shared how AI models can learn alignment from the human brain.

They juxtaposed the low energy usage of the brain compared to high-end computers and whether computers can become more energy efficient.

2 months, 2 weeks назад @ dataskeptic.com
Computable AGI
Computable AGI Computable AGI

Computable AGIOn today’s show, we are joined by Michael Timothy Bennett, a Ph.D. student at the Australian National University.

Michael’s research is centered around Artificial General Intelligence (AGI), specifically the mathematical formalism of AGIs.

He joins us to discuss findings from his study, Computable Artificial General Intelligence.

He described his results comparing weakness and simplicity for deciding between models, and what this implies.

You can learn more about Michael’s work on his website or follow him on Twitter @MiTiBennett.

2 months, 3 weeks назад @ dataskeptic.com
AGI Can Be Safe
AGI Can Be Safe AGI Can Be Safe

AGI can be SafeWe are joined by Koen Holtman, an independent AI researcher focusing on AI safety.

Koen is the Founder of Holtman Systems Research, a research company based in the Netherlands.

He discussed the obedience problem with AI models and the safe form of obedience.

Koen spoke about the problem of AGIs not being able to allow changing their utility function after the model is deployed.

Koen discussed the ultimate goal of a safe AI system and how to check that an AI system is indeed safe.

3 months назад @ dataskeptic.com
AI Fails on Theory of Mind Tasks
AI Fails on Theory of Mind Tasks AI Fails on Theory of Mind Tasks

AI Fails on Theory of Mind TasksOn the show today, we are joined by Tomer Ullman, an assistant professor of Psychology at Harvard University.

He discussed the evolution of published datasets to evaluate machines for the theory of mind test.

Tomer shared how he conducted his experiment that shows LLMs fail the theory of mind tests, a study he published.

He explained the Smarties tube test — another test for theory of mind.

Tomer shared his views on developing more advanced tests for the theory of mind in machines.

3 months, 1 week назад @ dataskeptic.com
AI for Mathematics Education
AI for Mathematics Education AI for Mathematics Education

AI for Mathematics EducationOn today’s show, Steven Van Vaerenbergh joins us.

Stephen works in the Department of Communications Engineering, University of Cantabria, and his research lies in artificial intelligence and mathematics education.

Steven started by discussing how AI is being used for mathematics education.

He discussed how he and his colleagues investigated the inner workings of such AI systems to classify the algorithms used.

Steven shared his thoughts on the prospects of advanced AI technologies for online learning.

3 months, 2 weeks назад @ dataskeptic.com
Evaluating Jokes with LLMs
Evaluating Jokes with LLMs Evaluating Jokes with LLMs

Evaluating Jokes with LLMsOn the show today, we are joined by Fabricio Goes, a lecturer at the University of Leicester.

Fabricio’s research interest lies in computational creativity and creativity evaluation.

He joins us to discuss the recent study on evaluating jokes with large language models (LLMs).

He also discussed the use of GPT for creating and evaluating jokes.

He detailed the process of evaluating jokes with GPT-3 and GPT-4.

3 months, 2 weeks назад @ dataskeptic.com
SuperDataScience SuperDataScience
последний пост 2 days, 13 hours назад
716: Happiness and Life-Fulfillment Hacks
716: Happiness and Life-Fulfillment Hacks 716: Happiness and Life-Fulfillment Hacks

Jon Krohn's 94-year-old grandmother, Annie, who's bursting with life and wisdom, shares her recipe to lifelong happiness and how relationships and daily intentions play an integral role.

Annie also shares her curious tak…

2 days, 13 hours назад @ soundcloud.com
715: Make Better Decisions with Data, with Dr. Allen Downey
715: Make Better Decisions with Data, with Dr. Allen Downey 715: Make Better Decisions with Data, with Dr. Allen Downey

Join us as Dr. Allen Downey, renowned author and professor, shares insights from his upcoming book 'Probably Overthinking It,' breaking down underused techniques like Survival Analysis, explaining common paradoxes, and d…

5 days, 13 hours назад @ soundcloud.com
714: Using A.I. to Overcome Blindness and Thrive as a Data Scientist
714: Using A.I. to Overcome Blindness and Thrive as a Data Scientist 714: Using A.I. to Overcome Blindness and Thrive as a Data Scientist

In this Friday episode, guest Tim Albiges explores with host Jon Krohn how people with blindness can have a lucrative and fulfilling career in data science, how Tim’s PhD thesis applied machine learning to help diagnose …

1 week, 2 days назад @ soundcloud.com
713: Llama 2, Toolformer and BLOOM: Open-Source LLMs with Meta's Dr. Thomas Scialom
713: Llama 2, Toolformer and BLOOM: Open-Source LLMs with Meta's Dr. Thomas Scialom 713: Llama 2, Toolformer and BLOOM: Open-Source LLMs with Meta's Dr. Thomas Scialom

Artificial General Intelligence, RLHF’s application in AI, and how entrepreneurs can enter the AI industry: Meta’s AI Research Scientist Thomas Scialom gives us behind-the-scenes insights into developing Llama 2 and what…

1 week, 5 days назад @ soundcloud.com
712: Code Llama
712: Code Llama 712: Code Llama

Code Llama might just be starting the revolution for how data scientists code.

In this Five-Minute Friday, host Jon Krohn investigates the suite of models under the free-to-use Code Llama and how to find the best fit for…

2 weeks, 2 days назад @ soundcloud.com
711: Image, Video and 3D-Model Generation from Natural Language, with Dr. Ajay Jain
711: Image, Video and 3D-Model Generation from Natural Language, with Dr. Ajay Jain 711: Image, Video and 3D-Model Generation from Natural Language, with Dr. Ajay Jain

In this episode, host Jon Krohn explores with his guest Ajay Jain, Founder of Genmo.ai, how creative general intelligence could take the video industry by storm.

They also discuss the models that got Genmo to this point,…

2 weeks, 5 days назад @ soundcloud.com
710: LangChain: Create LLM Applications Easily in Python
710: LangChain: Create LLM Applications Easily in Python 710: LangChain: Create LLM Applications Easily in Python

Discover the power of Large Language Models with Kris Ograbek as he unravels the intricacies of LangChain and showcases a chatbot in action, all while putting our host Jon Krohn in the hot seat!

Additional materials: ww…

3 weeks, 2 days назад @ soundcloud.com
709: Big A.I. R&D Risks Reap Big Societal Rewards, with Meta's Dr. Laurens van der Maaten
709: Big A.I. R&D Risks Reap Big Societal Rewards, with Meta's Dr. Laurens van der Maaten 709: Big A.I. R&D Risks Reap Big Societal Rewards, with Meta's Dr. Laurens van der Maaten

Meta's Senior Research Director, Dr. Laurens van der Maaten, takes center stage to unravel the captivating realm of AI innovation.

Learn about his groundbreaking contributions, including pioneering the t-SNE dimensionali…

3 weeks, 5 days назад @ soundcloud.com
708: ChatGPT Code Interpreter: 5 Hacks for Data Scientists
708: ChatGPT Code Interpreter: 5 Hacks for Data Scientists 708: ChatGPT Code Interpreter: 5 Hacks for Data Scientists

On this week’s Five-Minute Friday, host Jon Krohn gives five reasons why he is so excited about ChatGPT’s Code Interpreter and walks listeners through its capabilities with a practical example.

Additional materials: www…

1 month назад @ soundcloud.com
707: Vicuña, Gorilla, Chatbot Arena and Socially Beneficial LLMs, with Prof. Joey Gonzalez
707: Vicuña, Gorilla, Chatbot Arena and Socially Beneficial LLMs, with Prof. Joey Gonzalez 707: Vicuña, Gorilla, Chatbot Arena and Socially Beneficial LLMs, with Prof. Joey Gonzalez

LLM Vicuña, Chatbot Arena, and the race to increase LLM context windows: This episode’s guest Joey Gonzalez talks to Jon Krohn about developing models and platforms that leverage and improve LLMs, as well as the future o…

1 month назад @ soundcloud.com
706: Large Language Model Leaderboards and Benchmarks
706: Large Language Model Leaderboards and Benchmarks 706: Large Language Model Leaderboards and Benchmarks

In this episode, Caterina Constantinescu dives deep into Large Language Models (LLMs), spotlighting top leaderboards, evaluation benchmarks, and real-world user perceptions.

Plus, discover the challenges of dataset conta…

1 month, 1 week назад @ soundcloud.com
705: Feeding the World with ML-Powered Precision Agriculture
705: Feeding the World with ML-Powered Precision Agriculture 705: Feeding the World with ML-Powered Precision Agriculture

Join Jon Krohn as he chats with Syngenta Group's Feroz Sheikh, Jeremy Groeteke, and Thomas Jung about the digital revolution in agriculture.

Learn how data science is evolving farming, from precision techniques to global…

1 month, 1 week назад @ soundcloud.com
704: Jon’s “Generative A.I. with LLMs” Hands-on Training
704: Jon’s “Generative A.I. with LLMs” Hands-on Training 704: Jon’s “Generative A.I. with LLMs” Hands-on Training

Take on the world of GPT and learn to develop your own, commercially successful Large Language Models (LLMs) with Jon Krohn’s comprehensive, guided training video for generative AI.

Get to grips with the technology, lear…

1 month, 2 weeks назад @ soundcloud.com
703: How Data Happened: A History, with Columbia Prof. Chris Wiggins
703: How Data Happened: A History, with Columbia Prof. Chris Wiggins 703: How Data Happened: A History, with Columbia Prof. Chris Wiggins

Statistics history, interdisciplinarity, and data and society.

Chris Wiggins talks with Jon Krohn about the power dynamics of data, the transformation of the field of biology through data-driven approaches to genetic seq…

1 month, 2 weeks назад @ soundcloud.com
702: Llama 2 — It's Time to Upgrade your Open-Source LLM
702: Llama 2 — It's Time to Upgrade your Open-Source LLM 702: Llama 2 — It's Time to Upgrade your Open-Source LLM

This week, Jon Krohn is examining Meta's newly released open-source large language model, Llama 2, highlighting its commercial prospects, immense capacity, model variety, and unique 'time awareness' feature.

He also disc…

1 month, 3 weeks назад @ soundcloud.com
Data Science at Home Data Science at Home
последний пост 4 days, 19 hours назад
Attacking LLMs for fun and profit (Ep. 239)
Attacking LLMs for fun and profit (Ep. 239) Attacking LLMs for fun and profit (Ep. 239)

As a continuation of Episode 238, I explain some effective and fun attacks to conduct against LLMs.

Such attacks are even more effective on models served locally, that are hardly controlled by human feedback.

Have great fun and learn them responsibly.

Referenceshttps://www.jailbreakchat.com/https://arxiv.org/abs/2305.13860Image Credit

4 days, 19 hours назад @ datascienceathome.com
Unlocking Language Models: The Power of Prompt Engineering (Ep. 238)
Unlocking Language Models: The Power of Prompt Engineering (Ep. 238) Unlocking Language Models: The Power of Prompt Engineering (Ep. 238)

Join me on an enlightening journey through the world of prompt engineering.

Explore the multifaceted skills and strategies involved in harnessing the potential of large language models for various applications.

From enhancing safety measures to augmenting models with domain knowledge, learn how prompt engineering is shaping the future of AI.

ReferencesImage Credit

1 week, 6 days назад @ datascienceathome.com
Erosion of Software Architecture Quality in the Age of AI Code Generation (Ep. 237)
Erosion of Software Architecture Quality in the Age of AI Code Generation (Ep. 237) Erosion of Software Architecture Quality in the Age of AI Code Generation (Ep. 237)

In this era of AI-powered code generation, software architects are facing a concerning decline in the quality of their creations.

The once meticulously crafted software architectures are now being compromised.

Should LLMs be responsible?

ReferencesProgram Design in the UNIX Environmenthttps://harmful.cat-v.org/cat-v/unix_prog_design.pdfimage credit

3 weeks, 3 days назад @ datascienceathome.com
The new dimension of AI: Vector Databases (Ep. 236)
The new dimension of AI: Vector Databases (Ep. 236) The new dimension of AI: Vector Databases (Ep. 236)

Let’s delve into the emerging trend in database design – or is it really a new trend?

The realm of vector databases and their revolutionary influence on AI and ML is making headlines.

Come along as we investigate how these groundbreaking databases are revolutionizing the landscape of data storage, retrieval, and processing, ultimately unlocking the complete potential of artificial intelligence and machine learning.

But are they genuinely as innovative as they seem?

Referenceshttps://partee.io/2022/08/11/vector-embeddings/https://blog.det.life/why-you-shouldnt-invest-in-vector-databases-c0cd3f59d23chttps://medium.com/@ryanntk/choosing-the-right-embedding-model-a-guide-for-llm-applications-7a…

3 weeks, 3 days назад @ datascienceathome.com
Building Self Serve Business Intelligence With AI and LLMs at Zenlytic (Ep. 235)
Building Self Serve Business Intelligence With AI and LLMs at Zenlytic (Ep. 235) Building Self Serve Business Intelligence With AI and LLMs at Zenlytic (Ep. 235)

In this episode, we dive into the world of data analytics and artificial intelligence with Ryan, the CEO, and Paul, the CTO of Zenlytic.

Having graduated from Harvard and with extensive backgrounds in venture capital, consulting, and data engineering, Ryan and Paul provide valuable insights into their journey of building Zenlytic, a cutting-edge analytics platform.

Join us as we explore how Zenlytic’s natural language interface enhances user experiences, enabling seamless access and analysis of analytics data.

Discover how their self-service platform empowers teams to leverage business intelligence effectively, and learn about the unique features that set Zenlytic apart from other analytics…

1 month, 2 weeks назад @ datascienceathome.com
Money, Cryptocurrencies, and AI: Exploring the Future of Finance with Chris Skinner (Ep. 234)
Money, Cryptocurrencies, and AI: Exploring the Future of Finance with Chris Skinner (Ep. 234) Money, Cryptocurrencies, and AI: Exploring the Future of Finance with Chris Skinner (Ep. 234)

In this captivating podcast episode, join renowned financial expert Chris Skinner as he delves into the fascinating realm of the future of money.

From cryptocurrencies to government currencies, the metaverse to artificial intelligence (AI), Skinner explores the intricate interplay between technology and humanity.

Gain valuable insights as he defines the future of money, examines the potential impact of cryptocurrencies on traditional government currencies, and addresses the advantages and disadvantages of digital currencies.

Brace yourself for an enlightening discussion on the integration of AI in the financial sector and its potential impact on humanity.

Once subscribed, you get full acces…

2 months, 2 weeks назад @ datascienceathome.com
Debunking AGI Hype and Embracing Reality (Ep. 233)
Debunking AGI Hype and Embracing Reality (Ep. 233) Debunking AGI Hype and Embracing Reality (Ep. 233)

In this thought-provoking episode, we sit down with the renowned AI expert, Filip Piekniewski, Ph.D., who fearlessly challenges the prevailing narratives surrounding artificial general intelligence (AGI) and singularity.

With a no-nonsense approach and a deep understanding of the field, Filip dismantles the hype and exposes some of the misconceptions about AI, LLMs, and AGI.

If you’re seeking a refreshingly pragmatic perspective on the future of AI, this episode is an absolute must-listen.

Filip Piekniewski BioFilip Piekniewski is a distinguished computer vision researcher and engineer, specializing in visual object tracking and perception.

He is known for his realistic perspective on AI, d…

2 months, 4 weeks назад @ datascienceathome.com
Full steam ahead! Unraveling Forward-Forward Neural Networks (Ep. 232)
Full steam ahead! Unraveling Forward-Forward Neural Networks (Ep. 232) Full steam ahead! Unraveling Forward-Forward Neural Networks (Ep. 232)

In this exciting episode, we dive into the world of Forward-Forward Neural Networks, unveiling their mind-boggling power and potential.

Join us as we demystify these advanced AI algorithms and explore how they’re reshaping industries and revolutionizing machine learning.

From self-driving cars to personalized medicine, discover the cutting-edge applications that are propelling us into a new era of AI greatness.

Get ready to unlock the secrets of Forward-Forward Neural Networks and witness the future of artificial intelligence unfold before your eyes.

Don’t miss out – tune in now and be part of the AI revolution!

3 months назад @ datascienceathome.com
The LLM Battle Begins: Google Bard vs ChatGPT (Ep. 231)
The LLM Battle Begins: Google Bard vs ChatGPT (Ep. 231) The LLM Battle Begins: Google Bard vs ChatGPT (Ep. 231)

Source: PCMagBrace yourselves as we uncover the mind-blowing AI model, Google Bard, that’s poised to challenge ChatGPT and other conversational AI systems.

Join us as we explore the revolutionary features of Bard, its cutting-edge architecture, and its ability to generate human-like responses.

Discover why AI enthusiasts are buzzing with excitement.

References: [1] [2] [3]SponsorsFinally, a better way to do B2B research.

NewtonX The World’s Leading B2B Market Research CompanyReferences

3 months, 2 weeks назад @ datascienceathome.com
Unleashing the Force: Blending Neural Networks and Physics for Epic Predictions (Ep. 230)
Unleashing the Force: Blending Neural Networks and Physics for Epic Predictions (Ep. 230) Unleashing the Force: Blending Neural Networks and Physics for Epic Predictions (Ep. 230)

Source: Physics Informed Deep LearningIn this enlightening episode of our podcast, we delve into the fascinating realm of Physics Informed Neural Networks (PINNs) and explore how they combine the extraordinary prediction capabilities of neural networks with the unparalleled accuracy of physics models.

Join us as we unravel the mysteries behind PINNs and their potential to revolutionize various scientific and engineering domains.

We’ll discuss the underlying principles that enable these networks to incorporate physical laws and constraints, resulting in enhanced predictions and a deeper understanding of complex systems.

SponsorsThis episode is supported by Mimecast – the email security solut…

3 months, 3 weeks назад @ datascienceathome.com
AI’s Impact on Software Engineering: Killing Old Principles? [RB] (Ep. 229)
AI’s Impact on Software Engineering: Killing Old Principles? [RB] (Ep. 229) AI’s Impact on Software Engineering: Killing Old Principles? [RB] (Ep. 229)

In this episode, we dive into the ways in which AI and machine learning are disrupting traditional software engineering principles.

However, this reliance on AI can come at a cost to the tried-and-true methods of software engineering.

From real-time market data to sophisticated analytics, powerful trading tools, and more, Bloomberg engineers work with systems that operate at scale.

If you’re a software engineer looking for an exciting and fulfilling career, head over to bloomberg.com/careers to learn more.

Industries and governments around the world are fighting back, unveiling new regulations meant to better protect data against this rising threat.

4 months назад @ datascienceathome.com
Warning! Mathematical Mayhem Ahead: Demystifying Liquid Time-Constant Networks (Ep. 228)
Warning! Mathematical Mayhem Ahead: Demystifying Liquid Time-Constant Networks (Ep. 228) Warning! Mathematical Mayhem Ahead: Demystifying Liquid Time-Constant Networks (Ep. 228)

Source: IEEE SpecrumHold on to your calculators and buckle up for a wild mathematical ride in this episode!

Brace yourself as we dive into the fascinating realm of Liquid Time-Constant Networks (LTCs), where mathematical content reaches new heights of excitement.

In this mind-bending adventure, we demystify the intricacies of LTCs, from complex equations to mind-boggling mathematical concepts, we break them down into digestible explanations.

Referenceshttps://www.science.org/doi/10.1126/scirobotics.adc8892https://spectrum.ieee.org/liquid-neural-networks#toggle-gdpr

4 months, 1 week назад @ datascienceathome.com
Efficiently Retraining Language Models: How to Level Up Without Breaking the Bank (Ep. 227)
Efficiently Retraining Language Models: How to Level Up Without Breaking the Bank (Ep. 227) Efficiently Retraining Language Models: How to Level Up Without Breaking the Bank (Ep. 227)

🎙️In our latest podcast episode, we dive deep into the world of LoRa (Low-Rank Adaptation) for large language models (LLMs).

This groundbreaking technique is revolutionizing the way we approach language model training by leveraging low-rank approximations.

We’ll explore the ingenious strategies and practical methods that empower you to fine-tune your language models without breaking the bank.

Whether you’re a researcher, developer, or language model enthusiast, this episode is packed with invaluable insights.

Tune in and join the conversation as we unravel the secrets of LoRa low-rank adaptation and show you how to retrain LLMs on a budget.

4 months, 2 weeks назад @ datascienceathome.com
Revolutionize Your AI Game: How Running Large Language Models Locally Gives You an Unfair Advantage Over Big Tech Giants (Ep. 226)
Revolutionize Your AI Game: How Running Large Language Models Locally Gives You an Unfair Advantage Over Big Tech Giants (Ep. 226) Revolutionize Your AI Game: How Running Large Language Models Locally Gives You an Unfair Advantage Over Big Tech Giants (Ep. 226)

SourceThis is the first episode about the latest trend in artificial intelligence that’s shaking up the industry – running large language models locally on your machine.

This new approach allows you to bypass the limitations and constraints of cloud-based models controlled by big tech companies, and take control of your own AI journey.

We’ll delve into the benefits of running models locally, such as increased speed, improved privacy and security, and greater customization and flexibility.

We’ll also discuss the technical requirements and considerations for running these models on your own hardware, and provide practical tips and advice to get you started.

Join us as we uncover the secrets t…

4 months, 3 weeks назад @ datascienceathome.com