Episode

#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
listen on Spotify
2:35:49
Published: Sat Jul 31 2021
Description

Ishan Misra is a research scientist at FAIR working on self-supervised visual learning. Please support this podcast by checking out our sponsors: - Onnit: https://lexfridman.com/onnit to get up to 10% off - The Information: https://theinformation.com/lex to get 75% off first month - Grammarly: https://grammarly.com/lex to get 20% off premium - Athletic Greens: https://athleticgreens.com/lex and use code LEX to get 1 month of fish oil EPISODE LINKS: Ishan's twitter: https://twitter.com/imisra_ Ishan's website: https://imisra.github.io Ishan's FAIR page: https://ai.facebook.com/people/ishan-misra/ PODCAST INFO: Podcast website: https://lexfridman.com/podcast Apple Podcasts: https://apple.co/2lwqZIr Spotify: https://spoti.fi/2nEwCF8 RSS: https://lexfridman.com/feed/podcast/ YouTube Full Episodes: https://youtube.com/lexfridman YouTube Clips: https://youtube.com/lexclips SUPPORT & CONNECT: - Check out the sponsors above, it's the best way to support this podcast - Support on Patreon: https://www.patreon.com/lexfridman - Twitter: https://twitter.com/lexfridman - Instagram: https://www.instagram.com/lexfridman - LinkedIn: https://www.linkedin.com/in/lexfridman - Facebook: https://www.facebook.com/lexfridman - Medium: https://medium.com/@lexfridman OUTLINE: Here's the timestamps for the episode. On some podcast players you should be able to click the timestamp to jump to that time. (00:00) - Introduction (07:49) - Self-supervised learning (16:24) - Self-supervised learning is the dark matter of intelligence (20:17) - Categorization (28:50) - Is computer vision still really hard? (32:35) - Understanding Language (42:14) - Harder to solve: vision or language (48:59) - Contrastive learning & energy-based models (52:59) - Data augmentation (57:19) - Fixed audio spike by lowering sound with pen tool (1:05:33) - Real data vs. augmented data (1:09:16) - Non-contrastive learning energy based self supervised learning methods (1:12:54) - Unsupervised learning (SwAV) (1:15:37) - Self-supervised Pretraining (SEER) (1:20:44) - Self-supervised learning (SSL) architectures (1:26:43) - VISSL pytorch-based SSL library (1:29:38) - Multi-modal (1:37:06) - Active learning (1:42:45) - Autonomous driving (1:54:12) - Limits of deep learning (1:58:19) - Difference between learning and reasoning (2:03:26) - Building super-human AI (2:11:14) - Most beautiful idea in self-supervised learning (2:15:02) - Simulation for training AI (2:18:27) - Video games replacing reality (2:19:40) - How to write a good research paper (2:24:08) - Best programming language for beginners (2:25:01) - PyTorch vs TensorFlow (2:28:26) - Advice for getting into machine learning (2:30:31) - Advice for young people (2:32:58) - Meaning of life

Chapters
Ishan Misra, research scientist at Facebook AI Research, discusses self-supervised machine learning in computer vision, including the use of transformers and self-attention in language models.
00:00 - 02:00 (02:00)
listen on Spotify
Self-Supervised Learning
Summary

Ishan Misra, research scientist at Facebook AI Research, discusses self-supervised machine learning in computer vision, including the use of transformers and self-attention in language models.

Episode
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Podcast
Lex Fridman Podcast
This podcast discusses the benefits of deep work sessions when tackling specific problems that require depth versus breadth, and provides tips for improving writing and thinking.
02:00 - 06:16 (04:15)
listen on Spotify
Deep work
Summary

This podcast discusses the benefits of deep work sessions when tackling specific problems that require depth versus breadth, and provides tips for improving writing and thinking. An example of the hard problem of computer vision and cautiousness is used, with the word "banana" being the canonical example at its core.

Episode
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Podcast
Lex Fridman Podcast
The podcast discusses different learning paradigms including self-supervised and semi-supervised learning, which overcome some of the challenges of traditional supervised learning.
06:16 - 12:40 (06:23)
listen on Spotify
Machine Learning
Summary

The podcast discusses different learning paradigms including self-supervised and semi-supervised learning, which overcome some of the challenges of traditional supervised learning.

Episode
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Podcast
Lex Fridman Podcast
The acceptance of the fact that self-supervised learning is likely to play an important role in future machine learning algorithms is growing.
12:40 - 20:06 (07:26)
listen on Spotify
Machine Learning
Summary

The acceptance of the fact that self-supervised learning is likely to play an important role in future machine learning algorithms is growing. This powerful way to learn common sense about the world can overcome difficulties in labeling and can offer insights for a range of situations.

Episode
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Podcast
Lex Fridman Podcast
The usefulness of categorization and its limitations in problem-solving are discussed, with a focus on the role of self-supervised learning versus supervised learning.
20:06 - 26:03 (05:56)
listen on Spotify
Problem Solving
Summary

The usefulness of categorization and its limitations in problem-solving are discussed, with a focus on the role of self-supervised learning versus supervised learning.

Episode
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Podcast
Lex Fridman Podcast
The podcast discusses the importance of common sense in building a deep understanding of the world and how self-supervised learning can play a role in achieving this understanding.
26:03 - 31:26 (05:23)
listen on Spotify
AI
Summary

The podcast discusses the importance of common sense in building a deep understanding of the world and how self-supervised learning can play a role in achieving this understanding.

Episode
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Podcast
Lex Fridman Podcast
Self-supervised learning can be useful for many tasks, particularly prior to the point where the machine needs to communicate with a human.
31:26 - 36:37 (05:11)
listen on Spotify
Self-Supervised Learning
Summary

Self-supervised learning can be useful for many tasks, particularly prior to the point where the machine needs to communicate with a human. By having words in the same context, useful information can be learned about how words are related.

Episode
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Podcast
Lex Fridman Podcast
The improvement in the neural networks used for natural language processing and image recognition has been achieved through the use of context in the form of a wide context to understand a word in context, or local context to understand a pattern in an image.
36:39 - 42:53 (06:13)
listen on Spotify
Neural Networks and Natural Language Processing
Summary

The improvement in the neural networks used for natural language processing and image recognition has been achieved through the use of context in the form of a wide context to understand a word in context, or local context to understand a pattern in an image. Scaling up data and using better neural network architectures has improved prediction power.

Episode
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Podcast
Lex Fridman Podcast
This podcast explores how modern machine learning methods such as GANs, VAEs, and contrastive models are related through a common language and energy function.
42:53 - 53:07 (10:14)
listen on Spotify
Machine Learning
Summary

This podcast explores how modern machine learning methods such as GANs, VAEs, and contrastive models are related through a common language and energy function.

Episode
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Podcast
Lex Fridman Podcast
This podcast discusses the concept of data augmentation, which involves perturbing and augmenting data to improve a neural network's performance.
53:07 - 58:25 (05:17)
listen on Spotify
Data Augmentation
Summary

This podcast discusses the concept of data augmentation, which involves perturbing and augmenting data to improve a neural network's performance. It explores the idea of incorporating wild and physically consistent data, and the importance of feature vectors.

Episode
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Podcast
Lex Fridman Podcast
This podcast discusses the importance and potential benefits of data augmentation in machine learning, particularly in the context of medical imaging.
58:25 - 1:05:22 (06:56)
listen on Spotify
Machine Learning
Summary

This podcast discusses the importance and potential benefits of data augmentation in machine learning, particularly in the context of medical imaging. The speaker argues that incorporating data augmentation into the learning process can improve model performance and accuracy.

Episode
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Podcast
Lex Fridman Podcast
The success of learning algorithms for vision is heavily dependent on good data augmentation, even with an infinite source of image data.
1:05:22 - 1:15:27 (10:05)
listen on Spotify
Learning Algorithms
Summary

The success of learning algorithms for vision is heavily dependent on good data augmentation, even with an infinite source of image data. Without it, neural networks may not learn well and struggle to differentiate between different images.

Episode
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Podcast
Lex Fridman Podcast
The use of uncurated data for self-supervised learning presents challenges due to the inherent biases of photographers, and the reliance on data augmentation techniques designed for ImageNet.
1:15:27 - 1:21:17 (05:49)
listen on Spotify
Self-Supervised Learning
Summary

The use of uncurated data for self-supervised learning presents challenges due to the inherent biases of photographers, and the reliance on data augmentation techniques designed for ImageNet.

Episode
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Podcast
Lex Fridman Podcast
This podcast discusses the efficiency of the neural network architectures used for self-supervised learning that can fit large models on a single GPU through efficient use of memory.
1:21:17 - 1:30:44 (09:27)
listen on Spotify
Neural Network
Summary

This podcast discusses the efficiency of the neural network architectures used for self-supervised learning that can fit large models on a single GPU through efficient use of memory. The team pushes self-supervised learning methods into the visual learning and self-supervised learning (VISL) platform.

Episode
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Podcast
Lex Fridman Podcast
The idea of multimodal learning is to take audio and video signals and learn a common embedding space where the two modalities can be closely related.
1:30:44 - 1:35:19 (04:34)
listen on Spotify
Multimodal Learning
Summary

The idea of multimodal learning is to take audio and video signals and learn a common embedding space where the two modalities can be closely related. By doing so, the learned representation can be used to recognize human actions and different types of sounds in downstream tasks.

Episode
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Podcast
Lex Fridman Podcast
In order to ask good questions about an image, one must first understand something about the image.
1:35:20 - 1:42:50 (07:29)
listen on Spotify
Active Learning
Summary

In order to ask good questions about an image, one must first understand something about the image. Active learning involves interactive exploration of the data, and this plays an important part in the solution to intelligence.

Episode
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Podcast
Lex Fridman Podcast
This episode discusses the successes and challenges of computer vision-based autonomous driving systems, particularly on highways and freeways, and the need for AGI to overcome the limitations of edge cases.
1:42:50 - 1:53:06 (10:15)
listen on Spotify
Autonomous driving
Summary

This episode discusses the successes and challenges of computer vision-based autonomous driving systems, particularly on highways and freeways, and the need for AGI to overcome the limitations of edge cases. Also, possible applications of self-supervised learning in this context are explored.

Episode
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Podcast
Lex Fridman Podcast
The limits of self-supervised learning and deep learning primarily involve the need for an interface to communicate with humans and the lack of guarantees in the system.
1:53:06 - 1:59:20 (06:14)
listen on Spotify
Self-Supervised Learning, Deep Learning
Summary

The limits of self-supervised learning and deep learning primarily involve the need for an interface to communicate with humans and the lack of guarantees in the system.

Episode
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Podcast
Lex Fridman Podcast
The concept of consciousness plays a crucial role in our ability to connect with other beings and perceive the world around us.
1:59:20 - 2:09:43 (10:22)
listen on Spotify
Consciousness
Summary

The concept of consciousness plays a crucial role in our ability to connect with other beings and perceive the world around us. In order to truly interact with the physical world, we may need to find a way to directly engage with it rather than relying solely on passive computer vision.

Episode
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Podcast
Lex Fridman Podcast
The most fundamental concepts in computer graphics is accurately figuring out how the lighting reflects on objects.
2:09:44 - 2:19:39 (09:54)
listen on Spotify
Virtual Reality
Summary

The most fundamental concepts in computer graphics is accurately figuring out how the lighting reflects on objects. Concepts such as object permanence, that emerge through images or videos, can also be incorporated into virtual reality games to create more realistic worlds that people would want to remain in.

Episode
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Podcast
Lex Fridman Podcast
Starting early and focusing on one interesting problem can help generate better ideas.
2:19:40 - 2:24:46 (05:05)
listen on Spotify
Research Paper
Summary

Starting early and focusing on one interesting problem can help generate better ideas. When it comes to papers, it is important to avoid cramming too many things into one paper.

Episode
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Podcast
Lex Fridman Podcast
Python is increasingly being used to teach programming and data science in universities, while its competition with R keeps both frameworks improving.
2:24:46 - 2:29:56 (05:09)
listen on Spotify
Python
Summary

Python is increasingly being used to teach programming and data science in universities, while its competition with R keeps both frameworks improving.

Episode
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Podcast
Lex Fridman Podcast
Failure is something that happens almost every day, especially in research, and it is important to keep pushing through and maintain a certain attitude in order to acquire the skills and discipline to succeed.
2:29:56 - 2:35:00 (05:04)
listen on Spotify
Failure
Summary

Failure is something that happens almost every day, especially in research, and it is important to keep pushing through and maintain a certain attitude in order to acquire the skills and discipline to succeed.

Episode
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Podcast
Lex Fridman Podcast
Ishaan Misra discusses his research in computer vision at the MIT-IBM Watson AI Lab with Lex Fridman.
2:35:00 - 2:35:40 (00:40)
listen on Spotify
AI/Computer Vision
Summary

Ishaan Misra discusses his research in computer vision at the MIT-IBM Watson AI Lab with Lex Fridman.

Episode
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Podcast
Lex Fridman Podcast