Episode
#110 – Jitendra Malik: Computer Vision
Description
Jitendra Malik is a professor at Berkeley and one of the seminal figures in the field of computer vision, the kind before the deep learning revolution, and the kind after. He has been cited over 180,000 times and has mentored many world-class researchers in computer science. Support this podcast by supporting our sponsors: - BetterHelp: http://betterhelp.com/lex - ExpressVPN: https://www.expressvpn.com/lexpod If you would like to get more information about this podcast go to https://lexfridman.com/ai or connect with @lexfridman on Twitter, LinkedIn, Facebook, Medium, or YouTube where you can watch the video versions of these conversations. If you enjoy the podcast, please rate it 5 stars on Apple Podcasts, follow on Spotify, or support it on Patreon. Here's the outline of the episode. On some podcast players you should be able to click the timestamp to jump to that time. OUTLINE: 00:00 - Introduction 03:17 - Computer vision is hard 10:05 - Tesla Autopilot 21:20 - Human brain vs computers 23:14 - The general problem of computer vision 29:09 - Images vs video in computer vision 37:47 - Benchmarks in computer vision 40:06 - Active learning 45:34 - From pixels to semantics 52:47 - Semantic segmentation 57:05 - The three R's of computer vision 1:02:52 - End-to-end learning in computer vision 1:04:24 - 6 lessons we can learn from children 1:08:36 - Vision and language 1:12:30 - Turing test 1:16:17 - Open problems in computer vision 1:24:49 - AGI 1:35:47 - Pick the right problem
Chapters
The speaker is promoting ExpressVPN and BetterHelp in this podcast episode, encouraging listeners to visit their respective websites and make a purchase.
00:00 - 03:12 (03:12)
Summary
The speaker is promoting ExpressVPN and BetterHelp in this podcast episode, encouraging listeners to visit their respective websites and make a purchase. They are also discussing their personal demons and how they deal with them.
Episode#110 – Jitendra Malik: Computer Vision
PodcastLex Fridman Podcast
The issue with computer vision and driving is not just about pattern recognition or data but also about physical action, interaction with the environment, and creating safety policies to prevent accidents.
03:12 - 16:50 (13:38)
Summary
The issue with computer vision and driving is not just about pattern recognition or data but also about physical action, interaction with the environment, and creating safety policies to prevent accidents. AI needs to learn what a human driver learns from experience including safety and ethical concerns, as well as the details of traffic rules and human-road dynamics.
Episode#110 – Jitendra Malik: Computer Vision
PodcastLex Fridman Podcast
The process of accumulating knowledge through neural networks has some challenges, including power consumption and the need for significant evolution in learning techniques.
16:50 - 24:41 (07:51)
Summary
The process of accumulating knowledge through neural networks has some challenges, including power consumption and the need for significant evolution in learning techniques.
Episode#110 – Jitendra Malik: Computer Vision
PodcastLex Fridman Podcast
Computer vision aims to build a model of the external world to aid action, just like humans.
24:42 - 32:36 (07:54)
Summary
Computer vision aims to build a model of the external world to aid action, just like humans. Applications of computer vision range from analyzing dynamic videos to robotic vision.
Episode#110 – Jitendra Malik: Computer Vision
PodcastLex Fridman Podcast
The process of a child learning how to act in the world is different from current computer vision technology, which is mostly focused on short-term video understanding.
32:36 - 42:33 (09:56)
Summary
The process of a child learning how to act in the world is different from current computer vision technology, which is mostly focused on short-term video understanding. Mimicking a child's learning process in computer vision can lead to progress in areas such as autonomous driving and robotics.
Episode#110 – Jitendra Malik: Computer Vision
PodcastLex Fridman Podcast
The use of computer intensive models to build causal models for vision can provide increasingly realistic scene understanding, aided by static and dynamic scene understanding and the success of image compression.
42:33 - 49:29 (06:56)
Summary
The use of computer intensive models to build causal models for vision can provide increasingly realistic scene understanding, aided by static and dynamic scene understanding and the success of image compression.
Episode#110 – Jitendra Malik: Computer Vision
PodcastLex Fridman Podcast
This podcast discusses the differences between recognition and segmentation in computer vision, highlighting the importance of segmentation as a way to define objects without labeling or understanding their properties.
49:29 - 59:08 (09:38)
Summary
This podcast discusses the differences between recognition and segmentation in computer vision, highlighting the importance of segmentation as a way to define objects without labeling or understanding their properties.
Episode#110 – Jitendra Malik: Computer Vision
PodcastLex Fridman Podcast
The concept of end-to-end learning is often limited to end-to-end supervised learning for a specific task, which is a restricted view of the learning process.
59:08 - 1:07:31 (08:23)
Summary
The concept of end-to-end learning is often limited to end-to-end supervised learning for a specific task, which is a restricted view of the learning process. This approach divides vision into different modules for lower, mid and high levels of vision without considering the nuances of learning.
Episode#110 – Jitendra Malik: Computer Vision
PodcastLex Fridman Podcast
The evolution of bipedalism in humans can be traced back to five million years ago when the first bipedal primate appeared.
1:07:32 - 1:12:13 (04:41)
Summary
The evolution of bipedalism in humans can be traced back to five million years ago when the first bipedal primate appeared. This development occurred after the human species gained the ability to manipulate with their hands and had a larger brain size.
Episode#110 – Jitendra Malik: Computer Vision
PodcastLex Fridman Podcast
The Turing test, proposed in 1950, may have solved a certain problem back then but today, focus should shift more towards manipulation, navigation, visual scene understanding, reading and comprehension-based tasks as opposed to long range video understanding.
1:12:13 - 1:18:46 (06:32)
Summary
The Turing test, proposed in 1950, may have solved a certain problem back then but today, focus should shift more towards manipulation, navigation, visual scene understanding, reading and comprehension-based tasks as opposed to long range video understanding.
Episode#110 – Jitendra Malik: Computer Vision
PodcastLex Fridman Podcast
The podcast discusses how the known unknowns in computer vision restrict the richness of 3D understanding and how narratives or stories are used to explain the workings of a black box.
1:18:47 - 1:27:36 (08:49)
Summary
The podcast discusses how the known unknowns in computer vision restrict the richness of 3D understanding and how narratives or stories are used to explain the workings of a black box.
Episode#110 – Jitendra Malik: Computer Vision
PodcastLex Fridman Podcast
The podcast explores the effects of artificial intelligence on society and its unknown repercussions on scientific research.
1:27:37 - 1:36:02 (08:25)
Summary
The podcast explores the effects of artificial intelligence on society and its unknown repercussions on scientific research. The speaker expresses their fortune in joining the field when it was in its early stages, and reflects on the uncertainty of AI's impact on the future.
Episode#110 – Jitendra Malik: Computer Vision
PodcastLex Fridman Podcast
The speaker believes that advising students on what are good problems is an important skill to have.
1:36:03 - 1:41:45 (05:42)
Summary
The speaker believes that advising students on what are good problems is an important skill to have. He thinks that smart students coming into Berkeley are very profound thinkers who can go deep down one particular path.