#110 – Jitendra Malik: Computer Vision

Lex Fridman Podcast

/#110 – Jitendra Malik: Computer Vision

Episode

#110 – Jitendra Malik: Computer Vision

1:42:04

Published: Tue Jul 21 2020

Description

Jitendra Malik is a professor at Berkeley and one of the seminal figures in the field of computer vision, the kind before the deep learning revolution, and the kind after. He has been cited over 180,000 times and has mentored many world-class researchers in computer science. Support this podcast by supporting our sponsors: - BetterHelp: http://betterhelp.com/lex - ExpressVPN: https://www.expressvpn.com/lexpod If you would like to get more information about this podcast go to https://lexfridman.com/ai or connect with @lexfridman on Twitter, LinkedIn, Facebook, Medium, or YouTube where you can watch the video versions of these conversations. If you enjoy the podcast, please rate it 5 stars on Apple Podcasts, follow on Spotify, or support it on Patreon. Here's the outline of the episode. On some podcast players you should be able to click the timestamp to jump to that time. OUTLINE: 00:00 - Introduction 03:17 - Computer vision is hard 10:05 - Tesla Autopilot 21:20 - Human brain vs computers 23:14 - The general problem of computer vision 29:09 - Images vs video in computer vision 37:47 - Benchmarks in computer vision 40:06 - Active learning 45:34 - From pixels to semantics 52:47 - Semantic segmentation 57:05 - The three R's of computer vision 1:02:52 - End-to-end learning in computer vision 1:04:24 - 6 lessons we can learn from children 1:08:36 - Vision and language 1:12:30 - Turing test 1:16:17 - Open problems in computer vision 1:24:49 - AGI 1:35:47 - Pick the right problem

Chapters

Ad for ExpressVPN and BetterHelp

The speaker is promoting ExpressVPN and BetterHelp in this podcast episode, encouraging listeners to visit their respective websites and make a purchase.

00:00 - 03:12 (03:12)

ExpressVPN, BetterHelp

Summary

The speaker is promoting ExpressVPN and BetterHelp in this podcast episode, encouraging listeners to visit their respective websites and make a purchase. They are also discussing their personal demons and how they deal with them.

Episode
#110 – Jitendra Malik: Computer Vision

Podcast
Lex Fridman Podcast

Computer Vision and Driving: Problems and Possibilities

The issue with computer vision and driving is not just about pattern recognition or data but also about physical action, interaction with the environment, and creating safety policies to prevent accidents.

03:12 - 16:50 (13:38)

computer vision

Summary

The issue with computer vision and driving is not just about pattern recognition or data but also about physical action, interaction with the environment, and creating safety policies to prevent accidents. AI needs to learn what a human driver learns from experience including safety and ethical concerns, as well as the details of traffic rules and human-road dynamics.

Episode
#110 – Jitendra Malik: Computer Vision

Podcast
Lex Fridman Podcast

The Challenges of AI in Accumulating Knowledge through Neural Networks

The process of accumulating knowledge through neural networks has some challenges, including power consumption and the need for significant evolution in learning techniques.

16:50 - 24:41 (07:51)

Artificial Intelligence

Summary

The process of accumulating knowledge through neural networks has some challenges, including power consumption and the need for significant evolution in learning techniques.

Episode
#110 – Jitendra Malik: Computer Vision

Podcast
Lex Fridman Podcast

The goal of computer vision is to sense the world to aid action

Computer vision aims to build a model of the external world to aid action, just like humans.

24:42 - 32:36 (07:54)

Computer Vision

Summary

Computer vision aims to build a model of the external world to aid action, just like humans. Applications of computer vision range from analyzing dynamic videos to robotic vision.

Episode
#110 – Jitendra Malik: Computer Vision

Podcast
Lex Fridman Podcast

Challenges in computer vision that mimic a child's learning process

The process of a child learning how to act in the world is different from current computer vision technology, which is mostly focused on short-term video understanding.

32:36 - 42:33 (09:56)

Computer Vision

Summary

The process of a child learning how to act in the world is different from current computer vision technology, which is mostly focused on short-term video understanding. Mimicking a child's learning process in computer vision can lead to progress in areas such as autonomous driving and robotics.

Episode
#110 – Jitendra Malik: Computer Vision

Podcast
Lex Fridman Podcast

Computer Intensive Models for Understanding Vision

The use of computer intensive models to build causal models for vision can provide increasingly realistic scene understanding, aided by static and dynamic scene understanding and the success of image compression.

42:33 - 49:29 (06:56)

Computer Science

Summary

The use of computer intensive models to build causal models for vision can provide increasingly realistic scene understanding, aided by static and dynamic scene understanding and the success of image compression.

Episode
#110 – Jitendra Malik: Computer Vision

Podcast
Lex Fridman Podcast

Understanding the Importance of Segmentation in Computer Vision

This podcast discusses the differences between recognition and segmentation in computer vision, highlighting the importance of segmentation as a way to define objects without labeling or understanding their properties.

49:29 - 59:08 (09:38)

Computer Vision

Summary

This podcast discusses the differences between recognition and segmentation in computer vision, highlighting the importance of segmentation as a way to define objects without labeling or understanding their properties.

Episode
#110 – Jitendra Malik: Computer Vision

Podcast
Lex Fridman Podcast

End-to-End Learning and Its Limitations in Machine Learning

The concept of end-to-end learning is often limited to end-to-end supervised learning for a specific task, which is a restricted view of the learning process.

59:08 - 1:07:31 (08:23)

Machine Learning

Summary

The concept of end-to-end learning is often limited to end-to-end supervised learning for a specific task, which is a restricted view of the learning process. This approach divides vision into different modules for lower, mid and high levels of vision without considering the nuances of learning.

Episode
#110 – Jitendra Malik: Computer Vision

Podcast
Lex Fridman Podcast

Evolution of Bipedalism in Humans

The evolution of bipedalism in humans can be traced back to five million years ago when the first bipedal primate appeared.

1:07:32 - 1:12:13 (04:41)

Evolution, Bipedalism

Summary

The evolution of bipedalism in humans can be traced back to five million years ago when the first bipedal primate appeared. This development occurred after the human species gained the ability to manipulate with their hands and had a larger brain size.

Episode
#110 – Jitendra Malik: Computer Vision

Podcast
Lex Fridman Podcast

The Turing Test and Its Limitations Today

The Turing test, proposed in 1950, may have solved a certain problem back then but today, focus should shift more towards manipulation, navigation, visual scene understanding, reading and comprehension-based tasks as opposed to long range video understanding.

1:12:13 - 1:18:46 (06:32)

Summary

The Turing test, proposed in 1950, may have solved a certain problem back then but today, focus should shift more towards manipulation, navigation, visual scene understanding, reading and comprehension-based tasks as opposed to long range video understanding.

Episode
#110 – Jitendra Malik: Computer Vision

Podcast
Lex Fridman Podcast

Understanding the Known Unknowns in Computer Vision System

The podcast discusses how the known unknowns in computer vision restrict the richness of 3D understanding and how narratives or stories are used to explain the workings of a black box.

1:18:47 - 1:27:36 (08:49)

Computer Vision

Summary

The podcast discusses how the known unknowns in computer vision restrict the richness of 3D understanding and how narratives or stories are used to explain the workings of a black box.

Episode
#110 – Jitendra Malik: Computer Vision

Podcast
Lex Fridman Podcast

Being a Scientist when AI is Transforming the Future

The podcast explores the effects of artificial intelligence on society and its unknown repercussions on scientific research.

1:27:37 - 1:36:02 (08:25)

Summary

The podcast explores the effects of artificial intelligence on society and its unknown repercussions on scientific research. The speaker expresses their fortune in joining the field when it was in its early stages, and reflects on the uncertainty of AI's impact on the future.

Episode
#110 – Jitendra Malik: Computer Vision

Podcast
Lex Fridman Podcast

Advising students to solve good problems

The speaker believes that advising students on what are good problems is an important skill to have.

1:36:03 - 1:41:45 (05:42)

Education

Summary

The speaker believes that advising students on what are good problems is an important skill to have. He thinks that smart students coming into Berkeley are very profound thinkers who can go deep down one particular path.