What is Computer Vision?

2023-07-08
关注

Illustration: © IoT For All

In The Matrix: Resurrections (2021) movie, Neo, the protagonist fighting against a computer-generated world called the Matrix, and his team of fighters are gathered around a set of computers trying to locate Trinity, Neo’s partner, who is still stuck in the Matrix. Neo points to the distinctive green code that characterizes the Matrix on a computer screen and says, “That’s her, on the bridge.” The camera then cuts to Trinity riding her motorcycle on a bridge, not knowing that she is part of a simulation. How was the computer able to “see” Trinity on the bridge? That’s not the concern of the movie (they need to save humanity), but it is the focus of this article. 

What we see play out in The Matrix: Resurrections is called “computer vision.” This technology is what allows computers to “see” and make sense of visual information. Computer vision relies on a combination of algorithms and artificial intelligence to process information like shapes, colors, and textures to understand what is before it. The computer in The Matrix was able to spot Trinity using visual cues like her hair, her facial structure, her clothes, etc. to make a match to the person known as Trinity, much like we would use the same cues to identify a person – even if it sometimes takes us longer than a computer!

“Computer vision relies on a combination of algorithms and artificial intelligence to process information like shapes, colors, and textures to understand what is before it.”

Although the computer was able to recognize Trinity inside the Matrix in a matter of seconds, there were a few steps that took place before the computer could correctly identify her.

  1. Image Acquisition: Computer vision requires visual input. In the movie, we see that Neo and his friends have tapped into the many cameras and sensors in the unnamed city inside the Matrix where Trinity is living. 

  2. Preprocessing: The visual input may need some tweaking to enhance its quality before analysis can begin. This can include resizing, noise reduction or cancellation, and more. In the movie, we see this process subtly when the camera starts to zoom in on Trinity riding her motorcycle. The first shot of her is an aerial view but later becomes more focused as the camera shot lands square on her. 

  3. Feature extraction: Once the visual input has been “cleaned,” algorithms start extracting relevant visual attributes to help understand the content of the image or video.

  4. Feature representation: The visual features or attributes extracted must then be represented in a way that can be processed by machine learning algorithms. In the movie, we see that the visual input of Trinity is represented on the computer screen by a series of numbers and characters. 

  5. Machine Learning and Training: In this step, the computer would be trained to understand the characteristics of new visual inputs based on previous inputs it had been fed. For example, a computer would know how to identify a new human if it had been trained on the characteristics, such as noses, ears, arms, etc. that make up a human. 

  6. Recognition and Interpretation: A trained computer vision system can now recognize new, unseen visual data. In the movie, we see that the computer has flagged the representation of Trinity through a gold outline. 

At this point, you may understand computer vision very well – and that is great! But you may be wondering, is this technology only available in the science fiction world of a movie? The answer is no! Computer vision has many uses and applications in our own world.

Let’s start with a very simple and ubiquitous example of computer vision in the real world. You’re probably reading this article on your computer or your phone. If you’re using your phone, you probably had to unlock it using either a passcode, a PIN, or face ID. If you used face identification, you saw computer vision in action! Your front-facing camera captured an image of your face, identified key features, then compared the image to the saved image of you that you registered when setting up face ID. Once computer vision has identified the image as a match of the authorized user (you), the phone will unlock. 

Now, let’s try another example of computer vision that you may not interact with every day but that has slowly crept into the weekly routine of people. You probably shop for groceries at least once a week. Does your grocery store have self-checkout kiosks? There’s typically a camera pointed right at you above the register as you scan your groceries. Computer vision can be used to make sense of the camera feed from these self-checkout kiosks to identify and flag any suspicious behavior – such as someone pocketing an item without paying or looking around anxiously. 

Here’s a third example of computer vision that, hopefully, you don’t interact with as much. Computer vision can be used in traffic cameras to detect violations like speeding and running red lights. Computer vision can then extract the license plates of the offending cars so law enforcement can send tickets to the right people. If you ever got a speeding ticket even though there was no police officer to stop you when the incident occurred, that was computer vision at work. 

So, The Matrix: Resurrections’ exploration of computer vision – even if it does not use the term – is rooted in the reality of our lives. We can use computer vision to help us unlock our phones, track goods in grocery stores, or help find loved ones.

Tweet

Share

Share

Email

  • Artificial Intelligence
  • Big Data
  • Data Analytics
  • Machine Learning

  • Artificial Intelligence
  • Big Data
  • Data Analytics
  • Machine Learning

  • en
您觉得本篇内容如何
评分

相关产品

EN 650 & EN 650.3 观察窗

EN 650.3 version is for use with fluids containing alcohol.

Acromag 966EN 温度信号调节器

这些模块为多达6个输入通道提供了一个独立的以太网接口。多量程输入接收来自各种传感器和设备的信号。高分辨率,低噪音,A/D转换器提供高精度和可靠性。三路隔离进一步提高了系统性能。,两种以太网协议可用。选择Ethernet Modbus TCP\/IP或Ethernet\/IP。,i2o功能仅在6通道以太网Modbus TCP\/IP模块上可用。,功能

雷克兰 EN15F 其他

品牌;雷克兰 型号; EN15F 功能;防化学 名称;防化手套

Honeywell USA CSLA2EN 电流传感器

CSLA系列感应模拟电流传感器集成了SS490系列线性霍尔效应传感器集成电路。该传感元件组装在印刷电路板安装外壳中。这种住房有四种配置。正常安装是用0.375英寸4-40螺钉和方螺母(没有提供)插入外壳或6-20自攻螺钉。所述传感器、磁通收集器和壳体的组合包括所述支架组件。这些传感器是比例测量的。

TMP Pro Distribution C012EN RF 音频麦克风

C012E射频从上到下由实心黄铜制成,非常适合于要求音质的极端环境,具有非常坚固的外壳。内置的幻像电源模块具有完全的射频保护,以防止在800 Mhz-1.2 Ghz频段工作的GSM设备的干扰。极性模式:心形频率响应:50赫兹-18千赫灵敏度:-47dB+\/-3dB@1千赫

ValueTronics DLRO200-EN 毫欧表

"The DLRO200-EN ducter ohmmeter is a dlro from Megger."

评论

您需要登录才可以回复|注册

提交评论

广告

iotforall

这家伙很懒,什么描述也没留下

关注

点击进入下一篇

LwM2M: Remote Firmware Updates for IoT Devices Demystified

提取码
复制提取码
点击跳转至百度网盘