Home » Peter DeCaprio: Why I Hate Various Types of Machine Learning and Real-Time Applications

Peter DeCaprio: Why I Hate Various Types of Machine Learning and Real-Time Applications

Peter DeCaprio

Machine learning is a very important topic in an industry currently. People at various companies aspire to learn and apply machine learning either for their research or for the sake of finding a better solution to their business problems says Peter DeCaprio. However, I believe that people should be careful while selecting what kind of technology they desire to learn due to how it affects them later on in their career and life after the formal education and training end.

The purpose of this post is not only to dramatically discourage learners from choosing certain types of machine learning because they are apparently trendy but also to show how one can benefit from such bad choices (in terms of future job prospects). The main target audience/readership includes high school students (especially those interested in science), university students, and employees working at IT companies and startup companies, and people who follow machine learning-related news (e.g., tech journalists and bloggers).

Machine Learning: Better or Worse in the Industry?

The original purpose of machine learning is to make lives easier for scientists and engineers in terms of solving difficult problems in different fields such as computer science, engineering, economics, medicine, physics, etc. The following sections will provide insights into how you can benefit from knowing certain types of machine learning (in terms of industry). If you are not familiar with certain types of machine learning (e.g., deep learning). It would be wise to stop reading this article now. Because the latter sections might confuse you even more. If you think that all aspects (such as theory building) in machine learning. It can be learned in an academic environment with a formal degree then you are gravely mistaken. I am sorry for such harsh language but that is how it is says Peter DeCaprio.

Computer Vision: The Holy Grail of AI and Machine Learning

Computer vision, especially deep learning-based methods (e.g., deep neural nets), has probably been around for more than 50 years, although most people became aware of it only during the last decade and its popularity grew exponentially within the last few years after the breakthroughs in image recognition by state-of-the-art convolutional neural networks [1]. It all started with AlexNet [2] in 2012 which demonstrated better performance over humans in terms of image classification (and eventually object localization) using deep neural networks. Since then, many other companies (e.g., Google) started investing heavily in computer vision technologies. For developing self-driving cars and providing better solutions to their customers (e.g., shopping suggestions).

The amount of investment in this area is huge (more than $1 billion by Apple alone [3]), which clearly shows the impact of computer vision on different businesses in recent years. However, it also shows how machine learning can help researchers solve significantly difficult problems. Without having to invest significantly high amounts of money only through scientific experiments. (I am not discouraging scientific efforts here because they are crucial. But some people seem to forget about them while talking about future expectations).

Computer vision includes two main topics: image recognition and object localization. Image recognition is the problem of classifying images into different categories (e.g., “dog”, “cat”, etc). While object localization is to determine. Where in an image a certain object can be found (e.g., find all dogs in my local park). I would like to talk about object localization specifically. Because it has two main tasks. Bounding box regression and image segmentation explain Peter DeCaprio.

When it comes to bounding box regression, there are three classes of algorithms. Namely supervised learning, unsupervised learning, and semi-supervised learning? The first two are widely used in the industry for training both small/medium-sized datasets (e.g., ImageNet [4]) and larger datasets with millions of images including more than 1,000 categories (e.g., Facebook), which are then used to train commercial off-the-shelf object recognition systems. It is interesting that even though unsupervise learning is widely in use for training supervise models. It is rarely applying for performing unsupervising tasks for identifying objects in images. especially in industry. Where the main goal of computer vision is to automate human tasks (i.e., let robots do manual work).

Semi-supervised learning methods include knowledge distillation [5] and data reweighting [6] , both of which can be easily implemented using machine learning frameworks such as TensorFlow. Peter DeCaprio says knowledge distillation allows training a model on large datasets (e.g., ImageNet) with millions of images. And then reweighting a smaller dataset (e.g., a few thousand images) using the trained model. In order to achieve better performance for object recognition. Data reweighting is an interesting method because it can be used for both supervised and unsupervised tasks. This means that it is possible to use this algorithm for performing both image segmentation and bounding box regression.

Conclusion:

There are many different algorithms for performing both image segmentation and bounding box regression. This means that there is not a single solution to the object localization problem used by all companies. Involved in the machine learning business.

Now it’s time to talk about real-time applications. Because this term has become very popular lately among computer vision researchers. As well as my blog readers (I am mentioning your names here: Michał, Oleksandr, and Wojciech ). Peter DeCaprio agrees that it would be great. If we could build powerful models on GPUs and deliver better experiences to our users. But we also need to keep in mind that modern CPUs do not work very efficiently. When working with large images such as those found in high-resolution photos cameras.