Research on food safety sampling inspection system based on deep learning

With numerous promising cases in image processing, voice recognition, target detection, and other fields, deep learning (DL) have proven to be an advanced tool for big data analysis. It’s been used in food science and engineering recently as well. This is the first food-related study that we are aware of. We gave a brief overview of DL in this paper, as well as comprehensive descriptions of the structure of some common deep neural network (DNN) architectures and training approaches. We looked at hundreds of publications that used DL as a data processing method to address problems and issues in the food domain, such as food identification, calorie estimating, fruit, potato, meat, and aquatic commodity quality detection, food supply chain, and food pollution. Each study looked at the particular challenges, datasets, preprocessing techniques, networks and systems used, the efficiency achieved, and comparisons with other common solutions. We examined the degree to which big data is being used in the food safety domain and found some positive developments in this article. According to our study results, DL outperforms other approaches such as manual attribute extractors, traditional machine learning algorithms, and DL as a promising technique in food quality and safety inspection.


Introduction
Every day, more individuals and economic industries are being involved in globalization. The transportation of goods is one of the industries where the effect has been the greatest (Carvalho, 2017). On the other hand, food and feed protection is a highly sensitive issue: food should be especially safe for customers (Ionel, 2018;Kah et al., 2019). The economic effects and health risks of recent food shortages demonstrate the value of food safety. Walls et al. (2019) depict the decline in beef consumption after the mad cow disease epidemic. Panghal et al. (2018) demonstrate the importance of trustworthiness in disclaimers, as shown by the botched detection of E. coli-infected cucumbers. According to the World Trade Organization (WTO), It is a major industry subject: food and farm products account for roughly 10% of all exports. In particular, the European Commission has made setting high food safety standards a top policy priority (Flynn et al., 2019;Jagadeesan et al., 2019). In 2002, the European Food Safety Authority (EFSA) was established to provide scientific guidance and communicate the risks associated with chain food to carry out this strategy.
The ability to determine food attributes quickly, accurately, and automatically is a functional requirement in everyday life (Patterson & Gibson, 2017). Food characteristics have been detected using modern techniques such as electronic noses, computer vision, spectroscopy, and spectral imaging, and so on (Kamilaris & Prenafeta-Boldú, 2018). These methods can collect a significant amount of digital data about food properties. The importance of data processing in these methods is critical because the vast volume of data contains a lot of repetitive and meaningless material. How to work with such a vast volume of data and derive valuable functionality from it is a pressing and critical problem, as well as a difficulty when it comes to putting these strategies into the application (APP) (Buduma & Locascio, 2017).
A vast amount of data is being generated around the world in almost all segments of society, including industry, health care, government, and research disciplines such as natural sciences, life sciences, social sciences, humanities, and engineering. When more big data becomes accessible, it can be used to provide new information, improve decision-making, and improve product and service efficiency (Nogales et al., 2020). Machine Learning strategies are a method of finding trends in data after being educated on a historical dataset and then applying them to new data to make automated predictions or decisions. Deep learning (DL) has been a success in recent years among Machine Learning methods, providing good results in prediction problems (Brei et al., 2020;Kloeckner et al., 2020). Deep neural networks (DNNs), a branch of artificial neural networks (ANNs) inspired by how human neurons act, conduct DL. It's known as models that can learn multiple levels of abstraction for data representations (Zhou et al., 2019). These methods will collect a lot of digital data on food properties. The importance of data processing in these methods is critical because the vast volume of data contains a lot of repetitive and meaningless material. How to work with such a vast volume of data and derive valuable functionality from it is a pressing and critical problem, as well as a difficulty when it comes to putting these strategies into practice. Principal component analysis, for example, is used to isolate features (PCA) (Affonso et al., 2017;Fan et al., 2019;Nijhawan et al., 2019;Yin & Zhao, 2016). DL is a set of machines learning algorithms that include: • They use a large number of multiple layers of non-linear processing units to extract and convert features. Each layer will use the output of the previous layer as input; • They learn in a supervised manner (like classification) or without supervision (like pattern analysis); • Learn multiple layers of play corresponding to different abstraction levels; these levels form a series of concepts.
Most modern DL approaches are based on ANNs, although these models may contain propositional formulas or hidden organized layered variables in productive approaches like the nodes in deep Boltzmann machines (DBMs) and deep belief networks (DBNs). Each level learns to turn its input data into a slightly more abstract and hybrid display in-depth learning. In the image recognition application, the raw input can be a pixel matrix; the first display layer can be used to single out pixels and encode the edges; the second layer can create and encode the edges; the third layer can encode the eyes and nose, and the fourth layer can identify that the image includes a face. What is important is that an in-depth learning process can learn on its own which features can be optimally positioned at which level. Development in food production is a very long process. Using artificial intelligence (AI), very large agricultural information databases can be generated in great detail and at high speed. Machine learning can be used to identify nutritious foods as well as spices and to develop cooking recipes. With this system's development, there are many opportunities for cooking and preparing food with new flavors and according to the audience's taste.
Machine learning has been used in different of areas as a useful method for data analysis. A manual function extraction process normally accompanies orthodox machine learning methods due to the inability to evaluate raw natural data (Tsoumakas, 2019). For recognition, grouping, or regression, a computer may use representation learning to derive features from raw data. Convolution is the main idea behind convolutional neural networks (CNN), and how it is used is a major determinant of network performance. At the first level is the convolution layer, which can extract new features from the image using various kernels. This is followed by the Max pooling operation, which performs the task of reducing the size and number of network parameters. This layer's output is sent to the complete connection network layer after being converted to a one-dimensional vector. In this layer, common neural network algorithms are used. The convolution + max-pooling block, known as the convolution layer, can be repeated many times to build a deeper network. The user also determines the number of Fully connected layers. At the first level is the convolution layer, which can extract new features from the image using various kernels. This is followed by the Max pooling operation, which performs the task of reducing the size and number of network parameters. This layer's output is sent to the complete connection network layer after being converted to a one-dimensional vector. In this layer, common neural network algorithms are used. The convolution + max-pooling block, known as the convolution layer, can be repeated many times to build a deeper network. The user also determines the number of Fully connected layers. CNNs are a class of DNNs mostly used for visual or verbal analysis in machine learning (Ciocca et al., 2017). A standard CNN structure for image classification is indicated in Figure 1. The fruit is a vital source of nutrition for humans. Fruit production and sales face the same issues as crop production and sales, such as bugs, illness, bruises, and so on. Additionally, the fruit is a high-value farm crop. Fruit freshness, nutritional quality, and protection assurance are also topics to consider. Fruit and vegetable quality identification is a popular and difficult research topic right now.
DL is part of a larger family of learning approaches based on learning data instead of work-specific algorithms. Learning may be unsupervised, semi-supervised or supervised. DL architectures such as DNNs, recursive neural networks, and DBNs in areas such as natural language processing, computer vision, voice recognition, speech recognition, social media filtering, bioinformatics, drug design, machine translation, and Boardgame programs have been used in which they have provided results comparable to human experts and sometimes superior to them. Information and communication processing patterns do not very clearly inspire DL models in biological nervous systems, but they have different structural and functional features from human biological brains, that makes them inconsistent. With the evidence of neuroscience (Koturwar & Merchant, 2017;Pujol et al., 2019).

Materials and methods
Food classification and identification are essential tasks that aid humans in keeping track of their normal diets. Food images are one of the most valuable sources of knowledge about the characteristics of food. Furthermore, image sensing is a remarkably effective and easy method of retrieving information for food presentation research. Food recognition is difficult for natural products such as food due to the large variations in food type, amount, appearance, coloring, and compositions. Food identification and labeling are also affected by the context and layout of food products. Image processing is also the most widely used food identification pattern and labeling, thanks to CNN's widespread use. Also, it is also possible to download network architectures with pre-trained weights from the model zoo (Wang et al., 2019;Thenmozhi & Srinivasulu Reddy, 2019). DL is one of the most important approaches in machine learning, which includes important architectures. CNN is one of the architectures of interest in DL that has been widely used in digital image processing. Transfer learning is the most common method of DL. For example, in this method, we use pre-trained models as a starting point in computer vision (Khan et al., 2020). The retraining process listed is known as "finetuning, " and it has been shown to be an effective method for reducing training time and obtaining a more reliable outcome. Food/non-food labeling, food type discrimination, and ingredient recognition have also benefited from convolutional networks (Alom et al., 2018).
Big data applications in agriculture address important sustainability issues, global food security, safety, and efficiency improvement. These global issues have undoubtedly broadened the scope of big data beyond agriculture to include the entire food supply chain. Everything, including various components of agriculture and the supply chain, has become wirelessly connected due to the Internet's development, resulting in data that is instantly accessible. Operations, transactions, and images and videos captured by sensors and robots are primary data sources. However, efficient analysis is the key to unlocking the full potential of this dataset. Big data has enabled the development of risk management applications, sensor deployment, forecasting, and benchmarking (Granados-Chinchilla et al., 2017;van der Fels-Klerx et al., 2012). Anyone in the community can get real-time information about disease outbreaks and consumer safety measures on the internet at any time. Besides, this data analysis provides useful information to large companies that deal with food safety and security. Given the relationship between the two concepts of food safety and security, food security and long-term access to food are incompatible. Food safety and security is a global issue that is affecting developing countries more and more.
It should be noted that this area can provide a good opportunity for young people and startups to develop their innovation potential with the future of technology research and development of ideas in the fields of manufacturing, quality control, maintenance, packaging, and process ease. The majority of population problems can be solved with food access (Nightingale et al., 2004). Figure 2 shows how elements from different data sources can be used to bind data sources to create added value. Figure 2 shows data linkages similar to those used by WHO in FOSCOLLAB but from different data sources (Marvin et al., 2017).

Results and discussion
Feature learning or representation learning, in machine learning science, is a set of methods that allow the system to automatically discover the presentations required for feature detection or classification based on raw data. These methods replace the manual "feature engineering" methods, allowing the machine to learn features and use them to perform a specific task. Another feature of DL is the capacity to transmit knowledge. We found that the majority of the aforementioned studies utilized pre-trained CNN models developed on broad datasets that were fine-tuned on their reference datasets, reducing the overall complexity and duration needed to train a model. In addition, some authors used CNN features to train another classifier, such as SVM, in order to migrate information from the CNN method to the new classifier. Unlike traditional data analysis approaches, DL technology necessitates a more complex model structure and computational effort, which has hampered its growth and application in the past. Many resources have emerged to assist researchers in getting a fast start on developing a DL-based APP, thanks to scientists' efforts and a global emphasis on DL. In terms of software assistance, we' d like to recommend a few common frameworks for researchers who are having trouble programming: Theano, Tensorflow, Caffe, Pytorch, MXNet, Keras, and MatConvNet for Matlab (Savaş et al., 2019).
The NVIDIA's DNN-library and Compute Unified Device Architecture (CUDA) Toolkit can speed DL computations utilizing both hardware and software. NVIDIA libraries made it possible to build the first DL libraries in the CUDA language, while there were no such comprehensive libraries for Open Computing Language (OpenCL) Advanced Micro Devices (AMD). This initial advantage, coupled with full support from NVIDIA, rapidly increased the use of CUDA. This means that if you use NVIDIA GPUs, you will easily find a solution if you encounter a mistake; if you programmed CUDA yourself, you would have support and advice, and you will find that most DL Libraries have the best support for NVIDIA GPUs. This is a great power for the NVIDIA GPUs.
NVIDIA currently has a policy that allows CUDA to be used in data centres only for Tesla GPUs and not for Ray Tracing Texel (RTX) or Giga Texel Shader eXtreme (GTX) cards.
NVIDIA has been able to do this without any hindrance, which is a sign of its exclusive power. They can do whatever they want, and we've got to accept that. If you choose NVIDIA graphics cards' major benefits in terms of support, you must also accept that you may be under pressure from them. HIP via ROCm is capable of managing AMD GPUs and NVIDIA in a common programming language, the commands of which are compiled in the GPU language before being connected to the GPU. This is an important milestone if we have all of our GPU code in the HIP. But the storey is going to be more complicated because it's very difficult to use TensorFlow and PyTorch-based code.
The ROCm community is not very large and is therefore unable to solve problems quickly. It seems that AMD will have to spend more on developing and promoting DL, which is currently slow. However, AMD GPUs are more efficient than NVIDIA GPUs, and the next generation of AMD GPUs, called Vega 20, is a computing processor that uses Tensor cores such as computing units. However, the use of AMD GPUs is not recommended for ordinary users who simply want to use their GPUs. More experienced users should have fewer problems and help fight NVIDIA's proprietary position by supporting ROCm/HIP and AMD GPUs developers that will benefit everyone in the long run. If you are a GPU programmer and want to make a significant contribution to GPU-based computing, AMD GPUs may be the best way to make a good effect in the long run. NVIDIA GPUs may be a better choice for others.
These tools assist in the acceleration of the DL models previously defined. Hardware & software optimization systems greatly minimize computing time and allow for real-time computation. DL has flaws that cannot be overlooked. The refinement activities will be very difficult and slow due to the long training cycle and hardware constraints, as well as the model's high complication and multiple hyperparameters. GPUs are extremely costly, as are the processors and other hardware needed for computational acceleration. Training a DNN with just processing power as a computational resource would take even longer. DL often necessitates processing a vast amount of data, and finding a reliable large dataset is challenging. Compiling and annotating data will take a considerable amount of time and effort. There would undoubtedly be some errors when such free datasets for academic study and challenge competitions are gathered and duly labeled by volunteers or experts. Others can be downloaded directly from the Web by computers; thus, there might undoubtedly be some errors. It's also worth mentioning Some published datasets with incomplete data identify the goal challenge. It's challenging for a model trained on the UECFood-256 database to correctly classify meals from all other nations worldwide because the database contains a large number of Japanese food photographs. Large databases containing food photographs worldwide should be generated to achieve a more reliable and robust food recognition scheme. While DL has been used in food identification, few studies link it to food calorie assessment, supply chain, and safety issues. Despite the fact that hundreds of papers recorded their DL APP for food recognition, RGB image awareness is the only base applied to recognize food categories in the case of food recognition issues. Since the respondents to some of the above food classification studies were mainly computer scientists and image processors, general image features gained more consideration than simple food image features. To represent the intrinsic specifics of food, characterization techniques such as thermal imaging and hyperspectral imaging have been used in addition to photographs.
Using a potato as an example, DNN approaches can only forecast whether the goal region was harmed based on the average spectrum of a sample. It was impossible to judge where the exact location was based on the average spectrum of a sample. It might be essential to obtain certain crucial information to minimize the size of the spectral pictures. Finding any optimal bands that better reflect the variations across samples and then recombining the resulting layers as new images are one approach that focuses on spatial information. Another alternative is to use pixel-level spectra to train the network, then reconstruct each pixel's projection mark as an output mask, such as in Figure 3. Patel et al. (1996) and Xu et al. (2020) describe how one-dimensional convolution is implemented. If the issue at hand has little to do with spatial or structure information, this process of measuring the predicted values of each point independently may be a reasonable way to get around the hardware capacity constraint. Furthermore, similar to the solution presented by Berisha et al. (2019), a combination of spectral and spatial properties may be used to solve specific problems.
To train a DL model, more forms of food data are required to be used. We can say what sort of food it is and how consistent it depends on its density, smell, feel, solidity, flavor, and noises upon being struck. There is a range of sensors available for non-destructive measurement, including vibration sensors, electronic equilibrium, electrical noise, sound sensors, and so on, as well as advanced detector technologies (Adão et al., 2017;Gowen et al., 2012). Following the recognition of photos containing food, the next step is to investigate food labeling, which is a multiclassification challenge. There were several open-access food picture repositories of various types, such as UECFood-100, UECFood-256, Food-11, etc. (see Table 1). These huge food picture collections may provide plenty of food visual features for preparing a DNN model for food classification.
Multisource data fusion has not been thoroughly exploited to evaluate food quality and protection utilizing DL. In one illustration in this article, a mixture of image and mass information of fruits was used to identify fruits more precisely. Multisource data fusion focused on more data types from sensing instruments may obtain a more robust and reliable food assessment. Liquid foods, such as milk, beverages, and other beverages, as well as marine goods and fish, poultry, and fruits, will be analyzed in prospective studies. As a result, the time dimension must be considered. Any problems with static data are challenging to describe. For example, images or other data depicting dough fermentation's current state are inadequate to demonstrate the issue. It's sufficient to use criteria from the whole fermentation phase at different times.

Conclusion
DL has entered the field of AI. This learning has come to aid AI in responding more naturally to human needs and needs. AI has now come to the aid of human beings. It's not been many years since the creation of AI. But in this short time, human beings have used this technology in a variety of fields. DL in today's world goes hand-in-hand with the digital age. The digital age is considered to be a time of information boom and explosion. Today, human beings have access to a wide range of information around the world. Today, the treasure trove of information available to humans comes from a variety of sources. Resources such as social media, search engine optimization, e-commerce platforms, online cinemas, and more are constantly gathering human information today. This information is going to be very useful for teaching artificial intelligence. This vast amount of information can be easily disseminated. Human beings today are able to send information about the size of a national library to someone on the other side of the world in a matter of minutes. In fact, however, this amount of information that exists on the Internet without any special classification is so vast that human beings will not be able to digest, analyze and learn it. In fact, it will take decades for people to learn this amount of information. The lifespan of a normal human being will not be enough to learn this amount of information.
The exponential development of the Internet, social networking, smartphone applications, and other types of technology has culminated in increasingly complex methods to data collection, enabling more individuals to participate and contribute food knowledge such as photographs and text explanations, eventually allowing for the proliferation of even larger datasets. Food production and protection inspections are carried out by academics and research institutions worldwide using their own databases. Data collection is restricted by the capacity of a single individual, research team, or organization. It is planned to incorporate food-related datasets collected from consumers, researchers, and institutes worldwide using modern sensors and instruments into broad global databases. These datasets can be analyzed quickly with DL's help, which will support food researchers and institutes.
We looked at a wide variety of recent articles relating to DL implementation in food, illustrating the proposed framework, training methods, and final assessment outcome of DL models used to process food images, spectrums, text, and other data in each sample. We contrasted DNN to other efficient approaches in terms of results and found that DL outperformed the other methods in the studies we looked at. In our discussion of the benefits and pitfalls of DL approaches, as well as the challenges and potential futures of DNNs in the food domain, we come to an understanding. No other survey of DL applications in the food domain exists, according to the authors. This study aims to allow researchers and practitioners in this field to conduct additional food-related studies using DL approaches, provide functional solutions to regression and classification problems, and integrate these solutions for the benefit of food protection and quality inspection for human health. Finally, we suggest that: (1) A mixture of DL and multisource data fusion, including RGB pictures, spectra, smell, flavor, and so on, would be suggested to allow for a more accurate evaluation of food; (2) Future studies can concentrate on the creation of highly autonomous data acquisition gears for local and global food information sharing portals, as collecting big data relevant to food remains problematic due to the usage of semi-auto or even manual data acquisition instruments and incomplete data processing and sharing platforms; (3) DL technologies' data mining capabilities may be tested in food-related fields that are seldom explored, such as food safety; (4) Food picture recognition, intelligent meal recommendation Applications, and fruit consistency evaluation frameworks are all exciting DL cases that can be transformed into suitable offers.