Abstract:
To reduce fruit loss and ensure quality, harvest timing and load information is critical to farm management (of labour and packing consumables). Early harvest brings poor eating quality fruit to the market, while late harvest decreases the available shelf life of fruit. These factors drive the need for quantitative tools for fruit maturity and quality testing. The assessment of harvest time is generally based on time (number of days from flowering) and physical features (size, shape and surface characteristics, firmness and pulp color). The assessment of these physical features is subjective and requires experience labour. The current quality inspection methods in Pakistan include weight-based segregation and packaging, therefore the quality of each fruit is not traceable. A few value chains have now set standards for fruit dry matter (DM) content at the time of harvest assessed non-destructively via near infrared spectroscopy (NIRS) (e.g. Australian Mango Industry Association). Pakistani supply chain also needs to adopt such a system that provides traceability and visibility to each sample within fruit packs.
The focus of this thesis is to investigate the short-wave NIRS (SWNIRS) for fruit quality inspection and present a decision support system for the Pakistani horticulture. First in this thesis, I have developed a decision support system for prediction of Pakistani fruit’s quality index values using SWNIRS. The investigated fruits, i.e. export varieties of mango (‘Sindhri’, ‘Samar Bahisht Chaunsa’ and ‘Sufaid Chaunsa’), export variety of mandarin (‘Kinnow’) and loquat, hold high commercial significance to Pakistan, but also provide examples of fruit with relatively thin and thick skin and with relatively thick and thin edible flesh. These differences in morphology can be expected to impact the non-invasive assessment of flesh characters using SWNIRS. Locally developed partial least squares regression (PLSR) models returned an coefficient of determination (R2) of 0.90 and root mean square error (RMSE) of 0.95 oBrix for solids soluble content (SSC) and R2 of 0.80 and RMSE of 1.17% for DM in the prediction of a mango test set, and an R2 of 0.71 and RMSE 0.65 oBrix in the prediction of SSC in a Kinnow mandarin test set. For cultivar ‘Tanaka’ loquats, the locally developed PLSR model achieved an R2 of 0.90 and RMSE of 0.95oBrix in the prediction of a test set. The results confirm the suitability of NIRS for the non-invasive evaluation of thin-skinned fruit and highlight the need for a data model trained on spectra obtained from multiple varieties when indicated by quality control on prediction performance.
Most of the reported literature on non-destructive fruit quality estimation uses an indirect approach to classify fruit sample i.e. predict the quality index value using some machine learning regression algorithm and based on the predicted value judge the sample quality (which requires prior knowledge about standards). Second, in this thesis, I have proposed a direct sweetness classifier for fruit sweetness classification as opposed to thresholding based indirect measure of quality index value. I have defined acceptance criteria for melons and oranges based on direct classification method to predict the sweetness level using NIR spectroscopy. I have compared performance of our classification-based approach with that of regression-based thresholding methods reported in literature. The proposed classifier has been tested for sweetness classification of Pakistani melon (variety: ‘Honey melons’) and orange (variety: ‘Blood red’, ‘Mosambi’ and ‘Succari’) fruits. The best SSC model was obtained using multiple linear regression on second derivative of spectral data (for wavelength range 729–975 nm) with correlation coefficient (R) = 0.93, and RMSE = 1.63 on test samples. Sweetness of test samples were obtained using ◦Brix thresholding with an accuracy of 55.45% for three classes. The best direct sweetness classifier was obtained using K nearest neighbor (KNN) on second derivative of spectral data (for wavelength range 729–975 nm) with an accuracy of 70.3% for three classes on test samples. For oranges, PLSR models were developed for Brix, titratable acidity (TA), Brix: TA, and BrimA (Brix minus acids) estimation with a correlation coefficient of 0.57, 0.73, 0.66, and 0.55, respectively, on independent test data. The ensemble classifier achieved 81.03% accuracy for three classes (sweet, mixed, and acidic) classification on independent test data for direct fruit classification. Extensive evaluation validates our argument that modelling a direct sweetness classifier is a better approach as compared to estimation of quality indices for sweetness classification using NIR spectroscopy.
Automated fruit classification is a significant task in many industrial applications. For instance, it may help a supermarket cashier in identifying the fruit, its cultivar and subsequently its price. Computer vision based automatic fruit classification is relatively a mature field now, but it requires complex computer vision algorithms and systems to accurately classify different fruits. Third, I present SWNIR spectral data-based classifier for fruit classification problems. The research focuses on O-H and C-H overtone features of fruit and its correlation with SWNIRS and therefore opens a new dimension of fruit classification problems using SWNIRS. Eleven fruits, which include apple, cherry, hass, kiwi, grapes, mango, melon, orange, loquat, plum, and apricot, were used in this study to cover physical characteristics such as peel thinness, pulp, seed thickness, and size. Different shallow machine learning architectures were trained to classify fruits using spectral feature vectors. At first, using 83 features vectors within the range of 725-975nm (3nm-resolution) and then using only four features of wavelength 770nm, 840nm, 910nm, and 960nm (corresponding to O-H and C-H overtone features). For the 83 spectral features range as an input, the QDA classifier achieved a cross-validation accuracy of 100% and a test data accuracy of 93.02%. For the four features vector as an input, the QDA classifier achieved a cross-validation accuracy of 97.1% and test data accuracy of 90.38%. The results demonstrate that fruit classification is mainly a function of absorptivity of SWNIR radiation primarily with respect to O-H and C-H overtones features. An LED-based device mainly having 770nm, 840nm, 910nm, and 960nm range LEDs can be used in applications where automation in fruit classification is required.
The decision support system presented in this dissertation will not only aid the Pakistani supply chain for automated, efficient and non-destructive quality assessment of fruits in particular, but in general also opens a new dimension for fruit segregation in two applications: 1) using direct classification based on acceptance criteria of sweetness instead of quantitative assessment of quality attributes and 2) fruit type classification using SWNIR spectral features.