Predicting Food Wastage

Use Case by Industry

Featured use cases All use cases E-Commerce & Retail Technology Gaming Healthcare Financial Services Insurance
Higher Education Industry Agnostic

Predicting Food Waste

In today's world, where sustainability and efficiency are paramount, the issue of food waste has emerged as a significant challenge. One of the main sources of food waste is the restaurant industry, with only a small portion of excess food recycled or donated. Not only does this wastage have a negative impact on the environment, but it also poses financial implications for businesses.

The advent of technology has opened doors to innovative solutions that address this problem head-on. The power of Machine Learning (ML) and Artificial Intelligence (AI) can be leveraged in the Actable AI platform to predict the amount of food waste and the factors contributing to it.

It is always important to first understand the data that you have collected. After it is uploaded to the Actable AI platform (either via an Excel spreadsheet, CSV file, by connecting directly to a database, or by using the Google Sheets add-on), a number of tools can be used to visualize the data and understand the relationships between different variables.

One such tool is the correlational analysis tool, which measures the strength of the relationships between variables. The parameters can be set as follows:

‍

The variable of interest, namely the amount of food waste, is specified in the ‘correlation target’ field. Meanwhile, any features for which the correlation needs to be measured with the target are specified in the ‘compared factors’ field. Other options are also available, such as the number of factors to display and whether values should be shown on the bar chart.

After clicking the ‘Run’ button and waiting for a few moments, the results are generated and displayed to the user:

Bar chart showing the results of the correlational analysis

‍

As can be observed, one of the most important factors appears to be the pricing. Specifically, higher prices tend to lead to higher waste. On the other hand, moderate and low pricing seem to decrease food waste.

Other factors that contribute to waste include the number of guests and the quantity of food. This is perhaps not surprising, given that a higher number of people (each contributing to wasted food) will inevitably lead to more overall waste, while having too much food will also increase the likelihood of food not being consumed and eventually thrown away.

Several graphs are also generated, which can also be used to deduce the relationship amongst features:

‍

Comparison of wastage when the pricing is high with wastage when pricing is low or moderate

‍

Correlation between food waste and the number of guests. It is evident that as the number of guests increase, so does the amount of food waste.

‍

Now that we have a better understanding of our features, we can try training a machine learning model with the task of predicting the amount of food wastage. This can be done by selecting the ‘regression’ analytic with the following options:

‍

‍

Similar to the correlational analysis, the outcome that should be predicted (amount of food wastage) should be specified in the ‘predicted target’ field, while any other features that should be used to predict the target are specified in the ‘predictors’ field.

Several other options can also be specified, including:

‍

In this case, the ‘explain predictions’ option has been selected. This will enable the generation of what are known as Shapley values that can help us understand to what extent each variable has increased or decreased the prediction.

More advanced options can also be specified, such as the models to be trained and their hyperparameters. While the default settings generally work well, you might want to specify certain values to your liking, or try to tune them to improve performance. Actable AI will then leverage state-of-the-art AutoML techniques to automatically train several models with different hyperparameters, and select the one achieving the best performance.

‍

‍

The metric used for optimization can also be specified:

More details on all of the options available in the regression analytic can also be found in the user documentation.

Once we are satisfied with the settings, the ‘Run’ button can be clicked to start the model training process. When it is completed, a number of results are displayed:

‍

First of all, we can check out the performance of the model using a number of metrics. Each of these compares the ground-truth values of the amount of food wastage with those predicted by the best model. As can be observed, the results in this case are very good, with the Root Mean Squared Error, Mean Absolute Error, and Median Absolute Error being very close to 0 (the optimal value), and R2 being very close to the optimal value of 1.0 (in this case, it can be said that the model has approximately 8% relative error).

These metrics indicate that the model would perform well when used on real-world unseen data (data that is not used by the model when training it).

We can then observe which features are deemed to be important by the model:

‍

In this case, pricing is clearly a very important factor affecting the outcome, namely the amount of food waste. The number of guests and food quantity are also very important, as also observed in the correlational analysis. At the other end of the table, the event type and seasonality are evidently not important and do not significantly affect the amount of food waste.

Next, we can check out the raw values of the predictions and the Shapley values mentioned earlier:

Predicted values, ground-truth values, and Shapley values

‍

Comparing the ground-truth values (column ‘Wastage_Food_Amount’) with the predicted values (‘Wastage_Food_Amount_predicted’), it is clear that the predicted values are indeed very close to the actual values. Moreover, the extent to which each variable affects the outcome is also given in red or green; red values indicate that the value has decreased the value of the outcome (i.e. the amount of food wastage), while green values indicate that the value has increased the value of the prediction. These values are generated for each specific sample, enabling highly granular analysis of the model and how each variable affects the outcome. This also helps determine how the amount of food waste can be reduced.

Further analysis of how the model predictions vary across different values of the variables can also be checked out in the recently introduced PDP and ICE plots:

‍

‍

‍

An ICE plot shows the effect of a feature on the outcome, by freezing all the values of a sample except for the feature being investigated. The average across all samples yields the PD plot (PDP). In the above images, it is again evident that higher pricing and number of guests increases the amount of food wastage.

More information on the best model and the other models that have been trained can also be viewed in the ‘leaderboard’ tab:

‍

Apart from the chosen evaluation metric, the amount of time required to train the model and to perform the predictions are given. This helps us determine if the amount of time required for the model to work will be sufficient for the given application. Note that the desired inference time can also be specified in the ‘Advanced’ tab. The hyperparameters and variables that have been used by the model are also shown, allowing us to gain a better insight into the model composition.

Once we are satisfied with the trained model, it can be used with new data by selecting the ‘Live Model’ tab where predictions can be generated with a new data set. Predictor values can also be input interactively in a form and predictions are generated on the fly:

‍

If the options in the ‘Intervention’ tab are set, then it is also possible to use counterfactual analysis to determine the effect of a treatment variable (e.g. pricing) on the predicted outcome, and obtain a new prediction. In other words, what happens to the predicted food waste if the price is changed? Common causes, also known as confounders, are variables that can affect both the outcome and the intervened variable. These can also be specified in the ‘common causes’ field, enabling causal inference techniques to yield new estimates on food wastage based on causal relationships:

‘Live Model’ tab when setting ‘current intervention’ and ‘common causes’

‍

In this example, the amount of food waste equals 36.19 for the provided sample. If pricing is changed from ‘High’ to ‘Low’, the expected food waste decreases by 43% to 20.50. In this way, we can easily determine how each factor can be varied in order to minimize food wastage.

An API can also be used to integrate the model into your existing application (web app, mobile app, etc.) . Click on ‘Live API tab’ and all the details of the API are shown:

‍

Finally, the trained model can also be exported and used directly within Python by following the instructions in the ‘Export Model’ tab:

‍

The above example demonstrated the use of the Actable AI platform to analyze data and generate a predictive model that is capable of estimating the amount of food wastage given a number of factors such as the pricing, number of guests, event type, and so on. We also gained several insights into the variables that most influence the prediction, helping understand the model and the most relevant features affecting the amount of food waste. This would also help determine which procedures to prioritize when designing and implementing them, in order to reduce food wastage and which in turn also improves the environment and reduces costs.

More information on other functionalities of the platform can also be found in the user documentation.

‍

Optimize your results & drive more impact with Actable AI

Product

solutions

Company