Food Classification Methods

This repository contains the source code for my Thesis in the Department of Electrical and Computer Engineering at Aristotle University of Thessaloniki. The scope of this project is the employment of Computer Vision and Natural Language Processing techniques to deal with the complex problem of Food Categorization. Specifically, this project evaluates mainly two different classification techniques: the classification using the visual features of an image and the classification with the intermediate step of predicting a list of ingredients.

Abstract

This thesis studies the feasibility of using an ingredient prediction system, in order to categorize food images. This problem is known in the literature with the term multiclass classification and requires the prediction of the correct class given an input image. Specifically, this work compares three different methodologies, the first step of which is to find abstract vector representations of images.

The first methodology uses for this representation the penultimate layer of ResNet-50. The parameters of ResNet-50 are initialized with the pre-trained weights in ImageNet. Next, a fully connected layer with output neurons equal to the number of classes is applied, which acts as a direct classifier.

The second methodology uses a recent model for the prediction of ingredients and recipes given a food image, which employs transformers and the mechanism of attention. This model is given the vector representations of the images as input. The predicted ingredients are then imported into a classifier, which attempts to predict the correct class. At the same time, a simpler classification method is implemented, based on probabilities theory.

The third and most essential methodology combines a series of techniques belonging to different areas of machine learning, in order to solve the multiclass problem. Like the first two, it calculates the vector representations and predicted ingredients of an image. However subsequently, it maps the words of the ingredients and the class to a vector space of a pre-trained word2vec model of a wide vocabulary. Obviously, not all components and classes belong to the model vocabulary, so some entries are ignored. The vector sequences of the ingredients are processed by a series of LSTM networks to produce a vector of the above vector space.

Finally, a process of finding the nearest neighbors is implemented to classify each sample. The networks are trained in various datasets and the training statistics, as well as the results of the evaluation are presented in tables to facilitate the comparison of the methodologies.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.spyproject/config		.spyproject/config
InverseCooking		InverseCooking
Task1		Task1
Task2		Task2
Task3		Task3
img		img
.gitattributes		.gitattributes
Food Categorization via the intermediate step of Ingredients Prediction.pdf		Food Categorization via the intermediate step of Ingredients Prediction.pdf
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Food Classification Methods

Abstract

Workflow

Thesis Report

Author

About

Releases

Packages

Languages

License

exarchou/Food-Categorization-via-Prediction-of-Ingredients

Folders and files

Latest commit

History

Repository files navigation

Food Classification Methods

Abstract

Workflow

Thesis Report

Author

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages