Image classification  |  TensorFlow Core (2023)

View on TensorFlow.org Image classification | TensorFlow Core (2)Run in Google Colab Image classification | TensorFlow Core (3)View source on GitHub Image classification | TensorFlow Core (4)Download notebook

This tutorial shows how to classify images of flowers using a tf.keras.Sequential model and load data using tf.keras.utils.image_dataset_from_directory. It demonstrates the following concepts:

  • Efficiently loading a dataset off disk.
  • Identifying overfitting and applying techniques to mitigate it, including data augmentation and dropout.

This tutorial follows a basic machine learning workflow:

  1. Examine and understand data
  2. Build an input pipeline
  3. Build the model
  4. Train the model
  5. Test the model
  6. Improve the model and repeat the process

In addition, the notebook demonstrates how to convert a saved model to a TensorFlow Lite model for on-device machine learning on mobile, embedded, and IoT devices.

Setup

Import TensorFlow and other necessary libraries:

import matplotlib.pyplot as pltimport numpy as npimport PILimport tensorflow as tffrom tensorflow import kerasfrom tensorflow.keras import layersfrom tensorflow.keras.models import Sequential
2022-08-12 01:23:24.420055: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered2022-08-12 01:23:25.144734: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvrtc.so.11.1: cannot open shared object file: No such file or directory2022-08-12 01:23:25.145032: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvrtc.so.11.1: cannot open shared object file: No such file or directory2022-08-12 01:23:25.145046: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

Download and explore the dataset

This tutorial uses a dataset of about 3,700 photos of flowers. The dataset contains five sub-directories, one per class:

flower_photo/ daisy/ dandelion/ roses/ sunflowers/ tulips/
import pathlibdataset_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz"data_dir = tf.keras.utils.get_file('flower_photos', origin=dataset_url, untar=True)data_dir = pathlib.Path(data_dir)

After downloading, you should now have a copy of the dataset available. There are 3,670 total images:

image_count = len(list(data_dir.glob('*/*.jpg')))print(image_count)
3670

Here are some roses:

roses = list(data_dir.glob('roses/*'))PIL.Image.open(str(roses[0]))

Image classification | TensorFlow Core (5)

PIL.Image.open(str(roses[1]))

Image classification | TensorFlow Core (6)

And some tulips:

tulips = list(data_dir.glob('tulips/*'))PIL.Image.open(str(tulips[0]))

Image classification | TensorFlow Core (7)

PIL.Image.open(str(tulips[1]))

Image classification | TensorFlow Core (8)

(Video) Image classification using CNN (CIFAR10 dataset) | Deep Learning Tutorial 24 (Tensorflow & Python)

Load data using a Keras utility

Next, load these images off disk using the helpful tf.keras.utils.image_dataset_from_directory utility. This will take you from a directory of images on disk to a tf.data.Dataset in just a couple lines of code. If you like, you can also write your own data loading code from scratch by visiting the Load and preprocess images tutorial.

Create a dataset

Define some parameters for the loader:

batch_size = 32img_height = 180img_width = 180

It's good practice to use a validation split when developing your model. Use 80% of the images for training and 20% for validation.

train_ds = tf.keras.utils.image_dataset_from_directory( data_dir, validation_split=0.2, subset="training", seed=123, image_size=(img_height, img_width), batch_size=batch_size)
Found 3670 files belonging to 5 classes.Using 2936 files for training.
val_ds = tf.keras.utils.image_dataset_from_directory( data_dir, validation_split=0.2, subset="validation", seed=123, image_size=(img_height, img_width), batch_size=batch_size)
Found 3670 files belonging to 5 classes.Using 734 files for validation.

You can find the class names in the class_names attribute on these datasets. These correspond to the directory names in alphabetical order.

class_names = train_ds.class_namesprint(class_names)
['daisy', 'dandelion', 'roses', 'sunflowers', 'tulips']

Visualize the data

Here are the first nine images from the training dataset:

import matplotlib.pyplot as pltplt.figure(figsize=(10, 10))for images, labels in train_ds.take(1): for i in range(9): ax = plt.subplot(3, 3, i + 1) plt.imshow(images[i].numpy().astype("uint8")) plt.title(class_names[labels[i]]) plt.axis("off")

Image classification | TensorFlow Core (9)

You will pass these datasets to the Keras Model.fit method for training later in this tutorial. If you like, you can also manually iterate over the dataset and retrieve batches of images:

for image_batch, labels_batch in train_ds: print(image_batch.shape) print(labels_batch.shape) break
(32, 180, 180, 3)(32,)

The image_batch is a tensor of the shape (32, 180, 180, 3). This is a batch of 32 images of shape 180x180x3 (the last dimension refers to color channels RGB). The label_batch is a tensor of the shape (32,), these are corresponding labels to the 32 images.

You can call .numpy() on the image_batch and labels_batch tensors to convert them to a numpy.ndarray.

Configure the dataset for performance

Make sure to use buffered prefetching, so you can yield data from disk without having I/O become blocking. These are two important methods you should use when loading data:

  • Dataset.cache keeps the images in memory after they're loaded off disk during the first epoch. This will ensure the dataset does not become a bottleneck while training your model. If your dataset is too large to fit into memory, you can also use this method to create a performant on-disk cache.
  • Dataset.prefetch overlaps data preprocessing and model execution while training.

Interested readers can learn more about both methods, as well as how to cache data to disk in the Prefetching section of the Better performance with the tf.data API guide.

AUTOTUNE = tf.data.AUTOTUNEtrain_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)

Standardize the data

The RGB channel values are in the [0, 255] range. This is not ideal for a neural network; in general you should seek to make your input values small.

(Video) Core ML Tutorial: Create a Simple Machine Learning App - Image Classification

Here, you will standardize values to be in the [0, 1] range by using tf.keras.layers.Rescaling:

normalization_layer = layers.Rescaling(1./255)

There are two ways to use this layer. You can apply it to the dataset by calling Dataset.map:

normalized_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))image_batch, labels_batch = next(iter(normalized_ds))first_image = image_batch[0]# Notice the pixel values are now in `[0,1]`.print(np.min(first_image), np.max(first_image))
0.0 0.9970691

Or, you can include the layer inside your model definition, which can simplify deployment. Use the second approach here.

A basic Keras model

Create the model

The Keras Sequential model consists of three convolution blocks (tf.keras.layers.Conv2D) with a max pooling layer (tf.keras.layers.MaxPooling2D) in each of them. There's a fully-connected layer (tf.keras.layers.Dense) with 128 units on top of it that is activated by a ReLU activation function ('relu'). This model has not been tuned for high accuracy; the goal of this tutorial is to show a standard approach.

num_classes = len(class_names)model = Sequential([ layers.Rescaling(1./255, input_shape=(img_height, img_width, 3)), layers.Conv2D(16, 3, padding='same', activation='relu'), layers.MaxPooling2D(), layers.Conv2D(32, 3, padding='same', activation='relu'), layers.MaxPooling2D(), layers.Conv2D(64, 3, padding='same', activation='relu'), layers.MaxPooling2D(), layers.Flatten(), layers.Dense(128, activation='relu'), layers.Dense(num_classes)])

Compile the model

For this tutorial, choose the tf.keras.optimizers.Adam optimizer and tf.keras.losses.SparseCategoricalCrossentropy loss function. To view training and validation accuracy for each training epoch, pass the metrics argument to Model.compile.

model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=['accuracy'])

Model summary

View all the layers of the network using the Keras Model.summary method:

model.summary()
Model: "sequential"_________________________________________________________________ Layer (type) Output Shape Param # ================================================================= rescaling_1 (Rescaling) (None, 180, 180, 3) 0 conv2d (Conv2D) (None, 180, 180, 16) 448 max_pooling2d (MaxPooling2D (None, 90, 90, 16) 0 ) conv2d_1 (Conv2D) (None, 90, 90, 32) 4640 max_pooling2d_1 (MaxPooling (None, 45, 45, 32) 0 2D) conv2d_2 (Conv2D) (None, 45, 45, 64) 18496 max_pooling2d_2 (MaxPooling (None, 22, 22, 64) 0 2D) flatten (Flatten) (None, 30976) 0 dense (Dense) (None, 128) 3965056 dense_1 (Dense) (None, 5) 645 =================================================================Total params: 3,989,285Trainable params: 3,989,285Non-trainable params: 0_________________________________________________________________

Train the model

Train the model for 10 epochs with the Keras Model.fit method:

epochs=10history = model.fit( train_ds, validation_data=val_ds, epochs=epochs)
Epoch 1/1092/92 [==============================] - 3s 19ms/step - loss: 1.2910 - accuracy: 0.4479 - val_loss: 1.0880 - val_accuracy: 0.5490Epoch 2/1092/92 [==============================] - 1s 15ms/step - loss: 0.9705 - accuracy: 0.6281 - val_loss: 0.9521 - val_accuracy: 0.6117Epoch 3/1092/92 [==============================] - 1s 15ms/step - loss: 0.7879 - accuracy: 0.7071 - val_loss: 0.9698 - val_accuracy: 0.6104Epoch 4/1092/92 [==============================] - 1s 15ms/step - loss: 0.5747 - accuracy: 0.7919 - val_loss: 0.9599 - val_accuracy: 0.6417Epoch 5/1092/92 [==============================] - 1s 15ms/step - loss: 0.3659 - accuracy: 0.8716 - val_loss: 1.1058 - val_accuracy: 0.6471Epoch 6/1092/92 [==============================] - 1s 15ms/step - loss: 0.2075 - accuracy: 0.9298 - val_loss: 1.2884 - val_accuracy: 0.6158Epoch 7/1092/92 [==============================] - 1s 15ms/step - loss: 0.1258 - accuracy: 0.9646 - val_loss: 1.6268 - val_accuracy: 0.6322Epoch 8/1092/92 [==============================] - 1s 15ms/step - loss: 0.0825 - accuracy: 0.9768 - val_loss: 1.7297 - val_accuracy: 0.6335Epoch 9/1092/92 [==============================] - 1s 15ms/step - loss: 0.0522 - accuracy: 0.9850 - val_loss: 1.7817 - val_accuracy: 0.6117Epoch 10/1092/92 [==============================] - 1s 15ms/step - loss: 0.0581 - accuracy: 0.9837 - val_loss: 1.8938 - val_accuracy: 0.6281

Visualize training results

Create plots of the loss and accuracy on the training and validation sets:

acc = history.history['accuracy']val_acc = history.history['val_accuracy']loss = history.history['loss']val_loss = history.history['val_loss']epochs_range = range(epochs)plt.figure(figsize=(8, 8))plt.subplot(1, 2, 1)plt.plot(epochs_range, acc, label='Training Accuracy')plt.plot(epochs_range, val_acc, label='Validation Accuracy')plt.legend(loc='lower right')plt.title('Training and Validation Accuracy')plt.subplot(1, 2, 2)plt.plot(epochs_range, loss, label='Training Loss')plt.plot(epochs_range, val_loss, label='Validation Loss')plt.legend(loc='upper right')plt.title('Training and Validation Loss')plt.show()

Image classification | TensorFlow Core (10)

The plots show that training accuracy and validation accuracy are off by large margins, and the model has achieved only around 60% accuracy on the validation set.

The following tutorial sections show how to inspect what went wrong and try to increase the overall performance of the model.

Overfitting

In the plots above, the training accuracy is increasing linearly over time, whereas validation accuracy stalls around 60% in the training process. Also, the difference in accuracy between training and validation accuracy is noticeable—a sign of overfitting.

(Video) Getting Started with Tensorflow 2.0: Basic Image Classification

When there are a small number of training examples, the model sometimes learns from noises or unwanted details from training examples—to an extent that it negatively impacts the performance of the model on new examples. This phenomenon is known as overfitting. It means that the model will have a difficult time generalizing on a new dataset.

There are multiple ways to fight overfitting in the training process. In this tutorial, you'll use data augmentation and add dropout to your model.

Data augmentation

Overfitting generally occurs when there are a small number of training examples. Data augmentation takes the approach of generating additional training data from your existing examples by augmenting them using random transformations that yield believable-looking images. This helps expose the model to more aspects of the data and generalize better.

You will implement data augmentation using the following Keras preprocessing layers: tf.keras.layers.RandomFlip, tf.keras.layers.RandomRotation, and tf.keras.layers.RandomZoom. These can be included inside your model like other layers, and run on the GPU.

data_augmentation = keras.Sequential( [ layers.RandomFlip("horizontal", input_shape=(img_height, img_width, 3)), layers.RandomRotation(0.1), layers.RandomZoom(0.1), ])

Visualize a few augmented examples by applying data augmentation to the same image several times:

plt.figure(figsize=(10, 10))for images, _ in train_ds.take(1): for i in range(9): augmented_images = data_augmentation(images) ax = plt.subplot(3, 3, i + 1) plt.imshow(augmented_images[0].numpy().astype("uint8")) plt.axis("off")

Image classification | TensorFlow Core (11)

You will add data augmentation to your model before training in the next step.

Dropout

Another technique to reduce overfitting is to introduce dropout regularization to the network.

When you apply dropout to a layer, it randomly drops out (by setting the activation to zero) a number of output units from the layer during the training process. Dropout takes a fractional number as its input value, in the form such as 0.1, 0.2, 0.4, etc. This means dropping out 10%, 20% or 40% of the output units randomly from the applied layer.

Create a new neural network with tf.keras.layers.Dropout before training it using the augmented images:

model = Sequential([ data_augmentation, layers.Rescaling(1./255), layers.Conv2D(16, 3, padding='same', activation='relu'), layers.MaxPooling2D(), layers.Conv2D(32, 3, padding='same', activation='relu'), layers.MaxPooling2D(), layers.Conv2D(64, 3, padding='same', activation='relu'), layers.MaxPooling2D(), layers.Dropout(0.2), layers.Flatten(), layers.Dense(128, activation='relu'), layers.Dense(num_classes, name="outputs")])

Compile and train the model

model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=['accuracy'])
model.summary()
Model: "sequential_2"_________________________________________________________________ Layer (type) Output Shape Param # ================================================================= sequential_1 (Sequential) (None, 180, 180, 3) 0 rescaling_2 (Rescaling) (None, 180, 180, 3) 0 conv2d_3 (Conv2D) (None, 180, 180, 16) 448 max_pooling2d_3 (MaxPooling (None, 90, 90, 16) 0 2D) conv2d_4 (Conv2D) (None, 90, 90, 32) 4640 max_pooling2d_4 (MaxPooling (None, 45, 45, 32) 0 2D) conv2d_5 (Conv2D) (None, 45, 45, 64) 18496 max_pooling2d_5 (MaxPooling (None, 22, 22, 64) 0 2D) dropout (Dropout) (None, 22, 22, 64) 0 flatten_1 (Flatten) (None, 30976) 0 dense_2 (Dense) (None, 128) 3965056 outputs (Dense) (None, 5) 645 =================================================================Total params: 3,989,285Trainable params: 3,989,285Non-trainable params: 0_________________________________________________________________
epochs = 15history = model.fit( train_ds, validation_data=val_ds, epochs=epochs)
Epoch 1/1592/92 [==============================] - 4s 30ms/step - loss: 1.4681 - accuracy: 0.3931 - val_loss: 1.1181 - val_accuracy: 0.5272Epoch 2/1592/92 [==============================] - 3s 28ms/step - loss: 1.1017 - accuracy: 0.5504 - val_loss: 1.0365 - val_accuracy: 0.5913Epoch 3/1592/92 [==============================] - 3s 28ms/step - loss: 1.0359 - accuracy: 0.5943 - val_loss: 1.0137 - val_accuracy: 0.6104Epoch 4/1592/92 [==============================] - 3s 29ms/step - loss: 0.9272 - accuracy: 0.6410 - val_loss: 0.8631 - val_accuracy: 0.6662Epoch 5/1592/92 [==============================] - 3s 28ms/step - loss: 0.8648 - accuracy: 0.6706 - val_loss: 0.8629 - val_accuracy: 0.6649Epoch 6/1592/92 [==============================] - 3s 29ms/step - loss: 0.8158 - accuracy: 0.6887 - val_loss: 0.8362 - val_accuracy: 0.6771Epoch 7/1592/92 [==============================] - 3s 29ms/step - loss: 0.7793 - accuracy: 0.6962 - val_loss: 0.8271 - val_accuracy: 0.6839Epoch 8/1592/92 [==============================] - 3s 28ms/step - loss: 0.7310 - accuracy: 0.7214 - val_loss: 0.8039 - val_accuracy: 0.7003Epoch 9/1592/92 [==============================] - 3s 28ms/step - loss: 0.7222 - accuracy: 0.7275 - val_loss: 0.7739 - val_accuracy: 0.6921Epoch 10/1592/92 [==============================] - 3s 28ms/step - loss: 0.6738 - accuracy: 0.7466 - val_loss: 0.8141 - val_accuracy: 0.6866Epoch 11/1592/92 [==============================] - 3s 27ms/step - loss: 0.6683 - accuracy: 0.7480 - val_loss: 0.7524 - val_accuracy: 0.7125Epoch 12/1592/92 [==============================] - 3s 28ms/step - loss: 0.6372 - accuracy: 0.7524 - val_loss: 0.7144 - val_accuracy: 0.7248Epoch 13/1592/92 [==============================] - 3s 28ms/step - loss: 0.6151 - accuracy: 0.7575 - val_loss: 0.7929 - val_accuracy: 0.6744Epoch 14/1592/92 [==============================] - 3s 28ms/step - loss: 0.5850 - accuracy: 0.7786 - val_loss: 0.7484 - val_accuracy: 0.7234Epoch 15/1592/92 [==============================] - 3s 28ms/step - loss: 0.5567 - accuracy: 0.7916 - val_loss: 0.8348 - val_accuracy: 0.7057

Visualize training results

After applying data augmentation and tf.keras.layers.Dropout, there is less overfitting than before, and training and validation accuracy are closer aligned:

acc = history.history['accuracy']val_acc = history.history['val_accuracy']loss = history.history['loss']val_loss = history.history['val_loss']epochs_range = range(epochs)plt.figure(figsize=(8, 8))plt.subplot(1, 2, 1)plt.plot(epochs_range, acc, label='Training Accuracy')plt.plot(epochs_range, val_acc, label='Validation Accuracy')plt.legend(loc='lower right')plt.title('Training and Validation Accuracy')plt.subplot(1, 2, 2)plt.plot(epochs_range, loss, label='Training Loss')plt.plot(epochs_range, val_loss, label='Validation Loss')plt.legend(loc='upper right')plt.title('Training and Validation Loss')plt.show()

Image classification | TensorFlow Core (12)

(Video) Image Classification in SwiftUI App Using Core ML and Pre-trained Model

Predict on new data

Use your model to classify an image that wasn't included in the training or validation sets.

sunflower_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/592px-Red_sunflower.jpg"sunflower_path = tf.keras.utils.get_file('Red_sunflower', origin=sunflower_url)img = tf.keras.utils.load_img( sunflower_path, target_size=(img_height, img_width))img_array = tf.keras.utils.img_to_array(img)img_array = tf.expand_dims(img_array, 0) # Create a batchpredictions = model.predict(img_array)score = tf.nn.softmax(predictions[0])print( "This image most likely belongs to {} with a {:.2f} percent confidence." .format(class_names[np.argmax(score)], 100 * np.max(score)))
1/1 [==============================] - 0s 133ms/stepThis image most likely belongs to sunflowers with a 98.02 percent confidence.

Use TensorFlow Lite

TensorFlow Lite is a set of tools that enables on-device machine learning by helping developers run their models on mobile, embedded, and edge devices.

Convert the Keras Sequential model to a TensorFlow Lite model

To use the trained model with on-device applications, first convert it to a smaller and more efficient model format called a TensorFlow Lite model.

In this example, take the trained Keras Sequential model and use tf.lite.TFLiteConverter.from_keras_model to generate a TensorFlow Lite model:

# Convert the model.converter = tf.lite.TFLiteConverter.from_keras_model(model)tflite_model = converter.convert()# Save the model.with open('model.tflite', 'wb') as f: f.write(tflite_model)
WARNING:absl:Found untraced functions such as _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op while saving (showing 3 of 3). These functions will not be directly callable after loading.INFO:tensorflow:Assets written to: /tmpfs/tmp/tmp3i5ygcg_/assetsINFO:tensorflow:Assets written to: /tmpfs/tmp/tmp3i5ygcg_/assets2022-08-12 01:24:35.718986: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:362] Ignored output_format.2022-08-12 01:24:35.719067: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:365] Ignored drop_control_dependency.

The TensorFlow Lite model you saved in the previous step can contain several function signatures. The Keras model converter API uses the default signature automatically. Learn more about TensorFlow Lite signatures.

Run the TensorFlow Lite model

You can access the TensorFlow Lite saved model signatures in Python via the tf.lite.Interpreter class.

Load the model with the Interpreter:

TF_MODEL_FILE_PATH = 'model.tflite' # The default path to the saved TensorFlow Lite modelinterpreter = tf.lite.Interpreter(model_path=TF_MODEL_FILE_PATH)

Print the signatures from the converted model to obtain the names of the inputs (and outputs):

interpreter.get_signature_list()
{'serving_default': {'inputs': ['sequential_1_input'], 'outputs': ['outputs']} }

In this example, you have one default signature called serving_default. In addition, the name of the 'inputs' is 'sequential_1_input', while the 'outputs' are called 'outputs'. You can look up these first and last Keras layer names when running Model.summary, as demonstrated earlier in this tutorial.

Now you can test the loaded TensorFlow Model by performing inference on a sample image with tf.lite.Interpreter.get_signature_runner by passing the signature name as follows:

classify_lite = interpreter.get_signature_runner('serving_default')classify_lite
<tensorflow.lite.python.interpreter.SignatureRunner at 0x7ff35478f3a0>

Similar to what you did earlier in the tutorial, you can use the TensorFlow Lite model to classify images that weren't included in the training or validation sets.

You have already tensorized that image and saved it as img_array. Now, pass it to the first argument (the name of the 'inputs') of the loaded TensorFlow Lite model (predictions_lite), compute softmax activations, and then print the prediction for the class with the highest computed probability.

(Video) Getting started with image classification

predictions_lite = classify_lite(sequential_1_input=img_array)['outputs']score_lite = tf.nn.softmax(predictions_lite)assert np.allclose(predictions, predictions_lite)print( "This image most likely belongs to {} with a {:.2f} percent confidence." .format(class_names[np.argmax(score_lite)], 100 * np.max(score_lite)))
This image most likely belongs to sunflowers with a 98.02 percent confidence.

Of the five classes—'daisy', 'dandelion', 'roses', 'sunflowers', and 'tulips'—the model should predict the image belongs to sunflowers, which is the same result as before the TensorFlow Lite conversion.

Next steps

This tutorial showed how to train a model for image classification, test it, convert it to the TensorFlow Lite format for on-device applications (such as an image classification app), and perform inference with the TensorFlow Lite model with the Python API.

You can learn more about TensorFlow Lite through tutorials and guides.

FAQs

What are the classification of image? ›

Image classification is the process of categorizing and labeling groups of pixels or vectors within an image based on specific rules. The categorization law can be devised using one or more spectral or textural characteristics. Two general methods of classification are 'supervised' and 'unsupervised'.

What is the best classifier for image classification? ›

Convolutional Neural Networks (CNNs) is the most popular neural network model being used for image classification problem.

What is image classification with example? ›

The task of identifying what an image represents is called image classification. An image classification model is trained to recognize various classes of images. For example, you may train a model to recognize photos representing three different types of animals: rabbits, hamsters, and dogs.

Is CNN good for image classification? ›

The Convolutional Neural Network (CNN or ConvNet) is a subtype of the Neural Networks that is mainly used for applications in image and speech recognition. Its built-in convolutional layer reduces the high dimensionality of images without losing its information. That is why CNNs are especially suited for this use case.

Why is image classification used? ›

The objective of image classification is to identify and portray, as a unique gray level (or color), the features occurring in an image in terms of the object or type of land cover these features actually represent on the ground. Image classification is perhaps the most important part of digital image analysis.

Why do we do image classification? ›

Image classification is an essential part of building your machine learning algorithm. Your model must be constructed using supervised learning and CNNs or unsupervised learning. The approach you decide to go with is highly dependent on your data, what you need to achieve, and which method is best for your workflow.

What are the two types of image classification? ›

Unsupervised and supervised image classification are the two most common approaches. However, object-based classification has gained more popularity because it's useful for high-resolution data.

Which CNN model is best for image classification? ›

VGG16 is a pre-trained CNN model which is used for image classification. It is trained on a large and varied dataset and fine-tuned to fit image classification datasets with ease.

Is SVM good for image classification? ›

SVM is a very good algorithm for doing classification. It's a supervised learning algorithm that is mainly used to classify data into different classes. SVM trains on a set of label data.

What is image classification in simple words? ›

In simple words, image classification is a technique that is used to classify or predict the class of a specific object in an image. The main goal of this technique is to accurately identify the features in an image.

Is image classification supervised or unsupervised? ›

Image classification is mainly divided into two categories (1) supervised image classification and (2) unsupervised image classification. In supervised image classification training stage is required, which means first we need to select some pixels form each class called training pixels.

How does an image classifier work? ›

How Image Classification Works. Image classification is a supervised learning problem: define a set of target classes (objects to identify in images), and train a model to recognize them using labeled example photos. Early computer vision models relied on raw pixel data as the input to the model.

Why is CNN better than SVM? ›

Clearly, the CNN outperformed the SVM classifier in terms of testing accuracy. In comparing the overall correctacies of the CNN and SVM classifier, CNN was determined to have a static-significant advantage over SVM when the pixel-based reflectance samples used, without the segmentation size.

Why is CNN better than Knn? ›

CNN has been implemented on Keras including Tensorflow and produces accuracy. It is then shown that KNN and CNN perform competitively with their respective algorithm on this dataset, while CNN produces high accuracy than KNN and hence chosen as a better approach.

Which algorithm is best for image recognition? ›

CNN is a powerful algorithm for image processing. These algorithms are currently the best algorithms we have for the automated processing of images. Many companies use these algorithms to do things like identifying the objects in an image. Images contain data of RGB combination.

Why is CNN used for image classification? ›

CNNs are used for image classification and recognition because of its high accuracy. It was proposed by computer scientist Yann LeCun in the late 90s, when he was inspired from the human visual perception of recognizing things.

Why deep learning is best for image classification? ›

Image classification with deep learning most often involves convolutional neural networks, or CNNs. In CNNs, the nodes in the hidden layers don't always share their output with every node in the next layer (known as convolutional layers). Deep learning allows machines to identify and extract features from images.

What are features in image classification? ›

Well known examples of image features include corners, the SIFT, SURF, blobs, edges. Not all of them fulfill the invariances and insensitivity of ideal features. However, depending on the classification task and the expected geometry of the objects, features can be wisely selected.

What is image classification in AI? ›

Image classification involves teaching an Artificial Intelligence (AI) how to detect objects in an image based on their unique properties. An example of image classification is an AI that detects how likely an object in an image is to be an apple, orange or pear.

What are the different type of classification? ›

The three types of classification are artificial classification, natural classification and phylogenetic classification.

What is image classification in GIS? ›

Image classification refers to the task of assigning classes—defined in a land cover and land use classification system, known as the schema—to all the pixels in a remotely sensed image. The output raster from image classification can be used to create thematic maps.

Which method is more preferable in image classification? ›

The Maximum likelihood image analysis is the best method for land use / land cover classification, but, it is a probability value and the occurrences of paramedic value of multispectral wave length ranging from visual to microwave.

Which Optimizer is best for image classification? ›

Adam is the best optimizers. If one wants to train the neural network in less time and more efficiently than Adam is the optimizer. For sparse data use the optimizers with dynamic learning rate. If, want to use gradient descent algorithm than min-batch gradient descent is the best option.

How many layers are there in image classification? ›

There are 3 such layers (convolution and max-pooling) to extract the features of images. If there are very complex features that need to be learned, more layers should be added to the model making it much deeper.

Is Knn good for image classification? ›

k-NN: A Simple Classifier

The k-Nearest Neighbor classifier is by far the most simple machine learning and image classification algorithm. In fact, it's so simple that it doesn't actually “learn” anything.

Which is best SVM or CNN? ›

Classification Accuracy of SVM and CNN In this study, it is shown that SVM overcomes CNN, where it gives best results in classification, the accuracy in PCA- band the SVM linear 97.44%, SVM-RBF 98.84% and the CNN 94.01%, But in the all bands just have accuracy for SVM-linear 96.35% due to the big data hyperspectral ...

Why does CNN use SVM? ›

The proposed hybrid model combines the key properties of both the classifiers. In the proposed hybrid model, CNN works as an automatic feature extractor and SVM works as a binary classifier. The MNIST dataset of handwritten digits is used for training and testing the algorithm adopted in the proposed model.

Can we use logistic regression for image classification? ›

A Complete Image Classification Project Using Logistic Regression Algorithm. Logistic regression is very popular in machine learning and statistics. It can work on both binary and multiclass classification very well.

What are the steps in image classification? ›

Remember to make appropriate changes according to your setup.
  1. Step 1: Choose a Dataset. ...
  2. Step 2: Prepare Dataset for Training. ...
  3. Step 3: Create Training Data. ...
  4. Step 4: Shuffle the Dataset. ...
  5. Step 5: Assigning Labels and Features. ...
  6. Step 6: Normalising X and converting labels to categorical data. ...
  7. Step 7: Split X and Y for use in CNN.
11 Jan 2021

Is image classification computer vision? ›

Probably one of the most well-known tasks in computer vision is image classification. It allows for the classification of a given image as belonging to one of a set of predefined categories. Let's take a simple binary example: we want to categorize images according to whether they contain a tourist attraction or not.

What is the difference between image classification and object recognition? ›

Image classification involves predicting the class of one object in an image. Object localization refers to identifying the location of one or more objects in an image and drawing abounding box around their extent. Object detection combines these two tasks and localizes and classifies one or more objects in an image.

Is image classification a reinforcement learning? ›

The image classification related issues motivated the researchers to use Reinforcement Learning (RL) with image classification experiments to enhance it. RL is a self-learning approach where machines can learn from experience. This paper aims to study the influence of RL on image classification trials.

Is CNN supervised or unsupervised? ›

2. Convolutional Neural Network. CNN is a supervised type of Deep learning, most preferable used in image recognition and computer vision.

Which is better supervised or unsupervised classification? ›

Supervised techniques deal with labeled data where the output data patterns are known to the system. This makes Supervised Learning models more accurate than unsupervised learning models, as the expected output is known beforehand.

What is image classification in simple words? ›

In simple words, image classification is a technique that is used to classify or predict the class of a specific object in an image. The main goal of this technique is to accurately identify the features in an image.

What are features in image classification? ›

Well known examples of image features include corners, the SIFT, SURF, blobs, edges. Not all of them fulfill the invariances and insensitivity of ideal features. However, depending on the classification task and the expected geometry of the objects, features can be wisely selected.

What is image classification in computer vision? ›

Image classification is a subdomain of computer vision dealing with categorizing and labeling groups of pixels or vectors within an image using a collection of predefined tags or categories that an algorithm has been trained on. We can distinguish between supervised and unsupervised classification.

What is image classification paper? ›

Image Classification is a fundamental task that attempts to comprehend an entire image as a whole. The goal is to classify the image by assigning it to a specific label. Typically, Image Classification refers to images in which only one object appears and is analyzed.

What are the two types of image classification? ›

Unsupervised and supervised image classification are the two most common approaches. However, object-based classification has gained more popularity because it's useful for high-resolution data.

Is image classification supervised or unsupervised? ›

Image classification is mainly divided into two categories (1) supervised image classification and (2) unsupervised image classification. In supervised image classification training stage is required, which means first we need to select some pixels form each class called training pixels.

Why deep learning is best for image classification? ›

Image classification with deep learning most often involves convolutional neural networks, or CNNs. In CNNs, the nodes in the hidden layers don't always share their output with every node in the next layer (known as convolutional layers). Deep learning allows machines to identify and extract features from images.

What are image classification algorithms? ›

Put simply, image classification in a computer's view is the analysis of this statistical data using algorithms. In digital image processing, image classification is done by automatically grouping pixels into specified categories, so-called “classes.”

How does an image classifier work? ›

How Image Classification Works. Image classification is a supervised learning problem: define a set of target classes (objects to identify in images), and train a model to recognize them using labeled example photos. Early computer vision models relied on raw pixel data as the input to the model.

Which method is more preferable in image classification? ›

The Maximum likelihood image analysis is the best method for land use / land cover classification, but, it is a probability value and the occurrences of paramedic value of multispectral wave length ranging from visual to microwave.

Why is CNN used for image classification? ›

CNNs are used for image classification and recognition because of its high accuracy. It was proposed by computer scientist Yann LeCun in the late 90s, when he was inspired from the human visual perception of recognizing things.

What is CNN image classification? ›

So basically what is CNN – as we know it's a machine learning algorithm for machines to understand the features of the image with foresight and remember the features to guess whether the name of the new image is fed to the machine.

What is image classification using deep learning? ›

Image classification is one of the most important applications of deep learning and Artificial Intelligence. Image classification refers to assigning labels to images based on certain characteristics or features present in them.

What architecture does CNN use? ›

LeNet-5 architecture is perhaps the most widely known CNN architecture. It was created by Yann LeCun in 1998 and widely used for written digits recognition (MNIST).

Which machine learning algorithm is best for image classification? ›

In the image classification field, traditional machine learning algorithms, such as K-Nearest Neighbor (KNN) and Support Vector Machine (SVM), are widely adopted to solve classification problems and especially perform well on small datasets.

What is vgg16 in deep learning? ›

VGG-16 is a convolutional neural network that is 16 layers deep. You can load a pretrained version of the network trained on more than a million images from the ImageNet database [1]. The pretrained network can classify images into 1000 object categories, such as keyboard, mouse, pencil, and many animals.

Videos

1. Lecture 2 | Image Classification
(Stanford University School of Engineering)
2. Deep Learning - Image Classification Tutorial step by step (for Beginners) (python / TensorFlow)
(DeepLearning_by_PhDScholar)
3. Part - 3 | Create a CNN model | Image Classification using Tensorflow | Complete guide | with Code
(Raj Kapadia)
4. Tensorflow Image Classification | Build Your Own Image Classifier In Tensorflow | Edureka
(edureka!)
5. Build a TensorFlow Image Classifier in 5 Min
(Siraj Raval)
6. Transfer Learning - Image Classification using Tensorflow
(AIEngineering)
Top Articles
Latest Posts
Article information

Author: Aracelis Kilback

Last Updated: 01/07/2023

Views: 6198

Rating: 4.3 / 5 (44 voted)

Reviews: 91% of readers found this page helpful

Author information

Name: Aracelis Kilback

Birthday: 1994-11-22

Address: Apt. 895 30151 Green Plain, Lake Mariela, RI 98141

Phone: +5992291857476

Job: Legal Officer

Hobby: LARPing, role-playing games, Slacklining, Reading, Inline skating, Brazilian jiu-jitsu, Dance

Introduction: My name is Aracelis Kilback, I am a nice, gentle, agreeable, joyous, attractive, combative, gifted person who loves writing and wants to share my knowledge and understanding with you.