ICDAR 2005 Robust Reading Competitions

From TC11
Jump to: navigation, search

Datasets -> Datasets List -> Current Page

Created: 2011-06-28
Last updated: 2013-001-19

Contact Author

Prof Simon Lucas
School of Computer Science and Electronic Engineering
University of Essex
Email: sml@essex.ac.uk

Current Version



Scene text, character recognition, word recognition, text localization, robust reading


The datasets below were created for the ICDAR 2005 Robust Reading competitions organised by Prof Simon Lucas. You can find more details about these competitions at the ICDAR 2005 competition page.

Four independent competitions were organised: Character Recognition, Word Recognition, Text Locating and Reading. These tasks were organised in a closed mode, meaning that the participants had to submit an operational version of their system for independent testing. Out of the three tasks above, training datasets are available only for the character recognition competition. The datasets used for final performance evaluation are not available for any of the competitions.

Character Recognition

Sample digits of the dataset

The character recognition datasets are in the simple MNist format, at the same size as the original MNist dataset (28x28). Each pixel is represented as a grey-level in the range 0 (black) to 255 (white). A random selection of 10 digits from each class is shown in the image. All the segmentation and labelling was performed while observing the full size colour images (i.e. including the surrounding context).

Three datasets are provided covering digits, lower case characters and upper case characters. For each of the datasets there is an images.bin file that contains the images in the MNIST format and a labels.bin file that contains the class labels in the MNist format.

In the case of the digits, in addition to the MNist format the data is also available as a directory tree of GIF images. This enables easy viewing without the need for a special purpose application. For each image in the directory tree, the file c*.gif is the rectangular grey-level image of each character, normalised so that the maximum dimension is 56 pixels, and the file n*.gif is the same image centred on a 28 x 28 square with the margins filled with Gaussian noise (with mean and standard deviation derived from the statistics of that image).


  1. S.M. Lucas, "ICDAR 2005 Text Locating Competition Results", Proc. of the 8th Int. Conf. on Document Analysis and Recognition (ICDAR 2005), pp. 80-84, Vol. 1, 2005

Submitted Files

Version 1.0

Character Recognition

This page is editable only by TC11 Officers .