============================================================================ IAPR TC-11 Newsletter September 2011 http://www.iapr-tc11.org ========== Contents ======================================================== * Message from the Editor * Dates 'n' Deadlines - DAS 2012, Gold Coast, Australia, October 14 (extended!) * Short Summary: First ICDAR Doctoral Consortium * Report: - First International Workshop on Automated Forensic Handwriting Analysis (AFHA 2011), in conjunction with ICDAR 2011 * Call for Participation: - RISOT 2011: Retrieval from Indic Script OCRed Text * Announcement: - Second NIST OpenHaRT Evaluation 2012 * Call for Contributions ============================================================================ ========== Message from the Editor ========================================= Just coming back from a successful ICDAR 2011 I would like to welcome you to the September edition of our newsletter. This year ICDAR attracted quite a number of participants (387 from 35 different countries) and featured 90 oral and 188 poster presentations as well as 3 invited talks, a panel discussion, and 16 competitions (a record for ICDAR). ICDAR was accompanied by the first ever ICDAR Doctoral Consortium for which you will find a short summary and a link to available materials below. In addition, ICDAR featured 4 successful satelite workshops (for AFHA 2011 you will find a report below), as well as GREC, and 6 tutorials (more detailed reports on some of these events will be published in future editions of this newsletter). The next ICDAR in 2013 will be held in Washington, DC. The ICDAR audience decided the venue for ICDAR 2015 to be Tunis, Tunesia. So in 2015 for the first time ICDAR will be held in Africa. For those of you thinking about a trip to "Down Under" it will be good news that the deadline for DAS 2012 to be held in Gold Cost, Australia in March next year was extended to October 14. Furthermore, in this newletter you will also find a Call for Participation for RISOT 2011, a text retrieval competition based on OCRed Indic texts, and the announcement of the Second NIST OpenHaRT Evaluation 2012. Gernot A. Fink, IAPR-TC11 Newsletter Editor Gernot.Fink@udo.edu ============================================================================ ========== Dates 'n' Deadlines ============================================= Event/Location/Web: Event Date: Deadline (paper submission): ---------------------------------------------------------------------------- * DAS 2012, Gold Coast, Australia March 27-29, 2012 October 14 (!!!) (http://www.ict.griffith.edu.au/das2012) * ICFHR 2012, Bari, Italy September 18-20, 2012 February 28 (http://www.icfhr2012.uniba.it) * ICPR 2012, Tsukuba, Japan November 11-15, 2012 March 31 (http://www.icpr2012.org) ============================================================================ ========== Short Summary: First ICDAR Doctoral Consortium ================== ICDAR 2011 Doctoral Consortium September 18, 2011 The first-ever ICDAR Doctoral Consortium was held on September 18, the day before the main conference. A total of 21 Ph.D. students participated, along with a majority of the 19 volunteer mentors who had been working with them over the past several months. The event was regarded as a big success by all involved and will no doubt be repeated at ICDAR 2013 in Washington, DC. Look for a more complete report next month -- for now, we refer you to the webpage we have set up on the TC-11 website: http://www.iapr-tc11.org/mediawiki/index.php/ICDAR2011_Doctoral_Consortium Daniel P. Lopresti, IAPR-TC11 Chair on behalf of the ICDAR 2011 Doctoral Consortium Organizing Committee ============================================================================ ========== Report: AFHA 2011 =============================================== First International Workshop on Automated Forensic Handwriting Analysis (AFHA 2011) 17-18 September 2011, Beijing, China Workshop Report: AFHA 2011 Chairs: Marcus Liwicki, Michael Blumenstein, Elisa van den Heuvel, Bryan Found, Charles Berger, Reinoud Stoel Report prepared by: Muhammad Imran Malik Online proceedings available @ http://ceur-ws.org/Vol-768 The slides of the tutorial sessions are available @ http://www.dfki.de/~liwicki/sigTutorial2011/ The 1st International Workshop on Automated Forensic Handwriting Analysis was held as a satellite workshop of the ICDAR 2011 in Beijing, China during September 17-18, 2011. The aim of the AFHA 2011 was bringing together researchers in the field of automated handwriting analysis and signature verification and experts from the forensic handwriting examination community. It was organized as a two-day combined workshop and tutorial where the participants from the two communities were provided with an open opportunity to interact directly with each other and try to understand the demands of each other with reference to forensic handwriting analysis. The response from the two communities was very encouraging and there were more than thirty registrations. About ten Forensic Handwriting Examiners (FHEs) from Argentina, Australia, Canada, China, Greece, Hungary, South Africa, and the Netherlands participated in the event. Remaining participants included experts and students from pattern recognition (PR) community of different countries including Australia, China, France, Germany, India, Italy, Pakistan, Philippines, Qatar, Tunisia, and USA. On the first day, an introductory tutorial on forensic handwriting examination was given. There were three sessions in all dedicated for FHEs, PR experts and plenary discussions. In the first session, Bryan Found (Australia), Charles Berger (the Netherlands), and Reinoud Stoel (the Netherlands) being FHEs themselves brought forward their viewpoint. They explicitly outlined the demands the FHEs have from automated systems. They also described various complexities involved in forensic cases and why the output generated by automated handwriting analysis systems so far is not acceptable for the court of law. The FHEs specifically mentioned various types of genuine and forged handwritings they have to deal with in real forensic scenarios. To give PR community a clear idea about their work the FHEs provided various examples of their real caseworks and PR people were involved in practically solving them. The view point of PR community and how handwriting is approached in general by PR people was put forward by Marcus Liwicki (Germany) and Michael Blumenstein (Australia) in the second session. Their talks gave an overview of the field, historical perspectives, and various approaches used in computer science to perform automated handwriting analysis. The results of some recent signature verification competitions containing data from FHEs were also presented and discussed. The last session of the day was an initial plenary discussion session. It was an ice breaker as to allow people from the two communities in general to discuss their ideas. Here participants commented about what they thought of current research on handwriting analysis with respect to their particular background (FHE, PR expert or student). It was suggested that FHEs should encourage the use of more and more automated systems in their real caseworks at the same time PR people should explicitly focus on demands of FHEs so that automated systems developed by them should fulfill real casework needs of FHEs. The AFHA 2011 participants had an opportunity to socialize on the dinner given by Marcus Liwicki at the end of the first day. This was important as it allowed people from the two communities to be more understanding towards each other and discuss certain related issues in a much candid environment. The second day was dedicated for a workshop about recent research activities. It had two sessions. First, participants with accepted report papers got the opportunity to talk about their research. Subsequently, in a panel discussion session, all participants were able to state their points of view and discuss together about selected topics of the two communities. The first session focusing paper submissions about emerging approaches was divided into three sections namely, General Aspects, Features and Automatic Verification respectively. Eight papers were accepted for presentation. In the first section chaired by Marcus Liwicki two papers surveying the general aspects of signature verification and effects of data selection and sampling on signature verification were presented. The second section chaired by Muhammad Imran Malik focused the classification of features into strong and weak commodities for signature verification and comparison of forensic and computing features for document retrieval. The third section chaired by Michael Blumenstein contained four papers that presented different approaches for automatic identification and verification of handwriting. The second session of the day was plenary discussion session. It was one of the most important sessions of the entire two days activity. Here major findings of the interaction between two communities, i.e., forensic document examination community and pattern recognition community were summarized. Various terms for specifying different types of handwriting forgeries were suggested and finalized as per suggestions from the two communities. It was also agreed that computer science/ pattern recognition and forensic science people will use the definitions of terms agreed here. Various important issues regarding future of AFHA were also discussed. A detailed report about this plenary session will appear in the next IAPR newsletter. In summary it was agreed to conduct the tutorial session every year along with the ICFHR conference and combined tutorial plus workshop every two years as a satellite workshop of ICDAR. This two day combined workshop and tutorial session was a success according to the participants from both FHE and PR communities. The FHEs found the event much better than expected. It was a good opportunity for them to have a look under the hood of various automated handwriting analysis systems. They appreciated the non-mathematical explanations of different automated systems especially by Marcus Liwicki and Michael Blumenstein. The PR experts expressed that now they feel themselves in a better position to work in-line with the FHEs expectations. Both were of the view point that they now felt less hesitant to use each other's experiences for possible future research and joint funding of projects. This unique combination of tutorial and workshop was very beneficial for newcomers in the field, as well as for persons who had interesting ongoing research results and wanted to discuss about them and other topics in a broad group consisting of experts from the document analysis field as well as experts from the forensic handwriting examination community. Since this was the first workshop of this series, yet the response is enormous, we hope to organize even better workshops in future where we try to further widen the scope of our topics and invite more FHEs and PR experts from different countries. ============================================================================ ========== Call for Participation: RISOT 2011 ============================== RISOT: Retrieval from Indic Script OCRed Text (from http://www.isical.ac.in/~clia/risot/risot.html) Introduction RISOT focuses on evaluating IR effectiveness on Indic script OCRed text. The participants will be provided with a relevance judged collection of 62,825 articles of a leading Bangla newspaper, Anandabazar Patrika (2004-2006). For each article, both the original digital text and corresponding OCR results are given. Relevance judgments are available for 92 topics. The OCR output is obtained by rendering each digital document as a document image, which is then processed by a Bangla OCR system. The document images have variation in font faces, character styles and sizes. The character level (more specifically, Unicode level) accuracy of the OCR engine is about 92%. For instance, if <#><#> is misrecognized as <#>, this incurs 2 unicode-level errors; if <#><#><#> is mis-recognized as <#> then 3 errors are counted. The participants in the 2011 RISOT pilot task are expected to develop IR techniques to retrieve documents from these collections and report the MAP and Precision@10 separately for the digital text collection and for the OCR collection. Retrieval from the OCR collection is expected to show degradation in IR effectiveness, and therefore the search algorithms are expected to make use of additional techniques (e.g., OCR error corrections, modeling of OCR errors for IR purposes, etc.) to improve the performance of IR from OCRed text. In subsequent years of FIRE we anticipate conducting an extended version of RISOT. Although we are asking participants to compute their own results using existing relevance judgments for the 2011 pilot task, in future years we would expect to conduct blinded evaluations using new relevance judgments. For the 2011 pilot task we have generated clean images from the text pages, but image degradation models could be applied before running the OCR. We could model the actual application with even higher fidelity by actually printing and then re-scanning at least a part of the collection. And even higher fidelity could be achieved by finding a subset of documents that have actually been printed in newspaper and scanning them. This could generate as many as four different versions of the OCR collection. Some participants in future years might also wish to contribute additional OCR results. In this case, the participants would be provided with the image dataset along with the text collection. Adding documents in other Indic scripts such as Devanagari will also be considered in future years. We may consider adoption of additional evaluation measures. The specific design of the task in future years will, of course, be discussed among the potential participants. We therefore encourage the broadest possible participation in the 2011 pilot task in order to provide a basis for those discussions. Important Dates Corpus and Query Release Aug 25 2011 Submission of Results Due Oct 25 2011 Working Note Due Nov 25 2011 Data Registered participants will download the corpus from the following URL. Two collections (i.e. text and OCR) are given in two different directories. A text document and its correspodning OCRed document are having the same names. The topic set contains 92 topics, which are taken from FIRE 2008 and 2010 topic sets. Each topic consists of three parts namely title, description (desc) and narrative (narr) along with a unique query number. Title gives the focus of the information need, description field gives somewhat clearer information need and narrative field provides a content guidelines of the relevant documents for this topic. Here is the structure of a sample topic. The participants can build queries using these parts. Therefore, in the working note the participants should explain how they have made their queries. For example, a query can be title only or title, description query (title and description fields are combined). Existing relevance judgments have also been provided. Task Organizers Utpal Garain, ISI, Kolkata Jiaul Paik, ISI, Kolkata Tamaltaru Pal, ISI, Kolkata Prasenjit Majumder, DAIICT, Gandhinagar David Doermann, University of Maryland, College Park, USA Doug Oard, University of Maryland, College Park, USA. Registration For registration (or for any queries), please mail to Utpal Garain: utpal@isical.ac.in or Jiaul Paik: jia.paik@gmail.com giving the following details: 1) Name(s) of the participant(s); 2) Affiliation(s); and 3) Contact details (Contact person, Email, and Telephone) Run Submissions Participants are expected to report MAP and Precision@10. The reported results will have the following format: Number of queries = ??? Retrieved = ??? Relevant = ??? Relevant retrieved = ??? ------------------------ Average Precision : ????? R Precision : ????? ------------------------ ============================================================================ ========== Announcement: 2nd NIST OpenHaRT Evaluation ====================== Dear All, We are pleased to announce the second NIST OpenHaRT evaluation to take place in the Spring of 2012. The evaluation tasks will be similar to those evaluated in 2010, again focusing on recognition and translation technologies of document images containing Arabic handwritten script. Some highlights are: 1) Continuation of a news-focused data domain collected in a controlled environment 2) Inclusion of a new data domain collected in an unrestrictive environment that captures natural variations found in real world data 3) Line segmentation will be the primary (and only) segmentation condition evaluated The details of the evaluation are described in the evaluation plan and can be found at the URL: http://www.nist.gov/itl/iad/mig/hart2012.cfm Please pass this announcement to others who might find the evaluation of interest. Contact hart_poc@nist.gov if you have any questions. Best Regards, NIST OpenHaRT team -- Audrey Tong NIST Multimodal Information Group 100 Bureau Drive, Stop 8940 Gaithersburg, MD 20899 U.S.A. Tel: 301-975-6091 Fax: 301-670-0939 ============================================================================ ========== Call for Contributions ========================================== This newsletter needs your support in order to provide useful information to the TC11 community. Therefore, please contribute relevant news by sending a short notice to the newsletter editor Gernot A. Fink . Such news could be the obvious announcements of conferences and workshops, job opportunities, reports on past conferences, book reviews, or anything that might be of interest to a wider audience involved in the construction of reading systems. ============================================================================ ============================================================================ ========== Subscription Information ======================================== This newsletter is sent to subscribers of the IAPR TC11 mailing list. To manage your subscription, please visit the mailing list homepage at: https://www.jiscmail.ac.uk/cgi-bin/webadmin?A0=IAPR-TC11 The homepage for IAPR TC11 is http://www.iapr-tc11.org ============================================================================