Begin of page section:
Page sections:

  • Go to contents (Accesskey 1)
  • Go to position marker (Accesskey 2)
  • Go to main navigation (Accesskey 3)
  • Go to sub navigation (Accesskey 4)
  • Go to additional information (Accesskey 5)
  • Go to page settings (user/language) (Accesskey 8)
  • Go to search (Accesskey 9)

End of this page section. Go to overview of page sections

Begin of page section:
Page settings:

English en
Deutsch de
Search
Login

End of this page section. Go to overview of page sections

Begin of page section:
Search:

Search for details about Uni Graz
Close

End of this page section. Go to overview of page sections


Search

Begin of page section:
Main navigation:

Page navigation:

  • University

    University
    • About the University
    • Organisation
    • Faculties
    • Library
    • Working at University of Graz
    • Campus
    Developing solutions for the world of tomorrow - that is our mission. Our students and our researchers take on the great challenges of society and carry the knowledge out.
  • Research Profile

    Research Profile
    • Our Expertise
    • Research Questions
    • Research Portal
    • Promoting Research
    • Research Transfer
    • Ethics in Research
    Scientific excellence and the courage to break new ground. Research at the University of Graz creates the foundations for making the future worth living.
  • Studies

    Studies
    • Prospective Students
    • Students
  • Community

    Community
    • International
    • Location
    • Research and Business
    • Alumni
    The University of Graz is a hub for international research and brings together scientists and business experts. Moreover, it fosters the exchange and cooperation in study and teaching.
  • Spotlight
Topics
  • StudiGPT is here! Try it out!
  • Sustainable University
  • Researchers answer
  • Work for us
Close menu

End of this page section. Go to overview of page sections

Begin of page section:
You are here:

University of Graz Historical job ads Research results
  • About the project
  • Research team
  • Research results
  • News

End of this page section. Go to overview of page sections

Begin of page section:
Sub navigation:

  • About the project
  • Research team
  • Research results
  • News

End of this page section. Go to overview of page sections

Research results

Working papers

This paper addresses the challenge of evaluating page segmentation methods in the context of extracting historical job advertisements in digitized newspapers. Accurate segmentation is essential for high-quality Optical Character Recognition (OCR) results, yet the methodology for comparing and evaluating segmentation algorithms has received limited attention in Digital Humanities. The paper presents an evaluation framework developed within the JobAds Project, focusing on textual congruence between predicted and ground-truth regions. This is important for an evidence-based segmentation algorithm selection and offers insights into segmented data quality, impacting research outcomes. The paper examines three evaluation features: intersection area, text similarity based on Levenshtein distance, and text presence/absence in non-intersecting parts of the predicted region and its ground truth, revealing their effectiveness through logistic regression models. The method involves manual ground-truth creation, aiming for an automatic metric to quantify textual congruence. Results show that combining the text presence/absence feature with Hausdorff distance achieves the highest performance, reaching an F1 score of 0.957 on the testing subset. The study emphasizes the need for tailored evaluation metrics in Digital Humanities and highlights challenges posed by OCR errors and irregular layouts while underscoring the importance of transparency in research. The proposed evaluation framework offers insights for segmentation assessment in historical newspapers, with further application beyond the specific dataset and use case.

In the last 200 years, the division of labor has increased drastically. The different skills and knowledge need to be combined for production. How is the dispersed knowledge brought to the place where it creates a particularly large value? To assess this matching, we study labor markets as the devise to facilitate such processes in a decentralized manner. We start with our investigation in the middle of the 19th century, which was the beginning of the `modern' labor market and follow the market for 100 years. We use job ads in newspapers as our major data source. The analysis is put into perspectives of emergence, development and functioning of markets as means to facilitate the matching. The labor market was `created' by initiative of many actors including some public actors at later time. The market changed through time without losing robustness and functionality. The changes made increased the stability of matches and follows social preferences.

Working paper

Published Papers

Historical job advertisements provide invaluable insights into the evolution of labor markets and societaldynamics. However, extracting structured information, such as job titles, from these OCRed and unstructuredtexts presents significant challenges. This study evaluates four distinct computational approachesfor job title extraction: a dictionary-based method, a rule-based approach leveraging linguistic patterns,a Named Entity Recognition (NER) model fine-tuned on historical data, and a text generation modeldesigned to rewrite advertisements into structured lists.Our analysis spans multiple versions of the ANNO dataset, including raw OCR, automatically postcorrected,and human-corrected text, as well as an external dataset of German historical job advertisements.Results demonstrate that the NER approach consistently outperforms other methods, showcasingrobustness to OCR errors and variability in text quality. The text generation approach performs well onhigh-quality data but exhibits greater sensitivity to OCR-induced noise. While the rule-based method isless effective overall, it performs relatively well for ambiguous entities. The dictionary-based approach,though limited in precision, remains stable across datasets.This study highlights the impact of text quality on extraction performance and underscores the need foradaptable, generalizable methods. Future work should focus on integrating hybrid approaches, expandingannotated datasets, and improving OCR correction techniques to enhance the extraction of structuredinformation from historical texts. These advancements will enable deeper exploration of labor markettrends and contribute to the broader field of digital humanities.

Paper: jdmdh.episciences.org/15373

Conference participation

DHd2025 Conference 2025, Bielefeld - Conference Paper
DHd2025 Conference 2025, Bielefeld - Conference Poster
CHR2024 Conference 2024, Aarhus - Conference Paper
CHR2024 Conference 2024, Aarhus - Conference Poster
NLP4DH Conferece 2024, Miami - Conference Paper
Zeitgeschichtetag 2024 - Panel Accessing the “Invisible” Histories: Digital Data and the New Historical Perspectives in Historical Research
DHd Conference 2024, Passau - Conference Poster
DHd Conference 2024, Passau - Conference Paper

Begin of page section:
Additional information:

University of Graz
Universitaetsplatz 3
8010 Graz
Austria
  • Contact
  • Web Editors
  • Moodle
  • UNIGRAZonline
  • Imprint
  • Data Protection Declaration
  • Accessibility Declaration
Weatherstation
Uni Graz

End of this page section. Go to overview of page sections

End of this page section. Go to overview of page sections

Begin of page section:

End of this page section. Go to overview of page sections