Assistance for Historians | A Chatbot to Aid in Researching Historical Newspapers #25

Thukyd · 2024-01-29T21:57:51Z

Thukyd
Jan 29, 2024

Not sure if I will go with this project in the end but currently I've got an idea for an application which I've got since a while. I want to create a chatbot which helps researcher to query through vast amounts and "Difficult-to-Process" sources.

Context

More specific, I am experimenting with historical newspapers as the "Jüdische Zeitung" a pre-war Jewish newspaper from Vienna. The documents are printed in "Fraktur" typeface (screenshot). They offer a rich resource for historians but are really hard to work with due to readability for modern German speakers.

Goals

Accessibility: Studying newspapers in a antiquated typeface takes a lot mental effort and time. Read throughs take up a large chunk of the work of a historian. A tool that could support this, would make a huge difference.
Search by topics: Resources as the "Jüdische Zeitung" were published in weekly frequency. As historians read these sources with a specific question in mind, vast amounts of the information won't be relevant. The task is to find the articles which are.
Uncover hidden gems: Newspapers suffered under censorship during wartimes. Editorials had to find more subtle ways to report on sensitive issues. Historians are aware of these "codes" but in the original sources they are easily overlooked. A LLM-RAG based chatbot would be very helpful to find these.

Potential Challenges

(Optical) Character Recognition: I am not sure how deal with "Fraktur" typeface yet. Documents are uploaded but I don't know how to check if they text was detected fully. Are there ways to evaluate and improve the text inputs?
Checking for Quality and Utilisation: When I query for the articles about the region "Galicia", I expect that the app provides me the most relevant articles first but still provide me all articles which fit that theme. What methods are there to achieve this?
Preference for sources: Next to sources, I use literature/excerpts about the topic. When I query, I always prefer original sources though. The literature should only aid with additional context. How can I set up such preferences?

If you have any ideas or potential solution to tackle these challenges, comment below! Would be very appreciated!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Assistance for Historians | A Chatbot to Aid in Researching Historical Newspapers #25

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Assistance for Historians | A Chatbot to Aid in Researching Historical Newspapers #25

Thukyd Jan 29, 2024

Context

Goals

Potential Challenges

Replies: 0 comments

Thukyd
Jan 29, 2024