Skip to main content
עברית
|
العربية
Invalid characters
Search
Advanced Search
En
People
Projects
Tools
ArtGallery
Courses
SummerSchool
Conference
Events
Newsletter
המרח"ב הדיגיטלי
DHSS Hub | Projects
Automatic Dating of Hebrew Manuscripts from the Cairo Genizah
Israel Ministry of Science & Technology research grant, Digital Humanities (320,000 NIS for 3 years)
Dr. Daria Vasyutinsky Shapira, The Open University of Israel (postdoc)
Prof. Ophir Munz-Manor, The Open University of Israel
This project aims to develop deep machine learning algorithms for automatic dating of Hebrew liturgical poems (piyyutim) from the storerooms (genizot) of synagogues where out-of-use manuscripts were kept. We research the manuscripts from the Cairo Genizah, using datasets our group had already built during previous studies. Currently, many libraries and archives digitize significant collections of manuscripts, and thousands of manuscript images are available. Thus, one of the primary desiderata of digital historical and culturological research is finding efficient new methods for studying these collections. Determining the date of copying for unrecognized digitized manuscripts is one of the most desired among these new techniques.
Deep machine learning for image processing is a cutting-edge technology in digital manuscript research. This project is at the forefront of digital humanities, as we explore problems that have not been solved anywhere in the world. We are discovering the possibilities of semi-supervised and unsupervised algorithms on hard- and soft-labeled datasets.
Our research is supervised by Prof. Ophir Münz-Manor in collaboration with the DHSS Hub at the Open University, the Visual Media Lab at Ben-Gurion University, and the National Library of Israel.
Preliminary results:
Malachi Beit-Arié and his team described the dated Hebrew manuscripts in the Sfardata database kept at the National Library of Israel. The Visual Media Lab team and I extracted the Sfardata for research purposes and are currently training the deep learning algorithms on this new dataset. We will then incorporate the few existing dated manuscripts from the Cairo Genizah into the train sets. After the models are successfully pretrained on the existing datasets, we will apply them to the more considerable genizah data of poetic manuscripts.
The accuracy of different models during supervised training:1
1 Ahmad Droby, Irina Rabaev, Daria Vasyutinsky Shapira, Berat Kurar Barakat and Jihad El-Sana. “Digital Hebrew Paleography: Script Types and Modes.” Journal of Imaging 2022, 8(5), 143; doi:10.3390/jimaging8050143
Siamese network for the unsupervised classification:
He
|
Arb