Fall Training Series: Text & Data Mining

What You'll Want to Know

This 8-session comprehensive course is designed to equip students with the essential skills and knowledge required to undertake text and data mining tasks. Throughout this course, students will be introduced to key concepts and tools of text and data mining, including data types, data structures, data pre-processing, text processing, data mining techniques, text mining techniques, and advanced topics in both data and text mining. Each session will include a Python component, discussing the importance of Python and its libraries in handling various aspects of text and data mining. Students are not expected to know Python, rather they will be introduced to how Python can solve key issues so that they are aware of its capabilities. By the end of the course, participants will have a solid understanding of text and data mining concepts, be proficient in using Python for text and data mining tasks and be able to apply these skills to real-world library applications and case studies.

Learning Outcomes:

1. Understanding of Data, Data Structures, and Complex Data Types
2. Understanding of the main types of Machine Learning and their Applications
3. Understanding of the key Python libraries for text and data mining
4. Understanding of the primary methods for performing text and data mining

More details will be added to the NISO event page in the coming weeks. Register now!

About Our Instructor

William Mattingly, PhD is a Postdoctoral Fellow in the Smithsonian Institution's Data Science Lab. He has worked with data scientists, archivists, medical professionals, social scientists, geographers, and historians. He specializes in the application of machine learning and natural language processing on archival and historical documents. He is the author of Introduction to Python for Digital Humanists (2023), content creator for Python Tutorials for Digital Humanities, and lead developer for the Bitter Aloe Project.

Dates and Time

The training series consists of eight sessions, running on Thursdays, October 12, 2023 - Thursday, December 7, 2023.

NOTE: Due to the Thanksgiving holiday, there will be no training session on Thursday, November 23, 2023.

Each session will be 90 minutes in length, 11:00am - 12:30pm (EST).