Uncovering the Past through Digital Research
Text mining and machine learning helped uncover Jim Crow laws in archived North Carolina General Statutes
Text mining and machine learning helped uncover Jim Crow laws in archived North Carolina General Statutes
When a high school teacher asked UNC librarian Sarah Carrier for a comprehensive list of North Carolina’s Jim Crow laws in 2017, Carrier did not have a complete resource. To manually search through decades of volumes of North Carolina General Statutes would have been a near-impossible task.
Carrier worked with her library colleagues in Digital Research Services to find an automated solution. Amanda Henley, head of Digital Research Services, engaged more than 30 people in a multi-year project using text mining and machine learning to identify racist language in legal documents. To date, they’ve discovered nearly 2,000 Jim Crow laws in North Carolina.
In August 2020, the group released the project, called “On the Books: Jim Crow and Algorithms of Resistance,” to the public. Users can search and download text files of all the North Carolina statutes from 1866 to 1967.
When the Mellon Foundation heard about On the Books, they contacted Henley and have provided additional funding for her team to expand the project. For the next two years, Henley’s team will identify Jim Crow Laws within two additional states and will help research and teaching fellows learn how to use these data within their own projects and schools.
“I think the collaborative nature of this project is one of the reasons why the University Libraries is a good home for it,” said Henley. “Because of where we sit on campus, we know what other people are doing and who has different areas of expertise. We also have a broad range of expertise within the libraries. That’s what allowed us to be so successful.”
Read more about the impact of the University’s Libraries across campus…Opens in new window