News
150,000 word mark passed
September 22, 2020
Students of my introductory course "From Text to Linguistic Evidence," autumn semester 2022, have managed to add more than 10 new transcripts to the corpus. Their hard work has increased its total word count to 152,304 tokens, or more than 12 hours of speech - a very substantial increase. Among their additions are commentaries on the Ukraine War, an interview with Harper Lee, author of "To Kill a Mockingbird," and a talk by climate scientist Roger Revelle.
I would like to thank every student for their contribution, commitment and professionalism. I wish everybody all the best for the future!
Second UG Scholars project over
June 30, 2022
This year's undegraduate scholars Sadie Barlow and Xiaohui (Iris) Chen have added another 10,000 words in four new transcripts to the spoken corpus. Thank you!
They then investigated variation between 'very' and 'really' in the corpus. Below, you can find a video of their competent and insightful final presentation. Have fun with it :)
The Spoken Corpus is one today!
September 1, 2021
UG Scholars project concludes successfully
June 23, 2021
Three students have worked on this year's Undergraduate Scholars program. They are: Wenjing Shi, Brandon Cox and Jamie-Lea Carter. They have made fantastically accurate transcripts and in so doing, added many thousands of words to the corpus. Many thanks to their hard work and dedication!
You can watch the final presentation of their research outcomes below. Enjoy!
UG Scholars to enlarge the corpus
February 11, 2021
The UG Scholars program gives students an opportunity to engage in actual research. Three amazing students have volunteered with this program to help increase the size of the Student-Transcribed Corpus of Spoken American English. They will learn how to make transcripts for the corpus, cut up audio-clips, annotate the files in XML and conduct research on spoken language. Their contribution will be greatly appreciated!

The Undergraduate Scholars Program is an opportunity for students to engage in real research and gain extra-curricular experience for their CV
Thank you to all students!
December 17, 2020
'Tis done! The first cohort of students working on the corpus have finished their semester. They jointly added more than 60,000 new word tokens, quite an amazing feat. I'm very impressed by their dedication and attention to detail. Thanks to everybody who contributed to this project. Well done, guys!
Divide and conquer!
October 14, 2020
Get this - the results table is now sortable! I shamelessly stole some code from the internets to implement this functionality. Furthermore, the number of hits to be displayed can now be determined by the user. The programming is a bit sloppy perhaps; every time you click on "Next hits", all the hits will be searched again. This could be an issue down the line if and when the corpus becomes larger, but for now, sorting and cutting up the results seems to be working fine ...
A logo for the Spoken Corpus
September 29, 2020
The corpus now has a logo - a speech bubble to represent its spoken mode filled with the Star-Spangled Banner because it includes speech from native speakers of American English. The logo is also used as a favicon next to the page's title in the browser tabs.

The new logo of the Student-Transcribed Corpus of Spoken American English
The first 10,000 words
September 22, 2020
The Student-Transcribed Corpus of Spoken American English has reached its first milestone - it now includes 10,000 words of transcribed speech (including disfluencies and punctuations). Let's hope it will keep growing at a brisk and steady pace.
Hello World!
September 1, 2020
Happy birthday, Student-Transcribed Corpus of Spoken American English! Your domain has been bought, your website has been set up, a search interface has been programmed. You're off to a good start :-)
Who knows, really, what this project might sometimes be used for? Hopefully it will fulfill some sort of purpose.
“The two most important days in your life are the day you are born and the day you find out why.” - Mark Twain
So, here is to hoping that you will grow nicely and become big, that lots of people around the world find you interesting and useful, for research, for teaching, or just for fun, and that the internet gods will always smile on you and spare you from bugs and crashes. Cheers :)