Concordance to TLK
by Thomas Sweet


Character
Analysis:

Simba
Nala
Mufasa
Scar
Zazu
Timon
Pumbaa
Rafiki
Shenzi
Banzai
Ed
Sarabi
Gopher
Sarafina
Male Singer
Female Singer
Chorus

Entire Cast

The following is the accompanying explanatory material Thomas Sweet wrote along with his package of input, ouput, and program files for his Concordance project.


I took the initiative to create a concordance for The Lion King. After seeing the movie, I had loved all what it was about, and wanted to do something special with my computer programming talents.

This file is meant to discuss the process for creating the concordance for that movie.

First I created a file called ONELINER.BAS which takes every word in the script and formats it to a "one word to a line" format. This took file called SCRIPT.TXT and sent the output to file SCRIPT.DAT

Second I created a file called COMMENTS.BAS which removes all the comments from the script, so that what is left is just the words of the characters. It searches for brackets [] and braces {} and removes everything inside. This took file SCRIPT.DAT and sent the output to file SCRIPT.OUT

Third I created a file called VOCAB.BAS which creates the individual vocabulary files for each character. It searches word by word (with each word on a line to its own), and checks the rightmost character of each word. If it happens to be a colon (:) then it reads the word to the left of that to read the desired name. When the desired name is found, it copies every word to their vocabulary file, until it gets to the next colon. This takes file SCRIPT.OUT and sends output to file <character name>.TXT

Fourth I created a file called REDUNDNT.BAS to remove all the redundant and non-alphanumeric characters, excluding the apostrophe. I searched a character at a time, and printed only the necessary letters to the output file. It also converts every letter to lowercase. This takes file <character name>.TXT and outputs to <character name>.OUT

Fifth I created a file called ALPHABET.BAS to alphabetize the new files, to make it easier to search for extra copies of words, since the words would be next to each other. This procedure takes perhaps the longest. This takes file .OUT and outputs to <character name>.DOC

Sixth I created a file called CONCORD.BAS to count the number of reocurrances of all the words and print the output. It calculates total number of words spoken, their vocabulary which is all the words with no repeating, and also prints out each word, in alphabetical order, with the number of times spoken beside it. This is the final file which creates the final output. This takes file <character name>.DOC and outputs to <character name>

The last four programs have the option of processing a single character file, or doing all characters in CHARFILE.DAT, or even creating a file called EVERYONE that has the entire script.

If anyone has any questions, feel free to visit my website:

http://www.geocities.com/Athens/Olympus/4449

Or email me at: poetic_physicist@hotmail.com

- Thomas Sweet (Warlock Gold)


Back to the Other Goodies page
Back to the Other Goodies page
The Lion King WWW Archive