Development Notes
Development Platform
The project has been written in C\C++. As much of the work has been performed in C++ as feasible. Glade has been used for the GUI (Graphical User Interface) editor. It has allowed me to be quickly develop the screens as required and spend more time on more interesting things.
Well, I have finally got there with my initial file release. At least further progress will no longer be as formidable as it was as I feel that I have passed the steepest part of my learning curve with some of the toolkits involved. XDB tended to produce very bloated files and required a fair amount of machinery to manage that I have postponed any further messing around in that corner. The translations have been organised into CSV (Comma separated value) files that can be read with any text editor. The same goes for the IDX files which are simply byte offset of the starting of the different books of the bible.
Much of the current implementation may seem amateurish. However I decided that it was best to get something released rather than take forever and a day to come up with something that people could download. It isn't perfect but its there, well, partially. This way I can get feedback, sooner rather than later. All those people who have got some great ideas for truth2000 let me know. And if you wish to get involved in some other way I would welcome the help. :)
Well, I have finally got there with my initial file release. At least further progress will no longer be as formidable as it was as I feel that I have passed the steepest part of my learning curve with some of the toolkits involved. XDB tended to produce very bloated files and required a fair amount of machinery to manage that I have postponed any further messing around in that corner. The translations have been organised into CSV (Comma separated value) files that can be read with any text editor. The same goes for the IDX files which are simply byte offset of the starting of the different books of the bible.
Festival
The speech is generated using the Festival software. Festival software is open source and an ideal environment for tailoring speech requirements. Currently I am running Festival as a server process. This means Festival has to be run before Truth2000. Truth2000 connects to the Festival Server using TCP port 1314 and sends the speech commands using this connection.
Eventually I would like to link in the Festival library as a static library and thus removing the requirement for a TCP connection. However there are a number of issues to be resolved.
Although there is no apparent limit to the amount of text that I can send to Festival to be rendered into speech there are practical considerations. The more text that is sent to Festival the longer the delay before Festival utters its first word. Send too much text and Festival takes several minutes before it starts. I suspect that Festival converts the entire text into a speech waveform before audio output commences. The solution that I have adopted is to only send one chapter to Festival at a time and then wait for Festival to finish speaking before sending subsequent chapters.
How do I conveniently stop Festival from speaking? There is a command that I can send to stop Festival from speaking using a certain command. The problem is that it conflicts with the need to determine when festival has finished speaking. If I set Festival to async mode I can stop Festival speaking but I can't determine whether it has finished speaking. If I set Festival to sync mode then the speaking won't stop using the stop command.
If I link in the Festival library directly then I have an issue with blocking I/O. Because this program relies on the GTK event queue for responsive buttons and mouse clicks I cannot allow the program to block. This issue needs to be resolved in some fashion.
Portability
Why at this stage have I decided to use Glade and GTK for the graphical user interface? Why not Qt? This has been partly motivated by my desire to have this system eventually go multi-platform with a view to not limiting my options with respect to porting the code to Win32. Another part of the motivation is that GTK is the underlying framework for wxwindows. A set of APIs for Win32\Linux cross compatibility. Again it is a matter of not cutting off my options. At this stage I am targeting the Linux platform but as far as I am aware the code should also compile under Cygwin. I have seen instructions (somewhere) under the Festival project that their system will also compile under Cygwin.
File Formats
The file formats are currently CSV but this could change in the future. It was necessary to make a decision as to whether to try and achieve a perfect design first off or get a workable downloadable version available for release.
File formats for The Talking Bible
This is subject to change. Here is the current method for storing the following scripture in a CSV file. The scripture is taken from James 1:1 from the NASB translation.
James, a bond-servant of God and of the Lord Jesus Christ, To the twelve tribes who are dispersed abroad: Greetings
nas.csv |
|||||
---|---|---|---|---|---|
Word # |
Book |
Chapter |
Verse |
Position # |
Actual Word |
1 |
James |
1 |
1 |
1 |
James, |
2 |
James |
1 |
1 |
2 |
a |
3 |
James |
1 |
1 |
3 |
bond-servant |
4 |
James |
1 |
1 |
4 |
of |
5 |
James |
1 |
1 |
5 |
God. |
Notice that the name of the translation nas is embedded
in the name of the table. After due consideration I considered it
prudent to separate the different translations into different files.
This enables them to have a different distribution policy to satisfy
the different requirements of their respective bible societies.
The Word # is unique for the entire table but the Position # will start again at 1 at the start of the next verse. The reason for laying out each scripture this way, instead of storing each verse as a sentance, is to facilitate the building of a concordance table. This will enable fast searching of different scriptures.
Concordance
The concordance has been stored in gdbm files. I think this will suffice for the moment until a better way becomes known. The problem was again to prevent the files from becoming too large. 100Mb is too large!! So I have resorted to gdbm files rather than XDB for the time being. GDBM functions much like an associative array which is stored on disk. If you are familiar with Perl you will be familiar with associative arrays. In Python I believe they exist under a different name, "dictionaries". The index of the array is a word (instead of a number) that appears in the scriptures. The contents of each array are a series of scriptures where each is joined together by a separating colon ':'. An example follows:
nas_concordance.db |
|
---|---|
Index |
Contents |
holiday |
est,2,18,21:est,8,17,32:est,9,19,23:est,9,22,31 |
malchiel |
gen,46,17,23:num,26,45,14:1ch,7,31,8 |
So as an array this means that nas_concordance['malchiel'] = "gen,46,17,23;num,26,45,14;1ch,7,31,8". See the linux man page on gdbm for more details.
Structure of the source code
All of the glade stuff and the glade generated GTK source code are stored in the gui/ and gui/src/ subdirectories respectively. The only file worth looking at here is callbacks.c which contains all the event code. The procedures and functions in here are automatically invoked as a result of button presses and leaving a text box, etc.
Everything else is in the support/ subdirectory. The important files in here are:
TScripture.cpp contains the C++ class for looking up scriptures. In the future it should handle all my concordance requirements as well, and eventually lexicons, chain references, etc. This hasn't been finalized yet but the intention is to have the interface consistent irrespective of the file format that is eventually settled upon.
TVerse.cpp is a support file for TScripture.cpp which really should be incorporated into Tscripture.cpp. Expect this file to disappear in the future.
The file TSpeech.cpp is the c++ class I want to use for all my speech interfaces. Again this is to keep the program structure independent of any speech implementation.
The file t2speech.cpp controls the scripture\speech handling requirements. It calls the scripture class to locate the requested scripture and calls the Tspeech class to generate the speech output using festival.
The file special.cpp is a dumping ground for procedures I wish to call from callbacks.c but are unlikely to be of any use outside of this project.
The file utility.cpp is the opposite. There are trim functions and tests for file existence which could be used anywhere. I prefer to keep these functions in this file.
The To Do List is now part of the source distribution and is comprehensively documented there. Look for a file called TODO in the top level directory.
The ultimate goal of this project is to have software which is usable either under Windows or Linux and to be easily installable. Initially the software will be developed to work on Linux and once a sufficiently working version has been developed the software will be migrated to WxWindows. For the Win32 (Microsoft Windows 95\98\ME\NT\2000) platforms there will be additional issues which need to be resolved. Hopefully most of them will be addressed through the use of WxWindows as a portable API. However as yet I do not know of any CD Writer API standard for the Win32 environments. I do not know how I can automatically burn audio to CD, in a Win32 environment, from the talking bible software. Hopefully this will become clear at some point in the future. In Linux this can be simply achieved, at the most rudimentary level, by invoking a shell to run the standard command 'cdrecord' available on nearly every full sized linux distribution.
The goal is therefore to have a working beta version out as soon as possible. Hopefully this will inspire further interest in the project with some feedback about features that people would like to see in this software. Once this has taken place a redesign of the software will incorporate as many of these changes as is logical and achievable and the code will be revamped to utilize the WxWindows API and Win32 version of Festival via the sockets API.
From what I understand from Festival's project documentation the Festival text-to-speech engine will run as a service on a Microsoft Windows computer.
Some links which may be useful:
I am also watching with interest the following open source projects that may be useful for incorporation into this project.
Liscensing
The talking bible project will be GPLed (and hence available to all).