"Automated syncing of subtitlesr" by Gutto Silva, Universal Subtitles Summer of Code Ideas, 2012
The syncing of subtitles is made manually and requires very attention and time. This is a causes to want to develop a automatic syncing system of subtitles to movies. A problem is that the automatic syncing of subtitles is very hard and have low quality. To resolving this problem could be made a syncing system based in the capitation the speech.
The project consists of three components:
GPML to Reactome XML layout converter
Unlike the Reactome XML format, GPML mainly describes the graphical representation of pathways and does not contain semantics of the reactions. To produce Reactome XML, therefore, the converter must employ certain heuristics to infer semantic relations from graphical representation and eliminate ambiguities. The heuristics will follow SBGN as close as possible while still retaining compatibility on other formatting conventions.
Reactome XML layout to GPML converter
The Reactome XML layout contains further pathways data that are not viewable in GPML. Therefore, the resulting GPML after conversion will contain additional comments containing the Reactome data or at least their identifiers, so that when a back-conversion (from the GPML to Reactome XML) occurs, data will be preserved.
During the conversion, SBGN semantics will be employed to provide unambiguous back-conversion to Reactome XML later when necessary. Some additional shapes might need to be implemented in GPML, or alternatively comments can be written to differentiate SBGN symbols that do not have corresponding graphical representation in GPML.
During the development of this converter a schema for Reactome XML will also be made so that converted test files can be easily validated.
Automatic update mechanism between WikiPathways and Reactome
A separate script will be made that periodically pulls updates from WikiPathways and convert it to Reactome XML layout. The script can be set to automatically update the pathways in Reactome if correct credentials are provided. This will mainly be done for pathways that are already tagged to be high quality.
The script will also pull updates from Reactome and push new pathways to WikiPathways. Only Reactome pathways that have XML layout will be pushed to WikiPathways.
an XML schema to validate the new Reactome XML format;
a GPML to Reactome XML layout converter and Reactome XML layout to GPML converter, which will be available both as command line tool and a library that can be integrated with WikiPathways infrastructure;
a system using the above converter, integrated to WikiPathways, that will periodically check for updates on both WikiPathways and Reactome and update the websites accordingly;
proper documentation and tests for the above-mentioned components.
This week-by-week timeline provides a rough guideline of how the project will be done.
3 -- 16 May
Familiarize with the code and the community, the version control system, the documentation and test system used, and the new Reactome version.
17 -- 30 May
Write the Reactome XML layout schema and the command line Reactome XML to GPML converter, keeping in mind that the internals are to be used subsequently as a library.
31 May -- 6 June
Test and document existing code more thoroughly.
7 -- 20 June
Determine algorithms used to convert GPML graphical representations to Reactome XML. Then, write the command line GPML to Reactome converter, keeping in mind that the internals are to be used subsequently as a library.
21 -- 27 June
Test and document the GPML to Reactome XML converter and the heuristic algorithm more thoroughly.
28 June -- 11 July
Ensure that round-trip conversion works flawlessly (i.e. no data is lost when converting GPML to Reactome XML to GPML again, and vice versa). Also test and document round-trip conversions.
12 -- 25 July
Integrate the converters to WikiPathways. A system that periodically check for updates on both WikiPathways and Reactome and update the websites accordingly is written.
26 July -- 1 August
Test and document the periodic push/pull mechanism more thoroughly.
2 -- 16 August
Further refine tests and documentation for the whole project.
There has been error in communication with booki server. Not sure right now where is the problem.
You should refresh this page.