Generating EPF XML Import file from external content [message #589685] |
Tue, 24 June 2008 14:04 |
Kristian Mandrup Messages: 44 Registered: July 2009 |
Member |
|
|
After some hard work I managed to get the "xsmall" (from java.net) XML
pipeline project up and running and integrated with my EPF import
framework in progress.
The cookbook so far is as follows:
1. Recursively iterate all subdirectories from starting directory with
(htm,
html) filter - Use DirectoryWalker with FileFilters (Apache Common IO)
2. For each htm or html file [file], check if <o:DocumentProperties> is
present
(Word htm NOT filtered)
2a. Extract Metadata into [file]-metadata.xml
2b. Extract folder location relative to starting directory for each file
and use this to define package structure (placement) of file. Will
corresponde to package structure in EPF
2c. Set <package> according to package structure from file location (or if
<o:Category> is set, this tag overrules package structure by location!)
2d. If no Metadata present, create default metadata file (using template
in directory or closest parent directory where such a template is present)
2e. Tidy each htm or html file to generate minimalistic xhtml file that
conforms to EPF style guide.
2e. Merge metadata file with xhtml file (using xsmall), creating a
content-file for each
3. Use xsmall pipeline to build one MyPlugin.xml file using all the
content-files as input
4. User must manually load MyPlugin.xml into a Library of choice
Suggestions are welcome!
Kristian
|
|
|
Powered by
FUDForum. Page generated in 0.02476 seconds