Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Eclipse Projects » EPF » WordHtmForEPF - Preparing Word saved .htm pages for instant EPF import
WordHtmForEPF - Preparing Word saved .htm pages for instant EPF import [message #53550] Thu, 11 September 2008 03:36
Kristian Mandrup is currently offline Kristian Mandrup
Messages: 44
Registered: July 2009
Member
Hi,

I was annoyed that I couldn't paste a .htm document saved from Word into a
EPF rich editor without getting some rather unuseful results. I created a
small groovy script to greatly improve the process.

Usage guide:
Save Word doc as filtered .htm page. It will create a folder with external
resources used such as images, numbered image001, image002 and so on. This
really sucks for importing into EPF.

Run the WordHtmForEPF.groovy in this directory, perhaps with a filepattern
argument for selecting which htm files to process, fx:
WordHtmForEPF.groovy Guide*.htm

The script will create new .htm files with EPF_ prefix for each .htm file
matching the filepattern. These new .htm files have image src attributes
redirected to images in a /resource folder.
The images from the Word saved resource folders will be renamed more
usefully (with document name prefix) and copied to the new /resources
folder.

Now if you open a EPF_xxx.htm fil in a broswer you can copy paste it into
a EPF RTE and the result will be just beautiful. All images are
transfered, stored in EPF /resources folder for the type of method element
and each image named to indicate the method element they are part of :)

It even handles special characters (danish characters as of now...)

---

/**
* @author kristian@mandrup.dk
*
*/
//import java.net commons
public class WordHtmForEPF {

private static encode(first) {
def prefix = first.replaceAll(/(%20|\s)/) {fullM, space ->
return '-'
}
prefix = prefix.replaceAll(/(æ|Æ|%C3%A6)/) {fullM, space ->
return 'ae'
}
prefix = prefix.replaceAll(/(ø|Ø)/) {fullM, space ->
return 'oe'
}
prefix = prefix.replaceAll(/(å|Å)/) {fullM, space ->
return 'aa'
}
prefix = prefix.replaceAll(/(é|É)/) {fullM, space ->
return 'e'
}
return prefix
}

/**
* @param args
*/
public static void main(def args){
println "WordHtmEPF v.1.0 - by Kristian Mandrup consulting"
def filePattern = "[^EPF_].*.htm"
if (args.length > 0) {
filePattern = args[0]
}
def removePrefix = false
if (args.length > 1) {
if (args[1] == 'remove')
removePrefix = true
}

def dir = new File(".")
dir.eachFileMatch(~"${filePattern}") {File f ->
println "Generating EPF image references for: ${f.name}"
def str = f.getText()
def replaced = str.replaceAll(/src="(.*)\/(.*)">/) {fullMatch, first,
second ->
return 'src="resources/' + encode(first) + '-' + second + '">'
}
// println replaced
def f2 = new File('EPF_' + f.name)
f2 << replaced
String dirName = f.name[0..-5] + '-filer'
File fdir = new File(dirName)
renameResources(fdir, removePrefix)
File resDir = new File('resources')
if (!resDir.exists()) {
// println "Renaming ${resDir.name} to resources "
fdir.renameTo(new File('resources'))
}
}
}

private static void renameResources(File directory, removePref) {
def renameClos = { dir, filePattern, prefix, removePrefix ->
println "Prefix ${prefix}"
println "filePattern ${filePattern}"
dir.eachFileMatch(~"${filePattern}") {File f ->
// is prefix already present in start of name!?
String newFileName = "${prefix}-${f.name}"
def replaceName = encode(newFileName)
if (!f.name.startsWith(prefix)) {

if (removePrefix) {
int index = prefix.length()+1
replaceName = f.name[index..-1]
}
} else {
// don't add prefix if present already!
replaceName = f.name
}

// copy all files to resources dir
String newDirectoryName = 'resources/'
def d1 = new File(newDirectoryName)
d1.mkdir()
File newResourceFile = new File(newDirectoryName + '/' + replaceName)
// copy file
println "Copy ${f.name} to resources/${newResourceFile.name} "
new AntBuilder().copy(file: "${f.canonicalPath}",
tofile:"${newResourceFile.canonicalPath}")
}
}
def filePattern = ".*.(jpg|JPG|gif|GIF|png|PNG|bmp|BMP)"
String prefix = ""
def path = directory.canonicalPath
def index = path.lastIndexOf('\\')+1
prefix = path[index..-1]
println "Directory name used as default prefix: ${prefix}"
println "Renaming all pictures (jpg|JPG|gif|GIF|png|PNG|bmp|BMP)"
renameClos( directory, filePattern, prefix, removePref )
}

}
Previous Topic:RichText (EPF milestone 1.2.0.4)
Next Topic:WordHtmForEPF - Preparing Word saved .htm pages for instant EPF import
Goto Forum:
  


Current Time: Mon Jul 28 08:37:29 EDT 2014

Powered by FUDForum. Page generated in 0.02237 seconds