Simple ".docx" converter implemented by Go. Convert ".docx" to plain text.
This repository is an alpha version. Some disruptive changes could be applied.
MIT
- Less dependency.
- No need for Microsoft Office.
- Only on limited environment, also ".doc" could be converted.
- Windows in which MS Office has been installed.
This is a simple example to read all paragraphs.
package main
import "github.com/tenkoh/go-docc"
func main(){
fp := filepath.Clean("./target.docx")
r, err := NewReader(fp)
if err != nil {
panic(err)
}
defer r.Close()
ps, _ := r.ReadAll()
// do something with ps:[]string
}
If you want read the document by a paragraph, the below example is useful.
package main
import "github.com/tenkoh/go-docc"
func main(){
fp := filepath.Clean("./target.docx")
r, err := NewReader(fp)
if err != nil {
panic(err)
}
defer r.Close()
for {
p, err := r.Read()
if err == io.EOF {
return
} else if err != nil {
panic(err)
}
// do something with p:string
}
}
Before compiling, you shall execute go mod tidy
or go get github.com/tenkoh/go-docc
to get this package.
go install
is available.
go install github.com/tenkoh/go-docc/cmd/docc@latest
Then, docc
command could be used. This is a simple example.
docc target.docx > plain.txt
Your contribution is really welcomed!
tenkoh