At a guess, it will have taken the average words per line, and multiplied by lines with content which would be reasonable accurate but would skew as the numbers get larger. Using antiword(doc to txt, I presume there is something similar for odt) and wc should be a little less involved if you need more accuracy (thou im only assuming wc is accurate)