1

I try to use python-docx to read a ms word document, and I don't found a function or method in API.If I wanna read formula in ms word, any advice? update: I try to print all text attribute of document object with follow code,it can't show any formula information at all.

from docx import Document
from docx.shared import Inches
import collections
def object_walk(obj,stack):
    result=set()
    id_hex=hex(id(obj))
    if id_hex in id_set:
        return result
    else:
        id_set.add(id_hex)
    if len(stack)==8 or obj is None:
        return  result
    for attr in (name for name in dir(obj) if not name.startswith('_')):
        if attr=="text":
            print(getattr(obj,attr),"============",stack)
        if isinstance(obj, collections.Iterable):
            i=0
            for item in obj:
                stack.append(attr+str(i))
                object_walk(item,stack)
                stack.pop()
                i+=1
        else:
            stack.append(attr)
            try:
                object_walk(getattr(obj,attr),stack)
            except:
                pass
            stack.pop()

document=Document("demo.docx")
id_set=set()
object_walk(document,["root"])
FavorMylikes
  • 1,098
  • 10
  • 18
  • Did you try anything? [Extract text from document](http://stackoverflow.com/questions/25228106/how-to-extract-text-from-an-existing-docx-file-using-python-docx). After being able to read it, it's probably just a matter of encoding. – Simon Jul 26 '16 at 04:37
  • Thank for you response,I tried check whether there is any content in document.paragraph.If I create a ms-word file only have a formula,just a "\n" in document.paragraph. – FavorMylikes Jul 26 '16 at 04:55
  • @FavorMylikes did you manage to find a solution for this? I'm trying to extract formula from a docx too. – snowflake Dec 18 '18 at 11:23

0 Answers0