Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add custom properties support + unit-tests, feature-tests #1273

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

michael-koeller
Copy link

Follow-up to MR #580, which I closed unintendedly.

Sorry for any inconveniences.

@michael-koeller
Copy link
Author

Hi @scanny,

I did the rebase, as you suggested.

Is there anything more you need to be done, to have this long-running MR merged?

Best regards,
Michael

@Ricyteach
Copy link

Still looking forward to this since 2018! ;)

Thanks for all the hard work!!!!

@ssmoliarchuk
Copy link

Hi, first of all thanks for your work.
It's exactly what i'm looking for and it works perfect from your branch.

There is only one issue for me:
I modified custom properties, saved file, opened file. Custom properties have been changed. But changes are not applied in the document (i'm using inserting custom properties values inside doc).
It's a way to trigger it manually inside Word, but i'm looking how to make it automatically.
May be i need additionally trigger smthg or change some metadata? (I understand that it relates more to Word but may be you know how to do it).

@michael-koeller
Copy link
Author

There is only one issue for me: I modified custom properties, saved file, opened file. Custom properties have been changed. But changes are not applied in the document (i'm using inserting custom properties values inside doc). It's a way to trigger it manually inside Word, but i'm looking how to make it automatically. May be i need additionally trigger smthg or change some metadata? (I understand that it relates more to Word but may be you know how to do it).

@ssmoliarchuk: Yes I know about this issue. My workaround is to select the area in the document and have Word refresh all dynmic fields. Interestingly, the issue does not occur in LibreOffice.

@chrisopenhab
Copy link

Hi, thanks for your work! Sad that this feature is still not merged :-(
Would be cool it the feature will be available in future releases.

@ryanamannion
Copy link

Hi all,

I have been using this change for a few months now and have enjoyed it, thank you for your work on this @michael-koeller.

My team and I have noticed, however, that when opening documents created with python-docx with this change, MS Word warns of "Unreadable content"

image

We have been able to trace this back to the namespace prefix used in this PR for custom-properties, namely cup. I haven't been able to find references to cup in any ooxml specs I have seen.

image

To recreate the above custom.xml you can run the following

from docx import Document
d = Document()
d.add_paragraph("Test text.)
d.custom_properties["test_prop"] = "foo"
d.save("/path/to/document.docx")

Microsoft's DocumentFormat.OpenXml documentation says the prefix is op, and this article on adding custom properties via the SDK also show the prefix as op, yet when I changed the prefix from cup -> op locally I still got the unreadable content warning with Word.

It looks like creating a new custom.xml in Word by adding a custom property to a blank document yields a Properties tag, no namespace prefix.
image

To recreate the above document in MS Word:

  1. Open blank word document.
  2. Optionally add some paragraph text to document.
  3. File > Info > Click the word "Properties" top right to reveal "Advanced Properties" > Custom > Add a custom property

To test a crude potential fix, I changed the definition of the _customProperties_tmpl attribute of CT_CustomProperties in oxml/customprops.py to be

xmlns = 'xmlns="http://schemas.openxmlformats.org/officeDocument/2006/custom-properties"'
_customProperties_tmpl = "<Properties %s/>\n" % (xmlns + " " + nsdecls("vt"))

and changed definitions of cup to be op instead, namely in docx/oxml/ns.py and in docx/oxml/__init__.py for register_element_cls. Doing this seems to keep custom property functionality, and work with all editors I have tested with (LibreOffice, OnlyOffice, MS Word).

Any thoughts? I'd love to contribute by cleaning up my changes and opening a PR into your branch @michael-koeller

@michael-koeller
Copy link
Author

@ryanamannion: Thanks for the detailed input. I'll soon have a look into it.

@ryanamannion
Copy link

I have been looking into this more to make a proper fix for BlackBoiler's fork of python-docx. Using dotnet's Open XML Productivity Tool, validation revealed that the actual issue was the missing namespace tag on property children of <op:Properties>. Adding the namespace prefix for property e.g. <op:property> in CustomProperties.__setitem__ seems to have fixed the issue, and is compliant with the spec using the correct namespace tags.

python-docx now generates a custom.xml that looks like the following (using the same code snippet from above):

image

The resulting document does not raise "Unreadable Content" warnings when opening with MS Word, and is validated successfully with the Open XML Productivity Tool

Code changes to achieve this can be viewed at a glace here: BlackBoiler#25

@michael-koeller
Copy link
Author

@ryanamannion: I just merged your fix from BlackBoiler#25 into this pull request. Thanks again for the efforts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants