-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setting core_properties.last_modified_by makes document invalid #1037
Comments
Please post the code you used. |
|
It seems to happen when the |
Paste in a snippet of the |
Also, what do you mean by:
That is a namespace declaration not a property and who is adding it and how can you tell? |
<?xml version='1.0' encoding='UTF-8' standalone='yes'?>
<cp:coreProperties
xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:dcmitype="http://purl.org/dc/dcmitype/"
xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>
<dc:title/>
<dc:subject/>
<dc:creator>Author</dc:creator>
<cp:keywords/>
<dc:description></dc:description>
<cp:lastModifiedBy>first_author</cp:lastModifiedBy>
<cp:revision>30</cp:revision>
<dcterms:created xsi:type="dcterms:W3CDTF">2020-12-15T09:27:00Z</dcterms:created>
<dcterms:modified xsi:type="dcterms:W3CDTF">2021-11-22T20:52:00Z</dcterms:modified>
<cp:lastModifiedBy xmlns:cp="http://schemas.openxmlformats.org/officeDocument/2006/custom-properties">
Second author
</cp:lastModifiedBy>
</cp:coreProperties> |
so cp:lastModifiedBy |
You're right it's not a attribute, my bad ;-) |
Okay, so this looks like a namespace collision. What is the provenance of the document you are making this change to? What happens if you make this change to a newly-created Word document? The In addition to an account of the provenance, please paste in the original Also, show the output of the following: >>> document = Document(...)
>>> core_properties = document.core_properties
>>> core_properties.last_modified_by
...
>>> core_properties.last_modified_by = 'Someone'
>>> core_properties.last_modified_by
... |
I found the solution and it's not in
However I think that the bug can be fixed on Why is it that if we write |
We are encountering somewhat related problem while using python-docx (indirectly via docxtpl). Simple fact of getting actual docx template properties by calling doc.core_properties, makes a duplicate for docProps/core.xml. Ref: elapouya/python-docx-template#558 Problem occurs when Input.docx is generated by LibreOffice , If Input.docx is generated by MS Word this issue does not happen. Run the attached code against any LibreOffice generated docx file to reproduce this issue. `from docx import Document Load any document generated using LibreOfficedoc = Document('Input.docx') Following line generates UserWarning and output file is corrupt. It could not be opened using MS Word 2013....\Lib\zipfile.py:1566: UserWarning: Duplicate name: 'docProps/core.xml'return self._open_to_write(zinfo, force_zip64=force_zip64)doc.save('Output.docx') |
@bhavin-qryptal I had the same problem and I made a PR with a fix: #1436 |
Looks like this is a documented non-conformance, aka. "Normative Variation": I'd be interested to see what Word does when you load a file like this and change a core-property. I'm inclined to think it would convert the namespace to the I don't see how we could have aliases for a namespace. It's one thing to recognize that a part is already present and not add a new one, it would be something else to decide which namespace to use for the elements of that part at runtime. |
Attached are some test files: |
@cip91sk have you observed that new DOCX files produced by LibreOffice when choosing the I'm inclined to think the desirable behavior for This could happen when accessing the core-properties part, in the same spot where On access to
|
@scanny I added a commit to the PR that should do what you asked |
I can confirm that if I generate the template with libreoffice I get the error, if I generate it with msword or wordpad I don't get the error. I also don't get the error if I generate with msword or wordpad and later edit it with libreoffice overwriting the file |
Changing the last modified user with the core properties makes the document invalid to Word and the lastModified user is stil the old one.
If I look in the core.xml the cp:lastModifiedBy property is there twice.
The text was updated successfully, but these errors were encountered: