Page 1 of 1

PPTX 2007 -- updating chart spreadsheet

PostPosted: Thu Jan 10, 2013 7:26 am
by cdobrem
I created the following PowerPoint 2007 slide, it has two slides. One a title slide and another with a TwoColumn chart.

I ran the pptx example EditEmbeddedCharts, updating the code to reflect the name difference of the embedded spreadsheet. The program completes with no ERRORS, but does have the following WARN

Code: Select all
...preceding messages excluded...

WARN org.docx4j.openpackaging.parts.Parts .put line 75 - Overwriting existing part /ppt/embeddings/Microsoft_Office_Excel_Worksheet1.xlsx



After running the program it creates a modified version with a different name. When I try and open the newly created powerpoint and inspect the modified spreadsheet, nothing appears. If I unzip the pptx and traverse down to the spreadsheet and open the Excel file it complains about the file being corrupted. I did execute the RoundTripTest on both the source and created powerpoint decks and didn't see any errors.

The name of the embedded spreadsheet filename is different than from the example used by EditEmbeddedChart code, but as far as I can tell that is the only difference.

I have attached the sample powerpoint that I am using.

Thanks in Advance.

Re: PPTX 2007 -- updating chart spreadsheet

PostPosted: Tue Jan 15, 2013 6:42 pm
by jason
Hi

Running your pptx through the sample produced output which opened fine in Powerpoint 2010.

This was using my current dev code, which is now on GitHub https://github.com/plutext/docx4j/commi ... 9e1bb91777

Could you please try again with that?

.. Jason

Re: PPTX 2007 -- updating chart spreadsheet

PostPosted: Wed Jan 16, 2013 11:36 am
by cdobrem
Jason,

Thanks for the reply, concidently I did a new pull yesterday from github and re-ran some tests to make sure I was getting consistent results.

Just to comment on your post, yes I was able to open the updated document in Office 2007-SP3, but when you right click on the chart and select "Edit Data", then nothing appears. I ended up performing the following set of tests trying to figure out what is wrong.

Test 1:
---------
1) Ran the EditEmbeddedChart code against my sample-2007.pptx document, creating a new file xxx-updated.pptx.
2) Opened "OUT_EditEmbeddedCharts-1358262369593.pptx" using the OpenXML productivity tool and performed "validate".

The following error appeared:

Related Part:
-------------
/ppt/embeddings/Microsoft_Office_Excel_Worksheet1.xlsx

Description:
------------
The package/part "PresentationDocument' cannot have a relationship that targets part "EmbeddedPackagePart{/ppt/embeddings/Microsoft_Office_Excel_Worksheet1.xlsx}"



3) Looked at the document tree in the left-hand side, and saw the two spreadsheet entries were present.

one under: Doc -> "ppt" -> "presentation" -> "slide2" -> "chart1" -> Microsoft_Office_Excel_Worksheet1.xlsx (part Id rId1)
another under: Doc -> "ppt" -> "embeddings" -> Microsoft_Office_Excel_Worksheet1.xlsx (partId rId5)

4) Opened the "source" pptx using OpenXML Productivity Tool and only see one entry under:

Doc -> "ppt" -> "presentation.xml" -> "slide2" -> "chart1" -> Microsoft_Office_Excel_Worksheet1.xlsx (part Id rId1)

5) Ran the RountTrip example on both pptx files everything appeared to work, no Errors in the output.


Test2:
--------

Modified EditEmbeddedChart code to the following:

Code: Select all
public static void main(String[] args) throws Docx4JException {

        // Input file
        String inputfilepath = System.getProperty("user.dir") + "/sample-docs/pptx/sample-2007-sp3-v1.pptx";
//        String inputfilepath = System.getProperty("user.dir") + "/sample-docs/pptx/pptx-chart.pptx";

        // Output file
        String outputfilepath = System.getProperty("user.dir")
                + "/OUT_ExploreCharts-"
                + System.currentTimeMillis() + ".pptx";
        String xlsxfile = System.getProperty("user.dir")
                + "/OUT_internal-xslx-"
                + System.currentTimeMillis() + ".xlsx";

        // Values to change
        Random rand = new Random();

        String firstValue = String.valueOf(rand.nextInt(99));
        String secondValue = String.valueOf(rand.nextInt(99));

        // Open the PPT template file
        PresentationMLPackage ppt = (PresentationMLPackage) OpcPackage
                .load(new java.io.File(inputfilepath));

        /*
         * Get the Chart object and update the values. Afterwards, we'll update
         * the associated spreadsheet so that the data is synchronized.
         */
        Chart chart = (Chart) ppt.getParts().get(new PartName("/ppt/charts/chart1.xml"));

        List<Relationship> relationships = chart.getRelationshipsPart().getRelationships().getRelationship();
        for (Relationship relation : relationships) {
            System.out.println("relation.id " + relation.getId());
            System.out.println("relation.target " + relation.getTarget());
            System.out.println("relation.type " + relation.getType());
            System.out.println("relation.value " + relation.getValue());
        }
        EmbeddedPackagePart epp = (EmbeddedPackagePart) chart.getRelationshipsPart().getPart("rId1");

        InputStream is = BufferUtil.newInputStream(epp.getBuffer());

        SpreadsheetMLPackage spreadSheet = (SpreadsheetMLPackage) SpreadsheetMLPackage.load(is);


        /*
         * Convert the Spreadsheet to a binary format, set it on the
         * EmbeddedPackagePart, add it back onto the deck and save to a file.
         * 
         */
        ByteArrayOutputStream baos = new ByteArrayOutputStream();

        SaveToZipFile saver = new SaveToZipFile(spreadSheet);

        saver.save(baos);
        epp.setBinaryData(baos.toByteArray());

        chart.addTargetPart(epp, "rId1");

        // Write the new file to disk
        ppt.save(new java.io.File(outputfilepath));

        spreadSheet.save(new java.io.File(xlsxfile));

        System.out.println("\n\n done .. saved " + outputfilepath);
    }
}

Files created: OUT_ExploreCharts-1358262412255.pptx and OUT_internal-xslx-1358262412255.xlsx

This example does not update the spreadsheet backing the chart, instead it extracts the spreadsheet associated with the chart and saves it to my local disk. I validate the extracted "xlsx file" using the OpenXML Productitivity tool and "no errors" are reported.

I open the extracted "xlsx file" using Excel, and I get the following error message:

"Excel found unreadable content in 'OUT_internal-xslx-1358262412255.xlsx'. Do you want to recover the contents of this workbook?"


Once opened it reports the following issues were corrected.

"Repaired Records: Table from /xl/tables/table1.xml part (Table)"



Observations:
----------------

When comparing the " /xl/tables/table1.xml" files between the "modified/correct xlsx" and the "xlsx" file extracted via the ExploreCharts code. The difference was in the the line:

Line from the Extracted Spreadsheet File (OUT_internal-xslx-1358262412255.xlsx):
------------------------------------------------------------------------------------------------------------------
Code: Select all
<tableStyleInfo showFirstColumn="false" showLastColumn="false" showRowStripes="true" showColumnStripes="false"/>



Excel appeared to correct this line (OUT_Mod.xlsx):
---------------------------------------------------------------------------
Code: Select all
<tableStyleInfo showFirstColumn="0" showLastColumn="0" showRowStripes="1" showColumnStripes="0"/>



The library appears to using boolean values and Excel is using numbers.

My apologies for the long post, but wanted to make sure you have all of my data gathered during the tests.

Thanks in Advance.

Re: PPTX 2007 -- updating chart spreadsheet

PostPosted: Wed Jan 16, 2013 9:54 pm
by jason
The problem with the part being attached incorrectly can be addressed by commenting out the line:

Code: Select all
ppt.addTargetPart(epp);


Thanks for noticing the boolean / numeric difference. I'll look into this and get back to you.

Re: PPTX 2007 -- updating chart spreadsheet

PostPosted: Sun Jan 20, 2013 3:48 pm
by jason
The xsd:boolean stuff was a red herring. Excel is happy with either format.

The actual problem is to do with the correspondence between the table part and sheet1.xml contents, which in this case depended in turn on the sharedStrings.xml part contents.

The problem was that there was an xml:space="preserve" in the shared strings part, which was required in order for Excel to be able to read the file. That attribute however, is not present in the relevant ECMA 376 schema (up to edition 4), so JAXB was dropping it.

So I added that attribute to the XSD, and at the same time, used XJC to regenerate the sml classes from the 4th edition. I also added the parent pointer.

The EditEmbeddedCharts sample now uses STCellType.STR for its new values, avoiding the shared strings table. See http://openxmldeveloper.org/blog/b/open ... heets.aspx

All should be good with https://github.com/plutext/docx4j/commi ... 2f493bfdee