Summary :
The main idea behind our simple program is that the ( pptx ) extension is simply a zipped file ( you can try to unzip it )
In this zipped file there are a lot of XML files , So we need to find the files that contain the text
Note : You can use SharpZipLibrary to unzip files in your project
From http://icsharpcode.net/OpenSource/SharpZipLib
I found the text is in %file%ppt/slides
The XML tag that contain the text data is " a:t " so we need an XML reader to read this data and write it to our rich text box
Then I added some dialoges to be a real program
The Main code :
1: try
2: {
3: //instance for fastzip library
4: FastZip unzip = new FastZip();
5: //unzip to the temp folder in windows
6: string tmploc = Path.GetTempPath();
7: //we just need to unzip this folder NOT all files for slow computers
8: unzip.ExtractZip(openFileDialog1.FileName, tmploc, "ppt/slides");
9: //for loop to extract data from XML files
10: //the ( Directory.GetFiles(tmploc + "ppt\\slides", "*.xml") ) is used to stop the loop
11: //after reaching the last XML File
12: for (int i = 1; i <= Directory.GetFiles(tmploc + "ppt\\slides", "*.xml").Length; i++)
13: {
14: //creating a reader to read XML data from this location which change after every loop
15: //to get the next file name
16: XmlReader rdr = XmlReader.Create(tmploc + "ppt\\slides\\slide" + i + ".xml");
17: while (rdr.Read())
18: {
19: //specify that we need to read a node of type "element"
20: if (rdr.NodeType == XmlNodeType.Element)
21: {
22: //if the reader reaches an element with the tag ( a:t )
23: if (rdr.Name == "a:t")
24: {
25: //will read the element contents as string and add it to rich text box
26: textdata.Text += rdr.ReadElementContentAsString() + "\n";
27: }
28: }
29: }
30: //close the reader as the file location will change the next loop
31: rdr.Close();
32: }
33: }
34: //catch any error and show a message to the user instead of terminating the program
35: catch (Exception err) { MessageBox.Show(err.Message); }
Thursday, March 26, 2009
PPTX extractor code
التسميات: Simple Programs
0 التعليقات:
Post a Comment