xml-packages-example.txt

---
jupytext:
  cell_metadata_filter: -all
  formats: md:myst
  text_representation:
    extension: .md
    format_name: myst
    format_version: 0.13
    jupytext_version: 1.11.5
kernelspec:
  display_name: Python 3
  language: python
  name: python3
---

# Lesson ?: Practical session: Working with XMLX, ElemenTree and Beautiful Soup

In this lesson, we are going to explore how the different packages work. We use the same example file that was used in lesson 2 (click here to download the file).


This lesson is divided intro three sessions, where every session works with one of the Python packages that was introduced in lesson ?.
With every package, we follow these steps:
- Open the XML file
- Display the structure of the XML file
- Extraxt the booktitles
- Extract name and surname of the author
- Store the information in a .csv of .txt file

## ?a: The ElemenTree

ElemenTree is part of the standard library and therefore does not need to be installed.

Before we can use the package, we have to let Python know we want to use it. We do this by importing the package.
Type in a code cell:

```
jupyter-book myst init path/to/markdownfile.md
```


Now, we want to open the XML file from which we want to extract information. 
Add a new code cell and type:

```
tree = ET.parse('path_to_file/name_file.xml')
root = tree.getroot()
```
```{note}
In the code above, alter the 'path_to_file/name_file' with the path to the folder and the filename. 
For example: D:\Projects\XML workshop\data\example.xml 
```

When you want to extract information from an XML file, it is important that you are familair with the structure of the file. 
There are two ways to do this.