Parsing big XML files in Python is hard. On one hand, regular XML libraries load the
whole file into memory, which will crash the process if the file is too big. Other
solutions such as iterparse
do read the file as they parse it, but they are complex to
use if you don't want to run out of memory.
This is where the BigXML library shines:
- Works with XML files of any size
- No need to do memory management yourself
- Pythonic API
- Any stream can easily be parsed, not just files
- Secure from usual attacks against XML parsers