Bigjson - Read JSON files of any size with Python
Published at Nov. 10, 2020
I was using Python to parse the wikidata dumps that are tens of gigabytes of JSON. I though it would be nice if I could open the JSON even though it cannot fit to the memory. That is not possible with the built in JSON module, so I made this library. It reads JSON files of any size.
Installing
The library can be found from Pypi, so you can install it by just typing:
pip install bigjson
Example
The library works somewhat like the built in Python json module. Here we have an example where 78 GB JSON file is opened and read:
import bigjson
with open('wikidata-latest-all.json', 'rb') as f:
j = bigjson.load(f)
element = j[4]
print(element['type'])
print(element['id'])