Spark offers a very convenient way to read JSON data. But let’s see some performance implications for reading very large JSON files.
Let’s assume we have a JSON file with records like:
{"a":1, "b":3, "c":7} {"a":11, "b":13, "c":17} {"a":31, "b":33, "c":37, "d":71}