When I try to read parquet folder, that is currently being written with another spark streaming job, using an option "mergeSchema":"true", I get an Error:
java.io.IOException: Could not read footer for file
Run Code Online (Sandbox Code Playgroud)
java.io.IOException: Could not read footer for file
Run Code Online (Sandbox Code Playgroud)
Without schema merging I can read the folder nicely but is it possible to read such a folder with schema merging regardless of possible side jobs updating it?
Full exception:
java.io.IOException: Could not read footer for file: FileStatus{path=hdfs://path.parquet/part-00000-20199ef6-4ff8-4ee0-93cc-79d47d2da37d-c000.snappy.parquet; isDirectory=false; length=0; replication=0; …Run Code Online (Sandbox Code Playgroud)