Abstract: Big Data systems like Apache Spark and Hadoop are cornerstones of large scale data processing. However, they are being utilized only on one or some steps of a larger data processing pipeline ...