With the introduction of Apache Arrow in Spark, it makes it possible to evaluate Python UDFs as vectorized functions. In addition to the performance benefits from vectorized functions, it also opens up more possibilities by using Pandas for input and output of the UDF. This post will show some details of on-going work I have been doing in this area and how to put it to use.