Qubole, the cloud big data-as-a-service company, and Snowflake Computing, the only data warehouse built for the cloud, recently announced a new partnership that enables organizations to use Apache Spark in Qubole with data stored in Snowflake. With the new integration between cloud services, data teams can build, train and put in production powerful machine learning (ML) and artificial intelligence (AI) models in Spark using information stored in Snowflake. The integration also enables data engineers to use Qubole to read and write data in Snowflake for advanced data preparation such as data wrangling, data augmentation and advanced ETL to refine existing Snowflake data sets.

Qubole provides an enterprise cloud data platform for all types of big data workloads. It enables organizations to operationalize data lakes through infrastructure automation and cloud optimizations for leading open source engines. The new partnership automates the connection between Qubole and Snowflake, eliminating the complexity of manually configuring Spark and reducing the time to train and deploy ML and/or AI models with Snowflake data. This integration also provides one-time, secure credential management between Qubole and Snowflake, and access to Snowflake data through Scala and Python via Qubole’s Dataframe API for Apache Spark.

Businesses are increasingly looking to build a cloud-based data infrastructure to gain agility, scale broader analytics capabilities, as well as lower cost of ownership. At the same time, moving data warehouse infrastructures to the cloud and building data lakes dramatically improves organizations’ performance, concurrency and simplicity. With this announcement, enterprises have the best of both worlds, giving them access to a simple, out-of-the-box integration between Qubole and Snowflake for the most performant, cost effective, and proven solution to any alternatives in the market.

“For the first time our joint customers can leverage the computing power of Apache Spark and Snowflake together in an optimized environment. Customers can now benefit through our integration with Qubole to transform an organization’s decision making abilities through more advanced analytics. Our customers who have been asking for additional ML capabilities can now build unique models to improve their business. We believe so much in this partnership and the value it brings to our customers that Snowflake is already using Qubole internally to augment our security capabilities using ML techniques,” Snowflake Vice President of Alliances, Walter Aldana said.

“Enabling AI and Machine Learning is a strategic imperative for every company,” Qubole COO, Kevin Kennedy said. “The fastest path to success is combining Qubole’s cloud-optimized Apache Spark implementation with data stored in Snowflake to get frictionless deployment, lowest TCO, and unlimited scale.”

ReturnPath specializes in optimizing email marketing and is a joint customer of Qubole and Snowflake. “The integration between Qubole and Snowflake is an integral enhancement to our data platform and overall data flow. It should have an immediate impact on the speed and performance of Qubole-based reporting and modeling. Through the beta we were able to leverage Qubole’s Apache Spark capabilities for machine learning algorithms while using data that resides in Snowflake, dramatically speeding up the time to build and implement models, while maintaining Snowflake as our datastore for model inputs and outputs. Additionally, this integration enables us to seamlessly leverage the value of Qubole-based reporting dashboards alongside the speed of Snowflake’s data pre-processing. These new capabilities formed through this partnership allow us to gain greater value from our data platforms and, in turn, help us deliver greater value to our customers,” Anna Sheets, Manager of Data Science and Analytics at ReturnPath said.