Posts

Showing posts from July, 2018
Bulk Data Load Using Apache Beam JDBC Sometimes we have to load data from relational database in to BigQuery. In this article I will describe how to do this using  MySQL as a datasource,  Apache Beam and    Google Dataflow.  Moreover we will do this in parallel. Lets start. Firs of all we will create Apache Beam project  Apache Beam quick start guide Then we will do it step by step: 1)  Add dependency to beam JDK module < dependency > < groupId > org.apache.beam </ groupId > < artifactId > beam-sdks-java-io-jdbc </ artifactId > < version > ${beam.version} </ version > </ dependency > 2)  Because of we will have parallel data load  its good idea to have connection pool. I choose c3p0 connection pool. < dependency > < groupId > c3p0 </ groupId > < artifactId > c3p0 </ artifactId > < version > 0.9.1 </ version > ...