365 lines
16 KiB
HTML
365 lines
16 KiB
HTML
<!DOCTYPE html>
|
|
<html lang="en">
|
|
<head>
|
|
<meta charset="UTF-8">
|
|
<!--[if IE]><meta http-equiv="X-UA-Compatible" content="IE=edge"><![endif]-->
|
|
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
|
<meta name="generator" content="Asciidoctor 1.5.8">
|
|
<title>BigQuery</title>
|
|
<link rel="stylesheet" href="css/spring.css">
|
|
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css">
|
|
|
|
<style>
|
|
.hidden {
|
|
display: none;
|
|
}
|
|
|
|
.switch {
|
|
border-width: 1px 1px 0 1px;
|
|
border-style: solid;
|
|
border-color: #7a2518;
|
|
display: inline-block;
|
|
}
|
|
|
|
.switch--item {
|
|
padding: 10px;
|
|
background-color: #ffffff;
|
|
color: #7a2518;
|
|
display: inline-block;
|
|
cursor: pointer;
|
|
}
|
|
|
|
.switch--item:not(:first-child) {
|
|
border-width: 0 0 0 1px;
|
|
border-style: solid;
|
|
border-color: #7a2518;
|
|
}
|
|
|
|
.switch--item.selected {
|
|
background-color: #7a2519;
|
|
color: #ffffff;
|
|
}
|
|
</style>
|
|
<script src="https://cdnjs.cloudflare.com/ajax/libs/zepto/1.2.0/zepto.min.js"></script>
|
|
<script type="text/javascript">
|
|
function addBlockSwitches() {
|
|
$('.primary').each(function() {
|
|
primary = $(this);
|
|
createSwitchItem(primary, createBlockSwitch(primary)).item.addClass("selected");
|
|
primary.children('.title').remove();
|
|
});
|
|
$('.secondary').each(function(idx, node) {
|
|
secondary = $(node);
|
|
primary = findPrimary(secondary);
|
|
switchItem = createSwitchItem(secondary, primary.children('.switch'));
|
|
switchItem.content.addClass('hidden');
|
|
findPrimary(secondary).append(switchItem.content);
|
|
secondary.remove();
|
|
});
|
|
}
|
|
|
|
function createBlockSwitch(primary) {
|
|
blockSwitch = $('<div class="switch"></div>');
|
|
primary.prepend(blockSwitch);
|
|
return blockSwitch;
|
|
}
|
|
|
|
function findPrimary(secondary) {
|
|
candidate = secondary.prev();
|
|
while (!candidate.is('.primary')) {
|
|
candidate = candidate.prev();
|
|
}
|
|
return candidate;
|
|
}
|
|
|
|
function createSwitchItem(block, blockSwitch) {
|
|
blockName = block.children('.title').text();
|
|
content = block.children('.content').first().append(block.next('.colist'));
|
|
item = $('<div class="switch--item">' + blockName + '</div>');
|
|
item.on('click', '', content, function(e) {
|
|
$(this).addClass('selected');
|
|
$(this).siblings().removeClass('selected');
|
|
e.data.siblings('.content').addClass('hidden');
|
|
e.data.removeClass('hidden');
|
|
});
|
|
blockSwitch.append(item);
|
|
return {'item': item, 'content': content};
|
|
}
|
|
|
|
$(addBlockSwitches);
|
|
</script>
|
|
|
|
</head>
|
|
<body class="book toc2 toc-left">
|
|
<div id="header">
|
|
<div id="toc" class="toc2">
|
|
<div id="toctitle">Table of Contents</div>
|
|
<ul class="sectlevel1">
|
|
<li><a href="#_bigquery">BigQuery</a>
|
|
<ul class="sectlevel2">
|
|
<li><a href="#_configuration">Configuration</a></li>
|
|
<li><a href="#_spring_integration">Spring Integration</a></li>
|
|
<li><a href="#_sample">Sample</a></li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
</div>
|
|
</div>
|
|
<div id="content">
|
|
<div class="sect1">
|
|
<h2 id="_bigquery"><a class="link" href="#_bigquery">BigQuery</a></h2>
|
|
<div class="sectionbody">
|
|
<div class="paragraph">
|
|
<p><a href="https://cloud.google.com/bigquery">Google Cloud BigQuery</a> is a fully managed, petabyte scale, low cost analytics data warehouse.</p>
|
|
</div>
|
|
<div class="paragraph">
|
|
<p>Spring Cloud GCP provides:</p>
|
|
</div>
|
|
<div class="ulist">
|
|
<ul>
|
|
<li>
|
|
<p>A convenience starter which provides autoconfiguration for the <a href="https://googleapis.dev/java/google-cloud-clients/latest/com/google/cloud/bigquery/BigQuery.html"><code>BigQuery</code></a> client objects with credentials needed to interface with BigQuery.</p>
|
|
</li>
|
|
<li>
|
|
<p>A Spring Integration message handler for loading data into BigQuery tables in your Spring integration pipelines.</p>
|
|
</li>
|
|
</ul>
|
|
</div>
|
|
<div class="paragraph">
|
|
<p>Maven coordinates, using <a href="getting-started.html#_bill_of_materials">Spring Cloud GCP BOM</a>:</p>
|
|
</div>
|
|
<div class="listingblock">
|
|
<div class="content">
|
|
<pre class="highlightjs highlight"><code class="language-xml hljs" data-lang="xml"><dependency>
|
|
<groupId>org.springframework.cloud</groupId>
|
|
<artifactId>spring-cloud-gcp-bigquery-starter</artifactId>
|
|
</dependency></code></pre>
|
|
</div>
|
|
</div>
|
|
<div class="paragraph">
|
|
<p>Gradle coordinates:</p>
|
|
</div>
|
|
<div class="listingblock">
|
|
<div class="content">
|
|
<pre class="highlightjs highlight"><code>dependencies {
|
|
compile group: 'org.springframework.cloud', name: 'spring-cloud-gcp-bigquery-starter'
|
|
}</code></pre>
|
|
</div>
|
|
</div>
|
|
<div class="sect2">
|
|
<h3 id="_configuration"><a class="link" href="#_configuration">Configuration</a></h3>
|
|
<div class="paragraph">
|
|
<p>The following application properties may be configured with Spring Cloud GCP BigQuery libraries.</p>
|
|
</div>
|
|
<table class="tableblock frame-all grid-all stretch">
|
|
<colgroup>
|
|
<col style="width: 25%;">
|
|
<col style="width: 25%;">
|
|
<col style="width: 25%;">
|
|
<col style="width: 25%;">
|
|
</colgroup>
|
|
<tbody>
|
|
<tr>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock">Name</p></td>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock">Description</p></td>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock">Required</p></td>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock">Default value</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>spring.cloud.gcp.bigquery.datasetName</code></p></td>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock">The BigQuery dataset that the <code>BigQueryTemplate</code> and <code>BigQueryFileMessageHandler</code> is scoped to.</p></td>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock">Yes</p></td>
|
|
<td class="tableblock halign-left valign-top"></td>
|
|
</tr>
|
|
<tr>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>spring.cloud.gcp.bigquery.enabled</code></p></td>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock">Enables or disables Spring Cloud GCP BigQuery autoconfiguration.</p></td>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock">No</p></td>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>true</code></p></td>
|
|
</tr>
|
|
<tr>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>spring.cloud.gcp.bigquery.project-id</code></p></td>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock">GCP project ID of the project using BigQuery APIs, if different from the one in the <a href="#spring-cloud-gcp-core">Spring Cloud GCP Core Module</a>.</p></td>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock">No</p></td>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock">Project ID is typically inferred from <a href="https://cloud.google.com/sdk/gcloud/reference/config/set"><code>gcloud</code></a> configuration.</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>spring.cloud.gcp.bigquery.credentials.location</code></p></td>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock">Credentials file location for authenticating with the Google Cloud BigQuery APIs, if different from the ones in the <a href="#spring-cloud-gcp-core">Spring Cloud GCP Core Module</a></p></td>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock">No</p></td>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock">Inferred from <a href="https://cloud.google.com/docs/authentication/production">Application Default Credentials</a>, typically set by <a href="https://cloud.google.com/sdk/gcloud/reference/auth/application-default"><code>gcloud</code></a>.</p></td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
<div class="sect3">
|
|
<h4 id="_bigquery_client_object"><a class="link" href="#_bigquery_client_object">BigQuery Client Object</a></h4>
|
|
<div class="paragraph">
|
|
<p>The <code>GcpBigQueryAutoConfiguration</code> class configures an instance of <a href="https://googleapis.dev/java/google-cloud-clients/latest/com/google/cloud/bigquery/BigQuery.html"><code>BigQuery</code></a> for you by inferring your credentials and Project ID from the machine’s environment.</p>
|
|
</div>
|
|
<div class="paragraph">
|
|
<p>Example usage:</p>
|
|
</div>
|
|
<div class="listingblock">
|
|
<div class="content">
|
|
<pre class="highlightjs highlight"><code class="language-java hljs" data-lang="java">// BigQuery client object provided by our autoconfiguration.
|
|
@Autowired
|
|
BigQuery bigquery;
|
|
|
|
public void runQuery() throws InterruptedException {
|
|
String query = "SELECT column FROM table;";
|
|
QueryJobConfiguration queryConfig =
|
|
QueryJobConfiguration.newBuilder(query).build();
|
|
|
|
// Run the query using the BigQuery object
|
|
for (FieldValueList row : bigquery.query(queryConfig).iterateAll()) {
|
|
for (FieldValue val : row) {
|
|
System.out.println(val);
|
|
}
|
|
}
|
|
}</code></pre>
|
|
</div>
|
|
</div>
|
|
<div class="paragraph">
|
|
<p>This object is used to interface with all BigQuery services.
|
|
For more information, see the <a href="https://cloud.google.com/bigquery/docs/reference/libraries#using_the_client_library">BigQuery Client Library usage examples</a>.</p>
|
|
</div>
|
|
</div>
|
|
<div class="sect3">
|
|
<h4 id="_bigquerytemplate"><a class="link" href="#_bigquerytemplate">BigQueryTemplate</a></h4>
|
|
<div class="paragraph">
|
|
<p>The <code>BigQueryTemplate</code> class is a wrapper over the <code>BigQuery</code> client object and makes it easier to load data into BigQuery tables.
|
|
A <code>BigQueryTemplate</code> is scoped to a single dataset.
|
|
The autoconfigured <code>BigQueryTemplate</code> instance will use the dataset provided through the property <code>spring.cloud.gcp.bigquery.datasetName</code>.</p>
|
|
</div>
|
|
<div class="paragraph">
|
|
<p>Below is a code snippet of how to load a CSV data <code>InputStream</code> to a BigQuery table.</p>
|
|
</div>
|
|
<div class="listingblock">
|
|
<div class="content">
|
|
<pre class="highlightjs highlight"><code class="language-java hljs" data-lang="java">// BigQuery client object provided by our autoconfiguration.
|
|
@Autowired
|
|
BigQueryTemplate bigQueryTemplate;
|
|
|
|
public void loadData(InputStream dataInputStream, String tableName) {
|
|
ListenableFuture<Job> bigQueryJobFuture =
|
|
bigQueryTemplate.writeDataToTable(
|
|
tableName,
|
|
dataFile.getInputStream(),
|
|
FormatOptions.csv());
|
|
|
|
// After the future is complete, the data is successfully loaded.
|
|
Job job = bigQueryJobFuture.get();
|
|
}</code></pre>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
<div class="sect2">
|
|
<h3 id="_spring_integration"><a class="link" href="#_spring_integration">Spring Integration</a></h3>
|
|
<div class="paragraph">
|
|
<p>Spring Cloud GCP BigQuery also provides a Spring Integration message handler <code>BigQueryFileMessageHandler</code>.
|
|
This is useful for incorporating BigQuery data loading operations in a Spring Integration pipeline.</p>
|
|
</div>
|
|
<div class="paragraph">
|
|
<p>Below is an example configuring a <code>ServiceActivator</code> bean using the <code>BigQueryFileMessageHandler</code>.</p>
|
|
</div>
|
|
<div class="listingblock">
|
|
<div class="content">
|
|
<pre class="highlightjs highlight"><code class="language-java hljs" data-lang="java">@Bean
|
|
public DirectChannel bigQueryWriteDataChannel() {
|
|
return new DirectChannel();
|
|
}
|
|
|
|
@Bean
|
|
public DirectChannel bigQueryJobReplyChannel() {
|
|
return new DirectChannel();
|
|
}
|
|
|
|
@Bean
|
|
@ServiceActivator(inputChannel = "bigQueryWriteDataChannel")
|
|
public MessageHandler messageSender(BigQueryTemplate bigQueryTemplate) {
|
|
BigQueryFileMessageHandler messageHandler = new BigQueryFileMessageHandler(bigQueryTemplate);
|
|
messageHandler.setFormatOptions(FormatOptions.csv());
|
|
messageHandler.setOutputChannel(bigQueryJobReplyChannel());
|
|
return messageHandler;
|
|
}</code></pre>
|
|
</div>
|
|
</div>
|
|
<div class="sect3">
|
|
<h4 id="_bigquery_message_handling"><a class="link" href="#_bigquery_message_handling">BigQuery Message Handling</a></h4>
|
|
<div class="paragraph">
|
|
<p>The <code>BigQueryFileMessageHandler</code> accepts the following message payload types for loading into BigQuery: <code>java.io.File</code>, <code>byte[]</code>, <code>org.springframework.core.io.Resource</code>, and <code>java.io.InputStream</code>.
|
|
The message payload will be streamed and written to the BigQuery table you specify.</p>
|
|
</div>
|
|
<div class="paragraph">
|
|
<p>By default, the <code>BigQueryFileMessageHandler</code> is configured to read the headers of the messages it receives to determine how to load the data.
|
|
The headers are specified by the class <code>BigQuerySpringMessageHeaders</code> and summarized below.</p>
|
|
</div>
|
|
<table class="tableblock frame-all grid-all stretch">
|
|
<colgroup>
|
|
<col style="width: 50%;">
|
|
<col style="width: 50%;">
|
|
</colgroup>
|
|
<tbody>
|
|
<tr>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock">Header</p></td>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock">Description</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>BigQuerySpringMessageHeaders.TABLE_NAME</code></p></td>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock">Specifies the BigQuery table within your dataset to write to.</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>BigQuerySpringMessageHeaders.FORMAT_OPTIONS</code></p></td>
|
|
<td class="tableblock halign-left valign-top"><p class="tableblock">Describes the data format of your data to load (i.e. CSV, JSON, etc.).</p></td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
<div class="paragraph">
|
|
<p>Alternatively, you may omit these headers and explicitly set the table name or format options by calling <code>setTableName(…​)</code> and <code>setFormatOptions(…​)</code>.</p>
|
|
</div>
|
|
</div>
|
|
<div class="sect3">
|
|
<h4 id="_bigquery_message_reply"><a class="link" href="#_bigquery_message_reply">BigQuery Message Reply</a></h4>
|
|
<div class="paragraph">
|
|
<p>After the <code>BigQueryFileMessageHandler</code> processes a message to load data to your BigQuery table, it will respond with a <code>Job</code> on the reply channel.
|
|
The <a href="https://googleapis.dev/java/google-cloud-clients/latest/index.html?com/google/cloud/bigquery/package-summary.html">Job object</a> provides metadata and information about the load file operation.</p>
|
|
</div>
|
|
<div class="paragraph">
|
|
<p>By default, the <code>BigQueryFileMessageHandler</code> is run in asynchronous mode, with <code>setSync(false)</code>, and it will reply with a <code>ListenableFuture<Job></code> on the reply channel.
|
|
The future is tied to the status of the data loading job and will complete when the job completes.</p>
|
|
</div>
|
|
<div class="paragraph">
|
|
<p>If the handler is run in synchronous mode with <code>setSync(true)</code>, then the handler will block on the completion of the loading job and block until it is complete.</p>
|
|
</div>
|
|
<div class="admonitionblock note">
|
|
<table>
|
|
<tr>
|
|
<td class="icon">
|
|
<i class="fa icon-note" title="Note"></i>
|
|
</td>
|
|
<td class="content">
|
|
If you decide to use Spring Integration Gateways and you wish to receive <code>ListenableFuture<Job></code> as a reply object in the Gateway, you will have to call <code>.setAsyncExecutor(null)</code> on your <code>GatewayProxyFactoryBean</code>.
|
|
This is needed to indicate that you wish to reply on the built-in async support rather than rely on async handling of the gateway.
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
<div class="sect2">
|
|
<h3 id="_sample"><a class="link" href="#_sample">Sample</a></h3>
|
|
<div class="paragraph">
|
|
<p>A BigQuery <a href="https://github.com/spring-cloud/spring-cloud-gcp/tree/master/spring-cloud-gcp-samples/spring-cloud-gcp-bigquery-sample">sample application</a> is available.</p>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
<script type="text/javascript" src="js/tocbot/tocbot.min.js"></script>
|
|
<script type="text/javascript" src="js/toc.js"></script>
|
|
<link rel="stylesheet" href="js/highlight/styles/atom-one-dark-reasonable.min.css">
|
|
<script src="js/highlight/highlight.min.js"></script>
|
|
<script>hljs.initHighlighting()</script>
|
|
</body>
|
|
</html> |