This post was originally published on this site
June 9 – June 13
-
Introducing Pub/Sub Single Message Transforms (SMTs), to make it easy to perform simple data transformations such as validate, filter, enrich, and alter individual messages as they move in real time right within Pub/Sub. The first SMT is available now: JavaScript User-Defined Functions (UDFs), which allows you to perform simple, lightweight modifications to message attributes and/or the data directly within Pub/Sub via snippets of JavaScript code. Learn more in the launch blog.
-
Serverless Spark is now generally available directly within BigQuery. Formerly Dataproc Serverless, the fully managed Google Cloud Serverless for Apache Spark helps to reduce TCO, provides strong performance with the new Lightning Engine, integrates and leverages AI, and is enterprise-ready. And by bringing Apache Spark directly into BigQuery, you can now develop, run and deploy Spark code interactively in BigQuery Studio. Read all about it here.
-
Next-Gen data pipelines: Airflow 3 arrives on Google Cloud Composer: Google is the first hyperscaler to provide selected customers with access to Apache Airflow 3, integrated into our fully managed Cloud Composer 3 service. This is a significant step forward, allowing data teams to explore the next generation of workflow orchestration within a robust Google Cloud environment. Airflow 3 introduces powerful capabilities, including DAG versioning for enhanced auditability, scheduler-managed backfills for simpler historical data reprocessing, a modern React-based UI for more efficient operations, and many more features.
June 2 – June 6
-
Enhancing BigQuery workload management: BigQuery workload management provides comprehensive control mechanisms to optimize workloads and resource allocation, preventing performance issues and resource contention, especially in high-volume environments. To make it even more useful, we announced several updates to BigQuery workload management around reservation fairness, predictability, flexibility and “securability,” new reservation labels, as well as autoscaler improvements. Get all the details here.
-
Bigtable Spark connector is now GA: The latest version of the Bigtable Spark connector opens up a world of possibilities for Bigtable and Apache Spark applications, not least of which is additional support for Bigtable and Apache Iceberg, the open table format for large analytical datasets. Learn how to use the Bigtable Spark connector to interact with data stored in Bigtable from Apache Spark, and delve into powerful use cases that leverage Apache Iceberg in this post.
-
BigQuery gets transactional: Over the years, we’ve added several capabilities to BigQuery to bring near-real-time, transactional-style operations directly into your data warehouse, so you can handle common data management tasks more efficiently from within the BigQuery ecosystem. In this blog post, you can learn about three of them: efficient fine-grained DML mutations; change history support for updates and deletes; and real-time updates with DML over streaming data.
- Google Cloud databases integrate with MCP: We announced capabilities in MCP Toolbox for Databases (Toolbox) to make it easier to connect databases to AI assistants in your IDE. MCP Toolbox supports BigQuery, AlloyDB (including AlloyDB Omni), Cloud SQL for MySQL, Cloud SQL for PostgreSQL, Cloud SQL for SQL Server, Spanner, self-managed open-source databases including PostgreSQL, MySQL and SQLLite, as well as databases from other growing list of vendors including Neo4j, Dgraph, and more. Get all the details here.