Use Spark with AWS Glue Iceberg REST API and S3 Tables

A quick experiment to use Spark for Iceberg tables stored on S3 table buckets and managed by Glue Data Catalog via Iceberg REST API.

Study Apache Iceberg ecosystems in AWS

[WIP] Study note about Apache Iceberg ecosystems in AWS.

Setup development environment on MacBook Pro

Setup log of my new work MBP.

Read Book: Systems Performance 2nd edition

Notes while reading Systems Performance: Enterprise and the Cloud, 2nd Edition. This is still WIP as I’m still reading. I read the 1st edition 9 years ago. It turns out...

Read papers about DB in April 2024

This is a note about papers of analytical DB technologies I enjoyed reading recently. I summarized highlights of some interesting papers.

Study AWS Aurora Serverless v2

Amazon launched Aurora Serverless v2 on Apr 21, 2022. Aurora Serverless v2 is designed for applications that have variable workloads. The DB capacity dynamically changes as the workload changes, so...

Rethink scale up for analytical DBs

I came across a blog post BIG DATA IS DEAD that describes how big data processing is rare in the real world. Data generated by computers and stored on persistent...

Read: The Amazon Builders' Library Part 2

2019年に公開されたThe Amazon Builders’ Libraryですが、 前回読んだ時から 随時新たな記事が追加されていたようです。新たな記事の中で印象に残ったものについてまとめます。

Read Book: Data Governance: The Definitive Guide

読んだ本:Evren Eryurek, et al. Data Governance: The Definitive Guide: People, Processes, and Tools to Operationalize Data Trustworthiness タイトルの通り、データガバナンスについて、定義からポリシーやプロセスの構築、そしてそれらの組織への定着まで 全体像を解説する一冊です。 社内で読んだ数名の評判がよさそうだったのと、最近データガバナンス関連の開発に触れる機会が増えてきているので、 知識の獲得と整理のために読みました。以下読書メモです。

Study Note of AWS Neptune

AWS Neptune is managed graph database service that supports graph query languages Apache TinkerPop Gremlin and W3C’s SPARQL. This post is my study note to understand what is Neptune.