Run locally with container
Run interactive pyspark session
podman run -it -v ./:/opt/spark/work-dir spark:4.0.0-java21-python3 /opt/spark/bin/pyspark
Submit a pyspark app
podman run -it spark:4.0.0-java21-python3 \
/opt/spark/bin/spark-submit \
--conf spark.log.level=WARN \
/opt/spark/examples/src/main/python/pi.py
Iceberg
Write dummy data
spark.sql("create database test_db location 's3://bucket_name/iceberg/test_db'")
spark.sql(
"""
create table test_db.test_tbl (id int)
using iceberg
location 's3://bucket_name/iceberg/test_db/test_tbl'
"""
)
df = spark.range(0, 100)
df.writeTo("test.test_tbl").append()
spark.sql("select count(*) from test.test_tbl").show()
Inspecting metadata
Links
- Spark doc