Hi, I'm Vytautas.

Data engineer with hands-on experience building ELT pipelines and lakehouse-style analytics stacks in regulated environments. I like owning pipelines end-to-end, improving reliability, and delivering measurable outcomes.

Data Engineer dbt Airflow AWS

Experience

Lakehouse architectures, Airflow orchestration and practical data quality.

Data Engineer · Bank of Lithuania
2025 — present
  • S3‑based lakehouse (Apache Iceberg + Trino) with a dbt layer for analytics/regulatory reporting.
  • Airflow DAGs for multi‑source ingestion (retries/backoff, on‑failure alerts) and coding/security standards.
  • Data quality checks and documentation (EIOPA/ESMA/DORA).
Data Engineer Intern · Wix.com
2024 — 2025
  • Automated incremental pipeline from MySQL to Apache Iceberg via Trino, orchestrated with Airflow.
  • Improved query latency and analytics workflows.
Architecture & Project Delivery · Architect
2008 — 2024
  • Led projects with strict compliance requirements; mentored peers and managed timelines and documentation.

Skills

Languages

Python SQL

Data Engineering

Airflow dbt Apache Iceberg Trino / Starburst PostgreSQL Spark (basics)

Infrastructure / Cloud

Terraform (basics) Docker Kubernetes / OpenShift (basics) AWS GitLab CI/CD GitHub Actions Linux

Data Quality / Governance

Great Expectations dbt tests DataHub

Nice to have

Streamlit FastAPI Postman n8n

Projects

ELT architecture

Focus: Airflow, dbt, S3 data lake

This ELT project loads source files from object storage, stages them in an analytical database, and then transforms them into standardized, history-aware dimensional tables. Raw data is first copied into a dated backup area, then ingested into staging schemas with minimal structure changes. Next, transformation models build SCD2-style dimensions, applying data quality checks to ensure consistency and traceability. The entire pipeline (backup → load → transform → test → document) runs on Kubernetes cluster and orchestrated by Airflow.

ELT architecture with Apache Airflow, dbt and an S3 data lake.
ELT dbt Airflow

Backup/load CLI

Focus: explain code logic

Python CLI that orchestrates three data maintenance modes over S3 and Trino/Iceberg: backup (copy CSVs between S3 accounts into date-based folders), load (normalize columns and write into Trino *_staging tables), and cleanup (delete old backups by retention period). All behavior is configured via environment variables and CLI arguments.

UML-like flowchart of a Python CLI that backs up CSVs between S3 buckets, loads them into Trino staging tables, and cleans up old backup folders.
Python S3 flowchart

Static portfolio website: S3 hosting

Focus: HTTPS, cost-efficiency

Simple website deployment automation with GitHub Actions and Terraform.

Static portfolio website S3 hosting architecture
S3 IAM GitHub Actions Terraform

Weather Data System: Serverless ingestion

Focus: serverless simplicity and cost control (2024)

Automated weather data ingestion to Postgres with subsequent analysis.

Serverless ingestion
Serverless Lambda Postgres

Machine Learning Models: ingestion + price prediction

Focus: experimentation pipeline (2024)

Pipelines for ML experiments and price prediction.

Serverless ingestion
FastAPI ML Docker

Recommendation Engine: front‑end & back‑end

Focus: end‑to‑end demo (2024)

Prototype with data preparation and a web UI.

Serverless ingestion
ETL FastAPI Streamlit

Data Processing Pipeline: Docker + Airflow

Focus: reproducibility (2024)

Containerised example with Airflow orchestration.

Serverless ingestion
Docker Airflow

Certifications

AWS Certified Cloud Practitioner
Status: achieved
AWS Certified Solutions Architect – Associate
Status: in progress

Contact

Location: Vilnius, Lithuania

LinkedIn: linkedin.com/in/pliadis

Email: pliadis@pm.me