Skip to content
Avatar
  • Currently at Roku, was at Microsoft
  • San Francisco Bay Area, CA
Block or Report

Block or report jiegzhan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
jiegzhan/README.md

Hi there 👋

  • 🔭 I am Zhang Jie (张 杰), a Senior Software Engineer at Roku 💜 Big Data Platform team, where I provide data infrastructure and data solutions both in large scale 🔵 real time streaming processing and data warehouse batch processing.

  • 🌱Tech Stack: Flink, Spark, Kafka, Kafka Connect, Presto, Hive, Hadoop Ecosystem, Airflow, Kubernetes, Docker, AWS Stack, DataDog, Jupyter Notebook, Superset, Looker.

Real Time Streaming Processing

Architected and led a Flink & Kubenetes powered real time streaming platform which provides capabilities to build Flink streaming applications and run them on Kubernetes clusters seamlessly. Onboarded other engineering teams and promoted best streaming practices.

Data Warehouse Batch Processing

Built and maintained a Hive & Spark & S3 & Airflow based data warehouse, architected and implemented distributed data ingestion, storage, and processing pipelines.

Pinned

  1. Classify Kaggle Consumer Finance Complaints into 11 classes. Build the model with CNN (Convolutional Neural Network) and Word Embeddings on Tensorflow.

    Python 423 202

  2. Classify Kaggle San Francisco Crime Description into 39 classes. Build the model with CNN, RNN (GRU and LSTM) and Word Embeddings on Tensorflow.

    Python 583 262

  3. Classify MNIST image dataset into 10 classes. Build an image classifier with Recurrent Neural Network (RNN: LSTM) on Tensorflow.

    Python 78 48

  4. Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

    Java 6.7k 2.1k

  5. apache/hudi Public

    Upserts, Deletes And Incremental Processing on Big Data.

    Java 3.7k 1.7k

  6. The official home of the Presto distributed SQL query engine for big data

    Java 14.2k 4.9k

356 contributions in the last year

Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Mon Wed Fri

Contribution activity

November 2022

6 contributions in private repositories Nov 16

Seeing something unexpected? Take a look at the GitHub profile guide.