Enhancing Data Science Outcomes With Efficient Workflow (EDSOEW) – Outline

Detailed Course Outline

Introduction

  • Meet the instructor.
  • Create an account at courses.nvidia.com/join

Advanced Extract, Transform, and Load (ETL)

  • Learn to process large volumes of data efficiently for downstream analysis:
    • Discuss current challenges of growing data sizes.
    • Perform ETL efficiently on large datasets.
    • Discuss hidden slowdowns and perform DataFrame transformations properly.
    • Discuss diagnostic tools to monitor and optimize hardware utilization.
    • Persist data in a way that’s conducive for downstream analytics.

Training on Multiple GPUs With PyTorch Distributed Data Parallel (DDP)

  • Learn how to improve data analysis on large datasets:
    • Build and compare classification models.
    • Perform feature selection based on predictive power of new and existing features.
    • Perform hyperparameter tuning.
    • Create embeddings using deep learning and clustering on embeddings.

Deployment

  • Learn how to deploy and measure the performance of an accelerated data processing pipeline:
  • Deploy a data processing pipeline with Triton Inference Server.
  • Discuss various tuning parameters to optimize performance.

Assessment and Q&A