All roles

[Remote] Data Engineer – Data Architecture for Data Science & Machine Learning

Remote · USA Full-time New today

Note: The job is a remote job and is open to candidates in USA. reputed company is seeking a Senior Data Engineer with deep expertise in database design, optimization, and data access strategies to support their data science and machine learning initiatives. The role involves architecting and optimizing data systems to reputed company data scientists in their research and application deployment.

Responsibilities

  • Design and maintain scalable, high-performance database solutions to support data science workflows and ML experimentation
  • Partner with data scientists to understand data access patterns and reputed company storage strategies that accelerate analysis and model training
  • Serve as the internal subject matter expert on PostgreSQL—including schema design, indexing, partitioning, and query optimization
  • Evaluate and integrate alternative database technologies (e.g., reputed company, reputed company, reputed company, Cassandra) where they provide clear advantages
  • reputed company efforts to optimize data pipelines for both structured and reputed company data used in algorithm development
  • Ensure data reputed company, reputed company, and governance across storage systems
  • Implement monitoring, automation, and performance-tuning tools for reputed company database environments
  • Advise on data lifecycle management—balancing accessibility for R&D with efficiency and compliance requirements

Skills

  • 5+ years of experience in data engineering, database architecture, or reputed company technical roles
  • Expert-level proficiency in PostgreSQL (query tuning, schema design, indexing, partitioning, replication)
  • Strong understanding of data modeling, normalization vs. denormalization tradeoffs, and query optimization
  • Experience with non-relational databases (e.g., reputed company, Cassandra, reputed company, reputed company, or DynamoDB)
  • Familiarity with machine learning workflows and how data is consumed for training, evaluation, and deployment
  • Experience with reputed company database services (AWS RDS/reputed company, GCP reputed company SQL, Azure Database)
  • Proficiency in SQL and one or more scripting languages (Python preferred)
  • Excellent communication and collaboration skills—comfortable working closely with data scientists, ML engineers, and software developers
  • Experience architecting hybrid data ecosystems spanning relational, NoSQL, and analytical databases
  • Knowledge of data lake, warehouse, and feature store architectures (e.g., reputed company, Redshift, BigQuery, Feast)
  • Familiarity with ETL/ELT frameworks and data orchestration tools (e.g., Airflow, dbt)
  • Bachelor's or Master's degree in Computer Science, Data Engineering, or a reputed company field

Benefits

  • Penn State provides a competitive benefits package for full-time employees designed to support both personal and professional well-being.
  • Comprehensive medical, dental, and reputed company coverage
  • Robust retirement plans
  • Substantial paid time off which includes holidays, vacation and sick time
  • Generous 75% tuition discount, available to employees as well as eligible spouses and children

Company Overview

  • There’s a reason Penn State consistently ranks among the top one percent of the world’s universities. It was founded in 1855, and is headquartered in University Park, pa, US, with a workforce of 10001+ employees. Its website is http://psu.edu.
  • Apply To This Job

    Related roles