The Role

We are looking for a highly skilled Data Engineer with strong expertise in Python programming, data processing, and analytical problem-solving. This role requires a blend of analytical skills, engineering capabilities, and hands-on data manipulation to derive actionable insights, build efficient pipelines, and support data-driven decision-making across teams.

Responsibilities:

Data Exploration & Analysis:

Analyze large and complex datasets to extract meaningful insights and drive decision-making processes.
Identify data trends, anomalies, and opportunities for improvement within datasets and communicate findings clearly to stakeholders.
Collaborate with cross-functional teams to understand business requirements and transform them into technical solutions.

Data Pipeline Development:

Design, develop, and maintain robust data pipelines for efficient data ingestion, transformation, and storage.
Optimize and automate data workflows to improve data availability, quality, and processing efficiency.
Implement ETL (Extract, Transform, Load) processes to support analytics and reporting needs.

Data Modeling & Feature Engineering:

Build, validate, and maintain data models to support machine learning and statistical analysis needs.
Engineer and preprocess features for machine learning algorithms and ensure data quality and consistency.
Develop scalable solutions for feature storage, retrieval, and real-time model serving.

Programming & Scripting:

Write efficient, scalable, and well-documented Python code to support data engineering and analysis tasks.
Collaborate on code reviews, optimize code performance, and apply best practices in coding and version control.
Use Python libraries (e.g., Pandas, NumPy, SQLAlchemy) to streamline data workflows and support analysis.

Performance Optimization & Troubleshooting:

Monitor, troubleshoot, and enhance the performance of data systems and pipelines.
Address data integrity and pipeline issues promptly to ensure reliable data availability and system uptime.
Implement monitoring and logging to preemptively detect and resolve issues.

Collaboration & Communication:

Work closely with data scientists, analysts, and other engineers to develop cohesive data solutions.
Translate complex technical issues into non-technical language for clear communication with stakeholders.
Contribute to documentation, data standards, and best practices to foster a data-centric culture.

Job Requirements:

Technical Skills: Strong proficiency in Python and familiarity with data processing libraries (e.g., Pandas, NumPy, PySpark). Experience with SQL for data extraction and manipulation.
Data Engineering Knowledge: Experience in designing, building, and managing data pipelines, ETL workflows, and data warehousing solutions.
Statistical & Analytical Skills: Ability to apply statistical methods for data analysis and familiarity with machine learning concepts.
Problem-Solving Mindset: Proven ability to troubleshoot complex data issues and continuously improve workflows for efficiency and accuracy.
Communication: Effective communication skills to convey data insights to technical and non-technical stakeholders alike.
Bonus: Experience with cloud platforms (e.g., AWS, GCP), containerization (e.g., Docker), and orchestration tools (e.g., Airflow) is a plus.\

Preferred Education & Experience:

Bachelor’s or Master’s degree in Computer Science, Data Science, Engineering, Mathematics, or a related field.
3+ years of experience in a data science or data engineering role.

Benefits

Compensation commensurate with experience
Unlimited vacation
Ongoing education and training
Bonuses and profit-sharing