Join an innovative software company on the cutting edge of AI. We're seeking a talented Data Engineer to contribute to the data-driven solutions. As a Data Engineer, you will play a pivotal role in designing and implementing scalable data pipelines, ensuring the availability and reliability of our data infrastructure.
Responsibilities:
- Utilize your strong ETL experience to ingest data from diverse sources, transform it, and load it into analytical or data science structures.
- Employ your expertise in SQL and Python to write efficient and performant queries, as well as to transform data.
- Work with distributed processing frameworks like Spark to handle large-scale data processing tasks.
- Implement and manage CI/CD pipelines to ensure smooth and efficient deployment of data solutions.
- Leverage your understanding of data lineage to track and maintain the integrity of data throughout its lifecycle.
- Utilize various data sources, including flat files, APIs, and databases, to extract valuable insights.
Requirements:
- Degree in Information Technology or similar
- Proficiency in SQL and Python for data manipulation and transformation tasks.
- Experience with distributed processing frameworks.
- Familiarity with running CI/CD pipelines and a basic understanding of their mechanics.
- Strong understanding of ETL principles and best practices.
- Experience working with data from various sources, including flat files, APIs, and databases.
- Knowledge of transformation tools such as DBT, Databricks, or Python (with Spark being beneficial).
- Familiarity with batch and stream processing is highly desirable.
Note: Please include your resume and any relevant projects or code samples in your application.
