Data Engineer @Sparibis
Software Development
Salary unspecified
Remote Location
πŸ‡ΊπŸ‡Έ USA Only
Job Type full-time
Posted 1mth ago

[Hiring] Data Engineer @Sparibis

1mth ago - Sparibis is hiring a remote Data Engineer. πŸ’Έ Salary: unspecified πŸ“Location: USA

Role Description

  • Plan, create, and maintain data architectures, ensuring alignment with business requirements.
  • Obtain data, formulate dataset processes, and store optimized data.
  • Identify problems and inefficiencies and apply solutions.
  • Determine tasks where manual participation can be eliminated with automation.
  • Identify and optimize data bottlenecks, leveraging automation where possible.
  • Create and manage data lifecycle policies (retention, backups/restore, etc).
  • In-depth knowledge for creating, maintaining, and managing ETL/ELT pipelines.
  • Create, maintain, and manage data transformations.
  • Maintain/update documentation.
  • Create, maintain, and manage data pipeline schedules.
  • Monitor data pipelines.
  • Create, maintain, and manage data quality gates (Great Expectations) to ensure high data quality.
  • Support AI/ML teams with optimizing feature engineering code.
  • Expertise in Spark/Python/Databricks, Data Lake and SQL.
  • Create, maintain, and manage Spark Structured Streaming jobs, including using the newer Delta Live Tables and/or DBT.
  • Research existing data in the data lake to determine best sources for data.
  • Create, manage, and maintain ksqlDB and Kafka Streams queries/code.
  • Data driven testing for data quality.
  • Maintain and update Python-based data processing scripts executed on AWS Lambdas.
  • Unit tests for all the Spark, Python data processing and Lambda codes.
  • Maintain PCIS Reporting Database data lake with optimizations and maintenance (performance tuning, etc).
  • Streamlining data processing experience including formalizing concepts of how to handle lake data, defining windows, and how window definitions impact data freshness.

Qualifications

  • 5+ years of IT experience focusing on enterprise data architecture and management.
  • Must have an active Secret security clearance.
  • Bachelor degree required.
  • CompTIA Security+ certification preferred. If selected, must be able to obtain a CompTIA Security+ certification prior to begin supporting the program.
  • Experience in Conceptual/Logical/Physical Data Modeling & expertise in Relational and Dimensional Data Modeling.
  • Experience with Databricks and Python Development, Structured Streaming, Delta Lake concepts, and Delta Live Tables required.
  • Additional experience with Spark, Spark SQL, Spark DataFrames and DataSets, and PySpark.
  • Data Lake concepts such as time travel and schema evolution and optimization.
  • Structured Streaming and Delta Live Tables with Databricks a bonus.
  • Knowledge of Python (Python 3.X) for CI/CD pipelines required.
  • Familiarity with Pytest and Unittest a bonus.
  • Experience leading and architecting enterprise-wide initiatives specifically system integration, data migration, transformation, data warehouse build, data mart build, and data lakes implementation/support.
  • Advanced level understanding of streaming data pipelines and how they differ from batch systems.
  • Formalize concepts of how to handle late data, defining windows, and data freshness.
  • Advanced understanding of ETL and ELT and ETL/ELT tools such as SSIS, Pentaho, Data Migration Service etc.
  • Understanding of concepts and implementation strategies for different incremental data loads such as tumbling window, sliding window, high watermark, etc.
  • Familiarity and/or expertise with Great Expectations or other data quality/data validation frameworks a bonus.
  • Understanding of streaming data pipelines and batch systems.
  • Familiarity with concepts such as late data, defining windows, and how window definitions impact data freshness.
  • Advanced level SQL experience (Joins, Aggregation, Windowing functions, Common Table Expressions, RDBMS schema design, Postgres performance optimization).
  • Indexing and partitioning strategy experience.
  • Debug, troubleshoot, design and implement solutions to complex technical issues.
  • Experience with large-scale, high-performance enterprise big data application deployment and solution.
  • Understanding how to create DAGs to define workflows.
  • Familiarity with CI/CD pipelines, containerization, and pipeline orchestration tools such as Airflow, Prefect, etc a bonus but not required.
  • Architecture experience in AWS environment a bonus.
  • Familiarity working with Kinesis and/or Lambda specifically with how to push and pull data, how to use AWS tools to view data in Kinesis streams, and for processing massive data at scale a bonus.
  • Experience with Docker, Jenkins, and CloudWatch.
  • Ability to write and maintain Jenkinsfiles for supporting CI/CD pipelines.
  • Experience working with AWS Lambdas for configuration and optimization.
  • Experience working with DynamoDB to query and write data.
  • Experience with S3.
  • Experience working with JSON and defining JSON Schemas a bonus.
  • Experience setting up and management Confluent/Kafka topics and ensuring performance using Kafka a bonus.
  • Familiarity with Schema Registry, message formats such as Avro, ORC, etc.
  • Understanding how to manage ksqlDB SQL files and migrations and Kafka Streams.
  • Ability to thrive in a team-based environment.
  • Experience briefing the benefits and constraints of technology solutions to technology partners, stakeholders, team members, and senior level of management.
  • Proficiency using Git for version control, including repository management, branching, merging, and pull requests.
  • Repository setup and management.
  • Branching strategies (feature, develop, main).
  • Merging and resolving conflicts.
  • Creating and reviewing pull requests.
  • Commit best practices (clear messages, atomic commits).
  • Tagging and release management.

Requirements

  • 5+ years Professional Experience.
  • Bachelor’s Degree in IT related field.
  • Applicants must be able to obtain and maintain a secret security clearance.
  • United States Citizenship is required as part of the eligibility criteria to be able to obtain this type of security clearance.
  • CompTIA Security + certification required.

Benefits

  • Remote work flexibility.
  • Equal opportunity employer that values diversity.

Company Description

Sparibis LLC is a professional solution firm that Clients rely on to access the best talent to drive their business success.

Before You Apply
️
πŸ‡ΊπŸ‡Έ Be aware of the location restriction for this remote position: USA Only
β€Ό Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Data Engineer @Sparibis
Software Development
Salary unspecified
Remote Location
πŸ‡ΊπŸ‡Έ USA Only
Job Type full-time
Posted 1mth ago
Apply for this position
Did not apply βœ“
Applied βœ“
Sent Follow-Up βœ“
Interview Scheduled βœ“
Interview Completed βœ“
Offer Accepted βœ“
Offer Declined βœ“
Unlock 152,720 Remote Jobs
️
πŸ‡ΊπŸ‡Έ Be aware of the location restriction for this remote position: USA Only
β€Ό Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Apply for this position
Did not apply βœ“
Applied βœ“
Sent Follow-Up βœ“
Interview Scheduled βœ“
Interview Completed βœ“
Offer Accepted βœ“
Offer Declined βœ“
Unlock 152,720 Remote Jobs
Γ—

Apply to the best remote jobs
before everyone else

Access 152,720+ vetted remote jobs and get daily alerts.

4.9 β˜…β˜…β˜…β˜…β˜… from 500+ reviews
Unlock All Jobs Now

Maybe later