Data Analyst @ Remote – 2 Roles

Key Responsibilities:

  • Develop, maintain, and optimize data pipelines using Python and PySpark for large-scale data processing.
  • Good knowledge on scalable data solutions on AWS cloud infrastructure, leveraging services such as S3, Redshift, Glue, and Lambda.
  • Collaborate with cross-functional teams to design and implement robust dataarchitectures for ETL and data transformation processes.
  • Ensure high-quality code through unit testing, code reviews, and adherence to best practices in data engineering.
  • Perform performance tuning and troubleshooting of PySpark jobs and data pipelines.
  • Conduct data analysis and generate meaningful insights to support business decision-making.
  • Participate in PySpark coding tests as part of the evaluation process.
  • Key Skills & Qualifications:
  • 3+ years of hands-on experience with Python and PySpark.
  • Strong experience with AWS services for data storage, processing, and analytics.
  • Expertise in building and managing ETL pipelines and data workflows.
  • Familiarity with CI/CD pipelines for data engineering projects.
  • Strong problem-solving skills and ability to optimize data processes for performance and efficiency.
  • Good communication skills and ability to collaborate with teams.

Preferred Qualifications:

  • Experience with data modeling and database design.
  • Familiarity with other big data tools like Hadoop or Kafka is a plus.
  • Knowledge of GCP or Azure cloud platforms.

Note: A coding test on PySpark will be part of the selection process.

Share this blog:
Job Category: Engineer
Job Type: Full Time
Job Location: India
Job Status: Remote

Apply for this position

Allowed Type(s): .pdf, .doc, .docx