Responsibilities:
● Design, develop, and maintain scalable ETL/ELT pipelines to process structured and unstructured data.
● Manage and optimize databases, data warehouses, and data lakes (SQL/NoSQL, BigQuery, etc.).
● Ensure data quality, governance, and reliability through validation, monitoring, and automation.
● Optimize pipelines and queries for large datasets and high-volume transactions.
● Explore and analyze datasets using statistical methods to identify trends and insights.
● Build, validate, and fine-tune predictive and prescriptive machine learning models relevant to the business.
● Deploy models into production by integrating them with data pipelines and business applications.
● Communicate findings and recommendations through dashboards, visualizations, and reports.
● Set up, maintain, and monitor Google Cloud Platform services (BigQuery, Dataflow, Cloud Storage, Pub/Sub, Composer).
● Manage ETL workflows on GCP, ensuring reliability, scalability, and cost efficiency.
● Implement security, access control, and compliance in cloud-based data systems.
● Collaborate with cross-functional teams (business, sales, operations) to ensure data is actionable and accessible.
Requirements & Qualifications:
● Bachelor’s degree in Computer Science, Information Technology, or related field (or equivalent experience).
● Minimum 2+ years of experience in Data Engineering, especially in data modeling, warehousing, and distributed systems.
● Strong proficiency in Python (for data pipelines, ML).
● Strong knowledge of SQL and NoSQL databases (MySQL, PostgreSQL, MongoDB) and experience with ClickHouse is a plus.
● Familiarity with RESTful APIs (design, development, integration).
● Familiarity with machine learning frameworks (TensorFlow, scikit-learn, PyTorch).
● Hands-on expertise with Google Cloud Platform (BigQuery, Dataflow, Pub/Sub, Composer, Cloud Storage); experience in GCP infra setup is a strong plus.
● Proficient in Linux-based servers and deployment workflows.
● Skilled in Git/Bitbucket for version control and collaborative development.
● Experience implementing security best practices in both web applications and data infrastructure.
● Strong problem-solving, analytical, and collaboration skills.
● Experience with Zoho Deluge scripting is a plus.