Data & AI Engineer

Catherine Varas

Building AI-Powered Solutions

AI & Data Engineer with 5+ years of experience designing and implementing scalable data pipelines, cloud infrastructure, software solutions and intelligent automation, along with 2+ years focused on AI systems.

+5
Years Experience
6
Cloud Certifications

// about

The Journey

Catherine Varas
Lima 🇵🇪 → Sydney 🇦🇺

AI & Data Engineer with 5+ years of experience designing and implementing scalable data pipelines, cloud infrastructure, software solutions and intelligent automation, along with 2+ years focused on AI systems.

Having worked across data engineering, software architecture, and applied AI helps me to design systems that are robust and hold up in production. I bring a research-driven approach to every problem, I learn the domain, understand the data, and deliver solutions that work reliably at scale.

"Ready to create innovative solutions and use the latest technology to help solve problems."

Tech Stack

Engineering

Python Python R Java Java SQL Bash C# JavaScript

Cloud

Azure Azure Azure ML Azure ML Azure Functions Azure Functions Azure AI Search Azure AI Search Azure Data Factory Azure Data Factory AWS AWS GCP HPC GPU Streaming

AI & ML

YOLO SAM2 Neural Networks HuggingFace RAG scikit-learn scikit-learn Deep Learning Computer Vision Agentic AI LangChain n8n

Tools

Dash/Plotly Power BI Power BI Streamlit Streamlit matplotlib Git Git CI/CD

Databases

Oracle Oracle SQL Server PostgreSQL PostgreSQL MySQL MySQL Elasticsearch

// experience

Career Timeline

AIBN — University of Queensland

Australian Institute for Bioengineering and Nanotechnology

Software Engineer — Research Data Pipelines

Mar 2026 – Present

Brisbane, Australia

  • â–¹ Developed a full-stack data analysis pipeline for fermentation research: volume reconstruction, mass balances, growth rate estimation, and automated phase detection
  • â–¹ Built an interactive Dash/Plotly dashboard for researchers to explore data and compare replicates
  • â–¹ Documented every formula, assumption, and design decision for reproducibility standards
Python Dash/Plotly Data Analysis Machine Learning

UQ Queensland Brain Institute

Research Experience

Research Volunteer — Computer Vision

Nov 2025 – Present

Brisbane, Australia

  • â–¹ Built a multi-model pipeline (YOLO + SAM2) for automated animal tracking
  • â–¹ Engineered chunked GPU streaming and HPC parallel execution for long video processing
  • â–¹ Designed a geometric classifier detecting 6 social interaction types from spatial coordinates
  • â–¹ Reconstructed neural network architectures from raw checkpoint weights
YOLO SAM2 Computer Vision HPC GPU

One51

Data & Analytics Consultancy

Data & AI Consultant

Jul 2025 – Mar 2026

Sydney, Australia

  • â–¹ Designed AI orchestration workflows using Azure ML, Durable Functions, Azure AI Search, and HuggingFace
  • â–¹ Developed AI RAG systems to automate analytics platform migrations
  • â–¹ Constructed an AI-based accounting automation pipeline on Databricks integrating OCR and LLM-assisted semantic resolution
  • â–¹ Built interactive applications for metadata visualization and dependency tracking
Azure ML RAG Databricks HuggingFace

BBVA

One of Latin America's Largest Banks

Associate Software & Data Architect

Jul 2024 – Jan 2025

Lima, Peru

  • â–¹ Led migration of millions of contractual records from IBM Mainframe to cloud infrastructure using Apache Spark
  • â–¹ Designed frameworks for data quality, scalability, and compliance with banking regulatory standards
  • â–¹ Developed Java backend routines dynamically querying Mainframe or Oracle systems
  • â–¹ Managed centralized contracts module exposing data across banking systems
Apache Spark Java Oracle IBM Mainframe

BBVA

Software Architect

Oct 2022 – Jul 2024

Lima, Peru

  • â–¹ Engineered scalable batch pipelines using Apache Spark and Java for multi-product Mainframe-to-cloud migration
  • â–¹ Developed reusable Spark routines for data transformation, validation, and reconciliation in Parquet format
  • â–¹ Automated business-rule validations and anomaly detection across millions of contract records
  • â–¹ Deployed and maintained 70+ reusable components (Java backend, Spark batch, database scripts)
Spark Java Parquet CI/CD

Vida Software

BI & Data Analyst

Jul 2021 – Feb 2022

Lima, Peru

  • â–¹ Built Big Data pipelines for performance reporting; automated daily processing of 200+ SFTP files
  • â–¹ Designed dashboards for inventory and sales forecasting for multinational retail clients
  • â–¹ Built integration pipelines between Hasura GraphQL and SQL Server
Python Power BI GraphQL Node.js

// projects

Featured Work

Databricks RAG Migration Tool preview

Databricks RAG Migration Tool

AI-powered system that transforms legacy T-SQL into modern Databricks SQL using RAG, LLM reasoning, and deterministic validators for safe, reliable migration.

Python RAG LLM Databricks Azure Azure Foundry

Computer Vision Tracking

Computer vision pipeline built with neural networks models such as YOLO and SAM2 to track animals movement for Queensland Brain Institute.

Python YOLOv8 SAM2 Computer Vision
Predictive Maintenance Lakehouse preview

Predictive Maintenance Lakehouse

Predictive Maintenance Pipeline for Industrial IoT using lakehouse architecture.

Jupyter IoT Machine Learning Data Pipeline
Traffic Analysis Spark HDFS preview

Traffic Analysis Spark HDFS

Big data pipeline built with Spark using HDFS files to analyze traffic and identify causes of it in Sydney.

Spark HDFS Docker Big Data
Cloud-Native Healthcare Platform preview

Cloud-Native Healthcare Platform

End-to-end cloud architecture for a healthcare startup, covering infrastructure, CI/CD, and role-based access control. Demo note: the live site runs in demo mode and may take ~2 min to load (cold start). Test credentials — operador@cmep.local / operador123 (OPERADOR), gestor@cmep.local / gestor123 (GESTOR), medico@cmep.local / medico123 (MÉDICO).

Python AWS CI/CD Cloud Architecture
MineOps Dispatch preview

MineOps Dispatch

Agentic AI system that automates mining field operations via WhatsApp. An AI agent receives operator reports, classifies incidents, updates the control room database, and sends real-time confirmations — fully autonomous, end to end.

n8n AI Agent WhatsApp Vercel

// certifications

Credentials

Microsoft

Fabric Data Engineer Associate

DP-700

Microsoft

Microsoft

Fabric Analytics Engineer Associate

DP-600

Microsoft

Microsoft

Power BI Data Analyst Associate

PL-300

Microsoft

AWS

Cloud Practitioner

CLF-C02

AWS

Microsoft

Azure Fundamentals

AZ-900

Microsoft

Databricks

Generative AI Fundamentals

Databricks

// education

Academic Path

University of Queensland

Master in Data Science

2025 – Jun 2026

Brisbane, Australia

Cloud Computing Machine Learning AI Deep Learning Bioinformatics

University of La Rioja (UNIR)

Master of Artificial Intelligence

2024 – 2025

Spain

Generative AI Machine Learning Computer Vision Neural Networks

Peruvian University of Applied Sciences (UPC)

Bachelor of Industrial Engineering

2019 – 2023

Lima, Peru

Project Management Process Optimization

Institute Cibertec

Associate's in Computer Science

2019 – 2022

Lima, Peru

SQL Java Python Web Development

// contact

Let's Build Something Together

Open to AI engineer roles. Available for full-time work under a postgraduate visa from July 2026. Let's connect.

Get in Touch