Hadoop Developer
Pavan

Professional Summary

10 years of experience in Information Technology with special emphasis on the design, development, administration and support of HDFS clusters, Big Data analytics, BI Reporting, ETL and Database Applications.

  • 3 years of multifaceted experience in Big Data using Hadoop, HDFS, MapReduce, Spark and Hadoop Ecosystem (Pig, Hive, Sqoop, Flume, HBase, Oozie, Impala, Tez, Zoo Keeper).
  • Very experienced as a Team Lead & Developer to do business analysis, gathering business requirements and preparing proof of concepts efficiently.
  • Experienced in installing, configuring and administrating Hadoop cluster of major Cloudera distributions.
  • Experienced in deployment automation tools like Ansible, alerts set up in Nagios, stats monitoring using InfluxDB and Grafana analytics.
  • Experienced in Shell scripting, Python, PIG Latin & in NoSQL environments.
  • Expertise in ingesting data from RDBMS and rolling logs to HDFS to allow consistent data mining, oversight and adhoc reporting activities.
  • Configured Zoo Keeper to monitor Hadoop clusters, Hive servers, HBASE and to feed notifications to Nagios.
  • Configured Hive Server (HS2) to enable analytical tools like MicroStrategy, Tableau, custom VISA applications to interact with Hive tables.
  • Good understanding of languages C, C++, Java, HTML and XML.
  • Proficient in deploying and generating real-time dashboards using BI reporting tools like MicroStrategy, Tableau, Oracle BI.
  • Ability to furnish Technical Documentation with thorough research and deep vision.
  • Meticulous, well-organized, quick learner, self-motivated team player with experience in all phases including project definition, analysis, design, coding, testing, implementation and support.
  • Excellent analytical, programming, written and verbal communication skills.

Education

  • Masters in Computer Engineering, University of New Haven, CT, USA
  • Bachelor of Technology, JNTU, India

Technical Skills

Operating Systems
UNIX, Linux, AIX, RHEL 6/7, Solaris and Windows 98/NT/2K/2005/2008/XP/Vista/7
Hadoop Ecosystems
Hadoop, HDFS, Map Reduce, YARN (MRv2), Pig, Zookeeper, Hive, Sqoop, Flume, Oozie, HBase, Cloudera Impala, Tez, HUE, Cloudera Search, Cloudera Distributions for Hadoop (CDH3, CDH4, CDH5), Cloudera Sentry
Business Intelligence
MicroStrategy, Tableau, Oracle BI, Siebel Analytics
RDBMS
Oracle 9i/10g/11g, MySQL, Teradata, DB2, PostgreSQL
Languages
SQL, PL/SQL, Shell/Bash Scripting, Python, C, C++, Java, HTML, Javascript, XML, Perl
Tools
RMAN, OEM, SQL*Loader, TOAD, SQL Navigator, EXPORT, IMPORT, DataPump, SQL Developer, Pro*C

Professional Experience

VISA, Foster city, CA
Duration
Dec 2013 - Present Date
Role
BIG Data Hadoop Engineer
Responsibilities
VISA is the global leader in credit card and debit card transaction processing through the Visa’s global network (VisaNet) processing billions of transactions every year. Domain: Big Data - Long Time Architecture (LTA) LTA is an internal team within VISA which renders Big Data Services for firm wide use to provide massively scalable parallel processing and storage capability. The goals are to perform sophisticated, detailed processing and analysis of data at a very high speed, also store data inexpensively and reliably to have big data clusters serving all of the data needs. The services offered by LTA team are advisory, architecture services, hosting Big Data, development and distribution of Big Data platform enhancements. The team also develops solutions for different LOB’s to implement their use cases and ease their transition into Big Data.

  • Onsite Lead & SME working directly with the stake holders and Product Offices at the VISA headquarters and helping manage resources at onsite & offshore.
  • Installed and configured fully distributed Cloudera Hadoop cluster.
  • Performed Hadoop cluster administration for initial setup via Cloudera Manager that includes adding and removing cluster nodes, cluster capacity planning, performance tuning, cluster monitoring, and trouble shooting.
  • Performance tuning of the Hadoop cluster workloads, bottle necks and job queuing.
  • Used Oozie to automate/schedule business workflows which invoke Sqoop, MapReduce and Pig jobs as per the requirements.
  • Working extensively on rulemaking to introduce same day Fraud detection using Spark framework.
  • Exposure with configuration management using Ansible, Puppet.
  • Involved in setting up alerts in Nagios and feeding stats into Time Series InfluxDB for front end Grafana analytics.
  • Developed Sqoop scripts to import and export the data from relational sources and handled incremental loading on the customer and transaction data by date.
  • Worked with various HDFS file formats like Avro, Sequence and various compression formats like Snappy, bzip2.
  • Experience with Pig & HIVE UDF's using Python to add custom fields, pre-process the data for analysis.
  • Loaded data into the cluster from dynamically generated event files using Flume and from RDBMS using Sqoop.
  • Developed Hive queries for data sampling, analysis and developed custom Hive QL's to generate aggregated data from the detail data in HDFS and to pump these aggregates via Sqoop to RDBMS for use in BI apps.
  • Involved in the setup of dedicated HBASE cluster for real-time updates in VROL application.
  • Developed custom Python and Unix SHELL scripts to do data sampling, pre and post validations of master and slave nodes.
  • Developed and used Pig Scripts to replace Ab-Initio jobs and to process, query flat files in HDFS which cannot be accessed using HIVE.
  • Supported Map Reduce Programs those are running on the cluster.
  • Identified several PL/SQL batch applications in General Ledger processing and conducted performance comparison to demonstrate the benefits of migrating to Hadoop.
  • Configured Sentry to secure role based access to underlying applications in HDFS.
  • Involved in several POCs for different LOBs to benchmark the performance of data-mining using Hadoop.
Environment
RedHat Linux 6/7, MS SQL Server, Oracle, Hadoop CDH 3/4/5, PIG, Hive, ZooKeeper, Mahout, HDFS, HBase, Sqoop, Python, Java, Oozie, Hue, Tez, UNIX Shell Scripting, PL/SQL, Ansible, Puppet, Maven, Ant
VISA, Foster city, CA
Duration
Jan 2011 - Nov 2013
Role
Lead BI Developer
Responsibilities
Domain: Common Data Interface Analytics (CDIA)

  • Worked as the Lead BI Developer & Support SME for CDIA on multiple projects at VISA for EDW enterprise warehouse Reporting, GMBS Member Billing services, MARS and Commercial Reporting platform.
  • Development of multiple applications like GMBS (Billing), Perform Source, VISA Commercial Dashboard (VCD), VISA Money Transfer (VMT)
  • Leading the EDW global development and production release in MARS (Member Activity Reporting System) where there was end to end MSTR implementation for 2 new projects covering the whole BI SDLC.
  • Played a major role and have been in the forefront, implementing and leading vital projects like EDW, GMBS, MARS, Commercial Reporting, various MicroStrategy Upgrades, VISA Money Transfer (VMT), Visa Payment Control (VPC), VISA Alerts Reporting (VAR), etc.
  • Worked extensively on the various MicroStrategy Upgrades, new features implemented through the CDIMI interface and other new functionalities introduced in EDW.
  • Performance tuning of MicroStrategy Intelligence Server, Reports & Dashboards and SQL tuning.
  • Developing and supporting new Report Service Documents (RSD’s) and Dynamic dashboards according to the business requirements using MSTR OLAP technology.
  • Work with Business Analysts, Program Managers and Technical Analysts in developing business intelligence solutions to support decision support systems.
  • Design and architect Business Intelligence Reports using various technologies including MicroStrategy, DB2, Freeform SQL, Microsoft Office, etc.
Environment
MicroStrategy 9.4.1, 9.0.2, 8.1.1, Tableau, MicroStrategy Administrator, MicroStrategy Desktop, Object Manager, MicroStrategy Report Services, MSTR Enterprise Manager, MicroStrategy Web, BMC Remedy VIPER, QTP, MicroStrategy Intelligence Server, MSTR Narrowcast Server, Ab Initio, DB2 , Windows XP/7/2003, AIX, UNIX, LINUX

PriceWaterHouseCoopers LLP, Tampa, FL
Duration
Aug 2007 – Dec 2010
Role
Lead BI Developer
Responsibilities
PricewaterhouseCoopers Global data warehouse is a framework architecture of data that is created and maintained using time and expense data. The data warehouse offers efficient financial management reporting and analysis capabilities at the individual staff level using MicroStrategy.

  • Worked as the Lead BI Developer & Admin for the GDW (Global Data Warehouse) and PDQ (Pulse Data Query) applications dealing with all components of MicroStrategy, Report development, Architecture, User support, Administration, training users, maintaining & leading offshore resources, implementing major upgrades and so on.
  • Developed many critical and complicated reports according to the business requirements and address user concerns.
  • Developed the single sign on universal login integrating the MicroStrategy metadata password with Siteminder.
  • Been a Technical Lead & implemented major MicroStrategy upgrades like the MicroStratgey 8.1.1 upgrade and the 9.0.1 upgrade successfully all by myself.
  • Played a major role and been in the front, leading in implementing many key projects like Geography changes for the Network of the Future, Data warehouse and Datamart separation, Major complicated MicroStrategy Upgrades, Oracle 11g implementation, Client Segmentation & DUNS validity, Swiss & Israel Data inclusion.etc.
  • Developed new Dashboards according to the business requirements from various grid reports using the Document editor.
  • Created new attributes and necessary facts from the new tables loaded, transformations, metrics to support the development and support of new US finance reports which were previously in SAP BW.
  • Created new user groups and Security filters to restrict User access as per the business requirements and assign privileges.
  • Been the Technical lead for the MicroStrategy Dashboards implementation and offshore coordination.
  • Worked extensively on Narrowcast where the monthly report delivery to the end users has been automated using script files to run the Narrowcast services through batch files which were triggered soon after the Data load using Maestro.
  • Been the MicroStrategy SME helping users and also other groups within PwC who were planning on implementing MSTR going ahead.
Environment
MicroStrategy 9.0.1, 8.1.1, 8.0.1, MicroStrategy Administrator, MicroStrategy Desktop, Object Manager, MicroStrategy Report Services, MSTR Enterprise Manager, MicroStrategy Web, MicroStrategy Intelligence Server, MSTR Narrowcast Server, Informatica 8.X,7.X, Oracle 9i/ 10g/ 11g , Windows XP/2003/2000, UNIX

Vcommerce, Scottsdale, AZ
Duration
Jan 2007 - July 2007
Role
BI Developer
Responsibilities
Vcommerce is the industry leader in enterprise eCommerce solutions for online retail. Vcommerce provides reporting using the MicroStrategy platform to clients like Target, M-TV, BJ’s, Trans World Entertainment, Univision.com, Overstock.com and many others.

  • Worked as the MicroStrategy developer to cater the Reporting needs for all the Clients.
  • Gathered the necessary requirements for the design of new MSTR reports and documents and architectural changes that were necessary to meet the various client needs.
  • Designed Templates and Filters to generate reports in grid and graph mode. Auto prompt filter feature gave end-user a choice of different filtering criteria each time they run the report.
  • Transformed various Grid based reports into Documents using Document Editor for Narrowcasting.
  • Created custom designed reports based on the requirements of the business teams using various report objects like filters, metrics, consolidations, custom groups and attributes.
  • Created Freeform SQL Reports and grid based report service documents and optimize the same for Excel and HTML.
  • Modified & created schema objects like attributes and transformations based on the reporting requirements.
Environment
MicroStrategy 8.0.2, MicroStrategy Administrator, MicroStrategy Desktop, Object Manager, MicroStrategy Report Services, MicroStrategy Web, MicroStrategy Intelligence Server, MicroStrategy Narrowcast Server, Informatica Power Center 6.0, Oracle 9i, SQL Server, PostgreSQL, Teradata, Windows XP/2003

Novartis Pharmaceuticals, East Hanover, NJ
Duration
July 2006 - Dec 2006
Role
BI/DB Consultant
Responsibilities
Novartis Pharmaceuticals Corporation researches, develops, manufactures and markets leading innovative prescription drugs used to treat a number of diseases and conditions, including those in the cardiovascular, central nervous system, cancer, ophthalmics, organ transplantation and respiratory areas.

  • Worked as a BI analyst initially for the Information technology team to supply various detailed reports and report service documents to the business. .
  • Customized the BI Web interface and the Intelligence Server settings for the project.
  • Designed and customized the reports in development and production to supply the business with improved flexibility and functionality.
  • Created complex metrics with different level dimensionality, condition and transformations for user and business specific reporting needs.
  • Used table based Transformation objects created for comparative metrics analysis like this year, last year and year to date analysis.
  • Designed several metrics, filters, prompts and consolidation objects for the development of new reports. Created reports based on the requirements.
  • Implemented Narrowcast Server Services to send reports via e-mail to various Novartis groups (CIC, MIC, SONIC, IRMA) based on schedule, static and dynamic subscriptions.
Environment
MicroStrategy 8.0.1/8.0/7.5.2, Siebel Analytics, Intelligence Server, Narrowcast Server, Informatica Power Center 6.0, IBM DB2/AS400, IBM i-Series, Windows XP/2003