Apache Pig

Pig is a high level scripting language that is used with Hadoop. It is used for writing complex MapReduce transformations using a simple scripting language, without knowing Java. Pig Latin (Pig’s simple SQL-like scripting language) defines a set of transformations on a data set such as aggregate, join and sort. Pig translates the Pig Latin script into MapReduce so that it can be executed within Hadoop. Pig Latin is sometimes extended using UDFs (User Defined Functions), which the user can write in Java or a scripting language and then call directly from the Pig Latin.

Pig was designed for performing a long series of data operations, making it ideal for three categories of Big Data operations: standard extract-transform-load (ETL) data pipelines, research on raw data, and iterative processing of data. For large ETL data pipelines

PIG charecteristics are the following:

  • Extensible. Pig users can create custom functions to meet their particular processing requirements.
  • Easy to program. Complex tasks involving interrelated data transformations can be simplified and encoded as data flow sequences. Pig programs accomplish huge tasks, but they are easy to write and maintain.
  • Self-optimizing. The system automatically optimizes execution of Pig jobs, so the user can focus on semantics.

Pig Latin is a flow language whereas SQL is a declarative language. SQL is great for asking a question of your data, while Pig Latin allows you to write a data flow that describes how your data will be transformed. Since Pig Latin scripts can be graphs (instead of requiring a single output) it is possible to build complex data flows involving multiple inputs, transforms, and outputs. Users can extend Pig Latin by writing their own functions, using Java, Python, Ruby, or other scripting languages.

Parent Category: Technologies

Subscribe to newsletter



ferari mini logo

6774 – FAIR




ISO 27001 Slika 2

AmCham Member 2015 Logo


Through education and training we allow our clients to completely utilize implemented tools in everyday business. We provide trainings for specific tools, DWH and BI methodologies and tailor made training for the purposes of specific projects.


We provide services needed for successful design, development and implementation of Big Data Processing and Analytics systems. Big data analytics is the concept of data analysis and deriving useful information from large volumes of data that is rapidly being generated in modern business.


MDM system provides a holistic, single view of foundational business entities, commonly referred to as master data such as customers, employees, citizens, locations and products. Successful MDM projects facilitate better operational efficiency, higher customer loyalty and successful compliance efforts.


Strategic consulting is a service provided by our top management consultants who have wide expertise in business processes in most industries in design and implementation of various types of business strategies.


Our methodology for managing and supervising of projects is characterized by the speed in which first useful results are produced, efficiency in implementation, the flexibility of the methodology, the effectiveness of risk management, quality assurance and optimization of project documentation.


Implementation  of intelligent information systems for support of the business decision making process is our primary field of activity. According to our clients' needs and expectations, we will provide either turnkey solutions or work together with our clients experts on implementation projects.


Nearshoring services

Our company is providing nearshoring development services for implementation of DWH/BI/PM systems for clients in different countries in EU. We can combine shorter on-site visits with majority of the activities done off-site through secure connection.



In order to view this object you need Flash Player 9+ support!

Get Adobe Flash player

Powered by RS Web Solutions