Draft:Analytics Engineering

From Wikipedia, the free encyclopedia

Analytics Engineering sits at the intersection of Business, Data Analysis and Data Engineering. It is responsible for bringing modeled, robust, efficient, and integrated data products to life. A practitioner of Analytics Engineering interfaces with the Business and collects Business Requirements and then models the data in the Data Warehouse to reflect the Business. Once the data is modeled in the Data Warehouse they are responsible for bringing the data to Information Mart, which then is consumed by Data Analysts and Business Intelligence team to produce Charts and Dashboard as per business requirements. In short, Analytics engineers provide clean data sets to end users, modeling data in a way that empowers end users to answer their own questions.[1]

Tools[edit]

An Analytics Engineer is not responsible for extraction of the data from the source systems. That is usually handled by Data Engineering team. As such, tools that are used by Analytics Engineers are more geared towards Business Requirement collection, Data Modelling and Data Warehousing. To use the metaphor from Dataform, "Data engineers build the cupboard, they gather together the wood and the tools and put it together. The Analytics Engineers open the cupboard and start putting in the plates, mugs, bowls, and arrange them in a certain order. This could be arranging them into particular colours, shapes or sizes. Data analysts then go into the cupboard and they know where everything lives as it is arranged nicely. They can then grab the small blue mug they were looking for and go make a cup of tea!"[2]

Modelling[edit]

Conceptual Modelling[edit]

This is typically the starting point and requires understanding of Business from the Subject Matter expert in each area. Typically mind mapping tool and other business requirement gathering tools are used for Conceptual Modelling. In conceptual modelling, we decide what is the structure of the business reality and what are the terms we use for our business objects. Customers, Orders, Invoices...[3]

Logical Modelling[edit]

Logical Modelling as it relates to Analytics Engineering involves modelling of business processes and entities.

Warehouse Modelling[edit]

This requires modelling of actual storage structures depending on the Warehousing Modelling methodology being implemented. For e.g. Data vault modeling will require defining the HUB, LINKS and Satellite to reflect the Business.

Storage[edit]

Data is stored in a variety of ways, one of the key deciding factors is in how the data will be used. Data engineers optimize data storage and processing systems to reduce costs. They use data compression, partitioning, and archiving.

Data warehouses[edit]

Main article: Data warehouse

If the data is structured and online analytical processing is required (but not online transaction processing), then data warehouses are a main choice. They enable data analysis, mining, and artificial intelligence on a much larger scale than databases can allow, and indeed data often flow from databases into data warehouses. Business analysts, data engineers, and data scientists can access data warehouses using tools such as SQL or business intelligence software.

Data lakes[edit]

A data lake is a centralized repository for storing, processing, and securing large volumes of data. A data lake can contain structured data from relational databases, semi-structured data, unstructured data, and binary data. A data lake can be created on premises or in a cloud-based environment using the services from public cloud vendors such as Amazon, Microsoft, or Google.

See Also[edit]

References[edit]

  1. ^ Carroll, Claire. "What is Analytics Engineering?". dbt Labs. Retrieved 2024-03-27.
  2. ^ "What do Analytics Engineers Actually Do? | Dataform". 2023-11-26. Archived from the original on 2023-11-26. Retrieved 2024-03-27.
  3. ^ "Juha Korpela on LinkedIn: Physical data modeling is going to die. Now, you're probably thinking… | 50 comments". www.linkedin.com. Retrieved 2024-03-27.