Big Data


A modern scalable approach to Business Intellingence

Processes, technologies and tools that help organisations collect, organise, analyse and present data to provide business decision support: this is the traditional definition of B.I. which, when applied to today’s reality, cannot ignore technical and technological complexities such as:

  • Processing of data in large quantities and produced with high frequency, not infrequently in real-time.
  • Design of storage and processing architectures able to grow gradually over time in proportion to requirements, safeguarding investments in terms of hw and sw.
  • Need to integrate information organised in very heterogeneous formats, ranging, for example, from text files, to relational databases, to video streams.

These, and many others (data quality, governance, AI, etc.), are the aspects that characterise that sector now generally known as Big Data.

Humanativa boasts pioneering experience in this field, thanks to which it is able to provide expertise covering every aspect in terms of methodology, technology, design and implementation.

Data Platform

Choosing an organisational model

Many features and principles underlying the new B.I. require not only adequate technical tools, but above all changes at organisational level, creation of competencies, definition of responsibilities. For instance, it is necessary to establish which corporate figures/units are responsible for:

  • Defining governance policies
  • Monitoring compliance with governance criteria (e.g. auditing)
  • Define quality criteria
  • Monitoring compliance with quality criteria (e.g. data stewardship)

Two are currently considered the main technical/organisational approaches for the proper design and management of a data platform: Data Fabric and Data Mesh.

Humanativa offers its expertise in this area to guide customers in making the most suitable choices for their context.

Data Fabric

The Forrester Wave 2016 Q4 publication illustrates the fundamental properties of this architecture, which is centralised both from a technical and organisational point of view, characteristics that make it suitable for small, medium or large-sized companies with a typically pyramid-shaped organisation chart.

Humanativa is able to apply its expertise on Big Data, to support customers in both Large Enterprises and SMEs, for the realisation of both on-cloud and on-premises data fabric, based on both commercial and open source latest-generation products.

Data Mesh

In 2018, Zhamak Dehghani, an expert in emerging technologies at Thoughtworks, formulates the new Data Mesh paradigm for data platforms, which, as it provides for decentralised platform management, appears to be most applicable in medium- and large-sized organisations characterised by multiple organisational units with a high degree of independence.

This paradigm requires the support of advanced enabling technologies, such as data virtualisation, query federation, identity federation, data product lifecycle management, etc., which the major cloud players have only recently begun to make available, but Humanativa’s skills can guide the customer to full use of them.

From Data Warehouse to Data Lakehouse

The architectural foundations of a data platform

Data Pipeline

Architectures for data acquisition in the Big Data context

Lambda Architecture

Humanativa has gained extensive experience in the design and implementation of data pipelines in the Big Data field, based on the consolidated Lambda architecture, which allows it to tap into both batch and realtime data sources, using open source technologies such as Kafka and Spark, including their commercial serverless cloud equivalents, as well as data integration products such as Talend, Data Stage, Power BI, etc.

Kappa Architecture

Thanks to its mastery of streaming technologies such as Kafka, Spark Streaming, Flink, etc., which can also be used in serverless mode on the major Cloud platforms, Humanativa is able to effectively support the customer in the realisation of data pipelines more oriented towards streaming, based on the so-called Kappa paradigm, which emphasises the role of realtime processing in both data acquisition and use.

Vendors and Technologies for Data Platforms

The business intelligence sector, particularly in its application to the Big Data context, is still expanding rapidly, both in terms of market and technology. Humanativa endeavours to keep itself constantly up-to-date with the technological offerings of products and services, both in the commercial and open source spheres.

Certified consultants in the areas of Microsoft Azure, Amazon AWS, Google Cloud Platform, Cloudera Data Platform, Databricks are able to support the customer in all phases of the implementation of data platforms both in the cloud and on-premises, according to an approach that meets their specific needs in terms of technical, security and business requirements.

Thanks to its in-depth experience, also historically, with open source technologies oriented to Big Data, Humanativa is able to operate at any level of the architectural stack with products such as Apache Hadoop, Ozone, Iceberg, Delta Lake, Spark, Kafka, Trino, Ranger, Atlas, etc.

On the (visual) data integration side, Humanativa offers expertise in widely used commercial products such as Talend, Microsoft Power BI, IBM Data Stage, Informatica, as well as open source products such as Apache Nifi. On the front-end side, know-how ranges from established products such as Tableau and Qlik to emerging open source products such as Apache Superset and Metabase.

Advanced applications

Thanks to its in-depth knowledge of programming languages such as Scala and Python, as well as of data integration and data science frameworks such as Apache Spark, TensorFlow, Keras, Pandas, Scikit, Humanativa is able to realise highly performing data pipelines, both in the ingestion phases and in the application of machine learning models.

For many Clients characterised by particularly critical requirements in terms of performance, Humanativa has realised complex Spark processes of data extraction, loading and transformation (ELT) entirely based on dynamic Scala code, configurable in a user-friendly manner, but extremely performant, robust and versatile in its possible applications.

Our Open Source Data Fabric Solution

Designed with the aim, on the one hand, of lowering recurring licence costs and, on the other, of making use of state-of-the-art technologies actively supported by the open source community, the Big Data architectural solution developed by Humanativa consists of a family of products distributed in containerised form both on-premise and on-cloud which, as a whole, functionally cover all the requirements of a modern data fabric:

  • Data Lakehouse
  • Data ingestion / data processing
  • Data analytics / business intelligence
  • Data governance / data quality / security
  • Monitoring / auditing