top of page

Gerenciamento da qualidade dos dados no Data Lakehouse com Spark

MD2 has extensive experience in projects of bases integration of several domains as customers and products, culminating in a unique data platform (master data management) and projects of data integration/analysis in corporations of great expression in the national and international market. We hold a considerable portfolio of projects successfully delivered in this segment and the recognition of our clients and the manufacturer IBM puts us in a prominent position in the Brazilian IT market.


MDM has a deep context of data reengineering, involving requirements such as wide connectivity with various data sources, high performance using massive parallel processing for treatment of tens of millions of records), multilayer data model to ensure complete governance in all stages of processing involving the collection of data, the critical processes and data sanitation to apply the routines of quality treatment and retention for curation and especially the mature identity resolution mechanisms to promote the unification of records at the corporate level in a secure and reliable way, maintaining traceability with their respective origins. 


An MDM solution requires sophisticated and mature people registry correlation algorithms, which take relevant investments to structure (based on the experience of implementing these algorithms for different industry segments and involving a range of people profiles treated for generating this data hub) and that need to be in constant evolution.

O modelo de dados do MD2 Sparked Master Data Manager é projetado para contemplar a privacidade de dados em todas as camadas da arquitetura de medalhas sugerida para data lakehouses. A arquitetura de medalhas é uma abordagem recomendada para organizar e melhorar a qualidade dos dados em um data lakehouse, permitindo que os dados fluam progressivamente por diferentes camadas de qualidade e estrutura, como bronze, prata e ouro.


Cada camada na arquitetura de medalhas possui critérios de qualidade de dados específicos, com a camada de bronze contendo dados brutos e não processados, a camada de prata contendo dados limpos e organizados, e a camada de ouro contendo dados refinados e de alta qualidade.

A arquitetura em camadas se subdivide em: 


Master Data is the sensitive information shared in various business processes. It is what provides the context for the company's day-to-day transactions and operations, typically information about people (customers, suppliers, prospects, employees, collaborators), products, and services.

Large organizations have in their operation several systems usually specialized in specific functions that promote the entry and maintenance of registers of people (customers, prospects, patients, partners, suppliers) and things (products and services). These registries conceived for specific functions end up diluting the vision of these information assets, limiting or hindering a corporate vision of this information. This situation generates rework, limits effective communication with people and the identification of products and services.

Simple questions like "how many customers do we have?" receive different answers, because Marketing has one vision, Operations another vision, and Sales another vision. Creating a unified view of people, products and services leverages several new business opportunities, since the potential of a centralized master data base can support several business initiatives: customer experience, CRM, Marketing, communication, etc.


Tratamento de qualidade de dados merece profissionalismo, métodos e artefatos performáticos e específicos para cada camada.

Com o MD2 Sparked Master Data Manager, é possível garantir que os dados sejam gerenciados de forma segura e privada em todas as camadas da arquitetura de medalhas. Isso inclui o uso de técnicas de criptografia e segurança de dados para garantir a privacidade dos dados, bem como recursos de gerenciamento de dados para garantir a conformidade com regulamentações e políticas de privacidade.

Performance, regra de padronização, enriquecimento, gestão de enquadramento legal, ciclo de vidas e unificação de pessoas para garantir dados de qualidade para bons negócios e garantir diligência à LGPD.


O MD2 Sparked Master Data Manager possui suporte completo para integração com a plataforma Databricks Data Lakehouse, uma plataforma de dados, análises e Inteligência Artificial que revoluciona o mundo de gestão e análise de dados


LGPD Clinic: a complete project, prepared by a team of experts in the health segment!

Don't waste any more time! Fill in your details and talk to our experts.

bottom of page