
Tag: DATA ENGINEERING


IBM Cloud Pak for Data will transform your cloud solutions in 5 key ways
Cloud Pak for Data – a solution for multi-cloud, Open Shift Containers, Data Virtualization and integration, DataOps and AI pipeline.

How to detect bad behavior in your data warehouse? Measure it with SLA
Imagine having thousands of processes in your data warehouse with millions of data. How do you know which ones are critical? Which process consumes most of CPU or memory resources? Which processes consume and produce large amounts of data? What to do to find out? The answer to all these questions is SLA.

How to save time by doing more? Learn it with CROZ Data Engineering team!
Documenting doesn’t have to be a toil. It can be fast and easy, yet extremely effective and useful. If team members communicate and make a note of their work on daily basis, the team is more productive, unleashed from irreplaceability, and headed for success.

R SHINY – Shiny side of data story
R is a free software environment for statistical computing which also supports tremendous graphs, maps, tables and other visualizations. R is also extraordinarily soft and accessible tool when it comes to visualization building.

How to achieve correct, unambiguous, consistent and complete data with Deequ Check
Deequ is a great tool for exploratory data analysis as well as for in depth data quality evaluation.
Here at CROZ Data Engineering Team, we are excited to use Deequ in our data processing pipeline and are looking



Making data ingestion easier with DBImport
DBImport is an open-source ingestion tool that uses Sqoop or Spark to ingest data from the relational databases into Hive database in the Hadoop cluster.

SQL Server – Building Dynamic Partition in One Click
When using partitioning your performance of querying large tables is better, queries can access partitions in parallel, faster data deletion, and data load, better manageability and many others

Summer Internship at CROZ: Spending the Summer in Zagreb was Worth It
If you want to feel accepted, understood and stimulated for new challenges, apply for Summer Accelerator – a student internship program at CROZ.

Delta Lake – extract the real value from Data Lake
Delta Lake provides great features and solves some of the biggest issues that come with a data lake. On top of all, it is easy to use! Keep reading this post for some useful tips and tricks.