Documenting doesn’t have to be a toil. It can be fast and easy, yet extremely effective and useful. If team members communicate and make a note of their work on daily basis, the team is more productive, unleashed from irreplaceability, and headed for success.
Tag: DATA ENGINEERING
R is a free software environment for statistical computing which also supports tremendous graphs, maps, tables and other visualizations. R is also extraordinarily soft and accessible tool when it comes to visualization building.
Deequ is a great tool for exploratory data analysis as well as for in depth data quality evaluation.
Here at CROZ Data Engineering Team, we are excited to use Deequ in our data processing pipeline and are looking
DBImport is an open-source ingestion tool that uses Sqoop or Spark to ingest data from the relational databases into Hive database in the Hadoop cluster.
When using partitioning your performance of querying large tables is better, queries can access partitions in parallel, faster data deletion, and data load, better manageability and many others
If you want to feel accepted, understood and stimulated for new challenges, apply for Summer Accelerator – a student internship program at CROZ.
Delta Lake provides great features and solves some of the biggest issues that come with a data lake. On top of all, it is easy to use! Keep reading this post for some useful tips and tricks.
DataOps’s main focus is on monitoring and optimization of the so-called data factory.
The neat thing about graph databases is that they are graphs, and there are a lot of graph algorithms that can be used to find a solution to your problem. That is why they started getting more traction lately and why I decided to look deeper into the matter.
If you need help with your data or if you need someone to give you more insight into your data, why not call our Data Engineering department?