Hortonworks – analysing Big Data
Big data is certainly one of the newer buzzwords which has caught the attention of many companies around the world. Collecting big amounts of data has raised the question of the use of all that data. What was certain was that traditional technologies and tools were not ideal for processing large amounts of data. In order to solve this problem, Google came up with three technologies of their own: MapReduce, Google File System and Bigtable. A few years later, Doug Cutting and Mike Cafarella started an open-source project based on Google’s solutions and called it Apache Hadoop. Hadoop consisted primarily of MapReduce and HDFS (a distributed file system). It didn’t take long for the community to realize the value of processing big amounts of data – as a result, today there are dozens of technologies at our disposal that have been developed based on the Apache Hadoop ecosystem.
Hortonworks Data Platform and Open Source Philosophy
Many large companies have invested in the development of big data technologies which resulted in multiple distributions of Apache Hadoop that are available on the market today. One of the most well-known distributions is Hortonworks Data Platform (HDP), developed by the company Hortonworks.
The HDP distribution comes with a number of big data technologies, such as Spark, Kafka, Ranger, Zookeeper, Zeppelin and so on. What makes Hortonworks special in comparison to other competitive companies is their devotion to open source philosophy, which, as they claim, encourages innovation. One advantage of their philosophy is that, due to their immersion in open-source communities, they really are keeping up with technology and adding the newest versions to HDP rather quickly.
Hortonworks was originally criticized – it was claimed that it will never succeed since it isn’t enterprise-ready because of its security. However, Hortonworks have proven that they don’t hold the same opinion. Alongside Ambari, an interface for cluster administration, and Ranger for monitoring and authorization, HDP is nowadays more than simply enterprise-ready. This is shown by the fact that many worldwide leading companies like IBM, Yahoo, T-Mobile and Symantec use it on their clusters.
The Partnership of Hortonworks and CROZ
CROZ has personally recognized many advantages of HDP compared to other competitive distributions. This is why we have decided to partner up with Hortonworks. The power of HDP is underlined by the fact that IBM has stopped developing their own distribution last summer and decided to form a partnership with Hortonworks, to which IBM is contributing with its Big SQL engine and Data Science Experience environment.
HDP is free of charge; anybody can download it and install it on their cluster. If you want to, you can pay to gain additional support. However, it is important to note that all the functions of HDP are available in the free version.
CROZ’s consulting services include, among other areas, architecture consulting, the installation and configuration of HDP distribution on your own cluster and the possibility of piloting HDP on CROZ’s cluster.
The development of big data technologies is in full swing without an end in sight. Hortonworks keeps introducing new features into every version of HDP. They make your cluster safer, better and more efficient when processing large amounts of data. However, the IT community has only recently started believing in big data technologies and is slowly transcending from the pilot phase to a phase of developing large projects that will tomorrow run a significant piece of our world.
Get in touch
Want to hear more about our services and projects? Feel free to contact us.Contact us