Impala's performance - see consistenly high performance compared to HiveQL; Cloudera Manager is easy to understand; provides complete Big Data applications stack on Hadoop HDFS
Cloudera has very little things to not like it until it fails for some (usually) unknown reason. Crashes are not common but it's very annoying when it happens on a very long job. When we deal with distributed systems, however, failures are a very common...
A great experience that combines ML-Runtimes - MLFlow and Spark. The ability to use Python, and SQL seamlessly in one platform. Since databricks notebooks can be saved as python scripts in the background it is amazing to have both notebook and script...
The biggest kink in Lakehouse platform is its speed. It does not deliver on the performance promised. In addition, the Databricks UI is not easy to use. It feels like it's a smartphone app. On the side of technology, it is slow and expensive, with...
Impala's performance - see consistenly high performance compared to HiveQL; Cloudera Manager is easy to understand; provides complete Big Data applications stack on Hadoop HDFS
A great experience that combines ML-Runtimes - MLFlow and Spark. The ability to use Python, and SQL seamlessly in one platform. Since databricks notebooks can be saved as python scripts in the background it is amazing to have both notebook and script...
Cloudera has very little things to not like it until it fails for some (usually) unknown reason. Crashes are not common but it's very annoying when it happens on a very long job. When we deal with distributed systems, however, failures are a very common...
The biggest kink in Lakehouse platform is its speed. It does not deliver on the performance promised. In addition, the Databricks UI is not easy to use. It feels like it's a smartphone app. On the side of technology, it is slow and expensive, with...