Friday, June 5, 2020
ML cloud services are evolving fast and furious. GCP and AWS are the leading players. Here is a quick visual peak at both ML tech stacks.
AWS has SageMaker as the centerpiece:
Then there is GCP with its Kubeflow angle and on-premises hybrid cloud options:
Posted by Sam Taha at 3:02:00 PM
Tuesday, June 2, 2020
Which to choose for your cloud OLAP engine? There are a lot of choices when it comes to cloud based analytics engines. All the major clouds have their homemade solution (GCP/BigQuery, AWS/Redshift, Azure) and their are plenty of independent options from Snowflake to Databricks to mention a few.
Which is right for your business and in what situation? Needs can vary from internal data exploration to driving downstream analytics with tight SLA. I am a strong proponent of the approach that no matter what you do that you have start with a foundational data lake blueprint and you then choose to build that with either an open source analytics engine on top of your cloud data lake or license a commercial analytic engine such as Redshift, Snowflake or BigQuery.
There is no one answer without looking at your business needs, existing technical foundation and strategic direction, but I have to say have I am getting more impressed with Snowflake as the product matures. Without getting to deep into the details, Snowflake is sort of an in memory (backed by public cloud object storage) data lake with a highly elastic in-memory MPP layer. There are many pros and cons in selecting the best option for your business. The edge Snowflake has it is cloud agnostic (sort of the Anthos of the data cloud) and I really like their cross cloud and data center replication feature (recently released feature) and cross cloud management.
If you want to discuss how to approach making this decision process look me up!
Posted by Sam Taha at 11:11:00 AM