Informatica is an enterprise cloud data management company, which means they have a full suite of products that focus on data integration and data management. In fact, they are a leader in 5 different magic quadrants including Enterprise Integration Platform as a Service, Data Quality Tools, and Master Data Management Solutions.
Lior Mechlovich is an Architect for Informatica Cloud, which is Informatica’s Enterprise Integration Platform as a Service offering. They support 80 of the Fortune top 100 companies so they understandably have a heavy focus on reliability. At Informatica, they have a long term strategy to move their products and offerings from on-premise infrastructure to microservices and the cloud.
A migration of this scale does not happen overnight and the effort at Informatica has been underway for the last 4 years. Lior and his team are responsible for Informatica cloud and onboarding other Informatica products to their cloud infrastructure. As they began scaling out, from 1 to now 15 different production clusters, they started to hit a wall with their existing architecture. The level of automation and deployment they wanted to achieve was not possible with their existing container strategy. They choose Kubernetes for many reasons but the primary driver is to increase their deployment cadence. While Kubernetes provides automation, self-healing, and immutable infrastructure, it also comes with a learning curve.
As they moved to Kubernetes, they needed to figure out how to get the organizational confidence to move to Kubernetes successfully. Some key tenants they kept in mind throughout their process are:
- Getting visibility
- Gaining context through metadata
- Migrating one piece at a time via side-by-side architecture
- Training and onboarding engineers through platform visibility
Getting visibility
Informatica has been a Sumo Customer for 4 years with hundreds of users across the organization to get operational and business insights. For Lior, he is most interested in how different product groups can use data to align and base discussions and decisions around a shared understanding. When they moved to Kubenretes it was no different. There are a variety of teams that need visibility into what is going on across their products and services.
By setting up collection into Sumo Logic they were able to get visibility across all of their clusters. Lior mentioned that in a lot of ways collection setup is more automated with Kubernetes than without. Previously, you would have to remember to set up collection for each new VM or server but with Kubernetes you can utilizate daemonsets which will deploy collectors on each new node in the cluster. Collectors forward all log, metric, and event data to Sumo Logic and build out a centralized view.
Getting context through metadata
Context is another crucial element to Kubernetes monitoring with confidence. This means, tying together metrics, events and logs. Sumo Logic uses a centralized metadata pipeline to tag all data with this valuable contextual information, which makes it easier to trace problems during troubleshooting.
Migrating one piece at a time
Phased migration is the core of Informaticas migration strategy. For migration they used used a side by side architecture where they could slowly push some traffic to the new Kubernetes architecture but have the ability to roll back if needed.
In order to implement this strategy they set up side by side dashboards in Sumo Logic so they could closely monitor both environments as they moved traffic over. As they gained confidence, they moved traffic over services by service and environment by environment.
Training and Onboarding Engineers
Finally, it was important to ensure the developers who were troubleshooting the problems have knowledge of the entire system. They needed to train developers how to find their services and check the logs for the service that they own. Using Sumo Logic they can visually see the architecture, go right to their service, and get the logs needed.
Whats next for Informatica
Lior and Informatica are still at the beginning of their migration but are learning a lot through the process. As they continue to roll out more and more services they have built out additional organizational rules around setting requests and limits on pods, converting useful dashboards to alerts, and setting liveness and readiness probes to further streamline their process. Check out Lior’s full talk about Kubernetes Migration.
Complete visibility for DevSecOps
Reduce downtime and move from reactive to proactive monitoring.