IT infrastructure engineering and operations transformation: – The Now, Next, and Beyond!

IT infrastructure engineering and operations transformation: – The Now, Next, and Beyond!
IT infrastructure engineering and operations transformation: – The Now, Next, and Beyond!

Organizations need to continuously transform their infrastructure platforms and Operational agility to keep pace with new technological transformations and customer preferences

This is an exclusive interview conducted by the Editor Team of CIO News with Dr. Rajesh Puneyani, Interim Site Leader, Kenvue India.


The retail ecosystem is expanding exponentially and globally beyond limits, becoming more demanding with technology systems that are no longer just serving customers locally or in the same geography; they have expanded their global footprints. Customers around the globe are now accessing these systems in one part of the world to search for and shop for products while getting them delivered anywhere else in the world. While this is an excellent trend for businesses, it puts a tremendous amount of stress on the technology systems in terms of their availability, reliability, performance (faster access), and providing a safe and secure environment for customers to surf and shop.

Thus, organizations need to continuously transform their infrastructure platforms and Operational agility to keep pace with new technological transformations and customer preferences. These need to be at the core of business processes and infrastructure strategy for an organization to offer highly reliable and secure systems that are available 24×7 and are monitored around the clock for seamless operations.

Challenges Galore:

Regular system maintenance and system reliability are some of the top challenges facing the industry. On top of it, the flow of information between various upstream and downstream systems—some of which are internal and some third-party vendor systems—adds additional reliability-specific complexity to providing a seamless omnichannel experience to customers. One cannot undermine the importance of system reliability; there are multiple things to balance, including new functionality and features for customers. An additional consideration for technology platforms would be to enhance the overall speed of development.

Infrastructure is probably transforming faster than applications. Along with on-premise infrastructure in data centres, organizations are making themselves more and more resilient by moving into a Hybrid cloud and multi-cloud strategy to avoid putting all their eggs in the same basket. Below are key considerations for designing highly reliable, available, scalable, and secure IT infrastructure systems, including:

  • High availability
  • Continuous monitoring: eyes on the glass
  • Automation
  • Recovery and business continuity plans – while building and testing those recovery plans regularly
  • The flexibility – to adopt new application enhancements

But these complexities become multifold as the system needs to adapt to the latest innovations while preparing the workforce to manage them (i.e., upskilling the engineers) and tackling other fundamental requirements on TCO around system management and upgrade. Put simply, how much workload should be kept on-premises and how much on clouds, and how do applications transform to work in multiple infrastructure ecosystems?

Getting deeper into opportunities and transformation levers:

Now Let us take this landscape and challenges further to reflect on how these challenges are perfect opportunities to see and solve things through the lens of transformation and operational excellence. We need to see things through the lens of process maturity (ITSM Value stream), automation, and self-healing capabilities, along with people aspects, and then slowly shift gears to take it to SRE transformation and then bring all these ITSM, SRE, and AI together.

In the interest of keeping it short and crisp, I would break down the overall thought process into multiple articles, and I promise there would not be much of a delay between the publications of those articles.

In this article, I will largely focus on the overall philosophy of transformation and excellence from a larger infrastructure and operations point of view.

The way our applications are becoming more complex, tightly integrated, and robust is putting a lot of demand on our infrastructure (whether cloud or on-premise). Every senior tech leader has the goal (whether explicitly called out or not) to reduce opex (to KTLO) and keep it as minimal as possible.

Transformation is the equal responsibility of everyone: Most of the time, it is seen that Operations teams are caught off guard when a new service or application goes live in production, and they have absolutely no clue what to do if something breaks. That leads to chaos, confusion, escalations, and delayed service restorations, which are easily avoidable. Agile and DevOps Practises are important and need to be embraced to streamline operations and enhance collaboration between development and operations teams.

Let there be a top-down mandate all the time to not bypass any part of a new service introduction or other ITIL process in the name of a “speedy path to production”. In the absence of that mandate, new services may get rolled out to production faster, but at what operational risk and at whose risk? Who will own that?

Engineering and Application teams need to be made equally responsible for building products with stability and reliability in mind and not just rolling out feature-rich contents.

Process Optimisation: In my previous role, I coined the term SAFE (Simplify, Automate, Federate, Eliminate) as the master term for all transformation and excellence initiatives. Under this, identify all inefficient components in various operational processes across your infrastructure and operations domains. Conduct a thorough analysis of your workflows, identify bottlenecks and waste, and implement process improvements to eliminate waste and manual tasks and enhance efficiency through automation wherever possible. I would add here that many times I have seen operations teams engaged in carrying out the activities of engineering teams, and they have no value added to those activities other than ticking a box. Many times it happens because no one has questioned or challenged that status quo, and many times the ops team does that because no one in the engineering world wants to do those “boring and trivial activities.” Hence, I would insist everyone not hesitate in calling out such activities to be federated to engineering teams, like change management exercises.

Data-Driven Decision Making: Data is the next gold (we all know it), hence the focus should be on leveraging operational data analytics of everything reactive, proactive, monitoring, and events to gain insights into your infrastructure and operations performance, reliability, and stability. For this to happen, monitoring and alerting evaluation should happen at regular intervals, and the approach should be to move away from component-based monitoring and alerting to service-based monitoring.

Every bit of data tells us something if we have ears to listen to the story behind the data. Analyze it to identify areas for improvement and slowly create a segway to build strong predictive analytics and machine learning algorithms to predict situations before they happen and optimize resource allocation through adequate capacity planning (of all tech and people assets) and performance management.

Talent Development and Organizational Culture: I&O roles can be very monotonous and boring at times, and very stressful on the other hand. So in order to drive holistic transformation, people and their development should always be at the centre of all actions. Keeping them motivated, excited, engaged, and Invested in developing the skills and capabilities of your infrastructure and operations teams

Let us face it: Operational transformation will have some impact on people’s roles and jobs, but proactively identifying roles and people to be repurposed for deployment is the key to avoiding any panic and nervousness in the system and truly enjoying the whole excellence journey. For that, identify training opportunities to keep them updated with the latest technologies and industry trends for movement to other engineering and high-value roles. Fostering a culture of continuous learning is the key to embracing transformational initiatives and driving change.

Vendor and Partner Collaboration: It is unlikely that in today’s world, any of our integrated systems are running completely in-house without any vendor involvement of any sort. So when it comes to operations and transformation, we can’t take our eyes off our vendors. Engaging with them to support our infrastructure and operations transformation is extremely vital, as we need to address all links in the whole chain of operations and service providers.

Change Management and Communication: Last but not least, Change Management Again, in my experience, Wonderful planning happens to plan all the transformation work, but the last mile of change management is forgotten like a stepchild, and this is where resistance, panic, and disengagement come within the workforce. Generally, transformation initiatives can encounter resistance and challenges. Implementing a robust change management strategy to be honest and transparent with the people is the least that every system should attempt to do to manage and mitigate the impact of change on employees and stakeholders. Clearly communicating the vision, goals, and benefits of the transformation and engaging key stakeholders early to gain their support and address concerns effectively are the keys to achieving grand success in the transformation.

To conclude this episode, I would like to leave you with the below questions that each leader should ask at various levels—not once but regularly.

Are we so engrossed in daily firefighting that even though we want to carry out transformation and develop self-healing and auto-healing capabilities, we are not finding enough time for those areas? If the answer is yes, then, my friend, this needs to be corrected immediately. Firefighting will never end, and no matter how hard the team is working (24×7) to KTLO, it will never be sufficient. It would be like running on a treadmill, where we are running very hard with a lot of speed, but in the end, we don’t reach anywhere and are only standing there.

Have we challenged our engineers and leaders with enough transformation and excellence goals for them to be aggressive? If NO, then that is where the focus needs to be. Everyone needs to have skin in the game for holistic operational excellence and transformation to yield results.

Are we constantly thinking about leveraging software-defined networking (SDN) and software-defined storage (SDS) or deploying automation and orchestration tools to keep everyone’s life simpler from day 1 of any new service in BAU?

Are people’s concerns and their development part of our design of our transformation roadmap? If not, then surely it is a recipe for a human disaster.

Follow SAFE diligently and fearlessly; operational transformation results will follow seamlessly.

More will come next on Ops transformation through ITSM process excellence and maturity, along with AI and automation, and big-ticket SRE transformation.

Also read:How technology can streamline hiring process for companies?

Do FollowCIO News LinkedIn Account | CIO News Facebook | CIO News Youtube | CIO News Twitter

About us:

CIO News, a proprietary of Mercadeo, produces award-winning content and resources for IT leaders across any industry through print articles and recorded video interviews on topics in the technology sector such as Digital Transformation, Artificial Intelligence (AI), Machine Learning (ML), Cloud, Robotics, Cyber-security, Data, Analytics, SOC, SASE, among other technology topics