Imperative Health Matters of Data Centre

Bharat Tank, Associate Director - IT & Operations, RICS School of Built Environment | Tuesday, 12 December 2017, 10:34 IST

Rahul Kumar is a seasoned IT profes­sional. He was thrilled when he was offered a job at a multi-national com­pany, which was entering the Indian market by establishing a manufactur­ing plant in a Special Economic Zone in the Delhi NCR. Armed with over 18 years of ex­perience in managing IT Infrastructure, Kumar was hired to configure a high quality data centre, which would be the core of manufacturing plant from the IT operations perspective. The manufacturing plant was to be fully automated and would use their legacy ERP application for the local production.

It took almost four months from infrastructure de­signing to implementation to user training to produc­tion for the data centre to go live. The manufacturing unit started production as per the defined timelines and the companies’ customers started receiving qual­ity products on time. The IT team grew in strength and Rahul was happy to showcase this as yet another achievement.

The joy was however short-lived. Within a year of setting up the data centre, the IT team realized that the rate of hardware failure was increasing, resulting in a lot of calls from OEMs (Original Equipment Manufac­turers) asking for the faulty hardware to be replaced. At first, it started out with one or two cases of hard­ware failure, but the rate at which hardware started failing increased steadily. Initially, Rahul and his team treated hardware failure cases as both individual and isolated cases, without realizing that there was a larger problem.

Nonetheless, hardware failure became a regu­lar talk point in the company’s weekly IT meetings. The IT team was puzzled about the reason for the failures. To be sure, the IT team had been maintain­ing the recommended cooling and humidity in the data centre at regular events as well as hardware di­agnostics. Apparently, there was nothing wrong with data centre and cause of hardware failure looked to be a mystery. The failures had begun to affect the company’s business. OEMs started refusing to re­place their hardware when the hardware failure rates continue unchecked.

Overcoming Dilemma

It was time to set things right. Rahul and his team decided to perform a thorough health check-up of data centre and its environment. Based on the final report, it was found that the server room air quality was contaminated with corrosive gasses such as sul­phur dioxide/hydrogen sulphide, that caused corrosion on IT components containing copper and silver parts. On further investigation, it was found that the pres­ence of open drains near manufacturing plant that car­ried Noida city’s garbage and untreated waste water of manufacturing/dyeing units, resulted in emission of highly poisonous and corrosive gases in the envi­ronment. Though the production unit was far from open drains, it was not possible to block the flow of contaminated air, which was affecting data centre. The immediate requirement was to improve air quality of the data centre and the team had to find ways either to stop it or at least delay the rate at which corrosion was impacting IT components. Assurance was given to OEMs, that an air quality control unit will be in­stalled in the data centre to stop further damage. The IT team evaluated multiple industrial air purifier sys­tems available in the market and checked recommenda­tions from existing users, who faced similar problems before related to data centre. Air purifier system is installed in data centre and air qual­ity improved substantially. After six months, the frequency of hardware failure reduced to one or two cases in six months or a year, compared to 2–3 cases on an average in a quarter earlier.

One of the key factors in decid­ing the capacity of the air purifier is the size of the data centre (simi­lar to deciding the tonnage of an AC depending on the size of the room). Air purifiers need to run 24x7 and its filters need to be regu­larly cleaned and changed depend­ing on the severity of the air qual­ity in the data centre. Significantly, air quality of the data centre should be checked every quarter or at least twice in a year to understand the ef­fectiveness of the air purifier. It is equally important to ensure that the data centre does not have any provi­sions for external air to enter into it. This can reduce effectiveness of the air purifier.

Air Quality around Data Centre Matters

IT heads should pay attention to the quality of air around data cen­tre. When deciding budget for your company’s IT Infrastructure, it is highly recommended that a small portion of budget should be set aside for testing air quality of the data centre. Frequent hardware failures can distract IT department from focusing on important mat­ters and they might waste time on solving petty issues. If the data centre's air quality is unhealthy and you do not have options to im­prove it, please take your IT part­ner or vendor into confidence and explain the situation before placing any IT infrastructure order. The OEMs should be made aware of the situation and it is advisable to take their confirmation in advance, so that they can solve the problem (if any). Document everything dili­gently, so that you will have sup­port of your IT partner in case a downtime arises. 

Don't Miss ( 1-5 of 25 )