When we implement a SAP system it is difficult to predict precisely in what way the system’s database will grow in the future. Of course, on the basis of the available data, the growths of databases in the future years are estimated during the phase of technical design of the system, and so is suitable disc space as a consequence.

Specifying the influence of data growth on the system’s work in various aspects of its functioning is more challenging, though. It is connected with the company’s development, the number of operations that will be carried out, orders growth, number of products and invoices issued.

Nowadays the available technologies used for hard discs offer big disc space for relatively low prices. However, the disc system structure, which on one hand provides big enough space, and on the other hand is efficient and capable of processing numerous write and read operations, is definitely more expensive and difficult.

Disc space is only one of the drawbacks that we deal with in the context of quantitive growth of every IT system, including SAP. Other problems include:

  • decrease in system efficiency resulting from extending the time needed for data reading and writing (especially with large tables);
  • necessity of providing an efficient mechanism for making backup copies and possibility of their recovery in case of a failure in time that is acceptable for business;
  • bigger needs for hardware resources, such as processors or RAM memory;
  • long time needed for refreshing the test environment with data from the production system (the so called “client copy”);
  • long downtimes during upgrading.

Additionally, in large databases there are also other inconveniences, such as long refresh time of the database statistics or problems with realigning data effectively. All the above consequently lead to a substantial increase in costs of maintaining the system. Additional costs regard the hardware as well as human resources.

Tidy up your desk

In business activity the data on purchase, payments and invoices dated 10 years ago is regarded ancient. It is rarely used for long term reports or analyses. It is often kept only due to legal requirements that oblige to store data and to provide access to it for appropriate institutions (e.g.: Tax Office, Social Insurance Institution). Statistics also show that data entered in the system a few years ago is extremely rarely read.

In SAP environment archiving is the most effective solution for problems with database size. The data archiving mechanism in SAP allows to limit the size of database through moving the data to an external archive. A suitable archiving mechanism makes it possible to carry out this process on a regular basis in order not to let the database grow excessivly again. At the same time, the access to archived data is maintained. As the archived data is used ocassionally, it is a good idea to store it on data media, which cost a lot less than databases. Moreover, the archived data is compressed (up to ten times), which additionally adds to the effectiveness of the entire solution.

SAP systems cooperate with a whole range of solutions for storing the archived data, from a standard disc space, through HSM devices (Hierarchical Storage Management), to advanced solutions dedicated to archived data storing (OpenText, FileNet, etc.).

BCC offers you a cause-oriented approach to archiving historical data in SAP, not a result-oriented one. The project of archiving (which is moving some data that is not being altered in the system anymore to external media) is worth carrying out as it clears space in the systems.

Data archiving may be compared to tidying our own desk. We cannot keep all the documents we have on it. Some documents we rarely flick through and thus we put them away on a shelf, while keeping close the documents we are working on at the moment.

Limit the database size

DHL Express (Poland) is an example of a company that decided to carry out a SAP data archiving. The project, implemented by BCC, was completed at the turn of 2010. The scope of archiving included data from FI, CO and SD fields.

Since 1999, the amount of data in the system had been growing rapidly, which was connected with the company expansion and winning the Polish market of courier shipments. In the middle of 2009 the size of SAP database in DHL Express (Poland) was about 2.4 TB.

Moreover, the forecast for the coming years are showing the database’s further growth.
For the size of the database to not influence the efficiency of the system, there had to be as much data archived as possible. Another, equally important, goal of the archiving project was to limit the database growth in the following years. Owing to their partaking in a project led by SAP Competence Center in cooperation with DHL Express in Poland, DHL consultants gained practical knowledge that let them archive SAP databases in the following years.

In the first step, an analysis of current state of affairs was made and a conception of archiving was developed. Secondly, the solution was configured and tests were run. There were workshops held to familiarize the key users with how the functioning of the implemented solution works. Finally, as a third step, the production solution was initiated.

Archiving analysis and conception

During the technical analysis of database in DHL Express there were selected the biggest tables and the tables that grow fastest. The biggest table was a table from COEP line items, the following two contained data from sales and distribution. As a result of the analysis, suitable archiving objects to be used were then specified. Archiving covered data from the period between 1999 and 2005.

System analysis and subsequent conception development is the most crucial stage of the project. Errors in this stage may lead to problems with accessing data or the solution may be not as effective as we expect (e.g. due to a small amount of data that will be able to get archived).

Choosing objects for archiving is not simple. It requires broad knowledge of certain SAP system modules and of dependencies between the data being archived, which has an effect on choosing the sequence of archiving particular tables. One has to bear in mind that CO line items (COEP table) are also being archived when sales orders archiving occurs. The scenario of archiving should then provide for archiving SD tables first and archiving CO tables later.

Properly selected objects with dependencies make it possible to create an archiving map that shows what data and in what order will be archived.

In the conception stage it is also crucial to define from which period data will be archived, what data should stay in the database regarding easiness and speed of access. Finding a compromise between the users’ expectations for all data to be available “as usual” and the necessity of moving a part of data to an external archive to improve the work of database is essential.

Another significant aspect to the conception is defining the access to the archived data. The access is not much different from the access to data already present in the database. However, in case of a part of data, the access is possible only upon dedicated transactions, and the way of presentation may differ from the way of data presentation from the database.

Information structures

It was also necessary to activate suitable information structures of archiving to provide correct access to data. The information structures are tables that store data chosen from archived information, which makes the access to archived information quick and easy. Optimally selected information structures store only the data necessary for effective reading of archived information.

On the one hand, information structures that are too expanded equal bigger space needed for their maintenance. On the other hand, structures that are too reduced do not let search effectively the archived data by all criteria (fields) we are interested in.
It is worth remembering that if we use other solutions (e.g. SAP BW data store) powered by SAP apart from SAP ERP, we should take this into account at the conception stage and plan powering these systems before archiving.

We should note that archiving is a cyclical activity, the data that is now fresh and indispensable will become archive data in a few years time. It is thus so important to describe the activities necessary for the proper undertaking of the whole project in the next years.

In DHL Express, all arrangements were included in a document describing the archiving conception and each archive object has its own separate document/guide that describes how to use it.

Configuration and tests

During the next stage of the project we prepared a conception-based configuration, which was then used on DHL production system.

The test results confirmed that the assumptions meet the expectations in terms of the volume of archived data, and the archived data is available for users and it may be effectively used.

Owing to tests, it was possible to solve a couple of problems which influenced the amount of data we could archive. One of the most common problem were incorrect statuses of archived documents or objects (not closed, not settled, still being processed, etc.). Similarly to the situation at the analysis and conception stage, the condition of successful fulfillment of this stage is close cooperation with key SAP users.

In the archiving project the configuration covers the technical part, such as preparing space for storing data and connecting it with SAP system (depending on the adopted solution: discs, repositories storing archived data, dedicated software, etc.), as well as the business part, specifying retention time for documents, accounts, objects and setting appropriate tags for specified types of objects and documents.

The finishing tests at his stage have to confirm that the desired result is possible to get, regarding both the amount of data that we can remove from data-base and the method of accessing the archive.

630 GB of data to be archived

After confirming the conception assumptions at the testing stage, we proceeded to production archiving. Unlike with archiving carried out through testing, we had to bear in mind the fact that starting archiving tasks on the production system may affect its work. Archiving tasks substantially overloaded the system and it was necessary to plan the tasks in the periods when there were no operations important for the company’s business activity running, such as periodic settlements, monthly close-offs etc. Due to high volume of data, single archiving tasks would take even a few days.

During three months, as this is how long the initial archiving process took in DHL, there were over 800 tasks connected with archiving started (preliminary processing, archiving, removing, final processing). There was over 630 GB of data that was archived. Due to effective data compression, the archive took up only 73 GB. A high compression ratio of 8.6 was achieved.

The numbers clearly show the results of the project. According to the estimates, the SAP database in DHL in 2013 will have grown to about 2.7 TB. Without archiving historical data it would grow to about 4.2 TB and it would be almost twice as big.

Along with decreasing the size of database, an increase in system work efficiency is another result of the project. The system response to queries to “slimmed” tables is now quicker. Moreover, other advantages of archiving include shorter time of making a security copy and shorter reading in case of a failure. The costs of hardware platform needed for hardware maintenance are lower, too.

The archiving project in DHL took over 6 months, which was caused by the size of the system and the volume of data. Particular tasks that were analysing data and the archiving tasks themselves were being processed for a long time (single archiving tasks took up to several days).

BCC Consultants were responsible for data archiving up until 2005. Now SAP administrators in DHL are getting ready for archiving data from the following years on their own.

“This period is a compromise between business requirements, such as high-volume data availability and a reasonable size of database that does not affect the system response time. Each year older data will be moved to the archive, where it will be also available for authorised users” – points out Paweł Rutkowski, a SAP Consultant in DHL Express.

It is worth considering the archiving beforehand, when the database is still far from reaching a tera-byte size. During the project there will be from dozens up to a thousand operations started. These operations will run easier and quicker with a relatively small database.

Once developed map and archiving procedures will be used on a recurring basis in order to limit data-base growth and keep it on a reasonable level. It is the greatest value added to archiving in SAP.

DHL Express (Poland), the leader in the express courier service on the Polish market, provides nationwide and international courier and express services for companies and institutions. It is a part of DHL group, owned by Deutsche Post DHL, world’s largest logistics and mail delivery group. There are three state-of-the-art mail sorting centers and 38 domestic mail terminals, 6 airport sorting centers, 47 service points, 3 aircrafts and a fleet of over 2200 trucks. The company employs more than 5000 workers and couriers.
For more information: www.dhl.com.pl