This is a continuation of the Data Cleanup series. One common form of bad data is called crosstab. Consolidating this type of data requires lot of manual effort. Pivot – Multiple Consolidation Ranges is a fantastic feature available in Excel which can do this job in a snap. Here is an example of cross tab data…
Let us see how to get this done.
Download Crosstab Consolidation.xlsx (17 kb) from here
For those in a hurry, here are the steps.
- Type the shortcut Alt D P to start Pivot Table Wizard
- Choose Multiple consolidation ranges option, Next.
- Choose I will create page fields, Next.
- Select each crosstab block of data and click Add.
- Choose 1 page fields, Click each range and type a name for the block of data
- Click Finish. Now a pivot table is created
- Combine spelling mistakes or similar names using Pivot Grouping.
- Double click on the grand total to get data in GOOD format.
What is a crosstab?
It is like two tables mixed into one. The months are column titles, and product names are row titles. So we have headings for rows and columns.
This block also has a sort of a name to indicate that this data is from the northern region. This name is just a text caption and may be given near the data or specified in the sheet or file name.
Compare this with the following data which is just tabular. Tabular data only has column headings.
Crosstab is bad data because it requires manual work to analyze further.
Crosstab is like Pivot Table output
Crosstab data is like a report generated from Pivot Table with one item in Row area, one in column area and some calculation in the data area.
Tabular data is raw data – which can then be converted to a report which is usually a crosstab.
Crosstab as input is a bad idea
However, in many cases, the crosstab data itself is your input. For example, let us consider these four blocks of data – sales across four regions.
Now if you want to get consolidated sales across the blocks, you need to perform lots of manual work.. Notice that the products and months are not in same order. Therefore, you will need to copy paste things in the correct order before you can put a combined formula.
We do have the Data tab – Consolidate feature. But that is a restrictive thing to use.
Fortunately, there is a brilliant feature available, which is a part of Pivot Table wizard. Unfortunately, hardly anyone has noticed it.
Solution: Pivot Table Multiple Consolidation Ranges
The older version Pivot Table wizard has this very useful option (2003 and before). The new version pivot table dialog does not have this option. Therefore, we must use the older wizard. To start the wizard, type Alt D followed by P.
The wizard appears. Choose Multiple Consolidation Ranges and click Next.
In the next step, it asks you whether you want to create the page fields. What is a page field? We are choosing multiple blocks of data and combining them. The combined version is will be a pivot table. In that pivot table, we will have a field which allows us to identify each block of data. In this example, each piece of data belongs to different regions. That region name is called the Page name. It will be added to the Page field in the output pivot table.
If you don’t create page fields, Pivot table creates them automatically, calling them Item1, Item2, Item3 and Item4. This can be renamed later, but it is better to do it earlier.
Choose the Second option – I will create page fields. If you don’t need the page names, you can choose the first option.
Select all blocks of data and add the ranges
Now a big, complex looking dialog appears. It looks complex. But it is not. Just select each block of data and click on the Add button.
Data blocks can be in the same sheet, different sheets or different files. It does not matter. The data blocks do not need to be repaired for row and column name consistency. The row and column names can be in any order in the raw data. Do NOT waste time in making the naming of rows and columns by manual copy pasting. Excel will match the names automatically.
The next step is to give legible names to each block of data. Otherwise Excel will use default names like Item1, Item2 and so on.
Click on the number 1 under How many page fields do you want? This means we want to specify 1 set of names. In this case each name will specify the region name.
You can use up to Four sets of names. For example, in this case, one set of names are the region names. Second set of names could be Zone1 and Zone2 which are combination of two regions.
If you have monthly data being consolidated, first set of page names will be base months. But you can also create quarters or H1/H2 by using additional page fields.
Using more than one page field gives you the ability to have multiple classifications of the base blocks of data.
In this case we will choose the region names as the page names – first set.
Click the range on top and then type a name at the Field One textbox. Repeat this process for all the ranges. Resist the urge to press Enter!
When all names are specified, choose Next, select New worksheet option and click Finish.
Now you have a pivot table with all the data consolidated automatically.
This pivot table will be in Compact Mode. If required choose Pivot Table – Design tab – Report Layout – Tabular…
Notice the field list of the pivot table. It shows four fields. Row, Column, Value and Page1. Row shows the combined row headings, column shows combined column headings, values is the sum of all the data and page1 is the field containing the names of regions.
Combining more blocks of data
What if you receive more data from another region, AFTER you have created this consolidated pivot table?
No problem, just keep that file open. Click anywhere inside the pivot table, type Alt D P to get the dialog, click Back button and add the new range. Specify a name for it and click Finish. This way, the data can be combined on an ongoing basis.
From crosstab to tabular in just ONE double click
What we just did is use multiple blocks of crosstab data – which is bad data – and found a quick way of consolidating it. Although our problem is solved, it is a good idea to ask for the data in good format – which is just simple tabular data. That should have four columns…
Region, Product, Month and Amount.
Now go to the grand total of the Pivot table and double click on it. Miraculously, you will see a new sheet added with data converted to nice tabular format.
This is how you convert the BAD crosstab to GOOD tabular data.
Remember that you can use this method to convert bad to good data even if you have just ONE block of cross tab.
This method works only if there is one column containing row headings. For example, data shown here will not be consolidated using the Pivot Table method.
Pivot Table consolidation works only when there is one item in the row labels of column labels. In this case we have two columns in the row labels – the product name and the product ID.
If you want to consolidate this data using Pivot table, you must use only the Product name column and exclude the product id column from the consolidation selection.
Once the data is consolidated and converted to tabular data (double click on the grand total), it is easy to create a vlookup range containing product name first followed by the product id and then use VLOOKUP to add the product id to the final data block.
The same concept should be used if you have multiple layered column headings.
Excel MVP Ashish Mathur has written a very interesting and useful article about how to handle this type of data consolidation.
We will see how Power Query can be used in this situation as an alternative to Pivot Table Consolidation.